Interactive comment on “ Can sampling biases explain the discrepancies between lower stratospheric water vapour trend estimates derived from the FPH observations at Boulder and a merged zonal mean satellite data set ?

The present study by Lossow et al. addresses the important question whether the differences in lower stratospheric water vapor trends as derived from the Boulder frost point hygrometer time series and the merged zonal mean satellite data set by Hegglin et al. (2014) are caused by sampling biases. For that purpose, the authors compare water vapour trends at Boulder and for different latitude bands derived from various chemistry-climate models. The same comparison is done for several other satellite data sets. Overall, the analysis indicates that sampling biases are rather not the reason C1

The present study by Lossow et al. addresses the important question whether the differences in lower stratospheric water vapor trends as derived from the Boulder frost point hygrometer time series and the merged zonal mean satellite data set by Hegglin et al. (2014) are caused by sampling biases.For that purpose, the authors compare water vapour trends at Boulder and for different latitude bands derived from various chemistry-climate models.The same comparison is done for several other satellite data sets.Overall, the analysis indicates that sampling biases are rather not the reason C1 for the trend discrepancies.
The paper is well written and provides an important contribution to the scientific community.Therefore, I suggest the manuscript for publication in ACP after some, mainly minor modifications.
First of all I have to say that it is a pity that the merged zonal mean satellite data set by Hegglin et al. is not included in the present study, but as I understand the data set is also 3 to 4 years after publication not yet pubicly available, unfortunately.Although Fig. 1 is mainly meant as a motivation to the subsequent analysis, it would be great to see the actual trend estimates for the merged satellite data here, in particular as such figures are often cited and the associated caveats get more and more lost.While extracting the percentage trends from the Hegglin et al. paper is not a problem, the conversion to mixing ratio trends by assuming a fixed reference mixing ratio (same reference value for all altitudes?) is a bit more disturbing.
In some parts the paper is rather lengthy and provides a lot of details, especially in section 4.1.Here the authors provide so many information about simulated water vapour trends at different altitudes, different time periods etc., that it is sometimes difficult to keep focus on the main question of the paper, namely the role of sampling biases in trend estimates.Section 4.1 could as well be part of an evaluation paper on modelled lower stratospheric water vapour trends.For the sake of clarity I would suggest to shorten this section drastically and to focus on one figure that makes the point.Other figures could be moved to a supplement.
By looking at the wide spread in simulated stratospheric water vapour trends I am immediately attempted to ask for explanations for the model spread, but I understand that this would be beyond the scope of the paper (but nevertheless, if there are any ideas, assumptions, etc, it would be great to briefly mention them).However, I am wondering how the choice of the model simulation used as transfer function for the merged zonal mean satellite data set could impact the merged data set?Maybe the authors could add a short discussion of that issue in section 5.
Overall I can only encourage the authors to continue their research on discrepancies in lower stratospheric water vapour trends among various data sets and to hopefully come up with reliable observational composites, which are key requisites for monitoring changes in atmospheric quantities, but also for model evaluation.

Specific Comments:
-Fig.1b: Given that the trends from the FPH data and the merged zonal mean satellite data show different signs, plotting the trend difference is a bit confusing to me.This is different to, for example, Fig. 3, since the modelled trends at Boulder and for the zonal mean usually show the same sign.
-Different time periods (e.g.Fig. 3 and 5): The different trend estimates shown in the paper are often based on different time periods, which makes it again sometimes difficult to keep track of the overall picture.
-Fig. 4 and related discussion: The idea behind this figure and the related discussion is not clear to me.I also do not clearly see the link to the shorter observational time series presented in section 4.2.Furthermore, as stated on p 13, l 16/17, the trends shown in Fig. 4 are statistically not significant.Therefore I would recommend to skip this figure.
-Statistical significance: It would be helpful to mention the significance of a trend or difference right away.For example, on p 11, l 4/5 it is stated that "Â ȃThe trends derived from the adapted time series yield smaller values as those obtained from the full time series. ..", but later on in the discussion section it is mentioned that these differences are not significant (p 16, l 23/24).
-Why are the FPH trend estimates not included in the various figures for comparison with the model data or the other satellite data sets (e.g.Fig. 3, 5 and 7)?