|I thank the authors for taking the suggestions of the first round of reviews into consideration. While manuscript still does not describe actual application of the techniques to atmospheric data, the introductory material on merging of ozone and temperature measurements does add value to the manuscript. And the structure and readability have improved compared to the first version.|
I believe however that there are some errors and inconsistencies with the formulas, calculations and quantitative results of the study. I urge the authors to triple check their calculations, addressing the comments listed below.
1. The factor of *12 added to Eqs 2, 3 and 5 is wrong: to convert months into years one should divide by 12, not multiply. But I would not advocate this: since the quantities in each equation are based on monthly means, and the value of n used to determine the scaling factor for the conversion to a confidence interval (1.96 or other) is the number of months, I think it’s best to write these formulas a “number of months…”
2. Spot checking Table A1 turns up a number of errors. For example, using Eq 2 and the numbers provided, I calculate time periods (months or solar rotation cycles) to identify an offset of 0.0008 watts m-2 nm-1 for SOLSTICE-SIM of 5.9 and 6.6, not 5 and 6 as listed in the table. Also, the value of 5.8 years to identify a drift of 0.0001 watts m-2 nm-1 year-1 in the SOLSTICE-SIM timeseries is inconsistent with Figure 5, which suggests around 2.2 years.
3. Apparently, the calculations behind Figure 5 use different values for sigma and phi than used for the prior calculations in section 3, and used in Table A1. This is not explained, and is therefore extremely confusing.
4. It doesn’t make sense to me why the “drift” in Eq 3 (and 5) should be used in units of yr-1, when the sigma and phi are based on monthly timeseries. Perhaps this explains why different sigma and phi values are used in the construction of Fig 5, these could be estimates of the standard deviation and autocorrelation of the annual mean timeseries, but could this be reliably done with just 3 years of data? It seems better to use a “drift” in units of month-1, and the original values of sigma and phi. When I do this, I get values pretty close to what is shown in Fig 5, but a little larger. For example, here is the calculation the way I think it has been done in Fig 5, for a “drift” of 1*10-4:
(1.96*8.58e-5/(1e-4)*sqrt((1+0.58)/(1-0.58)))^(2/3) = 2.1994,
and the way I suggest it probably should be done:
(1.96*1.7e-4/(1e-4/12)*sqrt((1+0.89)/(1-0.89)))^(2/3)/12 = 2.5144
5. The abstract states “For relative drift to be identified within 0.1% yr-1 uncertainty, the overlap for these two satellites would need to be 2.6 years”, while Fig 5 suggests rather that ~2.6 years is needed to identify a drift of 10%! This is a big difference. If the 10% value is correct, it has a pretty substantial practical implication for the study—it seems unlikely that any reasonable overlap period (of a small number of years) will be able to do much to constrain drifts of any but the most egregious magnitude.
Some specific comments
P1, l27: Perhaps pedantic, but it’s the satellite missions that overlap, not the satellites themselves.
P1, l29: this seems to be a result from another study, not this study, so probably shouldn’t be in the abstract.
P1, l32: actually 6 months (5.9), see major comments.
P2, l48: another pedantic point: the missions should overlap, not the launches
P5, Fig 2: these temperatures must be for a specific altitude range?
P5, Fig 2: the relevance of Fig 2 to this study is questionable. It shows differences between 2 merged datasets with the offset and drift removed. But the offset and drift estimation is specifically the theme of this paper!
P6, l8: “our” as in the authors’, or more generally?
P6, l20: what “model” is being discussed here?
P6, l32: why can only “small” problems be identified?
P7, l18: “respect” is a unique word choice, and I don’t know exactly what the authors mean by it.
P8, l8: So many atmospheric measurements show variability at time scales longer than the annual cycle due to modes of internal variability (e.g., ENSO, NAO) or responses to external forcings (like solar variations!). So I’d be careful about implying that one year of data will “cover the full range”.
P9, l6: These 3 points are contained in a paragraph describing Fig 3: it would be nice of point 1. connected the pre-flight calibration estimates with what is shown in Fig 3.
P9, l9: It’s perhaps impossible in this case to know whether there is a true drift between the instruments, or a “multiplicative bias”. This might be worth pointing out at some point.
P9, l19: Does SIM also show a jump here?
P11, Fig 4: Is this the same data as in Fig 3, just shown now in monthly means? The wavelength of the measurements should be mentioned.
P12, l7: Actually, depending on how sampling is dealt with, monthly means can exacerbate sampling differences between instruments compared to shorter-term averages. See, e.g., Toohey et al., 2013.
P12, l26: actually 6 months.
P13: l29: This sentence, which includes Eq 3, is not grammatically correct.
P14, l4: grammatical issue with sentence 2 of Figure caption.