the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
On the use of satellite observations to fill gaps in the Halley station total ozone record
Susan Solomon
Kane A. Stone
Jonathan D. Shanklin
Joshua D. Eveson
Steve Colwell
John P. Burrows
Mark Weber
Pieternel F. Levelt
Natalya A. Kramarova
David P. Haffner
Download
- Final revised paper (published on 30 Jun 2021)
- Supplement to the final revised paper
- Preprint (discussion started on 22 Feb 2021)
Interactive discussion
Status: closed
-
RC1: 'Interesting, but needs clarification of uncertainty measures', Anonymous Referee #1, 26 Mar 2021
Overall Comments
The manuscript describes the use of satellite data of total ozone to fill gaps in the ground-based Dobson total ozone record at Halley Bay, Antarctica.
As mentioned in the text, Halley Bay has one of the longest and most important total ozone records. This record was the key for the dectction of the Antarctic ozone hole.
It is a good idea, and scientifically sound, to fill gaps in this important record with satellite data, as well as check for consistency. Overall, the paper is well written and merits publication in ACP.Before publication, however, I suggest a few important clarifications. Thoroughout the manuscript, I get confused about the use of individual total ozone measurements, daily averages and difference, monthly averages and differences, and the corresponding standard deviations. Sometimes standard deaviations appear to be mis-named "averages" as well. The description of the applied method to shift satellite data towards the Dobson data is also quite long-winded. It would benefit from shortening and clarification. There is no need to make a simple average bias correction appear much more complicated than it is.
Specific Comments
line 21, average of 2 Dobson Units: I don't think this is what is meant. My understanding is the each satellite record is shifted by the Δ from Fig. 3, so that it matches the Dobson data on average. Therefore the satellite average should reproduce the Dobson average exactly, by construction. What is probably meant here is "within a standard deviation of 2 Dobson units". Even more information is required here: are the (presumably) 2 DU standard deviation for monthly means or for daily means? Is the given value 1 or 2 standard deviations? Is it +-1 DU or +-2 DU? Is it even correct? In lines 184 and 185 the stated standard deviation of the differences is 6 to 7 DU. This is much larger than 2 and neesd to be checked.
From text and Table 1 it appears that the "root mean square difference" (which is the same as the standard deviation!) for daily average data is about 12 DU. So the 2 DU are probably for the monthly average data, but the 12 DU for the daily data should be mentioned here as well. (Assuming a Gaussian distribution, 67% of the data should be within +-1 standard deviation of the mean (which should be zero here by construction), 95% of the data within +-2 standard deviations, ...
While I call this lack of clarity out here for line 21, it exists throughout the text, and needs to be fixed everywhere.
line 39: Here it says "throughout the year", line 37 said that no data are available for May to July. What is true now?
line 45: delete "the" before "satellite"?
line 50: replace "well tested" by "in place"?
Fig. 1: are those all measurements or daily averages? Please mention. Are the satellite data the original data from all satellites, or the adjusted data matching the Dobson?
Lines 79 to 104: Would be good to also give the size of the satellite ground pixels near Halley Bay for all the satellite instruments. In addition, I think it absolutely necessary to state which data version was used for each satellite, and where / from which URL the satellite data came from. For GOME2, for example, there are data from Uni-Bremen, from DLR / EUMETSAT, from RAL / ESA_CCI, ... A table of URLs and versions would help here.
Line 85: "cross-calibrated" My understanding is that the current SBUV 8.6 version is not cross-calibrated between satellites, but relies on improved calibration at the radiance level for each satellite. Please check. Natalya Kramarova will know.
Line 103: should be "polarization effects"
Line 107: How were overpasses defined? Satellite foot-point within what distance? Same for all satellites?
line 127: Would it not be better to have Figure 3 and lines 155 to 160 right here in section 2.3? After all, the Figure shows the Δ-s that are discussed in lines 120 to 127?
Line 128, Section 2.4: Would it not be clearer, to have section 2.2 here, after section 2.3. That way, you would have a more logical flow. a.) discuss Δ-s for individual satellites b.) discuss how you use all satellites to fill in, and how that looks for the different months.
Table 1: What is shown here? Differences between monthly averages, or differences between daily averages? From the numbers, around 12 DU, it looks like it was daily averageds. Were the satellite data Δ-adjusted or not? In April, that would make a large difference according to Fig. 3.
Line 143: Would the Δ-adjustment not take care of the Bass-Paur difference as well? Is it necessary to mention systematic biases here, since the filling-in method takes care of them anyways?
Figure 4: Having Figure 4 so close to Figure 3 confused me (Are they now using monthly Δ-s again? Or daily? Or what?). I guess the only point of Figure 4 is to show that 2019 was very different from the other years. This does not become very clear here. The stars for 2019 are easy to miss in the Figure, and they do not have error bars. It would be helpful to have a clearer Figure, that points out 2019 in a legend in the Figure, not just in the caption.
Also Figure 4: What are the error bars? Standard deviation of daily data or monthly data? Standard error of the mean? One or two standad deviations?
Lines 169 to 187, and Figure 5: I am confused. Does Fig. 5 show data, where 2003 to 2012 was the training period? Or which training periods were used to generate the data in the two panels of Fig. 5?
Lines 184, 185: I assume that the numbers are for the trained data? Please state the same numbers for the unadjusted satellite data. Only then you can conclude if the adjusted date are better, or not. Check consistency with numbers in abstract and conclusions!!
Figure 6: The differences between the Dobson monthly means and the $\Delta$-adjusted satellite data look rather large in 2019 and 2020, 5 to 10 DU. Is that consistent with the numbers given in lines 184, 185? Figure 4 shows that the 2019 Dobson data are flawed. Are the 2020 Dobson data flawed as well? Flaws of the Dobson data should be stated, and maybe even marked with different sysmbols in the Figure. How do the $\Delta$-adjusted satellite data look in the other years? It would be good to plot the entire red time series.
Line 230: Are the 2 Dobson Units the average difference? Is that really relevant? In principle, the average difference should be zero, due to the Δ-adjustment. Of course zero is not realized in every subset / realization of the data. Is not the standard deviation between Dobson and Δ-adjusted satellite data a much more meaningful quantity, to show how well the two data sets agree?
Also line 230: Check consistency with the numbers in abstract and in lines 184, 185. Please give (also) the standard deviations of Dobson minus Δ-adjusted satellite data on the basis of monthly and daily means.
Citation: https://doi.org/10.5194/acp-2021-122-RC1 -
AC1: 'Reply on RC1', Lily Zhang, 29 May 2021
The comment was uploaded in the form of a supplement: https://acp.copernicus.org/preprints/acp-2021-122/acp-2021-122-AC1-supplement.pdf
-
AC1: 'Reply on RC1', Lily Zhang, 29 May 2021
-
RC2: 'Comment on acp-2021-122', Anonymous Referee #2, 21 Apr 2021
On the Use of Satellite Observations to Fill Gaps in the Halley Station Total Ozone Record
The authors use observations from multiple satellite instruments to create an ozone column dataset above the historically and scientifically significant Halley station in Antarctica, using the Dobson instrument as a calibration anchor. The result is a dataset that can fill in gaps due to a recent ice crack, check for calibration issues affecting the new Dobson data, and fill in gaps caused by other future geophysical or social disruptions.
General comments:
Overall, I enjoyed reading this work. The manuscript is well written and organized. Gaps in scientifically significant long-term datasets – such as the Halley Station’s Dobson – are an important problem. The paper contributes usefully to this topic. I have a few questions and requests for clarification.
The satellite instruments used in the study have a variety of measurement techniques, advantages and limitations. While it might not be necessary to do a deep dive into this, I think it is at least necessary to comment on the different spatial resolutions (and perhaps vertical sensitivities) and the spatial coincidence criteria used to define co-location with Halley Station.
Figure 1 is a key illustration of the datasets involved in this study – i.e., the adjusted average satellite and Halley Station Dobson. The figure is limited to 2013 – 2019, I assume because this allows features on the ~monthly timescale to be seen and because 2013 – 2015 was one of the time windows used to test the technique’s ability to reproduce the Dobson. And 2017 – 2018 was the motivating gap in the Dobson’s timeseries. Nonetheless, I would like to see the full timeseries of the two datasets. Figure 6 does some of this, but I believe it only inserts the satellite dataset into the Dobson gaps (?). That’s useful, but I’d also like to see both fully plotted to get a sense of how closely they agree.
I am not entirely comfortable with the conclusion that the adjusted satellite average reproduces the Halley Station Dobson to within about 2 DU. I don’t think it was sufficiently stated that the satellite reproduction of the Halley Dobson data varies seasonally. In Figure 5, some months show very close agreement; others show much larger differences, e.g., ±15 DU. 2 DU is the apparent result of averaging large positive differences and negative differences across the year. In addition, while there are similar annual patterns, there are also notable differences between the years shown.
In addition, the time periods chosen for the test have a particular combination of satellite instruments that are being used to compare with the Halley Station Dobson. What were the results of the comparison between the adjusted satellite average and the Dobson for wider time periods? How well do these test cases, 1998 – 2002 and 2013 – 2015, generally represent the physical conditions and satellite datasets available to fill other gaps and future gaps?
The approach described in the paper, and the resulting dataset, is useful, but I think care needs to be taken in asserting the accuracy in reproducing the Dobson shown here – especially since particular months of the year are often of more significant interest to the study of ozone chemistry than annual averages.
Early in the paper (section 2.2) it is stated that both absolute and relative differences are computed. The paper then focuses on the absolute differences. Given the annual pattern to the absolute differences, I’m curious to know what the results of the relative differences were? Do percent differences show the same seasonality as the absolute differences? Would constructing a delta adjustment on the basis of a relative difference or SZA remove some of the annual pattern in the difference?
Figures:
It would be helpful to expand the width of the figures to fill the width of the page. Figure 1, for example, would benefit from this since it can be difficult to see the structure in the data and the comparison between the Dobson and satellite measurements. This is important for understanding the work being described. Figure 3 as well.
Figure 4: why is there no monthly value for August (month 8) 2019? Caption title could add that 2019 is with the automated Dobson so that this context stands alone in the figure without the text. I’m assuming the error bars are the standard error average? Is the August error bar larger because there are fewer days being used (polar night limitation)?
Figure 6: has tick labels that are too small to easily read; they are notably smaller than other figures.
Tables:
Table 1 is awkwardly split across page 5, which has the caption and titles, and page 6, which has all the values. Please note this to the typesetter and check that the proofs correct this.
Could be informative to add a column for the delta-corrected satellite average and/or the delta adjustment.
Abstract:
In a few cases, it might be helpful to the reader to add specifics. For example,
“by… adjusting overpass data” – adjusting how?
“Tests suggest that our method…” – what tests? Or rephrase to say “comparisons to ___ suggest that our method…”
“… our approach improves on the overall performance…” – what does overall performance refer to? Accuracy of the measurement? Comparison results? Completeness? Also want to be careful because the goal of the paper, as stated at P3L57-58, is not a high-performance dataset / “most accurate” dataset, but “to reproduce what the Dobson instrument would have observed”.
“… there was a significant difference between the two.” – I’d suggest being more quantitative. What was the difference? This also brings up a question of what the authors consider to be a threshold for good agreement?
Specific comments:
P2L1: A good paragraph starting sentence. But, suggest rephrasing “now”, since the interruption being discussed was a few years ago.
P2L49: (warning: pedantic point) Most readers will understand what is meant by “With the advanced multi-satellite observing system now…” but the wording here might suggest that the current (and past) suite of satellites are part of a coherent and coordinated “system”. We have a great scientific community and missions are chosen to provide complementary coverage. But I’m not entirely convinced the satellites used in this study, from different agencies, eras, and designs, are “an observing system”.
P4L79: missing parenthesis before the Munroe citation.
P4L76: “…some also have spectral information at other wavelengths.” – what instruments and what other wavelengths?
P4L80: Suggest re-wording “spin off of the” – it isn’t clear what is meant by “spin off”.
P4L81: “… of a somewhat improved…” – “somewhat” is hard to interpret. What was improved (or not) that is notable here?
P4L79:82: why put this information about GOME here when the details of this instrument are four paragraphs below? Doesn’t serve the flow.
P4L83: SBUV acronym used before defined. Need to carefully define, given it is an instrument and set of instruments.
P5L111: It is important to be clear what has been done to define the spatial coincidence criteria for the satellites and Halley Station.
P5L117: “… due to unusually high differences…” – what were the differences.
P5: Section 2.3: You state that the delta value for each DOY is averaged across all years in each satellite series. Was there any trend or interannual variability in the differences before combining all years?
P6L149: Why is April different?
P8: The observed differences are typically larger than the average delta, which is reduced by large positive and negative values averaging out. If this pattern is due to SZA-dependence, would a bias correction based on (or including) SZA rather than DOY produce a useful result?
P8: If there was an unusually high or low level of ozone abundance, will the absolute delta correction sufficiently reproduce the Dobson? Would a relative (%) adjustment scale better?
P9L190: suggest adding a comment about calculating averages for the months that border polar night, where there will be a reduced number of days contributing. I don’t know offhand when Halley Station’s latitude enters/exits polar night. That might be worth mentioning somewhere since it is relevant to data collection and chemistry.
P12L223: “Larger difference” – how large? Since this is an important finding, it would be good to state this. Hopefully this prompts investigation into why there might be differences between the measurements.
P13L227: were any of the satellites used in this study validated using the data collected at Halley by the Dobson?
P13: There was quite a bit of discussion about the DOY data but the conclusions focused on the monthly averages. Why not include a plot of the DOY comparison results as well and comment on their comparability to the monthly results?
Citation: https://doi.org/10.5194/acp-2021-122-RC2 -
AC2: 'Reply on RC2', Lily Zhang, 29 May 2021
The comment was uploaded in the form of a supplement: https://acp.copernicus.org/preprints/acp-2021-122/acp-2021-122-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Lily Zhang, 29 May 2021