Articles | Volume 25, issue 20
https://doi.org/10.5194/acp-25-13863-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
Meteorological influence on surface ozone trends in China: assessing uncertainties caused by multi-dataset and multi-method
Download
- Final revised paper (published on 27 Oct 2025)
- Supplement to the final revised paper
- Preprint (discussion started on 19 May 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-1880', Anonymous Referee #1, 14 Jun 2025
- AC1: 'Reply on RC1', Jia Zhu, 26 Jul 2025
-
RC2: 'Comment on egusphere-2025-1880', Anonymous Referee #2, 21 Jun 2025
- AC2: 'Reply on RC2', Jia Zhu, 26 Jul 2025
- AC3: 'Reply on RC2', Jia Zhu, 26 Jul 2025
Peer review completion
AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
AR by Jia Zhu on behalf of the Authors (26 Jul 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (28 Jul 2025) by Jason West
RR by Anonymous Referee #1 (12 Aug 2025)
ED: Reconsider after major revisions (20 Aug 2025) by Jason West
AR by Jia Zhu on behalf of the Authors (26 Aug 2025)
Author's response
Author's tracked changes
EF by Polina Shvedko (27 Aug 2025)
Manuscript
ED: Referee Nomination & Report Request started (03 Sep 2025) by Jason West
RR by Anonymous Referee #1 (03 Sep 2025)
ED: Publish as is (09 Sep 2025) by Jason West
AR by Jia Zhu on behalf of the Authors (11 Sep 2025)
Author's response
Manuscript
This study presents an analysis of the meteorological drivers of surface ozone (O3) trends in China from 2013 to 2022, based on an observational dataset of ozone and various supporting analyses, including statistical analyses, simple machine-learning and chemical transport modeling using GEOS-Chem.
The authors highlight the role of meteorological conditions in driving seasonal and regional ozone increases, and use these analyses to begin a discussion of the uncertainties arising from applying these different supporting datasets. The paper will be of interest for those using such large-scale observational datasets to isolate the drivers of air quality trends and may be of interest to policymakers. The use of a consistent metric is interesting. The paper represents a significant effort in gathering and providing an interesting high-level analysis of different ways to analyse the data.
The main result of the study is to assess the consistency between approaches using a coefficient of variability metric, in which higher CVs indicate lower consistency of meteorologically driven O3 trends derived from different datasets or methods. Initially this is used as a comparator between datasets, but towards the end of the MS the authors use this more quantitatively, with thresholds of 0.5 and 1.0 being applied to indicate consistency. How were these numbers chosen? What do they mean?
What other metrics could be used as a metric for comparison?
Most time is spent discussing an analysis using meteorological reanalyses with the ML and CTM work in a supporting role as challenger methods to the MLR analysis.
In section 2, the methods used are described. In the regression-based statistical analysis, the authors first use a time-series filter to retrieve trends in ozone and other fields, and then a MLR-based model to derive the drivers of these trends. I was not able to find further details of the method used as it is in a separate publication that is incorrectly referenced.
The ML study is perhaps the least well justified - six of the predictors are proxies for time, with a further six (pressure, temp, wind speed, RH and PBLH) being deemed sufficient to capture the meteorological drivers of ozone. I have reservations about this approach because the RF model is trained on MDA8O3 concentrations. Are the authors satisfied that this model is sufficiently accurate that it can be used for attribution of driver and yield confident results? If so, what is the justification? What is the basis for explaining 50% of the variance to be a threshold for inclusion? I'd like to see more here, particularly the basis for exclusion of e.g. trends in emissions or atmospheric composition which may be drivers. It would seem much more appropriate if they had use RF to predict the recovered LT O3 trend and then used the meteorological data as predictors for the trend. L167 specifies how MDA8 was calculated, but needs much more detail on how the trends were computed.
The use of GEOS-Chem is interesting and the experiment is well-conceived, and the model is well validated in the supporting information. No information on the extraction of the trend data from the GC experiments is given, and this should be included in the main MS. The MERRA2 reanalysis was used to drive the CTM. Given the scope of the MS, why just one reanalysis? It seems that there's an opportunity here to expand the analysis of the uncertainty in the GC trend on meteorological product, and it is certainly necessary to discuss how the lack of independence of the GC and MLR(MERRA2) results affects the analysis in this paper.
Section 3.1 details the results, and leaves some questions unanswered. Please include a discussion of what the analysis says about which are the main drivers, etc. At present, this discussion is more of a comparison with other findings. In fact, the authors note that most of the outcomes are already published elsewhere (L214-L223), which reinforces the need for novel analysis in this section. I believe the MS would be improved by reporting drivers of the trends, particularly as Section 3.2 lumps all these drivers together as the meteorological impact on the MDA8 O3 trends. Maybe a figure showing the contribution of each driver would be useful here.
Section 3.2 addresses the consistency of the MLR results across different reanalyses. I don't understand why the uncertainty in the derived trends is not included here. Could it not be calculated? I suggest it's included, not least to visually assess the consistency/difference between calculated trends and support the CV analysis. If it can be calculated, please add it as an error bar to the figure .
Section 3.3 confronts the MLR method with its challengers. Here the MS inter-compares the metrics and notes the difference across various domains. This provides a brief description of the uncertainty (ie spread) of results but stops short of providing a good assessment of the importance of individual drivers or in making broad recommendations as to which analysis is the most robust, reliable or useful. The analysis of the FNL results is interesting. My main concern here is with the ML/RF approach: it may be undermined by relatively low skill of the resulting model and resulting first-principles questions as to the robustness of these results - does a statistical model of relatively low skill permit us to say much about the drivers?
Overall, the MS has a number of positive qualities: the use multi-dataset and multi-method approaches is welcome. The MS shows that the analysis is quite robust for some regions and some seasons, and so has some policy relevance.
The MS would be much improved if the analysis was extended to identify the drivers of the what the authors call uncertainty, ie intermodel spread. At present, the MS doesn't give enough information on how the ML and GC data were used to compute trends, and whether it was as statistically advanced as for the reanalysis data, separating processes at different timescales. In short, if the comparison is between similar quantities.
It would be interesting to discuss the limitations of working with reanalysis datasets, and indeed the relative strengths and weaknesses of ML and GC data in deriving trends for comparison with observations. The ML and MLR analysis would be stronger if the role of additional chemical, meteorological and climate variables were included to capture a fuller picture of ozone drivers, e.g. solar radiation, soil moisture, vegetation cover, or climate indices like ENSO in driving uncertainty was quantified. Similarly, clustering techniques would be valuable to augment the region based approach and would provide better understanding of the similarity between stations.
To enhance its impact, in broad terms, I'd suggest to provide more detailed justifications for their methods, expand the analysis to include additional variables and uncertainties, and focus on identifying the main drivers of ozone trends. By addressing these points, the value of the study would be increased for researchers and policymakers working to mitigate ozone pollution under changing meteorological conditions.
Finally, regarding data availability, the data do not conform to Copernicus poloicy which states that "access to data is by depositing them (as well as related metadata) in FAIR-aligned reliable public data repositories, assigning digital object identifiers, and properly citing data sets as individual contributions.". This needs to be addressed via a DOI via archiving through Zenodo or similar of the entire O3 dataset.
Minor comments
L31 rapid not repaid
L266 uncertainties caused by multi-model is not clear. How are they caused? what is 'multi-model'in this context?
L296 interesting, but please add reasons why PBLH in FNL introduces these issues.
L300 should read 'for the whole of China'