Ground-level gaseous pollutants across China: daily seamless mapping and long-term spatiotemporal variations
- 1Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA
- 2Department of Chemical and Biochemical Engineering, Iowa Technology Institute, Center for Global and Regional Environmental Research, University of Iowa, USA
- 3STI, Universities Space Research Association (USRA), Huntsville, AL, USA
- 4NASA Marshall Space Flight Center, Huntsville, AL, USA
- 1Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA
- 2Department of Chemical and Biochemical Engineering, Iowa Technology Institute, Center for Global and Regional Environmental Research, University of Iowa, USA
- 3STI, Universities Space Research Association (USRA), Huntsville, AL, USA
- 4NASA Marshall Space Flight Center, Huntsville, AL, USA
Abstract. Gaseous pollutants at the ground level seriously threaten the urban air quality environment and public health. There are few estimates of gaseous pollutants that are spatially and temporally resolved and continuous over long periods in China. This study takes advantage of big data and artificial intelligence technologies to generate seamless daily maps of three major pollutant gases, i.e., NO2, SO2, and CO, across China from 2013 to 2020 at a uniform spatial resolution of 10 km. Cross-validation illustrated a high data quality on a daily basis for NO2, SO2, and CO, with mean out-of-bag coefficients of determination (root-mean-square errors) of 0.84 (7.99 μg/m3), 0.84 (10.7 μg/m3), and 0.80 (0.29 mg/m3), respectively. They have experienced significant declines and then recoveries during and after the COVID-19 lockdown associated with changes in anthropogenic emissions in eastern China, while surface CO recovered faster than SO2 and NO2. All gaseous pollutants decreased significantly by 0.23, 2.01, and 49 μg/m3 per year (p < 0.001) across China during 2013–2020, especially in three urban agglomerations. The declining rates were larger during 2013–2017 but slowed down in recent years. Both the areas and occurrence probabilities of days exceeding air quality standards also gradually shrank and weakened over time, especially for SO2 and CO, which almost disappeared during 2018–2020, suggesting significant improvements in air quality in China. This reconstructed dataset of surface gaseous pollutants, i.e., ChinaHighNO2, ChinaHighSO2, and ChinaHighCO, will benefit future (especially short-term) air pollution and environmental health-related studies.
-
Notice on discussion status
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
-
Preprint
(5608 KB)
-
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
Journal article(s) based on this preprint
Jing Wei et al.
Interactive discussion
Status: closed
-
RC1: 'Comment on acp-2022-627', Anonymous Referee #1, 26 Sep 2022
Wei et al. estimated long-term daily seamless different ground-level gaseous pollutants with high accuracy using machine learning and big data by combing monitors, satellites, and models. The public dataset are important to study air quality in China and also have been widely adopted in public health-related studies. The study is well organized and the results are well presented. However, the manuscript still suffers from some flaws. I recommend the manuscript for publication after the following comments are well addressed.
Major comments:
- The authors have constructed many air quality dataset (e.g., PM5, PM10) across China. Please introduce the novelty of this study compared with previous studies. I think it is essential to add these contents in the introduction.
- The authors should discuss the limitations of this paper and prospects for future work in the conclusion. The development of high-resolution dataset might not be the final aim.
Specific comments:
Line 41-43: Please spell out these abbreviations, e.g., NOx, VOCs, et al. Also, please double-check and correct such issues throughout the paper.
Lines 48 and 54: Should be MEE and WHO.
Lines 64-69: The authors are suggested to highlight the main purpose and provide more descriptions of the main work here to enrich the Introduction.
Lines 83-88: A long sentence suggests splitting.
Line 97: 0.1° × 0.1°?
Figures 2 and 3: Please clarify which cross-validated method was used.
Section 3.2.3: Besides annual variations, it is also interesting to see how three gaseous pollutants changed in different seasons on both the national and regional scales during the study period.
Lines 286 and 294: References are needed to support the evidence here.
Figures 9 and 10: Since the air quality guidelines have been newly updated in 2021, it is suggested to show the spatial distributions and variations of the percentage of polluted days exceeding both the WHO recommended long-term and short-term AQG levels and interim targets.
-
AC1: 'Reply on RC1', Jing Wei, 09 Dec 2022
The comment was uploaded in the form of a supplement: https://acp.copernicus.org/preprints/acp-2022-627/acp-2022-627-AC1-supplement.pdf
-
RC2: 'Comment on acp-2022-627', Anonymous Referee #2, 04 Oct 2022
The manuscript by Wei and colleagues titled “Ground-level gaseous pollutants across China: daily seamless mapping and long-term spatiotemporal variations” professes to generate seamless daily maps of three major pollutant gases, NO2, SO2, and CO, across China from 2013 to 2020 at a uniform spatial resolution of 10 km. While the topic is overall still quite interesting for the global air quality community, the manuscript has a number of serious scientific flaws which unfortunately led me to the recommendation of rejection. These issues are explained below, but are also clearly noted and commented upon in the annotated text that is included with this review.
The main premise of the generation of the daily maps of gaseous concentration is that the authors used artificial intelligence technologies and big data to produce these maps. The model used is not at all adequately described: it is simply named, Space-Time Extra-Tree, and a reference to a previous work that produced O3 maps is given. This is not at all sufficient for the reader of this work to assess the model, its strengths, its limitations, nor to assess whether a model that functioned well for one gas would work for another gas. Section 2.2 is extremely poor in reproducible content in that respect. The input parameters used in the model are not at all adequately described: in section 2.1.2 a long list of satellite, reanalysis, and model datasets are more or less simply named, without the most pertinent details of provenance, usability, references, validation and quality assurance being provided. Exactly how these input parameters were used in the STET model are not explained at all. Furthermore, these datasets have obvious important differences, for e.g. the OMI/GOME2 VCDs and the CAMS reanalysis VCDs, there is no discussion how these were merged into a usable dataset. The meteorological ERA5 data are on a 3h level, how were these turned into daily means, and what does it actually mean that they did, etc., is an issue also not discussed. The main input parameters, both for the training of the model and the verification of the model, i.e. the ground-based measurements are not at all adequately described. In section 2.1.1 it is not at all clear what these “reference-grade ground-based monitoring” stations are, how they were chosen, if and how the data pass QA/QC protocols, what the reference state is, how these stations were split for the verification of the STET and the training of the STET, how the gaps in the datasets were dealt with, how the hourly observations were turned into daily, etc. The results are not sufficient to support the interpretations and conclusions. The section starts, not with the expected maps of the input parameters, maps of the output parameters and maps of the ground-based stations, but with model performance scatter plots which are not at all explained as to what is being compared to what. Absolute levels are also provided for biases which have no meaning whatsoever if the actual levels of these gases around China are not provided to begin with. A section is also provided, 3.3, where this dataset is being compared, basically via Table 4, to numerous other related works. How the comparisons were made is unclear, how the statistics shown in the table were created is unclear, how so different datasets were homoegenized before comparison is unclear, and the final statement that our gaseous pollutant datasets are superior to those from the studies is not at all shown in this work. It is impossible to assess the interpretations and conclusions stated by the authors based on the information provided in the results section.
Another premise that the authors mention numerous times, in the title even, is that the new dataset is long-term and that it will benefit future (especially short-term) air pollution and environmental health-related studies. They provide a section, 3.4, where they enumerate successful applications however it is unclear if these studies used their previous work on O3, or other similar works. The benefits of this work should be clearly stated, to support this work, and not generalities.
Concluding, while is it possible that this work has potential for air quality-related studies, through the current manuscript the description of experiments and calculations is not sufficiently complete and precise to allow their reproduction by fellow scientists and provide traceability of results. I recommend to the authors to take the opportunity of this review to reconsider their strategy for their future publications.
-
AC2: 'Reply on RC2', Jing Wei, 09 Dec 2022
The comment was uploaded in the form of a supplement: https://acp.copernicus.org/preprints/acp-2022-627/acp-2022-627-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Jing Wei, 09 Dec 2022
Peer review completion






Interactive discussion
Status: closed
-
RC1: 'Comment on acp-2022-627', Anonymous Referee #1, 26 Sep 2022
Wei et al. estimated long-term daily seamless different ground-level gaseous pollutants with high accuracy using machine learning and big data by combing monitors, satellites, and models. The public dataset are important to study air quality in China and also have been widely adopted in public health-related studies. The study is well organized and the results are well presented. However, the manuscript still suffers from some flaws. I recommend the manuscript for publication after the following comments are well addressed.
Major comments:
- The authors have constructed many air quality dataset (e.g., PM5, PM10) across China. Please introduce the novelty of this study compared with previous studies. I think it is essential to add these contents in the introduction.
- The authors should discuss the limitations of this paper and prospects for future work in the conclusion. The development of high-resolution dataset might not be the final aim.
Specific comments:
Line 41-43: Please spell out these abbreviations, e.g., NOx, VOCs, et al. Also, please double-check and correct such issues throughout the paper.
Lines 48 and 54: Should be MEE and WHO.
Lines 64-69: The authors are suggested to highlight the main purpose and provide more descriptions of the main work here to enrich the Introduction.
Lines 83-88: A long sentence suggests splitting.
Line 97: 0.1° × 0.1°?
Figures 2 and 3: Please clarify which cross-validated method was used.
Section 3.2.3: Besides annual variations, it is also interesting to see how three gaseous pollutants changed in different seasons on both the national and regional scales during the study period.
Lines 286 and 294: References are needed to support the evidence here.
Figures 9 and 10: Since the air quality guidelines have been newly updated in 2021, it is suggested to show the spatial distributions and variations of the percentage of polluted days exceeding both the WHO recommended long-term and short-term AQG levels and interim targets.
-
AC1: 'Reply on RC1', Jing Wei, 09 Dec 2022
The comment was uploaded in the form of a supplement: https://acp.copernicus.org/preprints/acp-2022-627/acp-2022-627-AC1-supplement.pdf
-
RC2: 'Comment on acp-2022-627', Anonymous Referee #2, 04 Oct 2022
The manuscript by Wei and colleagues titled “Ground-level gaseous pollutants across China: daily seamless mapping and long-term spatiotemporal variations” professes to generate seamless daily maps of three major pollutant gases, NO2, SO2, and CO, across China from 2013 to 2020 at a uniform spatial resolution of 10 km. While the topic is overall still quite interesting for the global air quality community, the manuscript has a number of serious scientific flaws which unfortunately led me to the recommendation of rejection. These issues are explained below, but are also clearly noted and commented upon in the annotated text that is included with this review.
The main premise of the generation of the daily maps of gaseous concentration is that the authors used artificial intelligence technologies and big data to produce these maps. The model used is not at all adequately described: it is simply named, Space-Time Extra-Tree, and a reference to a previous work that produced O3 maps is given. This is not at all sufficient for the reader of this work to assess the model, its strengths, its limitations, nor to assess whether a model that functioned well for one gas would work for another gas. Section 2.2 is extremely poor in reproducible content in that respect. The input parameters used in the model are not at all adequately described: in section 2.1.2 a long list of satellite, reanalysis, and model datasets are more or less simply named, without the most pertinent details of provenance, usability, references, validation and quality assurance being provided. Exactly how these input parameters were used in the STET model are not explained at all. Furthermore, these datasets have obvious important differences, for e.g. the OMI/GOME2 VCDs and the CAMS reanalysis VCDs, there is no discussion how these were merged into a usable dataset. The meteorological ERA5 data are on a 3h level, how were these turned into daily means, and what does it actually mean that they did, etc., is an issue also not discussed. The main input parameters, both for the training of the model and the verification of the model, i.e. the ground-based measurements are not at all adequately described. In section 2.1.1 it is not at all clear what these “reference-grade ground-based monitoring” stations are, how they were chosen, if and how the data pass QA/QC protocols, what the reference state is, how these stations were split for the verification of the STET and the training of the STET, how the gaps in the datasets were dealt with, how the hourly observations were turned into daily, etc. The results are not sufficient to support the interpretations and conclusions. The section starts, not with the expected maps of the input parameters, maps of the output parameters and maps of the ground-based stations, but with model performance scatter plots which are not at all explained as to what is being compared to what. Absolute levels are also provided for biases which have no meaning whatsoever if the actual levels of these gases around China are not provided to begin with. A section is also provided, 3.3, where this dataset is being compared, basically via Table 4, to numerous other related works. How the comparisons were made is unclear, how the statistics shown in the table were created is unclear, how so different datasets were homoegenized before comparison is unclear, and the final statement that our gaseous pollutant datasets are superior to those from the studies is not at all shown in this work. It is impossible to assess the interpretations and conclusions stated by the authors based on the information provided in the results section.
Another premise that the authors mention numerous times, in the title even, is that the new dataset is long-term and that it will benefit future (especially short-term) air pollution and environmental health-related studies. They provide a section, 3.4, where they enumerate successful applications however it is unclear if these studies used their previous work on O3, or other similar works. The benefits of this work should be clearly stated, to support this work, and not generalities.
Concluding, while is it possible that this work has potential for air quality-related studies, through the current manuscript the description of experiments and calculations is not sufficiently complete and precise to allow their reproduction by fellow scientists and provide traceability of results. I recommend to the authors to take the opportunity of this review to reconsider their strategy for their future publications.
-
AC2: 'Reply on RC2', Jing Wei, 09 Dec 2022
The comment was uploaded in the form of a supplement: https://acp.copernicus.org/preprints/acp-2022-627/acp-2022-627-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Jing Wei, 09 Dec 2022
Peer review completion






Journal article(s) based on this preprint
Jing Wei et al.
Jing Wei et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
405 | 152 | 15 | 572 | 4 | 8 |
- HTML: 405
- PDF: 152
- XML: 15
- Total: 572
- BibTeX: 4
- EndNote: 8
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
The requested preprint has a corresponding peer-reviewed final revised paper. You are encouraged to refer to the final revised version.
- Preprint
(5608 KB) - Metadata XML