Estimating surface sulfur dioxide concentrations from satellite data over eastern China: Using chemical transport models vs. machine learning

Watson, Zachary; Li, Can; Liu, Fei; Freeman, Sean W.; Zhang, Huanxin; Wang, Jun; Lee, Shan-Hu

doi:https://doi.org/10.5194/acp-25-13527-2025

Articles | Volume 25, issue 20

https://doi.org/10.5194/acp-25-13527-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/acp-25-13527-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 25, issue 20

Research article

|

23 Oct 2025

Research article |

| 23 Oct 2025

Estimating surface sulfur dioxide concentrations from satellite data over eastern China: Using chemical transport models vs. machine learning

Zachary Watson, Can Li, Fei Liu, Sean W. Freeman, Huanxin Zhang, Jun Wang, and Shan-Hu Lee

Download

Final revised paper (published on 23 Oct 2025)
Supplement to the final revised paper
Preprint (discussion started on 25 Apr 2025)
Supplement to the preprint

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2025-1735', Anonymous Referee #1, 16 May 2025

This study examines the ability of two methods to translate satellite column measurements of SO2 into surface concentrations. The authors focus their study over eastern China where there are substantial point sources of SO2. They compare and contrast the abilities of these two methods – one involving the GEOS-Chem model and the other a machine learning model – to reproduce in situ surface measurements of SO2 across their study region. They find that the machine learning model is generally better at reproducing observed spatial and temporal variation, although the GEOS-Chem model approach also did a good job. They highlight that the GEOS-Chem model approach is typically better in regions when in situ data are absent.
The following comments are intended to improve the value of the study to a wider readership.
The authors mention methodological uncertainties throughout the paper but this reviewer didn’t see any summation of these uncertainties reported alongside the surface SO2 concentration estimates. This kind of information would help any potential user to assess the usefulness of the reported data. This reviewer recommends that the authors explicitly state the origin and magnitude of each source of uncertainty. For example, using one year of model output to interpret multiple years of satellite data introduces an uncertainty that the authors have reported.
Line 127: increasing the time steps by 50%. Please assure this reader that this adjustment does not violate the CFL condition.
In the description of the GEOS-Chem technique, this reviewer was curious about SO2 retrievals with little or no sensitivity to the surface, perhaps due to elevated aerosols over industrialised regions. In those cases, perhaps the retrievals are removed from further analysis but the authors might also be misallocating an SO2 column with elevated values in the free troposphere to changes in the surface. This might help to explain the results shown in Figure 3. This reviewer is also wondering whether it might also explain why boundary layer height is the single most important predictor for the machine learning model (Figure 5). At least some discussion is needed about this point.
Line 140: Explain to the reader why 40 km was chosen? Do alternative values significantly change the results?
Line 174: Is it normal practice to use so much data for training? Later in this paragraph the authors mention the resulting machine learning model overfitting the data. Have the authors considered using fewer data to train and more data to test the resulting model?
Line 420: This reviewer may have missed this point in the manuscript, but I didn’t see any evidence that the GEOS-Chem approach reproduced temporal distribution observed by the CNEMC in situ data. Figure 7 shows a muted seasonal cycle with a correlation coefficient of typically less than 0.4 so the model only captures at most 20% of the observed variation.
Minor points
Line 108: regridding does not result in good data quality.
Line 120: The current version of GEOS-Chem bears little resemblance to the model described by Bey et al, 2001. Strongly suggest using a more updated reference.
Line 125: When stating horizontal resolution, this reviewer suggests you label which of the values represents latitude and longitude.
Line 136: difference in (horizontal) resolution…

Citation: https://doi.org/10.5194/egusphere-2025-1735-RC1
- AC1: 'Authors' Reply to RC1', Zachary Watson, 15 Aug 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1735/egusphere-2025-1735-AC1-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-1735-AC1
RC2:
'Comment on egusphere-2025-1735', Anonymous Referee #2, 26 May 2025

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1735/egusphere-2025-1735-RC2-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2025-1735-RC2
- AC2: 'Authors' Reply to RC2', Zachary Watson, 15 Aug 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1735/egusphere-2025-1735-AC2-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-1735-AC2
- AC3: 'Authors' Reply to RC2', Zachary Watson, 15 Aug 2025
  
  The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2025/egusphere-2025-1735/egusphere-2025-1735-AC3-supplement.pdf
  
  Citation: https://doi.org/10.5194/egusphere-2025-1735-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Zachary Watson on behalf of the Authors (15 Aug 2025) Author's response Author's tracked changes Manuscript

ED: Referee Nomination & Report Request started (03 Sep 2025) by Farahnaz Khosrawi

RR by Anonymous Referee #2 (15 Sep 2025)

ED: Publish subject to technical corrections (16 Sep 2025) by Farahnaz Khosrawi

AR by Zachary Watson on behalf of the Authors (16 Sep 2025) Author's response Manuscript

Download

Article (6275 KB)
Full-text XML

Short summary

Air pollutants like sulfur dioxide impact human health and the environment. Our work estimated surface sulfur dioxide concentrations from satellite data over eastern China. One method used atmospheric models, and another method used machine learning. We found that compared to measurements from an air quality monitoring network, both methods accurately captured the locations of sulfur dioxide, but the machine learning method was generally much more accurate in the estimated concentrations.

Estimating surface sulfur dioxide concentrations from satellite data over eastern China: Using chemical transport models vs. machine learning

Download

Interactive discussion

Peer review completion

Suggestions for revision or reasons for rejection