01 Apr 2022
01 Apr 2022
Status: this preprint is currently under review for the journal ACP.

Statistical and machine learning methods for evaluating trends in air quality under changing meteorological conditions

Minghao Qiu1,a, Corwin M. Zigler2, and Noelle E. Selin1,3 Minghao Qiu et al.
  • 1Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, USA
  • 2Departments of Statistics and Data Science and Women’s Health, University of Texas, Austin, USA
  • 3Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, USA
  • acurrent address: Department of Earth System Science, Stanford University, USA

Abstract. Evaluating the influence of anthropogenic emissions changes on air quality requires accounting for the influence of meteorological variability. Statistical methods such as multiple linear regression (MLR) models with basic meteorological variables are often used to remove meteorological variability and estimate trends in measured pollutant concentrations attributable to emissions changes. However, the ability of these widely-used statistical approaches to correct for meteorological variability remains unknown, limiting their usefulness in the real-world policy evaluations. Here, we quantify the performance of MLR and other quantitative methods using two scenarios simulated by a chemical transport model, GEOS-Chem, as a synthetic dataset. Focusing on the impacts of anthropogenic emissions changes in the US (2011 to 2017) and China (2013 to 2017) on PM2.5 and O3, we show that widely-used regression methods do not perform well in correcting for meteorological variability and identifying long-term trends in ambient pollution related to changes in emissions. The estimation errors, characterized as the differences between meteorology-corrected trends and emission-driven trends under constant meteorology scenarios, can be reduced by 30 %–42 % using a random forest model that incorporates both local and regional scale meteorological features. We further design a correction method based on GEOS-Chem simulations with constant emission input and quantify the degree to which emissions and meteorological influences are inseparable, due to their process-based interactions. We conclude by providing recommendations for evaluating the effectiveness of emissions reduction policies using statistical approaches.

Minghao Qiu et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on acp-2022-232', Anonymous Referee #1, 26 Apr 2022
  • RC2: 'Comment on acp-2022-232', Benjamin Wells, 11 May 2022
    • AC1: 'Reply on RC2', Minghao Qiu, 16 May 2022
      • EC1: 'Reply on AC1', Anne Perring, 24 May 2022

Minghao Qiu et al.

Minghao Qiu et al.


Total article views: 372 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
273 91 8 372 5 3
  • HTML: 273
  • PDF: 91
  • XML: 8
  • Total: 372
  • BibTeX: 5
  • EndNote: 3
Views and downloads (calculated since 01 Apr 2022)
Cumulative views and downloads (calculated since 01 Apr 2022)

Viewed (geographical distribution)

Total article views: 615 (including HTML, PDF, and XML) Thereof 615 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 26 May 2022
Short summary
Evaluating impacts of emission changes on air quality requires accounting for meteorological variability. Many studies use simple regression methods to correct for meteorology, but much less is known about their performance. Using cases in US and China, we show that widely-used regression models do not perform well and can lead to biased estimates of emission-driven trends. We propose a novel machine learning method with lower bias and provide recommendations to policy makers and researchers.