Evaluating the influence of anthropogenic-emission changes on air quality requires accounting for the influence of meteorological variability. Statistical methods such as multiple linear regression (MLR) models with basic meteorological variables are often used to remove meteorological variability and estimate trends in measured pollutant concentrations attributable to emission changes. However, the ability of these widely used statistical approaches to correct for meteorological variability remains unknown, limiting their usefulness in the real-world policy evaluations. Here, we quantify the performance of MLR and other quantitative methods using simulations from a chemical transport model, GEOS-Chem, as a synthetic dataset. Focusing on the impacts of anthropogenic-emission changes in the US (2011 to 2017) and China (2013 to 2017) on PM

Researchers and policymakers have long been interested in understanding the anthropogenic drivers of trends in observed air pollutant concentrations in order to inform air quality policies. Declining trends in pollutant concentrations such as particulate matter with diameters less than 2.5

Measured pollutant concentrations are often used as the primary basis for evaluating air quality actions. For example, in 2013, China's central government established targets that aimed to reduce annual average PM

Many studies use multiple linear regression (MLR) models with basic meteorological variables to correct for meteorological variability in order to estimate the impacts of emission changes on measured air quality

Other statistical methods including non-linear regression or machine learning models have also been used to correct for meteorological variability

Despite a large number of papers that apply various meteorology correction methods, very little is known about whether these methods can effectively correct for meteorological variability and thus realistically estimate the counterfactual air quality and reveal the underlying impacts of anthropogenic-emission changes. Most studies cite the prediction performance of their statistical models (such as

Overview of research methodology. Terms and coefficients are linked to the associated terms in Eq. (

Here, we conduct a model experiment to evaluate the performance of widely used statistical models in correcting for meteorological variability and estimating emission-driven trends in air quality (see Fig.

GEOS-Chem is a global three-dimensional chemical transport model driven by assimilated meteorological data from the Goddard Earth Observation System (GEOS-5) of the NASA Global Modeling and Assimilation Office (GMAO) (

We use GEOS-Chem version 12.3.0 with a horizontal resolution of

Overview of GEOS-Chem scenarios and meteorological correction methods. RF: random forest. LASSO: least absolute shrinkage and selection operator.

Table

Observational scenarios simulate PM

Counterfactual scenarios simulate PM

It is important to note that we do not assume our GEOS-Chem simulations perfectly represent the underlying pollutant concentration in the real world (although the model compares relatively well with the observational data). Rather, our main focus is to evaluate how different statistical methods can explain the differences between the observational and counterfactual scenarios. The assumption here is that the differences between observational and counterfactual scenarios are useful approximations of the impacts of meteorological variability on pollutant concentrations. The implications of uncertainty in GEOS-Chem for our results can be found in the “Discussion” section.

For the US, we use the National Emissions Inventory 2011 (NEI 2011) as a baseline emission inventory and scale the emissions in 2012 to 2017 to match the annual total emissions each year

Natural emissions of multiple chemical species are calculated online in the simulations (rather than prescribed) in the GEOS-Chem model and thus can be influenced by meteorological variability (see

We assess the performance of statistical and machine learning models to correct for the meteorological variability in the observational scenarios. We evaluate these methods with a commonly used framework (e.g., used in

Here,

We use the following 10 variables from MERRA-2 as our selected meteorological features for the statistical analysis: surface temperature, precipitation, humidity, planetary boundary layer height, cloud fraction, surface air pressure, and wind speed (

We also evaluate models that use both local and regional meteorological features. Regional meteorological features are important for explaining variability in local pollutant concentrations due to (1) pollution transport from neighboring locations and (2) influences from meteorological systems at the synoptic scale (i.e., large-scale weather systems that span over 1000 km such as circulation patterns)

We further design and evaluate an approach to correct for meteorology variability with GEOS-Chem simulations (referred to as the “constant-emis” approach). The constant-emis approach uses GEOS-Chem simulations with constant anthropogenic emissions and changing meteorological fields (“constant-emission scenarios” in Table

Compared to previous statistical and machine learning approaches, the constant-emis approach better captures the meteorological variability as simulated in GEOS-Chem (as

We use the surface air quality measurements from the Air Quality System administered by the US EPA

The surface air quality measurements in China are derived from the monitoring network administered by

We use the meteorological variables from MERRA-2 when performing meteorology corrections at these monitoring stations because the meteorology information is not available for all these variables at the station level. This is consistent with previous analyses estimating the meteorology-corrected trends using observational air quality data

Figure 2a and c show the trends in PM

Trend estimates of daily annual PM

Figure 2b and d show the degree to which different meteorological correction methods can recover the emission-driven trends in the counterfactual scenarios. When no correction for meteorology is performed (“uncorrected” in Fig. 2b), we observe large estimation errors in trend estimates over the northeastern and southern US by up to 0.25

Trend estimates of daily annual PM

Meteorological variability also has a substantial influence on the summertime

Figure 3a and c show the trends in PM

Figure 3b shows the magnitude of estimation errors in the trend estimates of annual PM

Figure 3d shows the magnitude of errors in the trend estimates for summer

In our model experiments in both the US and China, we find large differences remain between the trends evaluated with statistical models (even the best-performing RF-regional model) and counterfactual trends. The remaining differences could result from two different factors: (1) the statistical model cannot capture the complex relationship between meteorology and pollutant concentrations, and/or (2) the differences between the observational scenarios and counterfactual scenarios depend on not only the meteorological variability but also the anthropogenic emissions in their interaction with meteorology (i.e., impacts of meteorology on air quality also depend on the level of emissions).

We quantify the potential magnitude of this second factor using our constant-emis approach. As the constant-emis approach captures the exact relationship between meteorology and pollutant concentrations in GEOS-Chem, the error in the constant-emis approach is only associated with the second factor above and thus provides a conceptual minimum of the estimation errors that can be achievable by any statistical approach. Figure

Panels

However, the estimation errors calculated above are still non-negligible and can be large in certain regions. As shown in Fig. 4b and d, the constant-emis approach generally yields trend estimates biased by 10 % relative to the counterfactual trends, but the errors can be up to 40 % in certain areas. This error term is the result of ignoring how emissions could potentially influence the impacts of meteorology on the pollutant concentrations – that is, the impacts of the same meteorological variability on concentrations may be different in the start year (with high emissions) compared to the end year (with low emissions).

Trends in

Figure

We find similar consistency in the method performances between observational data and GEOS-Chem simulations in China as well (Fig. 5b). When applying to the observational data from the surface monitoring network, a much smaller reduction in PM

We designed a model experiment that enables us to directly quantify the performance of different statistical models to evaluate the trends in pollutant concentrations driven by anthropogenic-emission changes. Based on our evaluations of either PM

With our model experiments, we also quantify the estimation errors in assuming emission impacts can be perfectly separated from meteorological variability. These errors likely bound the estimation errors that can be achieved by any statistical methods with this assumption. In the future, more complex statistical and machine learning methods could be applied to distinguish emission-driven and meteorologically driven changes, but attribution solely based on observed concentrations and meteorology will be limited by physical interactions between emissions and meteorology. We find that the estimation errors resulting from these interactions are overall much smaller compared to the estimation errors in the existing statistical methods but can still be important for certain regions at certain times. However, the intertwined relationships between anthropogenic emissions and meteorology are often much more complex in reality compared to our model experiments. For example, meteorology can also directly influence anthropogenic emissions (e.g., increased electricity consumption during extreme weather conditions,

While the GEOS-Chem model provides us with a framework to test statistical methods, its use in our model experiments introduces some uncertainty and limitations. Specifically, our experiments assess the performance of statistical methods in correcting for the meteorology–pollution relationships encoded in GEOS-Chem, which may differ from the complex relationships in the observational data. Several studies have shown that GEOS-Chem and similar models do not capture certain meteorology–pollution relationships in the observational data (e.g., temperature–

Changes in natural emissions due to meteorological variability play an important role in the air quality–meteorology relationship. Our model experiment considers natural emission changes that can be simulated online with assimilated meteorological fields in GEOS-Chem, including soil

Our research reveals multiple directions for future research to enhance our understanding of the usage of statistical models to evaluate trends in pollutant concentrations under changing meteorological conditions. One key but challenging question is to better understand the estimation errors in these existing approaches; e.g., why the MLR model is able to correct for the meteorological variability in some locations but not others. In this paper, we only test a selection of methods based on their popularity in the existing literature and propose a simple-to-use model (RF-regional). More complex models (such as convolutional neural networks) may offer better performance, but the estimation error will likely be bounded by the errors in the constant-emis approach. Our work only evaluates the statistical and machine learning models in Eqs. (

Using statistical methods to causally infer relationships between simulated air pollutant concentrations and anthropogenic emissions is challenging, and doing so in contexts of observational data is even more challenging. Understanding the uncertainty in statistical models in characterizing the meteorology–pollution relationship is essential to evaluating the effectiveness of policy interventions with observational data. Here, we make several recommendations to researchers and policymakers based on our analysis.

For those who aim to infer causal effects of emission changes on air quality based on observational data on concentrations and meteorology, we recommend using multiple statistical methods to correct for meteorological variability when evaluating the impacts of policies or interventions on air quality. From our two case studies, we find a relatively large variation between the trend parameters estimated by different statistical methods (especially at the grid cell or monitor level). Some methods perform better in certain locations but not in others (though RF-regional is the best-performing method overall). Using multiple approaches (linear/non-linear and at the local/regional scale) may help to quantify uncertainty related to meteorological corrections. These findings also suggest that empirical analyses may benefit from considering the impacts of meteorological variability on air quality separately for each region or even for each monitor location (if data permit), instead of attempting to determine a general relationship between meteorological variability and air pollution over a large spatial domain. Finally, analysts should be particularly cautious when using statistical methods to estimate impacts of anthropogenic emissions on air quality in regions where pollution variability is dominated by meteorologically influenced environmental processes such as dust emissions, as we consistently show that typical statistical methods (in combination with the standard set of meteorological variables) do not work well in those regions.

Due to the non-negligible estimation errors in recovering the counterfactual trends even with the best-performing statistical approach we test, we believe these statistical analyses are most useful in understanding the patterns of anthropogenic emissions on air quality when aggregated across larger spatial areas, rather than providing specific trends for individual monitor locations. There is a higher degree of consistency among the trend estimates across different methods when aggregated at regional level, but assessment at the local level is more sensitive to method choice. The absolute magnitude of monitor-level trends needs to be interpreted with caution, considering both the uncertainty from the statistical methods and also the limit of meteorological correction due to ignoring the interactions between meteorology and emissions.

Because measured pollutant concentrations are subject to the influence of underlying meteorological variability, many efforts have attempted to correct for the impacts of meteorological variability and use “meteorology-corrected” concentrations and trends to assist in evaluating the effectiveness of air quality policies. Our study evaluates existing methods that aim to correct for the meteorological variability and finds many of these methods do not perform well. This raises potential concerns about the use of meteorology-corrected concentrations as targets for policy evaluation. Meteorology-corrected concentrations and trends remain useful metrics to quantify the influence of emissions. However, a more comprehensive evaluation of the effectiveness of policy requires interpreting measurements with all available tools, ideally including both statistical analyses and physical models.

The GEOS-Chem simulation of different scenarios and the R scripts to implement the statistical methods to correct for meteorological variability are available at the following repository:

The supplement related to this article is available online at:

MQ and NES designed the research. MQ performed the statistical analysis and GEOS-Chem modeling simulations. All authors interpreted the results and wrote the paper.

The contact author has declared that none of the authors has any competing interests.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We thank Colette Heald and Valerie Karplus for helpful comments and discussions. We thank Yixuan Zheng for assistance with the MEIC emission inventory. We thank Ke Li for sharing code for stepwise MLR analysis. Minghao Qiu gratefully acknowledges the support of the MIT Martin Family Society of Fellows for Sustainability.

This publication was supported by the US EPA (grant no. RD-835872-01). Its contents are solely the responsibility of the grantee and do not necessarily represent the official views of the US EPA. Further, the US EPA does not endorse the purchase of any commercial products or services mentioned in the publication. This work was also supported by the National Institutes of Health (NIH; grant no. NIEHS R01ES026217).

This paper was edited by Anne Perring and reviewed by Benjamin Wells and one anonymous referee.