Global impact of COVID-19 restrictions on the surface concentrations of nitrogen dioxide and ozone

Keller, Christoph A.; Evans, Mathew J.; Knowland, K. Emma; Hasenkopf, Christa A.; Modekurty, Sruti; Lucchesi, Robert A.; Oda, Tomohiro; Franca, Bruno B.; Mandarino, Felipe C.; Díaz Suárez, M. Valeria; Ryan, Robert G.; Fakes, Luke H.; Pawson, Steven

doi:https://doi.org/10.5194/acp-21-3555-2021

Articles | Volume 21, issue 5

https://doi.org/10.5194/acp-21-3555-2021

Articles | Volume 21, issue 5

Research article

09 Mar 2021

Research article |

| 09 Mar 2021

Global impact of COVID-19 restrictions on the surface concentrations of nitrogen dioxide and ozone

Christoph A. Keller, Mathew J. Evans, K. Emma Knowland, Christa A. Hasenkopf, Sruti Modekurty, Robert A. Lucchesi, Tomohiro Oda, Bruno B. Franca, Felipe C. Mandarino, M. Valeria Díaz Suárez, Robert G. Ryan, Luke H. Fakes, and Steven Pawson

Abstract

Social distancing to combat the COVID-19 pandemic has led to widespread reductions in air pollutant emissions. Quantifying these changes requires a business-as-usual counterfactual that accounts for the synoptic and seasonal variability of air pollutants. We use a machine learning algorithm driven by information from the NASA GEOS-CF model to assess changes in nitrogen dioxide (NO₂) and ozone (O₃) at 5756 observation sites in 46 countries from January through June 2020. Reductions in NO₂ coincide with the timing and intensity of COVID-19 restrictions, ranging from 60 % in severely affected cities (e.g., Wuhan, Milan) to little change (e.g., Rio de Janeiro, Taipei). On average, NO₂ concentrations were 18 (13–23) % lower than business as usual from February 2020 onward. China experienced the earliest and steepest decline, but concentrations since April have mostly recovered and remained within 5 % of the business-as-usual estimate. NO₂ reductions in Europe and the US have been more gradual, with a halting recovery starting in late March. We estimate that the global NO_x (NO + NO₂) emission reduction during the first 6 months of 2020 amounted to 3.1 (2.6–3.6) TgN, equivalent to 5.5 (4.7–6.4) % of the annual anthropogenic total. The response of surface O₃ is complicated by competing influences of nonlinear atmospheric chemistry. While surface O₃ increased by up to 50 % in some locations, we find the overall net impact on daily average O₃ between February–June 2020 to be small. However, our analysis indicates a flattening of the O₃ diurnal cycle with an increase in nighttime ozone due to reduced titration and a decrease in daytime ozone, reflecting a reduction in photochemical production.

The O₃ response is dependent on season, timescale, and environment, with declines in surface O₃ forecasted if NO_x emission reductions continue.

Download & links

How to cite.

Received: 08 Jul 2020 – Discussion started: 18 Sep 2020 – Revised: 21 Jan 2021 – Accepted: 21 Jan 2021 – Published: 09 Mar 2021

1 Introduction

The stay-at-home orders imposed in many countries during the Northern Hemisphere spring of 2020 to slow the spread of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2, hereafter COVID-19) led to a sharp decline in human activities across the globe (Le Quéré et al., 2020). The associated decrease in industrial production, energy consumption, and transportation resulted in a reduction in the emissions of air pollutants, notably nitrogen oxides (NO_x= NO + NO₂) (Liu et al., 2020a; Dantas et al., 2020; Petetin et al., 2020; Tobias et al., 2020; Le et al., 2020). NO_x has a short atmospheric lifetime and are predominantly emitted during the combustion of fossil fuel for industry, transport, and domestic activities (Streets et al., 2013; Duncan et al., 2016). Atmospheric concentrations of nitrogen dioxide (NO₂) thus readily respond to local changes in NO_x emissions (Lamsal et al., 2011). While this may provide both air quality and climate benefits, a quantitative assessment of the magnitude of these impacts is complicated by the natural variability of air pollution due to variations in synoptic conditions (weather), seasonal effects, and long-term emission trends, as well as the nonlinear responses between emissions and concentrations. Thus, simply comparing the concentration of pollutants during the COVID-19 period to those immediately beforehand or to the same period in previous years is not sufficient to indicate causality. An emerging approach to address this problem is to develop machine-learning-based “weather-normalization” algorithms to establish the relationship between local meteorology and air pollutant surface concentrations (Grange et al., 2018; Grange and Carslaw, 2019; Petetin et al., 2020). By removing the meteorological influence, these studies have tried to better quantify emission changes as a result of a perturbation.

Here, we adapt this weather-normalization approach to not only include meteorological information but also compositional information in the form of the concentrations and emissions of chemical constituents. Using a collection of surface observations of NO₂ and ozone (O₃) from across the world from 2018 to July 2020 (Sect. 2.1), we develop a “bias-correction” methodology for the NASA global atmospheric composition model GEOS-CF (Sect. 2.2), which corrects the model output at each observational site based on the observations for 2018 and 2019 (Sect. 2.3). These biases reflect errors in emission estimates, sub-grid-scale local influences (representational error), or meteorology and chemistry. Since the GEOS-CF model makes no adjustments to the anthropogenic emissions in 2020, and no 2020 observations are included in the training of the bias corrector, the bias-corrected model (hereafter BCM) predictions for 2020 represent a business-as-usual scenario at each observation site that can be compared against the actual observations. This allows the impact of COVID-19 containment measures on air quality to be explored, taking into account meteorology and the long-range transport of pollutants. We first apply this to the concentration of NO₂ (Sect. 3.1) and then O₃ (Sect. 3.2) and explore the differences between the counterfactual prediction and the observed concentrations. In Sect. 3.3, we explore how the observed changes in the NO₂ concentrations relate to emission of NO_x, and in Sect. 3.4 we speculate what the COVID-19 restrictions might mean for the second half of 2020.

2 Methods

2.1 Observations

Our analysis builds on the recent development of unprecedented public access to air pollution model output and air quality observations in near-real time. We compile an air quality dataset of hourly surface observations for a total of 5756 sites (4778 for NO₂ and 4463 for O₃) in 46 countries for the time period 1 January 2018 to 1 July 2020, as summarized in Fig. 1 and Table 1. More detailed maps of the spatial distribution of observation sites over China, Europe, and North America are given in Figs. A1–A3. The vast majority of the observations were obtained from the OpenAQ platform and the air quality data portal of the European Environment Agency (EEA). Both platforms provide harmonized air quality observations in near-real time, greatly facilitating the analysis of otherwise disparate data sources. For the EEA observations, we use the validated data (E1a) for the years 2018–2019 and revert to the real-time data (E2a) for 2020. For Japan, we obtained hourly surface observations for a total of 225 sites in Hokkaido, Osaka, and Tokyo from the Atmospheric Environmental Regional Observation System (AEROS) (MOE, 2020). To improve data coverage in under-sampled regions, we also included observations from the cities of Rio de Janeiro (Brazil), Quito (Ecuador), and Melbourne (Australia). All cities offer continuous, hourly observations of NO₂ and O₃ over the full analysis period, thus offering an excellent snapshot of air quality at these locations. We include all sites with at least 365 d of observations between 1 January 2018 and 31 December 2019 and an overall data coverage of 75 % or more since the first day of availability. Only days with at least 12 h of valid data are included in the analysis. The final NO₂ and O₃ dataset comprise 8.9×10⁷ and 8.2×10⁷ hourly observations, respectively.

https://acp.copernicus.org/articles/21/3555/2021/acp-21-3555-2021-f01

Figure 1Location of the 5756 observation sites included in the analysis. Red points indicate sites with both NO₂ and O₃ observations (3485 in total), purple points show locations with O₃ observations only (978 sites), and blue points show locations with NO₂ observations only (1293 sites). See the Appendix for detailed maps for North America, Europe, and China.

Table 1Observational data sources used in the analysis. Time period covers 1 January 2018–1 July 2020.

Download Print Version | Download XLSX

2.2 Model

Meteorological and atmospheric chemistry information at each of the air quality observation sites is obtained from the NASA Goddard Earth Observing System Composition Forecast (GEOS-CF) model (Keller et al., 2020). GEOS-CF integrates the GEOS-Chem atmospheric chemistry model (v12-01) into the GEOS Earth System Model (Long et al., 2015; Hu et al., 2018) and provides global hourly analyses of atmospheric composition at 25×25 km² spatial resolution, available in near-real time at https://gmao.gsfc.nasa.gov/weather_prediction/GEOS-CF/data_access/, last access: 5 July 2020 (Knowland et al., 2020). Anthropogenic emissions are prescribed using monthly Hemispheric Transport of Air Pollution (HTAP) bottom-up emissions (Janssens-Maenhout et al., 2015), with imposed weekly and diurnal scale factors as described in Keller et al. (2020). The same anthropogenic base emissions are used for the years 2018–2020. Therefore, GEOS-CF does not account for any anthropogenic emission changes since 2018, notably any anthropogenic emission reductions related to COVID-19 restrictions. However, it does capture the variability in natural emissions such as wildfires (based on the Quick Fire Emissions Dataset, QFED) (Darmenov and Da Silva, 2015) or lightning and biogenic emissions (Keller et al., 2014). While the meteorology and stratospheric ozone in GEOS-CF are fully constrained by pre-computed analysis fields produced by other GEOS systems (Lucchesi, 2015; Wargan et al., 2015), no trace gas observations are directly assimilated into the current version of GEOS-CF. It thus provides a “business-as-usual” estimate of NO₂ and O₃ that can be used as a baseline for input into the meteorological normalization process.

2.3 Machine learning bias correction

2.3.1 Overall strategy

We use the XGBoost machine learning algorithm (https://xgboost.readthedocs.io/en/latest/#, last access: 15 March 2020) (Chen and Guestrin, 2016; Frery et al., 2017) to develop a machine learning model to predict the time-varying bias at each observation site at an hourly scale. XGBoost uses the Gradient Boosting framework to build an ensemble of decision trees, trained iteratively on the residual errors to improve the model predictions in a stagewise manner (Friedman, 2001). Based on the 2018–2019 observation–model differences, the machine learning model is trained to predict the systematic (recurring) model bias between hourly observations and the co-located model predictions. These biases can be due to errors in the model, such as emission estimates, sub-grid-scale local influences (representational error), or meteorology and chemistry. Since model biases are often site-specific, we train a separate machine learning model for each site.

The design of the XGBoost framework is determined by a set of hyperparameters, such as the learning rate, maximum tree depth, or minimum loss reduction. While a full hyperparameter optimization across all sites – e.g., by using a grid search approach – would be computationally prohibitive, we conducted hyperparameter sensitivity tests at few selected sites and found that the XGBoost performance only improved marginally at these sites when using hyperparameters other than the model defaults (less than 5 % improvement). In addition, we found that the sites respond differently to the same change in hyperparameter setup, suggesting that there is no uniform hyperparameter design that is optimal across all sites. Based on this, we chose to use the default XGBoost model parameters at all locations, with a learning rate of 0.3, minimum loss reduction of 0, maximum tree depth of 6, and L1 and L2 regularization terms of 0 and 1, respectively.

For each location, we split the 2-year training dataset into eight quarterly segments (January–March, April–June, etc.) and train the model eight times, each time omitting one of the segments (8-fold cross validation). The omitted segment is used as test data to validate the general performance of the machine learning model and to provide an uncertainty estimate, as is further discussed below. This approach aims to reduce the auto-correlation signal that can lead to overly optimistic machine learning results (Kleinert et al., 2021), while still including data from all four seasons in the testing. Once trained, the final model prediction at each location consists of the average prediction of the eight models.

The observations used in this analysis are not always quality-controlled, which can cause issues if erroneous observations are included in the training, such as unrealistically high O₃ concentrations of several thousand parts per billion by volume. As an ad hoc solution to this problem, we remove all observations below or above 2 standard deviations from the annual mean from the analysis. Sensitivity tests using more stringent thresholds of 3 or even 4 standard deviations resulted in no significant change in our results.

2.3.2 Evaluation of model predictors

The input variables fed into the XGBoost algorithm are provided in Table A1. The input features encompass 9 meteorological parameters (as simulated by the GEOS-CF model: surface northward and eastward wind components, surface temperature and skin temperature, surface relative humidity, total cloud coverage, total precipitation, surface pressure, and planetary boundary layer height), modeled surface concentrations of 51 chemical species (O₃, NO_x, carbon monoxide, volatile organic compounds (VOCs), and aerosols), and 21 modeled emissions at the given location. In addition, we provide as input features the hour of the day, day of the week, and month of the year; these allow the machine learning model to identify systematic observation–model mismatches related to the diurnal, weekly, and seasonal cycle of the pollutants. In addition, for sites with observations available for the full two years, we provide the calendar days since 1 January 2018 as an additional input feature to also correct for inter-annual trends in air pollution, e.g., due to a steady decrease in emissions not captured by the model. This follows a similar technique to Ivatt and Evans (2020) and Petetin et al. (2020).

Gradient-boosted tree models consist of a tree-like decision structure, which can be analyzed to understand how the model uses the input features to make a prediction. Particularly useful in this context is the SHapely Additive exPlanations (SHAP) approach, which is based on game-theoretic Shapely values and represents a measure of each feature's responsibility for a change in the model prediction (Lundberg et al., 2017). SHAP values are computed separately for each individual model prediction, offering detailed insight into the importance of each input feature to this prediction while also considering the role of feature interactions (Lundberg et al., 2020). In addition, combining the local SHAP values offers a representation of the global structure of the machine learning model.

Figure A4 shows the distribution of the SHAP values for all NO₂ predictors separated by polluted sites (left panel) and non-polluted sites (right panel), with polluted sites defined as locations with an annual average NO₂ concentration of more than 15 ppbv. Generally, the model-predicted (unbiased) NO₂ concentration is the most important predictor for the model bias, followed by the hour of the day, the day since 1 January 2018 (“trendday”), and a suite of meteorological variables including wind speed (u10m, v10m), planetary boundary hight (zpbl), and specific humidity (q10m). All of these factors are expected to highly impact NO₂ concentrations and it is thus not surprising that the model biases are most sensitive to them. While there is considerable spread in the feature importance across the individual sites, there is little overall difference in the feature ranking between polluted vs. non-polluted sites.

Figure A5 shows the SHAP value distribution for all O₃ predictors, again separated into polluted and non-polluted sites (using the same definition as for the NO₂ sites). Unlike for NO₂, the bias-correction models for polluted sites exhibit different feature sensitivities than the non-polluted sites. At polluted locations, the availability of reactive nitrogen (NO₂, NO_y, PAN) is the dominant factor for explaining the model O₃ bias, reflecting the tight chemical coupling between NO_x and O₃ (Seinfeld and Pandis, 2016). This is followed by the month of the year, total precipitation (tprec), and O₃ concentration, again variables that are expected to be correlated to O₃. At non-polluted sites, the uncorrected O₃ concentration is on average the most relevant input feature for the bias correctors, followed by the month of the year and the odd oxygen concentration ( $O_{x} = {NO}_{2} + O_{3}$ ). The non-polluted sites are generally more sensitive to wind speed, reflecting the fact that O₃ production and loss at these locations is less dominated by local processes compared to the polluted sites.

2.3.3 Machine learning model skill scores

Figures 2 and 3 summarize the machine learning model statistics for NO₂ and O₃, respectively. The normalized mean bias (NMB), normalized root-mean-square error (NRMSE), and Pearson correlation coefficient (R) at each site are shown for both the training (blue) and the test (red) dataset. We define NMB as mean bias normalized by average concentration at the given site, and the NRMSE as the root-mean-square error normalized by the range of the 95th percentile concentration and 5th percentile concentration. Rather than using the mean as the denominator for the NRMSE, we choose the percentile window as a better reference point for the concentration variability at a given site. Using the mean as the denominator for the NRMSE would lead to very similar qualitative results.

For both NO₂ and O₃, the bias-corrected model predictions show no bias when evaluated against the training data, NRMSEs of less than 0.3, and correlation coefficients between 0.6–1.0 (NO₂) and 0.75–1.0 (O₃). Compared to the training data, the skill scores on the test data show a higher variability, with an average NMB of −0.047 for NO₂ and −0.034 for O₃, a NRMSE of 0.25 (NO₂) and 0.18 (O₃), and a correlation of 0.64 (NO₂) and 0.84 (O₃). We find no significant difference in skill scores between background vs. polluted sites or different countries.

A number of factors likely contribute to the poorer statistical results at some of the sites. Importantly, some sites might be prone to overfitting if the training data include events that are not easily generalizable, such as unusual emission activity (e.g., biomass burning, fireworks, closure of nearby point source) or weather patterns that are not frequently observed. In addition, the availability of test data at some locations is weak (less than 50 %), which can contribute to a poorer skill score.

2.3.4 Uncertainty estimation

To quantify the uncertainty of an individual model predictions at any given site, we use the standard deviation of the model-observation differences on the test data. For sites with 100 % test data coverage, this represents the standard deviation from a sample of 17 520 hourly model-observation pairs. The thus obtained individual NO₂ prediction uncertainties range between 3.9–28 ppbv (mean = 8.5 ppbv) at polluted sites and 0.1–18 ppbv at clean sites (average of 4.9 ppbv). On a relative basis, this corresponds to an average uncertainty of 45 % at polluted sites and 65 % at clean sites. For O₃, we obtain an average individual prediction uncertainty of 14 ppbv (4.6–33 ppbv) at polluted sites and 9.0 ppbv (2.8–45 ppbv) at clean sites, corresponding to an average relative uncertainty of an individual prediction of 29 % and 33 % at polluted and clean sites, respectively.

The results presented in this paper are averages aggregated over multiple hours and locations, and the reported uncertainties are adjusted accordingly by calculating the mean uncertainty $\overline{σ}$ from the above-described hourly uncertainties σ_i:

\begin{matrix} (1) & {\overline{σ}}^{2} = \sum_{i = 1}^{N} {(\frac{σ_{i}}{N})}^{2} . \end{matrix}

This assumes that the errors across individual sites are uncorrelated. The error covariance across sites is complex: two urban sites close to each other might show a low degree of error correlation due to local-scale (street, canyon, etc.) differences, whereas two background sites further apart might show significantly more correlation due to regional-scale (synoptic) processes. In addition, our uncertainty calculation also implies that the aggregated mean error approaches zero. Given that the average mean biases of the machine learning models are clustered around zero (Figs. 2 and 3), this is a valid general assumption – especially when aggregating across multiple sites. For simplicity we keep the current analysis but acknowledge that it might lead to overly optimistic uncertainty estimates for sites with a relatively large mean bias.

https://acp.copernicus.org/articles/21/3555/2021/acp-21-3555-2021-f02

Figure 2Machine learning statistics between hourly observations and the corresponding bias-corrected model predictions for each observation location. Shown are the normalized mean bias (NMB), normalized root-mean-square error (NRMSE), and Pearson correlation coefficient (R) for the training data (blue) and the test data (red). Data are sorted by region: China, Europe, United States (USA), and rest of the world (ROW). The mean values across all locations are shown in the figure inset.

Global impact of COVID-19 restrictions on the surface concentrations of nitrogen dioxide and ozone

2.1 Observations

2.2 Model

2.3 Machine learning bias correction

2.3.1 Overall strategy

2.3.2 Evaluation of model predictors

2.3.3 Machine learning model skill scores

2.3.4 Uncertainty estimation

2.4 Lockdown dates

3.1 Nitrogen dioxide

3.2 Ozone

3.3 NOx emission reductions

3.4 Long-term impact of reduced NOx emissions on surface O3

3.3 NO_x emission reductions

3.4 Long-term impact of reduced NO_x emissions on surface O₃