Articles | Volume 24, issue 1
Research article
11 Jan 2024
Research article |  | 11 Jan 2024

Multidecadal ozone trends in China and implications for human health and crop yields: a hybrid approach combining a chemical transport model and machine learning

Jia Mao, Amos P. K. Tai, David H. Y. Yung, Tiangang Yuan, Kong T. Chau, and Zhaozhong Feng

Surface ozone (O3) is well known for posing significant threats to both human health and crop production worldwide. However, a multidecadal assessment of the impacts of O3 on public health and crop yields in China is lacking due to insufficient long-term continuous O3 observations. In this study, we used a machine learning (ML) algorithm to correct the biases of O3 concentrations simulated by a chemical transport model from 1981–2019 by integrating multi-source datasets. The ML-enabled bias correction offers improved performance in reproducing observed O3 concentrations and thus further improves our estimates of the impacts of O3 on human health and crop yields. The warm-season trends of increasing O3 in Beijing–Tianjin–Hebei and its surroundings (BTHs) as well as in the Yangtze River Delta (YRD), Sichuan Basin (SCB), and Pearl River Delta (PRD) regions are 0.32, 0.63, 0.84, and 0.81 µg m−3 yr−1 from 1981 to 2019, respectively. In more recent years, O3 concentrations experienced more fluctuations in the four major regions. Our results show that only BTHs have a perceptible increasing trend of 0.81 µg m−3 yr−1 during 2013–2019. Using accumulated O3 over a threshold of 40 ppb (AOT40-China) exposure–yield response relationships, the estimated relative yield losses (RYLs) for wheat, rice, soybean, and maize are 17.6 %, 13.8 %, 11.3 %, and 7.3 % in 1981, increasing to 24.2 %, 17.5 %, 16.3 %, and 9.8 % in 2019, with an increasing rate of +0.03 % yr−1, +0.04 % yr−1, +0.27 % yr−1, and +0.13 % yr−1, respectively. The estimated annual all-cause premature deaths induced by O3 increased from ∼55 900 in 1981 to ∼162 000 in 2019 with an increasing trend of ∼2980 deaths per year. The annual premature deaths related to respiratory and cardiovascular disease are ∼34 200 and ∼40 300 in 1998 and ∼26 500 and ∼79 000 in 2019, having a rate of change of −546 and +1770 deaths per year during 1998–2019, respectively. Our study, for the first time, used ML to provide a robust dataset of O3 concentrations over the past 4 decades in China, enabling a long-term evaluation of O3-induced crop losses and health impacts. These findings are expected to fill the gap of the long-term O3 trend and impact assessment in China.

1 Introduction

Surface ozone (O3), an important secondary air pollutant, is mainly generated through photochemical reaction of volatile organic compounds (VOCs), carbon monoxide (CO), and nitrogen oxides (NOx) in the presence of sunlight. As a strong oxidant, O3 at the ground level is detrimental to human health and vegetation. More recently, due to rapid urbanization and industrialization, summertime O3 pollution has become an emerging concern in China. Li et al. (2020) reported that the mean summer 2013–2019 trend in maximum daily 8 h average surface O3 (MDA8 O3) was +1.9 ppb yr−1 in China, with high values widely observed in the North China Plain (NCP), Yangtze River Delta (YRD), and Pearl River Delta (PRD) regions. On the regional scale, the exposure of humans and vegetation to O3 is greater in China than in other developed regions of the world (Lu et al., 2018). Several studies have suggested that climate and land cover changes play an important role in O3 pollution in addition to anthropogenic emissions (Fu and Tai, 2015; Wang et al., 2020). It has been suggested that global warming and the changing land use may further increase surface O3 by the late 21st century (Kawase et al., 2011; Wang et al., 2020), which can pose greater threats to human health and food security.

Meteorological factors can modulate the temporal and spatial patterns of O3 by affecting the physical and chemical processes within the atmosphere (Liu et al., 2019; Mao et al., 2020; Yin and Ma, 2020). High temperature, low relative humidity, and low planetary boundary height are conducive to photochemical production and O3 accumulation. Jacob and Winner (2009) summarized that the enhanced O3 levels at higher temperatures are primarily driven by increased biogenic VOC emissions from vegetation and reduced lifetimes of peroxyacetyl nitrate (PAN) due to accelerated decomposition of PAN into NOx. Moreover, the changes in wind speed and direction can affect O3 concentrations through transport. Land cover and land use change affects O3 air quality by perturbing surface fluxes, hydrometeorology, and concentrations of atmospheric chemical components (Tai et al., 2013; Fu and Tai, 2015; Liu et al., 2020; Ma et al., 2021). For instance, the terrestrial biosphere is a major source of isoprene, which plays a significant role in modulating O3 concentrations. In the Intergovernmental Panel on Climate Change (IPCC) A1B scenario, Tai et al. (2013) found that widespread crop expansion could reduce isoprene emission by ∼10 % globally compared with the present land use. Such a reduction could decrease O3 by up to 4 ppb in the eastern US and increase O3 by up to 6 ppb in southern and southeastern Asia, whereby the difference in the sign of responses is primarily determined by the different O3 production regimes.

The increasing health burden due to air pollution has become an important contributor to the global burden of disease. Some recent studies have demonstrated that short-term O3 exposure negatively impacts human health, especially with respect to respiratory and cardiovascular mortality (Shang et al., 2013; P. Yin et al., 2017; Feng et al., 2019; Zhang et al., 2022a). In 2015–2018, the estimated annual total premature mortality related to O3 pollution in 334 Chinese cities was 0.27 million for 2015, 0.28 million for 2016, 0.39 million for 2017, and 0.32 million for 2018 (Zhang et al., 2021). Maji and Namdeo (2021) reported that short-term all-cause, cardiovascular, and respiratory premature mortalities attributed to the ambient fourth-highest MDA8 O3 exposure were 156 000, 73 500, and 28 600 in 2019, showing increases of 19.6 %, 19.8 %, and 21.2 %, respectively, compared to 2015. Zhang et al. (2022b) reported that each 10 µg m−3 increase in the MDA8 O3 can lead to a rise of 0.41 % (95 % CI: 0.35 %–0.48 %) in all-cause mortality, 0.60 % (95 % CI: 0.51 %–0.68 %) in cardiovascular mortality, and 0.45 % (95 % CI: 0.28 %–0.62 %) in respiratory mortality.

The damage to plants induced by O3 is mainly caused by the stomatal uptake of O3 into the leaf interior instead of direct plant surface deposition (e.g., Clifton et al., 2020). In previous studies, a variety of concentration-based metrics have been widely used to assess the O3 risks to crop yield and ecosystem functions. Initially, a 7 h (09:00–15:59 LT) mean metric (M7) was proposed, which was later extended to 12 h (08:00–19:59 LT; referred to as M12) to include late-day O3 concentrations. Cumulative metrics have also been developed to evaluate the impacts of O3 on crops. The accumulated O3 over a threshold of 40 ppb (AOT40) is a widely used metric to evaluate the phytotoxic effects of O3. Compared to AOT40 using a linear function, another metric, W126, considers the nonlinear response of yield loss to O3 exposure whereby higher O3 concentrations will progressively induce more severe yield losses. However, many studies have suggested that the stomatal uptake of O3 is more related to vegetation damage than to O3 exposure per se (Feng et al., 2012, 2018; Pleijel et al., 2022). Therefore, in the past 2 decades, the flux-based approach has been developed and has increasingly been used to assess the relationships between the stomatal O3 uptake and crop yields. Tai et al. (2021) compared the results of the estimated global crop yield losses using three concentration-based and two flux-based O3-exposure metrics and showed that the concentration-based metrics differ greatly among themselves, while the two flux-based metrics, which lie close to the middle of the range covered by all metrics, are generally close to each other.

At present, a comprehensive long-term assessment of the impacts of O3 is hindered by a lack of continuous O3 observations in China (Lu et al., 2018; Gong et al., 2021). From both health and food perspectives, reliable long-term estimates of O3 are critically needed to better understand O3 damage over the past few decades since the beginning of rapid industrial transformation in the 1980s. In previous studies, various alternative approaches have been used to address the problem of insufficient observations. The multiple linear regression (MLR) model is often used for extrapolation to construct spatiotemporal distributions of air pollutants (Moustris et al., 2012; Abdullah et al., 2017). However, these linear statistical methods are generally limited by their incapability to capture the nonlinear relationships between air pollutants and precursors as well as meteorological fields. Chemical transport models (CTMs), based on mathematical representation of atmospheric physical and chemical processes, are also a common tool to simulate air pollutant concentrations spatiotemporally (Fusco and Logan, 2003; Liu and Wang, 2020a; H. Wang et al., 2022). Taking advantage of the CTM, Fu and Tai (2015) investigated the impacts of historical climate and land cover changes on tropospheric O3 in eastern Asia between 1980 and 2010. However, the utility of CTMs is often limited by their high computational cost when conducting long-term simulations at high spatiotemporal resolutions. Large biases also exist due to uncertainties in historical emission inventories, parameterization of physical and chemical processes, and initial and/or boundary conditions, and these errors tend to increase at finer spatiotemporal scales.

In recent years, machine learning (ML) methods have gained increasing popularity in air pollution studies (Liu et al., 2020; Ma et al., 2021). In the early stage of applying ML to atmospheric chemistry, ML methods were usually used as an independent method from CTMs (Hu et al., 2017; Zhan et al., 2017), for instance, to predict O3 concentrations by mapping the nonlinear relationships between observed O3 concentrations and their possible shaping factors. These applications are usually purely data-driven, whereby the ML algorithms do not involve any representation of the physical mechanisms behind the relevant processes. With powerful algorithms and user-friendly hyperparameter tuning processes, some well-trained ML models, driven by data from multiple sources including reanalysis and satellite data, have shown even higher predictive capacity than process-based models. The advantages of ML methods over CTMs include more flexible choices for input data and spatiotemporal resolution as well as substantially lower computational costs (Bi et al., 2022). However, purely data-driven ML methods are known for suffering a lack of transparency and interpretability, which renders it more difficult to offer adequate scientific interpretation for the physical mechanisms behind the relevant processes. Thus, a hybrid approach combining ML algorithms and CTM-simulated results has increasingly been used in recent years to predict air pollutants and understand their trends. Integrating data from various sources, ML methods have been used as a tool to correct the biases in the lower-resolution simulated results from CTMs (Di et al., 2017; Ivatt and Evans, 2020; Ma et al., 2021). Based on process-based CTMs integrating decades of accumulated knowledge in Earth system science while taking advantage of ML to address still-existing model errors, the hybrid approach has great potential to tackle air quality problems (Irrgang et al., 2021).

In this work, we incorporated the O3 concentrations directly simulated by the Goddard Earth Observing System coupled with Chemistry (GEOS-Chem) model at a lower resolution into a bias-corrected, finer-resolution dataset by integrating them with O3 observations from 2016 to 2018 (for validation purposes), high-resolution meteorological fields, land use data, and other geographical information from multiple sources using a tree-based ML algorithm called LightGBM. The final high-resolution hourly O3 dataset with a resolution of 0.25×0.25 from 1981 to 2019 was further used to assess the impacts of O3 on human health and crop yields over the past 4 decades. The simultaneous analysis of the combined impacts of O3 on agriculture and human health can offer more comprehensive policy implications for the mitigation of O3-related impacts across China.

2 Data and methods

2.1 Air quality, meteorological, land, and crop data

Hourly surface O3 observations (µg m−3) from 2016 to 2018 were obtained from the China National Environmental Monitoring Center Network (, last access: 24 December 2023) established by the Ministry of Ecology and Environment of China. The MDA8 O3 of each site was calculated with at least 14 valid hourly values from 08:00 to 00:00 LT. A total of 1016 sites were selected after deleting the missing and abnormal data (Fig. 1).

Figure 1Study domain and locations of the selected monitoring sites. The pink, blue, purple, and green rectangles indicate Beijing–Tianjin–Hebei and its surroundings (BTHs) as well as the Sichuan Basin (SCB), Yangtze River Delta (YRD), and Pearl River Delta (PRD) regions, respectively.

The surface meteorological fields used in this study include sea surface pressure, horizontal wind at 10 m, air temperature at 2 m, downward solar radiation, surface albedo, and total precipitation. The variables selected at 850 and 100 hPa include relative humidity as well as horizontal and vertical velocity. These meteorological variables have been shown by many previous studies to correlate strongly with surface O3 concentrations as discussed above. Hourly reanalysis data for meteorological variables were obtained from the fifth-generation European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis dataset (ERA5) with a spatial resolution of 0.25×0.25 from 1981 to 2019 (, last access: 24 December 2023). This spatial resolution sets the highest limit of resolution for our hybrid O3 product.

The national land use data with a spatial resolution of 1 km × 1 km for 2013 were obtained from the Resource and Environment Science data center of the Chinese Academy of Sciences (RESDC) (, last access: 24 December 2023). Six primary types of land use are considered: cultivated land, forestland, grassland, water bodies, construction land, and unused land. Nationwide elevation data were also provided by the RESDC (, last access: 24 December 2023), which are resampled based on the latest Shuttle Radar Topography Mission (SRTM) v4.1 data developed in 2000.

The spatial distribution of the harvested areas for four staple crops (wheat, rice, maize, soybean) in China was obtained from the Global Agro-Ecological Zones 2015 dataset ( Crop harvesting dates with a resolution of 0.5×0.5 were provided by the Center for Sustainability and the Global Environment (Sacks et al., 2010). For crops having more than one growing season in a year, only the primary growing period was considered.

2.2 GEOS-Chem model

We used the GEOS-Chem global 3-D chemical transport model version 12.2.0 (, last access: 24 December 2023), driven by assimilated meteorological data from Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA2) (, last access: 24 December 2023) with a horizontal resolution of 2.0 latitude by 2.5 longitude and reduced vertical resolution of 47 levels. GEOS-Chem incorporates meteorological conditions, emissions, chemical information, and surface conditions to simulate the formation, transport, mixing, and deposition of ambient O3. It performs fully coupled simulations of O3–NOx–VOC–aerosol chemistry (Bey et al., 2001). Previous studies have demonstrated the ability of GEOS-Chem to reasonably reproduce the magnitudes and seasonal variations of surface O3 in eastern Asia (Wang et al., 2011; He et al., 2012). To provide long-term simulated O3 fields for incorporation into the ML model (see below), we conducted GEOS-Chem simulations at a resolution of 2.0×2.5; higher resolutions of GEOS-Chem in nested grids are available but computationally prohibitive for multidecadal simulations. The original unit of GEOS-Chem-simulated O3 is parts per billion (ppb), which was converted to micrograms per cubic meter (µg m−3) assuming a constant temperature of 25 C and pressure of 1013.25 hPa (1 µg m−3 is approximately 0.5 ppb) when compared with observations (P. Yin et al., 2017; Gong and Liao, 2019).

Global anthropogenic emissions of CO, NOx, SO2, and VOCs are from the Community Emissions Data System (CEDS), which has coverage over the simulation years of 1950–2014 (Hoesly et al., 2018). Biomass burning emissions are from the Global Fire Emissions Database (GFED4) inventory (van der Werf et al., 2017). Biogenic VOC emissions are computed by the Model of Emissions of Gases and Aerosols from Nature (MEGAN) v2.1 (Guenther et al., 2012), which is embedded in GEOS-Chem. Emissions of biogenic VOC species in each grid cell, including isoprene, monoterpenes, methyl butanol, sesquiterpenes, acetone, and various alkenes, are simulated as a function of canopy-scale emission factors modulated by environmental activity factors to account for changing temperature, light, leaf age, leaf area index (LAI), soil moisture, and CO2 concentrations (Sindelarova et al., 2014).

Dry deposition follows the resistance-in-series scheme of Wesely (1989), which depends on species properties, land cover types, and meteorological conditions, and uses the Olson land cover classes with 76 land types reclassified into 11 land types. Although transpiration is a potential mechanism via which the land cover affects ozone, we do not address it in this study because water vapor concentration in GEOS-Chem is prescribed from assimilated relative humidity (i.e., not computed online from evapotranspiration).

2.3 LightGBM machine learning model

The primary purpose of utilizing ML here was to minimize the biases of model output as compared with observations, whereby the biases could arise from incomplete model physics, input and parameter errors, numerical errors, coding errors, and representation errors (i.e., mismatch in spatial scales between model grid cells and site observations), so that the output of the hybrid model could have the closest values to the observations and enable more accurate impact evaluation. In this study, we used the LightGBM ML algorithm to integrate GEOS-Chem-simulated O3 at a lower resolution with higher-resolution multi-source data to produce higher-resolution hourly O3 and MDA8 O3 fields.

LightGBM is a ML algorithm based on the gradient-boosting decision tree (Chen and Guestrin, 2016), which has a high training efficiency and lower memory footprint and is thus suitable for processing massive high-dimensional data (Zhang et al., 2019). The general steps to build a ML model can be summarized as follows: (1) choose an algorithm that is appropriate for the problem (e.g., regression or classification), (2) clean the data and split them into training and test data, (3) train and tune the model with training data to capture prediction patterns well, (4) evaluate the model performance on test data, and (5) return to step (3) and (4) until an optimal predictive ability is reached. The training and evaluation processes are both performed at the site level in accordance with the observations, whereby the predictor variables and model responses were first sampled at the same locations using the bilinear interpolation approach (Accadia et al., 2003). This approach of handling spatial-scale mismatch between model grid cells and site observations has been commonly used in previous studies (e.g., Li et al., 2021). When predicting the gridded O3 concentrations with the trained model, predictor variables at different spatial resolutions were all regridded to the same resolution of 0.25×0.25, consistent with the ERA5 meteorological fields. By taking advantage of these higher-resolution datasets, the hybrid approach can not only correct the biases of the GEOS-Chem-simulated O3, but also refine them into a finer resolution. To evaluate if the hybrid approach truly benefits from using higher-resolution meteorological fields, we also repeated the whole training exercise with the input meteorology of GEOS-Chem (MERRA2 at 2.0×2.5) instead of ERA5.

During the model training process, the model was evaluated with 10-fold cross-validation to ensure the robustness and reliability of the model, whereby the training data were randomly partitioned into 10 subsets of approximately the same size, with 90 % of the data used to train individual models and the ensemble model and the remaining 10 % of data used to examine model performance (Xiao et al., 2018). This process was repeated 10 times so that each data record was left for testing once. The tuning of the hyperparameters was optimized using grid search optimization to improve detection performance and diagnostic accuracy (Wang et al., 2019). Statistical indicators, including the coefficient of determination (R2) and root mean square error (RMSE), were used in a subsequent assessment of model performance for GEOS-Chem alone and for the hybrid approach.

Our analysis revealed that training the model with 1 year or more of data results in only marginal reductions in RMSE and enhancements in R2 (Fig. S1 in the Supplement); thus a timescale of 2 years appears to strike a good balance between computational burden and model accuracy. These results align with the findings of Ivatt and Evans (2020), who suggested that much of the variability in the power spectrum of surface O3 can be captured by timescales of a year or less. Therefore, here we utilized observations from the 2016–2017 period as the training data, which offered a more economical computing cost and improved training time efficiency, and observations in 2018 as the independent test data to evaluate model performance.

2.4 Ozone-exposure metric and exposure–yield response functions

Among O3-exposure indices, AOT40 has been widely used during the last 2 decades as it has been found to have a strong relationship with the relative yield of many crop species (Mills et al., 2007) and was thus used to quantify the impacts of surface O3 on crop yields in this study. The flux-based metrics, which require long-term simulations using a process-based stomatal uptake model, were beyond the scope of this study. The AOT40 (ppm-h) is defined as follows:

(1) AOT 40 = i = 1 n O 3 i - 0.04 ,

where [O3]i is the hourly mean O3 concentration (ppm) during the 12 h of local daytime (08:00–19:59 LT), and n is the number of hours in the growing season defined as the 90 d prior to the start of the harvesting period according to the crop calendar.

The exposure–yield response functions based on extensive field experimental studies have been established to relate a quantifiable O3-exposure metric to crop yields. It has been suggested that responses of crop yields were found to be greater in Asian experiments than in the American and European counterparts, indicating possibly higher O3 sensitivity of Asian crop varieties (Emberson et al., 2009; Feng et al., 2022). To better understand O3-induced risks to crops in China, the AOT40 exposure–yield functions developed based on field experiments in China are used in this study, which are named AOT40-China. The exposure–yield response functions for soybean are from Zhang et al. (2017), and those of the other three crops are from Feng et al. (2022). The statistical exposure–yield relationships used in this study are summarized in Table S1 in the Supplement.

2.5 Analysis of health impacts

All-cause mortality, cardiovascular disease mortality, and respiratory disease mortality are selected as the health outcomes of our study due to the high correlation between these endpoints and short-term O3 exposure found in previous studies. A log-linear exposure–response function is widely adopted and recommended by the World Health Organization (WHO) for health impact assessment in areas with severe air pollution. In particular, the log-linear model is the most widely applied exposure–response model at present in China (Lelieveld et al., 2015; H. Yin et al., 2017; Zhang et al., 2022b). The premature mortality is calculated following

(2) Δ M = δ c ( RR - 1 ) RR P ,

where ΔM is the excess mortality attributable to O3 exposure, δc is the baseline mortality rate for a particular health endpoint (P. Yin et al., 2017; Madaniyazi et al., 2016), P is the exposed population, and RR is the relative risk defined as

(3) RR = exp X - X 0 β .

Here, β is the exposure–response coefficient derived from epidemiological cohort studies (Shang et al., 2013), X represents the model-calculated O3 concentration, and the value of X0 is the threshold concentration below which no additional risk is assumed. Consistent with previous studies (Lelieveld et al., 2015; Liu et al., 2018), we used X0=75.2µg m−3.

In this study, the mean MDA8 O3 concentrations in warm seasons (May–September) were used to estimate the disease-specific health impacts of short-term exposure to O3. The province-level population and national baseline mortality rate for particular diseases were provided by the National Bureau of Statistics (, last access: 24 December 2023). The spatial differences of baseline mortality in China were not considered without provincial-level data, which means that we assume the baseline mortality is evenly distributed across China (Dedoussi et al., 2020). The exposure–response coefficients were obtained from existing epidemiological studies in China (Table S2). If the corresponding coefficient of a province could not be found in published epidemiological studies, the datum closest to that province would be selected as a substitute. If there were no neighboring provinces, the results of national meta-analysis was used (Zhang et al., 2021).

3 Results

3.1 Model development and validation

The final selected features and their importance estimated by the LightGBM algorithm based on 10-fold cross-validation are shown in Fig. 2. GEOS-Chem-simulated O3 is the top predictor of surface O3 concentrations, accounting for 61 % and 58 % of all relative importance in the ML algorithm predicting hourly O3 and daily MD8A-O3, respectively. The result indicates that process-based GEOS-Chem simulations have high utility for O3 predictions under the hybrid approach (Ma et al., 2021). The meteorological variables with a high contribution to both the daily and the hourly models are downward surface solar radiation (SSRD), relative humidity at 1000 hPa (RH_1000hpa), and 10 m horizontal wind (U10 and V10). Other special features, including location (latitude and longitude), elevation, and diurnal and monthly patterns of O3, also contribute to ambient O3 estimations. The spatial distributions of bias-corrected O3 are consistent with observations for both training and test datasets (Fig. S2), indicating that there is no obvious overfitting; i.e., the model is able to generalize from the training set to the test set. The good generalization ability of the model gives us confidence in its ability to make accurate predictions based on new data. In general, the hybrid approach can yield good O3 estimates in the data-intensive regions, including eastern and central China, which are the hotspot areas of O3 pollution.

Figure 2The feature importance plot for (a) MDA8 O3 and (b) hourly O3. The full list of candidate variables with their symbols, units, descriptions, and data sources are shown in Table S3.


Figure 3 shows the density scatterplots between O3 measurements and GEOS-Chem simulations, as well as the hybrid approach predictions for 2018. The R2 values of the hybrid approach and GEOS-Chem model are 0.66 and 0.27 at the hourly level and 0.72 and 0.53 at the MDA8 O3 level, respectively. Bias-corrected O3 concentrations have lower RMSEs in comparison with GEOS-Chem-simulated O3 concentrations, reduced from 31.1 to 23.8 µg m−3 for MDA8 O3 predictions and from 38.5 to 26.3 µg m−3 for hourly predictions. The MDA8 O3 model performance is better than that of the hourly model, indicating reduced errors upon temporal averaging. To test if using the higher-resolution meteorological data offers better prediction accuracy compared with the original input meteorology of GEOS-Chem, the MERRA2 dataset driving GEOS-Chem was also used to train the model. We found that the higher-resolution ERA5 dataset performed better in reproducing observed O3 concentrations with moderately smaller RMSEs and larger R2 values (Fig. S3), demonstrating the extent to which a higher-resolution meteorological dataset, despite not being strictly consistent with the input meteorology for the CTM, can help enhance the performance of the hybrid approach and help resolve finer spatial details within the original CTM grid cells. In summary, the result suggests that the CTM-simulated results can be substantially improved by applying ML with multi-source datasets, and the bias-corrected data can improve our understanding of long-term O3 trends and its further implications on crop and human health over China, as discussed in the following sections.

Figure 3Density scatterplots and linear regression statistics of O3 predictions vs. observations for 2018: (a) bias-corrected MDA8 O3 vs. observations, (b) GEOS-Chem MDA8 O3 vs. observations, (c) bias-corrected hourly O3 vs. observations, and (d) GEOS-Chem hourly O3 vs. observations. The model results are sampled at the same locations. The dashed red line indicates the 1:1 line, and the solid blue line indicates the line of best fit using orthogonal regression. R2 is the coefficient of determination, RMSE is the root mean square error, and N is the number of data points. The x and y axes represent the O3 observations and predictions, respectively.


In comparison with previous studies, Liu et al. (2020) used XGBoost to predict O3 in major urban areas of China at a resolution of 0.1×0.1, and the R2 value and RMSE for MDA8 O3 were 0.74 and 23.8 µg m−3, respectively. Their result indicates that higher-resolution predictions may help enhance model accuracy but represent a trade-off between model accuracy and time efficiency, depending on the purpose. Instead of directly predicting O3 concentrations, Ivatt and Evans (2020) predicted biases in GEOS-Chem-simulated O3 concentrations and then corrected them with XGBoost. They also suggested that the corrected model performs considerably better than the uncorrected model, with RMSE reduced from 32.4 to 15.0 µg m−3 and Pearson's R raised from 0.48 to 0.84. Their greater improvement with larger reduced RMSE than our result is mainly because they selected fewer sites for training, with all the urban and mountain sites (observations made at a pressure < 850 hPa) removed. The removal of these sites can improve the overall apparent performance of the model because O3 formation could have different characteristics in these areas. In general, ML methods have been proven to be a promising tool to improve air pollutant forecasts when a process-level understanding is still incomplete.

3.2 Spatiotemporal distribution and trends in O3 predictions

Figure 4 demonstrates the spatial patterns of averaged annual and warm-season (May–September) MDA8 O3 from 1981 to 2019. When compared to the high concentrations in the warm season, MDA8 O3 concentrations are relatively lower at an annual level. The annual and warm-season MDA8 O3 concentrations have similar spatial distributions, and both present an increasing trend over the past few decades, with more substantial increases observed between 1981 and 2010. The O3 levels in southern China are lower than those in northern China, but they are still relatively high in the PRD region, which is consistent with findings in previous studies (e.g., Liu and Wang, 2020a). During the first decade of 1981–1990, high-O3-concentration areas are mainly concentrated in the BTHs and northern Shandong. In the next 2 decades, O3 pollution expands extensively to most of eastern and northern China, spreading northward to Jilin and Liaoning; westward to Shanxi and Ningxia; and southward to northern Hunan, Shanxi, and Zhejiang. Moreover, the SCB and PRD regions also experience aggravated O3 pollution during this period. In the last decade of the study period, O3 concentrations remain at high levels in BTHs and SCB without obvious changes. Next we analyze the interannual variability to understand the detailed changes and trends in O3.

Figure 4Spatial distribution of the annual mean MDA8 O3 concentrations (µg m−3) during (a) 1981–1990, (b) 1991–2000, (c) 2001–2010, and (d) 2011–2019. Spatial distribution of the warm-season (May–September) mean MDA8 O3 concentrations of (e) 1981–1990, (f) 1991–2000, (g) 2001–2010, and (h) 2011–2019.

Figure 5 shows that the annual averaged MDA8 O3 concentrations increased from 87 µg m−3 in 1981 to 98 µg m−3 in 2019, with a growth rate of +0.26µg m−3 yr−1, while the warm-season averaged MDA8 O3 concentrations increased from 100 µg m−3 in 1981 to 117 µg m−3 in 2019, having a growth rate of +0.51µg m−3 yr−1. Moreover, the average annual and warm-season O3 concentrations have a more obvious upward trend before the 2000s, with a growth rate of 0.38 and 0.71 µg m−3 yr−1, compared to that after the 2000s, when O3 concentrations appear to fluctuate within a certain range. GEOS-Chem-simulated O3 has a similar trend to the bias-corrected O3, but it generally overestimates O3 concentrations on a national scale (Fig. S4). The annual and warm-season averaged MDA8 O3 concentrations in BTHs as well as the YRD, SCB, and PRD regions are shown in Figs. S5 and S6. The warm-season increasing trends for BTHs as well as the YRD, SCB, and PRD regions are 0.32, 0.63, 0.84, and 0.81 µg m−3 yr−1 from the years 1981 to 2019.

Figure 5The bias-corrected MDA8 O3 predictions (black line; upper y axis) and corresponding anomalies (colored bar; lower y axis) from 1981 to 2019: (a) annual mean and (b) warm-season mean (May–September). The trends (growth rates) are obtained through ordinary linear regression on mean values of MDA8 O3. The anomalies are defined as the annual mean minus the multidecadal average over 1981–2019.


In recent years, the worsening O3 pollution has fueled numerous studies on ground-level O3 spatial distribution and changes in China, which have been conducted on local, regional, and national scales using different O3 fields from observations, CTMs, and ML estimates. In this study, we mainly focus on the regional and national O3 characteristics, and the O3 trends reported in recent studies are listed in Table 1. By comparing the results of existing works, we find that source-varied O3 fields can induce great uncertainty in the O3 trends. Moreover, the O3 trends are found to be very sensitive to the study period even with the same O3 fields (Wei et al., 2022), which indicates large interannual variability, mostly reflecting the changing anthropogenic emissions and meteorology (Lu et al., 2019; Li et al., 2020). In contrast to the perceptible O3 trends, Liu et al. (2020) suggested that O3 pollution in most parts of China undergoes only modest changes between 2005 and 2017, and their trends were not spatially continuous. T. Wang et al. (2022) also reported that O3 has small positive increase rates for 2013–2021 in many cities, and the O3 increase rates greatly differ from site to site even within the same region.

Table 1Summary of reported regional and national MDA8 O3 trends (µg m−3 yr−1).

Download Print Version | Download XLSX

In comparison, our results indicate no obvious increasing trends in national MDA8 O3 within the same study period. On a regional scale, only BTHs have a perceptible increasing trend in more recent years, while no such trends are found over the YRD, SCB, and PRD regions during the same period. The summertime MDA8 O3 in BTHs has a change rate of +0.81µg m−3 yr−1, which is much lower than the results using O3 observations (Li et al., 2020). One possible reason is that most observational sites are in urban regions, which usually suffer more serious O3 pollution, while the O3 concentrations from model simulations and ML methods are calculated on the scale of a grid cell with lower domain-averaged values. Moreover, gridded data at a relatively coarse resolution may fail to capture larger site differences, leading to the larger discrepancy between O3 observations and gridded O3 estimates.

3.3 Seasonal characteristics of O3 predictions

Differences in averaged annual and warm-season O3 concentrations indicate that O3 has distinctive seasonal characteristics. Figure 6 shows the seasonal variations in O3 concentrations from 2011–2019, and results for the other past 3 decades are shown in Figs. S7–S9. In winter, pollution is mainly concentrated in the coastal areas of southern China. In spring, O3 pollution primarily occurs in eastern China and the southern part of Yunnan Province. O3 pollution continues to worsen over eastern China in summer, particularly in BTHs, and further extends to SCB. The air quality in eastern and central China is greatly improved in autumn, while southern China experiences the most pollution in this period. In general, the peak and trough values of O3 concentrations appear in summer and winter, respectively. However, O3 concentrations are found to be minimum in summer and maximum in autumn over PRD, which is largely determined by the summer monsoon (Zhou et al., 2013; H. Wang et al., 2018). Figure S10 shows the seasonal averaged MDA8 O3 concentrations in different regions from 1981 to 2019. In winter, O3 concentrations do not experience much change across the four regions over the past few decades, staying mostly between 70–80 μg m−3. Moreover, wintertime O3 concentrations after the 2000s are generally lower than those before the 2000s in BTHs, YRD, and SCB. In contrast, summertime O3 concentrations increase dramatically over the four regions. In spring and autumn, O3 concentrations have an increasing trend in PRD, while it mostly fluctuates within a certain range in the other three regions. The results show that O3 in non-winter seasons has a more pronounced increase during 1981–2019 albeit with regional differences. The regional characteristics of O3 and its influencing factors are further discussed in Sect. 3.4. BTHs as well as the SCB, YRD, and PRD regions have been identified as hotspots of O3 pollution in China. These regions are characterized by high population density (L. Wang et al., 2018) and are also major agricultural areas (Monfreda et al., 2008), which may face greater burdens of crop yield and human health losses with high O3 concentrations. Therefore, here we provide more detailed analysis and investigation of these regions.

Figure 6Spatial distribution of the bias-corrected MDA8 O3 predictions (µg m−3) from 2011–2019: (a) winter, (b) spring, (c) summer, and (d) autumn.

3.4 Regional characteristics of O3 predictions

Figure 7 shows the bar plots of the seasonal MDA8 O3 concentrations in each region from 1981–2019 for bias-corrected and GEOS-Chem-simulated O3. For the bias-corrected O3, the averaged summertime MDA8 O3 concentrations in BTHs, YRD, and SCB and autumn-time MDA8 O3 concentrations in PRD are 137±8, 119±10, 113±12, and 98±10µg m−3, with the increasing rate being 0.46, 0.73, 0.98, and 0.69 µg m−3 yr−1 from 1981 to 2019, respectively (Fig. S11). For GEOS-Chem-simulated O3, the averaged summertime MDA8 O3 concentrations in BTHs, YRD, and SCB and autumn-time MDA8 O3 concentrations in PRD are 141±7, 125±11, 120±14, and 100±12µg m−3, respectively. This shows that O3 concentrations of the four regions have a consistent upward trend in the summer over the past few decades, but there are regional differences in other seasons. Compared to BTHs and YRD, PRD and SCB have more distinctive O3 increases in spring and autumn. Among these four regions, the O3 concentrations have the biggest seasonal differences in BTHs and the smallest seasonal differences in PRD.

Figure 7The seasonal mean MDA8 O3 concentrations (µg m−3) in different regions during 1981–2019. Bias-corrected MDA8 O3 in (a) winter, (c) spring, (e) summer, and (g) autumn. GEOS-Chem MDA8 O3 in (b) winter, (d) spring, (f) summer, and (h) autumn. The error bar represents the standard deviation.


The spatiotemporal patterns of O3 in China have been proven to largely depend on both emissions and meteorology. The regional O3 pollution is usually found to be triggered by specific circulation patterns as local meteorological factors are modulated by synoptic-scale circulation patterns. China has a large territory and is affected by different weather systems. The continental high-pressure systems, components of the eastern Asian summer monsoon (EASM) and tropical cyclones, among others, are critical synoptic conditions leading to O3 formation and transport in China (T. Wang et al., 2022; Han et al., 2020). For instance, regional O3 pollution in northern China usually occurs under a typical weather pattern of an anomalous high-pressure system at 500 hPa (Gong and Liao, 2019), which creates favorable meteorological conditions for high O3 levels with high temperature, low relative humidity, anomalous southerlies, and divergence in the lower troposphere. As one of the most important components of EASM, the western Pacific subtropical high (WPSH) strongly influences summertime precipitation and atmospheric conditions in eastern China. A strong WPSH can decrease O3 levels over YRD as enhanced moisture is transported into YRD under prevailing southwesterly winds (Zhao and Wang, 2017). Located on the southern coast of China, PRD features a typical subtropical monsoon climate. There, O3 concentrations are usually the lowest in summer due to the prevailing southerlies with clean air from the ocean and the associated heavy rainfall, while the worst O3 pollution usually happens in autumn, mainly due to the occasional northerly winds during the monsoonal transition, thereby importing precursors from the north, and stable and still relatively warm and sunny weather conditions before the winter starts. Downdrafts in the periphery circulation of a typhoon system can also strongly enhance surface O3 before typhoon landing (Jiang et al., 2015; Lu et al., 2021; Li et al., 2022). On the one hand, the poor ventilation in the peripheral subsidence region of typhoons favors the accumulation of O3 and its precursors. On the other hand, the deep subsidence can transport the O3 in the upper troposphere and lower stratosphere to the surface, causing aggravated O3 pollution. Moreover, smaller-scale circulation patterns, such as land–sea and mountain–valley breezes, also influence O3 in coastal regions (Ding et al., 2004; Zhou et al., 2013; H. Wang et al., 2018).

When compared to the hybrid approach, GEOS-Chem generally has similar O3 distribution and trends over each region, while overestimating O3 concentrations (Table S4). GEOS-Chem particularly overestimates wintertime and autumn-time O3 concentrations in SCB, which are 10±1 and 17±3µg m−3 higher than those of the hybrid approach, respectively. Previous studies have reported such model overestimates and have proposed a number of explanations involving precursor emissions, dry deposition, and vertical mixing in the planetary boundary layer (PBL) (Lin et al., 2008; Travis et al., 2016; Fiore et al., 2005). Both observational analyses and inter-model comparisons suggested that the summertime dry deposition of O3 calculated by the Wesely scheme in GEOS-Chem could be underestimated, which has been invoked as a cause for model overestimates of O3. The biased emissions in the model, consistent with the biased-high tropospheric NOx columns, result in overestimated O3. Travis et al. (2016) showed that GEOS-Chem with reduced NOx emissions provides an unbiased simulation of O3 observations from aircraft and reproduces the observed O3 production efficiency in the boundary layer. Lin et al. (2008) suggested that the excessive PBL mixing can lead to the biased-high O3 concentrations. The fully mixed O3 throughout the PBL means that the higher O3 concentrations in the upper PBL are brought down to the surface much more efficiently. Moreover, the excessive spatial averaging of emissions at coarser resolutions could also lead to systematic overestimation of regional O3 production (Wild and Prather, 2006). In summary, with a higher prediction accuracy, the hybrid approach lends greater credence to using model simulations to extrapolate historical O3 further back in time, which can furthermore provide us with more accurate estimates of the impacts of O3 on crop production and human health.

Figure 8Estimated annual mean relative yield losses (RYLs; in %) of four staple crops from 1981–2019 using the AOT40-China metric. The estimated RYLs using bias-corrected O3 for (a) maize, (d) wheat, (g) soybean, and (j) rice. The estimated RYLs using GEOS-Chem-simulated O3 for (b) maize, (e) wheat, (h) soybean, and (k) rice. The differences in estimated RYLs between GEOS-Chem-simulated and bias-corrected O3 for (c) maize, (f) wheat, (i) soybean, and (l) rice. The GEOS-Chem-simulated O3 was regridded to 0.5×0.5 for comparison with bias-corrected O3.

3.5 Crop production losses attributable to O3 pollution

Figure 8 shows the relative yield losses (RYLs; RYL = 1  RY, where RY is the relative yield defined as the ratio of the O3-affected yield to the yield without O3 exposure) calculated with GEOS-Chem and bias-corrected O3 using the AOT40-China metric. For a given crop, the RYLs show generally consistent spatial distribution across the metrics, with BTHs having the most serious crop yield losses due to high O3 concentrations. Compared to the bias-corrected O3, using GEOS-Chem-simulated O3 generally leads to larger yield losses, especially over BTHs and SCB, reflecting overestimated O3 concentrations by GEOS-Chem in cropland areas during the growing seasons (Fig. S12), primarily in spring and summer, which is consistent with the above analysis. GEOS-Chem-simulated O3 leads to slightly underestimated wheat yield loss only over some parts of BTHs, mostly because the primary growing period of wheat there is in winter and spring, and GEOS-Chem has lower O3 estimates than the hybrid approach during this period there (Table S4).

Figure 9 shows the bar plots of the relative yield for each crop using the AOT40-China exposure–yield response relationship. Crop yield losses are generally consistent with the O3 trends as the exposure–yield relationships used here are essentially a set of linear functions. Most crops experience aggravated yield losses over the past 4 decades due to enhanced O3 concentrations, except for wheat, which has the largest yield loss during the period from 1991 to 2000. The reason could be that BTHs have the highest O3 concentrations in spring during the 1990s, which is the primary growing season for wheat (Fig. S13).

Figure 9The estimated decadal mean relative yield losses (RYLs) of four staple crops using different metrics from 1981–2019. The estimated RYLs using bias-corrected O3 and GEOS-Chem-simulated O3 for (a) maize, (b) wheat, (c) soybean, and (d) rice. The error bar represents the standard deviation.


The average annual crop RYLs from 1981 to 2019 for wheat, rice, soybean, and maize range from 1.1 % to 13.4 %, 2.7 % to 13.4 %, 6.3 % to 24.8 %, and 0.8 % to 7.4 %, respectively. The differences in yield losses across crops reflect the dependence on crop-specific phenology and ecophysiology. The estimated annual RYLs using bias-corrected O3 for wheat, rice, soybean, and maize from 1981 to 2019 range from 17.5 %–25.5 %, 10.7 %–19.1 %, 7.3 %–17.9 %, and 7.1 %–12.7 %, with a growth rate of 0.03 % yr−1, 0.04 % yr−1, 0.27 % yr−1, and 0.13 % yr−1. Wheat is the most sensitive crop to the O3 concentrations, whereas maize is the least sensitive. Using GEOS-Chem-simulated O3, the estimated annual RYLs for wheat, rice, soybean, and maize from 1981 to 2019 are 18.7 %–28.7 %, 14.0 %–22.0 %, 12.4 %–23.1 %, and 7.9 %–13.2 %, having a growth rate of 0.08 % yr−1, 0.14 % yr−1, 0.23 % yr−1, and 0.11 % yr−1. There are noticeable differences in crop yield estimates using the bias-corrected and GEOS-Chem O3, again indicating the importance of the bias-corrected high-resolution O3 data in related crop issues.

In existing studies evaluating the O3-induced crop losses in China, which also use exposure–yield relationships derived from the experiments conducted in Asia, Zhang et al. (2017) reported that the ambient O3 concentrations in northeastern China cause substantial annual yield loss of soybean ranging from 23.4 % to 30.2 % during 2013 and 2014, depending on the O3 metric used (including AOT40, W126, SUM06, and a flux-based metric). Feng et al. (2022), using AOT40, indicated that the annual average RYLs of wheat, rice, and maize from 2017 to 2019 are 33 %, 23 %, and 9 %, respectively. Our correspondingly estimated RYLs for rice (18.0 %) and maize (10.0 %) are generally consistent with their results, while the RYLs for soybean (16.4 %) and wheat (23.4 %) are much lower than their estimates. Since we used the same exposure–yield response relationships as in their studies, the discrepancies are primarily attributed to the differences in the metrics used (only for soybean), O3 fields, and the sensitivity of the crop to the changes in O3 concentrations (Mukherjee et al., 2021; Feng et al., 2022; Mills et al., 2018). In Zhang et al. (2017), the O3 measurements are obtained from the experimental field (4573 N, 12661 E), and in Feng et al. (2022), the measured O3 concentrations are from over 3000 monitoring sites across eastern Asia. The results of the comparison are consistent with the previous analysis of O3 trends and variability from different sources, where the domain-averaged values of O3 observations are larger than gridded O3 from model simulations (Sect. 3.2) and thus lead to larger estimates of RYLs. On the one hand, it indicates that O3 fields should be considered a great source of uncertainty when comparing the results of previous studies using source-varied O3 fields. Moreover, different degrees of importance should be given for specific crops; for example, the changes in O3 concentrations have a larger impact on wheat crop. On the other hand, it again highlights the necessity and importance of bias correction for model-simulated O3 when studying O3-induced crop reduction.

Figure 10(a) Annual premature mortality (in thousands) for different diseases over the past few decades, (b) annual mean province-based mortality (in thousands) attributed to different health endpoints, and (c) annual mean province-based population (in millions). The mortality is calculated using the bias-corrected O3.


3.6 Health impacts attributable to O3 pollution

The estimated annual all-cause premature deaths induced by O3 increased from 55 876 in 1981 to 162 370 in 2019 with an increasing trend of +2979 deaths per year. The annual premature deaths related to respiratory and cardiovascular diseases were 34 155 and 40 323 in 1998 and 26 471 and 79 021 in 2019, having a rate of change of −546 and +1773 deaths per year during 1998–2019, respectively (Fig. 10a). Among three types of health outcomes, only respiratory diseases experienced a decreasing trend in premature mortality, and the premature mortality is constantly below 40 000. The decreasing trend of the respiration-related mortality primarily results from the decreased annual baseline mortality rate over the past few decades (Fig. S14). As the total respiratory-related deaths decreased over the past few decades, respiratory O3-related deaths are decreasing even under aggravated O3 pollution. Based on GEOS-Chem-simulated O3, the corresponding estimated change rate for all-cause disease is +3516 deaths per year from 50 384 in 1981 to 176 741 in 2019. Premature mortality induced by respiratory disease decreased from 37 822 in 1998 to 29 079 in 2019 with a change rate of −584 deaths per year, while cardiovascular disease increased from 44 516 in 1998 to 85 980 in 2019 with a change rate of +1977 deaths per year (Fig. S15). The result shows that using GEOS-Chem-simulated O3 generally gives higher estimates of mortality than using the bias-corrected data. Figure 10b shows the provincial annual average premature mortality of different health endpoints. The five provinces with the highest all-cause mortality are Jiangsu (14 510; 95 % CI: 9022–19 935), Shandong (12 684; 95 % CI: 4258–20 990), Henan (12 290; 95 % CI: 4125–20 343), Guangdong (9268; 95 % CI: 7224–11 416), and Hebei (8276; 95 % CI: 2776–13 706), which are generally consistent with previous studies for China (Zhang et al., 2021, 2022a). Similar distribution can be found for respiratory and cardiovascular diseases but with a different ranking order. Generally, the provinces in densely populated areas (Fig. 10c) with higher O3 concentrations, such as BTHs, YRD, and PRD, have higher health burdens. In contrast, northeastern and southern China (excluding Guangdong) suffer the least life losses induced by O3 exposure (Fig. S16).

When compared with estimates from previous studies, our estimates are generally quite consistent with those given by Maji and Namdeo (2021), who reported that the short-term all-cause, cardiovascular, and respiratory premature mortalities attributed to ambient O3 exposure were 156 000, 73 500, and 28 600 in 2019, respectively. Based on O3 observations in 334 Chinese cities, Zhang et al. (2021) suggested that the national all-cause, respiratory, and cardiovascular mortalities attributable to O3 were 270 000 to 390 000, 49 000 to 63 000, and 150 000 to 220 000 across 2015–2018, respectively, which are much higher than most existing results. Since the methodological approaches are largely similar and since we use the log-linear exposure–response function, we attribute the very high estimated mortalities mainly to the concentration–response threshold X0 assumed to be 0 in their study. A lower X0 means that O3 can cause more adverse impacts on human health even at low concentrations, thus leading to higher mortalities.

4 Conclusions and discussion

In this study, to have a more accurate characterization of O3 spatiotemporal distribution and trends as well as its impacts on agriculture and human health, we used a hybrid approach to generate bias-corrected O3 data across China from 1981 to 2019. The hybrid approach helps improve O3 predictions by taking advantage of a chemical transport model and a ML algorithm as well as increasing availability of high-resolution environmental and meteorological data. In the model training process, we found that utilizing a higher-resolution meteorological dataset, albeit one that is not the same as the default CTM input meteorology, has high potential to enhance the performance of the hybrid model in reproducing observed O3 concentrations. The validation shows that the bias-corrected O3 can achieve a higher prediction accuracy than the GEOS-Chem-simulated O3 alone when compared with historical in situ measurements. Before being corrected, the GEOS-Chem-simulated O3 concentrations tend to be overestimated, which leads to higher crop yield losses and larger O3-induced mortalities. Noticeable differences in crop RYLs and mortality estimates highlight the advantages of using high-resolution O3 data to improve our understanding of long-term O3 impacts.

When examining the regional and national O3 trends, we found that MDA8 O3 concentrations have a perceptible increasing trend before the 2000s but fluctuate within a certain range with large interannual variabilities in more recent years. The large discrepancies in previous studies indicate that the regional and national O3 trends in China still suffer great uncertainties, particularly when different approaches are used to produce the O3 estimates. However, these studies using source-varied O3 fields consistently show the great interannual variabilities of O3 concentrations. Some insights can be obtained from existing findings, which need to be carefully considered when examining O3 trends and comparing them with existing results. First, given the large site differences, the calculation of observational O3 trends is very sensitive to the subsets of data from networks. Thus, great uncertainty could still exist even when using O3 observations from the same source depending on the chosen subsets of data. Second, different formats of O3 fields (e.g., site-based and gridded) could lead to large uncertainties in the O3 trend estimates. A higher resolution of gridded O3 estimates from CTMs and ML may reduce the differences between O3 observational results. Third, the calculated O3 trends are very sensitive to the chosen study period due to large interannual variability and seasonal differences. The changing meteorological conditions are the major factor causing the large interannual O3 variations, and reductions in the emissions of NOx, SO2, and PM also have complex effects on ground-level O3 concentrations (T. Wang et al., 2022). Liu and Wang (2020a) suggested that the meteorological impacts on O3 trends vary from region to region and year by year and that it could be comparable with or even larger than the impacts of changes in anthropogenic emissions.

Our estimated RYLs for maize and rice in China are generally consistent with existing studies, while the RYLs for soybean and wheat are lower than their estimates, mainly due to the differences in the metrics used, O3 fields, and crop sensitivity to ambient O3 concentrations. It suggests that plating O3-resistant cultivars could be an effective approach to increase total crop production to meet increasing food demands. Although other metrics (e.g., M7, M12, and W126) have also been used in some studies (Van Dingenen et al., 2009; Avnery et al., 2013; Y. Wang et al., 2022), exposure–yield relationships are not available for all four major crops specific to China. The estimated RYLs for crops could be largely biased using metrics with exposure–yield relationships developed for the US or Europe (Fig. S17), as they are inadequate to represent Asian crop genotypes and environmental conditions. Therefore, the region-specific exposure–yield relationships are highly recommended to be used in future studies estimating O3-induced crop reduction, especially for regional studies.

In recent years, although existing studies have made efforts to quantify the O3-related health impacts in China, only a few studies have focused on nationwide acute O3 health burden assessment, particularly for assessment over multiple decades (Maji and Namdeo, 2021; Sahu et al., 2021; Zhang et al., 2021., 2022a). There are some remaining issues to be addressed regarding O3 health impacts. For instance, the existence of a “safe” threshold of O3 levels is still debated. A recent study reported that no consistent evidence was found for a threshold in the O3 mortality concentration–response relationship in seven cities of Jiangsu Province, China, during 2013–2014 (Chen et al., 2017; Maji and Namdeo, 2021). Given the importance of the threshold assumption in assessing health effects of air pollution, more studies are needed to determine a most likely threshold for O3 mortality association in the future. Moreover, the multiple temporal O3 metrics (e.g., 1 h maximum and daytime average O3 concentrations) have also proved to play an important role in the variability of estimated health effects, even though standard ratios are used to convert among multiple metrics (Anderson and Bell, 2010). In addition to the uncertainties from varying methodologies, interpretation of the O3 epidemiological impact is also constrained by the variability in geographical, seasonal, and demographic characteristics (P. Yin et al., 2017). Liu et al. (2013) suggested that associations between O3 and mortality appeared to be more evident during the cool season than in the warm season and stronger in the oldest age group and among those with less education. The effect modification by population susceptibility and the confounding effects of concomitant exposures (e.g., temperature, particulate matter) should be further considered in future work.

A major limitation of our study lies in the uncertain predictions in regions where monitoring data are scarce (e.g., the western half of China). The monitoring sites are sparsely distributed in those areas, which may fail to capture the accurate association between O3 concentrations and various predictors there, especially considering that the ML algorithm has likely over-emphasized such relationships in the data-intensive eastern regions. Second, the land use data were prescribed in 2013 due to the limited availability of data, and this may neglect some major land use changes in China over the past few decades. Although the land use data were found by the ML algorithm to contribute little to the overall model, more detailed land use data are expected to further increase model accuracy. In addition, although concentration-based metrics are easy to calculate and ensured to be scientifically sound in some experiments (Fuhrer et al., 1997; Mills et al., 2007), they do not consider the active responses of plant ecophysiology to ambient climatic and environmental changes and are thus likely inadequate for examining yield losses in a future climate and atmospheric environment (Tai et al., 2021). Therefore, flux-based metrics are recommended in future studies to better understand the long-term evolution of crop losses over China (Feng et al., 2012; Zhang et al., 2017; Tai et al., 2021; Pleijel et al., 2022), wherein more crop- and region-specific experiments and trials are needed to acquire appropriate metrics and exposure–yield response functions and calibrate the process-based crop model.

Despite these limitations, our study represents important progress in evaluating the long-term, multidecadal health burdens and agricultural losses resulting from O3 pollution in China. Across the four major regions, BTHs experience the highest RYLs for major crops due to elevated O3. On the other hand, the YRD and PRD regions have greater human health losses primarily due to their large population size. The results can provide important references for governments and agencies when making related national or regional policies to meet imperative environmental, health, and food security demands. To effectively address the impacts of O3, collaborative efforts can be made in multifaceted aspects: (1) implement stricter regulations and specific emission control measures for major ozone precursors from industrial, vehicular, and agricultural sources that account for region-specific chemical, meteorological, and terrestrial conditions; (2) encourage the adoption of more sustainable and adaptive agricultural practices that minimize O3 exposure and its damage on crops (e.g., cultivating O3-resistant crop varieties); (3) improve short-range O3 forecast capabilities of regional models, especially with the enhancement of artificial intelligence technology, which may enable better early-warning systems to prepare the public and farmers for O3 episodes; and (4) raise public awareness via promotional campaigns and educational programs to inform individuals, communities, and farmers about the risks associated with O3. It is important for policymakers to consider these suggestions and act to effectively mitigate the negative impacts of O3.

Data availability

Model output data used for analysis and plotting are available at the following open-access online repository: (Mao, 2023).


The supplement related to this article is available online at:

Author contributions

APKT designed the study and supervised the writing of the paper. JM conducted model simulations, analyzed the results, and wrote the draft with the assistance of TGY and KTC. DHYY performed the GEOS-Chem simulations. ZZF assisted in the interpretation of the results. All authors contributed to the discussion and improvement of the paper.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Atmospheric Chemistry and Physics. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.


Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.


This work was supported by the National Natural Science Foundation of China (NSFC)/Research Grants Council (RGC) Joint Research Scheme (reference nos. N_CUHK440/20 and 42061160479) awarded to Amos Pui Kuen Tai and Zhaozhong Feng.

Financial support

This research has been supported by the National Natural Science Foundation of China (NSFC)/Research Grants Council (RGC) Joint Research Scheme (reference nos. N_CUHK440/20 and 42061160479) awarded to Amos P. K. Tai and Zhaozhong Feng.

Review statement

This paper was edited by Leiming Zhang and reviewed by two anonymous referees.


Abdullah, S., Ismail, M., and Fong, S. Y.: Multiple Linear Regression (MLR) models for long term PM10 concentration forecasting during different monsoon seasons, J. Sustain. Sci. Manage., 12, 60–69, 2017. 

Accadia, C., Mariani, S., Casaioli, M., Lavagnini, A., and Speranza, A.: Sensitivity of Precipitation Forecast Skill Scores to Bilinear Interpolation and a Simple Nearest-Neighbor Average Method on High-Resolution Verification Grids, Weather Forecast., 18, 918–932,<0918:SOPFSS>2.0.CO;2, 2003. 

Anderson, G. B. and Bell, M. L.: Does one size fit all? The suitability of standard ozone exposure metric conversion ratios and implications for epidemiology, J. Expos. Sci. Environ. Epidemiol., 20, 2–11,, 2010. 

Avnery, S., Mauzerall, D. L., and Fiore, A. M.: Increasing global agricultural production by reducing ozone damages via methane emission controls and ozone-resistant cultivar selection, Global Change Biol., 19, 1285–1299,, 2013. 

Bey, I., Jacob, D. J., Yantosca, R. M., Logan, J. A., Field, B. D., Fiore, A. M., Li, Q., Liu, H. Y., Mickley, L. J., and Schultz, M. G.: Global modeling of tropospheric chemistry with assimilated meteorology: Model description and evaluation, J. Geophys. Res.-Atmos., 106, 23073–23095,, 2001. 

Bi, J., Knowland, K. E., Keller, C. A., and Liu, Y.: Combining Machine Learning and Numerical Simulation for High-Resolution PM2.5 Concentration Forecast, Environ. Sci. Technol., 56, 1544–1556,, 2022.  

Chen, K., Zhou, L., Chen, X., Bi, J., and Kinney, P. L.: Acute effect of ozone exposure on daily mortality in seven cities of Jiangsu Province, China: No clear evidence for threshold, Environ. Res., 155, 235–241,, 2017. 

Chen, T. and Guestrin, C.: XGBoost: A Scalable Tree Boosting System, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 13 August 2016, San Francisco, California, USA, 785–794,, 2016. 

Clifton, O. E., Fiore, A. M., Massman, W. J., Baublitz, C. B., Coyle, M., Emberson, L., Fares, S., Farmer, D. K., Gentine, P., Gerosa, G., Guenther, A. B., Helmig, D., Lombardozzi, D. L., Munger, J. W., Patton, E. G., Pusede, S. E., Schwede, D. B., Silva, S. J., Sorgel, M., Steiner, A. L., and Tai, A. P. K.: Dry Deposition of Ozone over Land: Processes, Measurement, and Modeling, Rev. Geophys., 58, e2019RG000670,, 2020. 

Dang, R., Liao, H., and Fu, Y.: Quantifying the anthropogenic and meteorological influences on summertime surface ozone in China over 2012–2017, Sci. Total Environ., 754, 142394,, 2021. 

Dedoussi, I. C., Eastham, S. D., Monier, E., and Barrett, S. R. H.: Premature mortality related to United States cross-state air pollution, Nature, 578, 261–265,, 2020. 

Di, Q., Rowland, S., Koutrakis, P., and Schwartz, J.: A hybrid model for spatially and temporally resolved ozone exposures in the continental United States, J. Air Waste Manage. Assoc., 67, 39–52,, 2017. 

Ding, A., Wang, T., Zhao, M., Wang, T., and Li, Z.: Simulation of sea-land breezes and a discussion of their implications on the transport of air pollution during a multi-day ozone episode in the Pearl River Delta of China, Atmos. Environ., 38, 6737–6750,, 2004. 

Emberson, L. D., Büker, P., Ashmore, M. R., Mills, G., Jackson, L. S., Agrawal, M., Atikuzzaman, M. D., Cinderby, S., Engardt, M., Jamir, C., Kobayashi, K., Oanh, N. T. K., Quadir, Q. F., and Wahid, A.: A comparison of North American and Asian exposure–response data for ozone effects on crop yields, Atmos. Enviro., 43, 1945–1953,, 2009. 

Feng, Z., Tang, H., Uddling, J., Pleijel, H., Kobayashi, K., Zhu, J., Oue, H., and Guo, W.: A stomatal ozone flux–response relationship to assess ozone-induced yield loss of winter wheat in subtropical China, Environ. Pollut., 164, 16–23,, 2012. 

Feng, Z., Calatayud, V., Zhu, J., and Kobayashi, K.: Ozone exposure- and flux-based response relationships with photosynthesis of winter wheat under fully open air condition, Sci. Total Environ., 619–620, 1538–1544,, 2018. 

Feng, Z., De Marco, A., Anav, A., Gualtieri, M., Sicard, P., Tian, H., Fornasier, F., Tao, F., Guo, A., and Paoletti, E.: Economic losses due to ozone impacts on human health, forest productivity and crop yield across China, Environ. Int., 131, 104966,, 2019.  

Feng, Z., Xu, Y., Kobayashi, K., Dai, L., Zhang, T., Agathokleous, E., Calatayud, V., Paoletti, E., Mukherjee, A., Agrawal, M., Park, R. J., Oak, Y. J., and Yue, X.: Ozone pollution threatens the production of major staple crops in East Asia, Nat. Food, 3, 47–56,, 2022. 

Fiore, A. M., Horowitz, L. W., Purves, D. W., Levy Ii, H., Evans, M. J., Wang, Y., Li, Q., and Yantosca, R. M.: Evaluating the contribution of changes in isoprene emissions to surface ozone trends over the eastern United States, J. Geophys. Res.-Atmos., 110, D12303,, 2005. 

Fu, Y. and Tai, A. P. K.: Impact of climate and land cover changes on tropospheric ozone air quality and public health in East Asia between 1980 and 2010, Atmos. Chem. Phys., 15, 10093–10106,, 2015. 

Fuhrer, J., Skärby, L., and Ashmore, M. R.: Critical levels for ozone effects on vegetation in Europe, Environ. Pollut., 97, 91–106,, 1997. 

Fusco, A. C. and Logan, J. A.: Analysis of 1970–1995 trends in tropospheric ozone at Northern Hemisphere midlatitudes with the GEOS-CHEM model, J. Geophys. Res.-Atmos., 108, 4449,, 2003. 

Gong, C. and Liao, H.: A typical weather pattern for ozone pollution events in North China, Atmos. Chem. Phys., 19, 13725–13740,, 2019. 

Gong, C., Yue, X., Liao, H., and Ma, Y.: A humidity-based exposure index representing ozone damage effects on vegetation, Environ. Res. Lett., 16, 044030,, 2021. 

Guenther, A. B., Jiang, X., Heald, C. L., Sakulyanontvittaya, T., Duhl, T., Emmons, L. K., and Wang, X.: The Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN2.1): an extended and updated framework for modeling biogenic emissions, Geosci. Model Dev., 5, 1471–1492,, 2012. 

Han, H., Liu, J., Shu, L., Wang, T., and Yuan, H.: Local and synoptic meteorological influences on daily variability in summertime surface ozone in eastern China, Atmos. Chem. Phys., 20, 203–222,, 2020. 

He, J., Wang, Y., Hao, J., Shen, L., and Wang, L.: Variations of surface O3 in August at a rural site near Shanghai: influences from the West Pacific subtropical high and anthropogenic emissions, Environ. Sci. Pollut. Res., 19, 4016–4029,, 2012. 

Hoesly, R. M., Smith, S. J., Feng, L., Klimont, Z., Janssens-Maenhout, G., Pitkanen, T., Seibert, J. J., Vu, L., Andres, R. J., Bolt, R. M., Bond, T. C., Dawidowski, L., Kholod, N., Kurokawa, J. I., Li, M., Liu, L., Lu, Z., Moura, M. C. P., O'Rourke, P. R., and Zhang, Q.: Historical (1750–2014) anthropogenic emissions of reactive gases and aerosols from the Community Emissions Data System (CEDS), Geosci. Model Dev., 11, 369–408,, 2018. 

Hu, X., Belle, J. H., Meng, X., Wildani, A., Waller, L. A., Strickland, M. J., and Liu, Y.: Estimating PM2.5 Concentrations in the Conterminous United States Using the Random Forest Approach, Environ. Sci. Technol., 51, 6936–6944,, 2017.  

Irrgang, C., Boers, N., Sonnewald, M., Barnes, E. A., Kadow, C., Staneva, J., and Saynisch-Wagner, J.: Towards neural Earth system modelling by integrating artificial intelligence in Earth system science, Nat. Mach. Intel., 3, 667–674,, 2021. 

Ivatt, P. D. and Evans, M. J.: Improving the prediction of an atmospheric chemistry transport model using gradient-boosted regression trees, Atmos. Chem. Phys., 20, 8063–8082,, 2020. 

Jacob, D. J. and Winner, D. A.: Effect of climate change on air quality, Atmos. Environ., 43, 51–63,, 2009. 

Jiang, Y. C., Zhao, T. L., Liu, J., Xu, X. D., Tan, C. H., Cheng, X. H., Bi, X. Y., Gan, J. B., You, J. F., and Zhao, S. Z.: Why does surface ozone peak before a typhoon landing in southeast China?, Atmos. Chem. Phys., 15, 13331–13338,, 2015. 

Kawase, H., Nagashima, T., Sudo, K., and Nozawa, T.: Future changes in tropospheric ozone under Representative Concentration Pathways (RCPs), Geophys. Res. Lett., 38, L05801,, 2011. 

Lelieveld, J., Evans, J. S., Fnais, M., Giannadaki, D., and Pozzer, A.: The contribution of outdoor air pollution sources to premature mortality on a global scale, Nature, 525, 367–371,, 2015. 

Li, K., Jacob, D. J., Shen, L., Lu, X., De Smedt, I., and Liao, H.: Increases in surface ozone pollution in China from 2013 to 2019: anthropogenic and meteorological influences, Atmos. Chem. Phys., 20, 11423–11433,, 2020. 

Li, K., Jacob, D. J., Liao, H., Qiu, Y., Shen, L., Zhai, S., Bates, K. H., Sulprizio, M. P., Song, S., Lu, X., Zhang, Q., Zheng, B., Zhang, Y., Zhang, J., Lee, H. C., and Kuk, S. K.: Ozone pollution in the North China Plain spreading into the late-winter haze season, P. Natl. Acad. Sci. USA, 118, e2015797118,, 2021. 

Li, Y., Zhao, X., Deng, X., and Gao, J.: The impact of peripheral circulation characteristics of typhoon on sustained ozone episodes over the Pearl River Delta region, China, Atmos. Chem. Phys., 22, 3861–3873,, 2022. 

Lin, J. T., Youn, D., Liang, X. Z., and Wuebbles, D. J.: Global model simulation of summertime U.S. ozone diurnal cycle and its sensitivity to PBL mixing, spatial resolution, and emissions, Atmos. Environ., 42, 8470–8483,, 2008. 

Liu, H., Liu, S., Xue, B., Lv, Z., Meng, Z., Yang, X., Xue, T., Yu, Q., and He, K.: Ground-level ozone pollution and its health impacts in China, Atmos. Environ., 173, 223–230,, 2018. 

Liu, J., Wang, L., Li, M., Liao, Z., Sun, Y., Song, T., Gao, W., Wang, Y., Li, Y., Ji, D., Hu, B., Kerminen, V. M., Wang, Y., and Kulmala, M.: Quantifying the impact of synoptic circulation patterns on ozone variability in northern China from April to October 2013–2017, Atmos. Chem. Phys., 19, 14477–14492,, 2019.  

Liu, R., Ma, Z., Liu, Y., Shao, Y., Zhao, W., and Bi, J.: Spatiotemporal distributions of surface ozone levels in China from 2005 to 2017: A machine learning approach, Environ. Int., 142, 105823,, 2020. 

Liu, T., Li, T. T., Zhang, Y. H., Xu, Y. J., Lao, X. Q., Rutherford, S., Chu, C., Luo, Y., Zhu, Q., Xu, X. J., Xie, H. Y., Liu, Z. R., and Ma, W. J.: The short-term effect of ambient ozone on mortality is modified by temperature in Guangzhou, China, Atmos. Environ., 76, 59–67,, 2013. 

Liu, Y. and Wang, T.: Worsening urban ozone pollution in China from 2013 to 2017 – Part 1: The complex and varying roles of meteorology, Atmos. Chem. Phys., 20, 6305–6321,, 2020a. 

Liu, Y. and Wang, T.: Worsening urban ozone pollution in China from 2013 to 2017 – Part 2: The effects of emission changes and implications for multi-pollutant control, Atmos. Chem. Phys., 20, 6323–6337,, 2020b. 

Lu, C., Mao, J., Wang, L., Guan, Z., Zhao, G., and Li, M.: An unusual high ozone event over the North and Northeast China during the record-breaking summer in 2018, J. Environ. Sci. (China), 104, 264–276,, 2021. 

Lu, X., Hong, J., Zhang, L., Cooper, O. R., Schultz, M. G., Xu, X., Wang, T., Gao, M., Zhao, Y., and Zhang, Y.: Severe Surface Ozone Pollution in China: A Global Perspective, Environ. Sci. Technol. Lett., 5, 487–494,, 2018. 

Lu, X., Zhang, L., Chen, Y., Zhou, M., Zheng, B., Li, K., Liu, Y., Lin, J., Fu, T.-M., and Zhang, Q.: Exploring 2016–2017 surface ozone pollution over China: source contributions and meteorological influences, Atmos. Chem. Phys., 19, 8339–8361,, 2019. 

Ma, R., Ban, J., Wang, Q., Zhang, Y., Yang, Y., He, M. Z., Li, S., Shi, W., and Li, T.: Random forest model based fine scale spatiotemporal O3 trends in the Beijing-Tianjin-Hebei region in China, 2010 to 2017, Environ. Pollut., 276, 116635,, 2021. 

Madaniyazi, L., Nagashima, T., Guo, Y., Pan, X., and Tong, S.: Projecting ozone-related mortality in East China, Environ. Int., 92–93, 165–172,, 2016. 

Maji, K. J. and Namdeo, A.: Continuous increases of surface ozone and associated premature mortality growth in China during 2015–2019, Environ. Pollut., 269, 116183,, 2021. 

Mao, J.: ML simulated hourly and MDA8 ozone in China from 1981 to 2019, OneDrive [data set], (last access: 24 December 2023), 2023. 

Mao, J., Wang, L., Lu, C., Liu, J., Li, M., Tang, G., Ji, D., Zhang, N., and Wang, Y.: Meteorological mechanism for a large-scale persistent severe ozone pollution event over eastern China in 2017, J. Environ. Sci. (China), 92, 187–199,, 2020.  

Mills, G., Buse, A., Gimeno, B., Bermejo, V., Holland, M., Emberson, L., and Pleijel, H.: A synthesis of AOT40-based response functions and critical levels of ozone for agricultural and horticultural crops, Atmos. Environ., 41, 2630–2643,, 2007. 

Mills, G., Sharps, K., Simpson, D., Pleijel, H., Frei, M., Burkey, K., Emberson, L., Uddling, J., Broberg, M., Feng, Z., Kobayashi, K., and Agrawal, M.: Closing the global ozone yield gap: Quantification and cobenefits for multistress tolerance, Global Change Biol., 24, 4869–4893,, 2018. 

Monfreda, C., Ramankutty, N., and Foley, J. A.: Farming the planet: 2. Geographic distribution of crop areas, yields, physiological types, and net primary production in the year 2000, Global Biogeochem. Cy., 22, GB1022,, 2008. 

Moustris, K. P., Nastos, P. T., Larissi, I. K., and Paliatsos, A. G.: Application of Multiple Linear Regression Models and Artificial Neural Networks on the Surface Ozone Forecast in the Greater Athens Area, Greece, Adv. Meteorol., 2012, 894714,, 2012. 

Mukherjee, A., Yadav, D. S., Agrawal, S. B., and Agrawal, M.: Ozone a persistent challenge to food security in India: Current status and policy implications, Curr. Opin. Environ. Sci. Health, 19, 100220,, 2021. 

Pleijel, H., Danielsson, H., and Broberg, M. C.: Benefits of the Phytotoxic Ozone Dose (POD) index in dose-response functions for wheat yield loss, Atmos. Environ., 268, 118797,, 2022. 

Sacks, W. J., Deryng, D., Foley, J. A., and Ramankutty, N.: Crop planting dates: an analysis of global patterns, Global Ecol. Biogeogr., 19, 607–620,, 2010. 

Sahu, S. K., Liu, S., Liu, S., Ding, D., and Xing, J.: Ozone pollution in China: Background and transboundary contributions to ozone concentration & related health effects across the country, Sci. Total Environ., 761, 144131,, 2021. 

Shang, Y., Sun, Z., Cao, J., Wang, X., Zhong, L., Bi, X., Li, H., Liu, W., Zhu, T., and Huang, W.: Systematic review of Chinese studies of short-term exposure to air pollution and daily mortality, Environ. Int., 54, 100–111,, 2013. 

Sindelarova, K., Granier, C., Bouarar, I., Guenther, A., Tilmes, S., Stavrakou, T., Müller, J. F., Kuhn, U., Stefani, P., and Knorr, W.: Global data set of biogenic VOC emissions calculated by the MEGAN model over the last 30 years, Atmos. Chem. Phys., 14, 9317–9341,, 2014. 

Tai, A. P. K., Mickley, L. J., Heald, C. L., and Wu, S.: Effect of CO2 inhibition on biogenic isoprene emission: Implications for air quality under 2000 to 2050 changes in climate, vegetation, and land use, Geophys. Res. Lett., 40, 3479–3483,, 2013. 

Tai, A. P. K., Sadiq, M., Pang, J. Y. S., Yung, D. H. Y., and Feng, Z.: Impacts of Surface Ozone Pollution on Global Crop Yields: Comparing Different Ozone Exposure Metrics and Incorporating Co-effects of CO2, Front. Sustain. Food Syst., 5, 534616,, 2021.  

Travis, K. R., Jacob, D. J., Fisher, J. A., Kim, P. S., Marais, E. A., Zhu, L., Yu, K., Miller, C. C., Yantosca, R. M., Sulprizio, M. P., Thompson, A. M., Wennberg, P. O., Crounse, J. D., St. Clair, J. M., Cohen, R. C., Laughner, J. L., Dibb, J. E., Hall, S. R., Ullmann, K., Wolfe, G. M., Pollack, I. B., Peischl, J., Neuman, J. A., and Zhou, X.: Why do models overestimate surface ozone in the Southeast United States?, Atmos. Chem. Phys., 16, 13561–13577,, 2016. 

van der Werf, G. R., Randerson, J. T., Giglio, L., van Leeuwen, T. T., Chen, Y., Rogers, B. M., Mu, M., van Marle, M. J. E., Morton, D. C., Collatz, G. J., Yokelson, R. J., and Kasibhatla, P. S.: Global fire emissions estimates during 1997–2016, Earth Syst. Sci. Data, 9, 697–720,, 2017. 

Van Dingenen, R., Dentener, F. J., Raes, F., Krol, M. C., Emberson, L., and Cofala, J.: The global impact of ozone on agricultural crop yields under current and future air quality legislation, Atmos. Environ., 43, 604–618,, 2009. 

Wang, H., Lyu, X., Guo, H., Wang, Y., Zou, S., Ling, Z., Wang, X., Jiang, F., Zeren, Y., Pan, W., Huang, X., and Shen, J.: Ozone pollution around a coastal region of South China Sea: interaction between marine and continental air, Atmos. Chem. Phys., 18, 4277–4295,, 2018. 

Wang, H., Lu, X., Jacob, D. J., Cooper, O. R., Chang, K. L., Li, K., Gao, M., Liu, Y., Sheng, B., Wu, K., Wu, T., Zhang, J., Sauvage, B., Nédélec, P., Blot, R., and Fan, S.: Global tropospheric ozone trends, attributions, and radiative impacts in 1995–2017: an integrated analysis using aircraft (IAGOS) observations, ozonesonde, and multi-decadal chemical model simulations, Atmos. Chem. Phys., 22, 13753–13782,, 2022. 

Wang, L., Tai, A. P. K., Tam, C.-Y., Sadiq, M., Wang, P., and Cheung, K. K. W.: Impacts of future land use and land cover change on mid-21st-century surface ozone air quality: distinguishing between the biogeophysical and biogeochemical effects, Atmos. Chem. Phys., 20, 11349–11369,, 2020. 

Wang, L., Wang, S., Zhou, Y., Liu, W., Hou, Y., Zhu, J., and Wang, F.: Mapping population density in China between 1990 and 2010 using remote sensing, Remote Sens. Environ., 210, 269–281,, 2018. 

Wang, T., Xue, L., Feng, Z., Dai, J., Zhang, Y., and Tan, Y.: Ground-level ozone pollution in China: a synthesis of recent findings on influencing factors and impacts, Environ. Res. Lett., 17, 063003,, 2022. 

Wang, X., Gong, G., Li, N., and Qiu, S.: Detection Analysis of Epileptic EEG Using a Novel Random Forest Model Combined With Grid Search Optimization, Front. Hum. Neurosci., 13, 52,, 2019. 

Wang, Y., Zhang, Y., Hao, J., and Luo, M.: Seasonal and spatial variability of surface ozone over China: contributions from background and domestic pollution, Atmos. Chem. Phys., 11, 3511–3525,, 2011. 

Wang, Y., Wild, O., Ashworth, K., Chen, X., Wu, Q., Qi, Y., and Wang, Z.: Reductions in crop yields across China from elevated ozone, Environ. Pollut., 292, 118218,, 2022. 

Wei, J., Li, Z., Li, K., Dickerson, R. R., Pinker, R. T., Wang, J., Liu, X., Sun, L., Xue, W., and Cribb, M.: Full-coverage mapping and spatiotemporal variations of ground-level ozone (O3) pollution from 2013 to 2020 across China, Remote Sens. Environ., 270, 112775,, 2022. 

Wesely, M. L.: Parameterization of surface resistances to gaseous dry deposition in regional-scale numerical models, Atmos. Environ., 23, 1293–1304,, 1989. 

Wild, O. and Prather, M. J.: Global tropospheric ozone modeling: Quantifying errors due to grid resolution, J. Geophys. Res.-Atmos., 111, D11305,, 2006. 

Xiao, Q., Chang, H. H., Geng, G., and Liu, Y.: An Ensemble Machine-Learning Model To Predict Historical PM2.5 Concentrations in China from Satellite Data, Environ. Sci. Technol., 52, 13260–13269,, 2018. 

Yang, L., Luo, H., Yuan, Z., Zheng, J., Huang, Z., Li, C., Lin, X., Louie, P. K. K., Chen, D., and Bian, Y.: Quantitative impacts of meteorology and precursor emission changes on the long-term trend of ambient ozone over the Pearl River Delta, China, and implications for ozone control strategy, Atmos. Chem. Phys., 19, 12901–12916,, 2019. 

Yin, H., Pizzol, M., and Xu, L.: External costs of PM2.5 pollution in Beijing, China: Uncertainty analysis of multiple health impacts and costs, Enviro. Pollut., 226, 356–369,, 2017. 

Yin, P., Chen, R., Wang, L., Meng, X., Liu, C., Niu, Y., Lin, Z., Liu, Y., Liu, J., Qi, J., You, J., Zhou, M., and Kan, H.: Ambient Ozone Pollution and Daily Mortality: A Nationwide Study in 272 Chinese Cities, Environ. Health Perspect., 125, 117006,, 2017. 

Yin, Z. and Ma, X.: Meteorological conditions contributed to changes in dominant patterns of summer ozone pollution in Eastern China, Environ. Res. Lett., 15, 124062,, 2020. 

Zhan, Y., Luo, Y., Deng, X., Chen, H., Grieneisen, M. L., Shen, X., Zhu, L., and Zhang, M.: Spatiotemporal prediction of continuous daily PM2.5 concentrations across China using a spatially explicit machine learning algorithm, Atmos. Environ., 155, 129–139,, 2017. 

Zhang, W., Feng, Z., Wang, X., Liu, X., and Hu, E.: Quantification of ozone exposure- and stomatal uptake-yield response relationships for soybean in Northeast China, Sci. Total Environ., 599–600, 710–720,, 2017. 

Zhang, X., Osei, F., Stein, A., Cheng, C., and Maji, K. J.: Temporal and spatial evolution of short-term exposure to ozone pollution: Its health impacts in China based on a meta-analysis, J. Clean. Product., 373, 133938,, 2022a. 

Zhang, X., Yan, B., Zhou, Y., Osei, F., Li, Y., Zhao, H., Cheng, C., and Stein, A.: Short-term health impacts related to ozone in China before and after implementation of policy measures: A systematic review and meta-analysis, Sci. Total Environ., 847, 157588,, 2022b. 

Zhang, Y., Wang, Y., Gao, M., Ma, Q., Zhao, J., Zhang, R., Wang, Q., and Huang, L.: A Predictive Data Feature Exploration-Based Air Quality Prediction Approach, IEEE Access, 7, 30732–30743,, 2019. 

Zhang, Z., Yao, M., Wu, W., Zhao, X., and Zhang, J.: Spatiotemporal assessment of health burden and economic losses attributable to short-term exposure to ground-level ozone during 2015–2018 in China, BMC Publ. Health, 21, 1069,, 2021. 

Zhao, Z. and Wang, Y.: Influence of the West Pacific subtropical high on surface ozone daily variability in summertime over eastern China, Atmos. Environ., 170, 197–204,, 2017.  

Zhou, D., Ding, A., Mao, H., Fu, C., Wang, T., Chan, L. Y., Ding, K., Zhang, Y., Liu, J., Lu, A., and Hao, N.: Impacts of the East Asian monsoon on lower tropospheric ozone over coastal South China, Environ. Res. Lett., 8, 044011,, 2013. 

Short summary
Surface ozone (O3) is well-known for posing great threats to both human health and agriculture worldwide. However, a multidecadal assessment of the impacts of O3 on public health and agriculture in China is lacking without sufficient O3 observations. We used a hybrid approach combining a chemical transport model and machine learning to provide a robust dataset of O3 concentrations over the past 4 decades in China, thereby filling the gap in the long-term O3 trend and impact assessment in China.
Final-revised paper