Ground-level gaseous pollutants (NO 2 , SO 2 , and CO) in China: daily seamless mapping and spatiotemporal variations

. Gaseous pollutants at the ground level seriously threaten the urban air quality environment and public health. There are few estimates of gaseous pollutants that are spatially and temporally resolved and continuous across China. This study takes advantage of big data and artiﬁcial-intelligence technologies to generate seamless daily maps of three major ambient pollutant gases, i


Introduction
Air pollution has been a major environmental concern, affecting human health, weather, and climate (Anenberg et al., 2022;Kan et al., 2012;GBD 2019Risk Factors Collaborators, 2020Orellano et al., 2020), thus drawing worldwide attention. The sources of air pollution are complex. They include natural sources such as wildfires and anthropogenic emissions, including pollutants discharged from industrial production (e.g., smoke or dust, sulfur oxides, nitrogen oxides -NO x , and volatile organic compounds -VOCs), hazardous substances released from burning coal during heating seasons (e.g., dust, sulfur dioxide -SO 2 , and carbon monoxide -CO), and waste gases (e.g., CO, SO 2 , and NO x ) generated by transportation, especially in big cities.
Among various air pollutants, the following have been most widely recognized: particulate matter with diameters smaller than 2.5 and 10 µm (PM 2.5 and PM 10 ) and gaseous pollutants (e.g., ozone (O 3 ), nitrogen dioxide (NO 2 ), SO 2 , and CO, among others). Many countries have built groundbased networks to monitor a variety of conventional pollutants in real time. China has experienced serious ambient air pollution for a long time, prompting the establishment of a large-scale air quality monitoring network (MEE, 2018a). Over the years, much effort has been made to model different species of air pollutants. Many studies that focused on particulate matter in China have been carried out (Gao et al., 2022;Yang et al., 2022;Zhang et al., 2018). The global COVID-19 pandemic has motivated many attempts to estimate surface NO 2 concentrations from satellite-retrieved tropospheric NO 2 products (Tian et al., 2020;WHO, 2020), e.g., from the Ozone Monitoring Instrument (OMI) on board the NASA Aura spacecraft and the TROPOspheric Monitoring Instrument (TROPOMI) on board the Copernicus Sentinel-5 Precursor satellite, adopting different statistical regression (Chi et al., 2021;Qin et al., 2017;Zhang et al., 2018) and artificial intelligence Chi et al., 2022;Dou et al., 2021;Liu, 2021;Wang et al., 2021;Zhan et al., 2018) models. By comparison, surface SO 2 and CO in China are less studied, limited by weaker signals and a lack of goodquality satellite tropospheric products (W. Liu et al., 2019;Wang et al., 2021). Such studies still face more challenges, e.g., satellite data gaps and missing values that seriously limit their application and the neglect of spatiotemporal differences in air pollution in the modeling process. In addition, most previous studies mainly focused on studying a single or a few species during relatively short observational periods.
In view of the above problems, the purpose of this paper is to reconstruct daily concentrations of three ambient gaseous pollutants (i.e., NO 2 , SO 2 , and CO) in China. To this end, relying on the dense national ground-based observation network and big data, including satellite remote sensing products, meteorological reanalysis, chemical model simulations, and emission inventories, we are capable of mapping three pollutant gases seamlessly (100 % spatial coverage) on a daily basis at a uniform spatial resolution of 10 km since 2013 in China. Estimates were made using an extended and powerful machine-learning model incorporating spatiotemporal information, i.e., Space-Time Extra-Trees . Natural and anthropogenic effects on air pollution, including their physical mechanisms and chemical reactions, were accounted for in the modeling. Using this dataset, spatiotemporal variations of the gaseous pollutants, the impacts of environmental protection policies and the COVID-19 pandemic, and population risk exposure to gaseous pollution are investigated.
To date, we have combined the advantages of artificial intelligence and big data to construct a virtually complete set of major air quality parameters concerning both particulate and gaseous pollutants over a long period of time across China, including PM 1 (1 km, 2000-present) (Wei et al., 2019), PM 2.5 (1 km, 2000-present) , PM 10 (1 km, 2000-present) , O 3 (10 km, 1979-present) L. He et al., 2022), and NO 2 (1 km, 2019-present) , serving environmental, public health, economy, and other related research. This study is the continuation of our previous studies, adding two new species of SO 2 and CO for the first time and also dating the data records of NO 2 back to 2013. Instead of devoting itself to a single pollutant, this study deals with all gaseous pollutants of compatible quality over the same period with the same spatial coverage and resolution. In particular, considering that there are few public datasets of these three gaseous pollutants with such spatiotemporal coverages focusing on the whole of China, this is highly valuable for the sake of studying their variations, relative proportions, and attribution of emission sources, as well as the diverse and joint effects of different pollutant species on public health.

Materials and methods
2.1 Big data 2.1.1 Ground-based measurements Hourly measurements of ground-level NO 2 , SO 2 , and CO concentrations from ∼ 1600 reference-grade ground-based monitoring stations (Fig. 1) collected from the China National Environmental Monitoring Centre (CNEMC) network were employed in the study. This network includes urban assessing stations, regional assessing stations, background stations, source impact stations, and traffic stations, set up in a reasonable overall layout that covers industrial (∼ 14 %), urban (∼ 31 %), suburban (∼ 39 %), and rural (∼ 16 %) areas to improve the spatial representations, continuity, and comparability of observations (HJ 664-2013;MEE, 2013a). NO 2 is measured by chemiluminescence and differential optical absorption spectroscopy (DOAS), and SO 2 uses ultraviolet fluorescence and DOAS, while CO adopts non-dispersive infrared spectroscopy and gas filter correlation infrared spectroscopy. These measurements have been fully validated and have the same average error of indication of ±2 % of full scale (FS) for the three gaseous pollutants considered here, with additional quality control checks such as zero and span noise and zero and span drift (HJ 654-2013 andHJ 655-2013;MEE, 2013b, c). They have also been used as ground truth in almost all air pollutant modeling studies in China . All stations use the same technique to measure each gas routinely and continuously for 24 h a day at about sea level without time series gaps. However, the reference state (i.e., observational conditions like temperature and pressure) changed from the standard condition (i.e., 273 K and 1013 hPa) to the room condition (i.e., 298 K and 1013 hPa) on 31 August 2018 (MEE, 2018a). We thus first converted observations of the three gaseous pollutants after this date to the uniform standard condition for consistency. Here, daily values for each air pollutant were averaged from at least 30 % of valid hourly measurements at each station in each year from 2013 to 2020.

Main predictors
A new daily tropospheric NO 2 dataset at a horizontal resolution of 0.25 • × 0.25 • in China was employed, created using a developed framework integrating OMI Aura Quality Assurance for Essential Climate Variables (QA4ECV) and Global Ozone Monitoring Experiment-2B (GOME-2B) offline tropospheric NO 2 retrievals passing quality controls (i.e., cloud fraction < 0.3, surface albedo < 0.3, and solar zenith angle < 85 • ; He et al., 2020a). The reconstructed tropospheric NO 2 agreed well (R = 0.75-0.85) with multi-axis differential optical absorption spectroscopy (MAX-DOAS) measurements. Through this data fusion, the daily spatial coverage of satellite tropospheric NO 2 was significantly improved in China (average = 87 %). Areas with a small number of missing values were imputed via a nonparametric machine-learning model by regressing the conversion relationship with Copernicus Atmosphere Monitoring Service (CAMS) tropospheric NO 2 assimilations (0.75 • × 0.75 • ), making sure that the interpolation was consistent with the OMI/Aura overpass time (Inness et al., 2019;. The gap-filled tropospheric NO 2 was reliable compared with measurements (R = 0.94-0.98; Wei et al., 2022b). The above two-step gapfilling procedures allowed us to generate a daily seamless tropospheric NO 2 dataset that removes the effects of clouds from satellite observations.
Here, the reconstructed daily seamless tropospheric NO 2 together with CAMS daily ground-level NO 2 assimilations (0.75 • × 0.75 • ) averaged from all 3-hourly data in a day and monthly NO x anthropogenic emissions (0.1 • × 0.1 • ; Inness et al., 2019) were used as the main predictors for estimating surface NO 2 . Limited by the quality of direct satellite observations, daily model-simulated SO 2 and CO surface mass concentrations, averaged from all available data in a day provided by 1-hourly Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2, 0.625 • × 0.5 • ), 3-hourly CAMS (0.75 • × 0.75 • ), and 3-hourly Goddard Earth Observing System Forward-Processing (GEOS-FP, 0.3125 • × 0.25 • ) global reanalyses were used as the main predictors to retrieve surface SO 2 and CO, together with CAMS monthly SO 2 and CO anthropogenic emissions.

Auxiliary factors
Meteorological factors have important diverse effects on air pollutants Li et al., 2019), e.g., the boundary layer height reflects their vertical distribution and variations (Z. Seo et al., 2017); temperature, humidity, and pressure can affect their photochemical reactions Xu et al., 2011;; rainfall and wind can also influence their removal, accumulation, and transport (Dickerson et al., 2007;Li et al., 2019 Hersbach et al., 2020), were calculated (i.e., accumulated for precipitation and evaporation while averaged for the others) from all hourly data in a day, used as auxiliary variables to improve the modeling of gaseous pollutants. Other auxiliary remote sensing data used to describe land-use cover and/or change (i.e., Moderate-Resolution Imaging Spectroradiometer (MODIS) normalized difference vegetation index (NDVI), 0.05 • × 0.05 • ) and population distribution density (i.e., LandScan™, 1 km) were employed as inputs to the machine-learning model because they are highly related to the type of pollutant emission and amounts of anthropogenic emissions, as well as to the surface terrain (i.e., Shuttle Radar Topography Mission (SRTM) digital elevation model (DEM), 90 m), which can affect the transmission of air pollutants. Table S1 in the Supplement provides detailed information about all the data used in this study. All variables were aggregated or resampled into a 0.1 • × 0.1 • resolution for consistency.

Pollutant gas modeling
Here, the developed Space-Time Extra-Trees (STET) model , integrating spatiotemporal autocorrelations of and differences in air pollutants to the extremely randomized trees (ERT; Geurts et al., 2006), was extended to estimate surface gaseous pollutants, i.e., NO 2 , SO 2 , and CO. ERT is an ensemble machine-learning model based on the decision tree, capable of solving the nonparametric, multivariable, nonlinear regression problem. Ensemble learning can avoid the lack of learning ability of a single learner, greatly improving accuracy. The introduced randomness enhances the model's anti-noise ability and minimizes the sen- sitivity to outliers and multicollinearity issues. It can handle high-latitude, discrete, or continuous data without data normalization and is easy to implement and parallel. However, several limitations exist, e.g., it is difficult to make predictions beyond the range of training data, and there will be an over-fitting issue on some regression problems with high noise. The training efficiency diminishes with increasing memory occupation when the number of decision trees is large. Compared with traditional tree-based models (e.g., random forest), ERT has a stronger randomness which randomly selects a feature subset at each node split and randomly obtains the optimal branch attributes and thresholds. This helps to create more independent decision trees, further reducing model variance and improving training accuracy (Geurts et al., 2006). The STET model has been successfully applied in estimating high-quality surface O 3 in our previous study . It is thus extended here to regress the nonlinear conversion relationships between ground-based measurements and the main predictors and auxiliary factors for other species of gaseous pollutants. For surface NO 2 , the STET model was applied to the main variables of the satellite tropospheric NO 2 column, modeled surface NO 2 mass, and NO x emissions, together with ancillary variables of the previously mentioned meteorological, surface, and population variables (Eq. 1). For surface SO 2 (Eq. 2) and CO (Eq. 3), modeled surface SO 2 and CO concentrations and SO 2 and CO emissions were used as main predictors along with the same auxiliary variables as NO 2 to construct the STET mod-els separately.
where NO 2(ij t) , SO 2(ij t) , and CO ij t indicate daily groundbased NO 2 , SO 2 , and CO measurements at one grid (i, j ) on the tth day of a year; SNO 2(ij t) indicates the daily satellite tropospheric NO 2 column at one grid (i, j ) on the tth day of a year; MNO 2(ij t) , MSO 2(ij t) , and MCO ij t indicate daily model-simulated surface NO 2 , SO 2 , and CO concentrations at one grid (i, j ) on the tth day of a year; ENOx ij m , ESO 2(ij m) , and ECO ij m indicate monthly anthropogenic NO x , SO 2 , and CO emissions at one grid (i, j ) in the mth month of a year; Meteorology ij t represents each meteorological variable at one grid (i, j ) on the tth day of a year; DEM ijy and POP ijy indicate the elevation and population at one grid (i, j ) of a year; and P s and P t indicate the space and time terms .

Seamless mapping of surface gaseous pollutants
Using the constructed STET model, we generated daily 10 km datasets with complete coverage (spatial coverage = 100 %) for three ground-level gaseous pollutants from 2013 to 2020 in China, called ChinaHighNO 2 , ChinaHighSO 2 , and ChinaHighCO. Monthly and annual maps were generated by directly averaging daily data at each grid. They belong to a series of public long-term, fullcoverage, high-resolution, and high-quality datasets of a variety of ground-level air pollutants for China (ChinaHighAir-Pollutants, CHAP) developed by our team. Figure 2 shows spatial distributions of the three pollutant gases across China on a typical day (1 January 2018). The spatial patterns of these gaseous pollutants were consistent with those observed on the ground, especially in highly polluted areas, e.g., severe surface NO 2 pollution in the North China Plain (NCP) and high surface SO 2 emissions in Shanxi Province. The unique advantage of our dataset is that it can provide valuable gaseous pollutant information on a daily basis at locations in China where ground measurements are not available. This addresses the major issues of scanning gaps and numerous missing values in satellite remote sensing retrievals under cloudy conditions, e.g., the average spatial coverage of the official OMI/Aura daily tropospheric NO 2 product is only 42 % over the whole of China during the period 2013-2020 ( Fig. S1). Our dataset provides spatially complete coverage, significantly increasing daily satellite observations by 58 %. In addition, reanalysis data do not simulate surface masses of gaseous pollutants well, underestimating them compared to our results and ground-based observations in China (Fig. S2). This is especially so for SO 2 , where high-pollution hotspots are easily misidentified. Validation illustrates that our regressed results for surface NO 2 , SO 2 , and CO agree better with ground measurements than modeled results (slopes are close to 1, and correlations > 0.93), being 1.9-6.4 times stronger in slope and 1.3-3.5 times higher in correlation, but 5.9-7.7 times smaller in differences (Fig. S3). This shows that our model can take advantage of big data to significantly correct and reconstruct gaseous simulation results via data mining using machine learning. Figure 3 shows annual and seasonal maps for each gas pollutant during the period 2013-2020 across China.
Multi-year mean surface NO 2 , SO 2 , and CO concentrations were 20.3 ± 4.7 µg m −3 , 16.2 ± 7.7 µg m −3 , and 0.86 ± 0.22 mg m −3 , respectively. Pollutant gases varied significantly in space across China, where high surface NO 2 levels were mainly distributed in typical urban agglomerations, e.g., the Beijing-Tianjin-Hebei (BTH) region, the Yangtze River and Pearl River deltas (YRD and PRD), and scattered large cities with intensive human activities and highly developed transportation systems (e.g., Urumqi, Chengdu, Xi'an, and Wuhan, among others). High surface SO 2 concentrations were mainly observed in northern China (e.g., Shanxi, Hebei, and Shandong provinces), associated with combustion emissions from anthropogenic sources, and the Yunnan-Guizhou Plateau in southwest China, likely associated with emissions from volcanic eruptions. By contrast, except in some areas in central China (e.g., Shanxi and Hebei), surface CO concentrations were overall low.
Significant differences in spatial patterns were seen at the seasonal level. Surface NO 2 , SO 2 , and CO in summer (average = 15.9 ± 4.7 µg m −3 , 22.9 ± 13.4 µg m −3 , and 1.1 ± 0.3 mg m −3 , respectively) were the lowest, thanks to favorable meteorological conditions, e.g., abundant precipitation and high air humidity conducive to flushing and scavenging of different air pollutants (Yoo et al., 2014). Strong sunlight and high temperature also accelerate the photochemical reactions of NO 2 loss (Shah et al., 2020). Pollution levels were highest in winter, with average values increasing by ∼ 1.5-1.9 times those in summer. This difference was much larger in central and eastern China, e.g., 2.3-3.4 times higher in the BTH due to large amounts of direct NO x , SO 2 , and CO emissions from burning coal for heating in winter in northern China. The spatial patterns of the three gaseous pollutants were similar in spring and autumn.

Short-term pandemic effects on air quality
Many studies have focused on the effects of the COVID-19 pandemic on air quality (WHO, 2020). Most of them were done using ground-based observations (Huang et al., 2020;Su et al., 2020), tropospheric gas columns (Field et al., 2021;Levelt et al., 2022), or retrieved surface masses (Cooper et al., 2022;Ling and Li, 2021). The resulting conclusions could be affected by insufficient spatial representation due to the uneven distribution of ground monitors or a large number of missing values in space due to the influence of clouds. The unique advantage of our seamless day-to-day gaseous pollutant dataset can make up for these shortcomings, allowing us to assess the changes in gaseous pollutants during the pandemic more accurately and quantitatively.
We first compared the spatial differences in monthly relative differences from February to April between 2020 and 2019 in China (Fig. 4). In February, surface NO 2 sharply reduced in China, especially in key urban agglomerations and megacities, showing relative changes of greater than 50 %. A significant decrease in surface SO 2 (> 40 %) was observed in northern areas where heavy industry is the mainstay in China (e.g., Tianjin, Hebei, and Shandong), while little change was seen in southern China. Surface CO also showed drastic decreases, but the amplitude was smaller than the other two gaseous pollutants. These were attributed to extensive plant closures and traffic controls due to the lockdown, which started at the end of January 2020, significantly reducing anthropogenic NO x , SO 2 , and CO emissions (Ding et al., 2020;1516 J. Wei et al.: Ground-level gaseous pollutants (NO 2 , SO 2 , and CO) in China  Yang et al., 2022;Zheng et al., 2021). In March, surface NO 2 was still generally lower than the historical level in most eastern areas, especially in areas where the pandemic was severe, i.e., Wuhan, Hubei Province, and its surrounding areas. The decrease in surface SO 2 largely slowed by more than 2 times in the NCP and central China, while surface CO almost returned to normal levels in most areas in China. In April, surface NO 2 and SO 2 were comparable to historical concentrations (within ±10 %), even increasing in some areas of the southern and northeastern areas due to rebounding anthropogenic emissions (Ding et al., 2020), especially in Hubei Province, indicating that their surface levels were almost recovered.
Most previous studies have focused mainly on changes in air pollutants during the lockdown, with little attention paid to the recovery. We thus compared the time series of daily population-weighted concentrations of the three gaseous pollutants after the Lunar New Year between 2020 and 2019 in China (Fig. 5). After the beginning of New Year's Eve, surface gaseous pollutants showed a significant decrease in both the normal and pandemic years due to the closure of factories, with decreasing anthropogenic emissions during the Spring Festival holiday. However, gaseous pollutants in the normal year rose rapidly after they fell to their lowest levels due to the return to work after the holidays. By contrast, their levels continued to decrease in 2020 and were lower than historical levels due to the sustained impacts of the strict lockdowns. They hit bottom in the 4th week after the Lunar New Year, then they began to increase gradually. Surface NO 2 and SO 2 recovered in the middle of the 11th week (around the 72nd and 75th days) after the Lunar New Year (i.e., 2020 and 2019 concentrations intersected and then alternately changed). However, surface CO levels recovered at the end of the 5th week (around the 34th day), more than 2 times faster than NO 2 and SO 2 levels. This is attributed to more CO emissions from increased residents' indoor cooking , increased atmospheric oxidation capacity (Huang et al., 2020;Wei et al., 2022a), and a potentially higher sensitivity to temperature rises .

Temporal variations and policy implications
Figures S4-S6 show annual-mean maps of each gaseous pollutant from 2013 to 2020 in China. Surface NO 2 , SO 2 , and CO changed greatly, peaking in 2013, with average values of 21.3 ± 8.8 µg m −3 , 23.1 ± 13.3 µg m −3 , and 1.01 ± 0.29 mg m −3 , respectively. They reached their lowest levels in 2020, particularly due to the noticeable effects of the COVID-19 pandemic. In general, national ambient NO 2 , SO 2 , and CO concentrations decreased by approximately 12 %, 55 %, and 17 % from 2013 to 2020, respectively. Large seasonal differences were observed in the amplitude of gaseous pollutants (Fig. 6), e.g., surface NO 2 decreased the most in winter, especially in the three urban agglomerations (↓ 24 %-31 %), and changed the least in autumn (especially in the YRD). Surface SO 2 showed much larger decreases in all seasons, especially during the cold seasons (↓ 55 %-81 %) due to the implementation of stricter "ultra-low" emission standards (C. . Surface CO had similar seasonal changes as SO 2 , but these were 1.5-3.3 times smaller in amplitude. To better investigate the spatiotemporal variations of ambient gaseous pollution, we calculated linear trends and significance levels using monthly anomalies by removing seasonal cycles. Most of China showed significant decreasing trends, with average annual rates of 0.23 µg m −3 , 2.01 µg m −3 , and 0.05 mg m −3 for surface NO 2 , SO 2 , and CO (p < 0.001), respectively (Fig. 7), especially in three urban agglomerations and large cities (e.g., Wuhan and Chengdu). The largest downward trends mainly occurred in northern and central China, especially in the BTH (Table S2). This is mainly due to the change in fuel for heating from coal to gas widespread across China in winter (S. , greatly reducing emissions of precursor gases (Koukouli et al., 2018). Increasing trends of surface NO 2 were, however, found in Ningxia and Shanxi provinces in central China due to increased traffic emissions and new coal-burning power plants in underdeveloped areas without strict regulations on NO x emissions (C. Maji and Sarkar, 2020;Van Der A et al., 2017).
We then divided the study period into three periods to investigate the impact of major environmental protection policies on air quality implemented in China (Fig. 7). During the Clear Air Action Plan (CAAP, 2013(CAAP, -2017, the rates of decrease for surface NO 2 , SO 2 , and CO accelerated in most populated areas in China, especially urban areas. This was due to dramatic reductions in main pollutant emissions like SO 2 and NO x (by 59 % and 21 %, respectively) through the upgrading of key industries, industrial structure adjustments, and coal-fired boiler remediation (Q. . In addition, the majority of gaseous pollutants had dropped continuously during the Blue Sky Defense War (BSDW, 2018-2020), benefiting from continuous reductions in total air pollutant emissions and the impacts of COVID-19 (Jiang et al., 2021;Zheng et al., 2021). However, areas with trends passing the significance level sharply shrank, especially for SO 2 .
During the 13th Five-Year Plan (FYP, 2016-2020), the decreasing trends of the three gaseous pollutants across China slowed down compared to those during CAAP. Large decreases in surface NO 2 were mainly found in the BTH region and Henan Province, while slightly increasing trends occurred in southern China. Surface SO 2 significantly decreased in most areas, where a greater downward trend was observed in Shanxi Province, mainly due to the reduction in coal consumption thanks to a strengthened clean-heating policy (Lee et al., 2021). Surface CO also continuously decreased, more rapidly in central China but less rapidly elsewhere. The continuous decline in gaseous pollutants is due to the binding reductions in total emissions of major pollutants like NO x (↓ 71 %) and SO 2 (↓ 48 %) in China (Wan et al., 2022;.

Population-risk exposure to gaseous pollution
With the daily seamless datasets, we can evaluate the spatial and temporal variations of short-term populationrisk exposure to the three gaseous pollutants by calculating the number of days in a given year exceeding the new recommended short-term minimum interim target (IT1) and desired air quality guidelines (AQG) level defined by the World Health Organization (WHO) in 2021   (WHO, 2021). The area exceeding the recommended levels (i.e., daily NO 2 > 120 µg m −3 , SO 2 > 125 µg m −3 , and CO > 7 mg m −3 ) was generally small in eastern China (Fig. S7). High NO 2 -exposure risks were mainly found in Beijing and Hebei Province and in a handful of big cities (e.g., Jinan, Wuhan, Shanghai, and Guangzhou), while high SO 2 -exposure risks were mainly observed in Hebei, Shandong, and Shanxi Provinces. The risk of high CO pollution was small, only found in some scattered areas in the NCP. In general, both the area and the possibility of occurrence exposure to high pollution has gradually decreased over time, almost disappearing since 2018.
By contrast, most areas of eastern China had a surface NO 2 exposure exceeding the AQG level (Fig. 8), especially in the north and economically developed areas in the south (proportion > 80 %). Both the extent and intensity are decreasing over time, but it is still a problem, suggesting that stronger NO x controls are needed in the future. Most of the main air pollution transmission belt in China (i.e., the "2 + 26" cities, Fig. 1) had surface SO 2 levels exceeding the AQG level at the beginning of the study period. Thanks to strict control measures, these polluted areas sharply decreased after 2015, almost disappearing in 2020. Controlling CO was much more successful in China, with less than 10 % of the days in the BTH exceeding the desired standard in the early part of the  Figure 9 shows the percentage of days with pollution levels exceeding WHO air quality standards in three key regions. BTH was the only region experiencing high NO 2 and SO 2 exposure risks (i.e., daily mean > IT1), dropping to zero since 2017 and 2016, while YRD and PRD had no high risks of exposure to the three gaseous pollutants (Fig. 9a-b). There was also no regional high CO-pollution risk (Fig. 9c). However, although declining continuously, regional surface NO 2 levels failed to meet the short-term AQG level in 2020, with 61 %-73 % of the days exceeding this standard. More efforts toward mitigating NO 2 levels in these key regions are thus needed. Continual decreases in the number of days above the AQG level were also observed in surface SO 2 , reducing to nearly zero in 2014, 2016, and 2018 in the PRD, YRD, and BTH, respectively. Less than 3 % of the days in the BTH and YRD had surface CO levels exceeding the AQG level. Surface CO levels were always below the AQG level in the PRD.

Data quality assessment
Here, the widely used out-of-sample 10-fold cross-validation (10-CV) method was adopted to evaluate the overall estimation accuracy of gaseous pollutants (Rodriguez et al., 2010;Wei et al., 2022a). An additional out-of-station 10-CV approach was used to validate the prediction accuracy of gaseous pollutants, performed based on measurements from ground-monitoring stations. These measurements were randomly divided into 10 subsets, of which data samples from 9 subsets were used for model training and the remaining subset for model validation. This was done 10 times, in turn, to ensure that data from all stations were tested. This procedure generates independent training samples and test samples made in different locations, used to indicate the spatial prediction ability of the model in areas where ground-based measurements are unavailable Wu et al., 2021). Figure 10 shows the CV results of all daily estimates and predictions for ground-level NO 2 , SO 2 , and CO concentrations from 2013 to 2020 in China (sample size: N ≈ 3.6 million). Surface NO 2 and SO 2 concentrations mainly fell in the range of 200 to 500 µg m −3 . Daily estimates were highly correlated to observations, with the same coefficients of determination (R 2 = 0.84) and slopes close to 1 (0.86 and 0.84, respectively). Average root-mean-square error (RMSE; mean absolute error, MAE) values of surface NO 2 and SO 2 estimates were 7.99 (5.34) and 10.07 (4.68) µg m −3 , and normalized RMSE (NRMSE) values were 0.25 and 0.51, respectively. Most daily CO observations were less than 10 mg m −3 , agreeing well with our daily estimates (R 2 = 0.80, slope = 0.79), and the average RMSE (MAE) and NRMSE values were 0.29 (0.16) mg m −3 and 0.3. Compared to estimation accuracies (Fig. 10a-c), prediction accuracies slightly decreased, which is acceptable considering the weak signals of trace gases. Daily surface SO 2 , NO 2 , and CO predictions (Fig. 10d-f) agree well with ground mea- Figure 9. Percentage of days (%) exceeding the WHO-recommended short-term (a-c) minimum interim target (IT1) and (d-f) desired air quality guidelines (AQG) level for surface NO 2 , SO 2 , and CO for each year from 2013 to 2020 in three typical urban agglomerations: the Beijing-Tianjin-Hebei (BTH) region, the Yangtze River Delta (YRD), and the Pearl River Delta (PRD). surements, with spatial R 2 values of 0.70, 0.68, and 0.61, respectively. Their respective RMSE (MAE) values were 14.28 (8.1) µg m −3 , 11.57 (7.06) µg m −3 , and 0.42 (0.24) mg m −3 , and NRMSE values were 0.35, 0.71, and 0.42, respectively, representing the accuracy for areas without groundmonitoring stations.

Estimate and prediction accuracy
The performance of our air pollution modeling was also evaluated on an annual basis, showing that our model works well in estimating and predicting the concentrations of different surface gaseous pollutants in different years (Table 1). The model performance has continuously improved over time, as indicated by increasing correlations and decreasing uncertainties. This is because of the increasing density of ground stations (especially in the suburban areas of cities) and updated quality control of measurements, e.g., improving the sampling flow calibration of monitoring instruments, flow calibration of dynamic calibrators, revision of precision, accuracy review, and data validity judgment (HJ 818-2018;MEE, 2018b). This has led to an increase in the number of data samples (e.g., from 169 000 in 2013 to more than 522 000 in 2020) and improvement in their quality. Figure 11 shows the spatial validation of estimated daily pollutant gases across China. In general, our model works well at the site scale, with average CV-R 2 values of 0.77, 0.72, and 0.72 and NRMSE values of 0.25, 0.43, and 0.26 for surface NO 2 , SO 2 , and CO, respectively. In addition, approximately 93 %, 80 %, and 84 % of the stations had at least moderate agreements (CV-R 2 > 0.6) between our estimates and ground measurements. Except for some scattered sites, the estimation uncertainties were generally less than 0.3, 0.5, and 0.3 in more than 80 %, 77 %, and 76 % of the stations for the above three gaseous pollutant species, respectively. Figure 12 shows the temporal validation of ground-level gaseous pollutants as a function of ground measurements in China. On the monthly scale (Fig. 12a-c), we collected a total of ∼ 119 000 matched samples of the three gaseous pollutants. Accuracies significantly improved, with increasing R 2 (decreasing RMSE) values of 0.93 (4.41 µg m −3 ), 0.97 (4.03 µg m −3 ), and 0.94 (0.13 mg m −3 ) for surface NO 2 , SO 2 , and CO, respectively. On the annual scale ( Fig. 12d-f), more than ∼ 10 000 matched samples were collected, showing better agreement with observations (e.g., R 2 = 0.94, 0.98, and 0.97) and lower uncertainties (e.g., RMSE = 3.06 µg m −3 , 2.46 µg m −3 , and 0.07 mg m −3 ) for the above three gaseous pollutants, respectively.

Comparison with previous studies
We compared our results with those from previous studies on the estimation of the three gaseous pollutants using different developed models focusing on the whole of China. Here, only those studies applying the same out-ofsample cross-validation approach against ground-based measurements collected from the same CNEMC network were   Chi et al., 2021;Dou et al., 2021;Xu et al., 2019;Zhan et al., 2018). Some studies improved the spatial resolution by introducing NO 2 data from the recently launched Sentinel-5 TROPOMI satellite, but data are only available from October 2018 on-ward (Chi et al., 2022;Liu, 2021;Wang et al., 2021;. Surface SO 2 estimated from an SO 2 emission inventory and surface CO from Measurement of Pollution in the Troposphere (MOPITT) and TROPOMI retrievals have a much lower data quality, with R 2 values that are smaller by 12 %-57 % and RMSE values that are larger by 41 %-47 % against ground measurements compared to ours Liu et al., 2019;Wang et al., 2021). Overall, our gaseous pollutant datasets are superior to those from previous studies in terms of overall accuracy, spatial coverage, and length of data records.

Successful applications
Our surface gaseous pollutant datasets have been freely available to the public online since March 2021 (see "Data availability" section). A large number of studies have used the three gaseous pollutant datasets generated in this study to study their single or joint impacts on environmental health from both long-term and short-term perspectives, benefiting from the unique daily, spatially seamless coverage. For example, a nearly linear relationship between long-term ambient NO 2 and adult mortality in China was observed (Y. Zhang et al., 2022); ambient NO 2 hindered the survival of middleaged and elderly people , while acute exposure to ambient SO 2 increased the risk of asthma mortality in China (S. . Long-term SO 2 and CO exposure can increase the incidence rate of visual impairment in children in China , and short-term exposure to ambient CO can significantly increase the probability of hospitalization for stroke sequelae (R. Wang et al., 2022). Regional and national cohort studies have shown that exposure, especially short-term exposure, to multiple ambient gaseous (NO 2 , SO 2 , and CO) and particulate pollutants have negative effects of varying degrees on a variety of diseases, like all-cause mortality (Feng et al., 2023), dementia mortality (T. , myocardial infarction mortality , cause-specific cardiovascular disease (Xu et al., 2022a, b), respiratory diseases (H. , ischemic and hemorrhagic strokes F. He et al., 2022;H. Wu et al., 2022b;Xu et al., 2022c), metabolic syndrome S. Han et al., 2022), influenza-like illness (Lu et al., 2023), incident dyslipidemia , diabetes (Mei et al., 2023), blood pressure H. Wu et al., 2022a), renal and/or kidney function (S. , neurodevelopmental delays , serum liver enzymes (Y. , overweight and obesity , insomnia , and sleep quality (L. Wang et al., 2022). These studies attest well to the value of the CHAP dataset regarding current and future public health issues, among others.

Summary and conclusions
Exposure to gaseous pollution is detrimental to human health, a major public concern in heavily polluted regions like China, where ground-based observations are not as rich as in major developed countries. Moreover, pollutants travel long distances, affecting large downstream regions. To remedy such limitations, this study applied the machine-learning model called Space-Time Extra-Trees to estimate ambient gaseous pollutants across China, with extensive input variables measured by monitors and satellites and by models. Daily 10 km resolution (approximately 0.1 • × 0.1 • ), seamless (spatial coverage = 100 %) datasets for ground-level NO 2 , SO 2 , and CO concentrations in China from 2013 to 2020 were generated. These datasets were cross-evaluated in terms of overall accuracy and predictive ability at different spatiotemporal levels. National daily estimates (predictions) of surface NO 2 , SO 2 , and CO were highly consistent with ground measurements, with average out-of-sample (out-of-station) CV-R 2 values of 0.84 (0.68), 0.84 (0.7), and 0.8 (0.61) and RMSEs of 7.99 (11.57) µg m −3 , 10.7 (14.28) µg m −3 , and 0.29 (0.42) mg m −3 , respectively.
Ambient pollutant gases varied significantly in space and time, with high levels mainly found in the North China Plain, especially in winter, due to more anthropogenic emissions, such as coal burning for heating. All gaseous pollutants sharply declined in China during the COVID-19 outbreak, while large differences were observed during their recovery times. For example, surface CO was the first to return to its historical level within the fifth week after the Lunar New Year in 2020, about 2 times faster than surface NO 2 and SO 2 levels. This is attributed to more home cooking and enhanced atmospheric oxidation. Temporally, surface NO 2 , SO 2 , and CO levels in China gradually decreased from peaks in 2013 (average = 21.3 ± 8.8 µg m −3 , 23.1 ± 13.3 µg m −3 , and 1.01 ± 0.29 mg m −3 , respectively), with annual rates of decrease of 0.23 µg m −3 , 2.01 µg m −3 , and 0.05 mg m −3 , respectively (p < 0.001), until 2020. Improvements in air quality have been made in the last 8 years thanks to the implementation of a series of environmental protection policies, greatly reducing pollutant emissions. In addition, both the areal extents of regions experiencing gaseous pollution and the probability of gaseous pollution occurring have gradually decreased over time, especially for surface CO and SO 2 , which have almost reached the short-term air quality guidelines level recommended by the WHO in most areas in China in 2020. This high-quality daily seamless dataset of gaseous pollutants will benefit future environmental and health-related studies focused on China, especially studies investigating short-term air pollution exposure.
Although a lot of new and/or useful data and analyses are presented in this study, they still suffer from some limitations. For example, our estimated surface SO 2 and CO concentrations should have larger uncertainties than those of NO 2 , since model simulations, instead of satellite retrievals, are supplemented during modeling to compensate for the lack of data in China. However, these data often have large biases in remote regions with few observations, such as western China (H. , as the surface measurements from MEE are mainly over eastern China. More influential factors stemming from regional economic and development differences and more parameters describing the complex meteorological system (e.g., winds at 850 hPa and the pressure system in the mid-troposphere) need to be considered in developing more powerful artificial intelligence models, which could be helpful in improving the accuracy of air pollutant retrievals. The spatiotemporal resolutions of gaseous pollutants will be further improved by integrating information from polar-orbiting and geostationary satellites to investigate diurnal variations. In a future study, we will also reconstruct data records over the last two decades and investigate their long-term spatiotemporal variations, filling the gap of missing observations. This will help us understand their formation mechanisms and impacts on fine particulate matter and ozone pollution in China. Data availability. CNEMC measurements of gaseous pollutants are available at http://www.cnemc.cn (last access: 1 January 2023; CNEMC, 2023). The reconstructed OMI/Aura tropospheric NO 2 product is available at https://doi.org/10.6084/m9.figshare.13126847.v1 (He et al., 2020b). MODIS series products and the MERRA-2 reanalysis are available at https://search.earthdata.nasa.gov/ (last access: 1 January 2023; NASA, 2023a). The SRTM DEM is available at https://www2.jpl.nasa.gov/srtm/ (last access: 1 January 2023; NASA, 2023b), and LandScan™ population information is available at https://landscan.ornl.gov/ (last access: 1 January 2023; ORNL, 2023). The ERA5 reanalysis is available at https://cds.climate.copernicus.eu/ (last access: 1 January 2023; CDS, 2023), GEOS CF data are available at https://portal.nccs.nasa.gov/datashare/gmao/geos-cf/ (last access: 1 January 2023; NASA, 2023c), and the CAMS reanalysis and emission inventory are available at https://ads.atmosphere.copernicus.eu/ (last access: 1 January 2023; ADS, 2023).
Author contributions. JiW and ZL designed the study. JiW performed the research and wrote the initial draft of this paper. ZL, JuW, CL, and PG reviewed and edited the paper. MC copy-edited the article. All authors made substantial contributions to this work.
Competing interests. At least one of the (co-)authors is a member of the editorial board of Atmospheric Chemistry and Physics. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.

Disclaimer.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Financial support. This research has been supported by the National Aeronautics and Space Administration (grant nos. 80NSSC21K1980, 80NSSC19K0950, and ROSES-2020).
Review statement. This paper was edited by Hailong Wang and reviewed by three anonymous referees.