Spatiotemporal continuous estimates of daily 1-km PM2.5 from 2000 to present under the Tracking Air Pollution in China (TAP) framework

High spatial resolution PM2.5 data covering a long time period are urgently needed to support population exposure assessment and refined air quality management. In this study, we provided complete-coverage PM2.5 predictions with a 1-km spatial resolution from 2000 to the present under the Tracking Air Pollution in China (TAP, http://tapdata.org.cn/) framework. To support high spatial resolution modelling, we collected PM2.5 measurements from both national and local monitoring stations. To correctly reflect the temporal variations in land cover characteristics that affected the local variations in PM2.5, we constructed continuous annual geoinformation datasets, including the road maps and ensemble gridded population maps, in China from 2000 to 2021. We also examined various model structures and predictor combinations to balance the computational cost and model performance. The final model fused 10-km TAP PM2.5 predictions from our previous work, 1-km satellite aerosol optical depth retrievals and land use parameters with a random forest model. Our annual model had an out-of-bag R2 ranging between 0.80 and 0.84, and our hindcast model had a by-year cross-validation R2 of 0.76. This open-access 1-km resolution PM2.5 data product with complete coverage successfully revealed the local-scale spatial variations in PM2.5 and could benefit environmental studies and policy-making.


Introduction
Air pollution has been a nonnegligible environmental problem around the world.China implemented strict clean air policies in the past decade that considerably improved air quality.To support the policy evaluation and air quality management, we constructed the Tracking Air Pollution in China (TAP) platform (http://tapdata.org.cn/), which provides near real-time air pollution, i.e., PM2.5 and O3, distribution at a 0.1 degree (approximately 10 km) spatial resolution, from the fusion of ground measurements, satellite retrievals, and chemical transport model (CTM) simulations (Geng et al., 2021).The TAP data benefited the evaluations of clean air policies and the characterization of air pollution exposure (Xiao et al., 2020;Xiao et al., 2021c;Xiao et al., 2021b).However, with the improved air pollution control targets that require refined air quality management, the detailed monitoring of air pollution distribution at higher spatial resolutions is urgently needed.
Recent developments in machine learning algorithms and remote sensing techniques have fueled the production of air pollution data at high spatiotemporal resolutions.For example, moderate-resolution imaging spectroradiometer (MODIS) products provide aerosol optical depth (AOD) retrievals at a 3-km resolution, contributing to the prediction of ground-level PM2.5 concentrations at the local scale (Xie et al., 2015;He and Huang, 2018;Hu et al., 2019).The multi-angle implementation of atmospheric correction (MAIAC) algorithm provides AOD retrievals at a 1-km resolution and benefits predictions of PM2.5 distribution at a 1-km (Hu et al., 2014;Wei et al., 2020;Wei et al., 2021;Goldberg et al., 2019;Xiao et al., 2017;Bai et al., 2022b) or higher spatial resolution (Huang et al., 2021).Recently, with the Gaofen-5 satellite retrieval, Zhang et al. (2018) predicted the PM2.5 concentration at a 160-m resolution.However, most of these high-resolution data products are limited to after 2013 or cover a specific region of China.Few studies have filled the missing predictions resulted from missing satellite retrievals (Bai et al., 2022b;Ma et al., 2022).This discontinuous PM2.5 prediction in space and time not only limits the application of PM2.5 products in scientific research and environmental management but also biases the characterization of population exposure to PM2.5 pollution (Xiao et al., 2017).Additionally, although high-resolution PM2.5 prediction models widely included various land use data, e.g., road maps, land cover types and points of interest (POIs), to describe the local-scale spatial variations in air pollution emissions and air pollution levels, most studies used only one or two years of land use data during the whole study period and ignored the critical variations in them.This lack of temporal variations in land use data may affect the spatial accuracy of high-resolution PM2.5 predictions.
In this study, we constructed a high-resolution PM2.5 concentration prediction system under the TAP framework.By fusing high-resolution MAIAC satellite retrievals, TAP PM2.5 products at a 10-km resolution, satellite normalized difference vegetation index (NDVI) products, and various continuous long-term land use data, we provide 1-km resolution PM2.5 predictions from 2000 to the present with complete coverage and timely updates.The high quality and easy accessibility of our high-resolution PM2.5 data could support research on air pollution and environmental health at local scales and contribute to the management of local air quality.

Data and Method
The workflow of this study is shown in Fig. 1.First, we processed and assimilated various predictors.The daily scale varying predictors include satellite retrieval, TAP 10-km PM2.5 predictions and meteorological fields, and the land use variables include road map, population distribution, artificial impervious area and vegetation index.In China, the high-speed economic development in the past several decades has led to significant changes in land use and population distribution.To correctly reveal the temporal variations in land use parameters and further benefit the description of local-scale PM2.5 concentration variations, we constructed temporal continuous land use predictors through statistical and spatial modeling.We then optimized the model structure and selected model predictors through various examinations to balance the model performance and

Ground measurements of PM2.5
The hourly PM2.5 measurements from national air quality monitoring stations were downloaded from the China National Environmental Monitoring Center (http://www.cnemc.cn/).To examine the model's prediction ability of in space, we also collected PM2.5 measurements from local air quality monitoring stations operated by local government agencies (Supplementary Fig. 1).The hourly PM2.5 concentration measurements below 1 µg/m 3 , the lowest measurement limit of most monitors, and above 2000 µg/m 3 were excluded for quality control of measurements.Identical continuous measurements during at least three hours were also removed.Hourly data generated daily average PM2.5 concentrations with fewer than 18 hourly measurements were removed.

Full-coverage PM2.5 predictions at a 10-km resolution
The 10-km resolution PM2.5 predictions, which were estimated from a two-stage machine learning modeling system, were downloaded from the TAP website (http://tapdata.org.cn/)(Xiao et al., 2021c).Previous evaluations reported that the PM2.5 prediction model with the out-of-bag R 2 (the R 2 of the linear regression between measurements and predictions from trees that did not include these measurements for training) ranged between 0.80 and 0.88 (Geng et al., 2021).These TAP PM2.5 data at a 10-km resolution with complete coverage are updated in near real-time.
We filled the missing AOD retrievals with daily linear regressions when the AOD from one of the Aqua (crossover at 1:30 pm local time) or Terra satellites (crossover at 10:30 am local time) exists.After the daily linear interpolation, the Aqua and Terra AODs were averaged to reflect the daily aerosol loadings.

Meteorological fields and evaluation
Various reanalysis data products, including ERA5, MERRA-2, and GEOS-FP, have been used to provide meteorological files in previous air pollution prediction models (Geng et al., 2021;Wei et al., 2021;Xiao et al., 2017).In this study, to select the best-performing meteorological data with long temporal data coverage (from 2000 to the present) and timely updating, we evaluated ERA5-Land, ERA5 and MERRA-2 meteorological datasets with meteorological measurements at regional air quality monitoring stations during 2019.The evaluation results showed that ERA5-Land data at a 0.1-degree resolution outperformed the MERRA-2 reanalysis data and the ERA5 reanalysis data (Supplementary Table 1).We extracted and processed the ERA5-Land parameters, including 2-m temperature, 10-m u-and v-component of wind, surface pressure, and total precipitation, for model predictor selection.The 2-m relative humidity was calculated from the 2-m temperature and the 2-m dew point temperature.

Population density
We evaluated and fused various global gridded population density data that were publicly available, including the LandScan Global Population Database from 2000 to 2019 (Dobson et al., 2000); the Gridded Population of the World (GPW) data product (version 4) for 2000, 2005, 2010, 2015, and 2020(Doxsey-Whitfield et al., 2015); and the annual WorldPop data at a 1-km resolution from 2000-2020 (Wardrop et al., 2018;Reed et al., 2018).We linearly interpolated the GPW data for each year.For data quality evaluation, we obtained the sum population at the county or city level from 2000 to 2019 from the China City Yearbooks.The gridded population datasets were aggregated to county-or city sums and compared to the yearbook records.
The LandScan data outperformed the other two datasets (Supplementary Fig. 2); however, the spatial distribution of the LansScan data showed an unreasonable pattern and were excluded.Due to the significant spatial variations in data accuracy (Wang et al., 2004;Bai et al., 2018), we fused the GPW and WorldPop data to improve data quality across space.We first fitted linear regressions between gridded population and yearbook records of each county or city that has at least 6 matched data pairs.Then, we averaged the gridded population with the coefficient of determination (R 2 ) as a weight (Eq.( 1)).We selected the R 2 rather than the root mean square error as the weight because the spatial variation trends were more important than the number of populations in the prediction of PM2.5.
where  , represents the ensemble population of grid g in county/city i of year y;  (), and  (), represent the population of grid g year y of dataset GPW and WorldPop, respectively; and  , 2 and  , 2 represent R 2 of GPW and WorldPop in county/city i.
For counties/cities that did not have sufficient matched data pairs for regression fitting, we employed the weight of the nearest county/city for the ensemble (Supplementary Fig. 3).We subsequently constrained the city level and national sum population to be consistent with the record from the China City Statistical Yearbook and the China Statistical Yearbook.

Land cover
The percentage of artificial impervious area of each 1-km modeling grid was calculated from the annual global artificial impervious area (GAIA) data at a 30-m resolution from 2000 to 2018 (http://data.ess.tsinghua.edu.cn/gaia.html).To estimate the GAIA distribution after 2018, we fitted linear regressions with data from 2016 to 2018 for each grid and extrapolated the GAIA values of 2019, 2020 and 2021.The data from 2013 to 2017 were used to evaluate the performance of this linear extrapolation.The examination results comparing the GAIA data and the first-year, second-year and third-year extrapolated GAIA predictions showed that the R 2 values ranged from 0.996-0.999,0.985-0.989,and 0.969-0.979,respectively.

Road map
Limited road maps are available in China.We collected the annual road maps from 2013 to the present from OpenStreetMap (www.openstreetmap.org)(Barrington-Leigh and Millard-Ball, 2017), a crowdsourced collaborative geographic information collection project.OpenStreetMap data have been widely used for road density analysis and the construction of world road data products (Zhang et al., 2015;Meijer et al., 2018).A previous evaluation study reported on the considerable accuracy of the OpenStreetMap data (Haklay, 2010).To evaluate the quality of the OpenStreetMap data and to fill the historical data gap before 2013, we also collected road maps of 2000,2004,2005,2010,2012,2014 and 2015 from the survey.
We first extracted the length of various types of roads from the OpenStreetMap data and from the road survey at the grid level.
We combined some road types to make the road classification more comparable in OpenStreetMap and in the survey (Supplementary Table 2).To estimate the annual road distribution before 2013, we first compared the grid-level road length of 2014 extracted from OpenStreatMap and the survey map (Supplementary Table 3).Then, we modify the survey map data to construct the OpenStreetMap-type gridded road length of the years that the survey map data were available by the equation listed below: where  ,, represents the OpenStreetMap-type road length of year j over grid i and  , represents the survey-type road length of year j over grid i.
After estimating the OpenStreetMap-type road length of years when the survey maps were available, we filled the gap years by weighted linear interpolation.First, we estimated the city-level sum road length of different road types by a linear mixed effects model (LME) (Meijer et al., 2018): where  ,,, represents the sum road length of city c in province p of year j;  represents the fixed intercept and  ′ represents the province-level random intercept;  1 ,  2 ,  3 ,  4 , and  5 represent the fixed slope of the city average population density, per capita gross domestic product (GDP), city area, city average elevation and city average GAIA;  1 ′ represents the province-level random slope of population density.The log10 transformation was conducted for all the continuous variables to account for the skewed distribution (Meijer et al., 2018).Stepwise linear regression was used to select the significant predictors (Meijer et al., 2018).To evaluate the LME model performance, by-year cross validation and four-year cross validation were where  ,, and  ,, represent the road length over grid i of the starting and ending year with available OSMtype road data, respectively;  ,, and  ,, represent the road length of city c of the starting and ending year with available OSM-type road data, respectively; and  ,, represent the road length of city c of the gap year j.

Other auxiliary datasets
We downloaded the monthly Terra MODIS NDVI (MOD13A3) at a 1-km resolution and filled the missing NDVI data by the nearest neighbor spatial smoothing approach.The average elevation data at a 30-m resolution were extracted from the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) Global Digital Elevation Model (GDEM) version 2.

Data assimilation
All the predictors were assimilated to the 1-km MAIAC pixels by various geostatistical methods.The meteorological data and PM2.5 predictions at a 0.1 degree resolution were downscaled to the 1-km MAIAC pixels with the inverse distance weighting method.The elevation pixels falling in each 1-km grid cell were averaged.The NDVI data were assigned to the MAIAC pixels by the nearest neighbor method.

Optimization and evaluation of the prediction model
To make the PM2.5 prediction process efficient and highly accurate, we designed various examinations to optimize the model structure and identify key predictors of the PM2.5 prediction system.
Three model structures were evaluated: modelTwoStage has a two-stage design with the first stage model predicting the high pollution indicator and the second stage model predicting the residual between 10-km TAP PM2.5 predictions and measurements (Xiao et al., 2021c); modelResi is a one-stage model that predicts the residual between 10-km TAP PM2.5 predictions and measurements; and modelBase is a one-stage model that directly predicts the PM2.5 concentration with the 10km TAP PM2.5 prediction as a predictor.Since the underestimation of high pollution events are widely reported, in addition to the evaluations including all the test measurements, we conducted additional evaluations focusing on the prediction accuracy of haze events when the daily average PM2.5 concentration was higher than the 75 µg/m 3 national secondary air quality standard.
Then, we selected the critical predictors of the PM2.5 prediction model (Fig. 1).We first constructed the full model with all the predictors and then we removed the meteorological predictors in sequence, according to the importance of parameters estimated from the full model.Predictors with the smallest importance were removed first.Data from 2019 were used for model predictor optimization.
Various statistics were employed to characterize model performance.The out-of-bag (OOB) predictions were provided during the training of the random forest that the measurements were predicted by trees that were trained with randomly selected samples without each of these measurements.Comparing the out-of-bag predictions with measurements in linear regression provided us with the out-of-bag R 2 , the root mean square error (RMSE) and the mean prediction error (MPE).To evaluate the model's ability to reveal PM2.5 variations at the local scale and at locations without monitoring, we used the measurements from national stations for model training and the measurements from the high-density local stations from model evaluation.
These local stations are primarily located in Hebei, Henan, Shandong and Chengdu (Supplementary Fig. 1).The evaluation results of OOB and with test data were used to optimize the model structure and select predictors.Then, to evaluate the optimized final model's prediction ability in time, we conducted a by-year cross validation.
Consistent with the previous TAP PM2.5 prediction framework, the missing satellite AOD retrievals were filled by adjusting the first layer of the decision tree and setting the availability status of AOD as the cutoff point of the first layer of the decision trees.The performance of this gap-filling method has been fully evaluated in our previous studies (Xiao et al., 2021a;Geng et al., 2021) 3 Results

Temporal variations in predictors
The high-resolution PM2.5 prediction was supported by various high-resolution predictors.In addition to the 1-km resolution MAIAC AOD, we also constructed and presented various 1-km resolution predictors, including road map, population distribution, artificial impervious area, and NDVI (Fig. 2).
We evaluated the road length model for various road types through by-year cross validation and four-year cross validation (Supplementary Table 4).The cross-validation predictions of all road types were highly correlated with the OSM data.The four-year cross-validation performance were comparable to the by-year cross-validation performance, indicating that the model's temporal prediction ability was robust.The performance of the secondary road model was slightly better than that of the highway model and primary road model, showing higher correlations between local socioeconomic factors and secondary road length relative to highways and primary roads that are constructed nationally.The predicted highway length correctly revealed the temporal trends of the records of highway length from the China traffic yearbooks, with correlation coefficients of 0.99.Since the road type classifications of the OSM and China traffic yearbooks are inconsistent, we did not compare the lengths of other types of roads.We observed a consistent increasing trend in road length for all road types across China (Fig. 2 ).The predicted road maps also displayed the construction of some local landmarks, e.g., the 6 th Ring Road in Beijing.
Compared to the statistical yearbook records, our ensemble population data showed R 2 and root mean square error (RMSE) values of 0.74 and 0.19 million, respectively, outperforming other gridded population data (Supplementary Fig. 4).The changes in population density distribution were inconsistent across China due to the substantial internal migration during the past decades (Fig. 2).For example, we observed that the high-speed economic development in Shenzhen city and the whole Pearl River Delta (PRD) attracted a large migration population, but the populations in small cities in Northeast and Central China were consistent or decrease.The artificial impervious area also increased significantly across China, especially in regions with fast economic development.The consistent increase in NDVI over most parts of China, especially in the southeast, showed the achievement of environmental protection in China.We found considerable missingness in MAIAC retrievals over China, especially in the southeast and northeast regions with large populations.Thus, gap-filling is necessary to provide valuable PM2.5 predictions across China.

Optimization of the high-resolution PM2.5 prediction model
Three model structures, modelTwoStage, modelResi, and modelBase, were examined in this study.The evaluation results showed that these candidate models performed comparably in R 2 in all the evaluations (Fig. 3).However, the modelBase that directly predicts the measurements showed significantly larger prediction error than the other two models during haze events.For some years, e.g., 2017 and 2018, the average prediction error of the modelBase was more than double than the prediction error of modelTwoStage and modelResi.This result was consistent with our previous findings that the prediction of residuals enlarges the response of the dependent variable to the predictors, thus benefiting the prediction of extreme events (Xiao et al., 2021c).We did not observe significant differences between modelTwoStage and modelResi.Thus, considering the prediction performance and the model fitting time expense, the modelResi was selected.We then examined the contribution of meteorological fields to the high-resolution PM2.5 prediction (Supplementary Table 5).
Compared to the full model, the out-of-bag R 2 of the model without meteorological fields decreased from 0.85 to 0.80; however, the R 2 with test data decreased only 0.02, from 0.85 to 0.83.This evaluation showed that the contribution of meteorological fields to high-resolution PM2.5 predictions was limited.Potential reasons include the coarse resolution of the meteorological data limiting the characterization of high-resolution variations in meteorological fields or in PM2.5 distributions.Additionally, the meteorological effects on PM2.5 have been considered in the 0.1-degree PM2.5 data that served as a predictor in the model.
Comprehensively considering the model performance and the meteorological data update frequency, we removed meteorological fields.
Table 1 summarizes the out-of-bag performance of our final annual models and hindcast model.The model R 2 ranged between 0.80 and 0.84 for annual models.The small interannual variations indicated that our model was robust and provided predictions with constant quality during the study period.The very small model mean prediction error (bias) together with the slopes close to 1 showed the inexistence of systemic bias in the prediction models.Our model performance was comparable with previous studies (Huang et al., 2021;Wei et al., 2019;Wei et al., 2020;Wei et al., 2021).The highcast model performed comparably in the OOB evaluation, the test data evaluation and the by-year cross-validation evaluation (Fig. 4), showing great accuracy and high robustness.No considerable overfitting was observed, and no system bias was detected in the spatial prediction (test data) and temporal prediction (by-year cross-validation) examinations.

The spatiotemporal characteristics of the high-resolution PM2.5 map
The high-resolution PM2.5 maps revealed critical local patterns of annual (Fig. 5-6) and daily (Fig. 7-8) pollution distributions that could not be identified by the 0.1-degree resolution maps.Comparing the daily population weighted average PM2.5 concentrations from 2000 to 2021, the number of days with PM2.5 higher than 75 µg/m 3 were significantly reduced after 2013 across China (Fig. 5).BTH showed high pollution levels with the haze days occurred across the year.In recent years, benefited from the pollution control policies, high pollution days in BTH outside winter were basically removed.The annual maps of  Figure 6 highlighted the variations in spatial distribution of PM2.5 at the local scale.We compared the annual PM2.5 anomaly, which is the gridded PM2.5 minus the regional average PM2.5, in 2013 and 2021 in YRD.The pollution hotspots transferred from city center with monitors to rural regions with limited monitoring.We found that after 2013, although the percent of days and counties with population weighted PM2.5 violated than the primary (35 µg/m 3 ) and secondary air quality standard are continuously decreasing, the percent of days and counties with rural-county pollution higher than urban-county pollution significantly increased.In 2013, more than half of days and counties showed higher pollution in urban counties than in rural counties when the PM2.5 was greater than 35 µg/m 3 ; however, in 2021, more than 96% of this pollution days and counties showed lower pollution in urban counties than in rural counties.In 2017 and 2020, all the days with PM2.5 greater than 75 µg/m 3 showed higher rural-county pollution than urban-county pollution.One reason of the transportation of pollution hotspot is that the PM2.5 reduction during 2013-2021 was much greater in city centers than in rural regions.In 2021, most regional high pollution hotspots were transferred to around the junction of cities or towns.The daily maps showed more short-term local pollution variations.Figure 7 shows one haze event during November 18-25, 2013.Since November 18 th , the upper cyclone moved toward northwest and the North China Plain was covered by highpressure ridge with continuously strengthen downdraft, leading to stable atmosphere that was unfavorable for pollution control.
From November 18 -23, the pollution kept increasing and trigged the haze event.Then, since November 24 th , with the upperlevel ridge moved eastward to the ocean, the North China Plain was affected by the trough with increased vertical upward movement and raised boundary layer height.Both the horizontal and vertical diffusion was improved and the pollutant concentrations decreased sharply, leading to the end of this haze event.
The 1-km resolution pollution map was able to reveal regional characteristics.For example, the impact of local transportation emissions was observed on some days in the populous key regions (Fig. 8).These maps also indicated the importance of including time-varying land use data for air pollution predictions, especially in high-resolution predictions, since the land use characteristics led to noticeable spatial variations in the pollution distribution, as expected.To examine the impact of using   When constructing the model structure and selecting the predictors, we not only considered the prediction accuracy but also considered the computation time, data updating frequency and long-term data availability.For example, we did not include the POI data as predictors since we have no access to historical POI data in China, and there is no appropriate model to correctly predict POI distributions in previous publications.Including more land use variables will certainly improve the model accuracy; however, since historical data are unavailable, doing so will increase the uncertainty in historical predictions.Additionally, we did not include any spatial and temporal trends estimated from measurements in the hindcast model that could significantly improve the model performance statistics in the evaluations.The measurement-based spatiotemporal trends did not necessarily reflect the pollution distribution in regions without monitors (Bai et al., 2022a).Since the major aim of data fusing methods is to estimate the PM2.5 variations in regions without monitors, including measurement-based smoothing trends in space and time will hinder the achievement of this goal.
Our model still has several limitations.First, although we improved the model prediction accuracy during high pollution events, we still noticed an underestimation of PM2.5 levels.Several reasons could explain this underestimation.The AOD retrievals tended to misclassify high aerosol loading as cloud cover, leading to missing AOD during haze events.The CTM also hardly predicts high pollution events.The missing satellite retrievals together with the underestimated CTM simulations resulted in the underestimation of pollution levels.Furthermore, although we included some regional monitors to increase the density of monitors for model training, the number of monitors in western China is still insufficient.Thus, the uncertainty of PM2.5 predictions in these regions lacking data could be larger than in the regions with dense data.However, the distribution of monitors in China basically followed the population distribution in which populous regions hold more monitors; thus, the key regions of air pollution control have more training data and high-quality predictions, benefiting air quality management and environmental health studies in the future.

Conclusions
In this study, we constructed a high-resolution PM2.5 prediction model fused MAIAC satellite aerosol optical depth retrievals, 10-km TAP PM2.5 data, and land use variables including road map, population distribution, artificial impervious area and vegetation index.To describe the significant temporal variations in land use characteristics resulted from the economic development in China, we constructed temporal continuous land use predictors through statistical and spatial modelling.
Optimization of model structure and predictors was conducted with various performance evaluation methods to balance the model performance and computing cost.We revealed changes in local scale spatial pattern of PM2.5 associated with pollution control measures.For example, pollution hotspots transferred from city centers to rural regions with limited air quality monitoring.We showed that the land use data affected the predicted spatial distribution of PM2.5 and the usage of updated spatial data is beneficial.The gridded 1-km resolution PM2.5 predictions can be openly accessed through the TAP website (http://tapdata.org.cn/).
computing cost.With the selected model design, we fitted three models under the TAP framework: the hindcast model with training data from 2013 to 2020 to predict historical PM2.5 concentrations from 2000 to 2014; the annual model with training data of each corresponding year from 2015 to 2020; and the near real-time model with rolling one-year training data that provides near real-time PM2.5 predictions until the day before present day.

Figure 1
Figure 1 Workflow of this study.
conducted.Regarding the by-year cross-validation, we selected one-year data for model testing in sequence and used the data of the remaining years for model training.Regarding the four-year cross validation, we selected 2013, 2014, and 2015 for model testing in sequence and used the data four years after the corresponding testing year for model fitting.For example, when selecting 2013 for model testing, the data from 2017, 2018, and 2019 were used to fit the model.After estimating the city-level sum road length, we further used the sum road length as a weight to assign the road length changes to each gap year.The equation is listed below:  ,, =  ,, + ( ,, −  ,, ) ×  ,, − ,,  ,, − ,,

Figure 2
Figure 2 Estimated annual distributions of key predictors, including highway length, population, artificial impervious area, and MAIAC AOD, in 2000, 2010, and 2021 across China.

Figure 3
Figure 3 The performance of modelTwoStage, modelResi, and modelBase in the out-of-bag evaluation (OOB), the evaluation with test data from local stations (Test) and the evaluation with test data higher than 75 µg/m 3 (Test_haze).Left: the evaluation R 2 Right: the average prediction error.

Figure 4
Figure 4 Model performance of the hindcast model trained with all the data from 2013 to 2019

PM2. 5
distribution in 2000 showed that the most polluted regions were located in Beijing, Hebei, and north of Henan; in 2007 and 2013, the highly polluted regions extended and covered BTH, Shandong, Shanxi, Hunan, Sichuan basin and YRD.After 2013 with the strict pollution control policies, the air quality across China was significantly improved and the polluted regions shrinked in 2021.

Figure 6
Figure 6 The spatial distribution of annual average PM2.5 anomaly in 2013 and 2021 in YRD (upper) as well as the changes in annual average PM2.5 between 2013 and 2021 (lower left); and the temporal trends in percent of days and counties with rural-county pollution higher than urban-county pollution in this region.
temporally mismatched land use data on the retrieved spatial patterns of PM2.5, we used historical land use data (GAIA, NDVI, population and road map of 2000) to predict the daily PM2.5 distribution over these key regions of the same day.The historical land use data in 2000 led to incorrect spatial characterizations of the PM2.5 distribution in 2015.

Figure 8
Figure 8 The daily PM2.5 distribution in the Yangtze River Delta (YRD, year 2015 day 25) and PRD (year 2015 day 26), with the TAP 1-km PM2.5 predictions, the prediction with the historical land use predictors of year 2000, and the TAP 0.1-degree PM2.5 predictions.Gray shows the water body.