Himawari-8-derived diurnal variations in ground-level PM2.5 pollution across China using the fast space-time Light Gradient Boosting Machine (LightGBM)

Fine particulate matter with a diameter of less than 2.5 μm (PM2.5) has been used as an important atmospheric environmental parameter mainly because of its impact on human health. PM2.5 is affected by both natural and anthropogenic factors that usually have strong diurnal variations. Such information helps toward understanding the causes of air pollution, as well as our adaptation to it. Most existing PM2.5 products have been derived from polarorbiting satellites. This study exploits the use of the nextgeneration geostationary meteorological satellite Himawari8/AHI (Advanced Himawari Imager) to document the diurnal variation in PM2.5. Given the huge volume of satellite data, based on the idea of gradient boosting, a highly efficient tree-based Light Gradient Boosting Machine (LightGBM) method by involving the spatiotemporal characteristics of air pollution, namely the space-time LightGBM (STLG) model, is developed. An hourly PM2.5 dataset for China (i.e., ChinaHighPM2.5) at a 5 km spatial resolution is derived based on Himawari-8/AHI aerosol products with additional environmental variables. Hourly PM2.5 estimates (number of data samples= 1 415 188) are well correlated with ground measurements in China (cross-validation coefficient of determination, CV-R2= 0.85), with a root-meansquare error (RMSE) and mean absolute error (MAE) of 13.62 and 8.49 μg m−3, respectively. Our model captures well the PM2.5 diurnal variations showing that pollution increases gradually in the morning, reaching a peak at about 10:00 LT (GMT+8), then decreases steadily until sunset. The proposed approach outperforms most traditional statistical regression and tree-based machine-learning models with a much lower computational burden in terms of speed and memory, making it most suitable for routine pollution monitoring.


Introduction
China has faced severe environmental problems during the last 2 decades, especially air pollution Chan and Yao, 2008;Z. Li et al., 2017;Wei et al., 2021a). The sources of air pollution are numerous, coming from both natural changes (e.g., forest fires, biomass burning) and human activities (e.g., industrial production, transportation) Sun et al., 2004;Wei et al., 2019aWei et al., , b, 2021b. Particulate matter with a diameter of less than 2.5 µm (PM 2.5 ) has a greater impact on the atmospheric environment and climate change than other air pollutants (e.g., PM 10 , nitrogen dioxide, NO 2 , and sulfur dioxide, SO 2 ) (Jacob and Winner, 2009;Z. Li et al., 2017Ramanathan and Feng, 2009). Moreover, they can cause great harm to human health due to their smaller particle size (Delfino et al., 2005;Kampa and Castanas, 2008;Kim et al., 2015;Lelieveld et al., 2015). China has established and operates multiple ground-based observation networks to monitor air pollution in real time across mainland China, including information about PM 2.5 pollution.
For near-surface concentrations, the networks provide high-quality PM 2.5 measurements every hour (even every few minutes) but with non-uniform coverage. In recent years, an increased effort has been made in estimating PM 2.5 with products generated from multiple instruments on sunsynchronous satellites, e.g., the Multi-angle Imaging Spec-troRadiometer (MISR) (Liu et al., 2005;van Donkelaar et al., 2006), the Moderate Resolution Imaging Spectroradiometer (MODIS) (Liu et al., 2007;Ma et al., 2014;Wei et al., 2019aWei et al., , 2020Wei et al., , 2021a, and the Visible Infrared Imaging Radiometer Suite (VIIRS) (Wei et al., 2021c;Wu et al., 2016;Yao et al., 2019). However, due to their low revisit cycles (one or two overpasses per day), they are unable to monitor the diurnal variation in pollution. Currently, most available PM 2.5 datasets are at low temporal resolutions that cannot meet the requirements of air pollution real-time monitoring (Lennartson et al., 2018). For example, knowing when heavy pollution might occur during the day, people may adjust their time outdoors doing activities accordingly. Following the launch of the Himawari-8 Advanced Himawari Imager (Himawari-8/AHI) on 7 October 2014 (Bessho et al., 2016;Letu et al., 2020), near-surface PM 2.5 concentrations in the Eastern Hemisphere can now be estimated and used to examine their diurnal cycle. Wang et al. (2017) used the linear mixed-effect (LME) model, and Sun et al. (2019) applied the geographically weighted regression (GWR) and support vector regression (SVR) models to estimate hourly PM 2.5 concentrations in the Beijing-Tianjin-Hebei (BTH) region from the Himawari-8 aerosol optical depth (AOD) product. T.  developed an improved LME model, and Xue et al. (2020) proposed an improved geographically and temporally weighted regression (IGTWR) model to derive hourly PM 2.5 maps based on the Himawari-8 AOD product over central and eastern China. In addition to traditional statistical regression models, several artificial intelligence models, including the random forest (RF), the gradient boosting decision tree (GBDT), the eXtreme Gradient Boosting (XGBoost), and the deep neural network (DNN), have been recently successfully adopted to obtain ground-level PM 2.5 concentrations in local regions and in the whole of China Gui et al., 2020;Liu et al., 2019;Zhang et al., 2020). Nevertheless, due to their poor data-mining ability, traditional statistical regression methods usually suffer from large uncertainties. While artificial intelligence methods can achieve high accuracies, they are often highly demanding on computational power and are thus often slow. Therefore, spatiotemporal variations in PM 2.5 have often been neglected in the models developed in previous stud-ies Liu et al., 2019;Sun et al., 2019;Wang et al., 2017;, resulting in relatively low accuracies.
Focusing on the above issues, we have developed a new, highly efficient, and precise method for improving groundlevel PM 2.5 estimates by incorporating spatial and temporal information into the tree-based Light Gradient Boosting Machine (LightGBM) model. This new model is called the space-time LightGBM (STLG) model, and it has been used to generate a high-quality, high-temporal-resolution (hourly) PM 2.5 dataset over eastern China (at a spatial resolution of 5 km) from the Himawari-8/AHI hourly AOD product. Section 2 provides details about the data used and introduces the development of the STLG model. Section 3 validates the hourly PM 2.5 estimates and shows the diurnal PM 2.5 variations across China. Comparisons with results from traditional models and from previous studies are also presented. Section 4 summarizes the study.  Wei et al., 2020). The latest Himawari-8 version 2 hourly 5 km AODs at 500 nm across mainland China for that year were also collected. This AOD product is synthesized from level 2 10 min AODs generated by a newly developed Lambertiansurface-assumed aerosol retrieval algorithm (Letu et al., 2020;Yoshida et al., 2018). Himawari-8 AOD retrievals have been preliminarily evaluated against in situ AOD retrievals provided by the Aerosol Robotic Network (Giles et al., 2019) and the Sun-Sky Radiometer Observation Network , showing that they are consistent (R = 0.75), with a root-mean-square error (RMSE) and mean absolute error (MAE) of 0.39 and 0.21, respectively (Wei et al., 2019c). Here, only low-uncertainty AOD retrievals (500 nm) were selected for estimating PM 2.5 concentrations.

Meteorological conditions
PM 2.5 can be significantly affected by meteorological conditions . However, most currently available reanalysis meteorological products have low temporal resolutions (∼ 3-6 h). Recently (14 June 2018), the fifth-generation European Centre for Medium-range Weather Forecasts (ECMWF) global atmospheric reanalysis (ERA5) at a horizontal resolution of 0.25 • × 0.25 • has been released, as well as the land version (12 July 2019) at a horizontal resolution of 0.1 • × 0.1 • , both at an hourly timescale (1979 to the present). Here, we use seven ERA5 hourly meteorological parameters, i.e., the 2 m temperature (TEM), total evapora-tion (ET), relative humidity (RH), 10 m u-and v-components of wind, surface pressure (SP), and boundary-layer height (BLH).

Human influences
Human activity is a key factor affecting PM 2.5 pollution. The global annual LandScan™ product at a 1 km spatial resolution for the year 2018 was selected to obtain the population distribution (POP) (Dobson et al., 2000). Monthly anthropogenic source emission data from the Multi-resolution Emission Inventory for China (MEIC) (M. Zheng et al., 2018) were also employed. This dataset is generated from agricultural, industrial, power, residential, and transportation information obtained at more than 700 anthropogenic sources, including a total of 10 atmospheric pollutants and greenhouse gases. Here, four main precursors were selected, i.e., ammonia (NH 3 ), nitrogen oxides (NO x ), SO 2 , and volatile organic compounds (VOCs), and direct emissions to PM.

Ancillary data
Two additional ancillary datasets, namely, the MODIS monthly normalized difference vegetation index (NDVI) at a horizontal resolution of 0.05 • × 0.05 • and the Shuttle Radar Topography Mission (SRTM) 90 m digital elevation model (DEM) products, were selected to characterize land cover, its change, and topographical conditions in China. All selected variables (Table 1) with potential impacts on PM 2.5 concentrations were resampled to the same spatial resolution as the Himawari-8 aerosol product, namely, 0.05 • × 0.05 • .

LightGBM model
The LightGBM model, a newly developed tree-based machine-learning approach, was introduced in 2017 ( Ke et al., 2017). Using the gradient boosting framework to construct the decision tree, this approach can tackle both regression and classification tasks and as such can be expanded for PM applications. It can also tackle the main challenge faced in traditional machine-learning approaches, namely, computational complexities, which are very time-consuming. Light-GBM is a fast, distributed, and highly efficient method that reduces the number of data samples (M) and features (N ). The LightGBM model includes three main steps when constructing the decision tree.
1. Histogram-based algorithm. Continuous features are first converted to different bins which are used to construct feature index histograms without the need to sort during training. It goes through all the data bins to find the best split point from the feature histograms, which can significantly reduce the computation cost of the split gain. The overall complexity is O (M × N ).
2. Gradient-based one-side sampling. Data samples are first sorted in descending order according to their absolute gradients, and the top a % of them are selected as a subset sample with large gradients. The b % samples are then randomly chosen from the remaining data as a subset sample with small gradients. The sampled data with small gradients are multiplied by a weight coefficient 1−a b . Consequently, a new classifier is learned and established using the above-sampled data until convergence.
3. Exclusive feature bundling. A graph with weighted edges is first constructed, and each weight corresponds to the total number of conflicts between two features. The features are then sorted in descending order according to the degree of each feature (the greater the degree, the greater the conflict with other points). Last, each feature is checked in the sorted sequence, and it is assigned to a combination with small conflicts or a new combination is created.
In addition to the main technologies mentioned above, there are other features of the optimization, such as the leafwise tree growth strategy with depth restriction (Shi, 2007), histogram difference acceleration, sequential access gradient, and the support of category feature and parallel learning. These advanced methodologies make it possible to reach a high accuracy and efficiency (Ke et al., 2017).

Model development
It is well known that air pollution has spatiotemporal heterogeneity leading to large differences in PM 2.5 concentrations in both time and space. Such characteristics have always been ignored in most traditional statistical regression and artificial intelligence methods. Studies have shown that including spatiotemporal information has led to improved PM 2.5 estimates using remote sensing techniques (Z. Wei et al., 2019aWei et al., , 2020. Therefore, we have introduced a new approach to integrating spatiotemporal information into the LightGBM model. The new model developed here is called the STLG model. The spatial feature is represented by the geographical distances of one pixel to other points in the circumscribed rectangle of the study region (Baez-Villanueva et al., 2020;Behrens et al., 2018). The distance is calculated using the haversine method (Eq. 1) to reflect the spherical distance between two points in the sphere space (Wei et al., 2021a). The temporal feature is represented by the day of the year (DOY), which is used to distinguish each data record on different days of the year during the model training.  where ϕ and γ represent the latitude and longitude of a point on the sphere, respectively, and r denotes Earth's mean radius (≈ 6371 km). Figure 1 illustrates the flowchart of the new STLG model. In addition to Himawari-8 AODs, other auxiliary variables were considered and employed to improve PM 2.5 -AOD relationships. However, to avoid redundant information, we first calculated the normalized importance (%) of each feature to the PM 2.5 estimation during the model training ( Fig. 2). It represents the total gains of splits that use the feature during the decision-tree construction but not the physical contribution. AOD is found to be the most important feature, accounting for about 17 %. All meteorological factors have an important impact on the PM 2.5 estimation, especially BLH, RH, and TEM (importance > 8 %), followed by two surfacerelated variables (i.e., NDVI and DEM) and POP. The influence of aerosol precursors and emissions (i.e., NH 3 , NO x , SO 2 , PM, and VOC) on the PM 2.5 estimation cannot be ignored (importance > 2 %). Therefore, all 16 selected variables are included to establish the final model in this study.
Here, two independent 10-fold cross-validation methods (10-CV) (Rodriguez et al., 2010) based on all the data samples (i.e., out-of-sample) and PM 2.5 monitoring stations (i.e., out-of-station) were selected to validate the model performance and the spatial prediction ability, respectively.

Spatial-scale performance
The STLG model can largely minimize overfitting, showing a strong data-mining ability ( Fig. 3) which can more accurately establish the relationships between hourly PM 2.5 observations and influential variables (i.e., coefficient of determination, R 2 = 0.97-0.98, RMSE = 4.18-7.31 µg m −3 ). Figure 4 illustrates the out-of-sample evaluation results of estimated hourly PM 2.5 values over China from 08:00 to 17:00 LT in 2018. The STLG model is highly accurate in estimating hourly PM 2.5 concentrations, with high sample-based CV-R 2 values ranging from 0.81 to 0.85, strong slopes of ∼ 0.81-0.84, and small y-intercepts of ∼ 5.52-7.84 µg m −3 . The uncertainties are overall small, with RMSEs (MAEs) ranging from 11.24 (6.82) µg m −3 to 15.56 (9.79) µg m −3 . However, the STLG performs slightly differently with small differences in main evaluation indicators throughout the day. The main reason being that the number of training samples is reduced during sunrise ( Fig. 4a and b) and sunset ( Fig. 4i and j) in optical remote sensing, affecting the model training. Air pollution also has clear diurnal variations at different PM 2.5 pollution levels due to the different intensities of human activities and natural conditions. In general, our model is stable and robust, with an equal out-of-sample CV-R 2 of 0.85 and an equal regression slope of 0.81 at most hours during the day in China (Fig. 4c-h).  Furthermore, out-of-station CV-R 2 values range from 0.76 to 0.81, and RMSE (MAE) values range from 12.49 (7.85) µg m −3 to 17.61 (11.33) µg m −3 (Fig. 5), indicating that our model has a strong spatial prediction ability and can predict PM 2.5 values well in those areas without surface ob-servations.. The station-based accuracy is also slightly decreased with reference to the sample-based accuracy, further illustrating the robustness of our model. However, two cross-validation results (e.g., slopes = 0.78-0.84) indicate that hourly PM 2.5 concentrations are overall underestimated (Figs. 4-5), a common issue in fine-particle remote sensing . This can be explained by the large aerosol retrieval uncertainty, as well as the small number of data samples under highly polluted conditions (Wei et al., 2019c, d).
The regional performance of the STLG model for hourly PM 2.5 estimates (Fig. 6) was also evaluated. Hourly PM 2.5 estimates (number of data samples, N = 1 151 595) are highly consistent with ground measurements, with a high sample-based CV-R 2 of 0.87 and a strong regression slope of 0.86, showing small estimation uncertainties (i.e., RMSE = 12.77 µg m −3 , MAE = 8.12 µg m −3 ) over eastern China. The STLG model performs well (e.g., CV-R 2 = 0.88, slope = 0.87) in two typical urban agglomerations of public concern in China, i.e., the Beijing-Tianjin-Hebei (BTH) (Fig. 6b) and Yangtze River Delta (YRD) (Fig. 6c) regions. By contrast, our model performs relatively poorly in the Pearl River Delta (PRD) region (Fig. 6d) possibly due to the significant reduction in the number of data samples caused by frequent, long-term cloud cover in southern China. Note that there are some differences in the uncertainty of hourly PM 2.5   estimates mainly because of varying levels of air pollution. The pollution level in the BTH region is about 3 times higher than that in the PRD region. Figure 7 shows the accuracy of the STLG model at each monitoring station across China. At the individual site scale, the number of data samples gradually decreases from northern China to southern China mainly due to increasing cloud contamination with a site average of 997 data samples in China. Except for several scattered monitoring stations in western China, the STLG model has a high performance and adaptability and can estimate well hourly PM 2.5 concentrations at most monitoring stations (e.g., average CV-R 2 = 0.78, RMSE = 12.21 µg m −3 , and MAE = 8.17 µg m −3 ). In general, approximately 76 %, 79 %, and 82 % of monitoring stations show high accuracy, with out-of-sample CV-R 2 values > 0.7, RMSE values < 15 µg m −3 , and MAE values < 10 µg m −3 in hourly PM 2.5 estimates, especially for those located in central and northern China.

Temporal-scale performance
We first quantified the time series of the bias in hourly PM 2.5 estimates during the day in China (Fig. 8). There is a slight temporal dependence in that the PM 2.5 bias increases gradually with increasing standard deviation, reaching a maximum around 11:00 LT and subsequently decreasing. This seems to be closely related to the diurnal variation in PM 2.5 concen-  2019c) because machine learning is not sensitive to the systematic bias of aerosol retrievals (Wei et al., 2021c). Nevertheless, our model is generally robust and can accurately estimate PM 2.5 concentrations with small mean (median) biases of 0.05-0.08 (0.63-0.99) µg m −3 during different hours throughout the day.
We also compared Himawari-8-derived and ground-based PM 2.5 diurnal variations from all available monitoring stations in China and three typical urban clusters (Fig. 9). Hourly PM 2.5 concentrations observed by satellite are highly consistent with ground-based measurements, with a small difference within ± 0.10, 0.11, 0.13, and 0.11 µg m −3 in China and in each region. Moreover, the same diurnal variations in PM 2.5 pollution are seen during the day; i.e., they reach their maximum values at 10:00 or 11:00 LT and are lower at sunrise and sunset. These results illustrate that the diurnal PM 2.5 variations derived from Himawari-8 are reasonable compared to ground-based measurements.
We investigated the time series of the daily performance of the STLG model in estimating hourly PM 2.5 concentrations in China. The number of data samples varies on a daily basis, with an average of 3975 d −1 and with more than 83 % of all days having more than 2000 (Fig. 10). The large gap in the number of data samples is mainly caused   55, 9.63, 11.83, and 17.57 µg m −3 in spring, summer, autumn, and winter, respectively (Fig. 11).
In general, the overall uncertainty of PM 2.5 estimates increases at the beginning and at the end of the year likely due to the harsher environmental conditions (e.g., low humidity and less precipitation) and more intense human activities (e.g., coal heating and straw burning) in winter and spring. We have evaluated temporally synthesized PM 2.5 data from the hourly data samples at each monitoring station for the year 2018 (Fig. 12). Daily mean PM 2.5 estimates are highly correlated to those calculated from surface observations (R 2 = 0.91), and the average RMSE (MAE) value is 10.11 (6.39) µg m −3 . This suggests that the STLG model can capture daily PM 2.5 variations more accurately. Note that daily synthetic PM 2.5 data derived from geostationary satellites have a higher temporal frequency than data derived from sun-synchronous satellites. In general, PM 2.5 synthetic values also have high accuracies and low estimation uncertainties (e.g., R 2 = 0.98, RMSE = 1.6-3.3 µg m −3 , MAE = 1.1-2.3 µg m −3 ) from monthly to annual scales, allowing for a better description of spatiotemporal distributions and variations in PM 2.5 pollution across China. Figure 13 shows Himawari-8-derived hourly mean nearsurface PM 2.5 concentrations from 08:00 to 17:00 LT in 2018 across mainland China. They do not cover western Xinjiang and Tibet due to the limitation of satellite scanning. PM 2.5 pollution varies diurnally across China, being at an overall low level at sunrise (∼ 29.94 ± 10.91 µg m −3 ). With the increase in human activities, air pollution becomes more severe over time, reaching a peak at around 10:00-11:00 LT in China (∼ 36 ± 13 µg m −3 ). These high levels of pollution can last several hours. As the day progresses, human activities subside, and atmospheric fine particles settle on surfaces. PM 2.5 concentrations thus decrease towards sunset in most areas in China (∼ 23.21 ± 9.73 µg m −3 ). In general, air pollution in the morning (i.e., 08:00-12:00 LT) is much more severe than in the afternoon (i.e., 13:00-17:00 LT) in China, with morning PM 2.5 concentrations about 1.3 times higher than afternoon levels. This is related to the influence of varying BLHs (Z. Su et al., 2018). Table 2 summarizes the diurnal PM 2.5 variations in eastern China and three typical urban agglomerations. PM 2.5 pollution levels in eastern China are generally higher than the national level at each hour of the day due to the dense human population and intensive human activities. In the BTH region, PM 2.5 pollution varies greatly, with hourly PM 2.5 concentrations ranging from 28.88 ± 10.16 µg m −3 (10:00 LT) to 49.31 ± 15.03 µg m −3 (16:00 LT) and with differences exceeding 20 µg m −3 . PM 2.5 pollution remained at a high level (> 42 µg m −3 ) before 12:00 LT and dropped to a lower level (< 29 µg m −3 ) after 16:00 LT. This is closely related to people's daily activities and the production and life cycle of PM 2.5 during the day, as well as the change in boundary mixing as a function of the day (Lennartson et al., 2018;Wang and Christopher, 2003). Similar patterns and PM 2.5 pollution levels are seen in the YRD region. In general, the PRD region is less polluted in the morning but more severely polluted in the afternoon than the BTH region. Compared with the BTH and PRD regions, PM 2.5 pollution in the PRD region is much lower and shows a smaller diurnal difference, with hourly PM 2.5 values ranging from 29.49 ± 5.97 µg m −3 (11:00 LT) to 36.36 ± 5.76 µg m −3 (08:00 LT). Better natural conditions and fewer pollutant emissions mainly explain this .

Diurnal variations
In general, our satellite-derived diurnal variations in PM 2.5 pollution agree well with ground-based observations at both national and regional levels but with generally lower PM 2.5 concentrations (Fig. 9). The reason is that the PM 2.5 monitoring stations are unevenly distributed and vary greatly in the number of stations at the regional scale. Also, most sites are distributed in urban areas, leading to inevitable overestimations due to urban-rural differences. However, satellite remote sensing can cope with this deficiency by generating spatially continuous PM 2.5 maps, providing more accu-rate information about the distribution of and variations in PM 2.5 pollution.

Seasonal and annual variations
Seasonal PM 2.5 maps are synthesized from daily PM 2.5 maps from 2018 across China according to our previous approach (Wei et al., 2019a). Our results illustrate that PM 2.5 pollution varies greatly on a seasonal scale (Fig. 14). Pollution levels are generally low and show similar spatial patterns in summer (∼ 22.86 ± 7.05 µg m −3 ) and autumn (∼ 23.76 ± 10.97 µg m −3 ) across China  (Table 3). By contrast, it is much more severe in spring (∼ 32.84 ± 11.49 µg m −3 ) and winter (∼ 39.04 ± 16.32 µg m −3 ) across China, especially in the BTH and YRD regions in winter. The main reasons are the frequent sandstorms and the long-distance transmission of sand and dust in spring and the burning of coal and fossil fuels for heating in winter leading to more pollutant emissions in northern China. PM 2.5 pollution also shows significant spatial heterogeneities across China (Fig. 15), with an annual mean PM 2.5 concentration of 28.99 ± 10.31 µg m −3 in 2018 (Table 3). High pollution levels are always observed in the Hebei, Shandong, Jiangsu, Anhui, Henan, Hubei, and Sichuan provinces. Interactions between intensive human activities, adverse stagnant weather (e.g., low BLHs and low winds), and special terrain (e.g., basin) can increase anthropogenic aerosols (Chen et al., 2008;Wang et al., 2018). By contrast, PM 2.5 pollution is relatively light in the northeast (e.g., Heilongjiang and Jilin provinces), the southwest (e.g., Tibet and Yunnan provinces), and the eastern coastal areas of China (e.g., Zhejiang and Fujian provinces). These provinces are sparsely populated or experience meteorological conditions favorable for dispersing pollution .

Comparison with traditional models
We first compared results from the STLG model with results from five widely used statistical regression models employed for estimating PM 2.5 in China using the same input dataset ( Table 4). The multivariate linear regression (MLR) model performs the worst due to the complex nonlinear PM 2.5 -AOD relationship. The GWR model performs better because it takes into account the spatial characteristics of PM 2.5 pollution. The generalized additive model (GAM) and the LME model show overall improved performances with decreasing estimation uncertainties because of their nonlinear characteristics and stronger data regression abilities. The two-stage model outperforms the GAM and maximum likelihood estimation (MLE) models with higher CV-R 2 values and smaller estimation uncertainties by combining the advantages of the GWR and LME models. Our model performs better than all of the traditional statistical regression models considered mainly due to its stronger data-mining ability.
The first six rows of Table 5 show the accuracies and efficiencies of six tree-based machine-learning models when estimating PM 2.5 in China using the same input dataset. The decision tree (DT; Quinlan, 1986) is a traditional, frequently used, supervised learning classification method. Although the training speed is the fastest and the memory consumption is the least, it has the worst performance because of the simple single classifier. The model performances of ensemble-learning approaches, i.e., GBDT (Friedman, 2001), RF (Breiman, 2001), extremely randomized trees (ERTs; Geurts et al., 2006), and XGBoost (Chen and Guestrin, 2016), can be significantly improved by combining several weak classifiers into a strong classifier. Among them, the ERT model yields a higher estimation accuracy and a stronger spatial prediction ability than other ensemblelearning models. The LightGBM model (Ke et al., 2017) performs the best with the highest accuracy and smallest uncertainty among all tree-based machine-learning approaches considered.
The model efficiency differs among these models due to the large differences in the algorithm design frameworks. These tree-based machine-learning models can be divided into two categories. The DT, RF, and ERT models fall into the "bagging" category, which synthesizes multiple independent and unrelated weak classifiers into a strong classifier. It allows for work in parallel, which can save much time but may need more computer memory. The GBDT, XGBoost, and LightGBM models fall into the "boosting" category, which synthesizes multiple interdependent and related weak clas-  sifiers into a strong classifier. They can only work in serial, which may take much time but not too much memory. In general, the STGB model is the most time-consuming, while the STET model is the most memory-consuming. By contrast, the LightGBM model runs very fast and consumes very little computer memory, benefiting from a series of algorithm optimizations (Ke et al., 2017). After considering spatiotemporal variations, all the newly defined space-time DT, GBDT, XGBoost, RF, ERT, and LightGBM models (i.e., STDT, STGB, STXB, STRF, STET, and STLG) show significant improvements in both overall estimation accuracy and spatial prediction ability in estimat-ing hourly PM 2.5 concentrations with reference to their original models. This further illustrates the importance of including spatiotemporal information when constructing PM 2.5 -AOD relationships. More importantly, the training speed of these models did not decrease much, and the memory consumption did not increase much either. In general, the STLG model shows the best performance with a high efficiency (i.e., training speed = 46 s, memory usage = 0.60 GB) among all the space-time tree-based machine-learning models. Therefore, our new STLG model is highly valuable for accurate and fast air pollution monitoring, in particular for our future study extended to the global scale.

Comparison with related studies
We compared Himawari-8-based hourly PM 2.5 estimates at regional and national scales in China with previous related studies (Table 6). Local hourly PM 2.5 concentrations retrieved from our national-scale model are more accurate than those derived from the models developed separately in local areas, e.g., the LME model , the GWR, SVR, RF, and DNN models in the BTH region  and the two-stage RF and DNN models in the YRD region (Fan et al., 2020;Tang et al., 2019). Our model also outperforms most of the statistical regression models and machine-learning models focused on the entirety of China, e.g., the I-LME, IGTWR, RF, AdaBoost, XGBoost, and their stacked models in China Liu et al., 2019;Xue et al., 2020;. This is due to the stronger data-mining ability, considering key spatial and temporal information about air pollution (ignored in previous studies), which introduces more comprehensive factors that affect PM 2.5 pollution (e.g., emission inventories).

Summary and conclusion
PM 2.5 has a great impact on the atmospheric environment and is also used as a key indicator in environmental health studies. It varies diurnally, affected by both natural and human factors. Previous studies have been based on data from sun-synchronous satellites which can monitor air pollution at coarse temporal scales (i.e., daily), while high-temporalresolution and accurate information on PM 2.5 is needed. In this study, the Himawari-8/AHI hourly AOD product is employed to address this issue. Moreover, considering the large volume of input data and the large errors in PM 2.5 estimation using traditional methods, an efficient and accurate spacetime Light Gradient Boosting Machine (i.e., STLG) model has been developed. It utilizes meteorological, human, land use, and topographical parameters and is implemented at 5 km resolution and hourly timescale to generate PM 2.5 information over China. The hourly PM 2.5 estimates are evalu- ated against surface observations, and PM 2.5 spatiotemporal variations are also investigated. The STLG model predicts hourly PM 2.5 values accurately, with high out-of-sample (out-of-station) CV-R 2 values of ∼ 0.81-0.85 (∼ 0.76-0.81) and low RMSE values of ∼ 11.24-15.56 (∼ 12.49-17.61) µg m −3 throughout the day. The model can also produce daily (e.g., R 2 = 0.91, RMSE = 10.11 µg m −3 ), monthly, seasonal, and annual mean PM 2.5 values (e.g., R 2 = 0.98, RMSE = 1.6-3.3 µg m −3 ). PM 2.5 varies diurnally in most areas of mainland China, where PM 2.5 concentrations reach a maximum at 10:00 LT and are generally low at sunrise and sunset on a given day. PM 2.5 also varies greatly on a seasonal basis, in which winter and summer experience the highest and lowest air pollution levels, respectively. Comparison results suggest that the proposed model is more accurate than traditional statistical regression models, other tree-based machine-learning models, and various models developed in previous studies. Overall, the STLG model is more efficient, having faster training speed and less memory consumption. These results illustrate that this algorithm can be useful for real-time monitoring of PM 2.5 pollution in China. Data availability. PM 2.5 measurements are available at http:// www.cnemc.cn (CNEMC, 2020), the Himawari-8 AOD product is available at https://www.eorc.jaxa.jp/ptree/ (JAXA Himawari Monitor, 2020), ERA5 reanalysis products are available at https: //cds.climate.copernicus.eu/ (CDS, 2020), the MODIS product is available at https://search.earthdata.nasa.gov/ (NASA, 2020), and the LandScan™ product is available at https://landscan.ornl.gov/ (ORNL, 2020). The ChinaHighPM 2.5 dataset is available at https: //weijing-rs.github.io/product.html (Wei, 2020).
Author contributions. JiW designed the research and wrote the initial draft of this manuscript. ZL, RTP, JuW, and LS reviewed and edited the paper. RL and WX helped to process the data. MC copyedited the article. All authors made substantial contributions to this work.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "Satellite and ground-based remote sensing of aerosol optical, physical, and chemical properties over China". It is not associated with a conference.