Unleashing the potential of geostationary satellite observations in air quality forecasting through artificial intelligence techniques

Zhang, Chengxin; Niu, Xinhan; Wu, Hongyu; Ding, Zhipeng; Chan, Ka Lok; Kim, Jhoon; Wagner, Thomas; Liu, Cheng

doi:https://doi.org/10.5194/acp-25-759-2025

Articles | Volume 25, issue 2

https://doi.org/10.5194/acp-25-759-2025

Articles | Volume 25, issue 2

Research article

21 Jan 2025

Research article |

| 21 Jan 2025

Unleashing the potential of geostationary satellite observations in air quality forecasting through artificial intelligence techniques

Chengxin Zhang, Xinhan Niu, Hongyu Wu, Zhipeng Ding, Ka Lok Chan, Jhoon Kim, Thomas Wagner, and Cheng Liu

Abstract

Air quality forecasting plays a critical role in mitigating air pollution. However, current physics-based air pollution predictions encounter challenges in accuracy and spatiotemporal resolution due to limitations in the understanding of atmospheric physical mechanisms, observational constraints, and computational capacity. The world's first geostationary satellite UV–Vis spectrometer, i.e., the Geostationary Environment Monitoring Spectrometer (GEMS), offers hourly measurements of atmospheric trace gas pollutants at high spatial resolution over East Asia. In this study, we successfully incorporate geostationary satellite observations into a neural network model (GeoNet) to forecast full-coverage surface nitrogen dioxide (NO₂) concentrations over eastern China at 4 h intervals for the next 24 h. GeoNet leverages spatiotemporal series of satellite NO₂ observations to capture the intricate relationships among air quality, meteorology, and emissions in both temporal and spatial domains. Evaluation against ground-based measurements demonstrates that GeoNet accurately predicts diurnal variations and spatial distribution details of next-day NO₂ pollution, yielding a coefficient of determination of 0.68 and a root mean square of error of 12.31 µg m⁻³, significantly surpassing traditional air quality model forecasts. The model's interpretability reveals that geostationary satellite observations notably improve NO₂ forecast capability more than other input features, especially over polluted regions. Our findings demonstrate the significant potential of geostationary satellite observations in artificial-intelligence-based air quality forecasting, with implications for early warning of air pollution events and human health exposure.

Download & links

Article (PDF, 4529 KB)

Supplement (2908 KB)

Download & links

How to cite.

Received: 20 Aug 2024 – Discussion started: 30 Aug 2024 – Revised: 30 Oct 2024 – Accepted: 19 Nov 2024 – Published: 21 Jan 2025

1 Introduction

Since the industrial revolution, numerous countries worldwide have encountered severe air pollution issues such as photochemical ozone smog and haze pollution (Hong et al., 2019), which significantly affect human health, crop yields, and the global environment (Manisalidis et al., 2020; Sathe et al., 2021; Guarin et al., 2024). Recent studies have shown that both long-term and short-term exposure to air pollutants such as nitrogen dioxide (NO₂) can significantly affect human health, especially the respiratory system (Meng et al., 2021). Accurate and high-spatial-resolution predictions of air pollutant concentrations can provide critical information for sensitive persons to mitigate health risks. Meanwhile, air quality health risk (AQHI) forecasts and corresponding public response recommendations need to be communicated to the public promptly through public facilities (Tang et al., 2024; Fino et al., 2021). In recent decades, the advancement of atmospheric monitoring and modeling has enabled significant progress in air quality forecasting based on our understanding of atmospheric physics and chemistry (Peuch et al., 2022). Air pollution forecasting not only facilitates responses to environmental health risks but also improves the accuracy of climate and weather simulations (Makar et al., 2015). However, due to our still-limited understanding of atmospheric mechanisms and observational and emission constraints, existing air quality forecasts based on physical or statistical models still face challenges in terms of temporal, spatial, and accuracy aspects (Campbell et al., 2022; Zhong et al., 2021).

Artificial intelligence (AI) technology has made breakthroughs in the field of Earth science (Zhong et al., 2021; Boukabara et al., 2020), particularly excelling in addressing complex problems that are challenging for traditional physical paradigms to simulate (Irrgang et al., 2021), such as weather and climate forecasting (Andersson et al., 2021). Concerning meteorological data, some large-scale deep learning models have surpassed the predictive capabilities of existing numerical weather models to some extent. Examples include Climax (Nguyen et al., 2023), Pangu-Weather (Bi et al., 2023), and GraphCast (Lam et al., 2023). Despite significant progress and impressive performance achieved by forecasting meteorological variables with AI methods, there are still limitations in predicting atmospheric pollutant compositions. Compared to meteorological parameters, the prediction of air pollutant concentrations is affected by synoptic meteorology, chemistry, and anthropogenic emission activity, usually with more complex driven mechanisms and associated uncertainties. Current AI-based air quality forecasts often involve time series predictions at a limited number of observation stations rather than full-coverage predictions over the entire spatial domain (Du et al., 2021). This is primarily due to the lack of effective air quality observations with high temporal and spatial resolution simultaneously.

While past polar-orbiting satellite observations such as the Ozone Monitoring Instrument (OMI) and the TROPospheric Monitoring Instrument (TROPOMI), have provided extensive coverage of atmospheric pollutant distributions such as nitrogen dioxide (NO₂), sulfate dioxide (SO₂), ozone (O₃), and aerosols, they are limited to once-daily overpasses and are usually affected by clouds (van Geffen et al., 2022; Chan et al., 2023). This frequency usually exceeds the chemical lifetimes of many reactive-gas pollutants like NO₂, making it challenging to offer effective observational constraints for short-term air quality forecasting with machine learning (Shah et al., 2020). However, these observations at a fixed daily overpass time could hardly support the prediction of atmospheric trace gas concentrations at other times of the day under different meteorological conditions. In February 2020, the world's first geostationary satellite payload for air pollution monitoring, the Geostationary Environment Monitoring Spectrometer (GEMS), began to provide high-coverage and high-precision air quality observations at an hourly rate for the East Asian region (Kim et al., 2020). The dynamic processes of air pollutants including emission, transformation, and transport can be observed by the geostationary satellite during the daytime. This monitoring capability may advance data-driven air quality forecasting such as machine learning techniques by offering unprecedented observational constraints with high spatial and temporal coverage. Recent observing system simulation experiments (OSSEs) indicate that assimilating trace gas observations by geostationary satellites into chemical models can effectively improve surface ozone simulations (Shu et al., 2023), nitrogen oxides (NO_x), and emission estimates (Hsu et al., 2024).

Here, based on the unprecedented temporal and spatial resolution and coverage of the GEMS satellite (Kim et al., 2020), we incorporated geostationary satellite remote sensing of tropospheric NO₂ column densities (refer to Sect. 4 for details) into a neural network model (GeoNet) to forecast full-coverage surface NO₂ concentration over the next day from the current time t (i.e., t+24 h). Compared with previous air quality forecasting based on the simulation of atmospheric physics and chemistry, possibly combined with data assimilation approaches, GeoNet relies solely on geostationary satellite measurements and ancillary meteorology data. GeoNet effectively addresses the complex, nonlinear relationships between future short-term air quality and current satellite observations, as well as temporally adjacent meteorological variables (Zhang et al., 2022). The method employs satellite and meteorological variables within the spatial vicinity of individual air quality monitoring sites as input features, with site observations serving as labels for model training. The resulting model achieves accurate and comprehensive air quality predictions across the entire domain over eastern China, which is a significant achievement given that past machine learning technologies have relied on only a few stations or polar-orbiting satellite observations.

2 Materials and methods

2.1 Geostationary satellite observations of atmospheric NO₂

GEMS is the first UV–Vis spectrometer at a geostationary satellite orbit, measuring atmospheric pollutants such as NO₂, SO₂, O₃, and HCHO over East Asia at a spatial resolution of 3.5 km × 7.5 km at nadir and a temporal resolution of 1 h during the daytime (Kim et al., 2020). Based on the unique spectral absorption of trace gases, the atmospheric NO₂ column can be retrieved in visible wavelengths from the spectra of back-scattered sunlight. The details of the GEMS NO₂ retrieval can be found in the Algorithm Theoretical Basis Document (available at https://nesc.nier.go.kr/ko/html/satellite/doc/doc.do, last access: 1 June 2023). In this study, we used the tropospheric NO₂ column from the GEMS NO₂ version 2.0 product, as well as the cloud fraction for each satellite ground pixel. Overall, GEMS NO₂ measurements have a good correlation with ground-based remote sensing instruments, with correlation coefficients (R) between 0.69 and 0.81 and root mean square of errors (RMSE) between 3.2 and 4.9 × 10¹⁵ molec. cm⁻² (Kim et al., 2023). Our previous validation results indicated that GEMS NO₂ retrievals generally agreed well with ground-based MAX-DOAS measurements from six sites in China, with correlation coefficients ranging between 0.69 and 0.92 (Li et al., 2023).

2.2 Ancillary datasets

Other input information including meteorological datasets is necessary to better constrain the prediction of future NO₂ pollution. Here, both the ERA5 meteorology reanalysis (Hersbach et al., 2020) and the CAMS forecast (Peuch et al., 2022) were used to provide meteorological parameters such as zonal and meridional wind (U wind and V wind), temperature (Temp), relative humidity (RH), and precipitation (Precip). In addition, the fraction of cloud cover available from the satellite NO₂ datasets was also considered. To fill the missing gaps in the satellite NO₂ measurements, we use both the NO₂ concentrations from the WRF-Chem model (Zhang et al., 2022) and the CAMS forecast of atmospheric composition. Note that the reanalysis datasets were typically updated with a delay of 1 week from real time, while the forecast datasets can provide future 7 d meteorology from the current time. Therefore, the latency of input datasets would affect the operational prediction of the GeoNet model. Surface NO₂ measurements were used as the ground-truth label in the model training phase, available from over 1000 national air quality sites via the China National Environmental Monitoring Centre (CNEMC) (Kong et al., 2021).

The preprocessing steps of model input datasets, including outlier detection, missing value handling, resampling, and normalization, are described in Sect. S1 in the Supplement.

2.3 The GeoNet model

Figure 1 illustrates the structure and methodology of the artificial intelligence air quality forecasting model established in this study. Given the distinctive nature of spatiotemporal sequence data for air quality, predictions must consider not only temporal relationships but also spatial correlations. The deep learning model employed in this research utilizes convolutional long short-term memory (ConvLSTM) as its kernel, a variant of the LSTM model designed for the time series forecasting (Lin et al., 2020). It incorporates a convolutional network structure to capture spatial features of three-dimensional inputs. Both input-to-state and state-to-state transitions involve convolutional structures. ConvLSTM determines the future state of a unit within a grid based on inputs from its local neighbors and past states, allowing it to effectively model the spatiotemporal dynamics of air quality. The ConvLSTM kernel structure employed in training is illustrated in Fig. 5a. Here, X_t represents the input at time t, H_t and H_t−1 denote the outputs at times t and t−1, and C_t and C_t−1 represent the states at times t and t−1. The computational process is as follows:

\begin{matrix} (1) & i_{t} = σ (X_{t} * w_{xi} + H_{t - 1} * w_{hi} + b_{i}) \\ (2) & f_{t} = σ (X_{t} * w_{xf} + H_{t - 1} * w_{hf} + b_{f}) \\ (3) & o_{t} = σ (X_{t} * w_{xo} + H_{t - 1} * w_{ho} + b_{o}) \\ (4) & g_{t} = \tanh (X_{t} * w_{xg} + H_{t - 1} * w_{hg} + b_{g}) \\ (5) & C_{t} = f_{t} \times C_{t - 1} + i_{t} \times g_{t} \\ (6) & H_{t} = o_{t} \times \tanh (C_{t}), \end{matrix}

where the asterisk (*) represents the convolution operator, w is the convolution kernel, b is the offset, tanh is the hyperbolic tangent function, and σ is the activation function of sigmoid.

https://acp.copernicus.org/articles/25/759/2025/acp-25-759-2025-f01

Figure 1The framework of predicting full-coverage surface NO₂ concentration based on geostationary satellite measurements and a ConvLSTM neural network model (GeoNet). (a) The structure of the ConvLSTM block; (b) a diagram of the GeoNet model structure with input and output; (c) an illustration of the model input parameters, including meteorological variables and hourly NO₂ measurements by the geostationary satellite; (d) the input data cube of different features for a single training batch, which is centered at an air quality site.

The model primarily consists of three components: an encoder, a decoder, and fully connected layers. Tropospheric NO₂ observations from the GEMS satellite for different consecutive hours within a day, along with corresponding meteorological forecast field data, serve as input features for model training. The encoder processes the spatiotemporal sequences of input features for the preceding 8 h (t−7 h, t−6 h, …, t), which are then decoded by the decoder. The final output, representing NO₂ concentrations at 4 h intervals for the next 24 h (t+4 h, t+8 h, t+12 h, …, t+24 h), is produced through fully connected layers. The loss function of mean square error (MSE) is calculated by comparing the model output with the actual values from station observations, and the model undergoes iterative training. In the training task for a single station sample, the model utilizes continuous and distinct hourly dynamic images of all variables within the spatiotemporal vicinity of the station as input (see Fig. 1c–d). This effectively considers the intricate correlations in time and space between air quality, satellite observations, and meteorological input features. We trained the GeoNet model with input features during the whole year of 2021. The training datasets were randomly selected from 75 % of the whole samples, while the remaining 25 % were used as validation sets.

2.4 The model configuration and optimization

The model configurations and hyperparameters such as the optimizer, loss function, L1 or L2 regularization, dropout, training steps, and epochs can make a difference in the model performance, including the prediction accuracy and generalizability. The performance metrics, such as the coefficient of determination (R²), root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE), were used to diagnose the model (see definition in Sect. S2). Thus, several scenarios of model hyperparameters have been tested during the model training phase. The model accuracy in validation datasets and the learning rate curve were used to diagnose the model hyperparameters. The model parameters mainly include the number of layers and the dimensions of the hidden layers; both control the model's capacity. If the model capacity is relatively small, underfitting may occur; overfitting may exist if it is too large. Therefore, selecting an appropriate model capacity is crucial for improving model performance. During the pre-training process, the model is trained by combining different numbers of layers and dimensions of the hidden layers. The mean square error (MSE) loss is recorded for each training iteration, and a heatmap is generated as shown in Fig. S2. From the heatmap, it can be observed that when the number of layers is 2, and the dimension of the hidden layer is 256, the model achieves the minimum MSE Loss. Figure S3 shows the sensitivity test results of model loss varying with different batch size settings, indicating that a batch size of 64 is optimal. Based on the model's MSE loss under different hyperparameter configurations, the best-fitting model can be selected.

The Adam optimization algorithm controls the learning rate, which can design independent adaptive learning rates for different parameters. The three initialization parameters ϵ, ρ1, and ρ2 of the Adam algorithm are set to be 0.0001, 0.9, and 0.99, respectively. For the epoch, its size is controlled by the early stop method. The early stop method monitors the change in the model's loss function in the validation set during the training process and stops the model training immediately when the validation loss of the model starts to become larger. Due to the fluctuation in the loss function, a threshold ρ is set for the early stopping method in practice, and when the validation loss of the model becomes large for ρ consecutive epochs, the model is rolled back to the lowest validation loss, and the training is stopped. The threshold ρ is set to 10 in this paper. Figure S4 shows a typical learning curve of the MSE loss in training and validation datasets for different learning steps in training an optimal model. Such diagnostics can be used to avoid the model overfitting.

2.5 The importance of the model input feature

Permutation feature importance is a technique used to assess the significance of each input feature in a machine learning model (Altmann et al., 2010). The core idea is to evaluate the impact of each feature on model performance by randomly shuffling its values and observing the resulting change in the model's accuracy. In this study, for each input feature of the GeoNet, we iteratively shuffle its value independently while keeping other features unchanged and then observe the model prediction with the modified input. The difference in the model prediction performance between using the original and shuffling input quantifies the feature's importance. Here, we measure the relative importance of each input feature using the metric of 1−R², due to its good standardized and indicative ability (Zhang et al., 2022). Generally, a larger performance drop indicates greater importance, as the model heavily relies on that feature for predictions. Conversely, smaller drops or increases suggest that the feature may be less crucial or redundant. By permuting the input feature array based on the different spatial and temporal domains, we can gain a deeper understanding of how feature importance varies spatially and temporally. For example, the relative importance of one meteorology variable may vary with different diurnal, weekly, and monthly cycles, revealing the variability in its impact on the predicted NO₂ levels.

3 Results and discussion

3.1 Model performance

Based on the GeoNet model and necessary input data (refer to Sect. 2), we have achieved preliminary predictions of near-surface NO₂ concentration with full spatial coverage and a spatial resolution of 0.1° over eastern China at 4 h intervals over the next 24 h. In this study, we first tested the impact of using reanalysis and forecast meteorology datasets and filling in missing values in satellite observation data on the model predictions. The reanalysis datasets usually have higher precision than the forecast. Previous studies revealed that the accuracy of the information on meteorology and chemical composition significantly affects the performance of machine learning models in estimating air pollutant concentrations (Zuo et al., 2023; Wang et al., 2024). Due to the shielding effect of clouds, a considerable proportion of missing values may even exist in satellite NO₂ observations. Recent big-data research on air quality has usually required the gap-filling of missing satellite data before inputting them into the machine learning model by either spatial interpolation or regression techniques (Kim et al., 2021). We tested three methods for handling missing data, such as setting them to a fill value of zero or replacing them with real-time CAMS-simulated NO₂ or WRF-Chem-simulated NO₂ results (not real-time, but with higher precision).

The comparison results for the validation datasets indicate that the scenario using CAMS meteorology datasets and replacing missing satellite NO₂ data with fill values (Fig. 2c) corresponds to a modest NO₂ prediction performance with R²= 0.68 and RMSE = 12.26 µg m⁻³. In contrast, the configuration scenario using ERA5 reanalysis meteorology and imputing with WRF-Chem simulations (Fig. 2a) corresponds to the best prediction performance of R²= 0.69 and RMSE = 11.88 µg m⁻³. This may indicate that the importance of the imputation of missing satellite data may be diminished by cloud mask inputs, especially since the model can extract informative features from spatial and temporal neighboring inputs. To compromise between the performance of real time and accuracy, we selected the configuration scenario using CAMS meteorology and imputing with CAMS NO₂ (Fig. 2d) for subsequent discussion and operational forecasting, with an R²= 0.68 and RMSE = 12.31 µg m⁻³. In summary, using higher-precision meteorology and filling missing NO₂ data enhances the model's prediction accuracy with the validation dataset, but to a rather limited extent. This suggests that, unlike previous machine learning techniques, GeoNet can effectively adapt to three-dimensional inputs of varying accuracy and type, fully explore the spatiotemporal correlation of data features, and demonstrate strong model generalization capabilities.

https://acp.copernicus.org/articles/25/759/2025/acp-25-759-2025-f02

Figure 2The GeoNet prediction performance of the surface NO₂ concentration compared to the validation samples, based on different input datasets of meteorology and atmospheric composition: (a) use ERA5 meteorology and fill satellite measurement gaps with WRF-Chem-simulated NO₂; (b) use ERA5 meteorology and a NO₂ fill value of zero for over gaps; (c) use CAMS meteorology and a NO₂ fill value of zero for gaps; (d) use CAMS meteorology and CAMS NO₂. The left plot shows the scatter comparisons between GeoNet predictions and site observations, while the right plot shows the bias distribution between the two.

Download

Figures S5–S8 provide an overview of the major metrics (e.g., R², RMSE, MAE, and MAPE) of GeoNet prediction performance varying with prediction hours from t+4 h to t+24 h in different months. The results indicate that the model exhibits a higher correlation with NO₂ forecasting during the spring and winter seasons compared to the summer, while the RMSE errors show the opposite trend. This could be attributed to much higher NO₂ pollution levels in winter months. Additionally, GeoNet's NO₂ prediction errors gradually increase during the next 24 h, particularly after t+20 h. This is primarily due to the short lifetime of atmospheric NO₂, leading to a diminishing constraint from historical observational data on future NO₂ predictions. Similar phenomena are also observed in machine learning or model-assisted weather forecasts (Andersson et al., 2021).

To assess the GeoNet model's performance for short-term pollution events, we compared it with near-surface NO₂ from CAMS forecasts and in situ observations from CNEMC ground stations. Figure S9 illustrates the daily time series of t+4 h NO₂ from GeoNet, CAMS, and CNEMC for three typical sites in Beijing, Shanghai, and Guangzhou in 2021. As shown from the plot, NO₂ predictions by both GeoNet and CAMS generally agreed with the variation trends of CNEMC measurements. However, CAMS forecasts systematically overestimate the surface NO₂ concentration by 100 %, possibly resulting from the biases in the NO_x emission inventory (Douros et al., 2023). Compared to CAMS, the GeoNet prediction closely aligns with the ground-truth observations at CNEMC sites over eastern China, with an overall R² > 0.5 and mean bias < 5 µg m⁻³ for polluted regions (see Figs. S10 and S11, respectively).

https://acp.copernicus.org/articles/25/759/2025/acp-25-759-2025-f03

Figure 3(a) The overall relative importance of different input features such as wind, surface pressure, satellite NO₂, and cloud mask in GeoNet NO₂ forecasting, varying with different hour steps from t+4 h to t+24 h. (b) The spatial distribution of the relative importance of satellite NO₂ measurements in the GeoNet NO₂ forecast in 2021.

3.2 Main factors in NO₂ forecasting and their implications

Previous physics-based numeric models of air quality prediction, e.g., the CAMS global forecast model and the regional WRF-CMAQ model (Liu et al., 2023; Kumar et al., 2021; Kuhn et al., 2024), can simulate the physical and chemical atmospheric processes (such as advection, diffusion, deposition, and chemical reactions) by solving the atmospheric equations. Recent data assimilation techniques further take real-time monitoring data from satellite and ground-based platforms as model constraints to better predict air quality variables (Inness et al., 2022). Compared with physics-based models, “black box” models such as the deep learning technique usually lack interpretability and explainability (Zhang and Zhu, 2018). This hinders the understanding and implications for predicting air quality variables such as NO₂. Here, we measure the relative importance of each input feature on the NO₂ forecast accuracy by iteratively permuting the input array and observing its influences on the model prediction.

Figure 3a presents the relative importance (1−R²) of different input features varying with prediction hour steps from t+ 4 h to t+24 h. The geostationary satellite NO₂ measurements play the biggest role in predicting surface NO₂ levels of the next day, although it degrades after t+8 h. Other meteorological input features also show a major impact on NO₂ prediction performance. The significance of the different input variables remained generally consistent across seasons, with minor variations (as shown in Fig. S12). By permutating the input array for each ground pixel, Fig. 3b derived the spatial distribution of the relative importance of geostationary satellite NO₂ in the predicting performance. Overall, satellite NO₂ has a higher impact in densely populated areas experiencing severe air pollution, such as the Pearl River Delta, Yangtze River Delta, and Jianghuai Plain, than in western China. Such results highlight the underappreciated role of satellite NO₂ measurements with high spatial and temporal coverage in air pollution forecasts.

3.3 NO₂ pollution episodes and health exposure forecast

Beyond its prediction accuracy, GeoNet exhibits a pronounced advantage in spatial coverage and resolution, allowing finer-scale details in the pollutant distribution to be captured. Illustrated in Fig. 4, GeoNet demonstrates remarkable performance in predicting spatial nuances of NO₂ pollution, particularly when contrasted with ground-based and satellite observations. During a typical winter NO₂ pollution event (as shown in Fig. 5), GeoNet accurately simulates a significant decrease in concentrations at 11:00 and 15:00 local solar time, probably led by intense photochemical activity in the daytime, coincident with ground-based observations. It also outperforms CAMS in predicting NO₂ variations throughout the day. The GeoNet model also retains the distributional differences in NO₂ concentrations between urban and rural areas, consistent with emission source characteristics and satellite observations. The suboptimal performance of CAMS predictions can be attributed to insufficient observational constraints and the use of outdated emission inventories (Douros et al., 2023). In the European region, the assimilation of TROPOMI observations into CAMS forecasts significantly improves the simulation accuracy of near-surface NO₂ concentrations and tropospheric column densities (Inness et al., 2019). Neural network methods, similar to GeoNet, could be used to correct and downscale forecast results by existing models (Baghanam et al., 2024). This approach holds promise for achieving operational air quality forecasts that balance efficiency and accuracy.

https://acp.copernicus.org/articles/25/759/2025/acp-25-759-2025-f04

Figure 4The comparisons of annual surface NO₂ concentrations from GeoNet, CAMS, and CNEMC, respectively (a–c), as well as the tropospheric NO₂ column observations from GEMS and TROPOMI over eastern China in 2021 (d–e).

https://acp.copernicus.org/articles/25/759/2025/acp-25-759-2025-f05

Figure 5The spatial distribution comparisons of surface NO₂ concentration between (a) GeoNet prediction at the original resolution of 0.1°, (b) GeoNet prediction resampled to the CAMS resolution of 0.4°, (c) CAMS prediction, and (d) ground-based CNEMC site measurements. Note that the results are presented for different continuing local hours (labeled text in the subplot) on 23 November 2021.

In this study, we used a simplified linearized risk model for the short-term NO₂ exposure (Meng et al., 2021; Zhang et al., 2022) to calculate the distribution of all-cause mortality risks based on GeoNet NO₂ predictions (see Fig. 6). Short-term NO₂ exposure leads to remarkable regional differences in all-cause mortality, which are mainly concentrated in highly polluted and densely populated urban areas. For both urban and suburban locations in Beijing (see Fig. 6c–d), GeoNet-based NO₂ pollution exposure predictions are more consistent with actual in situ observations than the CAMS forecasts. Current air quality health index forecasting based on limited station data has significant gaps, making it difficult to meet the refined needs for different populations in urban, suburban, and rural areas. Integrating GeoNet forecasts based on hourly geostationary satellite observations can support spatially comprehensive and fine-scale air quality health risk prediction. This, in turn, guides the management of the risks of air-pollution-exposure-related diseases in sensitive populations and communities.

https://acp.copernicus.org/articles/25/759/2025/acp-25-759-2025-f06

Figure 6Mortality risk of short-term NO₂ exposure based on the GeoNet prediction on 23 November 2021. (a) Mean mortality due to the predicted NO₂ exposure in eastern China, (b) a zoom-in map over Beijing and its neighboring area. Panels (c) and (d) are comparisons of mortality estimation over Beijing's urban and rural regions (the rectangular areas presented in b), respectively, based on different NO₂ exposure predictions among GeoNet, CAMS, and CNEMC.

4 Conclusion

The GeoNet model utilizes the unprecedented hourly air quality observations from geostationary satellites and resolves nonlinear associations in spatiotemporal proximity across multiple data sources. It achieves seamless short-term regional air quality predictions, exhibiting significant performance advantages over existing machine learning air quality prediction models. To strike a balance between real-time and accuracy requirements, we evaluated the impact of using reanalysis- and forecast-based meteorology datasets, as well as imputing the missing values of satellite NO₂. The findings reveal that the GeoNet model demonstrates robust generalization across diverse datasets, with minimal fluctuations in prediction performance. Overall, the model achieves an RMSE of 12.31 µg m⁻³ and an R² of 0.68 when predicting NO₂ concentrations every 4 h for the next 24 h. However, validation accuracy notably diminishes after t+16 h within the next 24 h, with stronger predictive correlations observed in seasons characterized by severe pollution, such as spring and winter, compared to summer. The variation in the model forecasting performance also shows that accurate prediction for longer time windows and heavy-pollution events is still a major difficulty. This may be due to the high level of uncertainty in emissions and meteorology. In the future, a combination of higher resolution and more accurate multi-source data constraints, as well as machine learning models coupled with physical atmospheric mechanisms, may be needed to improve the existing forecasts.

Compared to traditional chemical model forecasts and data assimilation predictions, the GeoNet model handles various data sources, including meteorological simulations and air quality observations, and more accurately captures spatial intricacies of air pollution evolution. The GeoNet framework elucidated in this study forecasts short-term, near-surface NO₂ concentrations and demonstrates transferable learning potentials for predicting other pollutants. This work also has important implications for the prediction of near-surface O₃ and particulate matter. For example, the integration of using vertical O₃ profiles from the GEMS satellite, in particular near-surface layer concentrations, and their joint observations of important O₃ precursors, including NO₂ and HCHO, is expected to significantly improve the uncertainty in existing estimates of near-surface air pollution. This study underscores the pivotal role of next-generation stationary satellite observations of air pollution constituents in air quality forecasting, with the potential to advance operational air quality forecasting and mitigate associated health risks by integrating machine learning technologies.

Code and data availability

The GEMS NO₂ v2.0 data are available from the National Institute of Environmental Research (NIER) of South Korea (https://nesc.nier.go.kr/en/html/index.do, last access: 10 September 2023, NIER, 2023). We downloaded the NO₂ measurements from the CNEMC real-time air quality platform (https://air.cnemc.cn:18007/, last access: 8 June 2023, CNEMC, 2023). ERA5 reanalysis meteorological data are obtained from the European Centre for Medium-Range Weather Forecasts (https://doi.org/10.24381/cds.adbb2d47, Hersbach et al., 2023). CAMS forecasts of meteorological and atmospheric NO₂ datasets are retrieved from the CAMS Atmosphere Data Store (https://ads.atmosphere.copernicus.eu/datasets/cams-global-atmospheric-composition-forecasts?tab=download, Copernicus Atmosphere Monitoring Service, 2023). The source codes of the GeoNet model, surface NO₂ prediction, and necessary input data can be obtained from Chengxin Zhang (zcx2011@ustc.edu.cn) upon reasonable request.

Supplement

The supplement related to this article is available online at: https://doi.org/10.5194/acp-25-759-2025-supplement.

Author contributions

CZ implemented the GeoNet model and analyzed the data. CL supervised the study. CZ wrote the manuscript with input from all co-authors.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

Acknowledgements

The authors acknowledge the advanced computing resources provided by the Supercomputing Center of the University of Science and Technology of China.

Financial support

This study was supported by the National Natural Science Foundation of China (grant nos. 42225504, 62305322, and 42375120), the National Key Research and Development Program of China (grant nos. 2022YFC3700100 and 2023YFC3706104), the Fundamental Research Funds for the Central Universities (grant nos. YD2090002021 and WK2090000038), and the New Cornerstone Science Foundation through the XPLORER PRIZE (grant no. 2023-1033).

Review statement

This paper was edited by Carl Percival and reviewed by two anonymous referees.

References

Altmann, A., Tolosi, L., Sander, O., and Lengauer, T.: Permutation importance: a corrected feature importance measure, Bioinformatics, 26, 1340–1347, doi:10.1093/bioinformatics/btq134, 2010.

Andersson, T. R., Hosking, J. S., Pérez-Ortiz, M., Paige, B., Elliott, A., Russell, C., Law, S., Jones, D. C., Wilkinson, J., and Phillips, T.: Seasonal Arctic sea ice forecasting with probabilistic deep learning, Nat. Commun., 12, 5124, https://doi.org/10.1038/s41467-021-25257-4, 2021.

Baghanam, A. H., Nourani, V., Bejani, M., Pourali, H., Kantoush, S. A., and Zhang, Y.: A systematic review of predictor screening methods for downscaling of numerical climate models, Earth-Sci. Rev., 253, 104773, https://doi.org/10.1016/j.earscirev.2024.104773, 2024.

Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., and Tian, Q.: Accurate medium-range global weather forecasting with 3D neural networks, Nature, 619, 533–538, 2023.

Boukabara, S.-A., Krasnopolsky, V., Penny, S. G., Stewart, J. Q., McGovern, A., Hall, D., Ten Hoeve, J. E., Hickey, J., Allen Huang, H.-L., and Williams, J. K.: Outlook for exploiting artificial intelligence in the earth and environmental sciences, B. Am. Meteorol. Soc., 102, E1016–E1032, 2020.

Campbell, P. C., Tang, Y., Lee, P., Baker, B., Tong, D., Saylor, R., Stein, A., Huang, J., Huang, H.-C., Strobach, E., McQueen, J., Pan, L., Stajner, I., Sims, J., Tirado-Delgado, J., Jung, Y., Yang, F., Spero, T. L., and Gilliam, R. C.: Development and evaluation of an advanced National Air Quality Forecasting Capability using the NOAA Global Forecast System version 16, Geosci. Model Dev., 15, 3281–3313, https://doi.org/10.5194/gmd-15-3281-2022, 2022.

Chan, K. L., Valks, P., Heue, K.-P., Lutz, R., Hedelt, P., Loyola, D., Pinardi, G., Van Roozendael, M., Hendrick, F., Wagner, T., Kumar, V., Bais, A., Piters, A., Irie, H., Takashima, H., Kanaya, Y., Choi, Y., Park, K., Chong, J., Cede, A., Frieß, U., Richter, A., Ma, J., Benavent, N., Holla, R., Postylyakov, O., Rivera Cárdenas, C., and Wenig, M.: Global Ozone Monitoring Experiment-2 (GOME-2) daily and monthly level-3 products of atmospheric trace gas columns, Earth Syst. Sci. Data, 15, 1831–1870, https://doi.org/10.5194/essd-15-1831-2023, 2023.

China National Environmental Monitoring Centre (CNEMC): real-time air pollutants measurements dataset, CNEMC [data set], https://air.cnemc.cn:18007/, last access: 10 November 2023.

Copernicus Atmosphere Monitoring Service: CAMS global atmospheric composition forecasts, Atmosphere Data Store [data set], https://ads.atmosphere.copernicus.eu/datasets/cams-global-atmospheric-composition-forecasts?tab=download, last access: 22 June 2023.

Douros, J., Eskes, H., van Geffen, J., Boersma, K. F., Compernolle, S., Pinardi, G., Blechschmidt, A.-M., Peuch, V.-H., Colette, A., and Veefkind, P.: Comparing Sentinel-5P TROPOMI NO₂ column observations with the CAMS regional air quality ensemble, Geosci. Model Dev., 16, 509–534, https://doi.org/10.5194/gmd-16-509-2023, 2023.

Du, S., Li, T., Yang, Y., and Horng, S. J.: Deep Air Quality Forecasting Using Hybrid Deep Learning Framework, IEEE T. Knowl. Data En., 33, 2412–2424, https://doi.org/10.1109/TKDE.2019.2954510, 2021.

Fino, A., Vichi, F., Leonardi, C., and Mukhopadhyay, K.: An overview of experiences made and tools used to inform the public on ambient air quality, Atmosphere-Basel, 12, 1524, https://doi.org/10.3390/atmos12111524, 2021.

Guarin, J. R., Jägermeyr, J., Ainsworth, E. A., Oliveira, F. A. A., Asseng, S., Boote, K., Elliott, J., Emberson, L., Foster, I., Hoogenboom, G., Kelly, D., Ruane, A. C., and Sharps, K.: Modeling the effects of tropospheric ozone on the growth and yield of global staple crops with DSSAT v4.8.0, Geosci. Model Dev., 17, 2547–2567, https://doi.org/10.5194/gmd-17-2547-2024, 2024.

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., de Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.-N.: The ERA5 global reanalysis, Q. J. Roy. Meteor. Soc., 146, 1999-2049, https://doi.org/10.1002/qj.3803, 2020.

Hersbach, H., Bell, B., Berrisford, P., Biavati, G., Horányi, A., Muñoz Sabater, J., Nicolas, J., Peubey, C., Radu, R., Rozum, I., Schepers, D., Simmons, A., Soci, C., Dee, D., and Thépaut, J.-N.: ERA5 hourly data on single levels from 1940 to present, Copernicus Climate Change Service (C3S) Climate Data Store (CDS) [data set], https://doi.org/10.24381/cds.adbb2d47, 2023.

Hong, C., Zhang, Q., Zhang, Y., Davis, S. J., Tong, D., Zheng, Y., Liu, Z., Guan, D., He, K., and Schellnhuber, H. J.: Impacts of climate change on future air quality and human health in China, P. Natl. Acad. Sci. USA, 116, 17193–17200, 2019.

Hsu, C. H., Henze, D. K., Mizzi, A. P., González Abad, G., He, J., Harkins, C., Naeger, A. R., Lyu, C., Liu, X., and Chan Miller, C.: An Observing System Simulation Experiment Analysis of How Well Geostationary Satellite Trace-Gas Observations Constrain NO_x Emissions in the US, J. Geophys. Res.-Atmos., 129, e2023JD039323, https://doi.org/10.1029/2023JD039323, 2024.

Inness, A., Flemming, J., Heue, K.-P., Lerot, C., Loyola, D., Ribas, R., Valks, P., van Roozendael, M., Xu, J., and Zimmer, W.: Monitoring and assimilation tests with TROPOMI data in the CAMS system: near-real-time total column ozone, Atmos. Chem. Phys., 19, 3939–3962, https://doi.org/10.5194/acp-19-3939-2019, 2019.

Inness, A., Aben, I., Ades, M., Borsdorff, T., Flemming, J., Jones, L., Landgraf, J., Langerock, B., Nedelec, P., Parrington, M., and Ribas, R.: Assimilation of S5P/TROPOMI carbon monoxide data with the global CAMS near-real-time system, Atmos. Chem. Phys., 22, 14355–14376, https://doi.org/10.5194/acp-22-14355-2022, 2022.

Irrgang, C., Boers, N., Sonnewald, M., Barnes, E. A., Kadow, C., Staneva, J., and Saynisch-Wagner, J.: Towards neural Earth system modelling by integrating artificial intelligence in Earth system science, Nature Machine Intelligence, 3, 667–674, 2021.

Kim, J., Jeong, U., Ahn, M.-H., Kim, J. H., Park, R. J., Lee, H., Song, C. H., Choi, Y.-S., Lee, K.-H., and Yoo, J.-M.: New era of air quality monitoring from space: Geostationary Environment Monitoring Spectrometer (GEMS), B. Am. Meteorol. Soc., 101, E1–E22, https://doi.org/10.1175/BAMS-D-18-0013.1, 2020.

Kim, M., Brunner, D., and Kuhlmann, G.: Importance of satellite observations for high-resolution mapping of near-surface NO₂ by machine learning, Remote Sens. Environ., 264, 112573, https://doi.org/10.1016/j.rse.2021.112573, 2021.

Kim, S., Kim, D., Hong, H., Chang, L.-S., Lee, H., Kim, D.-R., Kim, D., Yu, J.-A., Lee, D., Jeong, U., Song, C.-K., Kim, S.-W., Park, S. S., Kim, J., Hanisco, T. F., Park, J., Choi, W., and Lee, K.: First-time comparison between NO₂ vertical columns from Geostationary Environmental Monitoring Spectrometer (GEMS) and Pandora measurements, Atmos. Meas. Tech., 16, 3959–3972, https://doi.org/10.5194/amt-16-3959-2023, 2023.

Kong, L., Tang, X., Zhu, J., Wang, Z., Li, J., Wu, H., Wu, Q., Chen, H., Zhu, L., Wang, W., Liu, B., Wang, Q., Chen, D., Pan, Y., Song, T., Li, F., Zheng, H., Jia, G., Lu, M., Wu, L., and Carmichael, G. R.: A 6-year-long (2013–2018) high-resolution air quality reanalysis dataset in China based on the assimilation of surface observations from CNEMC, Earth Syst. Sci. Data, 13, 529–570, https://doi.org/10.5194/essd-13-529-2021, 2021.

Kuhn, L., Beirle, S., Kumar, V., Osipov, S., Pozzer, A., Bösch, T., Kumar, R., and Wagner, T.: On the influence of vertical mixing, boundary layer schemes, and temporal emission profiles on tropospheric NO₂ in WRF-Chem – comparisons to in situ, satellite, and MAX-DOAS observations, Atmos. Chem. Phys., 24, 185–217, https://doi.org/10.5194/acp-24-185-2024, 2024.

Kumar, V., Remmers, J., Beirle, S., Fallmann, J., Kerkweg, A., Lelieveld, J., Mertens, M., Pozzer, A., Steil, B., Barra, M., Tost, H., and Wagner, T.: Evaluation of the coupled high-resolution atmospheric chemistry model system MECO(n) using in situ and MAX-DOAS NO₂ measurements, Atmos. Meas. Tech., 14, 5241–5269, https://doi.org/10.5194/amt-14-5241-2021, 2021.

Lam, R., Sanchez-Gonzalez, A., Willson, M., Wirnsberger, P., Fortunato, M., Alet, F., Ravuri, S., Ewalds, T., Eaton-Rosen, Z., and Hu, W.: Learning skillful medium-range global weather forecasting, Science, 382, 1416–1421, 2023.

Li, Y., Xing, C., Peng, H., Song, Y., Zhang, C., Xue, J., Niu, X., and Liu, C.: Long-term observations of NO₂ using GEMS in China: Validations and regional transport, Sci. Total Environ., 904, 166762, https://doi.org/10.1016/j.scitotenv.2023.166762, 2023.

Lin, Z., Li, M., Zheng, Z., Cheng, Y., and Yuan, C.: Self-attention convlstm for spatiotemporal prediction, in: Proceedings of the AAAI Conference on Artificial Intelligence, New York, February 2020, vol. 34, 11531–11538, 2020.

Liu, C., Wu, C., Kang, X., Zhang, H., Fang, Q., Su, Y., Li, Z., Ye, Y., Chang, M., and Guo, J.: Evaluation of the prediction performance of air quality numerical forecast models in Shenzhen, Atmos. Environ., 314, 120058, https://doi.org/10.1016/j.atmosenv.2023.120058, 2023.

Makar, P., Gong, W., Milbrandt, J., Hogrefe, C., Zhang, Y., Curci, G., Žabkar, R., Im, U., Balzarini, A., and Baró, R.: Feedbacks between air pollution and weather, Part 1: Effects on weather, Atmos. Environ., 115, 442–469, 2015.

Manisalidis, I., Stavropoulou, E., Stavropoulos, A., and Bezirtzoglou, E.: Environmental and health impacts of air pollution: a review, Frontiers in Public Health, 8, 14, https://doi.org/10.3389/fpubh.2020.00014, 2020.

Meng, X., Liu, C., Chen, R., Sera, F., Vicedo-Cabrera, A. M., Milojevic, A. et al.: Short term associations of ambient nitrogen dioxide with daily total, cardiovascular, and respiratory mortality: multilocation analysis in 398 cities, BMJ, 372, n534, https://doi.org/10.1136/bmj.n534, 2021.

Nguyen, T., Brandstetter, J., Kapoor, A., Gupta, J. K., and Grover, A.: ClimaX: A foundation model for weather and climate, arXiv [preprint] https://doi.org/10.48550/arXiv.2301.10343, 2023.

National Institute of Environmental Research (NIER, South Korea): Geostationary Environment Monitoring Spectrometer (GEMS) NO₂ retrieval product, NIER [data set], https://nesc.nier.go.kr/en/html/index.do, last access: 10 September 2023.

Peuch, V.-H., Engelen, R., Rixen, M., Dee, D., Flemming, J., Suttie, M., Ades, M., Agustí-Panareda, A., Ananasso, C., and Andersson, E.: The Copernicus Atmosphere Monitoring Service: From Research to Operations, B. Am. Meteorol. Soc., 103, E2650–E2668, 2022.

Sathe, Y., Gupta, P., Bawase, M., Lamsal, L., Patadia, F., and Thipse, S.: Surface and satellite observations of air pollution in India during COVID-19 lockdown: Implication to air quality, Sustain. Cities Soc., 66, 102688, https://doi.org/10.1016/j.scs.2020.102688, 2021.

Shah, V., Jacob, D. J., Li, K., Silvern, R. F., Zhai, S., Liu, M., Lin, J., and Zhang, Q.: Effect of changing NO_x lifetime on the seasonality and long-term trends of satellite-observed tropospheric NO₂ columns over China, Atmos. Chem. Phys., 20, 1483–1495, https://doi.org/10.5194/acp-20-1483-2020, 2020.

Shu, L., Zhu, L., Bak, J., Zoogman, P., Han, H., Liu, S., Li, X., Sun, S., Li, J., Chen, Y., Pu, D., Zuo, X., Fu, W., Yang, X., and Fu, T.-M.: Improving ozone simulations in Asia via multisource data assimilation: results from an observing system simulation experiment with GEMS geostationary satellite observations, Atmos. Chem. Phys., 23, 3731–3748, https://doi.org/10.5194/acp-23-3731-2023, 2023.

Tang, K. T. J., Lin, C., Wang, Z., Pang, S. W., Wong, T.-W., Yu, I. T. S., Fung, W. W. Y., Hossain, M. S., and Lau, A. K.: Update of Air Quality Health Index (AQHI) and harmonization of health protection and climate mitigation, Atmos. Environ., 326, 120473, https://doi.org/10.1016/j.atmosenv.2024.120473, 2024.

van Geffen, J., Eskes, H., Compernolle, S., Pinardi, G., Verhoelst, T., Lambert, J.-C., Sneep, M., ter Linden, M., Ludewig, A., Boersma, K. F., and Veefkind, J. P.: Sentinel-5P TROPOMI NO₂ retrieval: impact of version v2.2 improvements and comparisons with OMI and ground-based data, Atmos. Meas. Tech., 15, 2037–2060, https://doi.org/10.5194/amt-15-2037-2022, 2022.

Wang, S., Zhang, M., Gao, Y., Wang, P., Fu, Q., and Zhang, H.: Diagnosing drivers of PM_2.5 simulation biases in China from meteorology, chemical composition, and emission sources using an efficient machine learning method, Geosci. Model Dev., 17, 3617–3629, https://doi.org/10.5194/gmd-17-3617-2024, 2024.

Zhang, C., Liu, C., Li, B., Zhao, F., and Zhao, C.: Spatiotemporal neural network for estimating surface NO₂ concentrations over north China and their human health impact, Environ. Pollut., 307, 119510, https://doi.org/10.1016/j.envpol.2022.119510, 2022.

Zhang, Q.-S. and Zhu, S.-C.: Visual interpretability for deep learning: a survey, Front. Inform. Tech. El., 19, 27–39, 2018.

Zhong, S., Zhang, K., Bagheri, M., Burken, J. G., Gu, A., Li, B., Ma, X., Marrone, B. L., Ren, Z. J., and Schrier, J.: Machine learning: new ideas and tools in environmental science and engineering, Environ. Sci. Technol., 55, 12741–12754, 2021.

Zuo, C., Chen, J., Zhang, Y., Jiang, Y., Liu, M., Liu, H., Zhao, W., and Yan, X.: Evaluation of four meteorological reanalysis datasets for satellite-based PM_2.5 retrieval over China, Atmos. Environ., 305, 119795, https://doi.org/10.1016/j.atmosenv.2023.119795, 2023.

Articles

Download

Article (4529 KB)
Full-text XML

Short summary

This research utilizes hourly air pollution observations from the world’s first geostationary satellite to develop a spatiotemporal neural network model for full-coverage surface NO₂ pollution prediction over the next 24 hours, achieving outstanding forecasting performance and efficacy. These results highlight the profound impact of geostationary satellite observations in advancing air quality forecasting models, thereby contributing to future models for health exposure to air pollution.

Unleashing the potential of geostationary satellite observations in air quality forecasting through artificial intelligence techniques

2.1 Geostationary satellite observations of atmospheric NO2

2.2 Ancillary datasets

2.3 The GeoNet model

2.4 The model configuration and optimization

2.5 The importance of the model input feature

3.1 Model performance

3.2 Main factors in NO2 forecasting and their implications

3.3 NO2 pollution episodes and health exposure forecast

2.1 Geostationary satellite observations of atmospheric NO₂

3.2 Main factors in NO₂ forecasting and their implications

3.3 NO₂ pollution episodes and health exposure forecast