06 Jan 2021

06 Jan 2021

Review status: a revised version of this preprint is currently under review for the journal ACP.

Himawari-8-derived diurnal variations of ground-level PM2.5 pollution across China using a fast space-time Light Gradient Boosting Machine

Jing Wei1,2, Zhanqing Li2, Rachel T. Pinker2, Lin Sun3, Wenhao Xue1, and Runze Li1 Jing Wei et al.
  • 1State Key Laboratory of Remote Sensing Science, College of Global Change and Earth System Science, Beijing Normal University, Beijing, China
  • 2Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD, USA
  • 3College of Geodesy and Geomatics, Shandong University of Science and Technology, Qingdao, China

Abstract. PM2.5 has been used as an important atmospheric environmental parameter primarily due to its impact on human health. PM2.5 is affected by both natural and anthropogenic factors that usually have strong diurnal variations. Monitoring it does not only help understand the causes of air pollution but also our adaptation to it. Most existing PM2.5 products have been derived from polar-orbiting satellites. This study exploits the usage of the next-generation geostationary meteorological satellite Himawari-8/AHI in revealing its diurnal variations. Given the huge volume of the satellite data, a highly efficient tree-based Light Gradient Boosting Machine (LightGBM) learning approach, which is based on the idea of gradient boosting, is applied by involving the spatiotemporal characteristics of air pollution, named the space-time LightGBM (STLG) model. Hourly PM2.5 data set in China (i.e., ChinaHighPM2.5) at a 5 km spatial resolution is derived based on the Himawari-8/AHI aerosol products together with other variables. The hourly PM2.5 estimates (N = 1,415,188) are well correlated with ground measurements (R2 = 0.85) with a RMSE and MAE of 13.62 and 8.49 μg/m3 respectively in China. Our model can capture well the PM2.5 diurnal variations, where the pollution increases gradually in the morning, and reaches a peak at about 10:00 a.m. local time, then decreases steadily until sunset. The proposed approach outperforms most traditional statistical regression and tree-based machine learning models with a much lower computation burden in terms of speed and memory, making it most suitable for routine pollution monitoring.

Jing Wei et al.

Status: final response (author comments only)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on acp-2020-1277', Anonymous Referee #1, 27 Jan 2021
  • RC2: 'Comment on acp-2020-1277', Anonymous Referee #2, 03 Feb 2021

Jing Wei et al.


Total article views: 398 (including HTML, PDF, and XML)
HTML PDF XML Total Supplement BibTeX EndNote
289 98 11 398 18 1 3
  • HTML: 289
  • PDF: 98
  • XML: 11
  • Total: 398
  • Supplement: 18
  • BibTeX: 1
  • EndNote: 3
Views and downloads (calculated since 06 Jan 2021)
Cumulative views and downloads (calculated since 06 Jan 2021)

Viewed (geographical distribution)

Total article views: 492 (including HTML, PDF, and XML) Thereof 486 with geography defined and 6 with unknown origin.
Country # Views %
  • 1
Latest update: 15 Apr 2021
Short summary
This study developed a space-time Light gradient boosting machine (STLG) model and derived the high-temporal-resolution (1 hour) and high-quality PM2.5 dataset in China (i.e., ChinaHighPM2.5) at a 5 km spatial resolution from the Himawari-8/AHI aerosol products. Our model outperforms most previous related studies with a much lower computation burden in terms of speed and memory, making it most suitable for real-time air pollution monitoring in China.