Preprints
https://doi.org/10.5194/acp-2022-767
https://doi.org/10.5194/acp-2022-767
18 Nov 2022
 | 18 Nov 2022
Status: a revised version of this preprint was accepted for the journal ACP and is expected to appear here in due course.

Technical note: Improving the European air quality forecast of Copernicus Atmosphere Monitoring Service using machine learning techniques

Jean-Maxime Bertrand, Frédérik Meleux, Anthony Ung, Gaël Descombes, and Augustin Colette

Abstract. Model Output Statistics (MOS) approaches relying on machine learning algorithms were applied to downscale regional air quality forecasts produced by CAMS (Copernicus Atmosphere Monitoring Service) at hundreds of monitoring sites across Europe. Besides the CAMS forecast, the predictors in the MOS typically include meteorological variables but also ancillary data. We explored first a “local” approach where specific models are trained at each site. An alternative “global” approach where a single model is trained with data from the whole geographical domain was also investigated. In both cases, local predictors are used for a given station in predictive mode. Because of its global nature, the latter approach can capture a variety of meteorological situation within a very short training period and is thereby more suited to cope with operational constraints in relation with the training of the MOS (frequent upgrades of the modelling system, addition of new monitoring sites). Both approaches have been implemented using a variety of machine learning algorithms: random forest, gradient boosting, standard and regularized multi-linear models. The quality of the MOS predictions is evaluated in this work for four key pollutants, namely particulate matter PM10 and PM2.5, ozone O3 and nitrogen dioxide NO2, according to scores based on the predictive errors and on the detection of pollution peaks (exceedances of the regulatory thresholds). Both the local and the global approaches significantly improve the performances of the raw Ensemble forecast. The most important result of this study is that the global approach competes with and can even outperform the local approach in some cases. This global approach gives the best RMSE scores when relying on a random forest model, for the prediction of daily mean, daily max and hourly concentrations. By contrast, it is the gradient boosting model which is better suited for the detection of exceedances of the European Union regulated threshold values for O3 and PM10.

Jean-Maxime Bertrand et al.

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on acp-2022-767', Anonymous Referee #1, 15 Dec 2022
  • RC2: 'Comment on acp-2022-767', Anonymous Referee #2, 19 Dec 2022

Status: closed

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on acp-2022-767', Anonymous Referee #1, 15 Dec 2022
  • RC2: 'Comment on acp-2022-767', Anonymous Referee #2, 19 Dec 2022

Jean-Maxime Bertrand et al.

Jean-Maxime Bertrand et al.

Viewed

Total article views: 472 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
325 135 12 472 4 7
  • HTML: 325
  • PDF: 135
  • XML: 12
  • Total: 472
  • BibTeX: 4
  • EndNote: 7
Views and downloads (calculated since 18 Nov 2022)
Cumulative views and downloads (calculated since 18 Nov 2022)

Viewed (geographical distribution)

Total article views: 517 (including HTML, PDF, and XML) Thereof 517 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
1
 
 
 
 
Latest update: 28 Mar 2023
Download
Short summary
Post-processing methods based on machine learning algorithms were applied to refine the concentration forecasts of 4 key pollutants at monitoring sites across Europe. Performances show significant improvements compared to that of the deterministic model raw outputs. Taking advantage of the large modelling domain extension, a an innovative “global” approach is proposed to drastically reduce the period necessary to train the models and thus facilitate the implementation in an operational context.
Altmetrics