Historical and future changes in air pollutants from CMIP6 models

Poor air quality is currently responsible for large impacts on human health across the world. In addition, the air pollutants, 20 ozone (O3) and particulate matter less than 2.5 microns in diameter (PM2.5), are also radiatively active in the atmosphere and can influence Earth’s climate. It is important to understand the effect of air quality and climate mitigation measures over the historical period and in different future scenarios to ascertain any impacts from air pollutants on both climate and human health. The 6th Coupled Model Intercomparison Project (CMIP6) presents an opportunity to analyse the change in air pollutants simulated by the current generation of climate and Earth system models that include a representation of chemistry and aerosols 25 (particulate matter). The shared socio-economic pathways (SSPs) used within CMIP6 encompass a wide range of trajectories in precursor emissions and climate change, allowing for an improved analysis of future changes to air pollutants. Firstly, we conduct an evaluation of the available CMIP6 models against surface observations of O3 and PM2.5. CMIP6 models show a consistent overestimation of observed surface O3 concentrations across most regions and in most seasons, with a large diversity in simulated values over northern hemisphere continental regions. Conversely, observed surface PM2.5 concentrations are 30 consistently underestimated by CMIP6 models, particularly for the northern hemisphere winter months, with the largest model diversity near natural emission source regions. Over the historical period (1850-2014) large increases in both surface O3 and PM2.5 are simulated by the CMIP6 models across all regions, particularly over the mid to late 20th Century when anthropogenic emissions increase markedly. Large regional historical changes are simulated for both pollutants, across East and South Asia, with an increase of up to 40 ppb for O3 and 12 μg m-3 for PM2.5. In future scenarios containing strong air quality and climate 35 mitigation measures (ssp126), air pollutants are substantially reduced across all regions by up to 15 ppb for O3 and 12 μg m-3 for PM2.5. However, for scenarios that encompass weak action on mitigating climate and reducing air pollutant emissions (ssp370), increases of both surface O3 (up 10 ppb) and PM2.5 (up to 8 μg m-3) are simulated across most regions. Although, for regions like North America and Europe small reductions in PM2.5 are simulated in this scenario. A comparison of simulated regional changes in both surface O3 and PM2.5 from individual CMIP6 models highlights important differences due to the 40 interaction of aerosols, chemistry, climate and natural emission sources within models. The prediction of regional air pollutant concentrations from the latest climate and Earth system models used within CMIP6 shows that the particular future trajectory of climate and air quality mitigation measures could have important consequences for regional air quality, human health and near-term climate. Differences between individual models emphasises the importance of understanding how future Earth system feedbacks influence natural emission sources. 45 https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c © Author(s) 2020. CC BY 4.0 License.

scenarios was also shown across East and South Asia due to differences in the carbonaceous and sulphur dioxide (SO2) emission trajectories (Fiore et al., 2012). Future PM2.5 concentrations over Africa and the Middle East were shown to be quite 90 noisy due to the large meteorological variability that influences dust emissions over these regions.
The current set of experiments conducted for the 6th Coupled Model Intercomparison Project (CMIP6; Eyring et al., 2016) represent an opportunity to update the assessment of current and future levels of air pollutants using the latest generation of Earth system and climate models. A new set of future scenarios have been generated for CMIP6, the Shared Socio-economic Pathways (SSPs), which combine different trends in social, economic and environmental developments (O'Neill et al., 2014). 95 Varying amounts of emission mitigation to NTCFs are applied on top of the baseline social and economic developments to meet predefined climate and air quality targets in the future, allowing for a wider range of future air pollutant trajectories to be assessed than occurred in CMIP5 (Rao et al., 2017;Riahi et al., 2017). Initial assessments have been made of future changes to air pollutants in the SSPs using simplified models. The sustainability pathway (SSP1) leads to improvements in both air quality and climate, whereas SSP3 (regional rivalry) is not compatible with achieving air quality and climate goals, and the 100 conventional fuels (SSP5) pathway improves air quality at the expenses of climate (Reis et al., 2018). Strong climate and air pollutant mitigation measures in SSP1 were shown to reduce global annual mean surface O3 concentrations by more than 3.5 ppb, whereas for SSP3 O3 concentrations over Asia were predicted to increase by 6 ppb . These studies highlighted the potential large regional variability in the response of air pollutants to the different assumptions in the future pathways and also the need for a full model assessment using the current generation of Earth System Models (ESMs) that take 105 into account both changes in emissions and climate.
In this study, we use results from experiments conducted as part of CMIP6 to make a first assessment of historical and future changes in air pollutants. First, we assess the performance of CMIP6 models in simulating present day air pollutants by conducting an evaluation against observations of O3 and PM2.5. Regional changes in surface O3 and PM2.5 are computed over the historical period  to provide context with future changes. We are then able to show future projections of air 110 pollutants over different world regions under different Shared Socio-economic Pathways (SSPs) used in the CMIP6 experiments. Finally, a comparison is made of individual CMIP6 models for a single future scenario to identify potential reasons for model discrepancies.

Air Pollutant Emissions 115
A new set of historical and future anthropogenic air pollutant emissions has been developed and used as part of CMIP6. The historical anthropogenic emissions are from the Community Emissions Data System (CEDS) and a new dataset was developed for biomass burning emissions, both of which provides information on emissions from 1750 to 2014(van Marle et al., 2017Hoesly et al., 2018). The SSPs used in future CMIP6 experiments represent an update from the RCPs used in CMIP5, as they combine pathways of socio-economic development with targets to achieve a certain level of climate mitigation (O'Neill et al., 120 2014;van Vuuren et al., 2014;Riahi et al., 2017). The SSPs are divided into the following 5 different pathways depending on their social, economic and environmental development: SSP1 -sustainability, SSP2 -middle-of-the-road, SSP3 -regional rivalry, SSP4 -inequality, SSP5 -fossil fuel development. An assumption about the degree of air pollution control (strong, medium or weak) is included on top of the baseline pathway, with stricter air pollution controls assumed to be tied to economic development (Rao et al., 2016). Weak air pollution controls occur in SSP3 and SSP4, with medium controls in SSP2 and strong 125 air pollution controls in SSP1 and SSP5 (Gidden et al., 2019). A particular climate mitigation target, in terms of an anthropogenic radiative forcing by 2100, is included on top of each SSP and is achieved using a range of emissions mitigation measures appropriate to each SSP. Climate mitigation targets vary from a weak mitigation scenario with an anthropogenic radiative forcing of 8.5 W m -2 by 2100, comparable with a 5 °C temperature change (Riahi et al., 2017), to a strong mitigation https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. scenario with a radiative forcing of 1.9 W m -2 by 2100, in accordance with the Paris agreement for keeping temperatures below 130 2 °C (United Nations, 2016). Some climate mitigation targets are comparable with those of the RCPs used in CMIP5 (2.6, 4.5 and 6.0), whilst others are new, e.g. ssp534 is included as a delayed mitigation scenario. A scenario specific to the Aerosol and Chemistry Model Intercomparison Project (AerChemMIP), ssp370-lowNTCF, is also included to study the impact of mitigation measures to specifically control NTCFs on top of ssp370. Future biomass burning emissions vary in each scenario, depending on the particular land-use assumptions (Rao et al., 2017). Whilst future anthropogenic and biomass burning 135 emissions are prescribed in each CMIP6 model from the same dataset, other natural emissions, e.g. dust, biogenic volatile organic compounds (BVOCs) etc., will be different and depend on the individual model configuration. Figure 1 shows the future changes in global total (anthropogenic and biomass) emissions of the major air pollutant precursors across all of the CMIP6 scenarios, provided as input to the CMIP6 models. The overlying feature is that global air pollutant emissions are predicted to reduce across the majority of scenarios by 2100. The exception to this is that global and regional 140 emissions increase or remain at present day levels for ssp370 ( Fig. 1 and Fig. 2). Some air pollutant emissions increase in the near-term in other scenarios e.g. nitrogen oxides (NOx) in ssp585, but by 2100 these have been reduced. Future CH4 abundances show the largest diversity amongst the SSPs. Large increases in global CH4 abundances of more than 50% are predicted for the fossil fuel dominated pathways of ssp370 and ssp585, whereas large reductions are predicted to occur in the strong mitigation scenarios of SSP1. 145 Figure 1: Changes in annual total (anthropogenic and biomass) global air pollutant emissions (relative to 2015) of sulphur dioxide (SO2), organic carbon (OC), black carbon (BC), non-methane volatile organic compounds (NMVOCs), nitrogen oxides (NOx), carbon monoxide (CO) and global methane (CH4) abundances in the future CMIP6 scenarios used as input to CMIP6 models. The dashed black line represents the 2015 value.

150
For SO2, large reductions of more than 50% are shown for most scenarios and across most regions (Figure 2), apart from Africa and Asia in ssp370. Near-term (2050)  For all aerosol and aerosol precursors, a reduction of 80-100% (relative to 2015) in regional emissions is predicted by 2100 in 155 the strong mitigation scenarios. Changes in the emissions of the O3 precursors, NOx, CO and NMVOCs, show a similar increase across most regions for ssp370 but a general decrease in other scenarios. The change in these emissions are particular diverse across all the scenarios in South Asia with large relative increases in ssp370, contrary to the large decreases in ssp126.
Across East Asia there is an increase in NOx emissions for ssp370 in 2050 but a long term reduction across all scenarios.

CMIP6 Simulations
Surface concentrations of O3 and PM2.5 have been obtained from all the CMIP6 models that made appropriate data available 165 on the Earth System Grid Federation (ESGF) at the time of writing. To study changes in surface air pollutants over the industrial period data has been obtained from the coupled historical simulations (Eyring et al., 2016)  has also been obtained for the AerChemMIP specific ssp370-lowNTCF scenario, which was only required to be conducted over the period 2015-2055 (Collins et al., 2017). 175 Concentrations of both pollutants at the surface have been obtained by extracting the lowest vertical level of the full 3D field output on the native horizontal and vertical grid of each model (the "AERmon" CMIP6 table ID). For O3, this is supplied as a separate diagnostic which can be used directly. However, models contributing to CMIP6 will not all directly output PM2.5 and the calculation of PM2.5 will not be consistent across individual models due to the different treatment of aerosols and their components. For example only a few CMIP6 models include the simulation of ammonium nitrate in their aerosol scheme 180 (currently, only GISS-E2-1-H and GFDL-ESM4 have provided nitrate mass mixing ratios on the ESGF database). Therefore it has been necessary to use a definition of PM2.5, which is consistent across all models and is calculated offline. In this study surface PM2.5 is defined as the sum of the individual dry aerosol mass mixing ratios of black carbon (BC), total organic aerosol (OA -both primary and secondary sources), sulphate (SO4), sea salt (SS) and dust (DU) from the lowest model level extract of the full 3D model fields. All BC, OA and SO4 aerosol mass is assumed to be present in the fine size fraction (< 2.5 µm), 185 whereas a factor of 0.25 for SS and 0.1 for DU has been used to calculate the approximate contribution from these components to the fine aerosol size fraction (Eq. 1).
The factors used to calculate the contribution of SS and DU concentrations to the PM2.5 size fraction are likely to depend on the individual aerosol scheme and the simulated aerosol size distribution within a particular model. The calculation of an approximate PM2.5 concentration using Eq. (1) is therefore likely to introduce some errors but it does provide an estimate that is consistent across models and also with that previously used in CMIP5 and ACCMIP (Fiore et al., 2012;Silva et al., 2013. For the CNRM-ESM2-1 model, anomalously large concentrations were obtained from the sea salt mass mixing ratios. 195 Sensitivity tests with this model suggested that a much smaller factor of 0.01 was more appropriate to use for SS, which takes into account the non-dry nature of the sea salt aerosols and the large possible size range, up to 20 µm in diameter, of sea salt particles within the CNRM-ESM2-1 model (P Nabat 2019, personal communication, 27 th November).
Details of the data used in this study from different CMIP6 models, in both the historical and future scenarios, is presented below in Table 1. For the historical period, data was available from 5 different CMIP6 models for O3 and 10 models for PM2.5. 200 The future scenario with the most data available was ssp370, with 4 models suppling data for O3 and 7 models for PM2.5. For the other Tier 1 CMIP6 scenarios (ssp126, ssp245 and ssp585), data was only available for 2 models for O3 and 4 for PM2.5 (all components). It was decided to focus the analysis on ssp370 and other Tier 1 scenarios due to the limited availability of model data for Tier 2 scenarios (ssp119, ssp434, ssp460 and ssp534). The results from an O3 parameterisation (Turnock et al., 2018 has also been included in the analysis of surface O3 from CMIP6 models for both the historical and future scenarios 205 and is referred to in this study as HTAP_param. The O3 parameterisation does not take into account the effects of climate change on surface O3 concentrations and therefore provides an estimate of the emission-only driven changes to surface O3 with which to compare to the climate and Earth System models. https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License.

Surface Observations
Present day surface O3 and PM2.5 simulated by all of the CMIP6 models is evaluated against surface observations to ascertain 215 model biases and inter-model discrepancies. Surface O3 observations are obtained from the database of the Tropospheric Ozone Assessment Report (TOAR) (Schultz et al., 2017). The TOAR database provides a gridded product of surface O3 observations over the period 1970 to 2015. The majority of measurement sites are located in North America and Europe, with a smaller number of other sites in East Asia, Australia, New Zealand, South America, Southern Africa, Antarctica and remote ocean locations. Here we compile a monthly mean climatology of all available O3 observations over the period 2005-2014 from 220 measurement locations that are classified as rural in the TOAR database (Schultz et al., 2017). The rural locations were selected to be representative of background (i.e. non-urban) O3 concentrations and are considered to be more appropriate in evaluating the simulated values obtained at the relatively coarse horizontal resolution of the global ESMs. Simulated surface O3 concentrations from the CMIP6 models are re-gridded onto the same resolution of the observational product (2° x 2°) for evaluation purposes. 225 Surface PM2.5 observations have been obtained from all of the locations compiled in the database of the Global Aerosol Synthesis and Science Project (GASSP: http://gassp.org.uk/data/, Reddington et al., 2017) to evaluate CMIP6 models.
Background, non-urban, PM2.5 data is compiled in the GASSP database from three major networks: the Interagency Monitoring of Protected Visual Environments (IMPROVE) network in North America, the European Monitoring and Evaluation Programme (EMEP) and Asia-Pacific Aerosol Database (A-PAD). Again, like for O3, the networks/observations for PM2.5 230 were selected to be representative of non-urban environments, which are more appropriate for the evaluation of global ESMs.
With the exception of the IMPROVE network, most measurements of PM2.5 began after the year 2000. Like for O3, we compile a monthly mean climatology of PM2.5 but now over the period of 2000 to 2010, selected as the GASSP database contained the most observations within this period. Simulated surface PM2.5 was computed from CMIP6 models over the same time period as the observations and linearly interpolated to each measurement location. Whilst the surface observations measure total PM2.5 235 mass, the computed PM2.5 from CMIP6 models use Eq. 1 and does not include all observable PM2.5 aerosol components (e.g. nitrate aerosol). Therefore it is anticipated that the CMIP6 models will underrepresent the PM2.5 observations in this comparison.
To address the anticipated disparity between the observed ground based PM2.5 and the approximate PM2.5 from CMIP6 models, a further comparison has been made between the CMIP6 models and the Modern-Era Retrospective Analysis for Research and 240 Applications, version 2 (MERRA-2), aerosol reanalysis product Randles et al., 2017). The MERRA-2 aerosol product assimilates observations of Aerosol Optical Depth (AOD) from ground based and satellite remote sensing platforms into model simulations that use the GEOS-5 atmospheric model coupled to the GOCART aerosol module. The data assimilation used in MERRA-2 generally improves comparisons of PM2.5 with observations but there are still overestimations due to dust and sea salt and underestimations over East Asia Provençal et al., 2017). Separate mass 245 mixing ratios for BC, OA, SO4, SS and DU aerosol components are provided from MERRA-2, which are then combined using https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. 9 the formula in Eq. 1 to make an approximate PM2.5. Monthly mean approximate PM2.5 concentrations are then computed over the period 2005-2014 from the MERRA-2 reanalysis product to provide a more direct comparison and enhanced spatial coverage against the approximate PM2.5 concentrations calculated from the CMIP6 models calculated over the same time period. 250 3 Present-day Model Evaluation of Air Pollutants

Surface Ozone
The 5 CMIP6 models with data available for the historical experiments are evaluated against surface O3 observations from the TOAR database over the period 2005-2014. A long-term evaluation of surface O3 concentrations from CMIP6 models using observations compiled over the 20 th Century is presented separately in Griffiths et al., (2019). Figure  when O3 formation is enhanced by increased photolytic activity and levels of oxidants, as well as larger biogenic emissions.
The hemispheric difference in surface O3 is smaller in December, January and February (DJF) when O3 production is less in 260 the northern hemisphere but higher in the southern hemisphere. However, model diversity is larger in DJF (Fig. 3b) due to individual models simulating different seasonal cycles of O3, particularly UKESM1 which has the most pronounced seasonal cycle of all 5 models (Fig. S2). The multi-model mean of CMIP6 models overestimates surface O3 concentrations in both seasons when compared to observations from the TOAR database, although they do capture the broad hemispheric gradient in O3 concentrations ( Fig. 3c  265 and 3f). These results are consistent with the previous evaluation of ACCMIP models (Young et al., 2018). The overestimation in the CMIP6 models analysed here could be due to the coarse resolution of the ESMs, an excess of O3 chemical production (potentially due to an overabundance of NOx and/or VOCs) and weak O3 deposition. Smaller model biases exist in DJF (<5 ppb) than in JJA (5-15 ppb), mostly attributed to the strong seasonal cycle simulated by UKESM1. In contrast to other models

275
The observed annual cycle in surface O3 averaged across measurement locations within different regions is compared to that simulated by CMIP6 models (Figure 4). Across most regions, the mean annual cycle from CMIP6 models compares relatively well to that observed. The overprediction of surface O3 values in JJA is evident across most regions, as is the strong seasonal cycle in UKESM1 for northern hemisphere continental regions. Additionally, the timing of peak O3 over continental northern hemisphere locations occurs earlier in the observations (springtime) than in the CMIP6 models (spring and summer), which is 280 consistent with that from ACCMIP models (Young et al., 2018). At oceanic observation locations there is also a consistent overestimate of surface O3 by CMIP6 models across all seasons, indicating that O3 deposition rate could be underestimated here. There is also a large overestimation (~20 ppb) in all models at the one observation location in South East Asia, potentially due to difficulty in simulating O3 in the maritime continental boundary layer using lower resolution global ESMs. In contrast to this, CMIP6 models tend to underpredict the observed surface O3 concentrations at locations in the South Pole region in JJA 285 by ~5 ppb. This could be due to lack of long range transport of O3 to these sites, inaccuracies in southern hemisphere precursor emissions, or because of the difficulty in simulating O3 concentrations at the appropriate elevation of measurement sites located on the Antarctic ice sheet. 3.2 Surface PM2.5 295

Ground Based Observations
A similar comparison is made for seasonal mean surface PM2.5 concentrations from CMIP6 models against ground based surface observations ( Figure 5). The seasonal multi-model mean from CMIP6 models shows that elevated PM2.5 concentrations (>50 µg m -3 ) occur close to the large dust emission source regions of the Sahara and Middle East in both DJF and JJA over 2000-2010. These natural source regions are also one of the largest areas of diversity in PM2.5 concentrations (up to 20 µg m -300 12 across CMIP6 models due to the different contribution from anthropogenic PM2.5 components (Fig. S8-S10). Lower PM2.5 concentrations (<10 µg m -3 ) are predicted across both North America and Europe, with more agreement between CMIP6 models. Across the biomass burning regions of South America and Southern Africa, PM2.5 are elevated in JJA with larger 305 diversity in the CMIP6 models due to the differing contributions of the BC and OA components ( Fig. S9 and S10). Relatively consistent PM2.5 concentrations of <10 µg m -3 , with small model diversity (<5 µg m -3 ), are shown across oceanic regions, mainly from emissions of sea salt (Fig. S11). Apart from the natural sources of aerosol, which are subject to meteorological variability, the CMIP6 models are relatively consistent when simulating PM2.5 concentrations across most regions.
Compared to the ground based observations from the GASSP database, the CMIP6 multi-model mean underpredicts the 310 observed PM2.5 values in both seasons, with a slightly larger underestimation in DJF than JJA. As discussed in section 2.3, an underestimation was anticipated from comparing approximate PM2.5 concentrations, derived from CMIP6 models, to observed values. Nevertheless, the evaluation highlights that fine particulate matter (PM2.5) is generally underrepresented in the CMIP6 models across North America, Europe and parts of Asia for which observations are available; a similar result to other global and regional models (Glotfelty et al., 2017;Solazzo et al., 2017). This could be potentially due to uncertainties in emissions 315 (e.g. local dust sources) or deposition (dry or wet), the coarse resolution of global models and absence/underrepresentation of aerosol formation processes (e.g. nitrate aerosols or secondary organic aerosols).

320
The difference between the multi-model mean and PM2.5 observations in c) DJF and f) JJA.

13
The simulated regional mean annual cycle in surface PM2.5 from different CMIP6 models against observations is shown in Figure 6. The low model bias in PM2.5 concentrations is highlighted across all regions, except for the ocean. Across North America, the region with most observations, the annual cycle is simulated relatively well with a peak in concentrations in JJA and a lower model bias, although a larger model bias (factor of ~1.5 to 2) occurs in winter and spring. Across Europe, there is 325 a larger underestimation of observed PM2.5 concentrations by CMIP6 models in DJF (factor > 2) than JJA. Nitrate aerosols are observed and modelled (Fig. S12) to contribute between 1 and 5 µg m -3 of the total aerosol mass over Europe (Fagerli and Aas, 2008;Pozzer et al., 2012), explaining part, but not all, of the model observational discrepancy here. Additionally, on

MERRA Reanalysis Product
An additional comparison of surface PM2.5 concentrations from the MERRA-2 aerosol reanalysis product is made with that simulated by the CMIP6 models to improve the spatial coverage and provide a more consistent evaluation of the approximate PM2.5 concentrations. Figure 7 shows the same comparison as in Fig. 5 but now using the approximate PM2.5 obtained from the MERRA-2 reanalysis product over the period 2005-2014. In comparison to MERRA-2, the CMIP6 models are shown to 345 underpredict PM2.5 concentrations across North America, Europe and Eurasia. A similar seasonal cycle comparison is shown for Europe and North America (regions with most ground based observations) in both Fig. 6 and 8, providing confidence that the underestimation of PM2.5 by CMIP6 models is robust over these regions. Across all other regions, the MERRA-2 reanalysis https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. 15 product provides much greater spatial coverage for each region and therefore the features shown in the site-level comparison ( Fig. 6) will not necessarily apply here. A large overestimation of the MERRA-2 reanalysis product by the CMIP6 multi-model 350 mean is shown across East and South Asia. Figure 8 shows that on a regional mean basis most CMIP6 models are within the spread of the MERRA-2 concentrations for East Asia, although MERRA-2 was previously shown to underestimate PM2.5 concentrations across East Asia Provençal et al., 2017) and also on Fig. 6. CESM2-WACCM is the exception to this with distinctly higher PM2.5 concentrations over East Asia, potentially due to larger OA concentrations and more dust aerosols within the western side of this region ( Fig. S7 and S10). Across the South Asian region, CMIP6 models 355 show a more consistent overestimation of MERRA-2, with UKESM1 and CESM2-WACCM showing particularly high PM2.5 concentrations, again due to dust and OA. Across North Africa there is a lot of inter-regional variability with CMIP6 models both under and over-estimating the MERRA-2 PM2.5 concentrations, although this results in a relatively good regional mean representation ( Fig. 7 and 8). The annual mean cycle in MERRA-2 PM2.5 concentrations across South America is well represented by the CMIP6 models, although the peak in the biomass burning season is underestimated in some models. A more 360 pronounced annual cycle is exhibited by UKESM1 across Southern Africa, potentially due to the larger contributions from the OA fraction (Fig. S10) that result from enhanced biogenic emissions leading to secondary OA formation (SOA). Across oceanic locations all of the CMIP6 models underestimate the MERRA-2 PM2.5 concentrations, although MERRA-2 was previously shown to overestimate sea-salt concentrations Provençal et al., 2017), accounting for some of this discrepancy. Overall, comparisons of CMIP6 models with the MERRA-2 reanalysis product show biases across Europe 365 and North America that are consistent with the comparison to ground-based observations. Additionally, similar comparisons are shown in annual mean cycles across other regions, for which appropriate ground based data is lacking.   The simulated changes in surface O3 across 5 CMIP6 models and the HTAP_param are shown in Figure 9 over the historical period of 1850 to 2014. The CMIP6 multi-model mean shows that global annual mean surface O3 has increased by 11.5 +/-380 2.2 ppb since 1850 (+/-1 standard deviation), although the change could be as large as 14 ppb (from BCC-ESM1) or as little as 7 ppb (from UKESM1). The 1850 to 2000 multi-model mean change in surface O3 from the CMIP6 models of 10.6 ppb is in good agreement with the 10 +/-1.6 ppb simulated by the CMIP5 models used in ACCMIP ). An https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. evaluation of the long-term changes in surface O3 over the historical period simulated by the CMIP6 models at specific measurement locations is presented separately in Griffiths et al., (2019), which shows that the CMIP6 models are able to 385 represent long term changes in surface ozone since the 1960s.
A large diversity in the simulated historical changes is shown across the different regions analysed here, with UKESM1 tending to simulate the lowest historical change and GISS-E2-1-H or BCC-ESM1 the highest. Even, though the surface response is small in UKESM1, it is shown to have larger tropospheric changes in O3 over the historical compared to other CMIP6 models (Griffiths et al., 2019). South Asia is the region with the largest diversity in simulated historical changes in surface O3 of 390 between 16 and 40 ppb. Surface O3 is simulated to have increased by between 10 to 30 ppb over the major northern anthropogenic source regions since 1850, driven mainly by the large increases in anthropogenic precursor emissions of CH4, NOx, CO, and NMVOCs over this period. A qualitative estimate of the influence of non-emission driven processes (chemistry and climate change) can be ascertained by comparing results from the HTAP_param, an emission-only driven model, to those of the CMIP6-models. Simulated historical changes in surface O3 from UKESM1 are similar to those from the HTAP_param, 395 indicating that changes simulated by UKESM1 are strongly determined by precursor emissions. However, the global annual mean surface O3 response of 7.6 +/-0.7 ppb from HTAP_param over the historical period is 3.9 ppb lower than the CMIP6 multi-model mean, indicating globally that non-emission driven processes have contributed to approximately 30% of the change in surface O3, although this contribution varies regionally. The different magnitude of response across models could be due to non-emission driven process, e.g. from different chemistry schemes and climate change signals within models. 400 https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. The simulated change in annual mean surface PM2.5 across 10 CMIP6 models is shown in Figure 10   An analysis is now made of the future projections of air pollutants in the CMIP6 Tier 1 scenarios, including ssp370-lowNTCF. 425 A comparison is made of the projected future changes by 2050 and 2100 in four CMIP6 models which had the most data available for the ssp370 scenario.

Surface Ozone
Global annual mean surface O3 is reduced by more than 4 +/-0.5 ppb (+/-1 standard deviation value of the multi-model mean) in the near-term (2050)  The global response in annual mean surface O3 concentrations to the different scenarios is also repeated across the different 440 world regions, albeit with differing magnitudes. In ssp370 increases in annual mean surface O3 are predicted to occur across North America (+1.9 ppb), Europe (+4.8 ppb) and East Asia (+7.5 ppb), with the largest increase predicted in South Asia of 9.7 +/-3.7 ppb by 2100. Surface O3 increases across most world regions in this scenario can be attributed to the large increase in global CH4 abundances (80%) and the large predicted increase in surface temperatures ( Figure S13), despite the reductions in O3 precursor emissions across North America, Europe and East Asia (Fig. 2). South Asia shows the largest increase in 445 surface O3 as precursor emissions are anticipated to increase across this region on top of the large climate change signal and growth in CH4 abundance. Additionally, the largest diversity in predictions between the CMIP6 models is shown over South Asia, indicating that there is some disagreement between the models as to the magnitude and extent of changes over this region.
Surface O3 across oceanic regions (background) are predicted to remain at or near current values in ssp370 due to the increases in water vapour in a warming world leading to more O3 destruction . The impact of more aggressive are driven by larger precursor emission controls, a smaller climate change signal and controlling CH4 so that global abundances are just below 2015 values by 2100 (Fig. 1g). In ssp245 a near-term (up to 2040) increase in surface O3 is shown across Europe, East Asia and South Asia, which could be attributed to the peaking of global CH4 abundances at this point prior to then reducing.
The Tier1 future scenario with the strongest climate and air pollutant mitigation measures, ssp126, shows substantial decreases 470 in surface O3 concentrations across most regions due to the large reduction in precursor emissions, global CH4 abundances, and small climate change signal. Reductions in surface O3 of more than 8 ppb are predicted across anthropogenic emission source regions of the northern hemisphere, with smaller reductions across southern hemisphere regions.
Predictions from the CMIP6 models show that to achieve global benefits for regional surface O3 it is important to control O3 precursor emissions (including CH4) in addition to limiting future climate change. However, scenarios with large increases in 475 global CH4 abundances, a large climate change signal and limited control of precursor emissions fail to restrict regional increases in surface O3, leading to poor future air quality and potential human health impacts  https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. Figure 11 -Future global and regional changes in annual mean surface O3, relative to 2005-2014 mean, for the different SSPs used in CMIP6. Each line represents a multi-model mean across the region with shading representing the +/-1 standard deviation in the 480 mean. See Table 1 for details of models contributing to each scenario. The multi-model regional mean value (+/-1 standard deviation) for the year 2005-2014 is shown in the top left corner of each panel.
A more detailed comparison of future surface O3 predictions between CMIP6 models has been undertaken for ssp370, as it is the scenario with the largest number of available models (Table 1). The regional change in decadal mean surface O3, relative to 2005-2014, in 2050 (2045 -2055 mean) and 2095 (2090 -2100 mean) for ssp370 from four CMIP6 models and the 485 HTAP_param is shown in Figure 12. Discrepancies in the simulated response of background O3 across the ocean region (also South Pole and Pacific, Australia and New Zealand) are noticeable between individual models, with UKESM1 predicting a decrease in surface O3 compared to the small increase from the HTAP_param and other models in both 2050 and 2095 ( Figure   S14). UKESM1 is a model with high equilibrium climate sensitivity (ECS, 5.4 K) compared to other CMIP6 models (Forster et al., 2019;, and therefore will exhibit a larger climate response (surface temperature and water vapour) 490 leading to enhanced background O3 destruction via water vapour and the hydroxyl radical (OH). Over the North Pole region https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. 24 all models show surface O3 increases that are larger than the HTAP_param, indicating that the large temperature response or changes to long-range transport could be an important driver over this region with comparatively low local emissions.
Differences in the predicted surface O3 between models exist across South Asia where CESM2-WACCM (and BCC-ESM1 in 2050) predict a response that is twice as large as UKESM1 and GFDL-ESM4. The large increase in NOx emissions in this 495 scenario over South Asia (~80%) has resulted in areas of NOx titration near the Indo-Gangetic plain in both UKESM1 and GFDL-ESM4, reducing surface O3 concentrations (Fig. S14). This feature of NOx titration of O3 is absent in both CESM2-WACCM and BCC-ESM1, resulting in larger O3 production over South Asia. The comparison in Fig. 12 shows how the O3 chemistry within models responds differently in a future scenario with a large climate change signal and over a region with large increases in local precursor emissions. 500 Over South America and Southern Africa, particularly the tropical areas (Fig. S14), larger future changes in surface O3 are predicted by GFDL-ESM4/UKESM1 than CESM2-WACCM. Over this region, biogenic emissions (particularly isoprene) are an important source of O3 formation. Discrepancies in the magnitude of change in these emissions due to climate and land-use change could lead to the inter-model differences in surface O3. Total emissions of BVOCs (isoprene and monoterpenes) and their future change in ssp370 obtained from three models ( Figure S15) show that CESM2-WACCM has larger emissions over 505 the period 2005-2014 which increase in the future ssp370 scenario. Whereas, GFDL-ESM4 and UKESM1 have smaller increases in BVOC emissions with some emissions reducing over parts of Africa in UKESM1. The BVOC emission changes appear to have affected the future O3 formation differently in the individual models over these regions and represents an important process to understand further.
Whilst there is disagreements between models over some regions, there is substantial consistency in the predicted increase to 510 surface O3 in ssp370 over North America, Europe and East Asia, which is larger than that from HTAP_param. However, BCC-ESM1 tends to predict a larger increase than the other three models, potentially due to the coarser resolution of this ESM. As most anthropogenic precursor emissions are decreasing in this scenario across all these regions, changes in climate and global CH4 abundances seem to be the major driver of surface O3 increases.
The differences between the individual CMIP6 models highlight the importance of further understanding how future O3 515 chemistry is affected by changes to precursor emissions and climate. The predicted differences in models can be quite pronounced over regions like South Asia where changes in one model can be double that of another model, which could have important consequences for future regional air quality.
However, there is a degree of uncertainty associated with all of these future predictions indicated by the large diversity across the CMIP6 models. Some of the largest predicted increases in surface PM2.5 occur across South Asia in ssp370, a region already 535 with high present day PM2.5 concentrations. The increase in PM2.5 peak in 2050 across this region, which coincides with the increase of SO2, BC and OC emissions, before declining to 2100 when emissions reduce. Over East Asia, annual mean PM2. Reductions in annual mean surface PM2.5 are simulated across all regions for ssp126, ssp245 and ssp585. Differences exist in 545 the magnitude and timing of PM2.5 reductions across regions linked to the changes in emissions. The largest reductions in PM2.5 occur over South Asia in 2100 and range from 12.1 +/-1.9 µg m -3 in ssp126 to 9.1 +/-1.9 µg m -3 in ssp585, a substantial benefit to regional air quality. Similar benefits to PM2.5 are achieved over East Asia by 2100 although the more rapid improvements occur over this region in the first part of the 21 st Century.
The response of PM2.5 concentrations is more variable, with a larger diversity across CMIP6 models within regions that are 550 close to natural aerosol emission sources. This is particularly noticeable over North Africa where the variability across CMIP6 models in dust emissions from the Saharan source region (Fig. S7) results in an uncertain PM2.5 response across this region. A similar response is also exhibited across the Middle East and Central Asia. The potential influence of BVOCs on SOA formation ( Fig. S15 and S18) could also be contributing to the diversity in the CMIP6 model responses across the South America and Southern Africa regions. 555 The CMIP6 models show that future reductions in aerosols and aerosol precursors will lead to a decrease in surface PM2.5 concentrations across most world regions and a benefit to regional air quality (and human health), consistent with that from CMIP5. However if emissions are not controlled over economically developing regions such as South America, Asia and Africa then surface PM2.5 is anticipated to increase and worsen future regional air quality. Targeting emission reductions of NTCFs in the short-term shows the potential for rapid improvements in surface PM2.5 and air quality. 560 https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. Figure 13 -Future global and regional changes in annual mean surface PM2.5, relative to 2005-2014 mean, for the different SSPs used in CMIP6. Each line represents a multi-model mean across the region with shading representing the +/-1 standard deviation in the mean. See Table 1 for details of models contributing to each scenario. The multi-model regional mean value (+/-1 standard deviation) for the year 2005-2014 is shown in the top left corner of each panel.

565
In a similar analysis to surface O3, a more detailed comparison has been undertaken of four CMIP6 models predicting changes https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License.

28
Disagreements in both the sign and magnitude of simulated future surface PM2.5 changes between CMIP6 models are also 575 exhibited across East Asia. Small regional mean increases are predicted in 2050 for all models apart from GFDL-ESM4, attributed to a larger reduction in SO4 than other models across this region (Fig S17). In 2095 most models, apart from CESM2-WACCM, simulate a reduction in PM2.5 concentrations across East Asia. All models simulate continual reductions out to 2100 for SO4 across this region, whereas BC increases in the near-term before decreasing out to 2100. For OA, CESM2-WACCM shows larger increases over East Asia in both 2050 and 2095 compared to the other models, which show a smaller increase in 580 2050 and a reduction by 2095 (Fig. S18). CESM2-WACCM includes a more complex treatment of SOA formation, showing a strong response to climate and historical trends in OA , which could explain the multi-model differences across East Asia. The discrepancies in CMIP6 models are not as obvious over South Asia as the effect of the increase in OA over South Asia in CESM2-WACCM is masked by coincident increases in other components across other models. CESM2-WACCM also shows larger simulated increases in PM2.5 over South America, Central America, Southern Africa and South 585 East Asia than other models, which can be attributed to the larger increase in the OA fraction in this model. However, over Southern Africa UKESM1 shows a reduction in future PM2.5, in contrast to the other models. This can again be attributed to a reduction in the OA fraction in UKESM1 (Fig. S18), related to potential changes in land use and a reduction in biogenic emissions (monoterpenes) across Southern Africa in ssp370 (Fig. S15), the main precursor to SOA formation in this model (Mulcahy et al., 2019). 590 The decadal mean PM2.5 response is variable across individual CMIP6 models over regions close to natural sources of particulate matter (North Africa, Central Asia and Pacific, Australia and New Zealand). Over these regions there is a large range in both the sign and magnitude of the PM2.5 response, which can be mainly attributed to the dust fraction (Fig. S19) and the fact that this aerosol source has a large inter-annual variability in its emission strength. Interestingly, the CMIP6 models do not agree in the sign and magnitude of future changes to dust concentrations in ssp370 (Fig. S19). 595 Across the ocean and North Pole regions all the CMIP6 models tend to simulate a small increase in PM2.5 concentrations, which can be attributed to increases in sea salt concentrations (Fig. S20). A strong increase in all models is simulated across the Southern Ocean (and other oceans), potentially driven by changes to meteorological conditions which increase wind speed and sea salt emissions. As ssp370 is a scenario with a large climate change signal, the increases in PM2.5 across the North Pole can be attributed to the melting of sea ice increasing sea salt emissions. However, the magnitude of this response is different 600 in the CMIP6 models due to the underlying ECS and the response of Arctic surface temperatures within the individual model.
The differences in the simulated future PM2.5 changes across the CMIP6 models in ssp370 highlight that it is important to consider how natural sources of aerosol respond in a future climate in addition to that from changes in anthropogenic emissions.
Particular differences between models have been shown for dust, sea salt and also organic (secondary) aerosols, which should be explored further. In addition, the different representations of aerosols within individual models e.g. organic aerosols, are an 605 important consideration as they can make a large difference to any future regional prediction of PM2.5. https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License.

Conclusions
In this study we have provided an initial analysis of the historical and future changes in air pollutants (O3 and PM2.5) from the latest generation of Earth system and climate models that have submitted results from experiments conducted as part of CMIP6. 615 Data was available from the historical experiments of 5 CMIP6 models for surface O3 and 10 models for surface PM2.5.
Historical changes in regional concentrations of O3 and PM2.5 are presented over the period 1850 to 2014 using data from all models. A present day model evaluation of the CMIP6 models was conducted against surface observations of O3 and PM2.5 obtained from the TOAR and GASSP databases respectively. An additional comparison was performed for simulated PM2.5 concentrations against the MERRA-2 aerosol reanalysis product. An assessment is then made of the changes in surface O3 and 620 PM2.5 simulated by the CMIP6 models across different future scenarios, ranging from weak to strong air pollutant and climate mitigation.

30
The 5 CMIP6 models simulate present day (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) surface O3 concentrations that are elevated in the Northern Hemisphere summer, with lower values throughout the year across the Southern Hemisphere. However, a large model diversity is shown across the continental Northern Hemisphere due to the large simulated seasonal cycles in certain models. Compared to surface 625 O3 measurements, CMIP6 models consistently overpredict observed values in both summer and winter across most regions.
An exception to this is at observation locations across Antarctica where CMIP6 models tend to underpredict observed values.
Large surface PM2.5 concentrations are simulated in CMIP6 models near dust and anthropogenic emission source regions.
Model diversity across the CMIP6 models is largest near the dust source regions due to their sensitivity to meteorological variability, whereas across other regions the CMIP6 models are relatively similar in their simulation of PM2.5 concentrations. 630 Evaluating the approximate PM2.5 calculated from CMIP6 models (excluding nitrate aerosols) against ground based PM2.5 observations shows a consistent underprediction across most regions. The underestimation of observations by models is larger in the northern hemisphere winter than summer, in part due to the absence of nitrate aerosols within most CMIP6 models and also due to underrepresentation of other aerosol processes within the global models. To improve the spatial coverage and consistency of the PM2.5 evaluation with CMIP6 models an additional comparison was made to the MERRA-2 aerosol 635 reanalysis product. A similar underestimation of PM2.5 concentrations over Europe and North America was found in the comparison of CMIP6 models and MERRA-2, providing confidence in this result from the ground-based comparison. CMIP6 models overestimated the PM2.5 concentrations in MERRA-2 over South and East Asia, contrary to the evaluation using ground based observations. Annual mean cycles simulated by CMIP6 models and MERRA-2 tend to agree across other regions for which there are no suitable ground-based observations. 640 Across the historical period (1850-2014), the CMIP6 models simulated a global annual increase in surface O3 of between 7 and 14 ppb. A global multi-model mean increase of 11.5 +/-2.2 ppb was simulated by the CMIP6 models which agrees well with the change previously simulated by CMIP5 models. A large diversity in the historical change of surface O3 was simulated by CMIP6 models across South Asia and other Northern Hemisphere regions. CMIP6 models predicted larger historical changes in surface O3 than those from an emission-only driven parameterisation, indicating a potential climate change impact 645 (Bloomer et al., 2009;Rasmussen et al., 2013;Colette et al., 2015) on surface O3 over the historical period. Small global increases in surface PM2.5 are simulated over the historical period by CMIP6 models, with larger regional changes of up to 12 µg m -3 across East and South Asia. The largest diversity in the response of CMIP6 models occurs over Asian regions, with large interannual variabilities near dust source regions. CMIP6 models simulate the peak in PM2.5 concentrations in the 1980s across Europe and North America, prior to the decline in concentrations to present day resulting from air pollutant emission 650 controls over these regions.
The CMIP6 models predict surface O3 to increase across most regions in the weak mitigation scenarios (ssp370 and ssp585), particularly over South and East Asia (up to 10 ppb by 2100) due to a combination of increases in air pollutant emissions, increases in global CH4 abundances and climate change. Discrepancies exist in the regional surface O3 response in ssp370 between individual CMIP6 models due to differences in the future response of chemistry, climate and biogenic precursor 655 https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License. 31 emissions. Benefits to regional air quality from large reductions in surface O3 are possible across all regions for scenarios that contain strong climate and air pollutant mitigation measures, including those targeting CH4.
CMIP6 models predict surface PM2.5 concentrations to decreases across all regions in both the middle-of-the-road (ssp245) and strong mitigation scenarios (ssp126) by up to 12 µg m -3 due to the reduction in anthropogenic aerosols and aerosol precursor emissions, yielding a benefit to regional air quality. Whereas for the weak climate and air pollutant mitigation 660 scenario (ssp370), annual mean surface PM2.5 is simulated to increase across a number of regions. Implementing mitigation measures specifically targeting NTCFs on top of the ssp370 scenario shows immediate improvements in PM2.5 concentrations, restricting any changes to below present day values. The largest change in regional mean PM2.5 concentrations, and also largest diversity across CMIP6 models, is predicted in ssp370 across South Asia, an area with already poor air quality. Disagreements in the prediction of future changes to regional surface PM2.5 concentrations between individual CMIP6 models can mainly be 665 attributed to differences in the aerosol schemes implemented within models, in particular the formation mechanisms of organic aerosols. Additionally, the strength of the climate change signal within models and how this can have important impacts on natural aerosol emissions leading to discrepancies between models.
The results from CMIP6 provide an opportunity to assess the simulation of historical and future changes in air pollutants within the latest generation of Earth system and climate models using up to date scenarios of future socio-economic development. 670 Large changes in air pollutants were simulated over the historical period, primarily in response to changes in anthropogenic emissions. Future regional concentrations of air pollutants depend on the particular trajectory of climate and air pollutant mitigation that the world follows, with important consequences for regional air quality and human health. Substantial benefits can be achieved across most world regions by implementing measures to mitigate the extent of climate change, as well as from large reductions in air pollutants emissions, including CH4 which is particularly important for controlling O3. In future 675 scenarios which do not mitigate climate change and air pollutant emissions, the regional concentrations of air pollutants are anticipated to increase. Important differences between individual CMIP6 models have been identified in terms of how they treat the interaction of chemistry, climate and natural precursor emissions in the future. Further research and understanding is necessary of these processes to improve the robustness of regional predictions of air pollutants on climate change timescales (decadal to centennial). 680

Data Availability
CMIP6 data is archived at the Earth System Grid Federation and is freely available to download. A list of the model datasets used in this study are provided in Table 1. https://doi.org/10.5194/acp-2019-1211 Preprint. Discussion started: 21 January 2020 c Author(s) 2020. CC BY 4.0 License.