Evaluation of SO 2 , SO 2 − 4 and an updated SO 2 dry deposition parameterization in UKESM 1

In this study we evaluate simulated surface SO2 and sulphate (SO2− 4 ) concentrations from the United Kingdom Earth System Model (UKESM1) against observations from ground based measurement networks in the USA and Europe for the period 1987 to 2014. We find that UKESM1 captures the historical trend for decreasing concentrations of atmospheric SO2 and SO2− 4 in both Europe and the USA over the period 1987 to 2014. However, in the polluted regions of the eastern USA and Europe, UKESM1 over-predicts surface SO2 concentrations by a factor of 3, while under-predicting surface SO2− 4 5 concentrations by 25-35%. In the cleaner western USA, the model over-predicts both surface SO2 and SO2− 4 concentrations by a factor of 12 and 1.5 respectively. We find that UKESM1’s bias in surface SO2 and SO2− 4 concentrations is variable according to region and season. We also evaluate UKESM1 against total column SO2 from the Ozone Monitoring Instrument (OMI) using an updated data product. This comparison provides information about the model’s global performance, finding that UKESM1 over predicts total column SO2 over much of the globe, including the large source regions of India, China, the 10 USA and Europe as well as over outflow regions. Finally, we assess the impact of a more realistic treatment of the model’s SO2 dry deposition parameterization. This change increases SO2 dry deposition to the land and ocean surfaces, thus reducing the atmospheric loading of SO2 and SO2− 4 . In comparison with the ground-based and satellite observations, we find that the modified parameterization reduces the model’s over prediction of surface SO2 concentrations and total column SO2. Relative to the ground-based observations the simulated surface SO2− 4 concentrations are also reduced, while the simulated SO2 dry 15 deposition fluxes increase. 1 https://doi.org/10.5194/acp-2021-238 Preprint. Discussion started: 28 April 2021 c © Author(s) 2021. CC BY 4.0 License.


Introduction
Anthropogenic sulphur dioxide (SO 2 ) emissions have been the main driver of the historical aerosol effective radiative forcing (ERF) since the mid 20th century (Boucher et al., 2013). SO 2 is emitted into the atmosphere from a number of anthropogenic 20 (and natural) sources and once in the atmosphere SO 2 can be oxidised to form sulphate (SO 2− 4 ) aerosol, which plays a key role in both acid deposition, atmospheric aerosol loading and cloud properties, thereby directly influencing the Earth's radiative balance. For Earth system models (ESMs) to have a good representation of the historical climate and thereby give us confidence in their future projections, it is extremely important that they can capture the sulphur cycle. The UK's Earth system model (UKESM1), in common with other ESMs, has a cold bias in the mid 20th century which looks to be associated with an 25 excessively negative aerosol ERF (Sellar et al., 2019;Seland et al., 2020). A key component of the analysis and development of UKESM1 focuses on the model's sulphur cycle and its link to historical aerosol forcing. Mulcahy et al. (2020) conducted an in-depth evaluation of the aerosol species in UKESM1 and its physical model component, HadGEM3-GC3.1, including SO 2− 4 and uncovered some interesting differences in the sulphur budget between these two models including differences in the SO 2 lifetimes and oxidant loading. We aim to extend their work by conducting a detailed evaluation of SO 2 and by probing deeper 30 into the process level uncertainty of the sulphur cycle.
Total global emissions of SO 2 increased to a peak value of approximately 180 Tg y −1 in the 1970s, but following emission 35 reduction policies to improve air quality and reduce acid deposition that were implemented in the 1980s (Hoesly et al., 2018), total global emissions had decreased to approximately 120 Tg y −1 by 2015 (Aas et al., 2019). This trend is captured in global models, but there is substantial temporal variation at the regional scale (Aas et al., 2019). Legislation has driven reductions in SO 2 emissions and subsequently SO 2− 4 aerosol across Europe (Torseth et al., 2012) and North America (Sickles II and Shadwick, 2015;Holland et al., 1998). In these regions reductions in SO 2 emissions have had important environmental and 40 health benefits as well as climate impacts. Turnock et al. (2015) found that between 1970 and 2010 surface SO 2− 4 aerosol reduced by about 70% in the observations and also in the simulations. For the same period, top of atmosphere (TOA) aerosol radiative forcing over this region increased by >3 W m −2 in response to these changes in anthropogenic emissions. Similarly Leibensperger et al. (2012) reported that over the USA aerosol raditive forcing decreased by 1.0 W m −2 in the period from 1990 to 2010. Emission reduction policies in China have been implemented since 2013, which has reduced anthropogenic SO 2 45 emissions (Aas et al., 2019;Zheng et al., 2018;Hoesly et al., 2018;Liu et al., 2018;Krotkov et al., 2016), and subsequently driven decreases in aerosol optical depth (AOD) (Zhao et al., 2017). However, SO 2 emissions from India continue to increase (Aas et al., 2019;Liu et al., 2018;Krotkov et al., 2016).
Good representation of the sulphur cycle in models is essential for constraining uncertainties associated with the impacts of 50 aerosols on the Earth system and thus understanding the global climate. The global atmospheric loading of SO 2 is controlled Monitoring Network in East Asia (EANET) in 2001 (e.g. Wang et al., 2008). Since the early 2000's satellite observations of near surface SO 2 have also become available. Of these, the satellite data sets with the best temporal resolution and spatial coverage for SO 2 are from the Ozone Monitoring Instrument aboard the NASA Earth Observing System Aura spacecraft . Although biases in the SO 2 retrieval from OMI limit its use at high and low latitudes in winter and over ares 125 with low atmospheric loading, they do provide valuable information over regions where there are no long term, or even any ground-based observations Levelt et al., 2018). This paper is configured as follows; the model (UKESM1), the model simulations, observation data sets and modifications to UKESM1's SO 2 dry deposition parameterization are described in Section 2. In Section 3 we evaluate UKESM1 against obser-130 vations of surface SO 2 and SO 2− 4 and total column SO 2 . In Section 4 we assess the impact of the modifications to UKESM1's SO 2 dry deposition parameterization. The discussion and conclusions are presented in Sections 5 and 6.

UKESM1
135 UKESM1 is the latest generation Earth System (ES) model to be developed in the UK. UKESM1 has HadGEM3-GC3.1 (Kuhlbrodt et al., 2018;Williams et al., 2018) as its physical-dynamical core. HadGEM3-GC3.1 is comprised of the Global Atmosphere 7.1 (GA7.1) configuration of the Unified Model (UM) (Walters et al., 2019;Mulcahy et al., 2018); the Nucleus for European Modelling of the Ocean (NEMO) model ; the Los Alamos Sea Ice Model (CICE, Ridley et al., 2018) and the Joint UK Land Environment Simulator (JULES) land surface model . The additional 140 ES process models include the stratospheric-tropospheric (StratTrop) version of the United Kingdom Chemistry and Aerosol (UKCA) model (Archibald et al., 2020), the Model of Ecosystem Dynamics, nutrient Utilisation, Sequestration and Acidification, (MEDUSA, Yool et al., 2013) and the terrestrial biogeochemistry component of JULES . UKESM1 is described in detail, along with its component models and the coupling between them, by Sellar et al. (2019). The aerosol scheme used in the HadGEM3-GC3.1 (GLOMAP-Mode, Mann et al., 2010)), including SO 2 emissions and chemistry, is de-145 scribed in detail by Mulcahy et al. (2020). In UKESM1 the land and atmosphere share a regular latitude-longitude grid with a resolution of 1.25 • × 1.875 • (approximately 135 km at the mid latitudes). There are 85 vertical levels on a terrain following hybrid height coordinate with a model lid at 85 km above sea level and 50 of these levels below 18 km. The ocean has a horizontal resolution of 1 • and 75 vertical levels. While the atmospheric time step of the model physics is 20 minutes, due to the inherent computational cost of the chemistry and aerosol components, both of these components are called once per hour.

150
In UKESM1 the SO 2 emissions, including anthropogenic sources are from the CMIP6 inventory (Feng et al., 2020). Large explosive volcanic sources and biomass burning sources are not interactively modelled, but prescribed using the CMIP6 stratospheric aerosol climatology (Sellar et al., 2019) and van Marle et al. (2017) emissions inventory respectively. Continuously degassing volcanic sources are also included as present-day, three-dimensional, temporally fixed (i.e. no seasonal variation) 155 fields (Dentener et al., 2006). Emissions from the energy and industrial sectors are all emitted into the first model layer. We summarize the loss of SO 2 from the atmosphere via oxidation, wet and dry deposition here, but for a detailed description of these processes in UKESM1 the reader is referred to Archibald et al. (2020) and Mulcahy et al. (2020). Gas phaser and aqueous phase oxidation of SO 2 to SO 2− 4 is represented by the reactions shown in Table 1 (Pham et al., 1996;Sander et al., 2003;Kreidenweis et al., 2003). Dry deposition of SO 2 is parameterized following the resistance in series approach originally developed 160 by Wesely (1989) (see Section 2.2.1). Loss via wet deposition is the SO 2 that is scavenged and subsequently converted to SO 2− 4 in rainwater. It is parameterized as a first-order loss rate, calculated as a function of UKESM1's three-dimensional convective and large-scale precipitation (Archibald et al., 2020;O'Connor et al., 2014). Sulphate aerosol is also removed from the atmosphere by dry and wet deposition (Mulcahy et al., 2020). The aerosol dry deposition and sedimentation are represented by a resistance in series approach similar to that used for gaseous species, but which also accounts for aerosol size (Mann 165 et al., 2010). Wet deposition is parameterized in UKESM1 by an in-cloud convective plume scavenging scheme following the approach described by Kipling et al. (2013) and by nucleation scavenging (Mulcahy et al., 2020).
In the "resistance in series" approach originally developed by Wesely (1989) dry deposition flux is proportional to the atmo-170 spheric concentration of a species multiplied by its deposition velocity (Eqn. 1). The dry deposition velocity is calculated by analogy with electrical resistance as shown in Eqn. 2 where R a is aerodynamic resistance to gas transport through the near surface turbulent layer, R b is resistance to gas transfer across a quasi-laminar sublayer surrounding the receptor surface and R c is structural resistance to deposition of the receptor surface itself. For a detailed description of this approach, see Wesely The surface or canopy resistance to deposition, R c , is the most difficult of the three resistances to parameterize as it is sen-180 sitive to biochemical details of the individual receptor surfaces. R c is typically a function of the following receptor-specific resistances: (i) R stom , canopy stomatal resistance, combined with the mesophyll resistance (R m ) of a given plant, (ii) R cut , canopy cuticle or external leaf resistance and (iii) soil resistance R soil , combined with an in-canopy resistance R inc , describing the turbulent transport of a gas through the plant foliage to the ground. For surfaces not covered with vegetation (e.g. open water, bare soil or snow covered surfaces), R c is made equal to one of; R water , R soil , R snow Zhang et al., 2003). In this study we investigate two changes to the SO 2 dry deposition parameterization in UKESM1. Firstly, we account for a key omission in the UKESM1 calculation of R cut and R soil in that no account is taken as to whether the receptor surface is wet or dry, nor of the near surface relative humidity or soil pH. Observational studies suggest that SO 2 dry deposition (through a 190 decrease in R c ) is significantly more efficient over wet surfaces compared to dry surfaces, as well as for increasing values of near surface relative humidity (e.g., Garland and Branson, 1977;Fowler, 1978;Erisman and Baldocchi, 1994) due to the high solubility of SO 2 in water. Based on these findings we extend the calculation of R cut for SO 2 in UKESM1 to be a function both of whether the model vegetation is wet or dry and to the near surface relative humidity. This change allows a surface to remain wet after rainfall for a period of three hours, where previously it would have been "dry" immediately after the rainfall 195 event. R soil for SO 2 is also made a function of near surface relative humidity. These changes are referred to as R surf -mod and will impact SO 2 dry deposition over land surfaces. Our approach closely follows that of Erisman and Baldocchi (1994);  and interested readers are referred to those two articles for more detail. Secondly, we change the the surface resistance term for SO 2 dry deposition to water (R water ) from an erroneously high value of 148 s m −1 to 1 s m −1 to better reflect the high solubility of SO 2 in water. While lower than the value of 20 s m −1 used by Zhang et al. (2003), it reflects 200 the small, observed value of 0.004 s m −1 from Garland and Branson (1977). This change is referred to as R water -mod and will impact SO 2 dry deposition predominantly over the ocean.
In addition to the primary changes to the SO 2 dry deposition parameterization (R water -mod and R surf -mod), we also include two secondary modifications. These are (1) Dyer (1974) to that described by Holtslag and Bruin (1988). We also reduce the reference height for dry deposition (z) from 50 m to 10 m in line with Holtslag and Bruin (1988

Model simulations
For this evaluation we initially use 4 simulations from the 19 member ensemble of historical simulations that were conducted 225 for UKESM1's contribution to CMIP6 (Sellar et al., 2019;Tang, 2019). The historical simulations cover the period from 1850 to the end of 2014, thus modelling the evolution of climate and composition since the pre-industrial era. These simulations are forced by transient external forcings of solar variability, land use, well-mixed greenhouse gases and other trace gas emissions and aerosols. The volcanic forcing due to the stratospheric injection of SO 2 from volcanic eruptions is prescribed as a zonal mean climatology of the stratospheric aerosol optical properties over the historical period. All forcings and how they are 230 implemented in UKESM1 are described fully in Sellar et al. (2019). Each historical ensemble member was initialised from a different date in the pre-industrial control simulation (Yool et al., 2020). We use monthly mean output for surface SO 2 and SO 2− 4 concentrations, and SO 2 dry deposition flux. We use a second four member ensemble of historical simulations to evaluate the impact of the changes to the SO 2 dry deposition parameterization described in Section 2.2.1, hereafter this ensemble is referred to as UKESM1-SO2. The UKESM1-SO2 historical simulations are set up and run as for the UKESM1 historical 235 simulations.

Ground based observations
We compare the modelled surface SO 2 and SO 2− 4 concentrations to observations from the Clean Air Status and Trends Network (CASTNet, http://epa.gov/castnet/javaweb/index.html, Finkelstein et al., 2000) and the European Monitoring and Evaluation 240 Program (EMEP, http://ebas.nilu.no/; Torseth et al., 2012). CASTNet provides surface observations of mean seasonal SO 2 and SO 2− 4 concentrations which are available from 1987 to the present at 97 sites situated in the United States of America (USA). In this study we used observations from the CASTNet sites designated as "western reference" or "eastern reference".
The reference sites have been reporting measurements since at least 1990 and are used for determining long term trends, (e.g., Clarke et al., 1997;Holland et al., 1998;MACTEC-Engineering and Consulting, 2005;Baumgardner et al., 2002). There are 245 16 western reference sites and 33 eastern reference sites which are located in the continental USA to the west and east of 100 • W respectively (MACTEC-Engineering and Consulting, 2005). The eastern region is significantly more polluted than the western region due to the larger number of SO 2 sources there. We therefore keep the western and eastern data sets separate to assess how UKESM1 performs in the two regions. Hereafter, we refer to the eastern and western USA regions as USA-E and USA-W, respectively. For this evaluation we used the mean seasonal surface concentrations for SO 2 and SO 2− 4 which 250 are determined from hourly measurement data. Details of the quality control procedures and of how the the mean seasonal concentrations are calculated are given in (Baumgardner et al., 2002).
We also evaluate the simulated SO 2 dry deposition flux from UKESM1 against observations from CASTNet, using the same eastern and western reference sites that were used to evaluate surface SO 2 and SO 2− 4 concentrations. The CASTNet deposition 255 fluxes are derived using modelled deposition velocities rather than directly measured fluxes, which are difficult to obtain due to the requirement for extensive instrumentation and technical resource. Direct measurements of SO 2 dry deposition flux are therefore temporally and spatially limited, and not suitable for evaluating long term trends. To derive the SO 2 dry deposition fluxes, measurements of SO 2 concentration are combined with routine meteorological measurements, information on the land use type and LAI at the measurement site. This data is then combined with modelled deposition velocities from the Multi Layer Model (MLM, Meyers et al., 1998;Saylor et al., 2014). The methodology used to derive the SO 2 dry deposition fluxes for CASTNet is described in Clarke et al. (1997); Baumgardner et al. (2002). While the modelled SO 2 dry deposition fluxes can be under-predicted by approximately 30% (Clarke et al., 1997), it is considered to be the best available approach to regional scale assessment of dry deposition (Finkelstein et al., 2000;Baumgardner et al., 2002;E. Sickles and Shadwick, 2007;Sickles and Shadwick, 2007). This approach has been used to determine SO 2 dry deposition fluxes for CASTNet since 1987 (e.g., Clarke 265 et al. , 1997;Baumgardner et al., 2002) and to assess global and regional scale models (Vet et al., 2014;Tan et al., 2018;Tang et al., 2018).
Surface SO 2 and SO 2− 4 concentrations have been monitored at EMEP sites for the period 1972 -present (Torseth et al., 2012). In this study we have used observations of surface concentration of SO 2 and SO 2− 4 from 144 and 99 sites respectively,

270
although not all sites have measurements over the full period. We use monthly mean observations for both species. No SO 2 dry deposition data were available from EMEP. The locations of the CASTNet and EMEP sites used in this study are shown in Figure 1.

Data processing 275
For this evaluation we calculated seasonal averages for the modelled surface SO 2 and SO 2− 4 concentrations and for the EMEP observational data. The seasonal periods were defined as December-January-February (DJF), March-April-May (MAM), June-July-August (JJA) and September-October-November (SON). The CASTNet data for all variables was available as seasonal averages for these periods. Model grid cell output was co-located with the CASTNet and EMEP measurement sites. In some cases, this resulted in model data from a particular grid cell being compared with more than one measurement site. For the 280 time series analysis, regional means and standard deviations were calculated across the sites in the USA-E, USA-W and EMEP regions. Although there is spatial variation in the surface SO 2 and SO 2− 4 concentrations across Europe, for example concentrations are relatively low in Scandinavia, but are much higher in South East Europe, it is less easy to classify "clean" and "polluted" regions at the global model scale. Therefore we classify Europe as a single region. For the spatial analysis and calculation of time series statistics, we calculate mean values over the whole time series, i.e. 1987-2014 and for two time 285 slices at the start (1990)(1991)(1992)(1993)(1994)(1995) and end of the time series (2009)(2010)(2011)(2012)(2013)(2014). We investigate the two time slices to assess the model's performance during the different pollution levels at the start and end of the the time series. We determine the rate of change (trend) in the surface concentrations by calculating the linear regression for 1987-2014, 1990-1995 and 2009-2014.

Satellite observations 290
Total column SO 2 (TCSO 2 ) measurements came from the Ozone Monitoring Instrument (OMI), which were obtained from the Goddard Earth Sciences Data and Information Services Centre (https://aura.gesdisc.eosdis.nasa.gov/data/Aura_OMI_Level2/ OMSO2.003) . OMI is situated on-board NASA's polar-orbiting Aura satellite launched in 2004 with a local overpass time of approximately 13:45. OMI has a nadir footprint of 13 km × 24 km and a spectral viewing range of 270 to 500 nm (Levelt et al., 2018). The TCSO 2 product is quality controlled for cloud radiation fraction > 0.0 and < 0.5, solar zenith  For a robust comparison between model simulations and satellite data, to reduce sampling (representation) errors, both data sets typically require spatio-temporal co-location. To achieve this, high temporal resolution (e.g. 3 hourly or 6 hourly) model output of 3D tracer and pressure fields are required over the analysis period to capture e.g. diurnal variability (Pope et al., 2016;Monks et al., 2017). However, this is difficult when using standard climate model simulations, including those used in this study, which typically output monthly means due to their long term climate focus. In this comparison we performed tests to 305 show that using the monthly mean model output was suitable given the relatively uniform diurnal cycle of SO 2 emissions. For these tests we made initial comparisons of model output and satellite data using 6 hourly and monthly mean output from the same UKESM1 simulation. We found that the temporal sampling of the model was not overly critical for SO 2 , i.e. modelled SO 2 has a sufficiently long lifetime to dampen the influence of diurnal sampling of the model. Further details on the TCSO 2 product, how it was processed to obtain TCSO 2 values and the assessment of the temporal resolution is given in Pope and  Tables 3 and 4. We find that UKESM1 captures the historical reduction in surface SO 2 and SO 2− 4 concentrations, but was less able to simulate the finer detail in the temporal trends. UKESM1 over-predicts surface SO 2 concentrations in all three regions, but the direction of the model's bias in surface SO 2− 4 concentrations is spatially variable. UKESM1 over-predicts the annual mean surface SO 2 concentrations in the polluted regions of Europe and USA-E by a factor of 3.2 to 3.4 over the period 1987-2014, although the absolute bias is higher USA-E (see Table 3). While the absolute magnitude of the bias in mean annual surface SO 2 concentration is less in USA-W compared with the polluted regions, pro-340 portionally it is much larger, with the model simulating surface SO 2 concentrations more than 10 times the observed values.
The absolute magnitude of the bias in mean annual surface SO 2 concentration decreased from 1990-1995 to 2009-2014 in all three regions (see Table 3) reflecting the model's more rapid decrease in surface SO 2 concentrations relative the observations. However, the NMB values were slightly higher in 2009-2014 compared with 1990-1995. 345 Figures 2e-f show that UKESM1 better captures both the magnitude and the trends in surface SO 2− 4 concentrations compared to surface SO 2 concentrations. The model simulated surface SO 2− 4 concentrations decreasing at a rate of 0.14 µg m −3 y −1 (USA-E) and 0.12 µg m −3 y −1 (Europe) compared with the observed trend of 0.16 µg m −3 y −1 (see Table 4). UKESM1 does under-predict mean annual surface SO 2− 4 concentrations in the polluted regions of USA-E and Europe, but the model bias is relatively small compared with the large over prediction of mean annual surface SO 2 concentration (see Tables 3 and 4). We 350 also note that there is a large range associated with the modelled and observed data, and that the mean surface SO 2− 4 concentrations lie within these ranges.  Table 4). The NMB was slightly higher for both regions in the later period, but the values remained low. The picture is different in USA-W where, in contrast to USA-E and Europe, UKESM1 over predicts 355 mean annual surface SO 2− 4 concentration by 150% over the period 1987-2014. Both the absolute model bias and the NMB is worse in 1990-1995 than in 2009-2014, which may be attributed to the model simulating a much faster decrease in mean annual surface SO 2− 4 concentrations compared to the observed trend for the later period (see Table 4).   Figure A1). These figures also show the localised versus dispersed nature 365 of the surface SO 2 and SO 2− 4 concentrations, with high SO 2 concentrations located within 2 -3 grid boxes (200 -400 km) of the emission sources (see Figure A1), while SO 2− 4 is distributed more widely. UKESM1 is also able to simulate the range    However, the divide between higher concentrations in USA-E compared with lower concentrations in USA-W is still apparent for both species in the later period.
In Europe the highest mean annual surface SO 2 and SO 2− 4 concentrations were observed in central and eastern Europe and the South East (SE) of England (see Figure 4). Lower concentrations of both species were observed in Scandinavia, the Iberian 375 peninsular, south west France and at Mace Head on the west coast of Ireland. Figure 4c shows that mean annual surface SO 2   The model's over-prediction of mean annual surface SO 2 concentration is largest close to the sources. For example, UKESM1 over-predicts surface SO 2 concentrations by up to 50 µg m −3 in the central USA-E area, but only by around 10 µg m −3 at the surrounding sites. In Europe the largest biases of around 10-20 µg m −3 tend to be distributed across the region due to the more distributed nature of the point sources and on average the model bias is lower than in USA-E. The model's tendency to over-400 predict SO 2 concentrations to a greater extent close to the sources is also shown in the plots of NMB (see Figures 5c and 6c) and is likely why there is a larger range in the modelled mean annual surface SO 2 concentrations averaged across the USA-E and European measurement sites compared with the observational values (see Figures 2b and c). Figures 5b and 6b show that UKESM1 generally under-predicts surface SO 2− 4 concentration across the USA-E and European sites, with model biases of -1 to -3 µg m −3 . We find that in USA-E the larger negative biases are not directly co-located with the largest positive biases  The results presented in Sections 3.1 and 3.2 show that UKESM1 consistently over-predicts mean annual surface SO 2 concentrations in the USA and Europe, however Figures 7a-c show that in the more polluted regions (USA-E and Europe), the magnitude of the bias is seasonal, although still with a large positive bias. UKESM1 is able to capture the seasonal cycle in 425 surface SO 2 concentrations over Europe, but the absolute model bias is larger in DJF compared with JJA (see Table 5). In     Figure 8 shows total column SO 2 (TCSO 2 ) from UKESM1 and OMI, and the difference between them for DJF and JJA.
Note that the quality control for solar zenith angle results in no data availability above 65 • N degrees or below 65 • S in the winter months and due to OMI's weaker sensitivity to retrieving SO 2 in remote regions we focus on comparing TCSO 2 over source and outflow regions . TCSO 2 values by 0.2 -0.5 DU. UKESM1 also has larger volcanic sources and associated outflow, which can be seen over central America, Sicily, Hawaii and Papua New Guinea, for example. This is likely due to the climatology that UKESM1 uses 450 for continuously degassing volcanoes. In agreement with the ground based observations, the satellite data shows an East-West divide in the USA, with greater TCSO 2 over USA-E compared with USA-W.  The observations of TCSO 2 and surface SO 2 concentrations over Europe both show the impact of emission control policies on keeping atmospheric SO 2 levels low and stable (see Figures 2c and 9b). For the USA, we investigate the surface SO 2 concentrations separately over the USA-E and USA-W regions, whereas the TCSO 2 is averaged over the whole continental USA. As a result, the trend for decreasing surface SO 2 concentrations in USA-E (see Figure 2b) Tables 3 and 6). However, the R values for surface SO 2 concentration (>0.8) are much better than those for TCSO 2 (0.26 -0.53), particularly over the USA. The low R value for the USA 480 reflects the poor seasonal agreement in TCSO 2 in this region. The comparison against both observational data sets shows that the modelled atmospheric SO 2 is too high, both at the surface and through the column. In addition, UKESM1 predicts trends in TCSO 2 and surface SO 2 , particularly prior to 2010, that with the exception of USA-E are not seen in the observations. Both observational data sets also show that UKESM1's over-prediction of atmospheric SO 2 is greater in the winter months com- The largest absolute reductions in mean annual surface SO 2 and SO 2− 4 concentrations are over the source regions, corresponding with the locations of the largest increases in SO 2 dry deposition. This is expected because dry deposition of SO 2 to the surface is directly proportional to the surface concentration (Equation 1). Figure 10e shows that mean annual surface corresponds to a percentage decrease of 30-50%. Note that this is similar to the percentage decrease in mean annual surface SO 2 concentration over remote and ocean regions, although the absolute fluxes are much larger over the source regions. With the exception of some areas in the Sahara and the Middle East, we do not see increases in mean annual surface SO 2 concentration where dry deposition fluxes decrease (albeit by very low amounts), such as the Arctic. We suggest that this is because 515 these areas are remote and contain no SO 2 sources and by reducing SO 2 in the source regions, we reduce overall atmospheric SO 2 loading and less is transported to remote areas. Figures 10h and i show that mean annual surface SO 2− 4 concentrations are also reduced over the main source regions, although the reductions over the USA are relatively small compared to the other large source regions (0.5 µg m −3 compared with up to 3 µg m −3 in central and eastern Europe and China). Proportionally, the reduction in mean annual surface SO 2− 4 concentration is smaller than that for mean annual SO 2 concentration, with decreases 520 generally less than 5% over most source regions.

Evaluation of UKESM1-SO2 against ground-based observations
In Figure 11 and Table 7 we evaluate UKESM1-SO2 against the ground-based observations of mean annual surface SO 2 and SO 2− 4 concentration for the USA and Europe regions over the period from 1987-2014. UKESM1-SO2 is also evaluated against 525 SO 2 dry deposition flux from CASTNet over the same period. Figures 11 a and b show that SO 2 dry deposition flux is increased in UKESM1-SO2 relative to UKESM1 with this increase being more pronounced over USA-E compared with USA-W (see also Table 7). The increase in SO 2 dry deposition in UKESM1-SO2 does enhance the model's over-prediction of this parameter relative to the CASTNet data, with NMB increasing from 1.15 in UKESM1 to 3.0 in UKESM1-SO2. Mean annual SO 2 dry deposition fluxes are very low over USA-W due to the much lower concentrations of SO 2 in this region. UKESM1 does 530 over-predict mean annual SO 2 dry deposition flux in this region too, but the absolute bias changes very little in UKESM1-SO2 (see Table 7). Figures 11 c -e show that model bias in mean annual surface SO 2 concentration is reduced in UKESM1-SO2 compared with UKESM1 in all three regions over the period 1987-2014. The largest absolute reduction is in USA-E where mean annual surface SO 2 concentration decreases from 20.36 µg m −3 to 13.97 µg m −3 , however, the largest reduction in NMB is in USA-W where it decreases from 11.57 to 8.77 (see Table 7). The model's over prediction of mean annual surface SO 2− 4 535 concentration is reduced over USA-W in UKESM1-SO2 compared with UKESM1, with NMB decreasing from 0.5 to 0.24. However, the model's under prediction of mean annual surface SO 2− 4 concentration over USA-E and Europe increases, with Table 7. Statistics for mean annual surface SO2 and SO 2− 4 concentrations and SO2 dry deposition flux for UKESM1 and UKESM1-SO2. The mean seasonal values are averaged over the period 1987-2014 and the time slice mean is in µg m −3 for SO2 and SO 2− 4 concentrations, and kg m −2 y −1 for SO2 dry deposition flux.  Table 7). We find that the changes to the surface concentration and dry deposition flux occur almost uniformly over the seasonal cycle and so do not change the patterns in seasonal bias that are described in Section 3.4 for surface SO 2 and SO 2− 4 concentrations. The impact of the modifications to the SO 2 dry deposition parameterization on TCSO 2 are shown in Figure 12 and Table 6. regions (e.g. off the USA eastern seaboard), TCSO 2 has reduced by 30-50% and over the source regions, this varies by 30-50% for South to North East Asia, 20-30% for Europe and 10-30% for the USA (see Table 6). However, Figures 12c and d also show that the inter-model differences are smaller than the existing model-satellite difference, i.e. the hatched regions are sporadic 550 with limited coverage.

Discussion
The evaluation of UKESM1 against ground-based observations of SO 2 and SO 2− 4 concentrations from the USA and Europe, as well as SO 2 dry deposition fluxes from the USA, shows that the model is able to represent recent historical changes in 555 these variables. UKESM1 is also able to capture the spatial patterns in surface SO 2 and SO 2− 4 concentrations and SO 2 dry deposition, simulating larger values close to the sources and lower values away from the sources. However, UKESM1 generally over-predicts surface SO 2 concentrations and dry deposition fluxes, while under-predicting surface SO 2− 4 concentrations over the period 1987-2014. Further, we find that UKESM1 over-predicts the rate at which the surface SO 2 concentrations decrease over this period. 560 We also make use of the updated TCSO 2 product from OMI to evaluate UKESM1, finding that the model captures spatial patterns in TCSO 2 at the global scale. Importantly, this evaluation allows us to identify model bias in regions without longterm ground-based networks, showing that UKESM1 over-predicts TCSO 2 over all source and outflow regions. We find that although the ground-based and satellite observations are subject to different uncertainties, UKESM1's relative over-prediction 565 of both surface SO 2 concentration and TCSO 2 is similar in the USA and Europe. This suggests that our findings of positive bias in modelled atmospheric SO 2 is robust. We have also demonstrated that a more realistic treatment of SO 2 dry deposition in UKESM1 reduces the model's atmospheric loading of SO 2 and SO 2− 4 . However, we find that UKESM1's under-prediction of surface SO 2− 4 concentrations and over-prediction of SO 2 dry deposition fluxes increases when the changes are included in the model, suggesting that there are further uncertainties UKESM1's representation of the complex sulphur cycle processes.

570
Additionally, the spatial and temporal differences in the model bias suggests that the drivers of model bias are regionally and seasonally dependant.
Broadly, model bias in atmospheric SO 2 can be driven by too little removal of SO 2 from the atmosphere (via deposition or oxidation) or too high emissions. UKESM1 uses SO 2 emissions from CMIP6 (Eyring et al., 2016). In comparing these 575 emissions with the HTAP-OMI  and EDGAR (Crippa et al., 2018) data sets, Pope and Chipperfield (2021) showed that the total SO 2 emissions in CMIP6 are moderately larger than the HTAP-OMI and EDGAR data sets. Further, comparing the spatial distribution showed that the CMIP6 emissions from almost all sources are larger than the HTAP-OMI emissions. Exceptions occur at a number of point locations, which are likely from OMI rather than HTAP. Emission injection height is also an important constraint on near surface SO 2 concentrations as demonstrated by Yang et al. (2019). This study 580 found that uncertainty in industrial emission height resulted in modeled near-surface SO 2 concentrations varying between 70% and 130% over most land regions, higher than the overall uncertainty of 8-14% attributed to SO 2 emission rates. The SO 2 injection height in UKESM1 was investigated by Mulcahy et al. (2020) who used injection heights prescribed as for HadGEM-GC3.1, where 50% of energy and industry sector emissions are injected in to the atmosphere at a height of 500 m. Mulcahy et al. (2020) showed that the introduction of a vertical profile for SO 2 emissions in UKESM1 had negligible impact on surface 585 SO 2− 4 concentrations at measurement sites in Europe and the USA, suggesting an important role for the aerosol chemistry in these regions. Emitting the SO 2 at higher altitudes will act to reduce surface SO 2 concentrations and therefore model's bias against surface observations of both SO 2 concentration and SO 2 dry deposition flux. However, Pope and Chipperfield (2021) showed that model bias in TCSO 2 increased when emission injection heights were increased. This finding further suggests that the CMIP6 emissions are too high and that using a vertical profile for the emissions to some extent shifts the model's bias in 590 SO 2 to higher altitudes. We suggest that model experiments with different emissions inventories and injection height profiles should be able to cast some light on the role of SO 2 emissions in model bias in the sulphur cycle.
The two main removal pathways for SO 2 are oxidation to sulphate and dry deposition to the Earth's surface. In this study we have evaluated a more realistic treatment of SO 2 dry deposition in UKESM1 that accounts for the high solubility of SO 2 595 in water, finding that it reduces the models' atmospheric loading of SO 2 . This reduces positive model bias in surface SO 2 concentrations in the USA and Europe, and in TCSO 2 across most of the globe. The changes have the largest impact over source regions because dry deposition flux is directly proportional to the atmospheric concentration of SO 2 (Equation 1). However, while we see a direct reduction in UKESM1's bias in atmospheric SO 2 loading, the true impact of the changes to the dry deposition parameterization are confounded by model uncertainty in other aspects of the complex sulphur cycle as well as the inherent difficulties associated with evaluating a global model against point observations. Dry deposition is a highly parameterized process and often poorly represented, particularly in global models. Similarly to UKESM1, Vet et al. (2014) showed that SO 2 dry deposition fluxes were over-predicted by the 23 member model ensemble used for the TF-HTAP exercise relative to inferential data sets from measurement networks (including CASTNet). A key un-605 certainty highlighted in this study was that associated with the inferred dry deposition fluxes; from CASTNet these could be up to 30% lower than direct observations of SO 2 dry deposition flux (Baumgardner et al., 2002). However, the fluxes simulated by UKESM1 are a factor of 2 to 10 higher than the inferred data from USA-E and USA-W, indicating that the modelled deposition fluxes are almost certainly too large. Additional sources of uncertainty in model simulations of SO 2 dry deposition may include land surface cover, changes in the atmospheric SO 2 :NH 3 ratio and the ratio between wet and dry deposition of 610 SO 2 . Mulcahy et al. (2020) showed that wet deposition, and to a lesser extent dry deposition, of SO 2 was considerably lower in UKESM1 compared with HadGEM-GC3.1. Paulot et al. (2017) also report that poor representation of wet deposition likely contributed to bias in modelled surface SO 2− 4 concentrations. Too low wet deposition would also contribute to the model's over-prediction of surface SO 2 concentration. Dry deposition is sensitive to land surface type, which may not be well captured in global models. In this study the UKESM1 configuration uses 13 land cover classes including 11 plant functional types 615 (Archibald et al., 2020). This is reasonable for a global model, but inevitably detail is lost. Vet et al. (2014) also suggest that SO 2 dry deposition may depend on the atmospheric NH 3 loading. Long term measurements at a UK site showed that SO 2 dry deposition velocity has increased with time, which was attributed to changing ratios of SO 2 :NH 3 as SO 2 concentrations have decreased faster than NH 3 concentrations (Vet et al., 2014;Fowler et al., 2009). Currently nitrate chemistry is not represented in UKESM1, although it is planned for future model versions, and NH 3 has not been evaluated in the model, so it is unknown concentrations (Archibald et al., 2020). Mulcahy et al. (2020) showed that there are global and regional differences in oxidant concentrations and in the SO 2 lifetime between UKESM1 and the HadGEMGC3.1 model, with the latter better able to capture surface SO 2− 4 concentrations. This highlights the requirement for a more detailed investigation of SO 2 oxidation in UKESM1, particularly in polluted regions.
Further, uncertainty in UKESM1's sulphur chemistry appears to be seasonally dependant. In USA-E and Europe UKESM1 over-predicts SO 2 (surface concentration and TCSO 2 ) and under-predicts surface SO 2− 4 at all times of the year, but there is seasonal variation in these biases. In Europe, the model bias is largest in DJF which drives a stronger seasonal cycle relative to the observations. This is in agreement with Turnock et al. (2015)  in SO 2 and SO 2− 4 as the higher average latitude of the USA-E sites mean that the O 3 oxidation pathway is less important in this region and Paulot et al. (2017) showed that SO 2− 4 concentrations are sensitive to both pathways in winter. From this study it is not clear what is driving the relatively large positive bias in SO 2 in JJA over USA-E and we stress the need for closer examination of the sulphur chemistry in UKESM1.

650
Recent studies have also shown that cloud water pH may be an important factor in the aqueous phase oxidation of SO 2 to American and East Asian locations reported values between 3.34 and 5.29 (Guo et al., 2012). Turnock et al. (2019) showed that varying this value in the HadGEM3-UKCA model can have a large impact on SO 2 and SO 2− 4 concentrations. Over source regions, including Europe and North America, increasing the cloud water pH by 1.0 reduced the annual mean global SO 2 column burden by approximately 50% as more SO 2 was oxidized in cloud droplets, and consequently there were small increases in the annual mean sulphate column burden over these regions. Conversely, outside of polluted regions increasing the cloud 660 water pH reduced the sulphate column burden by 10% to 40% globally. These results indicate that having a more realistic treatment of cloud water pH could reduce UKESM1's biases in TCSO 2 , and potentially in SO 2 and SO 2− 4 concentrations remote from source regions. However, is unlikely to be a dominant removal pathway at the surface and any impact on surface SO 2 concentrations, especially close to sources would be expected to be minimal.

680
Model resolution is also likely to be an important source of model bias in this study. In evaluating UKESM1 against the CASTNet and EMEP data sets we are comparing a simulated value generated from a model grid box approximately 200-300 km with a point observation. This may be a particular problem for the surface SO 2 concentrations and SO 2 dry deposition fluxes because in reality a large fraction (20-40%) of SO 2 emitted from point sources is lost in the first 100 km, which is sub-grid scale relative to the model grid boxes (Smith and Jeffrey, 1975;Wys et al., 1978). In UKESM1 all the SO 2 is evenly 685 emitted across the grid box and then the deposition is calculated. In reality in large fraction of the emitted SO 2 never makes it to the grid scale, driving overall large model biases compared with the ground based observations which are not necessarily close to the point sources. In addition, the model resolution can not capture complex orography, meaning that transport of SO 2 may not be well represented. This could be a particular problem in mountainous areas of USA-W VanCuren and Gustin (2015).
High resolution model studies would beneficial to address the both importance of orography and to investigate the SO 2 losses 690 close to sources. We also suggest that evaluating UKESM1 against chemical re-analysis fields of SO 2 could reduce some of the bias that occurs with using using point observations, but there are uncertainties associated with this approach too (Ukhov et al., 2020).

695
We evaluate UKESM1 against ground-based and satellite observations of selected sulphur species over the recent historical period. We find that UKESM1 is able to capture the temporal and spatial patterns in surface SO 2 , surface sulphate concentration, SO 2 dry deposition and TCSO 2 . However, compared to observations we find that the model is biased, depending on the variable, region and species. We address one possible source of bias by introducing a more realistic treatment of SO 2 dry deposition, a key loss process for this species. This change reduces model bias in surface SO 2 concentrations and TCSO 2 .
However, it is apparent that other biases exits within the complex sulphur cycle and we highlight some key areas for further investigation to better understand these and target areas for development.
Our evaluation suggests uncertainty in UKESM1's sulphur chemistry is also an important driver of the biases seen in this study, in particular over polluted regions. Two priorities for further investigation into the oxidation of SO 2 in UKESM1 are (i) 705 an evaluation of the CRI-Strat scheme and (ii) a more realistic treatment of cloud water pH. The model's necessarily limited DMS chemistry may also contribute to bias in atmospheric sulphur loading over remote ocean areas. While testing the available, more detailed representations of DMS chemistry in the fully coupled UKESM1 model may not be feasible, their impact at the global scale could be assessed in UKCA. The impact of the nitrate scheme currently in development for UKESM1 will also be investigated in relation to the sulphur cycle. Another aspects of UKESM1's sulphur cycle that would benefit from more 710 detailed analysis is the ratio between wet and dry deposition and how this compares with observations. We also suggest that high resolution studies to investigate SO 2 deposition close to sources would be beneficial for a better understanding of these processes. Finally, we suggest that the SO 2 emissions may be too high through a possible combination of too high emissions in the CMIP6 inventory and injection of the emissions into the surface layer only.

715
The sulphur cycle is a key area of analysis and development for UKESM1 given its importance as a driver of historical aerosol forcing. UKESM1 is relatively unique amongst models in CMIP6 in that it has a fully interactive atmospheric chemistry scheme coupled to a two-moment (mass and number) aerosol scheme. Given the complexity of the model's chemistry-aerosol treatment within the ES framework, the model's performance here is encouraging and provides confidence in UKESM1, particularly in regard to capturing the historical trend. However there is always space for improvement and to the more realistic treatment of 720 SO 2 dry deposition will therefore be incorporated into the planned release of UKESM1.1. This latest model version will be documented in an forthcoming publication which will also address the impact of the SO 2 dry deposition changes on aerosol loading and climate.
Code availability. The UM is the source code for the atmosphere-land-ocean-sea ice components of the UKESM1 physical model, including the NEMO and CICE modules for oceans and sea ice, respectively. The UM source code also houses the GLOMAP-Mode and UKCA modules. JULES is the source code for land and terrestrial biogeochemistry components. MEDUSA is the source code for the ocean biogeochemistry. Due to intellectual property rights restrictions, we cannot provide either the source code or documentation papers for the UM or JULES. Obtaining the UM. The Met Office Unified Model is available for use under licence. A number of research organisations and national meteorological services use the UM in collaboration with the Met Office to undertake basic atmospheric process research, produce forecasts, develop the UM code and build and evaluate Earth system models. For further information on how to apply for a licence, see http://www.metoffice.gov.uk/research/modelling-systems/unified-model (last access: 17 March 2021). Obtaining JULES. JULES is available under licence, free of charge. For further information on how to gain permission to use JULES for research purposes, see http://juleslsm.github.io/access_req/JULE_access.html (last access: 17 March 2021). Information about the UKESM1 release and its components and the prerequisites for using it can be found here: http://cms.ncas.ac.uk/wiki/UM/Configurations/UKESM. Briefly, UKESM1 is distributed and run as a Rose suite on the Archer2 and Monsoon computing platforms administered by UK Research Innovation and the Met Office/Natural Environment Research Council, respectively. Rose is a framework for developing and running meteorological applications and is described in more detail here: http://cms.ncas.ac.uk/wiki/RoseCylc. Data availability. The simulation data used in this study are archived on the Earth System Grid Federation (ESGF) node (https://esgfnode.llnl.gov/projects/cmip6/, last access: 17 March 2021). The model source ID is UKESM1-0-LL for UKESM1. UKESM1 historical simulations are identified by the following variant labels: r1i1p1f2, r2i1p1f2, r8i1p1f2 and r9i1p1f2, (https://doi.org/10.22033/ESGF/CMIP6.6113; Tang (2019)). We acknowledge the use of the CASTNet data base (https://www.epa.gov/castnet, last access: 17 March 2021). Information on the EMEP network can be found in Tørseth et al. (2012). OMI total column SO2 data was obtained from NASA's Goddard Earth Sciences Data and Information Services Center (GES DISC, https://disc.gsfc.nasa.gov/, last access: 17 March 2021).
Author contributions. CGJ, JPM, STT and CH contributed to the conceptualization of this study CGJ and STR developed and tested the code changes for the revised dry deposition parameterization. STR ran the UKESM1 and UKESM1-SO2 historical simulations. CH led the analysis and did the data visualization. RP with support from CL evaluated UKESM1 against the OMI SO2 product. CH led the preparation of the manuscript, with important text contributions from JPM, CGJ and RP. All co-authors contributed invaluable comments in reviewing and editing the manuscript.

755
The authors declare that they have no conflict of interest.  Acknowledgements. For making their measurement data available to be used in this study, we would like to acknowledge the EMEP and CASTNet measurement networks along with any data managers involved in data collection. The EBAS database has largely been funded 760 by the UN-ECE CLRTAP (EMEP), AMAP and through NILU internal resources. Specific developments have been possible due to projects like EUSAAR (EU-FP5)(EBAS web interface), EBAS-Online (Norwegian Research Council INFRA) (upgrading of database platform) and HTAP (European Commission DG-ENV)(import and export routines to build a secondary repository in support of www.htap.org). The US Environmental Protection Agency is the primary funding source for CASTNet, with contracting and research support from the National Park Service. This study used JASMIN, the UK's collaborative data analysis environment (http://jasmin.ac.uk) (Lawrence et al., 2013). CH made 765 extensive use of the Iris and Cartopy libraries for the analysis and data visualisation in this study (Met Office, 2010.