The value of remote marine aerosol measurements for constraining radiative forcing uncertainty

Aerosol measurements over the Southern Ocean are used to constrain aerosol–cloud interaction radiative forcing (RFaci) uncertainty in a global climate model. Forcing uncertainty is quantified using 1 million climate model variants that sample the uncertainty in nearly 30 model parameters. Measurements of cloud condensation nuclei and other aerosol properties from an Antarctic circumnavigation expedition strongly constrain natural aerosol emissions: default sea spray emissions need to be increased by around a factor of 3 to be consistent with measurements. Forcing uncertainty is reduced by around 7 % using this set of several hundred measurements, which is comparable to the 8 % reduction achieved using a diverse and extensive set of over 9000 predominantly Northern Hemisphere measurements. When Southern Ocean and Northern Hemisphere measurements are combined, uncertainty in RFaci is reduced by 21 %, and the strongest 20 % of forcing values are ruled out as implausible. In this combined constraint, observationally plausible RFaci is around 0.17 W m−2 weaker (less negative) with 95 % credible values ranging from −2.51 to −1.17 W m−2 (standard deviation of −2.18 to −1.46 W m−2). The Southern Ocean and Northern Hemisphere measurement datasets are complementary because they constrain different processes. These results highlight the value of remote marine aerosol measurements.


Introduction
The uncertainty in the magnitude of the effective radiative forcing caused by aerosol-cloud interactions (ERF aci ) due to changing emissions over the industrial period is around twice that for CO 2 . It is essential to reduce this uncertainty if global climate models are to be used to robustly predict near-term changes in climate (Andreae et al., 2005;Myhre et al., 2013;Collins et al., 2013;Tett et al., 2013;Seinfeld et al., 2016).
Aerosol forcing uncertainty has persisted in climate models since the 1990s partly because there are no measurements covering the industrial period that can be used to directly constrain simulations of long-term changes in aerosol and cloud properties (Gryspeerdt et al., 2017;McCoy et al., 2017). Estimates of aerosol forcing over the industrial period therefore rely on models that have been evaluated against measurements made in the present-day atmosphere. However, it is known that the aerosol forcing (in particular the component caused by aerosol-cloud interactions) depends sensitively on the state of aerosols in the pre-industrial period (Carslaw et al., 2013;Wilcox et al., 2015) when natural aerosols were dominant (Carslaw et al., 2017). Observations of natural aerosols in the present-day atmosphere are therefore expected to help constrain the simulated forcing unless there have been significant changes in natural aerosol processes over the industrial period, for which there is little evidence (Carslaw et al., 2010).
In this paper we address the following questions: (i) to what extent can measurements of aerosols in pristine (natural) environments help to constrain model simulations and thereby reduce the large uncertainty in aerosol forcing? (ii) What is the relative importance of measurements in remote and polluted environments for constraining the forcing uncertainty? It is known that the abundance of natural aerosols affects the magnitude of forcing in a model Carslaw et al., 2013). However, to assess the effect on the uncertainty in forcing it is necessary to explore how the spread of predictions of a set of models changes when constrained by measurements. The Coupled Model Intercomparison Project Phase 5 is inadequate for this purpose because of insufficient aerosol diagnostics (Wilcox et al., 2015). Here we use large perturbed parameter ensembles (PPEs) of the UK Hadley Centre General Environment Model HadGEM3 (Hewitt et al., 2011). The PPEs were created by systematically perturbing numerous model parameters related to natural and anthropogenic emissions and physical processes (Yoshioka et al., 2019). The simulated aerosol forcings have uncertainty ranges that exceed those of multi-model ensembles (Yoshioka et al., 2019;Johnson et al., 2020). Instantaneous radiative forcing (RF) is quantified using the 26-parameter AER PPE in which just aerosolrelated parameters were varied, and the effective radiative forcing (ERF) is quantified using the 27-parameter AER-ATM PPE in which aerosol and physical atmosphere parameters were varied (Yoshioka et al., 2019). We use these PPEs to quantify how the constraint provided by pristine aerosol measurements affects the spread of aerosol forcings simulated by the ensembles.
Previous analysis of HadGEM3 PPEs showed that aerosol measurements in polluted regions help to constrain the uncertainty in aerosol-radiation interaction forcing (RF ari ) but not the component due to aerosol-cloud interactions (RF aci ) . A dataset of over 9000 (predominantly Northern Hemisphere) aerosol measurements reduced the uncertainty in global, annual mean aerosol RF ari by 35 %, but RF aci uncertainty was reduced by only 7 %. These measurements reduce the uncertainty in a small number of parameters related to anthropogenic emissions and aerosol processing in polluted environments. However, important causes of uncertainty in RF aci , such as natural aerosol emission fluxes, were largely unconstrained.
The Southern Ocean is one of the few regions on Earth (along with some boreal forests) in which the same processes are expected to affect cloud-active aerosol concentrations in the present-day and early-industrial atmospheres (Hamilton et al., 2014). In this study we make use of aerosol measurements from the Antarctic Circumnavigation Expedition: Study of Preindustrial-like Aerosols and Their Climate Effects (ACE-SPACE) campaign (Schmale et al., 2019). They offer a unique opportunity to constrain the early-industrial aspects of aerosol forcing uncertainty because the Southern Ocean is a source of natural aerosols that are relevant at the global scale and remains largely unaffected by anthropogenic aerosol and precursor emissions.
We use near-surface measurements of cloud condensation nuclei concentrations at 0.2 % and 1.0 % supersaturations (CCN 0.2 and CCN 1.0 ; , as well as mass concentrations of non-sea-salt sulfate particles with dry aerodynamic diameters less than 10 µm and number concentrations of particles with dry aerodynamic diameter larger than 700 nm (N 700 ; corresponds to volume equivalent diameter larger than around 500 to 570 nm; Schmale et al., 2019a). The measurements are compared to output from 1 million variants of the HadGEM3 model that sample combinations of parameter settings in the model. These model variants are used to represent aerosol forcing uncertainty in our model using probability density functions (pdf's) and were generated by sampling from Gaussian process emulators that were trained on the PPE model outputs (see Sect. S1 in the Supplement). Model variants that were judged to be observationally implausible against the measurements were rejected, resulting in a set of plausible variants from which the uncertainty in aerosol forcing could be computed (see Sect. S1). In the results shown below, we retained approximately 3 % of model variants (following Johnson et al., 2020) that best match all four measured aerosol properties. Figure 1 shows the CCN 0.2 mean and standard deviation from the unconstrained and constrained model variants to exemplify the effect of constraint on model output. The mean concentrations in the unconstrained sample are much smaller than measured concentrations. However, the range of CCN 0.2 values in the unconstrained sample spans the measurements in most locations (Fig. 1b). The measurement constraint increases CCN 0.2 concentrations (more than double the unconstrained mean in many locations; Fig. 1c) and greatly reduces the CCN 0.2 uncertainty (by more than half everywhere to less than 50 cm −3 ; Fig. 1d). Figure 2 shows pdf's of the output from the model for the four variables used as constraints, calculated as means over the locations where measurements were taken. The constraint reduces the uncertainty in all measurement types (narrower pdf's), and the central tendency of the pdf's is closer to the regional mean of measurements after constraint. Rejecting around 97 % of model variants as implausible compared to measurements greatly improves the model-measurement comparison.

Results
After constraint, the remaining model variants inhabit specific parts of the 26-dimensional parameter uncertainty space used to quantify the model uncertainty. We explore the effect of constraints on parameter values using onedimensional marginal probability distributions ( results. The magnitude of the marginal probability distribution after constraint reflects the number of ways in which a particular value of a parameter can be combined with settings of all the other parameters to produce an observationally plausible model. The white space in the marginal pdf's shows where parameter value density has decreased. The relative simplicity of aerosol emissions and processes over the Southern Ocean (compared to polluted continental regions) means that measurements can be used to tightly constrain uncertainty in the associated parameters. Two parameters (sea spray emissions and dry deposition velocity) are tightly constrained such that some parameter values are ruled out as implausible even when combined with uncertainties in all other parameters. Several other parameters (related to cloud droplet pH, dimethylsulfide (DMS) emissions and wet deposition) are more modestly constrained. These joint constraints (see also Fig. S3) suggest the model-measurement comparison is improved when aerosol number concentrations and mass are relatively high.
Sea spray emissions are tightly constrained to be around 3 times larger than the default model value. Observationally plausible values of the sea spray scaling parameter range from around 1.6 to 5.1 and all other values (including the default emission calculated in the model) are ruled out as implausible. This suggests that sea spray emissions in our model need to be significantly higher than those calculated using the wind-speed-dependent Gong (2003) parameterisation in the Southern Hemisphere summer. The higher flux is consistent with Revell et al. (2019), who showed that a more recent version of our model simulates cloud droplet concentrations and aerosol optical depth values that are lower than observed over the summertime Southern Ocean. However, sea spray emissions had to be reduced to achieve agreement with aerosol optical depth measurements in winter. Hence, our constraint on sea spray emission fluxes may only be appropriate for Southern Hemisphere summer when wind speeds are relatively low. We do not make any assumptions about the composition of these additional summertime sea spray particles. They may be rich in organic material as proposed by Gantt et al. (2011), which would alter the CCN activity of emitted particles. However, the consistency of constraint of CCN 0.2 and N 700 towards higher values (Fig. 2, Table S3) implies that a general scaling of the existing sea spray flux is consistent with the measurements from December to April, without the need for an additional source of fine-mode, organic-rich particles.
The dry deposition velocity of accumulation mode aerosols (Dry_Dep_Acc) has an 84 % likelihood of being lower than the default model value after applying the constraint. Furthermore, deposition velocities larger than around 3 times the default value are effectively ruled out. This constraint is consistent with the higher aerosol concentrations implied by constraint of the sea spray emission parameter.
Other parameters are more modestly constrained. The constraint on the aerosol precursor DMS emission flux scale factor is two-sided, reducing the credible range of DMS emission scalings from 0.5 to 2.0 down to 0.54 to 1.9. This con-straint suggests the default surface sea water concentration (Kettle and Andreae, 2000) and emission parameterisations (Nightingale et al., 2000) are consistent with measurements (including aerosol sulfate) and do not benefit from being scaled. Furthermore, ACE-SPACE measurements are consistent with less-efficient aerosol scavenging (55 % likelihood of Rain_Frac, the parameter that controls the fractional area of the cloudy part of model grid boxes where rain occurs, being below the unconstrained median value 0.5) and less aqueous phase sulfate production (pH of cloud droplets has a 62 % likelihood of being lower than the unconstrained median value). These combined constraints suggest, in agreement with sea spray and deposition parameter constraints, higher aerosol number and mass concentrations are consistent with measurements.
The effects of measurement constraint on pdf's of RF aci and ERF aci are shown in Fig. 4. Removing implausible model variants has reduced the uncertainty in several parameters including natural aerosol emission fluxes, which translates into a reduction in RF aci uncertainty (Carslaw et al., 2013). The measurement constraints have two important effects on aerosol forcing. Firstly, the magnitude of median RF aci weakens from −1.99 to −1.88 W m −2 (−1.64 to −1.49 W m −2 for ERF aci ). A weaker forcing is consistent with higher natural aerosol emissions, increased aerosol load, and higher cloud droplet number concentrations (see Table 1) in the early-industrial period. Secondly, the constrained forcing pdf's are approximately symmetric but have shorter tails (lower kurtosis). This suggests the constraints are selectively ruling out model variants that are outliers. The 95 % credible range of RF aci values is reduced by around 9 % (from −2.84 to −1.15 W m −2 down to −2.64 to −1.10 W m −2 ) and around 9 % for ERF aci (from −2.69 to −0.62 W m −2 down to −2.43 to −0.54 W m −2 ). The consistency of forcing constraint across two distinct PPEs suggests the results are insensitive to differences in meteorology, parameters perturbed in the PPEs, and the inclusion of rapid atmospheric adjustments. These results are also insensitive to additional constraint to ensure energy balance at the top of the atmosphere (Fig. S5). Johnson et al. (2020) reduced the global, annual mean RF aci uncertainty by constraining multiple anthropogenic emission and model process parameters (as well as some natural aerosol parameters) using over 9000 predominantly Northern Hemisphere measurements of aerosol optical depth, PM 2.5 , particle number concentrations, and mass concentrations of organic carbon and sulfate. When we combine our Southern Ocean constraint with the Johnson et al. (2020) constraint, we retain around 700 observationally plausible model variants (0.07 %). Although this is a small percentage of the original sample, 700 observationally plausible model variants is far more than are typically used to quantify model uncertainty or multi-model diversity (e.g. around 30 for CMIP6). The marginal parameter pdf's from this 700-member sample are shown in Fig. 5.  Table S3). The two measurement datasets constrain distinct groups of parameters. There are a few cases where the same parameters are constrained by both datasets, and in these cases the parameter values are constrained consistently (e.g. cloud droplet pH) or more strongly through ACE-SPACE (e.g. sea spray emissions). The complementary nature of these constraints means that the combined constraint marginal param-eter pdf's ( Fig. 5) are remarkably similar to those in our Fig. 3e (for sea spray and DMS emission fluxes, as well as deposition and pH parameters) and in Fig. 6 of  for other parameters.
The Johnson et al. (2020) constraint reduced the RF aci uncertainty by around 6 % and our ACE-SPACE measurement constraint reduced the uncertainty by around 9 %. However, the RF aci uncertainty is reduced by around 21 % (Fig. 6a) after applying both constraints, meaning the combined constraint is stronger than the sum of individual constraints.
The Johnson et al. (2020) constraint strengthened the RF aci by around 0.3 W m −2 (more negative) because the largest sea spray emission flux scaling and largest new particle formation rates were ruled out. Our ACE-SPACE constraint rules out the same large sea spray emission fluxes but also rules out all emission flux scale factors lower than around 1.6 ( Fig. 3), which increases the baseline aerosol concentration in the early-industrial atmosphere. The ACE-SPACE measurements also constrain several other parameters that collectively weaken the median RF aci by around 0.18 W m −2 . Therefore, using the combined measurement dataset, the strongest RF aci values have been ruled out as implausible and the credible range of observationally plausible RF aci values is reduced to around −2.51 to −1.17 W m −2 (−2.18 to −1.46 W m −2 , when using 1 standard deviation to quantify the uncertainty). Uncertainty in RF ari is reduced by around 48 % with observationally plausible values ranging from −0.27 to −0.09 W m −2 (−0.23 to −0.13 W m −2 , when using 1 standard deviation), because the strongest RF ari values are ruled out as observationally implausible.

Discussion
Our results show, as hypothesised from previous sensitivity analyses, that remote marine measurements are valuable for constraining the natural aerosol state of the atmosphere (Carslaw et al., 2013;Regayre et al., 2014Regayre et al., , 2018. They also provide new information about plausible model behaviour because they are closely related to model emissions and processes that measurements in polluted environments do not constrain. For the first time we have achieved a meaningful reduction of 21 % in the RF aci uncertainty by constraining the aerosol properties in the model. The reduction in forcing uncertainty could be improved further by using measurements of cloud properties and cloud-aerosol relations, particularly if these span polluted and pristine conditions (McCoy et al., 2020). An important factor that continues to limit observational constraint is the existence of many compensating parameter effects, even within the considerably reduced volume of multidimensional parameter space (Fig. S3). These limit the constraint on individual parameter ranges (Lee et al., 2016;Regayre et al., 2018), but some of the effects could be greatly reduced by perturbing uncertain emissions regionally rather than globally as we do here.
Our results are based on uncertainty in a single climate model. The model is structurally consistent in our experiments and so neglects uncertainty caused by choice of microphysical and atmospheric process representations. Our model also neglects some potentially important sources of remote marine aerosol, such as primary marine organic aerosol  and methane-sulfonic acid (Schmale et al., 2019;Hodshire et al., 2019;Revell et al., 2019). Model inter-comparison projects (such as CMIP6) can be used to quantify the diversity of RF (or ERF) output from models, but they lack information about single model uncertainty.
Ideally, multi-model ensembles would contain a perturbed parameter component, so that model diversity and single model uncertainty could be quantified simultaneously.
Studies like ours fill an important gap by quantifying the remaining uncertainty in aerosol forcing across a set of single-model variants that are plausible when compared with multiple measurement types. This knowledge can be used to form a more complete understanding of the importance of historical and near-term aerosol radiative forcing which would contribute to reducing the diversity in equilibrium climate sensitivity across models.  . Simulation output data for both AER and AER-ATM PPEs are available on the JASMIN data infrastructure (http://www.jasmin.ac.uk, last access: August 2020) (Yoshioka et al., 2019). Some of the climate-relevant fields are derived and stored in netCDF files (.nc) containing data for all ensemble members and made available as a community research tool as described in Yoshioka et al. (2019).  All data needed to recreate figures in this article are available at https://doi.org/10.5281/zenodo.3988476 (Regayre, 2020). Additional model data and analysis code can be made available from the corresponding author upon request.
Author contributions. LAR applied the statistical methodology and generated results. LAR and MY created the PPEs. LAR and JSJ designed the experiments and elicited probability density functions of all aerosol parameters. KSC and MY participated in the formal elicitation process. JS, AB, MGB, CT, SH, and FS collected and processed the ACE-SPACE measurements. DPG processed the cloud droplet number concentration data. LAR, KSC, JS, and JSJ analysed the results. LAR and KSC wrote the manuscript with contributions from all authors.
Acknowledgements. We thank Andre Welti and Markus Hartmann for CCN measurement support provided during the ACE-SPACE campaign. Julia Schmale holds the Ingvar Kamprad Chair sponsored by Ferring Pharmaceuticals. Ken S. Carslaw was a Royal Society Wolfson Merit Award holder during this research. The authors would like to thank the editor Johannes Quaas and the anonymous reviewers for their constructive advice.
Financial support. This research has been supported by the Natural Environment Research Council AEROS, ACID-PRUF, GASSP and A-CURE projects (grant nos. NE/J024252/1, NE/I020059/1, and NE/P013406 and a doctoral training grant), a UK Hadley Centre Met Office CASE studentship, the UK-China Research & Innovation Partnership Fund through the Met Office Climate Science for Service Partnership (CSSP) China as part of the Newton Fund (grant nos. DN321511 and CHN19/3), the European Union ACTRIS-2 project (grant no. 262254), the National Centre for Atmospheric Science ACSIS programme (grant no. NE/N018001/1), the Swiss Federal Institute of Technology Lausanne (EPFL) (ACE project no. 7), the Swiss Polar Institute (ACE project no. 7), Ferring Pharmaceuticals (ACE project no. 7), European FP7 BACCHUS (grant no. 49603445), the Swiss National Science Foundation Grant (grant no. 200021_169090), and the Deutsche Forschungsgemeinschaft in the framework of the priority programme "Antarctic Research with comparative investigations in Arctic ice areas" SPP 1158 (grant no. STR 453/12-1). This work used the ARCHER UK National Supercomputing Service (http://www.archer.ac.uk, last access: August 2020) (project allocations n02-chem, n02-NEJ024252, n02-FREEPPE and the Leadership Project allocation n02-CCPPE).
Review statement. This paper was edited by Johannes Quaas and reviewed by two anonymous referees.