In-situ constraints on the vertical distribution of global aerosol

In-situ constraints on the vertical distribution of global aerosol Duncan Watson-Parris1, Nick Schutgens2, Carly Reddington3, Kirsty J. Pringle3, Dantong Liu4, James D. Allan5,6, Hugh Coe5, Ken S. Carslaw3, Philip Stier1 1Atmospheric, Oceanic and Planetary Physics, Department of Physics, University of Oxford, Oxford, UK 2Earth Sciences, Faculty of Science, Vrije Universiteit Amsterdam 5 3School of Earth and Environment, University of Leeds, Leeds, UK 4Department of Atmospheric Sciences, School of Earth Sciences, Zhejiang University, Hangzhou, Zhejiang, China 5Centre for Atmospheric Science, SEAES, University of Manchester, Manchester, UK 6National Centre for Atmospheric Science, University of Manchester, Manchester, UK Correspondence to: Duncan Watson-Parris (duncan.watson-parris@physics.ox.ac.uk) 10 Abstract. Despite ongoing efforts, the vertical distribution of aerosols globally is poorly understood. This in turn leads to large uncertainties in the contributions of the direct and indirect aerosol forcing on climate. Using the Global Aerosol Synthesis and Science Project (GASSP) database – the largest synthesised collection of in-situ aircraft measurements currently available, with more than 1000 flights from 37 campaigns from around the world – we investigate the vertical structure of sub-micron 15 aerosols across a wide range of regions and environments. The application of this unique dataset to assess the vertical distributions of number size distribution and Cloud Condensation Nuclei (CCN) in the global aerosol-climate model ECHAMHAM reveals that the model underestimates accumulation mode particles in the upper troposphere, especially in remote regions. The processes underlying this discrepancy are explored using different aerosol microphysical schemes and a process sensitivity analysis. These show that the biases are predominantly related to aerosol ageing and removal rather than emissions. 20 Plain Language Summary The vertical distribution of aerosol in the atmosphere affects its ability to act as cloud condensation nuclei, and changes the amount of sunlight it absorbs or reflects. Common global measurements of aerosol including integrated properties such as Aerosol Optical Depth (AOD) provide no information about this vertical distribution. Using a global collection of in-situ aircraft measurements to compare with an aerosol-climate model (ECHAM-HAM) we explore the key processes controlling 25 this distribution and find that wet removal (by e.g. precipitation) plays a key role.

(CCN), which in turn depends on the hygroscopicity and size distribution at the altitude of cloud droplet activation, which is mostly around cloud base at altitudes of 1-3 km. Hence constraining the global aerosol size distribution is a necessary (albeit insufficient) requirement for constraining both the direct and indirect aerosol forcing. In particular, the vertical distribution of aerosol, both natural and anthropogenic, can affect the magnitude of both of these effects (Samset et al. 2013;Marinescu et al. 2017). 5 Measurements of aerosol microphysical properties with good spatial coverage and reliability are vital for constraining the simulated aerosol properties in general circulation models (GCMs). However, currently available in-situ measurement datasets have limited global representativeness -they do not equally sample all of the relevant aerosol regimes. The Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP; Winker et al. 2009) space-borne lidar provides unique information about the vertical distribution of cloud and aerosol globally and has been used in previous model evaluation studies (Koffi et al. 2012;10 Koffi et al. 2016) but it is not possible to infer aerosol size information from the retrievals. Design constraints also mean that CALIOP is unable to detect background aerosol in the free troposphere because of the insufficient signal-to-noise ratio (Winker et al. 2013;Watson-Parris et al. 2018). The European Aerosol Research Lidar Network (EARLINET; Pappalardo et al. 2014) and NASA Micro-Pulse Lidar Network (MPLNET; Berkoff, Welton, and Campbell 2004) ground station networks provide continent-scale lidar measurements, and have been used in model evaluations (Ganguly et al. 2009;Satheesh, Vinoj, and 15 Moorthy 2006) but these are unable to constrain remote aerosol conditions.
In-situ aircraft measurements provide important direct measurements of aerosol chemical composition, size distributions and radiative properties anywhere in the troposphere. These measurements have been used extensively to investigate the representation of Black Carbon (BC) in GCMs (e.g. Koch et al. 2009;Schwarz et al. 2010, Kipling et al. 2013, particle number (e.g. Spracklen et al. 2007;Yu et al. 2008;Mann et al. 2014;Dunne et al. 2016), organic aerosol 20 (Heald et al. 2011) and also aerosol size distribution (Ekman et al. 2012). However, with some notable exceptions (e.g. Clarke and Kapustin 2002), exploitation of aircraft measurements for global model evaluation has been restricted to a very small fraction of the available datasets, primarily because of the lack of easy access and a common data format. When using a selection of campaigns it is also unlikely that these accurately represent the different global aerosol regimes. The Global Aerosol Synthesis and Science Project (GASSP) dataset (Reddington et al. 2017) brings together measurements from more 25 than 1000 flights across 37 campaigns from around the world in a consistent, synthesised format. Using this combination of aircraft datasets we are able to make more extensive evaluations of global climate models. In this paper we use GASSP to evaluate the sub-micron aerosol and CCN distribution in ECHAM-HAM -an aerosol-climate model which includes explicit treatment of the aerosol size distribution and aerosol cloud interactions.
The first focus of the paper is to illustrate the usefulness of a global aircraft dataset in evaluating aerosol in a GCM, and some 30 of the caveats and issues in doing so. While a large collection of aircraft measurements can provide extremely valuable information about aerosol microphysical properties, there are difficulties in using such data to evaluate a GCM. For example, aircraft measurements represent a single point in space and time, whereas typical GCM output represents an average over a large (~100 km) region and often days or months (Schutgens at al. 2016b). We show that these sampling errors can be ensured to be small compared to model errors when the measurements are averaged over time and the high-temporal resolution 4-D model fields are interpolated onto the measurement locations. The Community Intercomparison Suite (CIS) makes these interpolations straight-forward even for fields on a hybrid sigma coordinate system (Watson-Parris et al. 2016).
The second focus is on characterising the vertical distribution of aerosol particles globally by combining these measurements with a GCM. We use one-at-a-time sensitivity tests of ECHAM-HAM model simulations and employ both the M7 modal and 5 Sectional Aerosol module for Large Scale Applications (SALSA) bin microphysics (H. Kokkola et al. 2008) schemes to explore the processes controlling these distributions. We find that ECHAM-HAM represents the aerosol size distribution well in the boundary layer, but that it appears to underestimate accumulation mode particles in the free-troposphere, which is also reflected in the CCN distribution. The wet-deposition and ageing by sulphate condensational growth are both shown to play a crucial role in these biases. 10 In Section 2 we describe the GASSP dataset and ECHAM-HAM model, before discussing the evaluation and sampling strategies in Section 3. We present the measurements and results from the evaluation in Section 4 and discuss their implications on constraining the global aerosol particle distribution in Section 5.

The GASSP dataset 15
The GASSP dataset provides a global collection of in-situ aerosol measurements from a large number of platforms in a single self-describing data format (Reddington et al. 2017). It includes measurements from more than 1000 flights and across 37 campaigns around the world -representing remote, continental and Arctic regions in the largest collection of data of its kind.
All of the campaigns that included measurements of number size distribution or CCN are included in this evaluation, as detailed in Table 1 and shown in Figure 1. 20  (Zhang et al. 1995) with particle sizing set at 0.01-0.25 um using a TST Model 3010 with 22°C saturator temperature difference for lowered detection limit and with thermal analysis similar to the OPC 3 Laser optical particle counter (OPC) (Particle Measurement Systems LAS-X, Boulder, Colorado with customized electronics) effectively sizes particles between 0.100 and 14 um with a resolution of 112 logarithmically spaced channels per decade (Clarke 1991) 4 Desert Research Institute (DRI) instantaneous CCN spectrometer (Hudson 1989). Parallel plate thermal gradient diffusion cloud chamber with streamwise supersaturation gradient (each plate is divided into eight temperature controlled zones). 5 The Georgia Institute of Technology particle-into-liquid sampling (PILS) system 6 NCAR radial differential mobility analyzer system (NCAR rDMA) measured the size and number of particles between 0.007 and 0.150 um with a resolution of 54 channels per (Russell et al. 1996) (Roberts and Nenes, 2005;Lance et al., 2006) 9 Two custom DMA's (TDMA 0.01-0.20 um, LDMA 0.010-0.50 um mobility diameter) 10 Optical particle counter (0.15-8.0 um optical diameter) 11 High-resolution time-of-flight aerosol mass spectrometer (HR-ToF-AMS) (Canagaratna et al. 2007;DeCarlo et al. 2006). 15 University of Wyoming CCN instrument consists of a static thermal-gradient chamber and an optical detection system (Snider et al. 2006)

Model description
In this study we use the ECHAM-HAMMOZ model as an example of a modern, well-characterized global aerosol-climate 5 model in order to demonstrate the value of the GASSP dataset. While other models will likely show different behaviours and biases it is hoped this initial evaluation will inform an extended analysis across a large number of models.
The recently released ECHAM6.3-HAM2.3 version is used, which includes improved sea-salt and dust emission parameterisations (Tegen et al. 2018). The model is run at T63 resolution with 31 vertical levels and nudged to ERA-Interim reanalysis for 2008 (Dee et al. 2011), using the ACCMIP interpolated emission dataset (Lamarque, Bond, and Eyring 2010). 10 Although the GASSP data spans many years, the inter-annual variability in aerosol burden, away from the main biomass burning regions, is small (Li et al. 2013).
In order to explore the uncertainty in the vertical structure of the aerosol size distribution we perform a set of sensitivity simulations in which key parameters were scaled up and down one at a time. There are a number of model parameterisations that influence the aerosol growth, removal and vertical transport, and we apply a simple high/low perturbation over their likely 15 range of uncertainty in order to determine their relative contribution to any model-measurement differences, as outlined in Table 2 and described in detail below. One-at-a-time sensitivity tests neglect the important effects of combinations of parameter perturbations that are captured by Latin hypercube perturbed parameter ensemble studies (e.g. Lee et al. 2013;Regayre et al. 2018). However, they allow an assessment of how individual process parameter uncertainties contribute to the vertical profile of the aerosol size distribution and can be used as a screening test to determine important model processes for further analysis.  Condensational ageing. In the default HAM setup a single monolayer of sulphate is assumed to be required to transfer insoluble particles to the corresponding soluble/mixed particle mode following (Vignati, Wilson, and Stier 2004). There is considerable uncertainty in this simple approximation. We therefore vary the number of monolayers required from 0.3 to 5, 10 matching the ranges used by (Lee et al. 2013).
Wet deposition. HAM2 includes wet deposition removal of aerosol via in-cloud nucleation and impaction scavenging as well as below-cloud impaction scavenging by rain and snow. In-cloud nucleation is the most important of these mechanisms and, because it primarily occurs at the top of the boundary layer, it has a large effect on the vertical distribution of aerosol (e.g. Kipling et al. 2016;Mahmood 2016). Here we scale the total in-cloud and below-cloud mixing-ratio removal tendencies in 15 each grid cell by a constant factor. As one of the primary aerosol removal mechanisms globally, the aerosol burdens are very sensitive to this scaling; so while there are large uncertainties in the precipitation and scavenging rates, the range of scalings used is smaller than in the other perturbations. Initial scaling values of 10, 5 and even 3 led to implausible aerosol burden globally.

Vertical flux in convection.
Convection is one of the dominant mechanisms for transporting aerosol and trace gases from the 20 boundary layer into the free troposphere globally (Park and Allen 2015). There are large uncertainties in the aerosol entrainment and detrainment rates for convective clouds. Here we scale the total convective tracer mass flux in each grid-cell to sample this uncertainty. The large range in scale factors was chosen to reflect the large uncertainties in the fluxes, and due to the relative insensitivity of the aerosol to this parameter.
Coagulation. The inter-and intra-modal omponents of the standard coagulation kernel within the M7 aerosol scheme can be 25 scaled independently to represent uncertainty in the assumptions used to calculate it -such as using only the median mode diameter in calculating the terms, and uncertainties in the effects of turbulence and electrostatics. In this work, since we are interested only in the broad uncertainties, we scale the whole kernel by the same scale factor.
Aerosol dry deposition. Lee et al. (2013) showed that uncertainties in the dry deposition process provided the largest contribution to the uncertainty in CCN globally in the HadGEM-GLOMAP GCM. Here we scale the dry deposition velocities for the Aitken and accumulation modes across the same range as in their study. 5

Evaluation strategy
The GASSP aircraft database provides valuable measurements with which to constrain global climate models. However, these near-instantaneous point measurements represent something quite different to typical model output fields which are often temporal averages of a grid-cell which itself represents some (usually undefined) average over a large spatial region, typically ~100 km in extent. The question of how to compare these two datasets consistently is the subject of this section. 10 Schutgens et. al (2016a) show the importance of collocating measurements with high temporal resolution model fields in order to reduce the large temporal sampling artefacts which would otherwise be present. This has also been noted in previous model evaluation work using aircraft data (Ekman et al. 2012). Further work (Schutgens et al. 2016b) showed the importance of averaging these collocated measurements over as long a period as possible in order to remove spatial sampling biases. While higher spatial-resolution models would reduce this particular form of sampling bias, they would face the associated problem 15 of small transport differences leading to large biases (Fast et al. 2016). Some campaigns include sampling biases by design, due to the particular objectives of the mission. For example, the MIRAGE campaign was flown to specifically measure the pollution down-wind of Mexico City and hence over represents the mean aerosol loading in the (wider) region. As noted in Table 1 this campaign was not included in the subsequent analysis. Some recent flight campaigns such as ORACLES (Zuidema et al. 2016) fly routine tracks several times during the campaign specifically to build up representative spatial statistics for 20 comparison with models, however most historic campaigns have not.
A further complication in the use of the combined measurements from a variety of campaigns, aircraft and even instruments is that the measurements themselves will have different sampling rates and systematic biases (for example due to the use of different inlets). Another concern is the different inlets and piping used to bring the sampled air inside the aircraft and to the instrumentation. Because these biases will generally be uncorrelated across campaigns, we assume that the large number of 25 campaigns used will remove any systematic bias in the reported average size distributions.
In order to remove the most high-frequency variability in the measurements (which we would not expect the GCM to reproduce), to bring the measurements onto a common temporal sampling, and to provide at least some temporal aggregation we down-sample the measurements to 2-minute averages. Typical aircraft in the GASSP database, such as the NOAA P3-B and the FAAM BAE-146, have cruise speeds of 600-800 km/h, so this averaging corresponds to a distance of 10-15km. 30 Detailed investigations of spatial variability of aircraft aerosol measurements in the ARCTAS campaign (Shinozuka and  Redemann 2011) show that this length scale will average out local emission sources while still maintaining the long-range variability which we hope the GCM to reproduce.
Using CIS (www.cistools.net: Watson-Parris et al. 2016) to linearly interpolate the model fields of interest onto these temporally averaged measurements we should minimise the associated sampling errors. The question remains however what model output frequency is required. The storage requirements for the 3-D model fields we wish to interpolate quickly become 5 inhibiting for daily and sub-daily model output. simulator provides a powerful diagnostic capability, we use model data interpolated from 3-hourly output fields for the results presented in Section 4 as a compromise between the introduced sampling bias and convenience in analysis. As shown in Figure  15 2 the bias and RMSE introduced are negligible.

Aerosol size distribution
The aerosol size distribution can be characterised in several ways, for example by aerosol number, surface area or volume.
The Aitken and accumulation modes are most important for constraining the indirect effect, so we focus on the aerosol number size distribution (NSD): 10 " # (ln ( ) ln + = number of (dry) particles per unit volume of (ambient) air in the size range ln ( to ln ( + ln ( northern extra-tropics, apart from a clear increase in Aitken mode aerosol at 8 km, presumably due to nucleation. The aerosol distribution in the southern extra-tropics is noticeably smaller than in the tropics or northern extra-tropics with more aerosol residing in the Aitken mode. The lower number of accumulation mode aerosol in the Southern Hemisphere has been observed before (e.g. Minikin et al. 2003) and is attributed to the lack of anthropogenic aerosol and gaseous pre-cursor sources in the Southern Hemisphere. 5 For comparison between the modelled (modal) and observed (binned) aerosol distribution it is useful to reduce this distribution 10 to a single number representing the integrated number above some lower threshold: While ( will often be plotted in units of m, in this paper / is always an integrated number above diameter in nm. We can interpolate the model mode number and radius fields on to the measurements and calculate / across a range of sizes for each point. While the integrated number concentrations at smaller size cut-offs will include the number of larger sized 15 particles the smallest particles will dominate the number. Figure 4a   The model performs well in the lower free troposphere with a near-zero bias around 1-3km, but it underestimates number concentrations in all aerosol size ranges in the planetary boundary layer (PBL). In the upper free troposphere (FT), above 4km, the model reproduces the number of particles smaller than around 20 nm very well, but there is a clear low bias of up to 100% in the number of larger (sub-micron) particles. 10 It should be noted that with the DMA-based instruments there is a potential for uncertainties associated with the assumed particle charging model. As part of their inversions, these instruments must assume a probability that particles of a given size achieve the specified states when subjected to the bipolar charge field at the instrument's inlet (Liu and Pui, 1974). The most common method used is the parameterisation of Wiedensohler (1988), which has proven to be robust in most applications (Wiedensohler et al., 2012). More recently, fundamental modelling studies have investigated how much this function depends 15 on conditions such as particle composition, temperature and pressure and have suggested these effects may be important in some situations (López-Yglesias and Flagan, 2013). In principle, if the charging function were to vary with pressure, this could be responsible for systematic artefacts in the vertical profiles of particle concentrations presented here. Leppä et al. (2017)  presented a case study at 10 km altitude and suggested that the number concentrations of particles greater than 10 nm diameter would be under-reported by between 5 and 33% depending on the ambient size distribution and other technical details such as the polarity of the instrument. However, at the time of writing, we are not in a position to use this result as the basis for a correction for our data because the altitude dependency case studies of López-Yglesias and Flagan (2013) and Leppä et al. (2017) varied both temperature and pressure simultaneously according to typical ambient conditions. In contrast, the 5 instruments whose data are being used here charged the aerosols at aircraft cabin temperature rather than ambient, so the actual effect that pressure variations may be having on the data is currently uncertain. Because applying a systematic correction would be both technically challenging and computationally expensive, this is deemed outside the scope of this work, however in the event that a generalised correction method be developed in the future, this issue should be revisited. Taken at face value however, this would mean our measured data of particles larger than 10nm would be at worst biased low a few tens of percent 10 at altitude, which would in turn only make the reported model biases more significant. Note also that this issue only affects counting and not the sizing of particles; the effects of variations in pressure and temperature on DMA sizing are already wellestablished and accounted for (Knutson and Whitby, 1975).
In order to understand the source of these global model biases we can split the data into measurements made over land or ocean, roughly analogous to near / far from major sources respectively, in order to understand the role of emissions and removal 15 in the model bias. Figure 5 shows the fractional bias in modelled aerosol number over land and ocean. The vertical profile of the bias over land shows the model consistently underestimates aerosol across all sizes in these regions. This bias is largest near the surface where biases due to local emissions sources not resolved by the coarse model resolution are likely to be dominant. The bias in the smaller particles improves with altitude and is near-zero above 6 km. The larger particles however show the same bias as in the global mean above 4km. The ocean profiles generally show a much better agreement with the 20 measurements throughout the troposphere, although the bias in the number concentration of large particles in the free troposphere remains. In these generally more remote regions the model recreates the overall aerosol number well, but is underestimating the number of larger particles aloft -suggesting over-efficient removal of these particles, or insufficient growth. Figure 5 also shows a strong low bias in the near surface aerosol number over the ocean, for all aerosol sizes. This is due to insufficient SO4 in the ocean boundary layer, as shown in Figure 13, presumably due to insufficient DMS emission. 25 Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1337 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 7 February 2019 c Author(s) 2019. CC BY 4.0 License.

Figure 5: The vertical profile of fractional bias in modelled aerosol number at different size cut-offs for Land (a) and Ocean (b) measurements.
It is instructive to stratify by latitude. Figure 6 shows the fractional bias profiles for flights in the Tropics and Northern and Southern extratropics. The Northern extratropical profiles show similar biases as the over-land biases shown in Figure 5 since 5 these include many of the same flights. Similarly, the tropical profiles are mostly over Ocean. However, the most remote dataset in the Southern extratropics (ACE1) has the strongest bias in the number of large particles. Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1337 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 7 February 2019 c Author(s) 2019. CC BY 4.0 License.

Condensational ageing 5
In HAM-M7 condensational ageing is the primary mechanism by which aerosol (mass and number) is transferred from the insoluble Aitken and accumulation modes into their soluble equivalents (Schutgens and Stier 2014) where they become available for removal by nucleation scavenging. The current assumption in this model is to require one-monolayer of sulphate condensed on a particle to transfer it from the insoluble to soluble modes. Figure 7 shows profiles of fractional bias in each of the latitudinal ranges shown in Figure 6 but with a reduced and increased amount of sulphate required, leading to faster and 10 slower condensational ageing respectively. Changing the condensational ageing rate between the values chosen for this study has a minimal effect in the tropical and southern extratropical regions, but by requiring more sulphate to age the aerosol the slower condensational ageing profiles show a reduced negative bias in the number of larger particles in the northern extratropics, at the expense of an increase in the negative bias for smaller particles. Nearer the large anthropogenic sulphate sources in the Northern Hemisphere, ageing timescales are much shorter (Schutgens and Stier 2014) and hence more sensitive 15 to these perturbations. It should be noted that ECHAM-HAM does not simulate the effects of nitrate which would be expected to contribute to the ageing of aerosols. Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1337 Manuscript under review for journal Atmos. Chem. Phys.

Wet deposition
Wet deposition is the primary removal mechanism of aerosol (Textor et al., 2006) as well as in ECHAM-HAM (Stier et al. 2005) and as such should have a strong effect on its vertical profile. There are several uncertainties associated with the removal rates, from the raindrop-aerosol collision efficiency (Seinfeld and Pandis 2016) to the sub-grid co-variability between precipitation and aerosol (Gryspeerdt et al. 2015). In order to explore the effect of these uncertainties, we scale the in-and 5 below-cloud removal tendencies up and down by a factor of 2. Figure 8 shows profiles of fractional bias for increased and decreased wet-deposition removal rates. The importance of wet-deposition in controlling the vertical distribution of the aerosol is immediately apparent. While increasing the wet-deposition rates leads to much stronger low biases in all cases, reducing the wet-deposition leads to a dramatic improvement. In the southern extratropics the size bias is virtually eliminated, although some smaller biases do remain. In the tropics the bias is also reduced, although in the boundary layer the larger aerosol is now 10 overestimated. The large near-surface bias in the tropics remains unchanged, further suggesting that this bias is due to insufficient sources rather than over-efficient removal. In the northern extratropics the bias in larger particles is virtually eliminated in the free troposphere, at the expense of the smaller particles however, which now show a low bias, presumably due to the reductions in nucleation through additional condensational sink. This is suggestive that wet deposition is generally over-efficient in HAM, but that this is not the only source of the biases shown in Figure 4c, and that a simple global scaling of 15 the removal rate would be unphysical and probably not be an effective solution. While biases in the precipitation rate could explain some of the aerosol biases, impaction scavenging is a relatively inefficient removal mechanism and ECHAM-HAM reproduces global patterns of precipitation reasonably well (e.g. Kipling et al. 2017). Biases in the in-cloud nucleation scavenging, which is a far more efficient mechanism, are therefor the most likely cause.

Vertical flux in convection
In order to determine the importance of convection on the vertical distribution of aerosol we scale the convective tracer entrainment by a factor of 10 up and down. This large perturbation causes a relatively small response in the vertical number size distribution, as shown in Figure 9. The largest effect is in the Tropics where reducing the tracer entrainment leads to a reduced bias in the free-troposphere for both small and larger particles. The reduced entrainment leads to a positive bias for 5 larger particles in the boundary layer however. There are also small improvements in the extra-tropics. Increasing the tracer entrainment leads to increased biases throughout. The reduced entrainment leads to a lower concentration of N10 in the UT, corroborating previous work which showed that the MIT-CAM model had too-large a transport of CN into the uppertroposphere (Ekman et al. 2012).

Coagulation
Inter-mode coagulation provides another mechanism by which aerosol can be transferred from insoluble to soluble modes (by coagulating with a particle already in a soluble mode), and intra-mode coagulation provides a key growth pathway for Aitken and accumulation mode aerosol. Both of these will affect the vertical aerosol size distribution, and here we scale both intraand inter-mode coagulation by a factor of two. Figure 10 shows that increasing the coagulation rates reduces the low model 5 bias throughout the troposphere in the tropics and southern extra-tropics for both larger and smaller particles. The largest improvement is in the tropical free troposphere where coagulation is a dominant mechanism for transfer of number from the nucleation to Aitken and Aitken to accumulation modes, due to the high number densities here (Schutgens and Stier 2014).
Generally, decreasing the coagulation rates increases the model bias, apart from the boundary layer in the northern extratropics where the decrease leads to a small reduction in the bias.

Dry deposition
Uncertainty in dry deposition has been shown to provide one of the largest contributions to uncertainty in the surface distribution of CCN in HadGEM-GLOMAP (Lee et al. 2013). However, despite scaling the dry-deposition rates for both the Aitken and accumulation modes over the same ranges, we see no significant change in the distribution of aerosol compared to the aircraft measurements (see figures in Appendix A). This could be due to the treatment of dry deposition as the lower 5 boundary of the vertical diffusion scheme in ECHAM-HAM (Stier et al. 2005), which minimises spurious surface effects.

Comparison with SALSA
When aerosols grow due to various processes they can move between modes, e.g. from Aitken to accumulation. Another possible reason for the biases observed in the upper FT is that M7, the default aerosol scheme in ECHAM-HAM, has to perform a redistribution of number between modes, in order to avoid numerical diffusion, often referred to as 'mode merging'. This can 10 result in 'stiff' modes which do not grow or shrink as efficiently as they should. However the SALSA bin scheme (H. Kokkola et al. 2008) is also available to use in ECHAM-HAM. Rather than representing the aerosol population as 7 log-normal modes as in M7, SALSA uses 20 bins in the standard configuration.
The model aerosol fields are interpolated onto the observational points as with M7 and the integrated number can be calculated directly by summing the appropriate aerosol bins. The median fractional bias in the integrated number as a function of altitude 15 is shown in Figure 11. Interestingly, the SALSA aerosol scheme shows a similar negative bias in the large-particle concentration in the upper FT which suggests the mode merging in M7 is not the cause of the bias in ECHAM-HAM. SALSA also has a small positive bias in smaller particles not present in M7, which may be due to differences in the wet deposition scheme (T. Bergman et al. 2012). This result suggests that microphysical details can be of secondary importance compared to other physical processes, in particular wet deposition, when it comes to accurately representing the aerosol size distribution. 20 A similar conclusion was reached when investigating the difference between bin and modal schemes in the GLOMAP model (Mann et al. 2012). Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1337 Manuscript under review for journal Atmos. Chem. Phys.

CCN
Many of the aircraft included in the NSD analysis above also carried a CCN counter which is able to measure CCN either at 5 specific or across a range of supersaturations (see Table 1 for details). By taking all of the measurements at each supersaturation and comparing with the model CCN at the same supersaturation we are able to create profiles of the fractional bias in CCN at a range of frequently measured supersaturations, as shown in Figure 12.
The CCN profiles contain fewer measurements since not all the flights carried a CCNC and some of these instruments 'scanned' across supersaturations and hence only measured at any given supersaturation a fraction of the time. As discussed in the 10 introduction, the CCN spectra also depend on both the aerosol size distribution and hygroscopicity, but it can be clearly seen that the bias in the NSD shown in Section 4.1 manifests itself in a low bias in the CCN at lower supersaturation (mostly larger particles) in the FT. The same processes identified through the sensitivity analysis as being important for influencing the vertical size distribution, namely wet deposition and condensational growth, control the vertical CCN spectra. Although many cloud regimes are updraft limited rather than CCN limited (Reutter et al. 2009) this low bias in low-supersaturation CCN is 15 likely to have an important impact on the forcing in those CCN-limited regimes. Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1337 Manuscript under review for journal Atmos. Chem. Phys.

Discussion and conclusions
We have evaluated the vertical size distribution of sub-micron aerosol particles in ECHAM-HAM using a dataset of in-situ 5 aircraft measurements that covers large parts of the globe. The model generally performs well but shows a negative bias in accumulation mode aerosol in the mid-to upper-troposphere. By comparing the bias over land and ocean we show that this bias could result from errors in the ageing and removal processes rather than in the emissions. This bias in particle concentrations translates into a negative bias for low-supersaturation CCN at similar altitudes. The model also underestimates marine sulphate in the boundary layer, likely due to an under-representation of DMS emissions. A similar bias, which 10 contributed to an overly large aerosol forcing, has been seen in UKESM1 (Mulcahy et al. 2018) and will be explored in other models in future work.
We also performed a simple one-at-a-time parameter perturbation study which showed that wet deposition, a key aerosol removal process in ECHAM-HAM, is probably over-efficient, particularly in the southern extra-tropics. One potential reason for this over-efficiency is the assumption (common in GCMs) that aerosol mixes instantaneously across a grid box (Gryspeerdt 15 et al. 2014). Both tracer entrainment and coagulation are shown to be important mechanisms controlling the vertical distribution of aerosol in the tropics, but with limited impact elsewhere. The modelled aerosol size distribution shows reduced bias compared to the aircraft measurements when these processes are tuned up and down respectively. Condensational growth is particularly important in the northern extra-tropics where slower ageing (requiring an increased amount of sulphate) reduces the model bias. A similar sensitivity analysis for HadGEM-UKCA (Kipling et al. 2016) showed qualitatively similar results: reduced condensational growth led to fewer small particles and more large particles in the upper troposphere; and coagulation having the greatest effect on particle concentrations in the tropics. In this study, however, ECHAM-HAM does not show the pronounced effect of convective entrainment or dry deposition seen in HadGEM-UKCA. This could be due to the treatment 5 of dry deposition as a lower boundary of the vertical diffusion scheme in ECHAM-HAM, which minimises spurious surface effects.
These simple perturbations do not allow us to explore the complex interactions between these processes, but they do demonstrate the magnitude of the single effects, and they highlight the value of these measurements in evaluating them. By performing a full sampling of these parameterisations and combining the constraints developed in this work with other remote-10 sensing datasets it will be possible to significantly improve our confidence in the representation of aerosol in ECHAM-HAM.
One important process which has been shown to affect the size distribution of aerosol in the UT (e.g. Heald et al. 2011) and not explored in this paper is the representation of Secondary Organic Aerosol (SOA). The default setup for ECHAM6.3-HAM2.3-M7 and -SALSA is to use prescribed SOAs, but this results in a lower SOA burden compared to online calculation (Tegen et al. 2018). This has also been shown to have a large effect on the vertical distribution of organic aerosol in other 15 models (e.g. Shrivastava et al. 2015;Tsigaridis et al. 2014), although it is not clear that this will have a large impact on the Aitken and accumulation mode number considered in this work.
The increasing availability of aircraft datasets measuring the vertical distribution of aerosol, particularly in the UT, provides valuable constraints for GCMs, with implications for improving our representation of aerosol direct and indirect effects in these models. A multi-model experiment to apply these constraints within the AeroCom framework and explore inter-model 20 biases and diversity is currently underway.

Acknowledgements
The authors thank Jens Redemann, Jamie Trenbeth, Harri Kokkola and Dan Partridge for useful discussions.