A global model perturbed parameter ensemble study of secondary organic aerosol formation

A global model perturbed parameter ensemble of 60 simulations was used to explore how combinations of six parameters related to secondary organic aerosol (SOA) formation affect particle number concentrations and organic aerosol mass. The parameters represent the formation of organic compounds with different volatilities from biogenic and anthropogenic sources. The most plausible parameter combinations were determined by comparing the simulations against observations of the number concentration of particles larger than 5 3 nm diameter (N3), the number concentration of particles larger than 50 nm diameter (N50), and the organic aerosol (OA) mass concentration. The simulations expose a high degree of model equifinality in which the skill of widely different parameter combinations cannot be distinguished against observations. We therefore conclude that, based on the observations we have used, a 6-parameter SOA scheme is under-determined. Nevertheless, the model skill in simulating N3 and N50 is clearly determined by the low and extremely low volatility compounds 10 that affect new particle formation and growth, and the skill in simulating OA mass is determined by the low volatility and semi-volatile compounds. The biogenic low volatility class of compounds that grow nucleated clusters and condense on all particles is found to have the strongest effect on the model skill in simulating N3, N50 and OA. The simulations also expose potential structural deficiencies in the model: we find that parameter combinations that are best for N3 and N50 are worst for OA mass, and the ensemble exaggerates the observed 15 seasonal cycle of particle concentrations − a deficiency that we conclude requires an additional anthropogenic source of either primary or secondary particles. 1 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c © Author(s) 2020. CC BY 4.0 License.


INTRODUCTION
; Spracklen et al. (2011) show that models have consistently and significantly underestimated SOA concentrations in different parts of the atmosphere. Tsigaridis et al. (2014) shows models largely underestimate the amount of organic aerosol present in the atmosphere with the underestimation being strongest in urban regions based on a study involving 31 global models. 5 It is well-established that atmospheric organic molecules strongly affect the number concentrations of climaterelevant sized particles by condensing on and growing aerosol particles (Riipinen et al., 2011) or by promoting particle formation (O'Dowd et al., 2002;Zhang et al., 2004;Metzger et al., 2010). Recent studies (Ehn et al., 2014;Jokinen et al., 2015;Kirkby et al., 2016;Tröstl et al., 2016;Bianchi et al., 2019) have established the importance of atmospheric highly oxygenated organic molecules in the formation and growth of new aerosol 10 particles. Subsequent model simulations based on these experimental data have shown that new particle formation involving organic molecules could explain the seasonal cycle of particle concentrations (Riccobono et al., 2014) and provide a source of aerosol in clean pre-industrial environments that is important for climate . It has been estimated that global cloud condensation nuclei (at 0.2% supersaturation) would be about one-quarter lower without biogenic VOCs and about three-quarters of this effect is caused by the role of 15 organic molecules in nucleation and early growth (Gordon et al., 2017).
The importance of SOA for climate means that large-scale models need to simulate the contributions of these highly oxygenated molecules in nucleation or subsequent particle growth. Several modelling studies (such as Farina et al., 2010;Pye and Seinfeld, 2010;Jathar et al., 2011) have implemented the Volatility Basis Set (VBS) 20 framework proposed by Donahue et al. (2012) for the description of organic partitioning and chemical ageing.
Other SOA modelling frameworks have been proposed by Odum et al. (1996); Camredon et al. (2007) ;Parikh et al. (2011). However increased model complexity inevitably requires more computational resources and more runtime, both of which need to be considered carefully by modellers. Shrivastava et al. (2011) found comparable predictions of observed OA between a 9-species VBS approach and a 2-species simplified VBS approach, the 25 latter being a factor 2 lower in computational cost. Shrivastava et al. (2011) concluded the 2-species approach is well-suited to represent the complex evolution of atmospheric organic aerosols. Riipinen et al. (2011) andScott et al. (2015) propose simplified SOA formation scheme representing species of two volatilities. Tsigaridis et al. (2014) shows global model skill does not increase with model complexity with regard to organic aerosol mass concentrations.
There is the risk that added complexities in models that are not well-constrained by experimental data could increase model uncertainty and thereby introduce more uncertainty in the quantification of the effect of anthro-5 pogenic aerosols (Lee et al., 2016). Although model complexity is increased in order to improve the representation of the physical, chemical or optical properties of SOA (Tsigaridis et al., 2014), a more-complex model that matches some observations may not have lower uncertainty in making predictions because of the increased likelihood of compensating errors, often called model equifinality (Beven, 2006). The simulated particle number (and anthropogenic aerosol forcing) is affected by several model parameters and their associated uncertainties, 10 some of which may be compensating for each other. Tuning any one parameter within the model (e.g. the nucleation rate) to improve the model performance against observations of one aspect of the atmospheric aerosol distribution (e.g. the total particle number concentration), may adversely affect the model performance in other outputs (e.g. total aerosol mass). Such tuned observationally constrained models give the impression of low aerosol uncertainty and model robustness but still predict a large range of aerosol forcing in Lee et al. (2016). 15 In this paper we use a perturbed parameter ensemble of 60 simulations of a global aerosol model to examine the effect of six uncertain parameters in SOA formation on simulated organic aerosol mass and particle number concentrations. We compare three model outputs against observations: the number concentration of particles larger than 3 nm dry diameter (N3) and 50 nm dry diameter (N50), and the mass concentration of 20 organic aerosol (OA). Our primary aim is to understand the sensitivity of N3, N50 and OA to the combinations of model input parameters that control the formation of SOA. We identify parts of parameter space that result in the best agreement with observations of N3, N50 and OA. Further, we identify parameter combinations that produce models that are indistinguishable in terms of their simulations of number and mass concentrations. 4 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.

GLOMAP global aerosol model
We use the GLOMAP (GLObal Model of Aerosol Processes) modal aerosol microphysics model (Mann et al., 2010), which is an extension to the TOMCAT 3-D chemical transport model (Chipperfield, 2006). The model has a horizontal resolution of 2.8 • x 2.8 • , with 31 hybrid σ-pressure levels from surface to 10 hPa. Large-scale atmospheric transport in the model for 2008 is driven by ERA-Interim reanalysis produced by the European 5 Centre for Medium-Range Weather Forecasts (ECMWF) at 6 hourly intervals. The aerosol distribution is simulated using four hydrophilic modes (nucleation, Aitken, accumulation and coarse) and one non-hydrophilic Aitken mode. The aerosol phase has four components: sulphate, sea salt, black carbon and organic carbon.
Where more than one species is contained in a mode we assume internal mixing (Mann et al., 2010).
2.2 SOA scheme 10 The model described in Mann et al. (2010) includes one SOA species produced from oxidation of monoterpenes only. In this study we produce six SOA species from oxygenated organic compounds derived from the oxidation of monoterpene, isoprene and anthropogenic sources. Monthly monoterpene and isoprene emissions used in the model are generated from the Community Land Model by Sarah Monks (MEGANv2.1;Guenther et al., 2012) Emission of anthropogenic VOC is parameterised using the approach in Spracklen et al. (2011) − we use CO emissions from the MACCity inventory and assume a SOA/OC mass ratio of 1.4. Atmospheric oxidants (OH . , 3 ) are taken from 6-hourly monthly mean values calculated offline from a TOMCAT simulation and interpolated to the model chemical time step (Monks et al., 2017). 5 Figure 1 shows a schematic of the treatment of SOA in this study. Gas-phase oxygenated organic compounds (ox-VOC) are represented by three classes: Extremely Low Volatility (ELVOC), Low Volatility (LVOC) and Semi-volatile (SVOC) organic compounds. 5 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.
2.2 SOA scheme 2 METHODS Figure 1. Schematic showing the SOA formation scheme in GLOMAP-mode and the six oxidised VOCs (products of photochemical oxidation of emitted VOCs that eventually produce SOA) whose concentrations are perturbed in this study. The six oxidised VOCs represent three volatility categories -extremely low volatility organic compounds (ELVOC, blue), low volatility organic compounds (LVOC, green) and semi-volatile organic compounds (SVOC, red). Prefix 'A' indicates precursor gases of anthropogenic origin and 'B' indicates biogenic origin. The schematic shows the precursor gases and oxidants that react to produce these ox-VOCs, their relative volatility (ELVOC<LVOC<SVOC) and the mechanism (nucleation for ELVOC, kinetic condensation for LVOCs and mass-based partitioning for SVOCs) by which they add to the condensed phase (represented here by the five modes:-nucleation soluble, Aitken soluble, accumulation soluble, coarse soluble and Aitken insoluble modes).
The ELVOCs are assumed to derive only from biogenic sources and nucleate to form new particles that either grow or are lost to pre-existing larger particles . The LVOCs are assumed to condense kinet-5 ically and irreversibly (i.e., with zero vapour pressure) on all particles, and the SVOCs are assumed to partition into all particles, except those in the nucleation mode, in proportion to the pre-existing organic mass in the mode (Scott et al., 2015). The LVOC and SVOC are further divided into biogenic (prefix B) and anthropogenic (prefix 6 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.

Perturbed parameter ensemble
2 METHODS A). The biogenic precursors are split into monoterpenes (suffix M) and isoprene (suffix I), with the monoterpenes allowed to form both LVOC and SVOC and the isoprene forming only SVOC. The precursors, oxidants 10 and role of each ox-VOC in SOA formation are defined in Table 1. The ox-VOCs in this new scheme of SOA formation produced from bimolecular oxidation reactions of VOCs are, B_ELVOC, B_LVOC, B_SVOC_M, B_SVOC_I, A_LVOC and A_SVOC. The six perturbed parameters in this study are scaling factors or reaction yields that control the concentrations of these six ox-VOCs.

A_SVOC
Semi anthropogenic mass based partition anthropogenic VOCs OH . Table 1. List of oxidised VOCs implemented in this study, the volatility class they represent, whether produced from biogenic or anthropogenic sources, how they take part in atmospheric SOA formation, parent VOC, and oxidants that react to produce each ox-VOC. MT stands for monoterpene, IP stands for isoprene and CO stands for carbon monoxide indicating anthropogenically sourced VOC.

Perturbed parameter ensemble
Simultaneous perturbation of six model parameters within each of their chosen range forms a 6-D parameter space within which we explored the competing and compensating effects of these parameters on model outputs. 5 An ensemble of 60 simulations was produced, each with a different combination of the six parameters. Simulations were run for the year 2008. To ensure good coverage of the 6-D parameter uncertainty space, the maximin Latin hypercube sampling method (McKay et al., 1979) was used to choose the combinations of parameters 7 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.

Perturbed parameter ensemble
2 METHODS (Lee et al., 2011(Lee et al., , 2013. The relative variation of the ox-VOCs in each simulation is shown in Figure 2. 10 We perturb the concentrations of SOA-producing ox-VOCs by perturbing the yields of bimolecular oxidation reactions for B_LVOC, B_SVOC_M, B_SVOC_I, A_LVOC and A_SVOC. The baseline molar yields for each of these ox-VOCs before perturbation were: 13% for B_LVOC, 13% for B_SVOC_M, 3% for B_SVOC_I producing approximately 40 Tg yr −1 of SOA from biogenic sources. The total anthropogenic ox-VOCs are split equally between A_LVOC and A_SVOC (effectively a 50% yield of the total anthropogenic each) together pro- 15 ducing approximately 63 Tg yr −1 of SOA. Within the ensemble the yield of each of the above is perturbed from 0 to 20 times the baseline for biogenic ox-VOCs (B_LVOC, B_SVOC_M, B_SVOC_I) and from 0 to 5 times the baseline for anthropogenic ox-VOCs (A_LVOC and A_SVOC). We perturb the concentrations of ox-VOCs taking part in the model nucleation scheme, B_ELVOC, using a scaling factor between 0 and 10, where 0 is equivalent to sulphuric acid-only nucleation and 10 being equivalent to ELVOC concentrations about a factor of 20 10 higher than reported in Kirkby et al. (2016). The ranges (Table 2) were chosen to encompass a wide range of uncertainties and simplifications in the model. These include VOC emission uncertainties; structural uncertainty such as neglected precursor gases (e.g., sesquiterpenes); uncertainty in yields of the ox-VOCs from oxidation reactions; simplifications to the oxidation pathways (in GLOMAP-mode only single-stage oxidation products are represented); and uncertainty in SOA due to neglecting the volatility distribution and re-evaporation of SOA (Donahue et al., 2011(Donahue et al., , 2012. Changing the ox-VOC concentrations by changing the yields of chemical reactions has the same effect in the model as perturbing the emissions of the parent VOCs. A yield above 100% in the bimolecular reactions should therefore be interpreted as an increase in the total concentration of reactants. 5 The advantage of varying the yield of ox-VOCs rather than the emissions of VOCs is that a perturbation applied to one ox-VOC does not affect the production or loss of the other ox-VOCs when they have the same parent VOC (i.e., they have uncorrelated effects across the 6-D parameter space).  Table 2. Perturbed parameters and ranges. Also listed are the precursor VOC gases that produce the ox-VOCs, the value of ox-VOC yields in the default model version (unperturbed), how the perturbation for each parameter is implemented in the model ('absolute' for replacing the default yield value by the perturbation, 'scaled' for scaling the default yield value by the perturbation), and the amount of SOA produced from each parameter in the default model version.  N3 observations cover 34 ground stations worldwide as used in . Measurements of N3 were made between 1994 and 2009 using either condensation particle counters (CPCs), scanning mobility par- 15 ticle sizers (SMPS), differential mobility particle sizers (DMPS) or Diffusion Aerosol Spectroscopes (DAS).
The N3 data set is fully described in  and has been used in previous studies such as Riccobono et al. (2014), Gordon et al. (2016)  All station data were averaged to create monthly mean values. Station heights were matched to model pressure levels for each month using barometric altitude. Stations cover a wide range of atmospheric conditions 5 such as continental boundary layer (CBL e.g. Hyytiala, Harwell, Botsalano), marine boundary layer (MBL e.g. Mace Head, Trinidad Head, Sable Island) and free tropospheric (FT e.g. Nepal, Jungfraujoch, Pico Espejo, Mauna Loa) sites. Errors in measurements are estimated to be around 30% on average, depending strongly on the spatial heterogeneity of sources (Reddington et al., 2017). 11 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. To examine the performance of the ensemble members we use statistical metrics including correlation coef-10 ficient, normalised mean bias factor and Taylor Skill Score or TSS (Taylor, 2001), in the following sections. 12 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.

Observations 2 METHODS
3.1 Global and regional aerosol mass and number 3 RESULTS NMBF is an unbiased and symmetric metric with a range from -∞ to +∞, with 0 corresponding to exact agreement. It is calculated using the following equation:  The Taylor Skill Score (TSS) is calculated as: whereσ is the normalised standard deviation (σ model /σ observation ) and R 0 is the maximum correlation attainable by the model, given the internal variability in the system (here R 0 is assumed to be 1). As the model variance approaches the variance in the observations and R approaches R 0 , TSS approaches unity. As the model 10 variance approaches zero or the correlation becomes more and more negative, TSS approaches zero.

Global and regional aerosol mass and number
The global mass of SOA produced in the model simulations ranges from 220 to 850 Tg yr −1 across the 6-D  Figure 4 shows that the ensemble members produce large regional variations in OA, even when they predict similar global values. For example, simulations 9 (subplot 3,4) and 36 (subplot 3,5) have similar global mean OA concentrations (6.47 and 6.49 µg m −3 , respectively) but they simulate very different OA over the highly 25 polluted regions of South Asia. Such regional variations are dependent on the parameter settings of the ox-VOCs. In this case simulations 9 and 36 particularly differ in the contribution from B_SVOC_M and A_LVOC.
Modelling efforts need to focus on capturing the competing and compensating effects of ox-VOCs contributing to different stages of particle formation and growth, rather than detailed representation of any one contributions.
This also emphasises the need to compare model outputs with regional as well as global metrics and observations to determine if the model is performing well. 5 Simulations 45 (subplot 9,2) and 49 (subplot 10,5) demonstrate the role of anthropogenic VOCs on the simulated aerosol size distribution and OA mass. Global mean N3, N50 and OA in simulations 45 and 49 are all in the upper quartile of the ensemble's output distribution for these quantities. Both of these simulations have high concentrations of B_ELVOC, which promote nucleation, and moderate to high B_LVOC, which promotes the survival of nucleated particles, contributing to the high global mean particle number and mass concentrations. In Therefore in the simulation with low anthropogenic SOA more smaller particles survive but fewer particles hold 5 substantial mass leading to lower OA mass concentrations. This is also supported by the considerably higher N3 but only slightly higher N50 number concentrations in the SE Asian region in simulation 49 compared to simulation 45. Anthropogenic sources of SOA affect the pre-industrial and present-day atmospheres differently in global models (Carslaw et al., 2013). Under-represented anthropogenic sources in global models therefore have greater climatic implications than under-represented biogenic SOA sources. Figures 5 and 6 show the global distribution of N50 and N3. The panels in each figure are ordered according to increasing global mean N3. There is a general increase in N50 with increasing N3, but eleven of the simulations clearly show low N50 concentrations despite high N3 (simulations 14, 39, 47, 56, 50, 38, 9, 15, 15, 59, 15 51 in Figure 6). All of these simulations have one aspect in common − very low yields of B_LVOC (Figure 2).
High concentrations of nucleating B_ELVOC in these simulations initially facilitates the formation of particles, but low yields of B_LVOC, especially when coupled with high B_SVOC_I and A_LVOC, suppresses particle growth up to 50 nm. 20 In contrast simulations 13, 24, 35 and 46 have low concentrations of B_ELVOC but relatively high B_LVOC.
Despite the low B_ELVOC concentrations, which produces fewer nucleated particles, the relatively high concentration of B_LVOC ensures that more of the nucleated particles reach 50 nm diameter in these simulations.
Consequently for these simulations the simulated global mean N3 concentrations are in the lower quartile within the ensemble but global mean N50 concentrations are in the interquartile range within the ensemble. B_LVOC 25 can compensate for B_ELVOC to some extent and clearly stands out as the most important controller of climaterelevant particle number concentrations. tions of ox-VOC parameters and with both high or low particle number concentrations. The figure is consistent with the challenge faced by state-of-the-art global models -despite simulating the particle number concentrations (N3 and N50) reasonably well models consistently under-predict OA mass concentrations (Kanakidou et al., 2005;Tsigaridis et al., 2014). The challenge in predicting both particle number concentrations and OA mass is re-visited in later sections. 3.1 Global and regional aerosol mass and number 3 RESULTS (Kirkby et al., 2016) used in the model, above which there is more scatter in N50 caused by the other model processes and parameters. N50 is also related to B_LVOC, but with more scatter than for B_ELVOC. These relationships show that N50 concentrations are strongly controlled in part by the production of B_ELVOC, which 15 causes nucleation, and by the production of B_LVOC, which grows the nucleated clusters via kinetic condensation. There is no clear relationship between N50 and A_LVOC production. The likely reason for this is that anthropogenic VOCs are not spatially co-located with the biogenically-produced B_ELVOC, so there are fewer nuclei in polluted regions and hence much less effect of the A_LVOC on the growth of nuclei to larger sizes. 20 Including new, more accurate nucleation pathways into models is unlikely to improve the model performance with respect to N50 (a highly relevant model output for estimation of climate relevant aerosol-cloud interactions) unless the models also include adequate representation of B_LVOCs (Figure 7). Several studies have investigated nucleation capability and nucleation pathways of atmospheric molecules (Kulmala et al., 1998(Kulmala et al., , 2004Kirkby et al., 2011;Kurtén et al., 2008;Almeida et al., 2013;Riccobono et al., 2014;Kirkby et al., 2016) 25 whereas the contribution of organic molecules to sub-3 nm cluster growth is relatively recent knowledge and the molecules involved are largely unidentified (Tröstl et al., 2016). The significance of B_LVOC is explored and established further in later sections. Figure 7 are found to be unrelated to B_ELVOC concentration showing that new particle formation has little effect on simulated OA mass in our model. Increases in all the other ox-VOC parameters generally increase OA, although there is a lot of scatter, particularly with the anthropogenic parameters, indicating a strong multi-variate relationship for simulated OA. Global mean OA shows the strongest dependence 5 on B_LVOC and B_SVOC_M and the highest global mean OA (darkest red in Figure 7) is simulated when B_LVOC and B_SVOC_M yields are more than 7.5 times the baseline yield of 13% (above 100% in Figure 7; producing over 113 Tg yr −1 each).

OA concentrations in
We make two more observations from Figure 7. The simulated OA mass concentrations seem to have a stronger 10 relationship with SVOCs than LVOCs, because SVOCs partition to larger particles which already hold substantial mass thereby having a greater impact on OA mass. Secondly OA concentrations appear to have a steeper in-20 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.
3.1 Global and regional aerosol mass and number 3 RESULTS crease with increases in the biogenic ox-VOCs than their anthropogenic counterparts. This is because A_LVOCs or A_SVOCs grow fewer particles than their biogenic counterparts and as a result changes in their concentrations have a lesser impact on simulated OA mass. The involvement of anthropogenic precursors in particle formation and cluster growth is likely to change this picture (Molteni et al., 2018).
Overall we find that when particle number concentrations are low, the main difference between simulations 5 that produce high amount of OA (simulations 16, 23, 35, 46 in Figure 4) and those that do not (see simulations 34, 55, 58 in Figure 4), is the relative concentrations of B_LVOC which grows freshly nucleated clusters before they can be scavenged by coagulation (see Figure 2 for parameter combinations). When particle number concentrations are high (due to high B_ELVOC or B_LVOC or both), the mass of OA produced is determined by the combined effects of all other ox-VOCs. Parameter combinations such as in simulations 45 and 49 produce some 10 of the highest global mean N3, N50 and OA within the ensemble. Parameter combinations when B_ELVOC and/or B_LVOC dominate significantly over the the rest of the ox-VOCs (such as simulations 8, 21, 41, 54 and 57) the increased competition between small particles for growth cause the available high volatility ox-VOCs to distribute on more particles, causing smaller increase in the particle mass. model. Here we find the winter-time underestimation continues even when B_ELVOC or B_LVOC ox-VOC parameters are at their highest settings. In addition to underestimation of modelled concentrations in winter, some ensemble members overestimate particle concentration in the summer (for example Aspvreten in Figure   8). These combined biases mean that the model overestimates the strength of the seasonal cycle compared to observations. 25 Figure 9 shows the monthly-mean timeseries of N50 at 31 ground sites. The 60-member ensemble is able to encompass the observations in all months at only 2 out of 31 sites. Like N3, the model bias for N50 in each of the 60 ensemble members range between a factor of 3 underestimation to a factor of 2 overestimation, with underestimation being more prevalent. The correlation coefficients (calculated for each ensemble member at each location using monthly mean simulated and observed N50 concentrations) in 22 out of 31 sites are higher than 0.5 (Figure not shown). The best correlation coefficients are observed in non-urban sites and the maxi-5 mum underestimation and poorest correlation coefficients for N50 are observed at the polluted sites of Ispra and Marikana (Asmi et al., 2011;Vakkari et al., 2013). In contrast the ensemble performs significantly better at Hohenpeissenberg and Zugspitz, both high-altitude sites free from nearby anthropogenic influence only about 458 km from Ispra, and at Botsalano, representing a semi-clean environment, which is about 150 km from Marikana. 5 We also find that as the normalised mean bias factor (calculated between each simulation and observed values) reduces, the calculated correlation coefficient weakens. This implies that the model has a structural deficiency that cannot be resolved by perturbing the model parameters. The strong link between B_ELVOC and N3 indicates that the weakening of the correlation coefficient with the improvement of normalised mean bias factor is 23 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.
3.1 Global and regional aerosol mass and number 3 RESULTS related to nucleation: higher ELVOC production rates increase annual mean N3 concentrations, but the summer concentrations are affected much more than the winter concentrations, which weakens the correlation. We suggest that a missing particle source such as anthropogenic pollutants − which are maximum in the winter

Model skill across the 6-D parameter space
We now explore how the model skill in simulating observed N3, N50 and OA varies across the 6-D ox-VOC 10 parameter space (Figures 10, 11 and 12). Because it is a 6-D space, it is important to note that for each subplot showing the relationship between two ox-VOCs, the other four parameters are varying randomly across each plane. A weak dependency between an ox-VOC and model skill does not imply that the contribution of the ox-VOC to OA and particle number concentration is unimportant. Rather, it implies that within the current modelling framework its contribution can be compensated by changes in other ox-VOCs. 15 25 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.

Model skill across the 6-D parameter space 3 RESULTS
Some clear patterns in model skill are apparent across the 6 dimensions. N3 skill depends strongly on B_ELVOC, B_LVOC and A_LVOC ( Figure 10). The skill is generally lower for B_ELVOC yields less than 6.4% from ozonolysis of α-pinene (i.e.twice the baseline yield), irrespective of the value of other parameters (left column in Figure 10). The N3 skill is also generally low for values of A_LVOC greater than about 95 Tg yr −1 (yield 20 corresponding to 150% in Figure 10), although there are a few simulations that have reasonable skill above this value (second row from bottom). We note two regions in the 6-D space dominated by low model skill in N3 − where B_ELVOC yield is greater than 19.8% (6 times baseline) and B_LVOC is less than 113 Tg yr −1 (corresponding to a yield of 100% in Figure 10 first column, second row from top) and where the sum of anthropogenic LVOC and SVOC is greater than 127 Tg yr −1 (200% yield in Figure 10 fifth column, sixth row). 25 There is also a general increase in skill for high values of B_LVOC (second column).
N50 skill has the strongest dependency on B_LVOC. The model is most skilful for B_LVOC greater than about 113 Tg yr −1 (corresponding to a yield of 100%, Figure 11 second column) with a general increase in skill for higher values of B_LVOC. N50 skill is generally lower for B_ELVOC yields less than twice the baseline yield or 6.4%, although other parameter values and particularly high B_LVOC (Figure 11 first column, second row) improve model skill in some cases. The dependence of N50 skill on A_LVOC is much weaker than for N3, with high and low model skills spread across the entire parameter range. In contrast the N50 skill tends to be low for 5 A_SVOCs greater than about 95 Tg yr −1 (150% yield, Figure 11 bottom row). Figures 10 and 11 show that model simulation of N3 and N50 is most skilful when B_ELVOC production is a factor of 2 to 8 higher than the baseline model B_ELVOC yields of 3.2% and 1.2% from O 3 and OH . oxidation reactions respectively (Kirkby et al., 2016). With less than 6.4% B_ELVOC yields from O 3 and OH .

10
oxidation reactions, the model−observation match is consistently poor (shades of blue in Figure 10 or Figure   13). The best estimate of B_ELVOC yield to obtain reasonable agreement with observed N3, N50 and OA in our model (denoted by shades of red in Figure 13) is about a factor of 4 higher than the baseline B_ELVOC yield from ozonolysis of α-pinene (Kirkby et al., 2016). Together, the variations in skill for N3, N50 and OA concentrations across the six dimensions show that the parameter space for a high N3 or N50 skill score does not overlap with the parameter space for a high OA skill score ( Figures 10, 11 and 12). This gives an insight as to why models that are fine-tuned to simulate particle number concentrations underestimate OA mass concentrations in the atmosphere − also identified as a persis-5 tent challenge for state-of-the-art global models (Kanakidou et al., 2005;Spracklen et al., 2011).
Our results reveal the problem of model equifinality, highlighted for the whole aerosol model by Lee et al. (2016). Equifinality means that there are multiple ways (i.e., parameter combinations) of achieving the same model skill against observations, which makes it difficult to identify the best model (Beven, 2006). An impor-10 tant consequence of equifinality is that fine-tuning any one aspect of the model (for example, the nucleation mechanism) to achieve best model-observation agreement for any one variable (e.g., the particle number concentration) can be achieved with a wide range of settings of other parameters (e.g., parameters controlling overall OA mass production). While this may not affect the overall model skill in the particular evaluation, the various parts of equally plausible parameter space may result in very different model behaviour in, say, climate 15 projections. 27 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure 10. Taylor Skill Score for model simulations against N3 observations across the 6-D parameter space. The x-and y-axes for a subplot show the total range of reaction yields (in%) over which each of the two parameters (as indicated by the plot labels at the top and right for each subplot respectively) is perturbed in the ensemble. Each triangle in a subplot represents a simulation and the color of the triangle indicates its Taylor skill score for N3. Darker shades of blue indicate low/poor Taylor skill score and darker shades of red represent high/good Taylor skill score. Figure A1 shows the same plot with ensemble members numbered.
Note: Axis for B_ELVOC shows scaling factor for B_ELVOC yields. Axis for the rest show corresponding ox-VOC yields. 28 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure 11. Taylor Skill Score for model simulations against N50 observations across the 6-D parameter space. The x-and y-axes for a subplot show the total range of reaction yields (in%) over which each of the two parameters (as indicated by the plot labels at the top and right for each subplot respectively) is perturbed in the ensemble. Each triangle in a subplot represents a simulation and the color of the triangle indicates its Taylor skill score for N50. Darker shades of blue indicate low/poor Taylor skill score and darker shades of red represent high/good Taylor skill score. Figure A2 shows the same plot with ensemble members numbered.
Note: Axis for B_ELVOC shows scaling factor for B_ELVOC yields. Axis for the rest show corresponding ox-VOC yields. 29 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure 12. Taylor Skill Score for model simulations against OA observations across the 6-D parameter space. The x-and y-axes for a subplot show the total range of reaction yields (in%) over which each of the two parameters (as indicated by the plot labels at the top and right for each subplot respectively) is perturbed in the ensemble. Each triangle in a subplot represents a simulation and the color of the triangle indicates its Taylor skill score for OA. Darker shades of blue indicate low/poor Taylor skill score and darker shades of red represent high/good Taylor skill score. Figure A3 shows the same plot with ensemble members numbered.
Note: Axis for B_ELVOC shows scaling factor for B_ELVOC yields. Axis for the rest show corresponding ox-VOC yields.

30
https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure 13 summarises Figures 10, 11 and 12 showing the model skill score in all three model outputs across the entire parameters space for all five ox-VOCs in 1-D. We find five simulations that are shaded red for all three model outputs across the 6 parameters: simulations 8, 17, 19, 21 Figure 13. The parameter combinations used to produce the above ox-VOCs in the ensemble are listed in Table A2.
One ensemble member, simulation 41, scores reasonably well (Taylor Skill Scores of 0.28, 0.11 and 0.14 for N3, N50 and OA) in simulating observed mass and number of particles. It is the only simulation for which Taylor Skill Score in each of the three outputs are among the top 10 highest scores within the ensemble (see Table A1). The score is highest for N3, second highest for N50 (0.12 being the highest score) and the sixth  Note: Axis for B_ELVOC shows scaling factor for B_ELVOC yields. Axis for the rest show corresponding ox-VOC yields.

CONCLUSIONS Conclusions
We have used a perturbed parameter ensemble of 60 model simulations to explore how uncertainty in six biogenic and anthropogenic precursors affects organic aerosol mass and particle number concentrations. The ranges for each parameter were chosen to encompass maximum uncertainty associated with organic compounds that 20 affect three different stages of SOA formation − nucleation, cluster growth and particle growth. Simultaneous perturbations of the six parameters using a Latin Hypercube Sampling technique allow the effects of parametercombinations rather than just individual parameters on model outputs to be explored. Three model outputs, the number concentration of particles larger than 3 nm diameter (N3), the number concentration of particles larger than 50 nm diameter (N50), and the organic aerosol (OA) mass concentration, were compared against observa-25 tions and the model skill score was then used to determine the skilful parts of parameter space.
The results expose a high degree of equifinality in the SOA model in which there are multiple ways of generating similar outputs (particle concentrations and OA mass). This is to be expected in a system with six free parameters and only three output variables of interest -that is, our six-component SOA model is underdetermined. Equifinality, or compensating parameter effects, limit the extent to which the best set of parameters can be identified by comparing the simulations against observations. Our results suggest that the effects of three cat-5 egories of volatile organic compounds can be detected by comparing the 60 simulations against observations: ELVOC, LVOC and SVOC. B_ELVOC is crucial for the formation of particles via nucleation. Thereafter contributions from LVOCs and SVOCs contribute to the growth of freshly-nucleated particles to produce a realistic N50 concentration and SOA mass.
10 B_ELVOC strongly influences model skill scores in N3 and to a lesser extent in N50. When B_ELVOC is low (< twice the baseline yield of 3.2%), the ensemble consistently underestimates N3 and N50 number concentrations, irrespective of the availability of other ox-VOCs. We find the best model skills scores in N3, N50 and OA are achieved when the ELVOC yield from precursor VOCs is between 6−26%, with the most plausible ELVOC yield estimate being aound 12.8%. Previously reported ELVOC yields from α-pinene ozonolysis are 15 at the lower end of this constrained range (3.2% with uncertainty range of +100%/-60% reported by Kirkby skill score) is accompanied by low B_SVOC_M such that the sum of B_LVOC + B_SVOC_M does not exceed about 226 Tg yr −1 (favouring a good OA skill score, due to the joint dependency of OA skill score on these two ox-VOCs).

5
SVOCs (from monoterpene/isoprene/anthropogenic sources) are important to simulate realistic number concentrations of climate-relevant sized particles such as N50 and OA mass concentrations. SVOCs generated from α-pinene in the model are spatially co-located with nucleated particles and hence have a strong effect on their growth to larger sizes. OA skills in particular show a strong dependence on SVOC_M as discussed above. 10 We cannot determine the plausible or implausible parameter space for those ox-VOC parameters which show a weak relationship with the model skill score. Model skill in N3, N50 and OA have the weakest relationship with B_SVOC_I. This is because in the current model setup the role of B_SVOC_I in growing particles by mass-based partitioning can be easily compensated for by other oxidised VOCs. The relationship between 15 anthropogenic oxidised VOCs and model skill score in OA are also weak, although A_LVOC shows a strong re-34 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. lationship with N3 skill score and A_SVOC with N50 skill score. The weak relationship between anthropogenic SVOCs with simulated OA mass is because the clusters produced by biogenic nucleation are not spatially colocalated with the anthropogenic LVOCs or SVOCs. We expect the relationship between anthropogenic VOCs and model skill scores in N3, N50 and OA to strengthen when the role of anthropogenic oxidised VOCs in nucleation and cluster growth Molteni et al. (2018) are represented.
Our results point to a structural deficiency in the model. The perturbed parameter ensemble tends to exaggerate the observed seasonal cycle of particle concentrations, overestimating summer and underestimating winter-time particle concentrations. We suggest this is due to SOA sources in models being predominantly biogenic. In-5 creases in the concentration of nucleating biogenic aerosols or in the complexities of nucleation mechanisms will not improve the model's ability to replicate the observed seasonal cycle. We expect the inclusion of anthropogenic sources in nucleation and in initial cluster growth for SOA particles in models would significantly improve the simulated aerosol seasonal cycle. It is important to explore the anthropogenic contribution to clustergrowth because such pathways could considerably impact model estimates of anthropogenic aerosol forcing by 10 differentially affecting the present-day and pre-industrial atmospheres (Carslaw et al., 2013).

5
Data availability. Measurement data, model data used in this paper can be made available upon request from the corre-15 sponding author. 35 https://doi.org /10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License.  Table A2. Scaling factor for B_ELVOC and yields of other ox-VOCs with the corresponding Taylor skill scores for 5 ensemble members that are shaded red for all three outputs N3, N50 and OA in Figure 13. 37 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure A1. Taylor Skill Score for model simulations against N3 observations across the 6-D parameter space. The x-and y-axes for a subplot show the total range of reaction yields (in%) over which each of the two parameters (as indicated by the plot labels at the top and right for each subplot respectively) is perturbed in the ensemble. Each triangle in a subplot represents a simulation (labelled 1-60) and the color of the triangle indicates its Taylor skill score for N3. Darker shades of blue indicate low/poor Taylor skill score and darker shades of red represent high/good Taylor skill score.
Note: Axis for B_ELVOC shows scaling factor for B_ELVOC yields. Axis for the rest show corresponding ox-VOC yields.
38 https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure A2. Taylor Skill Score for model simulations against N50 observations across the 6-D parameter space. The x-and y-axes for a subplot show the total range of reaction yields (in%) over which each of the two parameters (as indicated by the plot labels at the top and right for each subplot respectively) is perturbed in the ensemble. Each triangle in a subplot represents a simulation (labelled 1-60) and the color of the triangle indicates its Taylor skill score for N50. Darker shades of blue indicate low/poor Taylor skill score and darker shades of red represent high/good Taylor skill score.
Note: Axis for B_ELVOC shows scaling factor for B_ELVOC yields. Axis for the rest show corresponding ox-VOC yields.

39
https://doi.org/10.5194/acp-2020-756 Preprint. Discussion started: 12 August 2020 c Author(s) 2020. CC BY 4.0 License. Figure A3. Taylor Skill Score for model simulations against OA observations across the 6-D parameter space. The x-and y-axes for a subplot show the total range of reaction yields (in%) over which each of the two parameters (as indicated by the plot labels at the top and right for each subplot respectively) is perturbed in the ensemble. Each triangle in a subplot represents a simulation (labelled 1-60) and the color of the triangle indicates its Taylor skill score for OA. Darker shades of blue indicate low/poor Taylor skill score and darker shades of red represent high/good Taylor skill score.