Simulating organic aerosol in Delhi with WRF-Chem using the volatility-basis-set approach: exploring model uncertainty with a Gaussian process emulator

. The nature and origin of organic aerosol in the atmosphere remain unclear. The gas–particle partitioning of semi-volatile organic compounds (SVOCs) that constitute primary organic aerosols (POAs) and the multigenerational chemical aging of SVOCs are particularly poorly understood. The volatility basis set (VBS) approach, implemented in air quality models such as WRF-Chem (Weather Research and Forecasting model with Chemistry), can be a useful tool to describe emissions of POA and its chemical evolution. However, the evaluation of model uncertainty and the optimal model parameterization may be expensive to probe using only WRF-Chem simulations. Gaussian process emulators, trained on simulations from relatively few WRF-Chem simulations, are capable of reproducing model results and estimating the sources of model uncertainty within a deﬁned range of model parameters. In this study, a WRF-Chem VBS parameterization is proposed; we then generate a perturbed parameter ensemble of 111 model runs, perturbing 10 parameters of the WRF-Chem model relating to organic aerosol emissions and the VBS oxidation reactions. This allowed us to cover the model’s


S1. WRF-Chem setup.
Table S1 is based on Table 2 of Tsimpidi et al (2010), showing the aerosol yields for high-and low-NOx parameterisations for the SOA products we have included in this model, along with the CRI-v2R5 precursors we have used in this study.For full details of the scheme please refer to Tsimpidi et al (2010).
Table S1: SOA yield scenarios using a four-product basis set with saturations concentrations of 1, 10, 100, and 1000 g m -3 at 298 K. Table S2 is the mapping of WACCM6 chemical species to CRI-v2R5 & MOSAIC chemical species.The mapping of the inorganic aerosol components, and aerosol number, is taken from the mapping advice provided by Emmons, Pfister and Hodzic (2019), and not repeated here.The mapping of OA to VBS compounds has been estimated from our preliminary modelling studies, and is designed to provide 'aged' VBS aerosol at the boundaries of the domain.VBS volatility bins 1-3 (log10(C*) values of -2 to 0) only are used, and a 'non-oxygen' to oxygen ratio of 1.925 has been adopted (estimated from our preliminary modelling studies).We add the VBS compounds as gases, and allow the model to dynamically partition these to the condensed-phase as needed.The WRF-Chem variable names for each of these has been given in the Table S2.
Table S2: Species mapping between WACCM6 and CRI-v2R5 chemical schemes.The scaling factor has been included where this is not 1.The scaling factor for the gas-phase VBS compounds includes the kg kg -1 to ppm conversion factor of 1.1588x10 5 (calculated using the same molar mass of 250 g mol -1 that is used with WRF-Chem for these compounds).
Table S3 shows an example of the parameters used to control the VBS scheme when buiding the model setup for anthropogenic and biomass burning sources.111 namelist.inputfiles were designed using the ranges of the 10 parameters in table 2.  Ageing rate is the reaction rate for all VBS reactions in that scheme (in cm 3 molec -1 s -1 ) Oxidation rate is the fractional increase in oxidation of the VBS compounds per reaction step FRAC[1-9] is the multiplier from the POA mass in the emission database, to give the emitted mass of VBS component in that volatility bin.Volatility bin 1 has a log10(C*) value of -2, and for bins 2-8 this increases decadally (as illustrated in Figure S1), to a log10(C*) value of 6 for bin 9. scaling is a scaling factor applied to all FRAC[1-9] values for that scheme, usually with the aim of ensuring that the condensed VBS mass at time of emission is roughly equivalent to the involatile POA mass in emission database used.

S3. WRF-Chem model evaluation
The parameters used for model evaluation were calculated with the OpenAir package (Carslaw and Ropkins, 2012).The following equations were extracted from the OpenAir manual, where Oi represents the ith observed value and Mi represents the ith modelled value for a total of n observations.

Fraction of predictions within a factor of two, FAC2
FAC2 is the fraction of modelled values within a factor of two of observations, which satisfy:

Mean bias (MB).
MB gives an indication of the mean over or underestimate of predictions; it has the same units as the quantities being considered.

Index of agreement (IOA).
The IOA is commonly used in model evaluation (Willmott et al., 2012), ranging between -1 and +1, with values close to +1 representing a better model performance.An IOA of 0.5 indicates that the sum of the error magnitudes is one-half of the sum of the observed-deviation magnitudes.IOA, with c = 2, is defined as:

Figure
Figure S1 Diurnal fraction of seven activities used in WRF-Chem.Taken from(Olivier et al., 2003)

Figure S4 .
Figure S4.Wind roses and diurnal cycles of temperature, RH, ws and PBLH.May -2018.Circles highlight the mean.

Figure S6 .
Figure S6.Analysis of the 111 model runs for the 2-4 pm period.Mean OC_ratio coloured by mean Total_OM (top) and mean Total_OM coloured by mean OC_ratio (bottom).The red line highlights the mean and SD of AMS observations (O:C top and OA bottom).The mean AMS values are O:C = 0.67 and OA = 12.20 µg.m -3 .

Figure S7 .
FigureS7.Relative variation (%) of the 5 anthropogenic PPE (1 -5) for the full period.Each pentagon represents the 5-D parameter space and the positions of the dots connected with lines show the position of each parameter within its range for that specific ensemble member.The filled area within the dots represents the explored parameter space in each ensemble member.Anticlockwise from top there are the five anthropogenic parameters: VBS_AGERATE (P1), SVOC_VOLDIST (P2), SVOC_OXRATE (P3), IVOC_SC (P4) and SVOC_SC (P5).The values of the five parameters have been normalised dividing by their respective maximum values, hence their values in this plot range from 0 -1.The colour in the lines and dots represents the FAC2 values from the O:C analysis and the fill colour represents the FAC2 values from the OA analysis.Red = 0 -0.2, orange = 0.2 -0.4,yellow = 0.4 -0.6, green = 0.6 -0.8, light blue = 0.8 -0.9 and blue = 0.9 -1.0

Figure S9 .
Figure S9.Validation of the four tested emulators for Org concentrations.Circles are the original 81 runs.Squares with error bars in blue are the new 30 runs with low settings of the anthropogenic SVOC scaling parameter (which has led to low aerosol mass).Runs where the actual model output lies outside the 95% prediction interval of the emulator are shown in red.

Figure
Figure S10 Validation of the four tested emulators for O:C ratios.Circles are the old 81 runs.Squares with error bars in blue are the new 30 runs with low aerosol mass.Red are the runs that are not within the 95% CI from prediction

Figure S12 .
Figure S12.Spread of the total O:C ratio for the 111 model runs vs the 10 parameters for the 2-4 pm period.Red = 20 VALIDATE runs.Black = 61 TRAIN runs.Blue = 30 new TRAIN runs.

Table S3 :
Example of a namelist.inputfile with parameters to control the VBS scheme.

Table S5
Evaluation of the ensemble of 111 model runs ordered from high to low FAC2 values with O:C and OA for the 2-4 pm period.Table S6 Evaluation of the ensemble of 111 model runs ordered from high to low FAC2 values with O:C and OA for the full period.