Mapping the drivers of uncertainty in atmospheric selenium deposition with global sensitivity analysis

Abstract. An estimated 0.5–1 billion people globally have inadequate intakes of selenium (Se), due to a lack of bioavailable Se in agricultural soils. Deposition from the atmosphere, especially through precipitation, is an important source of Se to soils. However, very little is known about the atmospheric cycling of Se. It has therefore been difficult to predict how far Se travels in the atmosphere and where it deposits. To answer these questions, we have built the first global atmospheric Se model by implementing Se chemistry in an aerosol–chemistry–climate model, SOCOL-AER (modeling tools for studies of SOlar Climate Ozone Links – aerosol). In the model, we include information from the literature about the emissions, speciation, and chemical transformation of atmospheric Se. Natural processes and anthropogenic activities emit volatile Se compounds, which oxidize quickly and partition to the particulate phase. Our model tracks the transport and deposition of Se in seven gas-phase species and 41 aerosol tracers. However, there are large uncertainties associated with many of the model's input parameters. In order to identify which model uncertainties are the most important for understanding the atmospheric Se cycle, we conducted a global sensitivity analysis with 34 input parameters related to Se chemistry, Se emissions, and the interaction of Se with aerosols. In the first bottom-up estimate of its kind, we have calculated a median global atmospheric lifetime of 4.4 d (days), ranging from 2.9 to 6.4 d (2nd–98th percentile range) given the uncertainties of the input parameters. The uncertainty in the Se lifetime is mainly driven by the uncertainty in the carbonyl selenide (OCSe) oxidation rate and the lack of tropospheric aerosol species other than sulfate aerosols in SOCOL-AER. In contrast to uncertainties in Se lifetime, the uncertainty in deposition flux maps are governed by Se emission factors, with all four Se sources (volcanic, marine biosphere, terrestrial biosphere, and anthropogenic emissions) contributing equally to the uncertainty in deposition over agricultural areas. We evaluated the simulated Se wet deposition fluxes from SOCOL-AER with a compiled database of rainwater Se measurements, since wet deposition contributes around 80 % of total Se deposition. Despite difficulties in comparing a global, coarse-resolution model with local measurements from a range of time periods, past Se wet deposition measurements are within the range of the model's 2nd–98th percentiles at 79 % of background sites. This agreement validates the application of the SOCOL-AER model to identifying regions which are at risk of low atmospheric Se inputs. In order to constrain the uncertainty in Se deposition fluxes over agricultural soils, we should prioritize field campaigns measuring Se emissions, rather than laboratory measurements of Se rate constants.


implementation of Se chemistry in the SOCOL-AER model. The SOCOL-AER model is a suitable tool to model the Se cycle, since it successfully describes the major properties of the atmospheric S cycle (Sheng et al., 2015;Feinberg et al., 2019b).
The statistical methods that we use to conduct the sensitivity analysis are discussed in Sect. 3. Section 4 details the methods used to compile a database of measured wet Se deposition fluxes, which we use to evaluate the model. The results of the sensitivity analyses are presented in Sect. 5.1 for the atmospheric Se lifetime and Sect. 5.2 for the Se deposition patterns. 15 Section 5.3 illustrates the comparison between the compiled Se deposition measurements and simulated results. A discussion of both sensitivity analyses follows in Sect. 6, and the paper is concluded in Sect. 7.
2 Model Description 2.1 SOCOL-AER SOCOL-AERv2 is a global CCM that includes a coupled sulfate aerosol microphysical scheme. The base CCM, SOCOLv3 20 (Stenke et al., 2013), is a combination of the general circulation model ECHAM5 (Roeckner et al., 2003) and the chemical model MEZON (Egorova et al., 2003). The MEZON submodel comprises a comprehensive atmospheric chemistry scheme, with 89 gas-phase chemical species, 60 photolysis reactions, 239 gas-phase reactions, and 16 heterogeneous reactions. Chemical tracers are advected in the model using the Lin and Rood (1996) semi-Lagrangian method. Photolysis rates are calculated using a look-up table approach based on the simulated overhead ozone and oxygen columns. The MEZON model solves 25 the system of differential equations representing chemical reactions with a Newton-Raphson iterative method for short-lived chemical species and an Euler method for long-lived species. The sulfate aerosol model AER (Weisenstein et al., 1997) was first coupled to SOCOL by Sheng et al. (2015), and it was later updated by Feinberg et al. (2019b). SOCOL-AER includes gas-phase S chemistry and 40 sulfate aerosol tracers, ranging in dry radius size from 0.39 nm to 3.2 µm. SOCOL-AER simulates microphysical processes that affect the aerosol 30 size distribution, including binary homogeneous nucleation, condensation of H 2 SO 4 and H 2 O, coagulation, evaporation, and sedimentation. SOCOL-AER was extended in Feinberg et al. (2019b) to include interactive wet and dry deposition schemes.
The wet deposition scheme, based on Tost et al. (2006), calculates scavenging of gas-phase species depending on their Henry's law coefficients and aerosol species depending on the particle diameter. The wet removal of tracers is coupled to the grid cell simulated properties of clouds and precipitation. The dry deposition scheme is based on Kerkweg et al. (2006Kerkweg et al. ( , 2009, which uses the surface resistance approach of Wesely (1989). In addition to surface type and meteorology, the calculated dry 5 deposition velocities depend on reactivity and solubility for gas-phase compounds and size for aerosol species. SOCOL-AER uses an operator splitting approach, wherein the model time step is 2 hours for chemistry and radiation and 15 minutes for dynamics and deposition. Aerosol microphysical routines use a sub-time step of 6 minutes.
For the simulations in this study we use boundary conditions for the year 2000. Sea ice coverage and sea surface temperatures are prescribed from the Hadley Centre dataset (Rayner et al., 2003). Year 2000 concentrations of the most relevant greenhouse 10 gases (CO 2 , CH 4 , and N 2 O) derive from NOAA observations (Eyring et al., 2008). Anthropogenic CO and NO x emissions are based on the RETRO dataset (Schultz and Rast, 2007), while natural emissions are taken from Horowitz et al. (2003). Sulfur dioxide (SO 2 ) emissions from anthropogenic sources follow the year 2000 inventory from Lamarque et al. (2010);Smith et al. (2011). Volcanic degassing SO 2 emissions are assigned to surface grid boxes where volcanoes are located (Andres and Kasgnoc, 1998;Dentener et al., 2006). Atmospheric emissions of dimethyl sulfide (DMS) are calculated using a wind-based 15 parametrization (Nightingale et al., 2000) and a marine DMS climatology (Lana et al., 2011). To represent mean conditions for photolysis, the look-up table for photolysis rates is averaged over two solar cycles .

Selenium species overview
We included the Se cycle in SOCOL-AER ( Fig. 1) based on the existing literature on atmospheric Se (Ross, 1985;Wen and 20 Carignan, 2007). Seven Se gas-phase tracers have been added to SOCOL-AER (Table 1): carbonyl selenide (OCSe), thiocarbonyl selenide (CSSe), carbon diselenide (CSe 2 ), dimethyl selenide (DMSe), hydrogen selenide (H 2 Se), oxidized inorganic Se (OX_Se_IN), and oxidized organic Se (OX_Se_OR). The oxidized inorganic Se tracer represents species such as elemental Se, selenium dioxide (SeO 2 ), selenous acid (H 2 SeO 3 ), and selenic acid (H 2 SeO 4 ). Very little is known about the kinetics of interconversion between the oxidized inorganic Se species (Wen and Carignan, 2007), and therefore in our model they are treated as 25 one tracer. However, these species all have very low vapor pressures under atmospheric conditions (Rumble, 2017) and likely partition to the particulate phase. Oxidized organic Se species include dimethyl selenoxide (DMSeO) and methylseleninic acid (CH 3 SeO 2 H), which form after oxidation of DMSe (Atkinson et al., 1990;Rael et al., 1996). Similar to the oxidized inorganic Se compounds, oxidized organic Se species also partition to the particulate phase due to their low volatilities (Rael et al., 1996). 30 To determine which Se compounds are emitted by the different sources, we have reviewed studies that investigated the speciation of Se emissions. Thermodynamic modeling and in situ measurements of combustion exhaust gases have detected the Figure 1. Scheme of the atmospheric Se cycle in SOCOL-AER, based on information in Ross (1985); Wen and Carignan (2007). For simplicity the two oxidized Se tracers in SOCOL-AER are represented with a single box. The impact on agriculture and human health is also shown, since it motivates the study of atmospheric Se.
Atmospheric S emissions have been measured more extensively than Se emissions, so we scale inventories of S emissions to yield the spatial distribution of emitted Se. For the sensitivity analysis we assume that the Se emissions have the same distribution as S emissions, and we focus on the uncertainties in the global scaling factors for each source. The range in scaling 10 factors will be discussed in Sect. 3.1.4. The spatial distribution of anthropogenic and biomass burning Se emissions comes from the SO 2 inventory for the year 2000 (Lamarque et al., 2010;Smith et al., 2011). For sea surface DMSe concentrations, we scale the DMS climatology calculated by Lana et al. (2011), and calculate DMSe emissions using a wind-driven parametrization (Nightingale et al., 2000). The locations and strength of background degassing volcanic emissions are taken from the GEIA inventory (Andres and Kasgnoc, 1998;Dentener et al., 2006). Since little is known about both terrestrial biogenic Se emissions 15 and terrestrial S emissions (Pham et al., 1995), we assume that terrestrial Se is emitted in all land surface grid boxes, excluding glaciated locations.  (Rumble, 2017). In addition, there are reactions that are known to occur for the analogous S compound, but have never been studied for the Se compound (OCSe + OH, CSSe + OH, CSe 2 + OH, and H 2 Se + OH). These reaction rate constants were estimated in Fig. 2, which shows the ratio of the Se compounds' rates to analogous S rates (i.e., an enhancement factor for replacing an S atom with Se) plotted versus the S reaction rate. For S reactions which have a fast reaction rate (e.g., DMS + Cl, k = 1.8 × 10 −10 cm 3 molec −1 s −1 ), replacing S with Se does not yield a large difference in measured rates (DMSe + Cl, 10 k = 5.0 × 10 −10 cm 3 molec −1 s −1 ). This is because these reactions are already close to the collision-controlled limit, and thus lowering the activation energy by substituting a Se atom for S has little impact on the overall rate. On the other hand, slow reactions like DMS + O 3 are sped up by more than four orders of magnitude when Se is substituted for S. We used the log-log relationship in Fig. 2 to predict the reaction rates for OCSe, CSSe, and H 2 Se with OH (blue stars). The CSe 2 reaction with OH is calculated from the CSSe reaction, assuming a similar enhancement for the substitution of a second Se atom as between the 15 measured CSe 2 + O and CSSe + O reaction rates (Li et al., 2005). The branching ratio for the CSSe + OH reaction products was assumed to be 30% OCSe and 70% OX_Se_IN, the same as the measured CSSe + O branching ratio (Li et al., 2005).
We recognize that these estimates are inherently uncertain, and therefore address these uncertainties in our sensitivity analysis (Sect. 3.1.2).
The photolysis of gas-phase Se compounds was included using absorption cross sections of H 2 Se (Goodeve and Stein, 1931) 20 and OCSe (Finn and King, 1975). The absorption cross section of CSe 2 (King and Srikameswaran, 1969) has been measured, however in too low resolution to be incorporated into the model. Therefore, we assume that CSe 2 and CSSe have the same Table 2. Rate constants of Se compound gas-phase reactions at around 298 K and the corresponding rate constant of the analogous S compound. All S reaction rates are from Burkholder et al. (2015), except the DMS + O3 reaction rate which is taken from Wang et al. (2007). No corresponding rate for CSe2 reactions is listed, since CSe2 is obtained from doubly substituting Se in CS2.

Condensation of Se on preexisting aerosol particles
As nonvolatile species, oxidized inorganic and organic Se would condense on available atmospheric surfaces. In the SOCOL-AER model, the uptake of these oxidized Se species by sulfate aerosols is calculated similarly to the existing scheme of 5 gas-phase H 2 SO 4 uptake on sulfate particles (Sheng et al., 2015). We track the size distribution of Se in the aerosol phase with 40 tracers, one for each sulfate aerosol size bin. The sulfate aerosol size distribution changes through processes like growth, evaporation, and coagulation. We track how these microphysical processes change the size distribution of condensed the Se stored in that bin as gas-phase inorganic oxidized Se. Sedimentation and deposition of the host sulfate particles are concurrent with sedimentation and deposition of the condensed Se tracers. Gas-phase oxidized Se tracers are also removed by dry and wet deposition, with the assumption that they have the same Henry's law constant as gas-phase H 2 SO 4 .
One limitation of using SOCOL-AER for the Se cycle is that the model only includes online sulfate aerosols. This means that the transport of Se on other aerosols, including dust, sea salt, and organic aerosols, would be neglected. This may not be 5 a poor assumption, since Se and S are often co-emitted and have been found to be highly correlated in atmospheric aerosol measurements (e.g., Eldred, 1997;Weller et al., 2008). Nevertheless, we included a "dummy" aerosol tracer to test the effect of missing aerosol species in SOCOL-AER. The dummy aerosol tracer represents monodisperse particles that are emitted in a latitudinal band in the model and undergo Se uptake, sedimentation, and wet and dry deposition. This dummy aerosol tracer is clearly a simplification of true atmospheric processes, as in reality other aerosols are distributed in size and can coagulate with 10 sulfate aerosol particles. However, by varying the radius, location, and emission magnitude of these particles (Sect. 3.1.7), we can determine whether missing aerosols affect atmospheric Se cycling.

Statistical methods
To conduct the sensitivity analysis of our Se model, we first need to select the input parameters that would be included in the sensitivity analysis. The probability distributions of these input parameters' uncertainties were determined by reviewing 15 literature sources and using our best judgement. Variance-based sensitivity analysis methods usually require 10 4 to 10 6 model runs, which would be prohibitively expensive for the full SOCOL-AER model. We therefore replace the SOCOL-AER model with a surrogate PCE model. The SOCOL-AER model is run at carefully selected points within the parameter space, creating a set of "training" runs. The training runs are used to produce a surrogate PCE model, which approximates the outputs of the full SOCOL-AER model throughout the input parameter space. Sensitivity indices can then be derived from the surrogate model. All statistical methods presented in this section are available in UQLAB, an open-source MATLAB-based framework for uncertainty quantification (Marelli and Sudret, 2014).

Uncertainty ranges of input parameters
We restricted the scope of our sensitivity analysis to parameters that have been implemented in the model as part of the Se cycle. We neglect other model uncertainties, including those related to the wet deposition parameterization or the emissions of sulfate aerosol precursors. The focus of our sensitivity analysis is to prioritize which Se-related parameters should be further constrained. Since we do not vary all other model parameters, the uncertainties of output quantities may be underestimated. 10 However, given the large dimension of our parameter space with 34 Se-related input parameters, including additional non-Serelated parameters would be challenging. In the following section, we will discuss the uncertainty distributions for each of the 34 input parameters included in our study. Due to the lack of detailed information available in literature about the parameter distributions, we chose loguniform or uniform distributions for all but one of the parameters. This follows the conservative approach recommended by the Maximum Entropy Principle, as the uniform and loguniform distributions maximize entropy 15 while fulfilling the data constraints (Kapur, 1989). The uncertainty distributions of all input parameters are listed in Table 3.
3.1.1 Measured rate constants (k 1 -k 12 ) The Se reactions studied in the literature have each only been measured by one laboratory group (Table 2). Since only one measurement technique has been applied, the reported measurement uncertainties may underestimate the true uncertainties of these rate constants. To approximate an uncertainty distribution for these rate constants, we reviewed S compound reactions 20 that have been studied by multiple research groups. The reaction that had the largest spread in reported rate constants at ∼298 K was OCS + OH, which has been measured in six studies (Atkinson et al., 1978;Kurylo, 1978;Cox and Sheppard, 1980;Leu and Smith, 1981;Cheng and Lee, 1986;Wahner and Ravishankara, 1987;Burkholder et al., 2015). The measured reaction rate constant varied over multiple orders of magnitude; therefore, we calculated its variability on a logarithmic scale. The coefficient of variation (standard deviation divided by mean) of this reaction rate in logarithmic space was around 6%. We assumed that 25 this maximum S coefficient of variation would apply to the measured Se reaction rates. The bounds were calculated as 88% and 112% (±2 standard deviations) of the available measured rate constant in logarithmic space, i.e.: where k is the measured rate constant and "Bounds" are its upper and lower bounds, all expressed in cm 3 molec −1 s −1 . The maximum upper bound was set to 1.0 × 10 −9 cm 3 molec −1 s −1 , since at this order of magnitude the Se reaction rates are 30 collision-limited (Seinfeld and Pandis, 2016). Five Se reaction rate constants have not been measured previously in the laboratory and were estimated based on their relationship with analogous S rate constants (Fig. 2). We calculated the uncertainty bounds of estimated rate constants using the 95% error interval of prediction with a linear regression (Wackerly et al., 2014): where x is the logarithm (to the base 10) of the corresponding S rate constant,Ŷ is the logarithm of the predicted ratio of the Se rate to the S rate, n is the number of data points in the regression, t 0.025 is the 2.5 th percentile value of the Student's t distribution for n − 2 degrees of freedom,x is the mean of the logarithm of S rate constants in Fig. 2, SSE is the sum of squares of the residuals, and S xx is the variance of the S rate constants in Fig. 2. All rate constants are in units of cm 3 molec −1 s −1 .

Photolysis rate scaling factors
10 Uncertainties in our calculated Se photolysis rates arise from uncertainties in the measured OCSe and H 2 Se cross sections, the assumption that CSe 2 and CSSe have the same cross section as CS 2 , the quantum yields of photolysis reactions, and the look-up table approach that SOCOL-AER applies to calculate photolysis rates. Given the lack of specific information about these uncertainties, we apply a scaling factor on the calculated photolysis rates ranging from 0 to 2 (Table 3).

15
For the sensitivity analysis, we do not alter the spatial distribution of Se emissions from anthropogenic, marine biogenic, terrestrial biogenic, and volcanic sources (Fig. 3). The parameters that we varied are the scaling factors for each map, i.e., the global total emissions from each source. We reviewed atmospheric Se budget estimates to determine the range in total emissions for different sources. The estimates for global DMSe emissions ranged from the lower limit value of 0.4 Gg Se yr −1 in Nriagu (1989) to 35 Gg Se yr −1 in Amouroux et al. (2001). DMSe emissions are calculated online in the model from a marine DMSe 20 concentration map. Based on the results of a previous simulation, we normalize the marine DMSe concentration map in Fig. 3a so that it leads to 1 Gg Se yr −1 emissions globally. All other maps are also normalized to 1 Gg Se yr −1 emissions, so that we can directly apply the total global emissions as a scaling factor. The widest uncertainty range of terrestrial emissions was given by Nriagu (1989), from 0. 15-5.25 Gg Se yr −1 . Total global anthropogenic Se emissions in 1983 were estimated between 3.0 and 9.6 Gg Se yr −1 (Nriagu and Pacyna, 1988). We applied the same uncertainty range for the total emissions in the year  (Table S1). There is a high degree of variability in the emitted Se to S ratios between different volcanoes, and even temporally and spatially for the same volcano . The Se to S ratio in volcanic emissions ranges from 6 × 10 −6 for White Island, New Zealand  to 3.9 × 10 −3 for Merapi 30 Volcano, Indonesia (Symonds et al., 1987). By multiplying this range in ratios with the global mean total degassing SO 2 Emissions ( g Se yr -1 ) Figure 3. Spatial distribution of Se sources. Each source map is normalized so that 1 Gg Se yr −1 would be emitted globally. This is scaled during the sensitivity analysis.
emissions from Andres and Kasgnoc (1998) and Dentener et al. (2006), 12.6 Tg S yr −1 , we calculate an uncertainty range for total volcanic Se emissions, 0.076-49.1 Gg Se yr −1 . Loguniform distributions were used for the source types whose total emission ranges vary by more than 2 orders of magnitude (volcanic and marine biogenic), whereas uniform distributions were used for the terrestrial and anthropogenic Se emissions.

5
The available speciation information for Se emissions is largely qualitative; possible emission species have been identified but not quantified (Sect. 2.2.2). To estimate the uncertainty range for Se emission speciation, we use estimates of speciation from an atmospheric S budget (Watts, 2000). The second most important anthropogenic S species after SO 2 is H 2 S. Watts of total S emissions. Since OCSe, CSe 2 , CSSe, and H 2 Se are in general less stable than the analogous S species (Table 2), we consider 6% to be a maximum value for the mass fraction of anthropogenic Se emissions that come from each of these 13 species. The anthropogenic speciation fractions for OCSe, CSe 2 , CSSe, and H 2 Se are varied between 0 and 6%. The rest of the anthropogenic emissions, 76-100%, are attributed to OX_Se_IN, representing species such as Se and SeO 2 .
Regarding the speciation of volcanic S emissions, Watts (2000) estimates that 0.99 ± 0.88 Tg S yr −1 is in the form of H 2 S.
Comparing this to the estimate for volcanic SO 2 emissions, 12.6 Tg S yr −1 (Andres and Kasgnoc, 1998;Dentener et al., 2006), H 2 S contributes at most 13% of volcanic S emissions. Therefore, in our sensitivity analysis the percentage of volcanic 5 Se emissions in the form of H 2 Se ranges from 0-13%. Conversely, the percentage of OX_Se_IN in volcanic emissions ranges from 87-100%.

Accommodation coefficient of oxidized Se
The accommodation coefficient represents the probability that a gas-phase oxidized Se molecule will stick to an aerosol upon collision. No laboratory studies have investigated the accommodation coefficient of oxidized Se on aerosol surfaces. However, 10 review papers suggest that Se efficiently partitions to the aerosol phase upon oxidation (Mosher and Duce, 1987;Wen and Carignan, 2007), indicating a high accommodation coefficient. Due to the lack of further information, we assume an uncertainty range of 0.02-1 for the accommodation coefficient, as selected by Lee et al. (2011) for H 2 SO 4 . This accommodation coefficient applies to uptake of Se on sulfate and dummy aerosols.

15
SOCOL-AER only includes sulfate aerosol, lacking other aerosol types (e.g., dust, sea salt, organic aerosol, etc.) that may also transport Se. To test how these other aerosol types might affect Se transport and deposition, in the sensitivity analysis we vary the emission location, emission magnitude, and particle radius of the dummy aerosol tracer, defined in Sect. 2.2.4. These input parameters are different from other uncertainties, in that they are intended to investigate a lack in model completeness rather than uncertainties in measurable quantities. In our experiments the aerosols are emitted in one of 18 latitude bands, ranging from 20 90°-80°S to 80°-90°N. The latitude parameter can demonstrate whether Se deposition is affected by these missing aerosol types only in certain latitudinal bands. The emission radius affects the transport of Se on these missing particles, since the atmospheric lifetime of these particles depends on radius (Seinfeld and Pandis, 2016). For example, additional coarse particles (radius > 1 µm) could inhibit far-range Se transport, since they sediment and are removed quicker than the accumulation mode sulfate particles. We vary the emission radius between 0.01 and 3 µm, as this covers the range of atmospherically relevant 25 particle sizes.
To determine a reasonable range for the emission magnitude of additional aerosol particles, we analyzed the particle emission inventories from the AEROCOM I project (Dentener et al., 2006). Aerosol types are segregated into classes based on their effective radius, and the emission magnitude for 10°latitude bands is calculated for each size class. The mass emission of particles correlates with the particle size, with generally larger mass emissions for larger particle sizes (Fig. S1). Therefore, the 30 value of emission magnitude in our experiments depends lognormally on the particle radius (r), with mean µ = 2.5r + 19 and standard deviation σ = 2.88. We acknowledge that in the real atmosphere particles are emitted globally as size distributions, not monodisperse particles in one latitudinal band. Nevertheless, by including these input parameters as part of the sensitivity analysis, we can identify whether the lack of tropospheric aerosols other than sulfate impacts the simulated Se deposition maps in SOCOL-AER.

Experimental setup and model outputs
The creation of surrogate models requires a set of training runs with the full SOCOL-AER model. The values of the input parameters are varied simultaneously between training runs, so that interactions between parameters can also be detected.

5
A Latin hypercube design is used to draw N samples from an M -dimensional input parameter space. In Latin hypercube sampling, each parameter range is divided into N equally probable sub-intervals. N values for each parameter are drawn randomly from within each sub-interval. The selected values for all input parameters are then matched randomly with each other, to yield N points that cover the parameter space better than purely random Monte Carlo sampling (McKay et al., 1979).
The general rule of thumb is to select around 10M training simulations, to adequately cover the sample space (Loeppky et al., Se species. Since the model is run in T42 horizontal resolution, there are 8192 surface grid boxes representing geographical coordinates on the globe. We therefore have 8192 model outputs for Se deposition and we construct a PCE model and conduct a sensitivity analysis for each one. It can be argued that the computational cost can further be reduced by dimensionality reduction techniques (e.g., Sudret, 2011b, 2013;Ryan et al., 2018), like building the PCE models on a reduced 30 output set coming from a principal component analysis (PCA). We did not consider such an approach in this work because the cost of building the 8192 PCE models is marginal compared to that of evaluating the full SOCOL-AER model. In summary, the 400 SOCOL-AER training runs yield 400 values for atmospheric Se lifetimes and 400 × 8192 values for deposition fluxes.

Surrogate modeling with PCE
Sudret (2008) provided a detailed description of using PCE to build surrogate models, which was updated by Le Gratiet et al.
(2017). We will summarize the most important features of the approach here. For this study, a certain output of the SOCOL-AER model (Y ) can be thought of as a finite variance random variable which is a function of the M = 34 input parameters The input X ∈ R M is modeled by a joint probability density function (PDF) f X whose marginals are assumed independent, In polynomial chaos decomposition, the output variable Y is decomposed into the following infinite series (Ghanem and Spanos, 2003): where y α are coefficients to be determined and α is a multi-index set that defines the degree of the multivariate orthonormal The latter belongs to a family of polynomials that are orthogonal with respect to the marginal PDF f Xi . For example, univariate uniform probability distributions correspond to the family of Legendre polynomials and Gaussian probability distributions correspond to the family of Hermite polynomials (Xiu and Karniadakis, 2002). 15 Multivariate orthogonal polynomials can be constructed through multiplying the relevant univariate polynomials. The order of a polynomial term is defined as the total number of variables included in the polynomial term. The degree of a polynomial term represents the sum of the exponents of all variables appearing in the term. The number of coefficients to estimate grows exponentially with both the dimension and the degree. To allow calculation of the coefficients, the terms in Eq. 4 are truncated by restricting the maximum degree of the polynomials. Advanced truncation schemes can also be used to reduce the number 20 of terms and thus the computational budget. Here we consider hyperbolic truncation which removes high order interaction terms from the PCE, while retaining high degree polynomials of a single variable (Blatman and Sudret, 2011a). Hyperbolic truncation is based on selecting only the terms that satisfy: where q is a value selected between 0 and 1, and p is a value selected as the maximum degree of the PCE. Setting q = 1 25 corresponds to the standard truncation scheme where all terms with degree below p are selected. The lower the value of q, the more higher order interaction terms are removed from the PCE. In our case, we selected a q value of 0.75.
When the dimension is large, regression techniques that allow for sparsity, i.e., by forcing some coefficients to be zero, are favored. In this work, we consider least-angle regression as proposed in Blatman and Sudret (2011a) and follow the implementation in the UQLAB PCE module (Marelli et al., 2019). 5 The accuracy of the PCE in representing the full SOCOL-AER model is evaluated with a cross-validation metric named the leave-one-out (LOO) error, LOO (Blatman and Sudret, 2010a). The cross-validation approach removes the need for an independent validation dataset, saving computational expense. The leave-one-out procedure consists of creating a PCE using all but one of the training simulations, M PC\i . This PCE is then used to predict the output value of the model at the removed training simulation point, x (i) . The process is repeated for all N training points so that a residual sum of squares can be 10 calculated. This residual is then divided by the output variance in the training dataset, yielding the LOO error: whereμ Y is the sample mean of the model output in the training dataset. In practice, the LOO error is estimated using a single PCE model considering the entire experimental design (Marelli and Sudret, 2014;Blatman and Sudret, 2010b), to avoid the procedure of creating N PCE models. Equation 7 is evaluated as: where h i is the i th component of the vector defined by: and A is the experimental matrix whose components read:

20
where P ≡ cardA is the number of terms in the PCE. In our study, we use a degree-adaptive scheme to construct the PCE models sequentially from degree 1 to maximum degree 13. If the LOO error does not decrease over two steps in degree, the algorithm is stopped and the maximum degree is selected as the one with the lowest LOO error. This method reduces the risk of overfitting the training set. The maximum degree of 13 was selected because the PCE becomes too computationally expensive to calculate above this degree. In any case, almost all PCEs calculated in this study are below degree 10.

25
To improve the accuracy of the approximation, we applied post-processing steps to the construction of PCE models (Table   4). The global atmospheric Se lifetime is a function of the atmospheric Se burden and the total Se deposition. We found improved overall accuracy when separate PCE models were created for the atmospheric burden and total deposition, rather However, 873 of the 8192 grid boxes showed LOO errors higher than 0.05, which was selected as the acceptable threshold for this study. Improved LOO errors were achieved when the simulated deposition fluxes were log-transformed before constructing 5 the PCE model. However, after log-transformation the sensitivity indices for deposition cannot be extracted analytically from the PCE (Sect. 3.4), increasing computational expense. Therefore, we only log-transformed the data from these 873 grid boxes before creating their PCE models. Surrogate models for deposition fluxes in these 873 grid boxes are calculated by taking the exponential of the PCE model of log-transformed data.

Sensitivity analysis
10 A Sobol' sensitivity index represents the fraction of model variance caused by the parametric uncertainty of a certain input variable or the interaction between multiple variables (Sobol, 1993;Marelli et al., 2019). It is a global measure, meaning that the sensitivity index considers the entire parametric uncertainty space. Let us consider a model M whose inputs, defined over a domain D x , are assumed independent. It can be shown that it admits the following so-called Sobol' decomposition: 15 where M 0 is a constant and the other summands are defined such that their integrals with respect to any of their arguments is 0, i.e.: Following this decomposition, it can be shown that the variance of the random variable Y = M(X) can be cast as (Saltelli et al., 2008): where a partial variance with respect to several variables X i1 , . . . , X is can be calculated as: The Sobol' indices (S i1,...,is ) for a subset of input parameters are eventually obtained by normalizing the corresponding variance, i.e.: The It therefore becomes computationally demanding in our case with M = 34 input parameters to calculate sensitivity indices with order higher than 2. Furthermore, by the sparsity-of-effect principle (Goupy and Creighton, 2006), it is expected that the model is primarily driven by first order effects and low order interactions. We therefore only calculate individually the first and second order Sobol' indices, which is common practice in global sensitivity analysis.
The total Sobol' index (S T i ) summarizes all sensitivity indices which include the effect of a given input variable: In practice, it is possible to calculate the total Sobol' index without individually calculating all of the higher order indices (Marelli et al., 2019). The total Sobol' indices can be used to rank the influence of input variables on the variability of the model output. It should be noted that the sum of all total Sobol' indices will be greater than 1 if the model is non-additive, since interaction terms would be counted multiple times. 25 Other studies have emphasized the computational expense of conducting a sensitivity analysis for all grid boxes in a chemistry-climate model (Ryan et al., 2018). By using PCE as the surrogate model, we can reduce the computational expense since it is possible to compute the Sobol' sensitivity indices analytically. As shown by Sudret (2008), Sobol' sensitivity indices are a function of the calculated coefficients in the PCE model (Eq. 4). However, this is no longer possible when postprocessing steps are applied to the original PCE models to create surrogate models. For example: 1) the surrogate model for the atmospheric lifetime is calculated from two PCE models, one for total Se burden and one for total Se deposition; 2) Se deposition in 873 grid boxes is calculated using the exponential of a PCE model (see Table 4). In these cases, it is still possible to calculate the Sobol' indices through Monte Carlo estimation (Marelli et al., 2019). To accomplish this, the surrogate models 5 are sampled around 10 6 times, which remains tractable.
To summarize categories of input variables, we aggregate the Sobol' indices of several input parameters to yield a total Sobol' index for that category. For example, we summarize the total dummy aerosol effect by summing the total Sobol' indices of the dummy aerosol radius, emission magnitude, and latitude. It would anyways be difficult to separate the effects of the dummy aerosol input parameters since they are correlated inputs in the experimental design. The second order indices involving two 10 dummy aerosol input parameters may be double-counted with this method. However, since these indices are small (< 0.05) we do not expect a large error in the aggregated total Sobol' index (Sect. 5.1).

Resampling of surrogate models
In order to estimate distribution statistics (mean, standard deviation, quantiles) of the output variable, we resample each surrogate model 40 000 times. We also use these 40 000 samples to calculate relationships between input parameters and output 15 variables. To visualize marginal relationships between a certain input parameter and the output, we order samples in ascending values of that input parameter and calculate a moving average and standard deviation with window length 2000. This procedure yields smoothed univariate relationships between the model output and input parameters.

Compilation of Se wet deposition flux data
We decided to compare SOCOL-AER results with measurements of Se wet deposition, since wet deposition contributes an 20 estimated 80% of total deposition (Wen and Carignan, 2007). We conducted a systematic literature review to assemble a dataset of measured wet Se deposition fluxes, extending an earlier review from Conde and Sanz Alaejos (1997). We searched in Web of Science (Clarivate Analytics) for combinations of the following criteria: "Selenium", "Se", "rain", "precipitation", "wet deposition", "trace element". The last search was completed in July 2019, yielding a total of 672 search results. We screened these search results for studies that measured Se concentrations in rainwater for at least one month, neglecting studies that 25 measured bulk (wet and dry) deposition or that extrapolated wet deposition fluxes from aerosol measurements. The compiled dataset, which is available in the Supplement, consists of 29 papers and data from two measurement networks, the European Monitoring and Evaluation Programme (EMEP) and Canadian National Atmospheric Chemistry (NAtChem) database, for a total of 73 measurement sites (Table 5). From these studies, we extracted the annual mean Se deposition fluxes, if available, or we used rainwater volume-mean weighted Se concentrations combined with the annual precipitation depths to calculate the 30 deposition flux. If the paper did not provide the annual precipitation depth, we calculated the mean annual precipitation depth for the time period and location of the study from the historical Multi-Source Weighted-Ensemble Precipitation (MSWEP)    (Table 3), the uncertainty of the simulated atmospheric lifetime is less than one order of magnitude.
In order to identify the input parameters that drive the variability of the simulated Se lifetime, we apportion the variance into contributions from the most important parameters (Fig. 5). The most important variable is k 13 , which is the rate constant for the OCSe + OH reaction, followed by the dummy aerosol input parameters. Nonlinearities are also clearly important for the 10 global Se lifetime, since all first order terms only account for 53% of the variance in the Se lifetime, with interaction terms accounting for the other 47%. However, the interaction contribution is made up of many small individual interaction terms.
Only two interaction terms account for more than 5% of the total variance in the Se lifetime: the interaction between k 1 and k 13 (5.3% of variance) and the interaction between the dummy aerosol radius and dummy aerosol emissions (5.0% of variance).
Through resampling the surrogate model for the Se lifetime, we investigate the relationships between the Se lifetime and input 15 parameters in Figs. 6 and 7.
Several of the most influential input parameters for the Se lifetime are related to OCSe processes (Fig. 6a-c). Given the median estimates for the reaction rate constants in Table 2 rates of OCSe with OH (k 13 ) and O (k 1 ) lead to longer Se lifetimes. Since OH is more prevalent in the lower atmosphere than O, the dependence of the Se lifetime on k 13 is stronger than the dependence of the lifetime on k 1 . The influence of k 13 on the Se lifetime is mostly saturated above 10 −12 cm 3 molec −1 s −1 . Above this threshold in k 13 , the OCSe burden becomes a minor part of the total Se burden (accounting for less than 2% of total Se), and therefore the Se lifetime is not affected by OCSe processes. The second order interaction between k 13 and k 1 is also shown in Fig. 6b. When the reaction of OCSe with OH is fast 5 (k 13 > 10 −12.5 cm 3 molec −1 s −1 ), the value of k 1 is not important for the Se lifetime since almost all OCSe will react with OH. In cases when the OCSe + OH reaction is slow (k 13 < 10 −12.5 cm 3 molec −1 s −1 ), the value of k 1 has a stronger effect on the Se lifetime since not all OCSe has reacted with OH. The Se lifetime increases for higher fractions of anthropogenic Se emitted as OCSe, again showing that higher OCSe burdens prolong the atmospheric Se lifetime. OCSe has been associated with anthropogenic emissions by only one study, which inferred its presence from a similarity in boiling point with a detected 10 Se species (Pavageau et al., 2002). OCSe has never been identified in the ambient atmosphere; on the other hand, SOCOL-AER predicts maximum OCSe concentrations of sub-ppt levels, which would be difficult to measure analytically. Still, processes related to OCSe, a highly speculative atmospheric Se compound, contribute to the model's variability in the Se lifetime.
After OCSe, the most important input parameters for the atmospheric Se lifetime are related to dummy aerosols. It is difficult to interpret the univariate effects of aerosol emissions and radius (Fig. 7a, b), since by design the two inputs are correlated. which when divided by dummy aerosol radius yields a metric that is proportional to the emitted surface area (r 2 ). The Se lifetime generally increases with the surface area of emitted dummy aerosols, although the response is not linear (Fig. 7c). We compare this response with the range of available surface area of sulfate aerosol for 10 • latitude bands, shaded in yellow. For lower dummy aerosol surface areas, the available sulfate aerosol dominates the uptake of gas-phase Se and additional dummy However, the effect of dummy aerosols is not drastic, only increasing the mean Se lifetime from 4.3 to 4.8 d (Fig. 7c).

The other inputs have minor impacts on the global Se lifetime. Stronger emissions of volcanic Se lead to shorter overall global
Se lifetimes, while emissions of marine and terrestrial biogenic Se lead to longer Se lifetimes (Fig. 6d-f). Biogenic sources emit 5 only DMSe in SOCOL-AER, which is not removed by wet and dry deposition, while volcanic sources emit mainly oxidized Se species, which can be removed by wet and dry deposition. Biogenic emissions must first be oxidized before deposition can occur, which can prolong the Se lifetime. The influence of the two other terms with S T i > 0.05, the reaction rate constant of CSSe + OH (k 15 ) and the fraction of volcanic emissions emitted as H 2 Se, is mainly through interaction terms, and therefore the Se lifetime responds only weakly to univariate variations in these variables (Fig. 6g, h). 10

Spatial patterns of Se deposition
Surrogate models for Se deposition in all surface grid boxes were calculated according to Sect. 3.3. After log-transforming Se deposition in 873 grid boxes, the LOO error is below 0.1 in all grid boxes, but still above 0.05 in 354 grid boxes, mainly in polar regions (Fig. S2). However, due to limitations in computational time to run more training runs, we proceeded with the sensitivity analyses of all grid boxes. By resampling the surrogate models, we calculate maps of mean and coefficient of 15 variation of Se deposition over all the input uncertainties (Fig. 8). Selenium deposition is highest in areas close to anthropogenic and volcanic emissions (Fig. 3c, d), which are point sources. The deposition pattern is clearly affected by precipitation: dry areas (e.g., eastern portion of ocean basins, polar regions, and the Sahara Desert) show the lowest Se deposition fluxes globally. Other interesting features of the Se deposition patterns, for example identifying the regions influenced by long-range transport, will be investigated in upcoming studies. In this study, we focus on determining the most important sources of uncertainty to the simulated deposition maps. The relative uncertainty in simulated deposition, illustrated by the coefficient of variation (Fig. 8b), is highest in areas affected by marine and volcanic emissions because these emission sources have wider uncertainty ranges compared to anthropogenic and terrestrial emissions (Table 3). In the following global sensitivity analysis we can identify the 5 input factors that contribute to the variation in Se deposition in each grid box.
Figures 9 and 10 illustrate the spatial variation in the importance of input parameters. We chose example grid boxes (indicated by blue circles) to illustrate the marginal relationships between Se deposition and the input parameters. Overall, the most important input parameters are the total emissions from each source. In the example grid boxes shown in Fig. 9a-h, Se deposition increases linearly with increasing the emissions from the different source categories. The linear relationship is logical since 10 deposition balances emission in the steady state. In areas that are more remote from emission regions (Sahara, Antarctic, and Arctic), other factors become more important but are still minor compared to the emission inputs. The aerosol accommodation coefficient affects areas where the precipitation is very low, for example in the Saharan Desert (Fig. 9i). In these dry regions, total Se deposition is dominated by dry deposition. When the model is run with low accommodation coefficients, less oxidized Se partitions to the particulate phase and more remains in the gas phase. Dry deposition of particles in the 0.1-1 µm diameter 15 size range, within the range of sulfate and dummy aerosols, is slower than gas compounds due to the slower Brownian motion of particles (Seinfeld and Pandis, 2016). Chemical reaction rate constants, specifically the reaction rates of DMSe, impact Se deposition in polar regions. Slower reaction rates of DMSe with OH (k 6 ) and O 3 (k 8 ) enhance deposition over the example Antarctic grid box (Fig. 10a-c). Longer DMSe lifetimes allow more marine Se emissions to reach polar regions, which have little local Se emission sources. The chemical rate coefficients are more important in the Antarctic than the Arctic since there 20 is more O 3 and OH in the Northern Hemisphere than the Southern Hemisphere, meaning that the DMSe lifetime is longer in A blue circle indicates the grid box where the total Sobol' index is maximum. The relationship between total Se deposition and the input parameters in that grid box, calculated by resampling the surrogate model for deposition (right column). Note that the magnitude of deposition (y-axis) varies in each plot, depending on the grid box shown. the Southern Hemisphere than the Northern Hemisphere. Dummy aerosol parameters are only important for Se deposition in the Arctic (Fig. 10d-f). We calculate the relationship between Se deposition and dummy aerosols using the same quantity for emitted surface area of the dummy aerosols as in Fig. 7c. With increasing dummy aerosol surface area, the Se deposition in the Arctic increases, but only after surpassing the threshold of the available sulfate aerosol surface area in that latitude band (Fig. 10e). The dummy aerosols also have a stronger effect on Se deposition when they are emitted in a latitude band closer to the example grid box at 76.7°N (Fig. 10f). Attachment of oxidized Se to dummy aerosols increases the overall lifetime of Se (Sect. 5.1), leading to enhanced transport of Se to the Arctic region. The transport of Se on dummy aerosols does not lead to higher deposition in the Antarctic, perhaps because wet deposition in the Antarctic circumpolar storm track impedes the transport of aerosol poleward.
It is also important to note which input parameters do not influence Se deposition in any of the grid boxes. Variations in the 5 speciation of emissions, photolysis rates, and 15 of the Se reaction rate constants have a negligible influence on deposition.
Although several other parameters may play a role in certain grid boxes, the emission parameters are most important on the global scale, evidenced by their higher mean total Sobol' index ( Fig. 11). Figure 9 illustrates which regions are affected by different emission sources. Variations of marine emissions impact the most grid boxes; however, their influence is mainly confined to the oceans, coastal areas, and Southern Hemispheric continents. Since the motivation of studying Se deposition 10 is to understand its impact on agricultural soils, we also calculated the mean influence of parameters in pasture and cropland areas (Fig. 11), using maps from Ramankutty et al. (2008). The importance of anthropogenic emissions increases when looking only at pasture and cropland areas, since agricultural areas coincide with human settlement. All four emission parameters show similar levels of importance for agricultural regions. Therefore, further work in understanding any of these emission processes would be valuable to reducing the uncertainty in deposition fluxes.

Comparison with deposition flux measurements
With surrogate models of wet Se deposition, we can estimate the modeled wet Se deposition throughout the parametric uncertainty space. Comparisons of these modeled distributions of wet deposition with observed values could help to constrain the input parameters to which deposition is sensitive. However, we do not attempt at this stage to calibrate the parameters to existing measurements because of several challenges in comparing the compiled measurement dataset with our simulations. Se 5 is a difficult element to measure at environmental concentrations, which might lead to inaccurate reported deposition fluxes.
The most popular analytical method is inductively coupled plasma mass spectrometry (ICP-MS), which was used for 58 of the 73 sites in the database. However, it is difficult to measure Se with ICP-MS due to the low ionization of Se, the Se signal being split on the five stable isotopes, and especially mass interferences (Winkel et al., 2012). Several studies reported that Se concentrations in rainwater samples were often below the detection limit of the analytical method (e.g., Arimoto et al., 10 1987;Gratz et al., 2013). Unfortunately, other studies often do not explicitly report the detection limit, the fraction of samples under the detection limit, and how these samples are treated statistically. An additional issue is that many of the measurement sites were located in urban locations close to point-source emissions. Due its coarse resolution (   With the results from the deposition sensitivity analysis (Fig. 9), we categorize each measurement location by the input parameter that predominates the uncertainty in modeled deposition, indicated by the symbol in Fig. 12. The compiled data and measurements show good agreement at the lower end of deposition values, where the measurement sites are more remote from point-source emissions. The agreement worsens at higher values of observed Se deposition, which correspond to more urban measurement sites. As discussed before, it is not surprising that the model has difficulty matching the measurements for

6 Discussion
Through our consideration of model uncertainties related to Se cycling, we derived a median atmospheric Se lifetime of 4.4 d. This is the first bottom-up estimate for Se made with a mechanistic global atmospheric model. Our estimate for the Se lifetime matches the global arsenic lifetime calculated in the global model of Wai et al. (2016), namely 4.5 d. This agreement is due to both elements attaching to submicrometer aerosol particles; therefore, their lifetime is determined by the lifetime 5 of these particles. Since Se and arsenic have similar atmospheric sources as well (e.g., metal smelting, coal combustion, volcanoes), it may be possible to draw analogies between their atmospheric cycles. The range of previously estimated Se lifetimes from global atmospheric budgets is between 0.8 and 6 d, similar to our result (Ross, 1985;Mosher and Duce, 1987).
The recent value from Mason et al. (2018) of a 0.15 yr (55 d) Se lifetime seems overestimated compared to our results and past budgets, especially since Mason et al. (2018) only consider gas-phase Se in their model, which tends to be shorter lived 10 in the atmosphere than aerosol-bound Se. According to our sensitivity analysis results, the atmospheric Se lifetime could be further constrained by measuring the OCSe + OH reaction rates, and in general knowing more about whether OCSe is present in the atmosphere. Since dummy aerosols also impact the Se lifetime in our model, implementing a more complex tropospheric aerosol parametrization in SOCOL-AER would also further constrain the atmospheric lifetime of Se. However, since the main interest in Se is its atmospheric input to agricultural soils, it may be a higher priority to constrain the input parameters that 15 affect the deposition of Se in agricultural regions rather than the Se lifetime The results of the sensitivity analyses raise an obvious question: why do the input parameters that influence the atmospheric Se lifetime not appear as important for the Se deposition fluxes? One would expect that Se deposition fluxes close to areas of high emissions would be dominated by the magnitude of these emissions. One would also expect that, if anywhere, variation in the Se lifetime would play a role over remote regions, where the amount of locally emitted Se is low and thus the amount that 20 can be transported from emission regions has a larger effect on deposition. However, the range in the atmospheric Se lifetime in our simulations is relatively narrow, between 2.9 and 6.4 d if we consider the 2 nd percentile and 98 th percentile bounds (Fig.   4). On the other hand, emissions of various Se species can vary by orders of magnitude (Table 3). These larger variations in the amount of emitted Se have a larger impact on deposition than smaller variations in the Se lifetime, even in many remote places. Only in extremely remote areas, for example in the Arctic, do some of the parameters that affect the Se lifetime show 25 up as important, like the dummy aerosols. Parameters with regional rather than global importance for the Se lifetime, like the DMSe reaction parameters, impact deposition of Se in the Antarctic by controlling the amount of transported Se. It is not surprising that the parameter that has the largest impact on lifetime, the OCSe + OH reaction rate constant, has little impact on deposition fluxes, since emissions of OCSe are assumed to be a minor flux of Se (maximum 6% of the anthropogenic emissions flux). Like all sensitivity analyses, the results are dependent on the choice of uncertainty ranges for the different parameters; 30 if we had selected narrower uncertainties for the Se emission sources, the uncertainties of parameters that affect Se lifetime (e.g., chemical reaction rates, dummy aerosols, etc.) may have been more important in remote regions. However, the choice of wide uncertainty ranges for the Se emissions is justified, given the variability in natural emission processes and a lack of field campaigns assessing Se emission fluxes (Sect. 3.1.4). The different results for the two types of sensitivity analyses (lifetime and deposition fluxes) highlight that the "important" parameters to constrain depend on the choice of research question.
The global sensitivity analyses in this paper provides clear next steps for atmospheric Se research. The magnitude and spatial distribution of Se emissions remain the most important uncertainty to constrain, in order to improve the predictions of Se deposition patterns. Further investigations of chemical reactivity of Se species or the speciation of emissions are a lower 5 priority, although measuring the speciation of emissions can give mechanistic insights into emission processes. The emission uncertainties could be constrained by conducting field campaigns that either measure emission fluxes of Se close to sources (e.g., Amouroux et al., 2001) or separate Se source contributions at an ambient measurement site through trajectory modeling and/or speciation measurements (e.g., Suess et al., 2019). Our model results can help identify ambient locations that would be interesting to study for field campaigns, by mapping the contribution of the Se emission sources to deposition in different 10 regions (Fig. 9). We can also make use of already collected data to evaluate estimates of emission fluxes by compiling and reanalyzing past measurements from the literature, as was done in this paper for rainfall Se measurements. Bayesian inverse modeling techniques (e.g., Stohl et al., 2009) could be employed in conjunction with the SOCOL-AER Se model to provide posterior estimates for Se emission fluxes. Global sensitivity analysis is an invaluable first step before such model calibration techniques, since the parameter dimensionality can be reduced by neglecting non-influential parameters. As shown in Sect. 15 5.3, the heterogeneity of compiled literature data represents a challenge to comparing models and measurements. Therefore, standardized measurement techniques and adequate reporting of sampling, analytical, and post-processing methods are required so that the model is not calibrated to an errant measurement.

Conclusions
Now that it includes Se cycling, the SOCOL-AER model can be used to predict Se transport and deposition globally. We 20 created surrogate PCE-based models that are able to predict the output of the model throughout the uncertainty space of the input parameters. With these surrogate models, we determined that the atmospheric Se lifetime is around 4.4 d, similar to the lifetime of submicron aerosol particles in the atmosphere. Assuming that longitudinal wind speeds are around 10 m s −1 (Jacob, 1999), the likely Se lifetime range of 2.9-6.4 d corresponds to a distance of 2500-5000 km that Se is transported in the atmosphere. The global sensitivity analysis of Se deposition fluxes shows that reducing uncertainties in Se emissions would 25 lead to the biggest reductions in the uncertainty of deposition maps. Field measurements that elucidate and quantify Se emission processes should be prioritized, so that model predictions of Se deposition maps can be improved. Available measurements of Se in rainwater are within the likely range of model results at 79% of background sites; remaining discrepancies may be due to the time period of the simulations in this study, the coarse resolution of the model, and analytical challenges leading to measurement inaccuracies. In a future study, SOCOL-AER can be applied to different time periods to investigate how Se 30 deposition has changed due to variations in anthropogenic emissions and climate.
Code and data availability. The SOCOL-AER code is available upon request from the authors, after users have signed the ECHAM5 license agreement http://www.mpimet.mpg.de/en/science/models/license/. The relevant simulation data, along with the experimental design of the training runs, is available at: https://doi.org/10.3929/ethz-b-000357105 (Feinberg et al., 2019a). The compiled Se precipitation database is available in the Supplement. The NAtChem precipitation database is available online at: http://donnees.ec.gc.ca/data/air/monitor/ monitoring-of-atmospheric-precipitation-chemistry/metals-in-precipitation/. Selenium in precipitation measured by EMEP was extracted 5 from annual reports on heavy metals and POP measurements available at: https://projects.nilu.no//ccc/reports.html. UQLAB is freely available for Academic and degree-granting institutions by registering under https://www.uqlab.com/register. All of the scientific source code is available under the BSD 3-clause license and can be downloaded from: https://www.uqlab.com/obtain-the-sources.
Author contributions. AS, TP, and LW initiated the project of studying the atmospheric Se cycle using a chemistry-climate model. AF implemented Se into SOCOL-AER with assistance from AS, conducted all simulations, analyzed the results, and wrote the paper with contributions from all authors during the revision process. AF was directly supervised by AS, TP, and LW during the project. MM and BS guided the statistical analysis during the project. MM developed the scripts to conduct the global sensitivity analysis on the Se model and aided with writing the statistical methods section.
Competing interests. The authors declare that they have no conflict of interest. Environmental Conservation. Thanks to A. Meharg for providing additional data of Se in precipitation. We acknowledge all scientists who contributed to measurements that were compiled into the Se precipitation datasets. Thanks to the developers of the UQLAB for providing the software used in this study. Thanks to T. Meschini for the design of the Se cycle schematic. Color tables from ColorBrewer 2.0 (www. colorbrewer2.org) were used for the figures in this paper.

S4
S3 Leave-one-out error of PCE 0 0.02 0.04 0.06 0.08 0.1 Leave-one-out error of surrogate model Figure S2: Leave-one-out error of PCE models used to represent the total Se deposition fluxes from SOCOL-AER. The colorbar is chosen to highlight values that are above 0.05, which are shown in shades of red.