New particle formation from sulfuric acid and ammonia : nucleation and growth model based on thermodynamics derived from CLOUD measurements for a wide range of conditions

Understanding new particle formation and growth is important because of the strong impact of these processes on climate and air quality. Measurements to elucidate the main new particle formation mechanisms are essential; however, these mechanisms have to be implemented in models to estimate their impact on the regional and global scale. Parameterizations are computationally cheap ways of implementing nucleation schemes in models, but they have their limitations, as they do not necessarily include all relevant parameters. Process models using sophisticated nucleation schemes can be useful for the generation of look-up tables in largescale models or for the analysis of individual new particle formation events. In addition, some other important properties can be derived from a process model that implicitly calculates the evolution of the full aerosol size distribution, e.g., the particle growth rates. Within this study, a model (SANTIAGO – Sulfuric acid Ammonia NucleaTIon And GrOwth model) is constructed that simulates new particle formation starting from the monomer of sulfuric acid up to a particle size of several hundred nanometers. The smallest sulfuric acid clusters containing one to four acid molecules and a varying amount of base (ammonia) are allowed to evaporate in the model, whereas growth beyond the pentamer (five sulfuric acid molecules) is assumed to be entirely collisioncontrolled. The main goal of the present study is to derive appropriate thermodynamic data needed to calculate the cluster evaporation rates as a function of temperature. These data are derived numerically from CLOUD (Cosmics Leaving OUtdoor Droplets) chamber new particle formation rates for neutral sulfuric acid–water–ammonia nucleation at temperatures between 208 and 292 K. The numeric methods include an optimization scheme to derive the best estimates for the thermodynamic data (dH and dS) and a Monte Carlo method to derive their probability density functions. The derived data are compared to literature values. Using different data sets for dH and dS in SANTIAGO detailed comparison between model results and measured CLOUD new particle formation rates is discussed.


Introduction
The formation of new aerosol particles from the gas phase (nucleation) is the most important source of cloud condensation nuclei (CCN) in the free and upper troposphere (Dunne et al., 2016;Gordon et al., 2017).Binary new particle formation (NPF) from sulfuric acid and water is thought to be an important mechanism at cold conditions that can be enhanced by ions (Lee et al., 2003;Kirkby et al., 2011;Duplissy et al., 2016).The ternary system involving ammonia besides sulfuric acid and water can yield significantly enhanced NPF rates (Ball et al., 1999;Benson et al., 2009;Glasoe et al., 2015;Kirkby et al., 2011;Kürten et al., 2016).The addition of only a few parts per trillion by volume of ammonia can increase NPF rates by several orders of magnitude compared with the pure binary system (Kürten et al., 2016).The importance of ammonia in terms of NPF is highlighted by recent modeling studies, where a large fraction of CCN originates from ternary H 2 SO 4 -H 2 O-NH 3 nucleation (Dunne et al., 2016;Gordon et al., 2017).The detection of ammonia above several parts per trillion by volume in the upper troposphere by recent satellite measurements supports these findings (Höpfner et al., 2016).Furthermore, an aircraft campaign up to ∼ 5 km altitude measured elevated NH 3 Published by Copernicus Publications on behalf of the European Geosciences Union.concentrations over Texas (Nowak et al., 2010).Therefore, it is likely that ammonia plays an important role in new particle formation in the free troposphere.An expected future increase in the anthropogenic ammonia emissions could even increase the significance of ammonia in terms of NPF (Clarisse et al., 2009).
At cold conditions, NPF from H 2 SO 4 -H 2 O-NH 3 is efficient enough to explain NPF at atmospherically relevant concentrations of sulfuric acid and ammonia (Kirkby et al., 2011;Dunne et al., 2016;Kürten et al., 2016).However, the involvement of ammonia in the formation of new particles at the relatively warm conditions close to the surface is not clear yet.A recent study indicates that ion-induced ternary nucleation can explain some new particle formation events in the boreal forest in Finland (Yan et al., 2018); evidence that NH 3 is important in polluted boundary layer environments has been presented earlier (Chen et al., 2012).Most recently, Jokinen et al. (2018) showed that ion-induced ternary nucleation is important in coastal Antarctica.The importance of ammonia in enhancing boundary layer nucleation in the presence of highly oxygenated molecules (HOMs) from monoterpenes and sulfuric acid has recently been described (Lehtipalo et al., 2018).
In order to model nucleation, knowledge about cluster evaporation rates is required.This can either be gained by measurements in a flow tube (Hanson and Eisele, 2002;Jen et al., 2014Jen et al., , 2016;;Hanson et al., 2017) or in a chamber such as CLOUD (Cosmics Leaving OUtdoor Droplets; Kürten et al., 2015a).Another possibility is to apply quantum chemical (QC) calculations (Kurtén et al., 2007;Nadykto and Yu, 2007;Ortega et al., 2012;Elm et al., 2013;Elm and Kristensen, 2017;Yu et al., 2018).Comparison between experimental data measured at the CLOUD chamber and modeled formation rates using the ACDC (Atmospheric Cluster Dynamics Code) model (McGrath et al., 2012) with evaporation rates from quantum chemistry (Ortega et al., 2012) yielded good agreement for some conditions (208 and 223 K).For higher temperatures (≥ 248 K), the model generally overestimated the formation rates up to several orders of magnitude.A more recently developed nucleation model, also relying on evaporation rates from QC calculations, yields good agreement with the CLOUD data for some conditions (Yu et al., 2018).
For the global modeling studies by Dunne et al. (2016) and Gordon et al. (2017), CLOUD data have been parameterized to yield nucleation rates for four different channels (binary neutral and ion-induced and ternary neutral and ioninduced).The parameterization works well and describes the nucleation rates over a wide range of conditions (Dunne et al., 2016), but it also has its limitations.First, it does not give any insight into the stability of individual sulfuric acidammonia clusters.Second, the influence of other parameters on nucleation (e.g., the condensation sink) cannot be tested, while the model by Yu et al. (2018) considers the effect of the condensation sink on the nucleation rate.Third, the parame-terization provides only the nucleation rate, while a full nucleation model utilizing size bins over a wide diameter range can also yield the particle growth rates (Li and McMurry, 2018).
In the present study a model covering the aerosol size distribution over a wide size range, i.e., from the monomer of sulfuric acid up to several hundred nanometers, is constructed.The model simulates acid-base nucleation and considers evaporation rates for the clusters containing one to four sulfuric acid molecules and variable number of base molecules.The model allows for calculating new particle formation and growth rates at different sizes and considers sinks like wall loss, dilution and coagulation.SANTIAGO (Sulfuric acid Ammonia NucleaTIon And GrOwth model) is an extension of a previous simpler model version used to simulate acid-base nucleation involving dimethylamine (Kürten et al., 2014(Kürten et al., , 2018)).The model extension in the present study is a prerequisite for the main goal of deriving the thermochemical parameters (dH and dS) for the sulfuric acid-ammonia system from CLOUD chamber data (Dunne et al., 2016;Kürten et al., 2016).The data cover electrically neutral conditions for the clusters up to the tetramer (containing four sulfuric acid molecules and up to four ammonia molecules).First, a model has been developed that uses molecular and geometric size bins to cover a wide particle size range (starting with the monomer of sulfuric acid).Second, two numeric algorithms yield a best fit for the dH and dS values and their probability density functions (pdf's).The pdf's are obtained by using a Monte Carlo method introduced by Kupiainen-Määttä (2016).In total, CLOUD data from 125 experiments are considered; these cover the range from 208 to 292 K and a wide range of atmospherically relevant sulfuric acid and ammonia concentrations.The results of the model are compared to the measured CLOUD data, and further comparison regarding the thermochemical data from literature (Ortega et al., 2012;Hanson et al., 2017;Yu et al., 2018) is presented.

Methods
The aim of the present study is to find values for dH and dS of selected clusters (11 different clusters) such that modeled new particle formation (NPF) rates represent measured NPF rates from the CLOUD experiment with a small error.In order to perform this task, a model has been developed that calculates the NPF rates based on given concentrations of sulfuric acid and ammonia, relative humidity (RH), and temperature (T ; Sect.2.2).The data set from Kürten et al. (2016) for 125 neutral NPF rates is used to derive dH and dS.A best-fit thermodynamic data set is obtained by using an optimization method (Sect.2.4).Moreover, the distributions of the probability density functions for each cluster are explored with a Monte Carlo method (Kupiainen-Määttä, 2016, and Sect. 2.5).The thermodynamic parameters (enthalpy change dH and entropy change dS due to the addition or removal of a molecule) are required in order to obtain the evaporation rate of a cluster.The mathematical relationship between dH , dS, and the evaporation rate are provided in the supplementary information (Supplement Sect.S2).

Experimental data from the CLOUD experiment
The experimental data used to develop the model were taken at the CLOUD chamber at CERN (European Organization for Nuclear Research).The 26.1 m 3 stainless steel chamber allows for conducting nucleation and growth experiments under atmospherically relevant conditions regarding the trace gas concentrations, temperature, relative humidity, and ion concentrations (Kirkby et al., 2011).The chamber and the results for different chemical systems have been described elsewhere in the literature (e.g., Kirkby et al. 2011;Almeida et al., 2013;Duplissy et al., 2016).In the present study no new data are presented from CLOUD; instead the data from the Dunne et al. (2016) and Kürten et al. (2016) studies are used.Whereas in the previous publications the influence of the ion concentration on nucleation was also discussed, this study focuses on neutral nucleation only.The parameter space covers temperatures between 208 and 292 K (five different temperatures) and a wide range of atmospherically relevant sulfuric acid and ammonia concentrations.No systematic investigation of the relative humidity was carried out; for most experiments, the relative humidity was at 38 %.The new particle formation rates are reported for a mobility diameter of 1.7 nm (1.4 nm geometric diameter; see Ku and Fernandez de la Mora, 2009).All data shown in the figures of the present study are available for download (see Kürten, 2019).

Acid-base model
The model used in the present study solves a set of differential equations describing the concentrations of clusters and particles (McMurry, 1980;Kürten et al., 2014Kürten et al., , 2015aKürten et al., , 2018;;McMurry and Li, 2017).The model from Kürten et al. (2018) describes nucleation for the system of sulfuric acid and dimethylamine, where the formed clusters are stable against evaporation at a temperature of 278 K.For this reason, the sulfuric acid-dimethylamine system can be treated as quasi-unary, and the kinetic approach (all cluster evaporation rates equal zero) yields very good agreement between modeled and measured particle concentrations and formation rates over a wide range of particle diameters.The model treats the smallest clusters in molecular size bins, based on the number of sulfuric acid molecules in a cluster, while geometric size bins are used for the larger particles (Kürten et al., 2018).In the present study 12 molecular bins and 25 geometric bins with a geometric growth factor of 1.25 result in a maximum particle diameter of 295 nm.Choosing a larger number of bins and/or geometric factor would result in a larger upper size limit, which was, however, not necessary in the present study.Compared with the earlier study by Kürten et al. (2018) the number of bins is reduced in order to reduce computation time; the simulation of one new particle formation event (several hours of nucleation) takes ∼ 0.1 s on a personal computer with a 3.4 GHz processor.
While the approach of using a quasi-unary system with zero evaporation works well for sulfuric aciddimethylamine, this assumption cannot be used for sulfuric acid and ammonia because some small clusters evaporate rapidly (Nadykto and Yu, 2007;Ortega et al., 2012;Jen et al., 2014).In the following, the number of sulfuric acid molecules denotes the clusters as monomers (one sulfuric acid), dimers (two sulfuric acids), trimers (three sulfuric acids), etc.The clusters from the monomer to the tetramer can contain different numbers of ammonia molecules, where the maximum number of ammonia molecules is not allowed to exceed the number of acid molecules.The assumption that no clusters are allowed that contain more base than acid is based on fast evaporation rates that have been found for such clusters from quantum chemical calculations (Schobesberger et al., 2015;Elm et al., 2017;Yu et al., 2018); the assumption is further supported by mass spectrometric measurements that could not identify such clusters (Kirkby et al., 2011;Schobesberger et al., 2015).This results in the acidbase reaction scheme shown in Fig. 1, where A 1 denotes the sulfuric acid monomer concentration and B 1 the ammonia concentration.For the larger clusters and particles (starting with the pentamer), no differentiation regarding the base content is applied.The full set of differential equations used in SANTIAGO is listed in Sect.S1.Compared with its previous version SANTIAGO can more accurately describe nucleation from sulfuric acid and ammonia because of the consideration of clusters with different amounts of acid and base that are allowed to evaporate.
While a mixed acid-base cluster can in principle lose either acid or base, the following rule was implemented in the model: clusters containing more acid than base can only evaporate an acid molecule, while clusters containing equal numbers of acid and base can lose a base molecule only.While this is a simplification of the reality, quantum chemical calculations support that this assumption generally considers the dominant evaporation processes (Yu et al., 2018).In principle, acid and base evaporation could be implemented for each cluster in the model, but this would increase the number of free parameters from 22 (with the simplification) to 40 (with all possible evaporations), which would probably not lead to better results but would increase the computation time significantly.The existence of clusters containing more base than acid is excluded in SANTIAGO, which is also supported by quantum chemical calculations (Ortega et al., 2012;Yu et al., 2018).
The thermodynamic parameters for the two smallest pure acid clusters (A 2 and A 3 ) are taken from a study where the parameters were derived from flow tube measurements (Hanson and Lovejoy, 2006).Ehrhart et al. (2016) showed that a numeric model for sulfuric acid-water binary nucleation us-Figure 1. Acid-base scheme implemented in SANTIAGO (Sulfuric acid Ammonia NucleaTIon And GrOwth model).A x B y denotes a cluster of sulfuric acid and ammonia with x sulfuric acid molecules and y ammonia molecules.The arrows indicate the considered evaporation rates.Red colors mark the evaporation channels optimized with numeric methods in the present study.Evaporation rates for the channels marked with green arrows were taken from Hanson and Lovejoy (2006).Forward reactions are not shown, but the model considers all possible collisions, i.e., cluster-cluster collisions and not just the additions of monomers.Clusters and/or particles beyond the pentamer (with concentration N 5 ) are not allowed to evaporate; for these larger clusters, the base content is not considered.
ing those data can replicate new particle formation rates measured at CLOUD well.In their study, Hanson and Lovejoy report dependencies of the dimer and trimer evaporation rates regarding the relative humidity, which are also adopted in the present study (evaporation rate proportional to (20 %/RH) 0.5 for the dimer and (20 %/RH) 1.5 for the trimer).The same dependency was used here, and the evaporation rate for the pure tetramer (A 4 ) was scaled by the same RH-dependent factor as for the pure acid trimer.Further humidity effects are not applied; therefore, the results for the thermodynamic data can be interpreted as a weighted average over the range of the different water contents for each cluster.The equations for calculating an evaporation rate from dH and dS are given in Sect.S2 (see also Ortega et al., 2012).In general, slower evaporation rates result from more negative values of dH and from less negative values of dS; the evaporation rate varies exponentially with dH and dS.How strong the evaporation rate varies with temperature is determined by the value of dH .
Forward reaction rates are calculated based on the equations for the collision frequency function by Chan and Mozurkewich (2001), with a value of 6.4 × 10 −20 J for the Hamaker constant (Hamaker, 1937).An enhanced collisionrate between small clusters and particles due to van der Waals forces was reported in recent CLOUD publications (Kürten et al., 2014(Kürten et al., , 2018;;Lehtipalo et al., 2016).SANTI-AGO takes dilution and wall loss into account, which are relevant loss processes in the CLOUD chamber (Kirkby et al., 2011;Kürten et al., 2015a;Sect. S1).The value of the modeled new particle formation rate, J model , is taken for the nonamer (Kürten et al., 2015b): (1) The nonamer (m = 9) has approximately a mobility diameter of 1.7 nm for which CLOUD new particle formation rates are derived (Kirkby et al., 2011;Dunne et al., 2016).The formation rate calculation takes into account that the collision of two smaller clusters with concentrations N i and N j yield a particle equal or larger than the nonamer.The differential equations are integrated over the same time that each of the 125 individual CLOUD runs lasted; this time varied between roughly half an hour and several hours, depending on the gas concentrations.The latest value of the calculated nucleation rate defines the modeled NPF rate.Further details regarding the model can be found in Kürten et al. (2015bKürten et al. ( , 2018) ) and in Sect.S1.The particle growth rate (GR) can be calculated using the monomer and cluster concentrations in SANTIAGO: ( The increase in diameter depends on the particle diameter for which the growth rate is determined, d p,m , and the colliding cluster and/or particle diameter, d p,i (Nieminen et al., 2010).Note that Eq. ( 2) does not only consider the growth due to monomer additions (i = 1) but also the gain due to collisions with all clusters and/or particles smaller than the considered diameter.Lehtipalo et al. (2016) highlighted the importance of such cluster-cluster or cluster-particle collisions, especially for systems containing high cluster concentrations like the sulfuric acid-dimethylamine system.In the present study the GR is calculated for a mobility diameter of 2.4 nm.

Metric for average error of the model
In order to optimize the thermodynamic parameters, it is necessary to define a criterion that describes the overall deviation between the 125 measured and modeled new particle formation rates.Since the NPF rates span a large range (from roughly 10 −3 to 10 2 cm −3 s −1 ), it is reasonable to compare the ratios between modeled and measured rates rather than the absolute differences.This way, mainly the high values of the NPF rates being brought into agreement is avoided.In addition, it is taken into account that the data cover five different temperatures (208, 223, 248, 278, and 292 K) with different numbers of experiments conducted at each of the temperatures.In order to weigh each of the temperatures equally and not to bias the error calculation towards the temperature at which most of the experiments were conducted, the follow-ing error function, f , is defined: In this equation the values n 1 to n 5 indicate the number of experiments at each temperature.

Optimization method
The optimization method used was introduced by Steihaug (1983) and uses an approximation for the function, f , that should be minimized.A quadratic model (second-order Taylor expansion) approximates the function In this study, the point x k is the current set of thermodynamic parameters (11 dH and 11 dS values, i.e., 22 parameters in total), and s k is the vector that moves the point to a new position that ideally yields a smaller error (i.e., a smaller value for f ).The gradient vector is denoted by g k and the Hessian matrix by H k .Steihaug's conjugated gradient method finds the s k that minimizes M (Steihaug, 1983; Nocedal and Wright, 2006).The algorithm takes into account that the length of the vector s k stays within a certain trust region, k (i.e., ||s k || ≤ k ).The following value is used to decide whether k can be increased, stays unchanged, or should be reduced after each iteration, k: The empirical factor η 1 is used to determine, after each iteration, whether a step should be taken or not: The trust region radius is updated by using the following rules: where the empirical parameters η 2 , η 3 , t 1 , and t 2 are used.

Monte Carlo method
With the Monte Carlo method (Differential Evolution-Markov Chain algorithm -DE-MC; see Ter Braak, 2006;Ter Braak and Vrugt, 2008;Kupiainen-Määttä, 2016) the pdf's of the thermodynamic parameters are explored.The pdf's give information on the uncertainties of the parameters found by the optimization algorithm, as it is very likely that the optimized values represent a local minimum in the parameter space that is just one possible solution out of many others.
The DE-MC algorithm aims at finding the most probable values for the parameters instead of finding the optimal values (Kupiainen-Määttä, 2016).Therefore, the Monte Carlo solutions can be used to evaluate if the optimized values are within the range of the most probable solutions.

Initialization for generating the prior distributions
At the start of the Monte Carlo simulation, the parameters dH and dS are initialized, where each value is randomly selected from a range of possible values.In this study, this range was defined by the values from Ortega et al. ( 2012), ±10 kcal mol −1 for dH and ±10 cal mol −1 K −1 for dS.For these randomly selected thermodynamic parameters, the initial error (Eq. 3) is calculated.

Main loop
Within the main loop (iterated 5000 times), the first step involves the random variation of the parameters.The value for each dH and dS is updated with a probability of 0.2.Given that 22 parameters are used, this means that on average, 4.4 parameters changed during each iteration.If, however, the situation occurs that no update for any of the parameters is requested, the selection process is repeated until at least one thermodynamic parameter is updated (Kupiainen-Määttä, 2016).If a value should be updated, its step width is chosen from a normal distribution with a standard deviation of 0.05 times the width of the allowed range (i.e., 20 kcal mol −1 for dH and 20 cal mol −1 K −1 for dS).If a step would lead to the crossing of the upper or lower bound for any of the parameters, a new random value is chosen until the updated value stays within its allowed range.With the new set of parameters, the new error, f ( then the new set of parameters is accepted.However, even if f (x k +s k ) is larger than f (x k ), the step might still be accepted with the probability where a σ of 0.2 has been chosen (same as by Kupiainen-Määttä, 2016).This means that even steps in the "wrong" direction (making the error larger) have a chance of being taken.This can avoid the parameters becoming trapped in a local minimum, which can, for example, be the case with minimization methods.In any case, x k+1 is set to x k + s k if a step is taken before a new iteration starts.The error and the full set of parameters are recorded after each iteration.

Generation of the prior distribution
In total 20 data sets (each containing 5000 steps) are generated with the methods described in Sect.2.5.1 and 2.5.2.
From each of the 20 data sets the average error was determined from the last 2500 points.Whenever the error for one data set is smaller than the geometric mean from all 20 errors, the data set was selected (Kupiainen-Määttä, 2016).All selected data sets combined and thinned to 5000 data points represent the prior distribution, Z 0 .For each parameter, the standard deviation σ ini is determined.

DE-MC algorithm
In the DE-MC algorithm, five Markov chains are run in parallel, where each of the chains starts from a random point of the joint history, Z 0 (Ter Braak, 2006;Ter Braak and Vrugt, 2008;Kupiainen-Määttä, 2016).In the algorithm, the probability to jump from an old point, x old , to a new point, x new , should be the same as moving from x new to x old .This is achieved by calculating the new position vector according to where x 1 and x 2 are randomly selected points from the joint history, Z 0 .The factor γ is taken as (Ter Braak, 2006) or 0.98 (at every fifth step).Each individual dH and dS value for the new point is updated with a probability of 0.2 (see Sect. 2.3.2). δ is drawn from a normal distribution with σ = 0.05×σ ini (calculated from the prior distribution; see above).
The decision process of whether a step should be accepted or not is the same as in Sect.2.5.2 (Eq.8).
The points from the five chains are appended to the joint history, Z 0 , and the new points in the following iterations are drawn from the updated history.This way, eventually convergence should be reached after many iterations, resulting in the posterior distributions (probability density functions) for all parameters.The metric indicating convergence is given by the following (Kupiainen-Määttä, 2016): with the parameter k indicating the step index; the number of chains is c = 5.The variance of the means for each parameter, b, is calculated from where μ is the average of a parameter over all chains and µ l is the average for each of the chains, l.The mean of the variances, W , is calculated from where Var l is the variance for each parameter in one of the chains.Convergence is assumed when R (for each of the 22 parameters) reaches a value of < 1.1.In the present study, this was the case after more than 10 5 iterations.

Thermodynamic data
The results for the thermodynamic parameters are shown in An overall comparison between modeled and measured NPF rates is shown in Fig. 3. SANTIAGO uses the thermodynamic data from Steihaug's optimization method.The maximum ratio for the deviation between the modeled and measured nucleation rates is below a factor of 10, with only a few exceptions.The average deviation is a factor of ∼ 4. Some of the cases where the ratio deviates by more than a factor of 10 correspond to the lowest temperature (208 K) binary experiments where the model overestimates the measured NPF rates (Sect.3.2).As intended (Sect.2.3), the data in Fig. 3 do not indicate an apparent bias.

Comparison between modeled and experimental
data: To further evaluate the performance of SANTIAGO the calculated NPF rates are shown together with the measured rates as a function of the sulfuric acid concentration for the five different temperatures (Fig. 4).The color code represents the ammonia mixing ratio, while grey symbols indicate pure binary nucleation (see Kürten et al., 2016;Duplissy et al., 2016).Again, as in Fig. 3, the agreement between modeled and measured data is good.The same applies to the parameterization; in some cases, the parameterization yields even better agreement compared with the model.This is the case, for example, for the binary nucleation at 208 K and the data at 278 and 292 K for the lowest ammonia mixing ratios.However, one clear advantage of SANTIAGO is that it describes the functional behavior of the system more accurately.At a temperature of 208 K for the high ammonia mixing ratio, the model line shows a pronounced curvature, whereas the parameterization yields a straight line on the log-log plot.The curvature is due to the fact that the survival probability of subcritical clusters (i.e., clusters below the nonamer) can be strongly affected by losses to walls or pre-existing particles (Ehrhart and Curtius, 2013).This effect is most strongly pronounced when the concentration of the nucleating vapor is relatively low, which results in slow cluster and/or particle growth rates.Other thermodynamic data sets can be used to generate model curves similar to the ones in Fig.   SO 4 ) 1 (NH 3 ) 1 + H 2 SO 4 ⇔ (H 2 SO 4 ) 2 (NH 3 ) 1 27.8 a , 29

Comparison between modeled and experimental data: J 1.7 nm vs. [NH 3 ]
SANTIAGO can yield the dependency of the NPF rates for varying ammonia concentrations at fixed sulfuric acid concentration.Figure 5 shows these data for five different temperatures over a wide range of NH 3 concentrations.The modeled data agree overall very well with the experimental CLOUD data.The data points indicated in Fig. 5 are obtained by normalizing the CLOUD data to one sulfuric acid concentration for each of the temperatures (see Kürten et al., 2016); the sulfuric acid concentrations for the normalization are indicated in the figure annotation.
For the lowest temperature (208 K) the new particle formation rates show almost no increase with [NH 3 ] when ammonia is present at low concentrations (≤ 10 6 cm −3 ); this indicates that NPF is dominated by the pure binary channel.The data points for pure binary conditions are placed at the estimated NH 3 background concentrations for 208 and 223 K in Fig. 5 (Kürten et al., 2016).However, in the model for generating the lines at pure binary conditions (Fig. 4), zero NH 3 is assumed.For larger [NH 3 ] the NPF rates increase until they reach a plateau at ≥ 10 9 cm −3 .In this case new particle formation is only limited by the availability of sulfuric acid; evaporating ammonia molecules from clusters are, however, rapidly replaced because the arrival rate of ammonia is similar or faster than the ammonia evaporation rate.For the data at 223 K the situation is very similar.The plateau values agree very well with the calculated values for collision-controlled new particle formation (Kürten et al., 2018), which can be seen as a validation of SANTIAGO.
For both temperatures (208 and 223 K) the experimental pure binary new particle formation rates are represented well by the model.At 248 K and above, the modeled rates at low [NH 3 ] very likely overestimate the NPF rates (dashed sections of the curves; see discussion in Sect.4) because the model considers only evaporation up to the sulfuric acid tetramer, which is not sufficient to accurately model binary nucleation at these conditions.However, the slow rates of < 1 × 10 −3 or 1 × 10 −4 s −1 are not atmospherically relevant near the ground in most cases.Beyond the regions where binary nucleation dominates, the rates increase steeply with [NH 3 ].Although the slopes of the curves flatten somewhat towards high ammonia concentrations, no plateau is reached, even at concentrations of 10 11 cm −3 (approximately 4 ppbv).(Kürten et al., 2016).The lines show calculated NPF rates from the model using the thermodynamic data from the optimization method (Table 1).The dashed sections (for 248, 278, and 292 K) indicate regions of the parameter space where the model does not give accurate results, as the true binary rates are expected to be lower (Ehrhart et al., 2016).

Particle growth rates
Figure 6 shows calculated growth rates as a function of the sulfuric acid concentration according to Eq. ( 2).Additionally, a curve from the equations given by Nieminen et al. (2010) is included.The model results from the present study show a linear increase in the GR as a function of the sulfuric acid monomer concentration as expected (Nieminen et al., 2010).The higher values from SANTIAGO can be explained by the different methods for calculating the collision rate constant that include the van der Waals enhancement for the model of the present study (see Kürten et al., 2018).The increase in GR at a low temperature (208 K) is not intuitive, as the collision rates decrease somewhat with temperature, which should lead to slower GR.However, the clusters are more stable at low temperature and their elevated concentrations can contribute to particle growth (Lehtipalo et al., 2016).This effect is pronounced at 208 K with some ammonia, which indicates that considering only the condensation of monomers is not sufficient for some conditions.Not only can growth be effected by coagulation but also new particle formation rates; therefore, the implementation of a full coagulation scheme (Sect.S1) is important for the present study.The possibility of deriving growth rates with the model is an important feature that is not included in the parameterization for the CLOUD new particle formation rates by Dunne et al. (2016).The modeled growth rates enable further com- The posterior distributions with the median values for dH and dS for all clusters are shown in Fig. 2. For comparison, the values from Steihaug's optimization method are also shown.For the dS values, the medians and the optimized values agree very well.However, the distributions are rather flat, indicating that there is a wide possible range of entropies that lead to reasonable agreement between modeled and measured NPF rates.This is also reflected in Table 1 when comparing the dS to the Ortega et al. (2012) data.These were used to initialize the optimization method.However, no large differences can be found between the initialized and optimized values, which can be interpreted such that the quantum chemical calculations yield accurate results for dS.
The distributions for the dH values show more structure.However, the only cluster where a clear peak can be found is the A 2 B 2 cluster (for the B evaporation).The median value of the distribution is somewhat lower (by ∼ 2 kcal mol −1 ) compared with the optimized value, but it is well within the half-width of the distribution.For most dH values flat regions of the probability density function exist, for example, for the A 2 B 1 cluster (A evaporation) between −28 and −39 kcal mol −1 .In this range the evaporation rate varies between 5 × 10 −3 and 1 × 10 −11 s −1 (at 278 K and dS = −43 cal mol −1 K −1 ; Sect.S2).In practice, it does not matter which one of these evaporation rates is used; the magnitude of the evaporation rate in this range has essentially no effect on the outcome because the cluster is stable on the considered timescale (Kupiainen-Määttä, 2016).
For some clusters, limits seem to exist for dH .For example, the dH value for the A 4 is below −15 kcal mol −1 , and for the A 4 B 3 clusters (A evaporation), the upper limit is approximately −19 kcal mol −1 .The pdf's for the A 1 B 1 and the A 2 B 1 clusters show local maxima, which indicate elevated probability densities around −16.5 and −23 kcal mol −1 .

Comparison of dH and dS to literature data
For most of the clusters, the agreement between the Ortega et al. ( 2012) data and the data from the present study is quite good.One exception is the A 4 cluster, where the pdf indicates a median value of −23.1 kcal mol −1 for dH (−19.7 kcal mol −1 from the optimization method) in contrast to −16.78 kcal mol −1 by the Ortega et al. ( 2012) study.The much lower value found in the present study is reasonable, since Ortega et al. (2012) did not include water vapor in their calculations.The available water in the CLOUD experiment can lead to significantly slower evaporation rates indicated by the lower dH value.The difference from the Hanson et al. (2017) data is generally much larger.Especially, the trimer and tetramer with one ammonia (A 3 B 1 and A 4 B 1 ) evaporate significantly slower for the Hanson et al. (2017) data.This might explain the much higher NPF rates observed at the warm temperatures for the Hanson et al. (2017) predictions compared with the CLOUD data (Fig. S4).Yu et al. (2018) report dG values (Table 1) in their study.While the agreement between their model and CLOUD data is generally good for ion-induced conditions, the agreement for neutral conditions is only good for low temperature conditions.At temperatures ≥ 248 K the Yu et al. ( 2018) model underestimates the measurements by up to many orders of magnitude.This can at least partly be explained by the significantly higher dG values for some clusters (e.g., A 2 B 1 and A 4 B 1 ) in comparison to the other literature data and the values from the present study.

Uncertainties and limitations of SANTIAGO
One limitation of the model from the present study is that the effect of water vapor is not taken into account explicitly, i.e., no clusters containing different amounts of water molecules are considered.However, for the clusters containing no ammonia, to some extent humidity effects are included.This is achieved by scaling the evaporation rates of the sulfuric acid dimer, trimer and tetramer by a factor (20 %/RH) p with p = 0.5 for the dimer and 1.5 for the trimer and tetramer.The first two values for the parameter p are from Hanson and Lovejoy (2006).For the tetramer the same dependency as for the trimer is assumed, which introduces uncertainty.The reported dH and dS values for the sulfuric acid tetramer are therefore derived for a relative humidity of 20 % in order to be consistent with the Hanson and Lovejoy (2006) data.In Fig. 4 the agreement between the modeled and measured pure binary data (at 208 and 223 K) is relatively good, especially for the 223 K data.For the 208 K data SANTI-AGO overestimates the measured data.It has to be noted that the model calculations assume an average RH (33 % at 208 K and 28 % at 223 K), whereas the measurement conditions cover varying relative humidities (12 % to 57 % at 208 K and 11 % to 52 % at 223 K).This can explain some of the scatter in the measured data but not the systematic overestimation for the 208 K data by the model.However, the general agreement between model and measurement at ≤ 223 K is considered good for both ternary and binary conditions.For the warmer temperatures (≥ 248 K) the pure binary conditions can currently not be accurately represented by the model.This can be seen in Fig. 5 for the dashed sections of the curves, which approximately mark the limit of the parameter space regarding the allowed NH 3 concentrations.For the very low NH 3 concentrations, the modeled NPF rates approach the "pure" binary conditions.However, comparison with the data by Ehrhart et al. (2016) who simulated pure binary nucleation for the CLOUD chamber with the SAWNUC (Sulfuric Acid Water NUCleation) model indicates that the apparent binary data in Fig. 5 are significantly overestimating the true binary NPF rates.For 248 K the overestimation seems to be within a factor of 10, but for 278 and 292 K, the overestimation amounts to many orders of magnitude (Ehrhart et al., 2016).For this reason, the solid line sections for 248 K and warmer have been defined such that the contribution from the overestimated binary conditions is in any case less than 10 %.This means that SANTIAGO can be applied, e.g., at 292 K for NH 3 concentrations above ca. 1 × 10 7 cm −3 (≈ 0.4 pptv at 292 K and 1 bar).It can be seen that NH 3 has a large effect even at these tiny concentrations, which are below the measurable range of ammonia in the atmosphere.
The effect of water vapor on particle growth rates needs to be studied in the future.Comparison between measured and modeled growth rates at small diameters (2 nm) in the acid-base system (sulfuric acid-dimethylamine and sulfuric acid-ammonia) indicates that water has no significant effect on particle growth (Lehtipalo et al., 2016).The same can be concluded for the sulfuric acid-ammonia system at larger diameters (∼ 10 nm; see Chen et al., 2018).
The fact that no larger clusters than the tetramers can evaporate in SANTIAGO apparently leads to truncation errors as discussed before for the binary conditions.This truncation leads to the overestimation of new particle formation rates for the pure binary conditions at the warm temperatures.The extent to which truncation affects the ternary new particle formation can be discussed based on the cluster evaporation rates for the tetramers at the warmest temperature (292 K).The evaporation rates are ∼ 3000 s −1 (A 4 B 1 ), ∼ 75 s −1 (A 4 B 2 ), and ∼ 0.02 s −1 (A 4 B 3 ) when using the thermodynamic parameters from Table 1 (first columns) and the equations to convert dH and dS to an evaporation rate (see Sect.S2).This indicates that new particle formation proceeds most efficiently via the clusters containing at least three base molecules.For this cluster the forward reaction rate is larger than the evaporation rate when the total sulfuric acid concentration is larger than ∼ 2 × 10 7 cm −3 .If the A 4 B 3 and A 4 B 4 clusters are the dominant ones, this indicates that even if a pentamer with a small number of base molecules evaporates rapidly, it is probably not very important in terms of contributing to the new particle formation rates, as the main nucleation pathway will follow the clusters with high ammonia content.If truncation nevertheless plays a role, it can lead to an overestimation of evaporation for a smaller cluster, thereby compensating for the missing evaporation of the larger clusters.Therefore, it is possible that some evaporation rates in the present study could be overestimated.However, the data that are shown in Table 1 for a comparison have been derived from similar methods, where the effect of evaporation is also considered only up to a certain cluster size limit.Truncation effects are discussed in detail by Hanson et al. (2017).
Similarly, to truncation the negligence of evaporation of either acid or base for all considered clusters can potentially lead to errors (see Sect. 2.2).The model includes only the cluster evaporation rates, which seem to be most relevant (see Fig. 1; Ortega et al., 2012;Yu et al., 2018).For each cluster, one evaporation rate is included (either acid or base).This means that the negligence of the second evaporation channel can lead to an overestimation of the cluster concentration.However, in case the omitted evaporation rate is smaller than the considered one, this effect is very likely small.The selection of the considered evaporation rates are guided by the literature data on QC calculations (Ortega et al., 2012;Yu et al., 2018).This does, however, not rule out that important evaporation channels could be neglected.On the other hand, increasing the number of free parameters does not necessarily improve the accuracy of the model but only its complexity and the computational demands for the optimization and Monte Carlo calculations.

Implementation of literature data in SANTIAGO
The previous study by Kürten et al. (2016) compares the CLOUD data with ACDC (McGrath et al., 2012) model calculations using the thermodynamic data from Ortega et al. (2012).Using the same data, Fig. S3 shows this comparison using the model from the present study.Surprisingly the agreement between the model and measurement is better than in the study by Kürten et al. (2016).One difference between the two studies is that the ACDC model used the formation rate for neutral clusters containing six sulfuric acid molecules instead of nine in the present study.This difference was tested with the present model, but it only leads to a very small change in the simulated formation rates.An effect that can, however, explain the discrepancy is that the ACDC model calculations did not consider a wide range of particle sizes.This could lead to inaccuracies regarding the coagulation sink for the formed clusters.Especially at high acid concentrations when growth and nucleation rates are large, the particles can create a significant sink that can reach a similar magnitude to the wall loss rate in the CLOUD chamber (Kürten et al., 2015b).Neglecting the full size distribution can lead to an overestimation of cluster concentrations and formation rates (Sect.S1).This effect needs to be studied in more detail in the future.In any case, taking into account particles over a wide size range should improve the accuracy of a model due to the described effect.
The comparison between the CLOUD data and SANTI-AGO using the Hanson et al. (2017) data is shown in Fig. S4.Hanson et al. (2017) base their data on flow tube measurements performed at rather warm temperatures (∼ 295 K).The agreement between the modeled and measured data is good, however, mostly at the low temperatures (208 and 223 K); for the warmer temperatures, the model using the literature data significantly overestimates the NPF rates.This can partly be due to the fact that the model does not include all possible evaporation effects (acid and base for each cluster).Hanson et al. (2017) derived their data, however, by including many more possible evaporation channels.Their negligence shifts the new particle formation rates to higher values.It is likely that this effect is stronger at warm temperatures because at very cold conditions the evaporation rates for the clusters are generally very low except for the A 1 B 1 cluster.For this cluster only one possible evaporation channel exists that is included in the model.By including the new particle formation rates reported by Hanson et al. (2017) for 278 K at CLOUD chamber conditions (additional symbols in Fig. S4 at 278 K), the agreement is somewhat better but still significantly higher than the CLOUD data.Therefore, the missing evaporation channels in this study cannot explain the full extent of the discrepancy.

Summary and conclusions
The model SANTIAGO (Sulfuric acid Ammonia Nucle-aTIon And GrOwth model) describes new particle formation and growth from the reactions between sulfuric acid and ammonia.The effect of water vapor is taken into account, but the capability of simulating binary nucleation is limited to low temperatures (≤ 223 K) because cluster evaporation rates are only considered up to the tetramer; at warmer temperatures evaporation of larger pure acid clusters becomes important.SANTIAGO implements evaporation of the smallest clusters, containing one to four sulfuric acid molecules and a variable number of ammonia molecules.The thermodynamic data (dH and dS) for 11 different channels are used to calculate evaporation rates as a function of temperature.Two numeric methods have been applied to find the best set of parameters (Steihaug algorithm) and their probability density functions (Differential Evolution-Markov Chain algorithm -DE-MC).This is achieved by comparing the model output to the CLOUD data set for neutral nucleation in the ternary system of sulfuric acid-water-ammonia (Dunne et al., 2016;Kürten et al., 2016).The average ratio between modeled and measured data is found to be as small as a factor of ∼ 4 (mean error) for a wide range of conditions (208 to 292 K, sulfuric acid at atmospherically relevant concentrations, e.g., ≥ 5×10 5 cm −3 at 208 K and ≤ 2×10 9 at 292 K) when using the best-fit parameters.SANTIAGO can represent the neutral measured CLOUD data very well for all tested conditions.This means that even binary neutral nucleation at the lowest temperatures (208 and 223 K) can be described well.
The optimization and the Monte Carlo method were successfully applied to explore the landscape of the cluster thermodynamics for the nucleating system of sulfuric acid and ammonia.However, the probability density functions from the DE-MC algorithm do not yield a very clear picture of the most likely values for dH and dS, as the derived probability density functions are rather flat and indicate a wide range of probable values.Therefore, the parameters reported in the present study have a rather high uncertainty.Future experiments and quantum chemical calculations are necessary to narrow down these uncertainties.
Implementation of the literature data in the model indicates that the Ortega et al. (2012) thermodynamic data describes the CLOUD data better than previously thought (Kürten et al., 2016).This could be because of the negligence of large particles in the previous study.It seems essential to include the larger nucleated particles in the model, as these contribute to the sink for the small nucleating clusters and particles.The Hanson et al. (2017) data overestimate the new particle formation rates for the warm temperatures (278 and 292 K).No direct comparison to the Yu et al. ( 2018) is possible, as no temperature-dependent evaporation rates can be calculated from their reported dG values at 298 K.
SANTIAGO allows for calculating new particle formation rates for a wide range of experimental conditions (T , RH, sulfuric acid, and ammonia concentration).In contrast to the parameterization from Dunne et al. (2016) for the CLOUD data, it is also capable of considering different external sinks (e.g., due to chamber and/or flow tube walls in laboratory experiments or the presence of pre-existing particles in the atmosphere) that can affect nucleation and particle growth (Kerminen and Kulmala, 2002;Ehrhart and Curtius, 2013).With the model, growth rates can also be determined.
Finally, the strong dependence on [NH 3 ] regarding NPF even at levels below 1 pptv highlights the need for improved instrumentation when one wants to understand the impact of ammonia on nucleation, as no available technique can measure such low atmospheric ammonia concentrations in real time.
Data availability.Data can be downloaded from https://doi.org/10.5281/zenodo.2634985(Kürten, 2019) and are also available upon request by sending an email to the corresponding author.Competing interests.The author declares that there is no conflict of interest.
Special issue statement.This article is part of the special issue "The CERN CLOUD experiment (ACP/AMT inter-journal SI)".It is not associated with a conference.

Fig. 2 .
This figure indicates the results from the optimization method (dashed lines) and the pdf's (solid lines) along with their medians (dotted lines) for the 11 different clusters.A comparison between the pdf's and the values fromOrtega et  al. (2012)  andHanson et al. (2017) is shown in Supplement Fig.S1.The pdf's result from generating histograms of the values from Z 0 , where the first 5000 points are neglected (see Sect. 2.5.4).Discussion on the thermodynamic data follows in Sect. 4.

Figure 2 .
Figure 2. Probability density functions of dH and dS values for 11 clusters in the acid-base system (A x B y is cluster of sulfuric acid and ammonia with x sulfuric acid molecules and y ammonia molecules).The vertical lines indicate the values from the optimization method (dashed lines) and the medians of the probability density functions (dotted lines).
4. Using the data fromOrtega et al. (2012) andHanson et al. (2017) generates Figs.S3 and S4.FigureS2shows the model curves using dH and dS from the medians of the Monte Carlo simulation.The medians also give good results, except for an overestimation at 248 and 278 K at the lowest NH 3 concentration.This is probably due to comparatively low dG values for the sulfuric acid tetramer (

Figure 3 .
Figure3.Calculated new particle formation (NPF) rates vs. measured NPF rates (fromKürten et al., 2016).The color code indicates the temperature (between 208 and 292 K).The calculated values are from the model using the thermodynamic data from Steihaug's optimization method.The solid line indicates the one-to-one correspondence, while the dashed lines indicate a deviation by a factor of 10 from the one-to-one line.The error bars include the uncertainty of the [H 2 SO 4 ] (factor of 2) and the [NH 3 ] (seeKürten et al., 2016).

Figure 4 .
Figure 4. Comparison between simulated and measured new particle formation rates for five different temperatures.The color code indicates the ammonia mixing ratio (for the respective temperatures indicated in the figure panels and a pressure of 1 bar); the grey symbols indicate pure binary conditions.The model (solid lines) uses thermodynamic data from the optimization scheme according toSteihaug (1983,  Sect.2.4).The average ratio for the deviation is ∼ 4. In comparison, the results from the parameterization are also shown (dashed lines;Gordon et al., 2017).

Figure 5 .
Figure5.New particle formation rates as a function of the ammonia concentration.The triangles show the neutral formation rates from the CLOUD experiment normalized to the indicated sulfuric acid concentration for five different temperatures(Kürten et al., 2016).The lines show calculated NPF rates from the model using the thermodynamic data from the optimization method (Table1).The dashed sections (for 248, 278, and 292 K) indicate regions of the parameter space where the model does not give accurate results, as the true binary rates are expected to be lower(Ehrhart et  al., 2016).

Figure 6 .
Figure 6.Particle growth rates as a function of the sulfuric acid monomer concentration.The black line indicates the theoretical curve from Nieminen et al. (2010) for a temperature of 278 K and for sulfuric acid vapor.The other lines show the calculated particle growth rates at two different temperatures (indicated in the figure legend).The NH 3 concentration was set to 1 × 10 8 cm −3 (blue and red curve); for all calculations a density of 1615 kg m −3 and a particle mobility diameter of 2.4 nm was used, and the diameter of the particles was calculated assuming a molecular mass of 151 amu (two water molecules and one ammonia molecule per sulfuric acid molecule).

Table 1 .
Yu et al. (2018) from this study and from the literature.aOptimizationmethod.bMediansfrom Monte Carlo simulation.dGvalues at 298 K. c Data from Ortega et al. (2012).dDatafromHansonet al. (2017).eDatafromYu et al. (2018).f Value applies for cluster without involvement of water; with different amounts of water molecules, this value varies between 11.52 and 12.59 kcal mol −1 .g Value applies for cluster without involvement of water; with different amounts of water molecules, this value varies between 5.71 and 8.37 kcal mol −1 .NA stands for not available.+ NH 3 ⇔ (H 2 SO 4 ) 1 (NH 3 ) 1 16.7 a , 12.8 b (16.00) c (15.0) d 29.8 a , 30.0 b (28.14) c (21.8) d 7.8 a , 3.9 b (7.61) c (8.5) d (7.77) e (H 2 Table 1).Unfortunately, Yu et al. (2018) did not provide dH and dS values but only dG values at 298 K; therefore, their data set could not be tested.www.atmos-chem-phys.net/19/5033/2019/Atmos.Chem.Phys., 19, 5033-5050, 2019 www.atmos-chem-phys.net/19/5033/2019/Atmos.Chem.Phys., 19, 5033-5050, 2019 5046 A. Kürten : New particle formation from sulfuric acid and ammonia Appendix A: Nomenclature b Variance of the means for each parameter (dH or dS) B Hessian matrix of f regarding all dH and dS values c Number of chains d p Particle diameter dH Enthalpy for one of the reactions (see Table 1) dS Entropy for one of the reactions (see Table 1) f Average error for all modeled and measured particle formation rates g Gradient vector of f regarding all dH and dS values GR Number of experiments (n 1 for 208 K, n 2 for 223 K, n 3 for 248 K, n 4 for 278 K, and n 5 for 298 K) n coefs Total number of coefficients, i.e., all dH and dS values (n coefs = 22) N Cluster and/or particle number density p Power dependency of an evaporation rate regarding the relative humidity P Acceptance probability in Monte Carlo algorithm R Statistical metric to indicate convergence for the Monte Carlo simulation RH Relative humidity s Vector of step changes (all dH and dS values) in one iteration t Empirical parameter needed in Steihaug's optimization algorithm (t 1 , t 2 ) T Temperature Var Variance for a parameter in one of the chains W Mean of the variances over all chains for one parameter x Current vector of all dH and dS values (Monte Carlo simulation) x 1 , x 2 Drawn vectors of all dH and dS values from history (Monte Carlo simulation) x new New vector of all dH and dS values (Monte Carlo simulation) x old Old vector of all dH and dS values (Monte Carlo simulation) Z 0 Joint history for all chains in the Monte Carlo simulation δ Term in the calculation of the new vector in the Monte Carlo algorithm Radius of trust region in Steihaug's method max Maximum allowed radius of trust region in Steihaug's method γ Scaling factor in the calculation of the new vector in the Monte Carlo algorithm η Empirical parameter needed in Steihaug's optimization algorithm (η 1 , η 2 , η 3 ) µ Mean value for one parameter µ Mean value over all chains for one parameter ρ Ratio between actual and predicted function reduction in Steihaug's method σ Standard deviation σ ini Standard deviation of the parameters from the prior distribution Supplement.The supplement related to this article is available online at: https://doi.org/10.5194/acp-19-5033-2019-supplement.