Interactive comment on “ A new balance formula to estimate new particle formation rate : reevaluating the effect of coagulation scavenging ” by

A new method to estimate particle formation rates has been proposed by the authors which is an improvement to existing ones with respect to taking coagulation into account. This seems to be an important improvement whan analysing new particle formation events in polluted conditions, such as Beijing, where the method is applied. Also, a nice comparison with some previous approaches is presented. The topic fits well to ACP and deserves publication, but some major modifications are first necessary. In addition, the manuscript suffers from several grammatical errors as well as unclear writing (some of the points I have commented below but many not). A thor-


Introduction
New particle formation (NPF) is a frequently occurring phenomenon in the atmospheric environment.In a typical NPF event, gaseous precursors burst out into particles due to nucleation and lead to a rapid increase in the atmospheric aerosol population.Nucleated particles can grow quickly to increase the number concentration of cloud condensation nuclei (Kerminen et al., 2012;Kuang et al., 2009;Leng et al., 2014) and thus have indirect impacts on radiative forcing and global climate (Lohmann and Feichter, 2005).The continuous growth of nucleated particles also provides increasing aerosol surface area for heterogeneous physicochemical processes.NPF studies can trace back to the early 20th century (Aitken, 1911), and NPF events have been observed in various atmospheric environments, e.g., from the city to the countryside, from desert (Misaki, 1964) to rain forest (Zhou et al., 2002), from the continent to the ocean (Covert et al., 1992), from the Equator (Clarke et al., 1998) to polar areas (Covert et al., 1996;Park et al., 2004), and from the troposphere to the stratosphere (Lee et al., 2003).
The formation rate at which the growth flux passes a certain diameter is a key parameter to quantitatively describe NPF events.Different formulae have been used to estimate new particle formation rate from measured aerosol size distributions, and they mainly originate from two approaches.One is from the definition of nucleation rate (Heisler and Friedlander, 1977;Weber et al., 1996) and the other is a pop-ulation balance method (Kulmala et al., 2001(Kulmala et al., , 2012)).The consistency of these two approaches was tested using a numerically simulated NPF event, and a relative error of less than 20 % was reported (Vuollekoski et al., 2012).The simulated NPF event had a maximum formation rate of less than 1 cm −3 s −1 .However, the reported formation rates in the atmosphere vary on a large scale, e.g., from approximately 10 −2 to 10 4 cm −3 s −1 (Kulmala et al., 2004).Because of the assumptions made in these two approaches, their validity in describing NPF events with a high formation rate needs to be further explored.A high fraction of newly formed particles is scavenged by coagulation before they grow into larger sizes.Both approaches potentially underestimate the contribution of the coagulation scavenging effect when calculating the formation rate from measurement data.They perform well only in clean atmospheric environments in which nucleation intensity is not strong and aerosol concentration is relatively low; i.e., the coagulation scavenging effect is less important.
The effect of coagulation scavenging is more prominent when estimating the formation rate of sub-3 nm particles because of their high diffusivities and high concentrations during NPF events.Due to instrument limitations, aerosol size distributions of sub-3 nm particles were not available in many previous NPF field campaigns.Recent developments in diethylene glycol (DEG) condensation particle counters (CPCs; Iida et al., 2009;Vanhanen et al., 2011) have made it feasible to develop new scanning mobility particle spectrometers (SMPSs) for extending aerosol size distribution measurement from ∼ 3 nm down to ∼ 1 nm (Jiang et al., 2011a;Franchin et al., 2016).These new spectrometers were deployed in atmospheric observations (Jiang et al., 2011b) and in chamber measurements (Franchin et al., 2016) to study NPF.A miniature cylindrical differential mobility analyzer (mini-cyDMA; Cai et al., 2017) was developed to improve the performance of the DEG-SMPS.
In many locations in China, high emissions lead to both high concentrations of gaseous precursors and high atmospheric aerosol concentrations.NPF has frequently been observed, even in megacities such as Beijing and Shanghai (Wu et al., 2007;Kulmala et al., 2016;Wang et al., 2017).In most previous studies, the above population balance method was used to estimate new particle formation rates in China.The reported formation rates of 3 nm particles and larger ones are typically in the range of 1-10 cm −3 s −1 (Wang et al., 2013;Leng et al., 2014;An et al., 2015;Qi et al., 2015).One study in Shanghai reported a rate of 112.4 to 271.0 cm −3 s −1 for the formation of 1.5 nm particles inferred from a DEG-CPC (Xiao et al., 2015).For these intense NPF events, the above balance approach may underestimate the coagulation scavenging effect and thus lead to underestimation in the reported formation rate.In addition, applying new SMPSs to measure aerosol size distributions down to ∼ 1 nm will help to better quantify the formation rate and its governing factors in typical locations in China.
To estimate new particle formation rates, various particle size ranges were used in previous formulae.The definition approach tries to limit the size range towards the minimum detected diameter (Kuang et al., 2008;Weber et al., 1996), while studies with the population balance method have used various size ranges.Some studies used the aerosol size distributions from the minimum detected diameter up to 25 nm (Kulmala et al., 2001;Dal Maso et al., 2005;Wu et al., 2007;Wang et al., 2013).Kulmala et al. (2004) recommended the upper size bound as the maximum size that the critical cluster can reach during a short time interval of growth.There are also studies using narrower size ranges, such as from 3 to 6 nm (Sihto et al., 2006;Paasonen et al., 2009;Wang et al., 2011;Vuollekoski et al., 2012) and from 1.34 to 3 nm (Xiao et al., 2015).In principle, the estimated formation rates may vary when different particle size ranges are used.Assumptions made while deriving these formulae should be fully considered when proposing criteria to choose the particle size range.
In this study, a new population balance formula for estimating the new particle formation rate was derived from the aerosol general dynamic equation to properly account for the effect of coagulation scavenging, especially for analyzing intense NPF events.An NPF field campaign was carried out in Beijing.Aerosol size distributions down to ∼ 1 nm were measured using the DEG-SMPS equipped with the mini-cyDMA.Data from this campaign and from the literature are used to test the new formula and other widely used formulae.Different formulae are compared and their applicability in analyzing intense NPF events is addressed.Criteria to choose the particle size range for formation rate estimation are proposed and evaluated.The governing components of the new formation rate in Beijing are discussed and compared to those from other locations in the world.

The new balance formula to estimate formation rate
The new formula, which is based on the definition of droplet current and the aerosol general dynamic equation (see Appendix A for its derivation), is shown in Eq. ( 1): where J k is the formation rate of particles at size d k , N is the particle number concentration, and   1) is obtained by adding these single population balance equations from d k to d u , converting it from the discrete form into the continuous form, and approximating J u with the product of measured n u and GR u .Note that Eq. ( 1) is still an approximate formula for particle formation rate because CoagSnk and CoagSrc are calculated by using size bins and the coagulation effect of particles smaller than d min is not accounted for.For rigorous mathematical derivation and detailed illustration, please refer to Appendix A.

Previous approaches to estimate formation rate
The population balance method proposed in previous studies is shown in Eq. ( 2) (Kulmala et al., 2001(Kulmala et al., , 2012)): where coagulation sink, CoagS m , is defined as in Eq. (3).
The subscript m corresponds to the representing diameter, d m , for particles ranged from d k to d u ; d m is often estimated as the geometric mean diameter of d k and d u .Equations ( 1) and ( 2) look similar because they are both derived from the general dynamic equation, while their detailed differences are illustrated in Appendix B.
Equation ( 4) focuses on the flux into d k and is theoretically correct in the continuous space of particle diameter.However, when applying Eq. ( 4) in practice, the size distribution of particles smaller than d k is required, which is difficult to obtain (See Appendix B).Usually diameter bins larger than d k are used to estimate the particle formation rate when using the practical expression of Eq. ( 4) (e.g., Eq. 9 as defined in Sect.4.3).As illustrated in Fig. 1, such approximation essentially neglects the first three terms on the RHS of Eq. ( 1) and may lead to underestimation of the particle formation rate because of neglecting the coagulation scavenging effect, especially when intense NPF events are analyzed.

Previous formulae for comparison
Equation ( 5) is a widely used balance formula to estimate the formation rate in previous studies (Kulmala et al., 2001;Dal Maso et al., 2005;Wu et al., 2007;Shen et al., 2011;Wang et al., 2013): where N i is the number concentration of size bin i. Corresponding to Eq. ( 2), d u is 25 nm and d m is 8 nm in Eq. ( 5).By comparing Eq. ( 5) with Eq. ( 1), it can be concluded that Eq. ( 5) estimates CoagSnk using a representative CoagS m and neglects CoagSrc.The growth rates in all formulae in R. Cai and J. Jiang: A new balance formula to estimate new particle formation rate Sect.2.2 were estimated using the mode-fitting method suggested in Kulmala et al. (2012).
When calculating CoagS m , particles smaller than d m (Kulmala et al., 2012) or even d u are neglected in some previous studies.The corresponding formulae are shown in Eqs. ( 6) and ( 7).The only difference among Eqs.( 5), (6), and ( 7) is the lower bound when calculating CoagS m in the second term on the RHS of these equations.
It should be clarified that d k in Eqs. ( 5)-( 8) was usually 3 nm in previous studies due to the absence of sub-3 nm particle size distributions, and d m in Eq. ( 8) was 4 nm rather than 3 nm in previous studies because 4 nm is almost the geometrical mean diameter of 3 and 6 nm.Particles smaller than 6 nm were neglected when estimating the coagulation sink term in some studies, although its uncertainties will not be discussed here.The expression of the condensational growth term, i.e., the third term on the RHS of Eq. ( 8), varies among studies; however, it does not influence the generality of the following discussion.
In previous studies, several size bins larger that d k , typically 3 nm, were adopted when using the practical formula of the definition approach (Weber et al., 1996;Kuang et al., 2008), while here the size range from 1.5 to 2.5 nm is applied to estimate J 1.5 as shown in Eq. (9).
3 Experiment An NPF field campaign was carried out in Beijing.The observation period was from 7 March to 7 April 2016.The monitoring site is located on the top floor of a four-storey building in the center of the campus of Tsinghua University.Tsinghua is situated in the northwestern urban area of Beijing and the fourth-ring road is ∼ 2 km to the south of the monitoring site.The site has been a PM 2.5 monitoring station since 1999 (He et al., 2001;Cao et al., 2014) and there are no tall buildings nearby.Potential pollution sources in the area are the three cafeterias on campus, which may produce cooking aerosol during meal times, located ∼ 170 m to the northeast, ∼ 170 m to the north, and ∼ 350 m to the northwest.
A DEG-SMPS equipped with a mini-cyDMA specially designed for the classification of sub-3 nm particles was deployed to measure particles in the size range of 1-5 nm (Cai et al., 2017).A particle size distribution system, including an SMPS with a TSI nano-DMA, an SMPS with a TSI long DMA, and an aerodynamic particle sizer, was used to measure particles in the size range of 3 nm to 10 µm in parallel (Liu et al., 2016).Other instruments that produced data not used in this analysis are not listed here.
A C++ program was used to invert the particle size distribution from raw counts while incorporating diffusion losses inside the sampling tube, diffusion losses and charging efficiencies of the bipolar neutralizers, penetration efficiencies and transfer functions of DMAs, and detection efficiencies of CPCs (Hagen and Alofs, 1983;Jiang et al., 2011a).The particle density was assumed to be 1.6 g cm −3 according to local observation results (Hu et al., 2012).The mass accommodation coefficient was assumed to be 1.0, and temperature was assumed to be constant at 285 K, the average temperature during the observation period.

Results and discussion
4.1 Upper size bound for formation rate calculation New particle formation rates using different upper size bounds, d u , of 3, 6, 10, and 25 nm were calculated.Since the maximum size that new particles formed by nucleation have reached varies with time, the upper size bound should not be a constant value to minimize the interference of background particles.A varying upper size bound, d b , was visually determined as the largest size bin in the size range from 3 to 25 nm with a frequency density (particle size distribution), dN / dlogd p , larger than 2.8 × 10 4 no.cm −3 .Here 28 000 was determined visually according to the measured intensity plot of particle size distributions as an approximate boundary for newly formed particles and background particles.The value should be campaign specific or even event specific.Figure 2a indicates that d b is almost the boundary for particles formed due to nucleation.Estimated J 1.5 using 2.0 × 10 4 no.cm −3 as the boundary differed little from that using 2.8 × 10 4 no.cm −3 , indicating that the estimated J 1.5 is insensitive to the value for the boundary.It is reasonable to regard d b as a relatively credible value when compared to others.Note that when using d b as the upper size bound, the dN / dt term of newly formed particles in Eq. ( 1) is approximated by that of sub-25 nm particles to avoid the potential influence of varying size range on particle number concentration.
As shown in Fig 2b, estimated J 1.5 values using d b and a constant value of 25 nm as the upper bounds are almost the same (the mean relative error is 2.2 %).The maximum difference between these two choices is ∼ 10 %, which appears before 08:00 when d b is less than 5 nm and the number concentration of sub-25 nm particles is ∼ 2 times that of sub-6 nm particles and ∼ 3 that times of sub-3 nm particles.This indicates that the influence of non-freshly nucleated particles in estimating J 1.5 is not important because of their comparatively low diffusivities even though their concentration is comparatively high at the beginning of NPF events.Estimated J 1.5 values using d u of 6 and 10 nm are in good consistency with those using d b before 10:00 (the mean relative errors are 4.8 and 2.6 %, respectively).However, when particles formed by nucleation grow beyond the upper size bound, calculated J 1.5 is underestimated when using 6 and 10 nm as the upper bound.For example, the mean relative errors of estimated J 1.5 using d u values of 6 and 10 nm between 10:30 and 15:00 are 18.6 and 12.8 %, respectively.When calculating J 1.5 using 3 nm as d u , an average 47 % underestimation was found for this event.
The reason for underestimation when using smaller d u can be illustrated by Fig. 2c.J u is estimated by n u • GR u in Eq. ( 1).This estimation may be not accurate when d u is small because the assumption that the net coagulation between any particle larger than d u and other particles is negligible may be violated.As illustrated in the derivation of Eq. ( 1) in Appendix A, a nearly zero J u is preferred when using Eq. ( 1).However, as shown in Fig. 2c, estimated J 3 is still a large fraction compared to J 1.5 , while J 6 and J 10 are 27.8 and 17.6 % of J 1.5 on average between 10:30 and 15:00, respectively.Although J u is approximated by n u • GR u rather than simply neglected, this approximation may still lead to uncertainties.
Since J 1.5 values estimated from the varying d b and a constant value of 25 nm are almost the same with an acceptable relative error even under the interference of non-freshly nucleated particles, 25 nm was adopted as the upper bound for calculating J in this study.It is reasonable to neglect J u for simplicity when d u is determined according to the two criteria.It should be clarified that 25 nm is not necessarily valid for all other studies because the upper bound should be determined by the two criteria and can be campaign specific.However, it can be concluded that a very small upper bound, such as 3 nm, is not recommended because particles formed by nucleation surely grow larger than 3 nm in a typical NPF event, while the intense primary emission of particles around 3 nm is rarely observed in the atmosphere (unless near the emission sources).

Comparison with previous formulae
Estimated J 1.5 values using Eqs.( 1) and ( 5)-( 9) on 13 March are shown in Fig. 3, and d k , d u , and d min are 1.5, 25, and 1.3 nm, respectively, when using Eq. ( 1).It can be concluded that except for Eq. ( 8), other formulae significantly underestimate J 1.5 compared to Eq. (1).By comparing the contribution of each term on the RHS of Eqs.(1) and ( 5)-( 9), it was found that the underestimation of formation rates is mainly caused by the underestimation of CoagSnk.Equation (9) simply neglects CoagSnk and other terms (dN / dt and CoagSrc) compared to Eq. (1), so its result is the lowest among the six formulae.Equation (5) estimates CoagSnk using an average CoagS m , which leads to underestimation because CoagS at 8 nm happens to be smaller than at most other diameters in the size range from 1.5 to 25 nm, as illustrated in Appendix B. Equations ( 6) and ( 7) neglect particles smaller than 8 and 25 nm, respectively, when CoagS m is calculated.Such a simplification may be reasonable for a relatively clean atmosphere in which nucleation intensity is not strong; however, these approximations are not suitable for analyzing typical NPF events in Beijing during which coagulation among nucleation mode particles is a major proportion of CoagSnk.J 1.5 estimated using Eq. ( 8) agrees well with that estimated using Eq. ( 1); however, it does not mean that 6 nm serves as a better upper size bound than 25 nm.The agreement between the results estimated using Eqs.( 1) and ( 8) is due to the more accurate estimation of CoagSnk when using an average CoagS m in a narrower size range.In addition, in this case the underestimation of CoagSnk when using Eq. ( 8) is coincidently canceled out by the overestimation of the formation rate caused by neglecting CoagSrc.
The importance of coagulation scavenging among newly formed particles due to nucleation is illustrated in Fig. 4. Scavenging due to coagulation with particles smaller than d p is neglected, as mathematically defined in the formula in Fig. 4a.CoagSnk increases rapidly with the decrease in d p rather than maintaining an approximately constant value during NPF events, indicating that coagulation among nucleated particles contributes a considerable fraction to Co-agSnk in Beijing.The necessity of a sub-3 nm particle size distribution is also demonstrated, which means that estimated J 3 may also be underestimated due to the absence of sub-3 nm data, as illustrated in Appendix B. Approximation of CoagSnk estimated using a representative CoagS m is also shown in Fig. 4b, indicating the underestimation of the new particle formation rate when applying Eq. ( 5) to analyze NPF events in Beijing.However, calculated CoagSnk on a non-NPF event day and in non-NPF periods on NPF days is almost unaffected by the coagulation scavenging effect of particles in nucleation mode (smaller than 25 nm) because the number concentration of nucleation mode particles at non-NPF times is comparatively low.

Characteristics of estimated formation rate in Beijing
For the NPF events observed in the Beijing campaign, Co-agSnk is a governing component of the estimated J 1.5 .The estimated formation rate on 13 March and the four terms on the RHS of Eq. ( 1), i.e., dN / dt, CoagSnk, CoagSrc, and the condensational growth term, are shown in Fig. 5. CoagSnk is almost the same as the estimated J 1.5 in Beijing, while the difference between them is mainly due to dN / dt with an absolute value that is comparatively higher at the beginning and the end of the NPF event.The condensational growth term, n u • GR u , is negligible compared to other terms, which is reasonable since J u is assumed to be unimportant when determining d u in Eq. ( 1).The governing role of CoagSnk in estimated formation rates in Beijing emphasizes the importance of fully considering the coagulation scavenging effect among particles formed by nucleation.Equations ( 5)-( 9) may fit well in relatively clean atmospheric environments in which the new particle formation rate is comparatively low, such as in Hyytiälä, and the agreement of Eqs. ( 8) and ( 9) has been reported in a numerically simulated NPF event in which J 3 is less than 1 cm −3 s −1 (Vuollekoski et al., 2012).However, problems appear when applying them in urban Beijing because of underestimating the governing fraction of estimated J 1.5 , i.e., CoagSnk.Coagulation sink, CoagS, is not the major reason for the governing role of CoagSnk in Beijing.It is generally considered that the atmosphere in a typical urban area in China, such as Beijing, is comparatively polluted.However, observed NPF events mainly occur on clean days when the air mass comes from the north or northwest of Beijing.The mean PM 2.5 mass concentration reported by the nearest national monitoring station, Wanliu station, was 10.4 µg cm −3 during all NPF events in this campaign.The aerosol surface area concentration is characterized by the Fuchs surface area, A Fuchs (McMurry, 1983), and the condensation sink, CS (Kulmala et al., 2001), which are often used to examine the coagulation scavenging effect.The positive correlation between A Fuchs and CS is illustrated in McMurry et al. (2005), while CS can be regarded as the CoagS of sulfuric acid molecules.Figure 6a shows the comparison of A Fuchs and CS in Beijing to those in other locations around the world.A Fuchs and CS during NPF events in this study are higher than those in Hyytiälä, similar to those observed in Boulder, and lower than those in Atlanta, Mexico City, and New Delhi.This indicates that coagulation sink in urban Beijing on NPF days is in a common range rather than higher than most other places around the world.
As shown in Eq. ( 1), CoagSnk is approximately proportional to the square of the particle number concentration.Nucleation intensity in urban Beijing, characterized by the number concentration of particles larger than 3 nm during typical NPF event periods, is found to be higher than in Hyytiälä and Atlanta (as shown in Fig. 6b).The number concentration of sub-3 nm particles is not accounted for to maintain comparability.Although A Fuchs and CoagS represent the relative  (Wu et al., 2007).Condensation sink on NPF days in New Delhi was reported by Kulmala et al. (2005).ANARChE (McMurry et al., 2005) and MILAGRO (Iida et al., 2008) were conducted in Atlanta and Tecamac, respectively, while EUCCARI (Manninen et al., 2009), QUEST II (Sihto et al., 2006), and QUEST IV (Riipinen et al., 2007)  importance of the coagulation scavenging effect (McMurry, 1983;Kulmala et al., 2001), it is the CoagSnk that reflects the number of particles lost due to coagulation scavenging in the size range of d k to d u .This explains the governing status of CoagSnk in estimated formation rates in urban Beijing with intense NPF events.
Figure 7 further illustrates the underestimation in new particle formation rates in China due to previously used formulae, especially for Eq. ( 7), which neglects coagulation among sub-25 nm particles, and Eq. ( 9), which simply neglects the net coagulation effect.The mean J 1.5 estimated in this study using Eq. ( 1) are 1.2, 2.4, and 6.4 times those estimated using Eqs.( 5), (7), and (9), respectively.The mean J 3 values estimated in this study using Eq. ( 1) are 1.2, 2.0, and 3.3 times those estimated using Eqs.( 5), ( 7), and (9), respectively.The J 3 values reported in previous studies in urban Beijing (Wu et al., 2007;Yue et al., 2009;Wang et al., 2013Wang et al., , 2015)), Shanghai (Xiao et al., 2015), and Shangdianzi, the regional background station of the North China Plain (Shen et al., 2011;Wang et al., 2013), are also shown in Fig. 7. Higher formation rates are anticipated if the coagulation scavenging effect is fully considered when analyzing these NPF events.Note that sub-3 nm particles are also accounted for when calculating J 3 in this study, while they are not in previous studies except for the campaign in Shanghai.

Conclusions
A new balance formula to estimate new particle formation rate derived from the aerosol general dynamic equation was proposed.The new formula estimates the effect of coagulation scavenging better compared to previous ones.Two criteria in determining the upper bound for calculation were proposed.An NPF campaign in urban Beijing was carried out in spring 2016.Aerosol size distributions down to ∼ 1 nm were measured and used to test the new formula and those widely used in previous studies.It was found that formation rates in urban Beijing are underestimated to different extents in previously used formulae, and the underestimation of the coagulation scavenging effect (corresponding to the coagulation sink term) is the major reason.Coagulation among particles in nucleation mode was found to be important when estimating the coagulation scavenging effect in urban Beijing.The estimated formation rates of 1.5 nm particles in this campaign using the new formula were 1.3-4.3times those estimated using the formula neglecting coagulation among particles in the nucleation mode.The coagula-tion sink term is the governing component of the estimated formation rate in urban Beijing.Although higher than in a relatively clean atmosphere, such as in Hyytiälä, coagulation sink (expressed in the form of Fuchs surface area and condensation sink) in urban Beijing on NPF days is lower than reported in Atlanta and Mexico City.However, the number concentration of particles formed due to nucleation in urban Beijing is comparatively high, which leads to high coagulation loss.The formulae used in previous studies may perform well when describing relatively weak NPF events in a clean atmosphere, but they underestimate the coagulation scavenging effect when intense NPF events are analyzed.The formation rates reported in previous studies for urban Beijing and other locations with intense NPF events might be underestimated because of their underestimation or neglect of the coagulation scavenging effect.

Appendix A: Derivation of nucleation rate from aerosol general dynamic equation
The nucleation rate is the rate at which clusters grow to produce the critical cluster (nuclei).However, a more specific and microscopic definition of nucleation rate is needed for any further calculation, and it should be easily and unambiguously transferred into a mathematical expression.Here we adopt the definition based on droplet current (Eq.10.1, Friedlander, 2000): (A1) Formation rate, J g , is the excess rate of the passage from g−1 (cluster or particle with g−1 molecules) to g by condensation over the passage from g to g − 1 by evaporation.If g is the size of the critical cluster, J g is defined as the nucleation rate, I .N g is the number concentration of cluster g; β (i,j ) is the coagulation coefficient of i and j , and it can be theoretically estimated from the diameter of i and j (Eq.13.56, Seinfeld and Pandis, 2006); α g is the monomer evaporation flux from g; and s g is the effective surface area of g for evaporation.
Only formation due to condensational growth is considered in the definition of Eq. ( A1), while formation due to the coagulation of smaller clusters is not taken into account.This is based on the assumption that critical clusters are mainly formed due to the condensational growth of sulfuric acid and other chemical species.The formation of a critical cluster by coagulation does not influence the generality of the following derivation and can be readily incorporated; it will be clarified at the end of Appendix A.
The other basic equation for the derivation is the general dynamic equation in the discrete form (Eq. 11.3, Friedlander 2000): As shown in Eq. ( A2), the time rate of change of a cluster or particle number concentration, dN g / dt on the left-hand side (LHS), is determined by formation due to the coagulation of smaller clusters and (or) particles, coagulation scavenging with preexisting clusters and particles, condensational growth from g − 1 and to g + 1, and evaporation to g − 1 and from g + 1, corresponding to the six terms on the right-hand side (RHS) of Eq. (A2).The evaporation terms (corresponding to the fifth and sixth terms on the RHS) may be zero or nearly zero when g is large; however, their exact values have no influence on derivation.An important assumption to be noted is that meteorological transport, dilution, primary emission of g, and other losses (e.g., wall loss) are not included in Eq. (A2).
Note that the last four terms on the RHS of Eq. ( A2) are equal to J g -J g+1 by substituting Eq. (A1).By replacing the subscript g with the critical cluster size, k, we have The expression of Eq. ( A3) is similar to Eq. (A6) in Kuang et al. (2012), which was also obtained using the balance method.J k+1 is usually a relatively large term in Eq. ( A3), and it can be accounted for by iteration.Equation ( A5) is obtained by summing Eq. ( A3) from subscript k to u − 1 as shown in Eq. ( A4), where u is the particle size at the upper bound of the concerned size range.
On the RHS of Eq. ( A5) are the time rate of change of the particle concentration, the coagulation sink term, the coagulation source term, and the condensational growth term.Note that when particle u is large enough, J u is nearly zero, i.e., lim u→∞ J u = 0 because of the negligible condensational growth  and low number concentration compared to those of freshly nucleated small particles.Equation ( A6) is obtained by replacing the upper bound, u, with infinity and further simplified by combining the second and third term on the RHS of Eq. (A5).
Theoretically, Eq. (A6) can be used to estimate I since each term on the RHS can be calculated.However, the validity of Eq. (A6) faces a higher risk of violation when applied in a real atmosphere due to non-negligible primary emission sources.This is because Eq. ( A6) is a balance equation for the whole aerosol population rather than a limited size range of the nucleation mode.It is both more cautious and efficient to use Eq.(A5) with a proper particle size u and a reasonable estimation of J u .
When using measured particle size distribution to estimate I , Eq. (A5) has to be converted from the discrete form into the continuous form.For the third term on the RHS of Eq. (A5), i.e., the coagulation source term, its summation sequence can be rearranged as The formulae on both the far LHS and the far RHS of Eq. (A7) are equally accurate to estimate the coagulation source term.However, simply substituting the continuous particle diameter (e.g., d g ) for the discrete size (e.g., g) on the far LHS of Eq. (A7) will result in uncertainties when the size bins do not increase linearly in the particle volume space.
As indicated in Fig. A1, substituting the continuous particle diameter for the discrete size on the far RHS of Eq. (A7) is independent of the bin structure for d g and d i .Thus, Eq. (A5) can be rewritten as where d min is theoretically the minimum cluster size.Note that the size bin from  A8) is expressed in the summation form rather than the integration form.Practically, Eq. ( A8) is only an estimation of Eq. (A5) because coagulation is calculated by using size bins, while the particles sizes in each size bin are not exactly the same as the representing diameter, d g .The upper size bound, d u , is a "properly large" size at which diameter J u is negligible compared to the sum of the other three terms on the RHS of Eq. (A8).Properly large is defined by the following two criteria: d u should not be too large so that the calculated nucleation rate is non-negligibly affected by transport or primary emissions, and d u should not be too small so that the calculated nucleation rate is underestimated because J u is still too large to be neglected or to be estimated by growth rate (as illustrated in the following paragraph).These two criteria seem to be contradictory; however, as illustrated in Fig. 2b, the calculated nucleation rate is usually not sensitive to the upper bound because J u decreases rapidly with the increase in d u since the freshly nucleated particles are usually in a relatively narrow size range, especially during strong NPF events.The fourth term on the RHS of Eq. (A8), J u , is usually so small that it can be simply neglected when d u is properly large.However, an approximate term is recommended for better estimation.Here we introduce a sufficient but possibly unnecessary condition that the net coagulation effect between any particle larger than d u and other particles can be neglected when estimating the GR term.Define N [d u ,d u + d) t as the number concentration of particles in a narrow size range from d u to d u + d at time t.After a very short time dt, these particles grow into the size range from d u + dd to d u + d + dd, which is based on the assumption that diameter growth is equal for different particles in such a narrow size and time range, while the number concentration remains the same since there is no particle loss.Particles in the size range from d u + d to +∞ at time t grow to the size range from d u + d + dd to +∞.Since the size range is narrow enough, it is reasonable to assume that the concentration of particles is equally distributed in the size range from d u to The particle size distribution function, n, and growth rate, GR, are defined as Eqs.(A10) and (A11), respectively.Equation (A12) is obtained by combining Eqs.(A6), (A9), (A10), and (A11).
Finally by combining Eqs. ( A8) and (A12), we can obtain the equation to estimate the nucleation rate as Eq. ( A13): The first term on the RHS of Eq. ( A13) is the change in the number concentration of particles ranged from d k to d u .The second and third terms are particle loss to coagulation scavenging and particle formation by coagulation, named as the coagulation sink term (CoagSnk) and the coagulation source term (CoagSrc), respectively (Kuang et al, 2012).The fourth term is the condensational growth term, which is an approximation of the formation rate, J u .This balance formula derived from the aerosol general dynamic equation can also be expressed as Eq.(A14): When applying Eq. (A13) in practice, d k is usually the assumed size of the critical nuclei (or the lowest size limit of the instrument, corresponding to the formation rate, J k , rather than the nucleation rate, I ).The dN / dt term can be obtained either by differentiating between adjacent time bins or fitting in a continuous time period.CoagSnk and CoagSrc can be directly calculated from the particle size distribution, where d min is the minimum detected particle diameter.If formation by the coagulation of smaller clusters is also included in the definition of nucleation rate, the calculation of CoagSrc (the third term on the RHS of Eq.A12) should begin with d k+1 instead of d k , which usually has little effect since the difference is only a size bin and CoagSrc is usually a minor term of J in the atmospheric environment.The growth rate can be estimated by using different methods (Weber et al., 1996(Weber et al., , 1997;;Kulmala et al., 2012;Lehtipalo et al., 2014), or www.atmos-chem-phys.net/17/12659/2017/ the growth term can simply be neglected when d u is properly large.
It should be clarified that the formation rate calculated using Eq.(A13) may be underestimated because coagulation scavenging by particles and clusters smaller than d min is neglected due to the limitation of measuring instruments.As illustrated in Fig. 6a, CoagSnk calculated using d p larger than 3 nm is ∼ 89.1 % of that using d p larger than 1.5 nm.It could be inferred that the calculated J 3 was slightly underestimated in previous studies lacking size distributions for sub-3 nm particles.In this study, measured particles down to 1.3 nm are accounted for when calculating J 1.5 and J 3 .Neglecting coagulation between clusters may also have a non-negligible effect on the calculated results (McMurry, 1983), which calls for the measurement of major molecular clusters participating in nucleation if a more accurate formation rate is to be obtained.

Appendix B: Relationships with previous approaches
Since the new balance approach proposed in this study is based on the aerosol general dynamic equation with a reasonable assumption that the net coagulation of any particle larger than the properly large upper bound, d u , and other particles can be neglected, its inter-relationships with former approaches can be elucidated by making additional assumptions and approximations.
The formation rate is defined as the flux growth of particles past a given size and can be expressed as Eq.(B1), where k is the number of molecules contained by the particle (Heisler and Friedlander, 1977;Weber et al., 1996;Kuang et al., 2008Kuang et al., , 2012)).Note that Eq. (B1) is valid only when it is in the continuous space of particle diameter, while a more accurate expression in the discrete form is shown as Eq.(B2).
Equation ( B2) is believed to be theoretically correct since the only condensational flux into d k is the growth of smaller clusters or particles with diameter d k−1 .Although similar in expression to Eq. (A12), Eq. (B2) focuses on the flux into rather than out of the size bin for calculation, and there is no need to account for coagulation scavenging, as illustrated in Fig. 1.A theoretical expression of GR proposed in a previous study is shown as Eq.(B3), where α is herein the coagulation efficiency (fraction of collisions that successfully result in coagulation), V 1 is the volume increment when adding a single gaseous precursor, and v is the mean thermal velocity of the gaseous precursor (Weber et al., 1996).Here we update the equation by considering different chemical species and describing coagulation by β, as shown in Eq. (B4).The subscript c denotes different chemical species of monomers participating in the condensational growth of cluster k − 1,  and N 1c is their corresponding number concentration.Coagulation efficiency is included in each β (1c,k) (Eq.13.56, Seinfeld and Pandis, 2006).
Equation (B2) is theoretically correct; however, it faces difficulties when being applied in practice since n k−1 is obtained by approximation over a size range around d k rather than the true frequency density at cluster k −1, dN k−1 / dd k−1 .Moreover, because a size distribution smaller than d k is difficult to obtain, the size range for estimation is usually larger than d k .
For example, the formula to estimate J 3 using nano-SMPS data in Kuang et al. (2008) is shown as Eq.(B5).Although Eq. (B5) seems to be an estimation of Eq. (B2), they are essentially two different equations.This is because the measured particle number concentration in the size range for calculation, i.e., N 3-6 in Eq. (B5), has been affected by coagulation.By comparing with Eq. (A14), it can be concluded that dN / dt, CoagSnk, and CoagSrc are simply neglected in Eq. (B5), while Eq. ( B2) is not subject to this problem by its definition.
There are also problems in estimating GR k−1 .Equation ( B4) is only a theoretical formula since it is nearly impossible to determine all the chemical species contributing to nucleation and their corresponding coagulation coefficients in the complicated atmospheric environment.GR calculated from sulfuric acid using Eq.(B3) may lead to underestimation (Kuang et al., 2010), while uncertainties also exist in the approaches that fit particle size distributions to obtain GR (Kulmala et al., 2012;Lehtipalo et al., 2014) because the effect of coagulation on measured size distribution is also neglected.So conclusively, Eq. ( B2) is considered to be theoretically correct; however, it is not recommend to be applied for analyzing NPF events with high coagulation scavenging.
The other approach is a balance method based on a macroscopic point of view shown as Eq.(B6) (Kulmala et al., 2001(Kulmala et al., , 2004)), and we adopt the equation in the most recent paper (Kulmala et al., 2012).Usually d m is the geometric mean diameter of d k and d u .However, coagulation between any particle smaller than d m or even d u and another particle (with any size) is sometimes neglected when it comes to calculation, such as the formula suggested in Kulmala et al. (2012) shown as Eq.(B7).
Equation (B6) appears similar to Eq. (A14) since they both originate from the population balance method; however, there are some differences between them.First, the upper bound for particle size in Eq. (B6), d u , lacks a strict definition and discussion.As discussed in Appendix A, d u should be decided by the two criteria that the effects of transport and primary emissions are negligible and the condensational growth term, J u , is relatively small compared to J k .The upper bound of 25 nm is usually reasonable since a high concentration of particles formed by nucleation predominates the coagulation sink term during strong new particle formation times, while the upper bound of 6 nm may lead to underestimation when freshly formed particles grow larger, as discussed in the main text.
Second, scavenging by coagulation with particles smaller than d m is not included if using Eq.(B7) to calculate CoagS.
As shown in Fig. B1, CoagS is always larger than CoagS , and their difference increases as d m increases.CoagS 8 nm is ∼ 31 % of CoagS 8 nm , indicating a large amount of underestimation when using Eq.(B7).Note that Eq. ( 3) and the approximation formula (estimated with condensation sink) proposed by Lehtinen et al. (2007) do not suffer from this problem.
Third, the second term on the RHS of Eq. ( B6) is not always a reasonable approximation of CoagSnk in Eqs.(A13) and (A14).Theoretically, the relationship between CoagSnk and CoagS is shown as Eq.(B8), while CoagS m is chosen as the representative value when estimating J using Eq.(B6).
However, CoagS is not a relatively constant value versus particle diameter, and CoagS m is not the mean value of CoagS in the calculated size range from d k to d u .As illustrated in Fig. B1, the coagulation coefficient with 8 nm particles decreases rapidly with the increase in d i when particle is smaller than 8 nm.The minimum value of β (d i ,8 nm) appears at d i around 8 nm because particles with similar thermal velocities are more difficult to collide with each other.The calculated CoagS during a strong NPF event on 27 March 2016 appears to monotonously decrease with the increase in d m , while the calculated CoagS has a minimum value of 6.7 nm because CoagS is mainly attributed to nucleation mode particles during NPF events.In this example, CoagS 8 nm and CoagS 8 nm are ∼ 22.6 and ∼ 7.2 % of CoagS 1.5 nm , respectively, indicating a non-negligible underestimation of the coagulation sink term and nucleation rate when using a constant CoagS m instead of a varying value (as a function of particle diameter).Fourth, particle formation by coagulation is neglected in Eq. (B6).The absence of CoagSrc will lead to an overestimation of the nucleation rate.However, it sometimes coincidently cancels out the underestimation caused by using CoagS m to approximate CoagSrc, as discussed in the main text.
Fifth, the growth term in Eq. ( B6) is estimated over the whole size range from d k to d u , while in Eq. (A13) it is mathematically restricted at the upper bound, d u ; n u is usually smaller than the mean value in the size range from d k to d u during an NPF event, and recent work has revealed that the observed GR is size dependent (Kuang et al., 2012;Kulmala et al., 2013;Xiao et al., 2015).For example, as shown in Fig. B2, GR varies with time in the NPF event on 3 April 2016 and was linearly fitted in different diameter ranges.The mean GR of particles in the size range from 2 to 25 nm is ∼ 7.47 nm h −1 , while GR 25 is ∼ 10.86 nm h −1 .At 11:30 on 3 April, n 25 (dN / dlogd p at 25 nm) is 164 no.cm −3 , while the mean n of particles in the size range from 2 to 25 nm is 4755 no.cm −3 .The calculated condensational growth term in Eq. ( B6) is ∼ 20 times of that in Eq. (A13).
In a relatively clean environment with weak NPF events, Eq. (B6) may work well since the calculated J k is mainly predominated by dN / dt.However, when the number concentration of aerosol formed by nucleation and (or) background aerosol is high, i.e., when CoagSnk is the major component of J k , Eq. (B6) underestimates the formation rate (and nucleation rate) due to underestimation of the coagulation scavenging effect.

Figure 1 .
Figure 1.Schematic of the general dynamic equation.

Figure 2 .
Figure 2. Comparison of formation rates estimated using different upper bounds, d u .(a) A typical new particle formation event.Dashed gray lines represent different d u values in Eq. (1).Solid black lines corresponds to d b , i.e., the varying upper bound determined by dN/dlogd p .(b) Estimated formation rates with different upper bounds, d u , using Eq.(1).(c) Estimated formation rates with different d k values using Eq.(1); d u equals 25 nm and d min equals 1.3 nm in the four scatter plots.

Figure 3 .
Figure 3.Comparison of formation rates estimated by different formulae.

Figure 4 .
Figure 4. (a) CoagSnk as a function of d p , where d p is the accounted minimum diameter when calculating CoagS g for particles at all different d g values, and scavenging due to coagulation with particles smaller than d p is neglected, as defined by the formula in panel (a).The dashed line corresponding to CoagSnk on a non-NPF day is also monotonously decreasing with the increase in d min by a negligible slope.(b) Time evolution of CoagSnk versus time on an NPF day (13 March) and a non-NPF day (12 March); d p is defined the same as in panel (a).N is the number concentration of particles in the size range from 1.5 to 25 nm, while CoagS 8 nm is calculated using Eq.(3).

Figure 5 .
Figure 5. Contribution of each term to the estimated formation rate; dN / dt is obtained by fitting and shown as an absolute value with solid and dashed lines corresponding to positive and negative parts, respectively.Note that the upper bound, d u , equals d b as defined in Sect.4.1 for better accuracy; however, it does not affect the generality of the result.

Figure 6 .
Figure 6.(a) Comparison of Fuchs surface area and condensation sink in Beijing (when NPF events occurred) with those in other locations.NPF days were classified by condensation sink in urban Beijing in 2004(Wu et al., 2007).Condensation sink on NPF days in New Delhi was reported byKulmala et al. (2005).ANARChE(McMurry et al., 2005) and MILAGRO(Iida et al., 2008) were conducted in Atlanta and Tecamac, respectively, while EUCCARI(Manninen et al., 2009), QUEST II(Sihto et al., 2006), and QUEST IV(Riipinen et al., 2007) were conducted at SMEAR II(Dal Maso et al., 2005) in Hyytiälä.A Fuchs data in MILAGRO, ANARChE, Boulder, EUCCARI, QUEST II, and QUEST IV were published inKuang et al. (2010).The ends of the colored rectangles correspond to quartiles, while the error bar represents the 10th and 90th percent values.(b) Comparison of peak number concentration of particles larger than 3 nm during NPF events in this study with those in Atlanta and other published data.Note that the published values (light orange points) in previous studies are not necessarily the mean values of the entire campaign periods.
Figure 6.(a) Comparison of Fuchs surface area and condensation sink in Beijing (when NPF events occurred) with those in other locations.NPF days were classified by condensation sink in urban Beijing in 2004(Wu et al., 2007).Condensation sink on NPF days in New Delhi was reported byKulmala et al. (2005).ANARChE(McMurry et al., 2005) and MILAGRO(Iida et al., 2008) were conducted in Atlanta and Tecamac, respectively, while EUCCARI(Manninen et al., 2009), QUEST II(Sihto et al., 2006), and QUEST IV(Riipinen et al., 2007) were conducted at SMEAR II(Dal Maso et al., 2005) in Hyytiälä.A Fuchs data in MILAGRO, ANARChE, Boulder, EUCCARI, QUEST II, and QUEST IV were published inKuang et al. (2010).The ends of the colored rectangles correspond to quartiles, while the error bar represents the 10th and 90th percent values.(b) Comparison of peak number concentration of particles larger than 3 nm during NPF events in this study with those in Atlanta and other published data.Note that the published values (light orange points) in previous studies are not necessarily the mean values of the entire campaign periods.

Figure 7 .
Figure7.Estimated J 1.5 and J 3 using different equations.Previously reported J 3 values in China were included for comparison.The ends of the colored rectangles correspond to the minimum value and the maximum values, respectively.J * 3 : the upper size bound to estimate formation rate, d u , is 6 nm (rather than 25 nm) inWang et  al. (2015)  andXiao et al. (2015).

Figure A1 .
Figure A1.Schematic for two different summation sequences to estimate the coagulation source term.Equations in panels (a) and (b) correspond to the continuous forms on the far LHS and the far RHS formulae in Eq. (A7), respectively.The coagulation source term is denoted by half the area of the triangle (since the particles at the same diameter are accounted for twice).The colored areas are the estimated area using the two equations.The summation terms corresponding to the same particle volume, v g , are shown in the same color.The coagulation source term is underestimated in panel (a) because v g increases nonlinearly in this case, whereas the estimated coagulation source term is independent of the bin structures for d g and d i in panel (b).
d u−1 to d u is denoted by the subscript u − 1, so the upper bound of the size range for calculation is d u .The discrete upper sizes, u − 1 in Eq. (A5) and u − 3 in Eq. (A7), are approximated by d u in Eq. (A8).N [d k ,d u ) is defined as the number concentration in the size range from d k to d u (particles with diameters of d u are not accounted for), corresponding to u−1 g=k N g in the discrete from.Since measured size bins are finite, Eq. (

Figure B1 .
Figure B1.Coagulation coefficient and calculated coagulation sink during a typical NPF event.CoagS and CoagS are defined in Eqs.(B7) and (B8), respectively, and d m in this figure is treated as a variable rather than a constant value.The upper and lower stars denote CoagS 8 nm and CoagS 8 nm , which are used in the second term on the RHS of Eqs.(5) and (6), respectively, to approximate CoagSnk.

Figure B2 .
Figure B2.Size-and time-dependent growth rate on an NPF day observed in Beijing.Representative diameters are obtained by lognormal fitting of nucleation mode particles in each time bin, and GR is linearly fitted in each section.