Towards understanding the mechanisms of new particle formation

Institute for Atmospheric and Earth System Research (INAR) / Physics, Faculty of Science, University of 8 Helsinki, P.O. Box 64, Helsinki, 00014, Finland 9 Climate & Atmosphere Research Centre (CARE-C), The Cyprus Institute, P.O. Box 27456, Nicosia, CY1

. Availability of hourly data (%) from the three particle measuring instruments. 18

PSM core sampling inlet 20
The PSM inlet design was first introduced by Kangasluoma et al. (2016). It is a simple design encompassing 21 a 6-mm tube fitted inside a 10-mm tube using a Swagelok T piece ( Figure S1). In normal operating conditions, 22 the 3 rd outlet of the T-piece is connected to vacuum which enables drawing higher flow through the 10-mm 23 tube than the PSM flow, allowing the PSM to sample from the middle of this flow and thus minimizing losses 24 caused by diffusion to the inlet walls ( Figure S1a). During the background measurements, the 3 rd outlet is 25 connected to particle-free pressurized air with a high enough flow rate allowing the PSM to sample this particle 26 free air ( Figure S1b) 27 Figure S1. A schematic of the PSM core sampling inlet during normal operation (a) and during background 28 measurements (b). 29

PSM diluter 30
We used a prototype diluter which was designed at the University of Helsinki and later commercialized by 31 Airmodus under the name "Airmodus nanoparticle diluter" (AND). The diluter has a cylindrical shape made 32 of three modules. The first module, from the air-sampling side, serves as a switchable ion filter which removes 33 charged ions and particles up to a certain size and allows the measurement of neutral particles only. In this 34 study the ion filter was turned off. The second module is a core sampling piece radially connected to a vacuum 35 source which draws 5 lpm excess flow from the sampling air. The third module constitutes the dilution module 36 where clean dry air is introduced radially into the sampled air flow. The differential pressure across the dilution 37 unit is continuously monitored and is kept constant by a feedback mechanism to a PID controlled proportional 38 valve which determines the dilution flow required to keep the dilution ratio constant. The design of the diluter 39 was made as compact as possible to reduce losses and optimize penetration efficiency. Additionally, the 40 dilution flow was monitored with a TSI flow meter and was used along with the pressure measurements to 41 determine and correct for the real-time dilution factor. 42

nCNC (PSM+CPC) inversion 43
In principle, the PSM is a mixing-type condensation particle counter but without the measuring optics. It uses 44 diethylene glycol (DEG) to grow nano-sized particles (~1-3 nm) up to around 90 nm. Subsequently, these 45 particles enter the CPC and are further grown with butanol to sizes measurable by the CPC optical detector. In 46 the first stage, the mixing ratio of DEG vapour with sample flow is scanned by continuously incrementing then 47 decrementing the saturator flow between 0.1 and 1.3 liters per minute (lpm) while keeping the sample flow 48 constant. By varying the mixing ratio, the particle cut-off size is changed (i.e., at higher mixing ratio, smaller 49 particles are activated and grown thus lower cut-off is achieved). Therefore, the nCNC measures the total 50 particle concentration above a certain diameter and inversion algorithms are required to retrieve the size 51 distribution below 3 nm. The two most popular methods to invert PSM data are the kernel function method 52 and the step inversion method. The expectation-maximization (EM) method has been recently recommended 53 over the kernel method because it is less sensitive to random errors (Cai et al., 2018;Chan et al., 2020). Here, 54 we compare the kernel method and the EM method using PSM data from the whole measurement period. Data 55 pretreatment before inversion was done similarly for the two methods and included a: 56 1) Diagnostic check that identifies and removes erroneous data based on instrument diagnostics and flags. 57 2) Background subtraction: the instrumental background of the PSM was continuously monitored with 58 daily automated random background (zero) checks. The background was subtracted from the measured 59 data except in the cases were the background was very high (> 10% of the measured concentrations) 60 then the corresponding data was deemed unusable until the background decreased to normal levels. 61 3) Correction for the time-delay between PSM and CPC which is typically ~5 seconds. 62 4) Noise filtering procedure achieved by applying a 6 th order median filter on the one second resolution 63 data. 64 5) Quality check using the method suggested by Chan et al. (2020). 65 6) Minimization of the inversion matrix using a saturator flow inversion window of 0.08 lpm which 66 minimized the saturator flow (corresponding to cut-off diameter) scans from ~120 to 16 per one-67 direction of the scan. 68 7) While pre-averaging before the inversion step is recommended for noisy data, here we did not pre-69 average in order to capture the fast variations in the data. 70 8) The minimized cut-offs matrix is differentiated to retrieve the concentration in each size bin which is 71 the input for the kernel inversion method. This step is not necessary for the EM method which takes 72 the cut-off matrix as input (the varying total particle concentration at each saturator flow rate). Further 73 explanation about the theoretical approach of each inversion method can be found in Cai et al. (2018). 74 During the inversion step, four kernels corresponding to four size channels (dp), with the following diameters: 75 1.1 nm, 1.3nm, 1.5 nm, and 2.4 nm were used with the kernel inversion method whereas 50 kernels between 76 1.1 nm and 2.4 nm were used for the EM inversion method. The kernels are Gaussian-shaped and represent 77 the derivative of the laboratory-derived detection efficiency curves with respect to the saturator flow rate. The 78 median (µ) of the kernel function at each dp is equal to the saturator flow having half maximum detection 79 efficiency at this diameter, whereas the width i.e. standard deviation (σ) is equal to p1/(dp+q1) where p1 and q1 80 are fitting parameters derived from the calibration curve. An example of PSM calibration curve data is shown 81 in Figure 1 from Cai et al. (2018). Note that the actual input to the EM method is the detection efficiency 82 curves rather than the kernels. 83 After the inversion step, inverted data was transformed from dN/ddP to dN/dlogdP and averaged to longer 84 times: five minutes and one hour. The comparison of the inversion methods was made by comparing the total 85 dN/dlogdP concentration from the kernel and EM methods to each other. The two methods were reasonably 86 comparable using the one hour resolution data ( Figure S2), although there is some scatter at low total 87 concentrations, and the 5 min average data revealed sometimes considerable deviations. Here, we mainly use 88 1 hour resolution data for the presented analysis thus we chose to use the data from the kernel inversion method 89 because it gave better uniformity for the particle size distribution below 3 nm. 90 Figure S2. Comparison between total dN/dlogDp concentrations (cm -3 ) between 1.1 and 2.4 nm computed 91 from PSM data using the Kernel inversion method and the E&M method. Each data point represents one 92 hour time resolution. Blue points represent data with global radiation lower than 50 W.m -2 (night-time data). 93 Green points represent data with global radiation higher than 50 W.m -2 (day-time data). The red line 94 represents the 1:1 line. 95

SMPS hygroscopicity corrections 98
The "ambient" SMPS particle size distribution was back calculated from the dry distribution using the 99 hygroscopicity model of Petters and Kreidenweis (2007). This model relies on the Köhler theory which 100 describes the equilibrium between the droplet phase and vapor phase. The traditional Köhler equation (Eq. S1) 101 links the equilibrium size of the growing aerosol particle, its chemical composition and water content to the 102 ambient water vapor saturation ratio (S) (Köhler, 1936). 103 Petters and Kreidenweis (2007) introduced a single hygroscopicity parameter ( ) which described the water 113 activity ( ) and the difference in the densities and molar masses of water and the dry material: 114 Assuming additive volumes, the Köhler equation can be reformulated to the -Köhler equation which can also 119 written in the form of hygroscopic growth factor (HGF) which is defined as the ratio between wet particle 120 diameter ( , ) and dry particle diameter (  .

122
In this study average seasonal values of were retrieved from hygroscopic tandem differential mobility 123 analyzer (HTDMA) measurements performed in parallel to our study (Table S2). The hygroscopic κ values 124 for each SMPS size bin were extrapolated from the HTDMA size resolved measurements by linear regression. 125 The particle size distributions at ambient RH conditions was then calculated using equation S3, by 126 incorporating the respective κ values per size bin, and the measured size distribution at dry conditions. 127 Next, the ambient (real) particle diameter was calculated from by solving equation S3, which was later used 128 to calculate the real particle size distribution (before drying). 129 To show an example of the effect of humidity corrected particle size distribution on NPF-related parameters, 130 we compared the dry condensation sink to that calculated when the particle sizes were assumed to be 131 equilibrated to the ambient RH. This comparison shows that the actual condensation sink is sometimes up to 132 3.5 times higher than the dry condensation sink but on average it is between 1.1 and 1.3 times higher than the 133 dry one ( Figure S4). 134  Figure S4. The top panel shows the effect of particle hygroscopic growth factor (GF) on condensation sink 136 (CS) calculations presented as the ratio between condensation sink calculated from the "ambient" 137 distribution and condensation sink calculated from the "dry" distribution. The bottom and top edges of the 138 box plot represent 25% and 75% percentiles. The whiskers extend to the most extreme data points not 139 considered outliers, and the outliers are plotted individually using the '+' symbol. The bottom panel shows 140 median RH (%) with 25 th and 75 th percentiles. 141

Identification of days with high dust loading 142
The method proposed by Drinovec et al. (2020) permits the calculation mineral dust concentrations with high 143 time resolution using the following equation Scientific, USA) coupled to a virtual impactor (VI), , 1 is the absorption coefficient (at 370nm) measured 147 by a second AE33 Aethalometer sampling through a PM1 sharp-cut cyclone, EF is the enhancement factor of 148 the VI and MAC is the mass absorption cross section for dust. The last two coefficients were used as 149 determined experimentally by Drinovec et al. (2020) where additional information about the method and the 150 instruments used can be found. 151 From the mineral dust daily time series we defined a daily threshold above which a day is considered having 152 high dust loading (Table S3). When aethalometer measurements were not available, coarse particle mass 153 loading (PM10 -PM2.5), determined by a Tapered Element Oscillating Microbalance (TEOM), was used to 154 identify dust days. Additional information about the TEOM used can be found in Pikridas et al. (2018). The 155 threshold for coarse PM was defined based on the linear regression between coarse PM and mineral dust 156 concentration. 157 Table S3. List of dates with high dust loading   Figure S8. Comparison of growth rates measured in this study to growth rates measured at 12 European sites 171 (Manninen et al., 2010). 172 Figure S9. The median (a) and mean (b) averages of the diurnal size segregated condensation sink (s -1 ) 173 computed over the whole measurement period of this study. 174  Figure S10. The monthly diurnal cycle of condensation sink (s -1 ) during event (blue) and non-event (green) 177 days. The shaded areas represent 25 th to 75 th percentile while the solid line represents the median. 178

The relation between some parameters and NPF events 179
Figure S11. Month wind roses during event and non-event days. and 75 th percentiles, respectively. The central mark indicates the median. The whiskers extend to the most 188 extreme data points not considered outliers, and the outliers are plotted individually using the '+' symbol. 189 Data presented have daily time resolution 190