Physical and chemical properties of urban aerosols in São Paulo, Brazil: Links between composition and size distribution of submicron particles.

. In this work, the relationships between size and composition of submicron particles (PM 1 ) were analyzed at an urban site in the Metropolitan Area of São Paulo (MASP), a megacity with about 21 million inhabitants. The measurements were carried out from 20 th December 2016 to 15 th March 2017. The chemical composition was measured with an Aerodyne Aerosol Chemical Speciation Monitor and size distribution with a TSI Scanning Mobility Particle Sizer 3082. PM 1 mass concentrations in the MASP had an average mass concentration of 11.4 µ g m − 3 . Organic aerosol (OA) dominated the PM 1 5 composition (56%), followed by sulfate (15%) and equivalent black carbon (eBC, 13%). Four OA classes were identiﬁed using Positive Matrix Factorization: oxygenated organic aerosol (OOA, 40% of OA), biomass burning organic aerosol (BBOA, 13%), and two hydrocarbon-like OA components (a typical HOA related to vehicular emissions (16%), and a second HOA (21%) representing a mix of anthropogenic sources). Particle number concentrations averaged 12100 ± 6900 cm − 3 , dominated by the Aitken mode. Accumulation mode increased under relatively high PM 1 concentrations, suggesting an enhancement of 10 secondary organic aerosol (SOA) production. Conversely, the contribution of nucleation mode particles was less dependent on PM 1 levels, coherent with vehicular emissions. The relationship between aerosol size modes and PM 1 composition was assessed by multilinear regression models. Mass loading in the nucleation mode was associated mostly with eBC, HOA and OOA, suggesting the contribution of primary and secondary particles from trafﬁc sources. Secondary inorganic aerosols were partitioned between Aitken and accumulation modes, related to condensation particle growth processes. Submicron mass load- 15 ing in accumulation mode was mostly associated with highly oxidized OOA and also trafﬁc-related emissions. To the author’s knowledge, this is the ﬁrst work that use the MLR methodology to estimate the chemical composition of the different aerosol size modes. The results emphasize the relevance of vehicular emissions to the air quality at MASP and highlight the key role of secondary processes on the PM 1 ambient concentrations in the region since 56% of PM 1 mass loading was attributed to SOA and secondary inorganic aerosol.


Identification of OA components with Positive Matrix Factorization (PMF)
Positive Matrix Factorization (PMF) is a statistical model that uses weighted least-square fitting for factor analysis (Paatero and Tapper, 1994;Paatero, 1997). It uses a bilinear factor analytic model defined in matrix notation as: where X denotes the matrix of the measured values, G and F are matrices computed by the model that represent the scores and loading, respectively, and E is the residual matrix, made up of the elements e ij . For the ACSM and AMS (Aerosol Mass 90 Spectrometer) data, the measured organic mass spectra are apportioned in terms of source/process-related components . In this case, the columns j in X are the m/z's and each row i represents a single mass spectrum, G represents the time series and F the profile mass spectrum for the p factors computed by the algorithm. The model adjusts G and F using a least-squares algorithm that iteratively minimizes the quantity Q, defined as the sum of the squared residuals weighted by their respective uncertainties: where σ ij is the uncertainty for each element in the matrix X. An IGOR™-based source finder (Canonaco et al., 2013, Sofi) with a multilinear engine algorithm (Paatero, 1999, ME-2) was used to prepare the data, error estimates, execute the analysis and evaluate the results.

Multilinear regression (MLR) model
100 An analysis of the relationship between chemical composition and size distribution was performed using a multilinear regression (MLR) model. Previous studies have applied the MLR model to estimate aerosol mass scattering and extinction efficiencies, and source apportionment of optical properties (Ealo et al., 2018). A linear regression model describes the relationship between a dependent variable, y, and one or more independent variables, x. The dependent variable is also called the response variable and independent variables are also called explanatory or predictor variables. The MLR model is: where, y i is the i th observation of the response variable, β j is the j th coefficient, β 0 is the constant term in the model, x ij is the i th observation on the j th predictor variable, j = 1, ..., p, and i is the i th error term.
For the MLR model, the time series of PM 1 chemical constituents (i.e., eBC, inorganic species and OA PMF-derived chemical classes) were used as dependent variables, and the volume of particle size modes (i.e., nucleation, Aitken and accumulation) 110 were taken as predictors. Volume size distribution was used in all MLR calculations since it represents better the accumulation Near real-time submicron mass concentration (PM 1 ) can be obtained by the non-refractory PM 1 (NR-PM 1 ) and eBC measurements (Table 2). Average PM 1 ± standard deviation during the campaign was 11.4 ± 7.8 µg m On average, organic aerosols dominated the composition, contributing to 55%, fol-lowed by sulfate (15%) and eBC (14%). Ammonium (9%), nitrate (6%) and chloride (1%) presented smaller contributions to the mass loading. The time series of PM 1 chemical species and its relative contribution to the total mass concentration in the submicron size range is shown in Figure 3.

145
Components of OA were identified using PMF analysis following the procedure described by Ulbrich et al. (2009) Table 4), and secondary inorganic species, such as sulfate, nitrate and ammonium. Moreover, the OOA mass concentration significantly increases in the afternoon, similarly to the ozone diels ( Figure 5), indicating that its formation is partially driven by photochemistry. Considering the sum of secondary inorganic aerosols (sulfate, nitrate and ammonium) and SOA (OOA) as a lower limit for the contribution of secondary aerosols to the total of PM 1 , it is possible to estimate that at least 56% of submicron particles mass loading results from secondary production.

165
The BBOA component has a mass spectrum dominated by the m/z's 29, 60 and 73 ( Figure 4). The signal at m/z 60 is associ- et al., 2007) and correlates with levoglucosan and similar anhydrosugar species (mannosan, galactosan) that result from the pyrolysis of cellulose. The BBOA mass spectrum presents a strong correlation with standard AMS database BBOA (R=0.90, Table 3). The diurnal variability of BBOA ( Figure 5), with an average concentration almost three times higher during nighttime, seems modulated by atmospheric dynamics, such as boundary layer height evolution. The 170 boundary layer height decreases during nighttime trapping freshly emitted smoke particles. The time series of BBOA correlates moderately with eBC (R=0.47), nitrate (R=0.48) and chloride (R=0.56) ( Table 4). The average contribution of BBOA is 13% of total OA and almost 7% of PM 1 , significantly lower than reported in Pereira et al. (2017), that found considerable biomass burning contributions (approximately 18% of PM 2.5 ) associated to long-range transport from regional sugarcane burning in São Paulo during wintertime, in addition to local emission sources.

175
Both HOA components present mass spectra characterized by hydrocarbon-like structures typical of alkanes, alkenes and cycloalkanes (m/z's 27, 29,41,43,55,57,67,69,71,81,83,85) related to anthropogenic primary emissions (Canagaratna et al., 2004;Zhang et al., 2005). Both mass spectra correlate with the AMS standard HOA mass spectrum (R=0.90 and R=0.89 for HOA I and HOA II , respectively). Although both factors are HOA related, it is not reasonable to interpret them as a split of the same source. The HOA I factor presents an elevated signal at m/z 55 that has been related to cooking OA (COA), an 180 important source of (primary organic aerosol) POA in urban environments (Mohr et al., 2012). The HOA II factor presents a higher signal at m/z 57 than at m/z 55, and higher correlation with eBC (R HOA II =0.69, R HOA I =0.45), which is related to vehicular emissions in the MASP, mostly heavy-duty vehicles (de Miranda et al., 2018). The results indicate that HOA II is more consistent with traffic, while the HOA I seems like a mixture of anthropogenic sources. Together, the HOA factors present an average contribution of 37% to OA (21% from HOA I and 16% from HOA II ). For both HOA factors, the diurnal profiles 185 of mass concentrations ( Figure 5) increase during the traffic rush-hour time 6h-8h (local time). However, HOA I also shows a peak between 12h and 14h (local time), probably associated with local cooking activities.
Although the sampling site is located in an industrialized region, a distinct industrial-related OA factor could not be identified in this study. As a matter of comparison, Bozzetti et al. (2017)  tion of nucleation, Aitken and accumulation modes explains the measured total particle number concentration (slope=0.98 and R 2 =0.99). The Aitken mode dominated the PNSD with average concentration ± standard deviation of 6900 ± 4600 cm −3 (56% of total number concentration) followed by the nucleation mode, with average particle number concentration of 2800±2100 cm −3 . The contribution of the accumulation mode is the lowest in terms of particle number (19% of total number concentration), but the highest in terms of particle volume concentration. The nucleation mode presented a peak concentration in 205 the morning rush-hour, similar to eBC, HOA II and NO 2 (Fig. 5).  pollution events tend to occur when condensation processes produce significant number concentrations of accumulation mode particles.

230
The strong occurrence of accumulation mode particles under polluted conditions can be explained by the fact that larger surface area of pre-existing particles favors coagulation processes. Consequently, nucleation is suppressed by coagulation loss and particles become larger. Moreover, the submicron aerosol size distribution is strongly influenced by the competition between nucleation of new particles and condensation of gas-phase components onto pre-existing particles Rodríguez et al. (2005). Under polluted conditions, the aerosol surface is enough to favor the condensation of vapors onto pre-existing particles, inhibiting 235 nucleation, and resulting in particle growth. During low PM 1 conditions, the available aerosol surface is low, decreasing both condensation and coagulation rates, which favors homogeneous nucleation.

Relationships between particle size and chemical composition of submicron particles
The contribution of aerosol size modes to the ambient concentrations of the PM 1 chemical species was assessed by performing a multilinear regression model (MLR). In the MLR model, the time series of volume concentration at the nucleation, Aitken 240 and accumulation modes were used as predictors. PM 1 components were used as species of interest. Results of MLR are summarized in Table 5. The model explained more than 90% of the average measured concentrations for the PM 1 species. For predictors used in MLR, the calculated variance inflation factor (VIF) was in the range of 1.11 to 2.16. In general, VIFs below regression coefficients (Table 5) and average volume concentrations. Their confidence intervals were calculated according to the confidence intervals of regression coefficients. PM 1 mass loadings were reconstructed by the sum of the partial contributions determined for each size mode (Fig. 9), i.e. nucleation (0.57 µg m −3 ), Aitken (1.25 µg m −3 ) and accumulation modes (6.23 µg m −3 ). It could explain 75% of the mean of measured PM 1 . It is important to emphasize that the dependent variables in the MLR are the mass measured by ACSM and MAAP, however, the particle range measured by the ACSM goes from 70 to 900 250 nm, so most of the nucleation mode particle composition cannot be directly measured by the ACSM. It can result in larger uncertainties for the reconstructed mass in the nucleation mode. However, since it represents a small fraction in terms of total mass the uncertainty is likely small. Secondary inorganic species (ammonium, nitrate and sulfate) are partitioned between Aitken and accumulation modes.
Those results are similar to Rodríguez et al. (2007). The authors observed that ambient concentrations of ammonium nitrate and ammonium sulfate correlated better with the accumulation mode, attributing it to condensation mechanisms and particle In this study, a detailed characterization of submicron particles was performed at an urban site in the MASP. The results show PM 1 mass concentrations in close agreement with other megacities, with an average mass concentration of 11.4 µg m −3 . As expected, chemical composition was dominated by organic aerosols (56%), with significant contributions of sulfate (15%) and black carbon (13%). Using PMF analysis it was possible to identify four OA classes including oxygenated organic aerosol (OOA), biomass burning organic aerosol (BBOA), and two hydrocarbon-like OA components (a typical HOA related 285 to vehicular emissions, and a HOA associated to a mix of anthropogenic sources). Considering the sum of secondary inorganic aerosols and SOA as a lower limit, more than 50% of PM 1 mass loading was estimated as resulting from secondary production.
Nucleation, Aitken and accumulation lognormal size modes were fitted to the measured PNSD. Aitken mode dominated the total number concentration with an average concentration of 6,900 cm −3 and submicron aerosol size distribution was strongly influenced by the PM 1 levels. The accumulation mode shows a large increase from low PM 1 conditions to high PM 1 condi-290 tions, when aerosol surface is enough to favor the condensation of vapors onto pre-existing particles, inhibiting nucleation, and resulting in particle growth. Conversely, the contribution of particles from the nucleation mode to the total number concentration is higher during low PM 1 conditions, when the available aerosol surface is low, decreasing both condensation and coagulation rates, and favoring homogeneous nucleation. Because of the high contribution of nucleation particles under low PM 1 loadings, PM 2.5 and PM 10 (parameters frequently used in the air quality index) may be insufficient to assess human PM 295 exposure in urban areas.
The relationships between size modes and chemical constituents of PM 1 were assessed by performing an MLR model.
Mass loading in nucleation mode was attributed to fresh particles from traffic (HOA and eBC) and photochemically induced secondary particle formation (OOA). Secondary inorganic species (ammonium, sulfate and nitrate) were partitioned between Aitken and accumulation modes and related to condensation particle growth processes. Submicron mass loading in the accu-300 mulation mode included aged secondary organic aerosol and vehicular emissions.
The results presented here emphasize the well-established impact of traffic-related sources in the MASP and make clear the need to reduce emissions rates in the region by applying new technologies such as the EURO VI emission standard. It is also essential to expand mass transportation systems, since the metro system in São Paulo is heavily underdeveloped, resulting in a better transportation system for 20 million people. Additionally, encouraging alternative transportation, implementing 305 strong incentives for electrical vehicles and the restriction of passenger car circulation can improve significantly air quality in urban areas. Although the implementation of regulatory programs to control stationary and mobile sources in the MASP over the last decades has been successful to reduce primary emissions, secondary processes have been recognized as critical to air quality in the region. The findings presented provide innovative insights on the association between sources and processes governing physicochemical properties of atmospheric aerosol and highlight the key role of SOA formation on the PM 1 ambient 310 concentrations in a megacity largely impacted by traffic emissions and extensive biofuel usage.