Positive matrix factorization of PM 2.5 – eliminating the e ﬀ ects of gas/particle partitioning of semivolatile organic compounds

Gas-phase concentrations of semi-volatile organic compounds (SVOCs) were calculated from gas/particle (G/P) partitioning theory using their measured particle-phase concentrations. The particle-phase data were obtained from an existing ﬁlter measurement campaign (27 January 2003–2 October 2005) as a part of the Denver Aerosol 5 Sources and Health (DASH) study, including 970 observations of 71 SVOCs (Xie et al., 2013). In each compound class of SVOCs, the lighter species (e.g. docosane in n -alkanes, ﬂuoranthene in PAHs) had higher total concentrations (gas + particle phase) and lower particle-phase fractions. The total SVOC concentrations were analyzed using positive matrix factorization (PMF). Then the results were compared with source ap- 10 portionment results where only particle-phase SVOC concentrations were used (ﬁlter-based study; Xie et al., 2013). For the ﬁlter-based PMF analysis, the factors primarily associated with primary or secondary sources ( n -alkane, EC/sterane and inorganic ion factors) exhibit similar contribution time series ( r = 0.92–0.98) with their corresponding factors ( n -alkane, sterane and nitrate + sulfate factors) in the current work. Three other 15 factors (light n -alkane/PAH, PAH and summer/odd n -alkane factors) are linked with pollution sources inﬂuenced by atmospheric processes (e.g. G/P partitioning, photochemical

This discussion paper is/has been under review for the journal Atmospheric Chemistry and Physics (ACP). Please refer to the corresponding final paper in ACP if available.
factors (light n-alkane/PAH, PAH and summer/odd n-alkane factors) are linked with pollution sources influenced by atmospheric processes (e.g. G/P partitioning, photochemical reaction), and were less correlated (r = 0.69-0.84) with their corresponding factors (light SVOC, PAH and bulk carbon factors) in the current work, suggesting that the source apportionment results derived from filter-based SVOC data could be affected 20 by atmospheric processes. PMF analysis was also performed on three temperaturestratified subsets of the total SVOC data, representing ambient sampling during cold (daily average temperature < 10 • C), warm (≥ 10 • C and ≤ 20 • C) and hot (> 20 • C) periods. Unlike the filter-based study, in this work the factor characterized by the low molecular weight (MW) compounds (light SVOC factor) exhibited strong correlations 25 (r = 0.82-0.98) between the full data set and each sub-data set solution, indicating that the impacts of G/P partitioning on receptor-based source apportionment could be eliminated by using total SVOC concentrations.

Introduction
The Denver Aerosol Sources and Health (DASH) study was designed to explore the associations between short-term exposure to individual PM 2.5 components, sources and negative health effects (Vedal et al., 2009). Daily 24-h PM 2.5 sampling was conducted from mid-2002 to the end of 2008. Speciation of PM 2.5 has been carried out 5 for gravimetric mass, inorganic ionic compounds (sulfate, nitrate and ammonium) and carbonaceous components, including elemental carbon (EC), organic carbon (OC) and a large array of semi-volatile organic compounds (SVOCs). Kim et al. (2012) have investigated the lag structure of the association between PM 2.5 constituents and hospital admissions by disease using the 5-yr bulk speciation data set of DASH study (nitrate, 10 sulfate, EC and OC). They found that the estimated short-term effects of PM 2.5 bulk components, especially those of EC and OC, were more immediate for cardiovascular diseases and more delayed for respiratory diseases. Future work will focus on the association between specific PM 2.5 sources and health outcomes.
To develop control strategies for PM 2.5 , receptor-based models (e.g. positive matrix 15 factorization, chemical mass balance) have been applied to quantitatively apportion PM 2.5 to sources that are detrimental to human health (Laden et al., 2000;Mar et al., 2005;Ito et al., 2006). One basic assumption of receptor-based models is that source profiles are constant over the period of ambient and source sampling (Chen et al., 2011). However, the output factors of a receptor model are not necessarily emission 20 sources, and could be affected by atmospheric processes like photochemical reaction or gas/particle (G/P) partitioning (May et al., 2012). The influence of atmospheric processes on certain output factors can change with meteorological conditions (e.g. solar irradiance, ambient temperature). Thus, the assumption of constant source profiles does not hold for all output factors, especially for long time series studies.
from theory (Eq. 2): where it is assumed that particle-phase organic material (OM) is primarily responsible 10 for the absorptive uptake. Thus, it is meaningful to normalize the G/P partitioning constant (K p , m 3 µg −1 ) by the weight fraction of the absorptive OM phase (f OM ) in the total PM phase (Eq. 1), so as to obtain K p,OM (m 3 µg −1 ). F (ng m −3 ) is the mass concentration of each compound associated with the particle phase; A (ng m −3 ) is the mass concentration of each compound in the gas phase; M OM (µg m −3 ) is the mass con-15 centration of the particle-phase OM; R (m 3 atm K −1 mol −1 ) is the ideal gas constant; T (K) is the ambient temperature; MW OM (g mol −1 ) is the mean molecular weight (MW) of the absorbing OM phase; ζ OM is the mole fraction scale activity coefficient of each compound in the absorbing OM phase; and p o L (atm) is the vapor pressure of each pure compound. For a given SVOC and a single OM phase, the G/P partitioning is 20 only controlled by ambient temperature (Eq. 2). The mass fraction of the total SVOC in the atmosphere that contributes to the particle phase thus can change with ambient temperature. As such, the source profiles of particle-phase SVOCs are expected to vary due to the influence of G/P partitioning, especially for those sources primarily contributing light SVOCs (e.g. docosane, fluoranthene). Therefore, when using a long time 25 series of speciated PM 2.5 data as input for receptor model analysis, the light SVOC 5203 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | related sources/factors for a sub period of observation might be obscured by the influence of G/P partitioning, which will subsequently affect the health effect estimation of specific PM 2.5 sources.
In this study, gas-phase SVOC concentrations were estimated using their particlephase concentrations based on Eq. (1). The particle-phase concentrations of SVOCs 5 were obtained from an existing 32-month series of daily PM 2.5 speciation, which has been used for source apportionment in a previous study (Xie et al., 2013). In order to eliminate the influence of G/P partitioning on source apportionment, the total concentrations of gas-and particle-phase SVOCs were used as inputs for PMF analysis. The PMF2 model (Paatero, 1998a,b), coupled with a stationary block bootstrap technique 10 quantifying errors due to random sampling (Hemann et al., 2009), was the primary source apportionment tool. Moreover, the 32-month data set of total SVOCs was divided into three sub-data sets by daily average temperature for source apportionment using the identical method. The use of smaller sub-data sets as inputs is to verify the elimination of G/P partitioning influence from the total SVOC-based PMF analysis.

Particle phase measurements
Daily PM 2.5 samples were collected on the top of a two-story elementary school building in urban Denver. Details of the sampling site, set up, protocols and chemical analysis have been published by Vedal et al. (2009) and Dutton et al. (2009a,b). Daily aver- Concentrations of inorganic ions, bulk elemental carbon (EC) and organic carbon (OC) were also measured for the same study period. The pointwise, blank corrected concentration uncertainties of each species were estimated by using the root sum of squares 25 (RSS) method (Dutton et al., 2009a,b). The concentration and uncertainty data sets 5204 Printer-friendly Version

Interactive Discussion
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | have been used as inputs for a filter-based source apportionment in a previous study (Xie et al., 2013). The meteorological (temperature, relative humidity and solar irradiance) and trace gas (ozone, nitrogen oxides (NO x ) and CO) data used in this study were also obtained from Xie et al. (2013). 5 The K p,OM value for each species on each day was calculated by Eq. (2). Here four parameters are required, including T , MW OM , ζ OM and p o L . For this application T is the measured daily average temperature. Based on smog chamber and ambient studies (Odum et al., 1996;Hallquist et al., 2009), 150-250 g mol −1 is a reasonable range for the average MW of the particulate OM phase; here we assume the MW OM to be 10 200 g mol −1 for all samples, as is used in previous work (Barsanti and Pankow, 2004;Williams et al., 2010). Values of ζ OM were assumed to be unity for all species in each sample. Values of p o L were estimated using the group contribution methods (GCMs) SPARC (Hilal et al., 1995; http://archemcalc.com/sparc/test/) and SIMPOL (Pankow and Asher, 2008 Table S1. 20 Gas-phase concentrations of each SVOC were calculated by Eq. (1). The values of F for each SVOC in Eq. (1) were obtained from existing PM 2.5 measurements (Xie et al., 2013); M OM was estimated by multiplying the OC concentrations by a scaling factor of 1.53, which resulted in optimum mass closure of PM 2.5 in a previous DASH  (Dutton et al., 2009a). The total concentration of each SVOC (S, gas + particle phase) on each day is then obtained by Eq. (4),

Gas phase concentration and uncertainty estimation
The uncertainty associated with S estimation was also calculated using the RSS method, where δS is the propagated uncertainty in S; δF and δM OM are the propagated uncertainties associated with particle-phase SVOC and M OM measurements, and could be obtained from the uncertainty data sets introduced in Sect. 2.1. The K p,OM value uncertainty was not estimated in the current work. Statistics for the total concentration  Table S1, including the mean and median concentrations, mean particle-phase fractions, signal to noise ratios (S/N = mean concentration/mean uncertainty) and coefficients of variation (CV = standard deviation/mean concentration). Table S1 also lists statistics of particulate bulk components (mass, nitrate, sulfate, ammonium, EC and OC). The OC concen-15 trations are shown in 5 fractions (OC1-4 and PC), representing the carbon measured at four distinct temperature steps (340, 500, 615 and 900 • C) with a pyrolized carbon adjustment in the first heating cycle of NOISH 5040 thermal optical transmission (TOT) method (NOISH, 2003;Schauer et al., 2003).

20
PMF2 (Paatero, 1998a,b), a multivariate receptor model, was used for source apportionment in this study. It is the primary source apportionment tool applied in the DASH 5206 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | project, and is discussed in detail by Dutton et al. (2010). PMF uses an uncertaintyweighted least-squares fitting approach to identify distinct factor profiles and quantify factor contributions from a time series of observations. The bias and variability in factor profile and contribution due to random sampling error were estimated by applying a method from Hemann et al. (2009). 1000 replicate data sets were generated from the 5 original data set using a stationary block bootstrap technique and each was analyzed with PMF. Because the ordering of factors may differ across solutions on bootstrap replicate data sets (e.g. factor i in one solution may correspond to factor j in another), the Multilayer Feed Forward Neural Networks were trained to sort and align the factor profiles from each PMF bootstrap solution to that of the original solution based on the 10 observed data (known as the base case). A PMF bootstrap solution was recorded only when each factor of that solution could be uniquely matched to a base case factor. The measurement days resampled in each recorded solution were tracked to examine the bias and variability in contribution of each factor on each day, which could then be used to assess the variability of the PMF model fit. In this work, the factor number was 15 determined based on the interpretability of different PMF solutions (5-9 factors) as well stability across bootstrap-replicate data sets as represented by factor matching rate.

Preparation of PMF input data set
Fifty one SVOCs and four bulk species were selected from all species with 970 daily observations for filter-based PM 2.5 source apportionment (Xie et al., 2013). The species 20 screening was based on the percentage of missing values and observations below detection limit (BDL), S/N ratios and the stability of PMF solution. In this work, the candidate SVOCs for source apportionment were selected from the fifty one species used in the previous study. Bulk species were selected from nitrate, sulfate, EC and the five OC fractions. Interpretability and factor matching rate (> 50 %) of the PMF solution 25 were criteria for species screening. Among the five OC fractions, the OC1 concentration was measured under the lowest temperature (340 • C) and most likely influenced by G/P partitioning; and the gas-phase concentration of OC1 could not be estimated 5207 Introduction due to the complex composition. The OC4 concentration was very low with low S/N ratio. Thus OC1 and OC4 were excluded for PMF analysis. The other three fractions (OC2, OC3, PC) were assumed to be less or non-volatile and were included for PMF analysis. Finally, the six bulk species with 970 daily observations and forty six SVOCs with 970 estimated total concentrations constituted the primary PMF input data set. 5 Similarly to the previous Xie et al. (2013) study, PMF analysis was also performed for three temperature-stratified subsets of the original 970 samples. The three subdata sets consisted of sampling days with daily average temperature less than 10 • C (N = 364), between 10 • C and 20 • C (N = 318), and greater than 20 • C (N = 288), respectively. The sampling periods of these three sub-data sets were defined as cold, 10 warm and hot. The statistics of total SVOCs during each of these three periods are shown in Tables S2-S4. PMF input species screening for each sub-data set was conducted in the same manner as for the full data set.

Result and discussion
3.1 Total SVOCs and their particle-phase fractions 15 Except steranes, the low MW species have the highest total concentrations and the lowest particle-phase fractions in each class of SVOCs (Table S1). For example, docosane and fluoranthene are the most abundant species in n-alkanes and PAHs with mean concentrations of 32.8 ng m −3 and 11.2 ng m −3 respectively, one to two orders of magnitudes higher than those of high MW species in their chemical classes. In this 20 study, the total concentrations of light n-alkane (e.g. docosane-pentacosane) and PAH (e.g. MW = 202) species increased by more than 100 % from the cold to the hot periods (Tables S2-S4), possibly due to the evaporation of fossil fuels (Nahir, 1999) and increases in biogenic VOC emissions with increasing temerature. The average particle-phase fraction of each SVOC was calculated for the cold, warm 25 and hot periods and shown in Fig. 1 fractions in cold periods and the lowest in hot periods, especially for those light SVOCs (e.g. docosane, fluoranthene), indicating a change in G/P partitioning behavior across different temperatures. Long chain n-alkanes (chain length > 27), heavy PAHs (MW > 252), steranes, hopanes, and sterols are mostly in the particle phase (> 75 %) for all periods and less subject to evaporation (or partitioning to the gas phase) under 5 higher temperatures. In Table S5, the estimated particle-phase fractions of selected SVOCs (n-alkanes, PAHs, sterane and hopanes) in hot periods are more comparable with those observed by Fraser et al. (1997Fraser et al. ( , 1998 in summer Los Angeles than in summer Athens (Greece) (Mandalakis et al., 2002). Average fractions of particulate PAHs for the whole period are similar to those annual averages measured by Tsapkis and 10 Stephanou (2005) in Heraklion (Greece). While large differences were observed for the particle-phase fractions of light PAHs (MW < 252) in cold and hot periods compared with those measured in urban Chicago (Simcik et al., 1997(Simcik et al., , 1998. These comparisons indicate that the estimations of G/P distributions of the SVOCs in this work are reasonable. Keep in mind that these differences may be influenced by parameters other than 15 T , like MW OM , ζ OM and M OM in Eqs. (1) and (2).

Sensitivity of total SVOC estimation based on G/P partitioning theory
Based on G/P partitioning theory, changes in ambient temperature lead to the evaporation or condensation of SVOCs; the extent of such changes with temperature depend in part on values of MW OM and ζ OM , here assumed to be 200 g mol −1 and unity re-20 spectively. However, MW OM and ζ OM are highly dependent on the composition of PM, which is complex in an urban area and mostly unknown. The MW OM values are typically based on MW of organic compounds detected in laboratory and field studies, but in some cases (e.g. under high relative humidity, RH) need to be adjusted downward for the presence of water in the particulate OM phase (Pankow and Chang, 2008; Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | varied amounts of polar and non-polar organic compounds and water) (Pankow and Chang, 2008;Pun, 2008). The uncertainties in these two parameters, as well as the OM/OC ratio, could affect the estimation of total SVOC concentration as described in Sect. 3.1. Combining Eqs. (2) and (4), the equation for total SVOC calculation can be re-written 5 as: from which we can infer that the estimation of total concentration (S value) for specific SVOC is primarily determined by the following term: 10 if z is close to 0, then most of the target SVOC is in particle phase; if z is close to or higher than 1, then the target SVOC is strongly subject to G/P partitioning. The sensitivity of total SVOC estimation (S value) to T , ζ OM , OM/OC ratio, MW OM can be evaluated as the changes of z value to these uncertain parameters in Eq. (7) Bae et al. (2006), were used to test the sensitivity of z value (or S value) calculation. 20 The values of the above parameters investigated were listed in Table 1. In Fig. 2, the sensitivity of z value to T , ζ OM , OM/OC ratio and MW OM are shown in nine mesh plots. Each mesh plot exhibits the changes of z value to varied M OM and 5210 MW OM is much larger than that of M OM , so the effects of MW OM to the calculation of z value seems more important than that of OM/OC ratio. However, if M OM and MW OM have similar variations (e.g. OM/OC ranges from 1.2 to 2.0, and MW OM ranges from 150 to 250 g mol −1 ), then these two parameters should have similar effects on the calculation of z value (or S value).

15
As demonstrated by the sensitivity study, the estimation of total SVOC concentration is mostly sensitive to ambient temperature. In this work, the sensitivity of G/P partitioning to ambient temperature is largely accounted for by adjusting the vapor pressure of each SVOC according to the daily average temperature. However, the total SVOC concentration estimated in the current work might be subject to considerable uncertainty 20 due to the variations of ζ OM , MW OM and OM/OC ratio across the sampling period.

PMF results for the full data set
A 7-factor solution was determined for the full data set using total SVOC concentration due to the most readily interpretable resulting factors and a relatively high factor matching rate of 79.9 % between bootstrapped and base case solutions ( with one standard deviation from bootstrapped PMF solutions, which represent the variability of PMF solution due to random sampling error. The factor contributions are also summarized by day of the week in boxplots (Fig. S3). The factor profiles have been normalized by where F * kj is the relative weighting of species j in factor k to all other factors. The median factor contributions in Fig. S2 are expressed as reconstructed PM 2.5 massthe sum of nitrate, sulfate, EC and straight OC fractions contributed by each factor. The contribution time series were divided into three periods (cold, warm and hot) and shown as the average contributions to major PM 2.5 components (nitrate, sulfate, EC and OC; Table 3). The sum of factor contributions to each component can be compared with the observed average concentration (Table 3). The sampling variability of factor contributions are represented by the median CVs (CV = standard deviation/median factor contribution). In addition, the factor contributions during each period were linearly regressed to meteorological and trace gas measurements in the same manner as dis-15 cussed in the previous Xie et al. (2013) study, so as to understand the association between each factor and pollution sources/processes. The resulting correlation coefficients are given in Table S6.
In Table 3, the nitrate and sulfate concentrations are dominated by the nitrate (average 59.4 %-97.4 %) and sulfate (79.5 %-96.0 %) factors in all periods. In cold periods, 20 the PAH factor (39.9 %) had the highest contribution to EC concentrations, followed by the sterane (25.2 %) and bulk carbon (23.0 %) factors; while in warm and hot periods, the bulk carbon factor contributed the most of the EC concentrations (warm, 53.3 %; hot, 76.5 %). The bulk carbon factor also has the highest contribution to OC (36.6 %-67.9 %) in all periods. Here the OC consists of the three less or non-volatile 25 OC fractions (OC2, OC3 and PC) that were used for source apportionment. The factors with small contributions to reconstructed PM 2.5 are prone to having high variability, as 5212 Introduction shown by their higher CVs (e.g. n-alkane, sterane and PAH factors). In each period, the sum of factor contributions to each major PM 2.5 component is close to the observed average concentration.

Comparison to filter-based source apportionment
In the previous Xie et al. (2013) study, an 8-factor solution was determined with 5 factors labeled as inorganic ion, n-alkanes, EC/sterane, light n-alkane/PAH, medium alkane/alkanoic acid, PAH, winter/methoxyphenol and summer/odd n-alkane. The medium alkane/alkanoic acid and winter/methoxyphenol factors only contributed a small part (0.41 %-1.10 %; 0.16 %-4.21 %) of reconstructed PM 2.5 mass and were not resolved in this study. The 7 factors resolved in the current work could be matched 10 with the remaining 6 factors in the filter-based solution after combining the nitrate and sulfate factors. Correlations of factor contributions between the matched pairs of factors are shown in Fig. 3. The factors characterized by inorganic ions, heavy n-alkanes and steranes exhibit strong correlations (r = 0.92-0.98) between the filter-based and total SVOC-based PMF solutions (Fig. 3). This strong correlation is because these factors are primarily linked with secondary formation or primary emission, and the heavy n-alkanes and steranes are mostly distributed in particle phase (Fig. 1). The light n-alkane/PAH and PAH factors from the filter-based solution are less correlated with the light SVOC (r = 0.73) and PAH (r = 0.84) factors from the total SVOC-based solution (Fig. 3). This 20 is because these factors contain a significant fraction of light organic compounds, being subject more strongly to G/P partitioning. In Fig. 4a, the light SVOC factor shows an increase in contribution when the temperature rises, supporting the association of this factor with fossil fuel evaporation and biogenic emissions. In contrast, the light n-alkane/PAH factor from the filter-based solution exhibits low contributions in mid- 25 summer when the temperature is the highest of the year and small peaks in winter when the temperature is low (Fig. 4b) the partitioning of gas-phase organics to the particle phase. In addition, the high ozone concentrations in mid-summer could also be responsible for the decrease in factor contribution, since negative correlations have been observed between ozone concentration and the two matched factors (Light SVOC: −0.48, Table S6; light n-alkane/PAH: −0.52, Xie et al., 2013) from both solutions during hot periods. No obvious difference 5 in contribution time series was observed for the PAH factor between the filter-based and total SVOC-based PMF solutions, since the PAH factor was mostly characterized by medium and high MW PAHs (MW ≥ 226; Fig. S1f).
The bulk carbon factor in the current work contains the largest percentages of EC and OC fractions (Fig. S1g), and has maximum contributions in summer (Fig. S2g). This 10 factor should be influenced by both secondary organic aerosols (SOA), as supported by the correlation between the factor contribution and ozone concentrations in hot periods (r = 0.36; Table S6), and primary emissions from motor vehicles, as supported by the weekend decrease in factor contribution (Fig. S3g) and the correlations between the factor contribution and NO x and CO concentrations (Table S6). The summer/odd 15 n-alkane factor from the filter-based solution was primarily associated with SOA formation, which lead to a moderate correlation (r = 0.69; Fig. 3f) with the bulk carbon factor in the current work. Except the inorganic ion factors, all other carbonaceous factors from the filter-based solution show higher contributions than their matched factors from the total SVOC-based solution, as illustrated by the regression slopes ranging from 1.3 20 to 2.7 (Fig. 3). This can mostly be attributed to the fact that the OC1 fraction was not included for source apportionment in the current study, which accounted for 47.6 % of the total OC on average.

PMF results for temperature-stratified sub-data sets
Statistics of PMF simulations for the three temperature-stratified sub-data sets are 25 given in Table 2. Comparing to the full data set, the same species and factor number were chosen for PMF analysis of the cold and warm period sub-data sets. The factor matching rates are 88.6 % and 77.2 %, respectively (Table 2). For the hot period 5214 and can be compared to those from full data set solution. Median CVs of factor contributions are also included in Table 3 to reflect the variability from random sampling error. In addition, the correlations between factor contributions and meteorological and trace gas measurements are given in Table S7. Similarly to the full data set solution, the nitrate and sulfate concentrations are mostly accounted for by the nitrate (average 10 93.9 %-94.7 %) and sulfate (85.2 %-87.9 %) factors (Table 3). The EC and OC concentrations are highest apportioned to the bulk carbon factor (EC, 48.9 %-64.9 %; OC, 32.9 %-50.7 %) for all periods.

Comparison to PMF results of the full data set
The factors from the analysis of each temperature-stratified sub-data set were matched 15 to those from the full data set based on factor profiles. The linear regressions of factor contributions between matched pairs of factors are given in Table 4, so as to verify that the influence of G/P partitioning was eliminated from the PMF analysis by using the total SVOC data set. However, we cannot rule out the impacts of other atmospheric processes like photochemical reactions.

Cold period
All the factors resolved by using the cold period sub-data set show similar factor profiles as their corresponding factors from the full data set solution (Figs. S1 and S4).
The EC concentration is more strongly apportioned to the bulk carbon factor from the cold period solution (average 63.8 %) than that from the full data set solution (22.2 %; Table 3). Moreover, strong correlations were observed between the bulk carbon factor 5215 Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | from the cold period solution and NO x (r = 0.76) and CO (r = 0.76; Table S7) concentrations. As such, the bulk carbon factor from the cold period solution should be mainly associated with primary emissions (e.g. gasoline and diesel vehicles). The full data set solution assumes constant co-influence of primary and secondary sources throughout the sampling period, which leads to a moderate correlation (r = 0.54; Table 4) of the 5 bulk carbon factor between the full data set and cold period solutions. For other factors, relatively strong correlations (r = 0.96-1.00; Table 4) were observed between the two solutions, indicating that these matched pairs of factors could be linked to similar pollution sources/processes. Among all the factors, the light SVOC factor is most likely influenced by G/P partitioning when we only use the filter measurement data for source apportionment. The influence of G/P partitioning should be different across different periods due to the distinct temperature ranges, while the filter-based full data set solution assumes constant G/P partitioning influence. In Fig. 5a, d, the light n-alkane/PAH factor from the filter-based PMF analysis was more poorly correlated (r = 0.41) between the cold period and the full data set solutions (Xie et al., 2013) than the light SVOC factor 15 from the total SVOC-based PMF analysis (r = 0.96). These results suggested that the G/P partitioning influence was removed from PMF analysis by using the total SVOC data set as input.

Warm period
The factors resolved by using the warm period sub-data set are also similar as those 20 from the full data set solution on factor profiles (Figs. S1 and S5). Moreover, the factor contributions of the warm period and full data set solutions are relatively strongly correlated (r = 0.96-0.99) with regression slopes close to unity (0.73-1.30; Table 4). Such consistency between the warm period and full data set solutions was also observed in the previous Xie et al. (2013) study. One explanation is that the PMF model is solved by 25 minimizing the sum of the squared, scaled residues, and then requires the mean concentrations of most species to be fit well. The average concentrations of most SVOCs in warm periods are closer to the averages of the whole period than those during cold 5216 Introduction and hot periods. Thus, the factor contributions of the warm period solution are more consistent with those of the full data set solution.

Hot period
For the hot period, the nitrate measurements were not included for source apportionment due to the high percentages of missing and BDL observations, resulting in the 5 omission of the nitrate factor. Meanwhile, a new factor was resolved and labeled as median n-alkane. It contains significant fraction of n-alkane with a chain length ranging from 22 to 29 (Fig. S6g). The factor contribution was moderately correlated with ambient temperature (r = 0.59) and anti-correlated with relative humidity (r = −0.45; Table S7). So the median n-alkane factor might be linked with temperature-dependent 10 summertime emissions with contribution time series opposing to that of relative humidity. The median n-alkane factor was also identified by using the filter-based sub-data set for hot periods (Xie et al., 2013), and well correlated (r = 0.80) with that identified in this work. The other factors were matched to those from the full data set solution with strong correlations (r = 0.79-0.99; Table 4). However, the regression plot for the light SVOC factor in hot periods (Fig. 5f) is more scattered than those in cold and warm periods (Fig. 5d, e); and from the cold to hot periods, the light SVOC factor becomes less correlated with ambient temperature (r, 0.61 → 0.07; Table S7). These could be caused by the increased photochemical reactions during hot periods, supported by the negative correlation (r = −0.46) between the light SVOC factor and ozone concentra-20 tion.

Conclusions
The gas-phase concentrations of 71 SVOCs were estimated using particle-phase measurements by G/P partitioning theory. In order to eliminate the impacts of G/P partitioning on PMF analysis, the gas-phase concentrations of all SVOCs were added to their ent sampling during the cold, warm and hot periods, were also analyzed using PMF. Unlike the light n-alkane/PAH factor from the filter-based study, the light SVOC factor from the total-SVOC based PMF solution exhibited strong correlations (r = 0.82-0.98) between the full data set and each sub-data set solutions. These results suggested that the influences of G/P partitioning on PMF analysis could be removed by using to-10 tal SVOC (gas + particle phase) data. However, the impact of photochemical process has not been ruled out in this work, as illustrated by the moderate correlation (r = 0.54) between the bulk carbon factor of the full data set solution and that of the cold period solution.
This study is our first step in improving SVOC-based PMF analysis by removing 15 the impacts of G/P partitioning. However, the pre-assumptions (e.g. MW OM and ζ OM values) made for the calculation of gas-phase SVOC concentrations need to be verified, and if necessary refined, by comparing with field measurements. Additionally, more source markers are required to further apportion the bulk carbon factor. Finally, gasphase OC data are needed to further understand the ambient OC sources. All of the 20 above will be considered in our subsequent work.