Measurement report: Spatiotemporal and policy-related variations of PM2.5 compositions and sources during 2015-2019 at multisite of a Chinese megacity

A thorough understanding of the relationship between urbanization and PM2.5 (fine particulate matter with 10 aerodynamic diameter less than 2.5 μm) variation is crucial for researchers and policymakers to study health effects and improve air quality. In this study, we selected a fast-developing Chinese megacity as the studied area to investigate the spatiotemporal and policy-related variations of PM2.5 compositions and sources based on a long-term observation at multisite. A total of 836 samples were collected at 19 sites in wintertime of 2015-2019. According to the specific characteristics, 19 sampling sites were assigned into three layers. Layer 1 was the most urbanized area referred to the core zone of Chengdu, layer 15 2 was located in the outside circle of layer 1, and layer 3 belonged to the outer-most zone with the lowest urbanization level. The averaged PM2.5 concentrations for five years were in the order of layer 2 (133 μg m) > layer 1 (126 μg m) > layer 3 (121μg m). And for each year, the spatial clustering of chemical compositions at sampling sites was generally consistent with the classification of layers. PM2.5 compositions for layer 3 in 2019 were found to be similar to that for other layers two or three years ago, implying that the urbanization levels had a strong effect on air quality. During the sampled period, a decreasing 20 trend was observed for the annual averaged PM2.5 concentrations, especially at sampling sites in layer 1, which was caused by the more strict control policies implemented in layer 1. The SO4/NO3 mass ratio at most sites exceeded 1 in 2015 but dropped less than 1 since 2016, reflecting decreasing coal combustion and increasing traffic impacts in Chengdu. The positive matrix factorization (PMF) model was applied to quantify PM2.5 sources. A total of five sources were identified with the average contributions of 15.5% (traffic emission), 19.7% (coal and biomass combustion), 8.8% (industrial emission), 39.7% (secondary 25 particles) and 16.2% (resuspended dust), respectively. From 2015 to 2019, dramatical decline was observed in the average percentage contributions of coal and biomass combustion, but traffic emission source showed an increasing trend. For spatial variations, coal and biomass combustion and industrial emission showed the stronger distribution patterns. High contributions of resuspended dust were occurred at sites with intensive construction activities such as subway and airport constructions. Combining the PMF results, we developed the source weighted potential source contribution function (SWPSCF) method for 30 source localization, this new method highlighted the influences of spatial distribution for source contributions, and the effectiveness of the SWPSCF method was well-evaluated.

conducted by governments to alleviate the pollution (Yan et al., 2018;Cai et al., 2017). The urbanization stage and emphasis 40 of policies vary greatly in both time and space Gurjar et al., 2016;Seto et al., 2017), causing the significant spatiotemporal heterogeneity in the PM2.5 distribution. Thus, a thorough understanding of the spatiotemporal and policy-related variations of PM2.5 is necessary to investigate the relationship between urbanization and PM2.5. Previous studies have investigated the spatiotemporal variability of PM2.5 with the impact of urbanization (Li et al., 2016;Timmermans et al., 2017;Zhang et al., 2019;Yang et al., 2020;Seto et al., 2017), among which a small number of literatures devoted to the analysis of 45 PM2.5 compositions and sources (Lin et al., 2014;Yan et al., 2018). However, there is a lack of research on multisite and longterm sampling for PM2.5 compositions over a city-size area (Dai et al., 2020;Xu et al., 2020a;Fang et al., 2020). Systematic measurement based on multisite and long-term observation can provide valuable data for the comprehensive understanding of PM2.5 characteristics and variations. Related studies are critical for promulgating targeted control policies from the perspective of urbanization. 50 In a city-size area, there exist a large number of natural and anthropogenic emission sources, such as soil or road dust, vehicle exhaust, biomass combustion, sea salt, forest fires, and they have great spatiotemporal variations (Zhang et al., 2015;Zhang et al., 2013;Mirowsky et al., 2013;Yang et al., 2018b). It is essential to identify and apportion PM2.5 sources for providing targeted control policies. To date, receptor models have been applied in a number of source apportionment studies of PM2.5, 55 including factor analysis models (like PCA-MLR, PMF, UNMIX, and ME2) and chemical mass balance (CMB) techniques (Shi et al., 2009;Choi et al., 2015;Hasheminassab et al., 2014;Liu et al., 2015). These receptor models have been proved to be the effective methods of identifying and apportioning sources. Furthermore, to identify the likely source regions for a receptor site, a number of trajectory statistical methods have been widely applied, including concentration field (CF), concentration weighted trajectory (CWT), potential source contribution function (PSCF) and so on (Chen et al., 2011;Gebhart 60 et al., 2011;Riuttanen et al., 2013;Kulshrestha et al., 2009b). For PSCF method, due to the sources showed discrepant spatial distribution patterns over the studied region, when trajectories passed over the grid cell in which a source category showed high local contributions, the probability of potential contribution for this grid cell should be relatively high in theory, which have been ignored during traditional PSCF modelling. Thus, the source weighted PSCF (SWPSCF) method would be developed in this work which combines PMF with PSCF and takes account of the spatial distribution of contributions for each 65 source category. The SWPSCF can be employed as a valuable tool to obtain more precise hint on potential source areas.
In China, megacities have experienced frequent air pollution events in response to rapid economic growth and urbanization (Li et al., 2016;Luo et al., 2018), which promoted governments to take various measures to improve air quality. Chengdu, one of typical megacities in China, can represent an illustrative example of urbanization in a metropolitan region. Since the 70 epidemiological studies. And it is of vital importance for further formulating emission reduction policies in China and in other developing and polluting countries.

Sampling sites and sampling
We collected PM2.5 samples in Chengdu (102° E to 104° E, 30° N to 31° N), which is in the southwest of China with a population of 16.33 million and the area of 14605 km 2 . As the important metropolitan region in western China, Chengdu is undergoing rapid urbanization and is also attracting more and more people living here. At the same time, much attention was paid on the pollution of PM. To improve air quality, Chengdu government adopted several measures including limiting the 90 driving area and time interval of highly polluted vehicles, adjusting industrial structures and implementing energy substitution.
Considering the heterogeneous spatial distribution of population, economic, industry and construction activities, there exists great difference in urbanization and air quality in Chengdu, and emphasis on corresponding policies also varies over the city.
As is shown in Fig. 1, the sampling was conducted at a total of 19 sites in Chengdu. Detailed information of sampling sites can be seen in Table S1. Based on the specific characteristics, 19 sampling sites were clustered in different zones for the 95 convenience of discussion. Environment Protection Building (QY1), Chengdu University of Technology (CH1) and botanical garden (JN1) have similarities of high population density and high traffic. They are located in the core zone of Chengdu, and developed earlier in the urbanization process. Combining the city structure and evolution of urbanization level, Chengdu citizens are used to define regions surrounded by the third circle road as "layer 1", and the location of QY1, CH1 and JN1 are in accordance with the extent of layer 1. Sampling sites including Qingbaijiang (QBJ2), Xindu (XD2), Pidu (PD2), Wenjiang 100 (WJ2), Shuangliu (SL2), Tianfu (TF2) and Longquanyi (LQY2) are located in the outside circle of layer 1. The circle developed later than layer 1 and were clustered as the second zone named as layer 2. Among the sampling sites in layer 2, QBJ2, XD2, WJ2 and SL2 are featured by intensive industrial factories, and TF2 has frequent construction activities. The remaining 9 sites (Jintang (JT3), Pengzhou (PZ3), Dujiangyan (DJY3), Chongzhou (CZ3), Dayi (DY3), Qionglai (QL3), Pujiang (PJ3), Xinjin (XJ3) and Jianyang (JY3)) are located in the outer-most zone of Chengdu, which belongs to layer 3. The urbanization level of 105 layer 3 is lower than layer 1 and layer 2. In addition, because the air pollution is usually heavy in winter, the sampling campaign was conducted in winter from 2015 to 2019, lasting about 15 days each year. The detailed sampling periods for sampling sites in 2015-2019 are listed in Table S2. Although several selected sampling sites may not fully consistent in each year, this small difference will not influence the reflection of spatiotemporal variations in Chengdu. A total of 836 PM2.5 samples were collected for analysis. 110 The sampling campaign was conducted by two medium-volume air samplers (TH-150C; Wuhan Tianhong Ltd., China) with the airflow rate of 100 min L -1 were used at each site. One sampler placed quartz filters to collect PM2.5 for analysing organic (OC), elemental carbon (EC) and ions. The other sampler placed polypropylene filters for analysing elements in PM2.5. PM2.5 samples were daily collected for 22h at 19 sites. Information of average temperature (℃), cumulative volume (L) and standard 115 volume (L) were recorded. Collected samples were stored in a layer of aluminium foil in a freezer at -20℃ until weighing and analysis. The mass of PM2.5 was determined by weight difference of the filter before and after sampling. Before sampling, blank quartz filters and blank polypropylene filters were baked at 600 ℃ for 4h and 60 ℃ for 3h, respectively. For the process of weighing, filters were weighted at a temperature of 20±1℃ and a humidity of 40±5% for 48h. The weights of filters can be obtained using a microbalance with a sensitivity level of 0.01 mg. Each filter was weighted twice, and the final weight equals 120 to the average of two values (the difference was less than 0.05 mg). https://doi.org/10.5194/acp-2021-311 Preprint. Discussion started: 9 June 2021 c Author(s) 2021. CC BY 4.0 License.

Chemical analysis and quality assurance/ quality control (QA/QC)
The OC, EC, ions and elements were detected by the thermal/optical carbon aerosol analyser (DRI model 2001A;Desert Research Institute, USA), ion chromatograph system (ICS-900; DIONEX, USA) and Inductively Coupled Plasma Atomic Emission Spectrometer (ICAP 7400 ICP-AES; Thermo Fisher Scientific, USA), respectively. 125 OC and EC were analysed based on a hole with 0.588 cm 2 of the quartz filter. The thermal/optical carbon aerosol analyser orderly detected OC1, OC2, OC3 and OC4 in a pure helium atmosphere at the temperature of 140℃, 280℃, 480℃, and 580℃, respectively. Likewise, the oven increased the temperature to 540℃, 780℃, and 840℃ for EC1, EC2 and EC3 analysis, respectively, in a 2% O2 atmosphere. The organic pyrolyzed carbon (OPC) would also be detected after adding the oxygen. 130 Finally, the OC and EC concentrations were calculated as Eq. (1) and Eq. (2), respectively. QA/QC was conducted by the calibration process. The analyser will be calibrated before and after analysing to make sure the analytic accuracy within 2%.
= 1 + 2 + 3 + 4 + (1) Ions such as Cl -, SO4 2-, NO3and NH4 + were measured on a one-eighth sample. Samples were cut up into small pieces and ultrasonically extracted with 8mL deionized water for 20 minutes. Tubes that were used during extracting had been cleaned three times by an ultrasonic cleaner. After extracting, the supernatant was injected into a vial through two 0.22μm filters for analysis. Relative standard deviations had to be calculated more than three times to hold the value at a lower level.

140
As for elements analysis, the microwave acid digestion method was applied for detecting Al, Fe, Mg, Ca, Na, K, V, Cd, Pb, Si, Zn, Cu, Cr, As, Ni, Co, Mn and Ti. 10mL mixed digestion solution was added to digest one-eighth sample pieces, the digestion process was conducted by four-stage microwave digestion procedure of the microwave-accelerated reaction system (MARS; CEM Corporation, USA). Afterward, the digestion solution would be transferred into a PET bottle, and diluted the solution to 25ml by deionized water for further analysis. 145

Positive Matrix Factorization (PMF)
The PMF model is a widely used bilinear receptor model. The goal of this model is to identify and quantify the source contribution of contaminants by solving the following equation (Eq. (3)): where i, j and p are the number of samples, chemical compositions, and factors, respectively; is the concentration of the 150 ℎ species in the ℎ sample; is the contribution of the ℎ source to the ℎ sample; is the concentration of the ℎ species from the ℎ source; and is the residual for each sample/species (Paatero, 1997;Paatero and Tapper, 1994).
We input the measured speciated data as the matrix X of i by j dimensions, then the PMF model can divide it into two matrixes: factor contributions (G) and factor profiles (F). The non-negativity constraint is also introduced to ensure the positive value 155 for each source contribution. In the process of decomposition, the model is run several times applying the least square method to minimize the objective function Q (Eq. (4)), and in this case, the obtained solution of G and F is considered the most optimal: where is the uncertainty of the ℎ chemical composition of the ℎ sample. The model required both concentration data and uncertainty of species in each sample. The equation-based uncertainty is calculated as follows (Eq. (5)): 160 https://doi.org/10.5194/acp-2021-311 Preprint. Discussion started: 9 June 2021 c Author(s) 2021. CC BY 4.0 License.
where is the concentration of chemical compositions of each sample, MDL is the method detection limit for each component.
In this study, the EPA PMF 5.0 was applied for the source apportionment of 2.5 , and total of 22 chemical compositions of 165 836 samples at 19 sites from 2015 to 2019 were simulated. The detailed source apportionment results are reported in 3.3 and more information on the PMF model is described in the PMF 5.0 User Guide.

Source Weighted Potential Source Contribution Function (SWPSCF)
The PSCF model is a conditional probability that was applied to identify the source regions of PM2.5 masses to the receptor site. In this study, the backward trajectories were modelled by the hybrid single-particle Lagrangian Integrated Trajectory 170 (HYSPLIT 4.9 version), which is available at http://www.arl.noaa.gov/ready/hysplit4.html. Required meteorological data can where is the total number of trajectory endpoints that fall into the grid cell (i, j), and is the number of trajectory endpoints when their corresponding contributions exceed the criteria value. is a weight function (Eq. (7)) used for reducing uncertainty when specific grid cells have small numbers of trajectory endpoints (Polissar et al., 2001;Lee and Hopke, 2006): where is the average number of endpoints in each grid cell.
When trajectories passed over a grid cell in which a certain source category showed high local contribution, the probability of potential contribution for this grid cell should be relatively high. Thus, we introduced another weighted function that represents the ratio of source contribution in grid cell (i, j) to average contribution in the whole study area. The is 190 calculated as Eq. (8). The source weighted PSCF (SWPSCF) value is expressed as Eq. (9).
where is the source contribution of each source category in grid cell (i, j), and is available using the Kriging interpolation algorithm; is the average source contribution of this source category of all sampling sites in the whole study area. 195 https://doi.org/10.5194/acp-2021-311 Preprint. Discussion started: 9 June 2021 c Author(s) 2021. CC BY 4.0 License.

Hierarchical cluster analysis (HCA)
The similarity analysis of PM2.5 compositions among the 19 sampling sites from 2015 to 2019 was conducted using the method of hierarchical cluster analysis. Cluster analysis, a technique used for identifying groups that have similar characteristics, can be broadly classified as hierarchical and non-hierarchical (Govender and Sivakumar, 2020; Saxena et al., 2017). By recursively finding nested clusters, hierarchical clustering repeatedly combines the two closest groups into one larger group (Xu et al., 200 2020b), and finally generates a dendrogram. The algorithm is implemented mainly by the following steps (Govender and Sivakumar, 2020): Step 1: Determine each observation as the initial cluster.
Step 2: Measure the distance between clusters for quantifying the similarity between objects.
Step 3: The closest pairs of clusters are merged into a single cluster, and re-calculate the distance matrix. 205 Step 4: Repeat step 2 and 3 until all observations are integrated into a single cluster.
In this study, the HCA was conducted based on the cosine distance and average linkage using IBM SPSS Statistics 25. By cutting the dendrogram at an appropriate distance, PM2.5 samples that have similarities in chemical species can be grouped into the same cluster.  Table S3. Due to the slight difference of the selected sampling sites in layer 2 and layer 3 in each year, both layers and sites were discussed for a better understanding of the PM2.5 variability. For spatial 215 distribution, the average PM2.5 concentrations of five years were 126 µg m -3 , 133 µg m -3 and 121 µg m -3 for layer 1, layer 2 and layer 3, respectively. Layer 1, the most urbanized area in Chengdu, suffered severe traffic pollution, however, more strict control policies were conducted by local government in this area. The high PM2.5 concentration in layer 2 may be caused by strong industrial activities and extensive construction activities at QBJ2, XD2, WJ2, SL2 and TF2. Layer 3 was characterized by the lowest urbanization level in Chengdu, although weak emissions of old chemical industries and small coal-fired boilers 220 were observed at XJ3, PZ3, CZ3 and DY3, there existed less vehicles than layer 1 and less factories than layer 2, explaining the relatively low PM2.5 level of the area.  Table S4, reflecting the similar meteorological conditions in the studied years, which highlighted the importance of the source variations for the clustering result. There existed a special case that sites of layer 3 in 2019 belonged to C2 rather than C3, indicating PM2.5 compositions for layer 3 in 2019 were more similar to that for other layers two or three years ago. This can be explained by the fact that the urbanization levels varied between layers in Chengdu. As the outer-most zone of Chengdu, layer 3 lagged 250 behind layer 1 and layer 2 in the urbanization, which contributed to the similar characteristics in air quality between current layer 3 and previous other layers. The HCA results indicated an incredible need to investigate the variations of PM2.5 compositions in both time and space.

Spatial variations of chemical compositions
To investigate the spatial similarities and differences of chemical compositions, the HCA was also applied based on chemical 255 compositions (%) at sampling sites for each year, and finally cluster results and their averaged species fractions are listed in  come from fuel combustion, such as coal, gasoline, diesel, biomass, and so on (Wang et al., 2020). In Chengdu, coal is one of the important fuels for the industry but has been strongly reduced by the government in recent years. Gasoline and diesel are mainly used for vehicles. The decrease of OC and EC fractions from 2015 to 2018 may due to the decline of coal use for industries, which was consistent with the strict coal-burning ban in these years, however, as the vehicles become more 280 important contributor, the OC and EC fractions increased in 2019. Publications have reported the SO4 2and Clas the coalburning markers (Tian et al., 2014;Vassura et al., 2014). In this study, the fractions of SO4 2and Clgenerally showed a decreasing trend, especially in 2016. However, the fractions of NO3showed a general increasing trend from 2015 to 2019, which might be attributed to the gradually enhanced contribution of vehicles and use of natural gas. We also analyzed the SO4 2-/NO3mass ratio, a qualitative indicator of sulfur versus nitrogen sources (Gao et al., 2015;Arimoto et al., 1996), and the 285 summary is listed in Fig. S2. Ratios at most sites exceeded 1 in 2015, dropped less than 1 in 2016 and then declined steadily, also indicating decreasing coal combustion and increasing traffic emissions in Chengdu. The result was consistent with the slow reduction in NOX and the sharp decline in SO2 emissions in China Wang et al., 2018b). For crustal elements, the temporal variations were found to have close relationship with the construction activities in Chengdu in 2015-

Spatiotemporal variations of sources
PMF was used to quantify the source contributions in the studied areas and finally five categories were selected with distinctively related source characteristics. Five sources were identified as traffic emission, coal and biomass combustion, industrial emission, secondary particles and resuspended dust, respectively. The estimated source profiles in the form of species concentrations (μg m -3 ) and percentages of species sum (%) are shown in Fig. 6. Factor 1 contributed 15.5% of PM2.5 and had 295 high fractions of EC (70.0% of total EC) and OC (51.8% of total OC), which can be identified as traffic emission (Xu et al., 2016). The relatively high NO3further revealed factor 1 as the traffic emission source. The moderate fractions of Al, Si, Cu, Ni and As in this factor may be associated with traffic activities including resuspension of road dust, tire and brake wear, and oil burning (Kulshrestha et al., 2009a;Almeida et al., 2005;Amato and Hopke, 2012). Factor 2 was determined as coal and biomass combustion source. Coal combustion generally plays an important role in Chinese energy structure. Identified as 300 markers of coal combustion source, OC, EC, Cd and SO4 2exhibited high loadings in factor 2, with fractions of 25.8%, 20.3%, 61.9% and 26.7%, respectively (Tian et al., 2016). The existence of biomass burning was indicated by the high fraction of K + in this factor (Amil et al., 2016;Richard et al., 2011). Factor 2 accounted for 19.7% of the total PM2.5 mass concentration.
According to previous studies, NO3 -, SO4 2and NH4 + are indicative species of secondary reactions (Richard et al., 2011;Wu et al., 2021). Consequently, factor 4 represented the secondary particle source, contributing 39.7% of PM2.5. Factor 5 was identified as resuspended dust accounting for 16.2% of PM2.5. The top three fractions of species were Al (84.2%), Ca (79.5%) 310 and Si (56.5%), which were typical indicatory components for resuspended dust (Pant and Harrison, 2012). Figure 7 shows the source contribution at each site from 2015 to 2019 to investigate their spatial variations. The CV (Coefficient of Variation), which is defined as the standard deviation divided by the mean, was used to investigate the spatial difference of each source category. As shown in Table S5, CV values in this study indicate that coal and biomass combustion 315 and industrial emission showed stronger spatial variations. For coal and biomass combustion source, the percentage contributions were higher at CZ3 of layer 3 and QBJ2 of layer 2 than at other sites. And the high contributions of industry https://doi.org/10.5194/acp-2021-311 Preprint. Discussion started: 9 June 2021 c Author(s) 2021. CC BY 4.0 License. source mainly occurred in layer 2 including QBJ2, WJ2, PD2, SL2 and XD2, with fractions from 8.9% to 12.9%. Among the sampling sites mentioned above, CZ3 was characterized by intensive coal-fired boilers. QBJ2 contains large-scale iron, steel and chemical plants. WJ2, PD2, SL2 and XD2 were in great development and also had large factories of glass, food and 320 furniture, respectively. Therefore, the spatial distributions of PM2.5 from coal and biomass combustion and industrial emission were strongly associated with industrial manufacturing plants. Additionally, the contributions of traffic emission were higher in layer 1 and layer 2, with the percentage contributions in 2015-2019 ranging from 13.9% to 16.3% in layer 1 and from 11.6% to 17.5% in layer 2. The secondary particles had higher contributions in layer 3. Fractions of secondary particles at QY1 and LQY2 also presented relatively high values of 44.5% and 49.9%, respectively. For resuspended dust, the spatial distribution 325 varied with human activities. The contributions were relatively higher at layer 1 in 2015-2018, which resulted from the construction of the urban subway. At JY3, high contributions from resuspended dust were attributed to the fact that Chengdu Tianfu International Airport was under construction. Overall, the spatial distributions of source contributions were exactly in accordance with the characteristics and urbanization level of sites, highlighting the importance of site-specific and urbanization research in pollutant emissions control. 330

Spatial variations of source contributions
To better consider the spatial distribution of contributions for each source category, the SWPSCF method was applied for identifying the source regions to the receptor site based on a source contribution weight. In this study, we selected QY1 as the  Fig. 8(a)), the potential source regions were observed to concentrate to CZ3 after source weighted, and the SWPSCF values around QBJ2 were higher than PSCF values, reflecting a strengthened influence of coal and biomass combustion source at CZ3 and QBJ2. For traffic emission source in 2019 ( Fig. 8(b)), the identified potential source regions moved toward layer 340 1 after source weighted, which was in agreement with the spatial distribution of traffic emission contributions.

Temporal variations of source contributions
Temporal variations of source contributions at each site are also summarized in Fig. 7 layer 3 in 2019 were found to be similar to that for layers two or three years ago, which indicated the great impact of differences in urbanization to air quality. Coal and biomass combustion and industrial emission showed the stronger spatial distribution patterns in Chengdu, the high percentage contributions of which usually occurred at sites with large-scale industrial factories and coal-fired boilers. Frequent construction activities in developing areas can considerably elevate the percentage contributions of resuspended dust. The SWPSCF results were found to have great differences with PSCF results. The changes 380 for identified potential source regions after source weighted were in agreement with the spatial distribution of each source contributions. This study presented a perspective on the relationship between PM2.5 and urbanization. Sampling activities that conducted based on a five-year measurement at 19 sites in different urbanization levels provided particular valuable data for researchers. The results can be useful for further policy formulation in most developing and polluted countries, and supply basic information for future epidemiological studies. 385 Data availability. The hybrid single-particle Lagrangian Integrated Trajectory (HYSPLIT 4.9 version) is available at http://www.arl.noaa.gov/ready/hysplit4.html (last access February 28 th , 2021). Required meteorological data during SWPSCF modelling can be obtained from National Oceanic and Atmospheric Administration (NOAA) website, ftp://arlftp.arlhq.noaa.gov/pub/archives/reanalysis (last access February 28 th , 2021).  Table S1.