Urban organic aerosol composition in eastern China differs from north to south: molecular insight from a liquid chromatography–mass spectrometry (Orbitrap) study

Air pollution by particulate matter in China affects human health, the ecosystem and the climate. However, the chemical composition of particulate aerosol, especially of the organic fraction, is still not well understood. In this study, particulate aerosol samples with a diameter of ≤ 2.5 μm (PM2.5) were collected in January 2014 in three cities located in northeast, east and southeast China, namely Changchun, Shanghai and Guangzhou. Organic aerosol (OA) in the PM2.5 samples was analyzed by an ultrahigh-performance liquid chromatograph (UHPLC) coupled to a high-resolution Orbitrap mass spectrometer in both negative mode (ESI-) and positive mode electrospray ionization (ESI+). After nontarget screening including the assignment of molecular formulas, the compounds were classified into five groups based on their elemental composition, i.e., CHO, CHON, CHN, CHOS and CHONS. The CHO, CHON and CHN groups present the dominant signal abundances of 81 %–99.7 % in the mass spectra and the majority of these compounds were assigned to monoand polyaromatics, suggesting that anthropogenic emissions are a major source of urban OA in all three cities. However, the chemical characteristics of these compounds varied between the different cities. The degree of aromaticity and the number of polyaromatic compounds were substantially higher in samples from Changchun, which could be attributed to the large emissions from residential heating (i.e., coal combustion) during wintertime in northeast China. Moreover, the ESIanalysis showed higher H/C and O/C ratios for organic compounds in Shanghai and Guangzhou compared to samples from Changchun, indicating that OA undergoes more intense photochemical oxidation processes in lower-latitude regions of China and/or is affected to a larger degree by biogenic sources. The majority of sulfur-containing compounds (CHOS and CHONS) in all cities were assigned to aliphatic compounds with low degrees of unsaturation and aromaticity. Here again, samples from Shanghai and Guangzhou show a greater chemical similarity but differ largely from those from Changchun. It should be noted that the conclusions drawn in this study are mainly based on comparison of molecular formulas weighted by peak abundance and thus are associated with inherent uncerPublished by Copernicus Publications on behalf of the European Geosciences Union. 9090 K. Wang et al.: Urban organic aerosol composition tainties due to different ionization efficiencies for different organic species.


Introduction
In the last decades, China has experienced rapid industrialization and urbanization accompanied by severe and persistent particulate air pollution Sun et al., 2014;Ding et al., 2016;Song et al., 2018;Shi et al., 2019;Xu et al., 2019). These particulate air pollution extremes can not only influence the regional air quality and human health in China, but also lead to a global environmental problem due to long-distance transport of pollutants. To better understand the effects of air pollution on air quality and human health, chemical characterization of fine particles (particulate matter with an aerodynamic diameter of less than 2.5 µm, or PM 2.5 ) is crucial. However, the chemical composition of PM 2.5 in China is still poorly understood due to a wide variety of natural and anthropogenic sources as well as complex multiphase chemical reactions (Lin et al., 2012a;Huang et al., 2014;Ding et al., 2016;Wang et al., 2017Wang et al., , 2018Wang et al., , 2019aAn et al., 2019;Tong et al., 2019). In particular, compared to the fairly well understood nature of the inorganic fraction of aerosol, the organic fraction, also named organic aerosol (OA), is considerably less understood in terms of chemical composition, corresponding precursors, sources and formation mechanisms (Huang et al., 2017).
During pollution events in China, OA accounts for as high as more than 50 % of the total mass of fine particles . Chemical compounds in OA cover a large complexity of species including alcohols, aldehydes, carboxylic acids, imidazoles, organosulfates, organonitrates and polycyclic aromatic hydrocarbons (PAHs) (Lin et al., 2012a;Rincón et al., 2012;Kourtchev et al., 2014;Wang et al., 2018Wang et al., , 2019aElzein et al., 2019). Thus, the capacity of traditional analytical techniques is limited to identify the compounds in OA, and the majority (> 70 %) of OA has not been identified yet as specific compounds (Hoffmann et al., 2011). The insufficient knowledge of chemical composition of OA hinders a better understanding of the sources, formation and atmospheric processes of air pollution in China.
Recently, ultrahigh-resolution mass spectrometry (UHRMS), such as Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS) and the Orbitrap MS, coupled with soft ionization sources (e.g., electrospray ionization, ESI, and atmospheric pressure chemical ionization, APCI), has been introduced to elucidate the molecular composition of OA (Nizkorodov et al., 2011;Lin et al., 2012a, b;Rincón et al., 2012;Noziere et al., 2015;Kourtchev et al., 2016;Tong et al., 2016;Tu et al., 2016;Brüggemann et al., 2017Brüggemann et al., , 2019Wang et al., 2017Wang et al., , 2018Wang et al., , 2019aFleming et al., 2018;Laskin et al., 2018;Song et al., 2018;Daellenbach et al., 2019;Ning et al., 2019). Due to the two outstanding features of high resolving power and high mass accuracy, UHRMS can give precise elemental compositions of individual organic compounds. However, UHRMS studies on Chinese urban OA are very limited. Wang et al. (2017) characterized OA in Shanghai and showed variations in chemical composition among different months and between daytime and nighttime. Our recent Orbitrap MS study  showed that wintertime OA in PM 2.5 collected in Beijing, China, and Mainz, Germany, was very different in terms of chemical composition. In contrast, for summertime OA from Germany and China, Brüggemann et al. (2019) found similar compounds and concentrations of terpenoid organosulfates in PM 10 , demonstrating that biogenic emission can significantly affect OA composition at both locations. Ning et al. (2019) analyzed the OA collected in a coastal Chinese city (Dalian) and found that more organic compounds were identified on haze days compared to non-haze days. Nonetheless, since severe particulate pollution in China occurs on a large scale, more UHRMS studies are needed to fully elucidate the chemical composition of OA in different Chinese cities.
In this study, PM 2.5 aerosol samples were collected in three Chinese cities, i.e., Changchun, Shanghai and Guangzhou, and their organic fraction was analyzed using an ultra-high-performance liquid chromatograph (UH-PLC) coupled with an Orbitrap MS. The Chinese cities of Changchun, Shanghai and Guangzhou are located in the northeast, east and southeast of China, which are major populated regions in China with populations of 7.5, 24 and 15 million, respectively. The geographic locations of these three cities cover a large latitude spanning from 23.12 to 43.53 • N, resulting in different meteorological conditions, including intensity and duration of sunlight, average daily temperature, and monsoon climate. In addition, the industrial structure, energy consumption and energy sources in these three cities are different; for example there is much more heavy industry (e.g., coal chemical industry and steelworks) in northeast China (Zhang, 2008), which can cause difference in anthropogenic emissions and can therefore influence the chemical composition of urban OA. Moreover, OA is strongly affected by residential coal combustion during winter in northeast China An et al., 2019). Therefore, this study presents a comprehensive overview of chemical composition of OA in three representative Chinese cities during pollution episodes, which eventually can improve our understanding of OA effects on climate and public health and also provide a chemical database for haze mitigation strategies in China. Figure 1. Mass spectra of detected organic compounds reconstructed from extracted ion chromatograms in ESI-and ESI+. The horizontal axis refers to the molecular mass (Da) of the identified species. The vertical axis refers to the relative peak abundance of each individual compound compared to the compound with the greatest peak abundance. The pie charts show the percentage of each organic compound subgroup (i.e., CHO, CHON, CHOS, CHONS and CHN) in each sample in terms of peak abundance. The map in the lower right corner shows the locations of these three megacities in China.

PM 2.5 samples
Three 24 h integrated urban PM 2.5 samples were collected during severe haze pollution events with daily average PM 2.5 mass concentration higher than 115 µg m −3 in each of the three Chinese cities: Changchun (43.54 • N, 125.13 • E, 1.5 m above the ground), Shanghai (31.30 • N, 121.50 • E, 20 m above the ground) and Guangzhou (23.07 • N, 113.21 • E, 53 m above the ground), which are located in the northeast, east and southeast regions of China, respectively (see Fig. 1). Samples in Changchun were collected on 4, 24 and 29 of January 2014 with PM 2.5 mass concentrations of 185-222 µg m −3 ; samples in Shanghai were collected on 1, 19 and 20 of January 2014 with PM 2.5 mass concentrations of 159-172 µg m −3 ; and samples in Guangzhou were collected on 5, 6 and 11 of January 2014 with PM 2.5 mass concentrations of 138-152 µg m −3 . Further details (e.g., the daily average concentrations of PM 2.5 , SO 2 , NO 2 , CO and O 3 , the average temperature, and the daily solar radiation value during sampling dates) are presented in Table S1, and the 48 h back trajectories of air arriving at the three sampling sites during the sampling periods are shown in Fig. S1 in the Supplement. All PM 2.5 samples were collected on prebaked quartz-fiber filters (20.3 cm × 25.4 cm) using a high-volume PM 2.5 sampler at a flow rate of 1.05 m 3 min −1 (Tisch Environmental, USA) and at each sampling site field blanks were taken. After sample collection, filters were stored at −20 • C until analysis.

Sample analysis
A detailed description of the filter sample extraction and UHPLC-Orbitrap MS analysis can be found in our previous studies (Wang et al., , 2019a. Briefly, a part of the filters (around 1.13 cm 2 , corresponding to about 600 µg particle mass in each extracted filter) was extracted three times with 1.0-1.5 mL of acetonitrile water (8/2, v/v) in an ultrasonic bath. The extracts were combined, filtered through a 0.2 µm Teflon syringe filter and evaporated to almost dryness under a gentle nitrogen stream. Finally, the residue was redissolved in 1000 µL acetonitrile water (1/9, v/v) to reach the total particulate mass concentration of around 600 µg mL −1 for the following analysis.
Compared to the direct infusion method applied in other UHRMS studies (Lin et al., 2012a, b;Rincón et al., 2012;Kourtchev et al., 2016;Fleming et al., 2018), the UH-PLC technique was used in this study, which could separate and concentrate the compounds before they entered the ion source, reducing the ionization suppression and increasing the sensitivity of the measurement. In addition, it can provide separation of some compounds and information of retention time of the compounds, which is useful for the identification of the compounds and the separation of isomers. The analytes were separated using a Hypersil GOLD column (C18, 50 × 2.0 mm, 1.9 µm particle size) with mobile phases consisting of (A) 0.04 % formic acid and 2 % acetonitrile in Milli-Q water and (B) 2 % water in acetonitrile. Gradient elution was applied with the A and B mixture at a flow rate of 500 µL min −1 as follows: 0-1.5 min 2 % B, 1.5-2.5 min from 2 % to 20 % B (linear), 2.5-5.5 min 20 % B, 5.5-6.5 min from 20 % to 30 % B (linear), 6.5-7.5 min from 30 % to 50 % B (linear), 7.5-8.5 min from 50 % to 98 % B (linear), 8.5-11.0 min 98 % B, 11.0-11.05 min from 98 % to 2 % B (linear), and 11.05-11.1 min 2 % B. The Q Exactive hybrid quadrupole-Orbitrap MS was equipped with a heated ESI source at 120 • C, applying a spray voltage of −3.3 kV and 4.0 kV for negative ESI mode (ESI-) and positive ESI mode (ESI+), respectively. The mass scanning range was set from m/z 50 to 500 with a resolving power of 70 000 m/z 200. The Orbitrap MS was externally calibrated before each measurement sequence using an Ultramark 1621 solution (Sigma-Aldrich, Germany) providing mass accuracy of the instrument lower than 3 ppm. Each sample was measured in triplicate with an injection volume of 10 µL.

Data processing
A non-target peak picking software (SIEVE ® , Thermo Fisher Scientific, Germany) was used to find significant peaks in the LC-MS dataset and to calculate all mathematically possible chemical formulas for ion signals with a sample-to-blank abundance ratio ≥ 10 using a mass tolerance of ±2 ppm. The permitted maximum elemental number of atoms was set as follows: 12 C (39), 1 H (72), 16 O (20), 14 N (7), 32 S (4), 35 Cl (2) and 23 Na (1) (Kind and Fiehn, 2007;Lin et al., 2012a;Wang et al., 2018). To remove the chemically unreasonable formulas, further constraint was applied by setting H/C, O/C, N/C, S/C and Cl/C ratios in the ranges of 0.3-3, 0-3, 0-1.3, 0-0.8 and 0-0.8 (Kind and Fiehn, 2007;Lin et al., 2012a;Rincón et al., 2012;Wang et al., 2018;Zielinski et al., 2018), respectively. For chemical formula C c H h O o N n S s Cl x , the double bond equivalent (DBE) was calculated by the equation DBE = (2c + 2 − h − x + n) / 2. The aromaticity equivalent (X C ) as a modified index for aromatic compounds was obtained using the equation: X C = [3(DBE -(p×o+q×s)) -2] / [DBE -(p×o+q×s)], where p and q, respectively, refer to the fraction of oxygen and sulfur atoms involved in the π -bond structure of a compound. As such the values of p and q vary between compound categories (Yassine et al., 2014). For example, carboxylic acids and esters are characterized using p = q = 0.5, while p = q = 1 and p = q = 0 are used for carbonyl and hydroxyl, respectively. Since it is impossible to identify the structures of the hundreds of formulas observed in this study, we cannot know the exact values of p and q in an individual compound. Therefore, in this study, p = q = 0.5 was applied for compounds detected in ESI-as carboxylic compounds are preferably ionized in negative mode. However, because of the high complexity of the mass spectra in ESI+, p = q = 1 was used in ESI+ to avoid an overestimation of the amount of aromatics. Moreover, for DBE ≤ (p × o + q × s) or X C ≤ 0, X C was defined as zero. Furthermore, in ESI-, for odd numbers of oxygen or sulfur atoms in molecular formulas, the value of (p × o + q × s) was rounded down to the lower integer. X C ≥ 2.50 and X C ≥ 2.71 have been suggested as unambiguous minimum criteria for the presence of monoaromatics and polyaromatics, respectively (Yassine et al., 2014).
Comparing the peak abundance has been performed in recent UHRMS studies Fleming et al., 2018;Song et al., 2018;Ning et al., 2019) to illustrate the relative importance of specific types of compounds. However, it should be noted that different organic compounds have different signal responses in the mass spectrometer due to the differences in ionization and transmission efficiencies (Schmidt et al., 2006;Leito et al., 2008;Perry et al., 2008;Kruve et al., 2014). Therefore, uncertainties may exist when comparing the peak areas among compounds. In this work, we assume that all organic compounds have the same peak abundance response in the mass spectrometer. The peak abundance-weighted average molecular mass (MM), elemental ratios, DBE and X C for the formula C c H h O o N n S s Cl x were calculated using following equations.
Here A i is the peak abundance for each individual compound i.

General characteristics
The main purpose of this study was to tentatively identify and compare the chemical composition of organic compounds in the PM 2.5 samples collected in the three Chinese cities Changchun, Shanghai and Guangzhou during pollution episodes. To reduce the uncertainty caused by the variability between the samples collected at each location, only organic compounds measured in all three samples of each city are used for intercity comparison. The number of organic compounds and molecular formulas detected in each city, the peak abundance-weighted average values (including the standard deviations of peak abundance of the three sam-ples from each city) of molecular mass (MM avg ), elemental ratios, DBE, X C , and the isomer number fraction (meaning the percentage of formula numbers that have isomers among all assigned formulas) for each subgroup are listed in Table 1.
It should be noted that in this study we focus solely on organic compounds with elevated signal abundances and thus presumably rather high concentrations. In contrast to our previous study , compounds with low concentrations were excluded by increasing the reconstitution volume from 500 to 1000 µL, reducing the sample injection volume from 20 to 10 µL and increasing the sample-to-blank ratio from 3 to 10 during data processing.
Overall, 416-769 (assigned to 272-415 molecular formulas) and 687-2943 (assigned to 383-679 molecular formulas) organic compounds in different city samples were determined in ESI-and ESI+, respectively. The largest number of organic compounds was observed in Changchun samples in both ESI-and ESI+, indicating that OA collected during the winter season in northeast China was more complex compared to urban OA in east and southeast China. This increased number of compounds can possibly be explained by the large residential coal combustion emissions in winter in north China Song et al., 2018;, which is consistent with the observation of higher average concentration (46 ± 20 µg m −3 ) of organic carbon in Changchun than in Shanghai (24±8 µg m −3 ) and Guangzhou (25 ± 2 µg m −3 ) as shown in Table S2. In addition, ambient temperatures were lowest during the sampling period in Changchun (i.e., −14 to −9 • , Table S1), which likely led to a decreased boundary layer height and therefore enhanced accumulation of pollutants and enhanced formation of secondary organic aerosol through for example gas-to-particle partitioning.
As shown in Table 1, the abundance-weighted average values of MM avg and O/C ratio of the total assigned formulas for Changchun samples detected in negative mode (Changchun-) are 169 and 0.58, respectively, which are lower than those for Shanghai-(MM avg = 176 and O/C = 0.69) and for Guangzhou-(MM avg = 183 and O/C = 0.74). On the contrary, the aromaticity equivalent X C for organics detected in Changchun-, X C (Changchun-) = 2.13, is higher than that for Shanghai-, X C (Shanghai-) = 1.92, and Guangzhou-, X C (Guangzhou-) = 1.65. Furthermore, the relative peak abundance fraction of compounds with O/C ≥ 0.6, which are considered to be highly oxidized compounds (Tu et al., 2016), is 31 % in Changchun-, and higher in Shanghai-(46 %) and Guangzhou-(51 %). These observations indicate that urban OA in northeast China features a lower degree of oxidation and a higher degree of aromaticity compared to urban OA in east and southeast China. The different chemical composition of the samples is probably caused by the rather low ambient temperatures and decreased photochemical processing of organic compounds in northeast China (indicated by the lower solar radiation in northeast China; see Table S1), slowing down oxidation processes and leading to a larger number of PAHs, which are mainly emitted from coal burning Song et al., 2018) or by different biogenic/anthropogenic precursors. Nitrate is mainly formed by photochemical oxidation, and the average concentration of nitrate (see Table S2) was lower in particle samples from Changchun (15.5 ± 8.5 µg m −3 ) compared to Shanghai (28.2 ± 9.4 µg m −3 ) and Guangzhou (24.6 ± 0.9 µg m −3 ), again indicating less photochemical processing in northeast China. In addition, long-range transport of air masses (see the 48 h back trajectories in Fig. S1) may have a certain effect on the chemical properties of aerosol samples collected in the three cities. Figure 1 shows the reconstructed mass spectra of organic compounds detected in ESI-and ESI+. A major fraction of organic species detected in ESI-are attributed to CHO-and CHON-, accounting for 30 %-42 % and 39 %-55 % in terms of peak abundance, respectively, and comprising 39 %-45 % and 23 %-33 % in terms of peak numbers, respectively. This is consistent with previous studies on Chinese urban OA by Wang et al. (2017Wang et al. ( , 2018 and Brüggemann et al. (2019). Comparing the organic compounds detected in ESI-for the three cities, 120 formulas were observed in all cities as common formulas (which refer to the compounds detected in all cities with the same molecular formulas and with the same retention times; retention time difference ≤ 0.1 min) (Fig. 2a), accounting for 29 %-44 % and 57 %-71 % of all Table 1. Number of organic compounds and molecular formulas in each subgroup and the peak abundance-weighted average values of molecular mass (MM avg ), elemental ratios, double bond equivalent (DBE), aromaticity equivalent (X C ) and isomer number fraction (meaning the percentage of formula numbers that have isomers among all assigned formulas) for detected organic compounds in ESI-and ESI+ in the three Chinese cities.

Sample
Subgroup assigned formulas in terms of formula numbers and peak abundance, respectively. Despite the abovementioned differences in chemical composition for OA from Changchun compared to OA from Shanghai and Guangzhou, these results demonstrate that still a large number of common organic compounds exist in Chinese urban OAs collected in different cities, in particular for organics with higher signal abundances. Furthermore, as shown by the pie chart in Fig. 2b, these common formulas are dominated by CHONand CHO-, accounting for 62 % and 30 % of the total common formulas in terms of peak abundance, respectively. As is commonly known, ESI exhibits different ionization mechanisms in negative and positive ionization modes. While ESI-is especially sensitive to deprotonatable compounds (e.g., organic acids), ESI+ is more sensitive to protonatable compounds (e.g., organic amines) (Ho et al., 2003). Due to the different ionization mechanisms, clear differences were observed in the mass spectra ( Fig. 1) and chemical characteristics (Table 1) from ESI-and ESI+ measurements. For example, CHO compounds were preferentially detected in ESI-, accounting for a relatively larger fraction of 30 %-42 % of all detected compounds in terms of peak abundance, compared to merely 4 %-13 % for such CHO compounds in ESI+. In contrast, CHN compounds were only observed in ESI+, yielding a rather large peak abundance fraction of 40 %-71 %. In particular, as can be seen in Fig. 1, several peaks of CHN+ compounds in Shanghai+ and Guangzhou+ have much higher abundance compared to other organic species, probably due to their high concentrations and/or high ionization efficiencies in the positive mode. This observation indicates that most CHO compounds with high concentrations are probably organic acids, whereas the majority of CHN compounds likely belong to the group of organic amines, which is in good agreement with previous studies (Lin et al., 2012a;Wang et al., 2017Wang et al., , 2018. Organic compounds in ESI+ are dominated by CHN+ and CHON+ compounds in terms of both peak numbers and peak abundance, and these compounds are characterized by rather high H/C ratio and low O/C ratios (Table 1), indicating a low degree of oxidation. The Venn diagram presented for ESI+ measurements in Fig. 2a shows that out of a total of 383-679 formulas, 129 formulas were found in samples from all three cities. Such common formulas, thus, account for 19 %-34 % and 30 %-75 % of all assigned formulas in terms of formula numbers and peak abundance, respectively. Among these common formulas, CHN+ and CHON+ exhibit the highest abundance fractions of 72 % and 26 %, respectively (Fig. 2b).
In the following, we will compare and discuss the chemical properties in detail for the three cities, including degrees of oxidation, unsaturation and aromaticity of each organic compound class (i.e., CHO, CHON, CHN, CHOS and CHONS). It should be noted that the chlorine-containing compounds were not discussed in this study due to the very low MS signal abundance. In addition, since peak abundances for the formula can vary by orders of magnitude, the The molecular formula represents the abundance-weighted average CHO-formula, and the area of the circles is proportional to the fourth root of the peak abundance of an individual compound (a diagram with circle areas related to the absolute peak abundances is presented in Fig. S2). The color bar denotes the aromaticity equivalent (gray with X C < 2.50, purple with 2.50 ≤ X C < 2.70 and red with X C ≥ 2.70). The pie charts show the percentage of each X C category (i.e., gray color-coded compounds, purple color-coded compounds and red color-coded compounds) in each sample in terms of peak abundance.
area of the circles presented in Figs. 3 and 5-7 is proportional to the fourth root of the peak abundance of each formula to reduce the size difference of the circles. For a more detailed comparison, figures with the circle size related to the absolute peak abundances are presented in the Supplement.

CHO compounds
CHO compounds have been widely observed in urban OA, accounting for a substantial fraction (8 %-67 %) of OA (Rincón et al., 2012;Tao et al., 2014;Wang et al., 2017Wang et al., , 2018. Previous studies have shown that a large fraction of CHO compounds in urban OA are composed of organic acids, containing deprotonatable carboxyl functional groups, which are detected preferentially in negative ionization mode when using ESI-MS. As shown in Table 1, a total of 346, 164 and 196 CHO-compounds were detected in ESI-in the OA samples collected in Changchun, Shanghai and Guangzhou, accounting for 30 %, 40 % and 42 % of the overall peak abundance in each sample, respectively. Out of all assigned formulas, 47 common CHO-formulas were observed for all cites, accounting for 35 %-52 % and 42 %-68 % of all identified CHO-formulas in terms of formula numbers and peak abundance, respectively. Despite this similarity, OA samples from Changchun-(i.e., in negative ionization mode) exhibit certain differences compared to samples from Shanghai-and Guangzhou-. The average H/C values for CHO-compounds are in a similar range for the three locations (i.e., 0.96-1.10); however, the average O/C values for O/C (Shanghai-) = 0.59 and O/C (Guangzhou-) = 0.65 are rather high compared to the average O/C ratio for Changchun-, O/C (Changchun-) = 0.41. Furthermore, the relative peak abundance fraction of CHO-compounds with O/C ≥ 0.6, which are considered to be highly oxidized compounds (Tu et al., 2016), is 14 % in Changchun and somewhat higher in Shanghai-(34 %) and Guangzhou-(45 %). Altogether, these results indicate that CHO-compounds in urban OA from east and southeast China experienced more intense oxidation and aging processes and/or were affected to a larger degree by biogenic sources.
Similarly, as shown in Fig. 3 (MM avg (Guangzhou-) = 172), respectively. Again, these average formulas show that CHO-in Shanghai-and Guangzhouexperienced more intense oxidation processes and/or were affected to a larger degree by biogenic precursors, indicated by the larger abundance-weighted MM avg with a higher degree of oxygenation. In contrast, CHO-compounds from OA samples in Changchun-exhibit a lower abundance-weighted MM avg with a decreased oxygen content.
Besides oxygenation, the aromaticity of the detected CHO-compounds exhibits remarkable differences in these three cities. In all cities, the CHO-compounds with high peak abundance were mainly assigned to monoaromatics with 2.5 ≤ X C < 2.7 (purple circles in Fig. 3) in the region of 7-12 carbon atoms per compound and DBE values of 5-7. The relative peak abundance fraction of monoaromatics in total CHO-compounds is 67 % in Changchun, which is higher compared to 64 % in Shanghai and 49 % in Guangzhou. In addition, 14 % of CHO-compounds in Changchun were identified as polyaromatic compounds with X C ≥ 2.7 (red circles in Fig. 3), which is higher than the 8 % in Shanghai and 4 % in Guangzhou. These observations indicate that CHO-compounds in the three Chinese cities are highly affected by aromatic precursors (e.g., benzene, toluene and naphthalene), in particular for the Changchun aerosol samples.
Besides the monoaromatics and polyaromatics, the rest of the detected CHO-compounds were assigned to aliphatic compounds with an X C lower than 2.5 (gray circles in Fig. 3). Interestingly, these aliphatic compounds account for about 47 % of all CHO-compounds for Guangzhou-samples in terms of peak abundance, whereas samples from Changchun-and Shanghai-exhibit only rather small fractions of such CHO-compounds, i.e., 19 % and 28 %, respectively. Such aliphatic compounds are commonly derived from biogenic precursors  and vehicle emission (Tao et al., 2014;Wang et al., 2017) and/or generated by intense oxidation processes of aromatic precursors, indicating the different biogenic and anthropogenic emission sources and chemical reaction processes for OAs in the three cities.
In addition, through the analysis of individual formulas, we find that for the Changchun-samples, formulas of C 8 H 6 O 4 , C 7 H 6 O 2 , C 7 H 6 O 3 , C 7 H 6 O 3 and C 8 H 8 O 3 with DBE values of 6, 5, 5, 5 and 5 dominate the assigned CHO formulas with respect to peak abundance. According to previous studies, C 8 H 6 O 4 , C 7 H 6 O 2 and C 7 H 6 O 3 are suggested to be phthalic acid, benzoic acid and monohydroxy benzoic acid, respectively, which are derived from naphthalene (Kautzman et al., 2010;Riva et al., 2015;Wang et al., 2017;He et al., 2018;Huang et al., 2019). C 7 H 6 O 3 is likely 4hydroxy acetophenone, which could be derived from estragole (Pereira et al., 2014), while C 8 H 8 O 3 is suggested to be either 4-methoxybenzoic acid generated from estragole (Pereira et al., 2014) or vanillin emitted from biomass burning . For the Shanghai-samples, besides C 8 H 6 O 4 , C 7 H 6 O 3 and C 7 H 6 O 2 , formulas of C 6 H 8 O 7 and C 9 H 8 O 4 with DBE values of 3 and 6 were observed with high peak abundances. C 6 H 8 O 7 was identified as citric acid in the pollen sample and mountain particle sample in previous studies (Fu et al., 2008;Wang et al., 2009;Jung and Kawamura, 2011), and C 9 H 8 O 4 is probably homophthalic acid derived from estragole (Pereira et al., 2014). For the Guangzhou-samples, besides the formulas of C 8 H 6 O 4 and C 6 H 8 O 7 discussed above, C 4 H 6 O 4 and C 4 H 6 O 5 with low DBE values of 2 were detected with high abundances and are suggested to be succinic acid and malic acid, respectively (Claeys et al., 2004;Wang et al., 2017).

CHON compounds
A large number of nitrogen-containing organic compounds were detected in these three cities, accounting for 39 %-55 % and 25 %-47 % of total peak abundance detected in ESI-and ESI+, respectively. Out of all assigned formulas, 45 common CHON-and 62 common CHON+ formulas were observed in all cities, accounting for 65 %-82 % and 25 %-44 % of all CHON compounds detected in ESI-and ESI+ in terms of peak abundance, respectively. This indicates that a large number of CHON compounds in all three Chinese cities show similar properties of chemical composition.
The CHON compounds were further classified into different subgroups according to their O/N ratios (Fig. 4 for CHON-and Fig. S3 for CHON+) or according to the number of nitrogen atoms in their molecular formulas (see Fig. S4 for CHON-and S5 for CHON+). As shown in Fig. 4, the majority (84 %-96 % in terms of peak abundance) of CHON-compounds exhibited O/N ratios ≥ 3, allowing the assignment of one nitro (-NO 2 ) or nitrooxy (-ONO 2 ) group for these formulas, which are preferentially ionized in ESI-mode (Lin et al., 2012b;Wang et al., 2017Wang et al., , 2018Song et al., 2018). CHON-formulas with O/N ratios ≥ 4 suggest the presence of further oxygenated functional groups, such as a hydroxyl group (-OH) or a carbonyl group (C = O). In terms of peak abundance, 59 % of CHON-compounds observed in Guangzhou-exhibited formulas with O/N ratios ≥ 4, which is higher than 51 % in Changchun-and 45 % in Shanghai-, indicating that CHON-compounds in southeast China show a higher degree of oxidation compared to those in northeast and east China. Not surprisingly, CHON+ compounds generally exhibit lower O/N ratios (Fig. S3), as they probably contain a reduced nitrogen functional group (e.g., amines), which is preferably detected in ESI+. As shown in Fig. S3, CHON+ compounds with an O/N ratio of 1 are dominant in Changchun+, whereas CHON+ compounds in Shanghai+ and Guangzhou+ show a broader range of O/N ratios from 1 to 3. Moreover, the average O/C ratios (0.27-0.45) in Shang-hai+ and Guangzhou+ (Table 1) are much greater than those (0.19) in Changchun+. Consistent with the observations for CHO compounds, these results indicate again that CHON+ compounds in the OA of east and southeast China experienced more intensive photooxidation and/or were affected to a larger degree by biogenic precursors. Figure 5 shows the DBE versus C number of CHONcompounds for the three cities. The majority of CHONcompounds lie in the region of 5-15 C atoms and 3-10 DBEs. A total of 67 % of CHON-compounds in terms of peak abundance were assigned to mono or polyaromatics in Shanghai-, which is higher than 52 % in Guangzhou- Figure 5. Double bond equivalent (DBE) versus carbon number for all CHON-compounds for all sample locations. The molecular formula represents the abundance-weighted average CHON-formula, and the area of circles is proportional to the fourth root of the peak abundance of an individual compound (a diagram with circle areas related to absolute peak abundances is presented in Fig. S6). The color bar denotes the aromaticity equivalent (gray with X C < 2.50, purple with 2.50 ≤ X C < 2.70 and red with X C ≥ 2.70). The pie charts show the percentage of each X C category (i.e., gray colorcoded compounds, purple color-coded compounds and red colorcoded compounds) in each sample in terms of peak abundance. and 55 % in Changchun-. This indicates that CHON-compounds are dominated by aromatic compounds in all cities, while relatively higher peak abundance-weighted fraction of aromatic CHON-compounds were observed in Shanghai. The peak abundance-weighted average molecular formulas for CHON-compounds in Changchun-, Shanghai-and Guangzhou-are C 7.10 H 6.76 O 3.56 N 1.03 , C 7.07 H 6.03 O 3.80 N 1.24 and C 7.12 H 6.36 O 3.99 N 1.24 , respectively, showing that CHONformulas in Shanghai-and Guangzhou-contain more O and N atoms on average than those for Changchun-. Formulas of C 6 H 5 O 3 N 1 , C 6 H 5 O 4 N 1 , C 7 H 7 O 3 N 1 , C 7 H 7 O 4 N 1 , C 8 H 9 O 3 N 1 and C 8 H 9 O 4 N 1 were detected with the highest abundance in all cities. These molecular formulas are in line with nitrophenol or nitrocatechol analogs, which have been identified in a previous urban OA study . Furthermore, these nitrooxy-aromatic compounds were shown to enhance light-absorbing properties of OA Lin et al., 2015). In addition, it should be noted that the X C values for C 6 H 5 O 4 N 1 , C 7 H 7 O 4 N 1 and C 7 H 7 O 4 N 1 were calculated to be lower than 2.5, suggesting that the fraction of aromatics in CHON-compounds was un-derestimated. This is because that for nitrocatechol analogs with formulas of C 6 H 5 O 4 N 1 , C 7 H 7 O 4 N 1 and C 8 H 9 O 4 N 1 , only one oxygen atom is involved in the π -bond structure corresponding to the p value of 0.25 in the X C calculation equation, which is lower than the p value of 0.5 applied for the X C calculation in this study. The diagram of DBE versus C number for CHON+ compounds observed in the three locations (presented in Fig. S7 in the Supplement) shows that more aromatic CHON+ compounds with relatively lower degree of oxidation were assigned in Changchun+ samples compared to Shanghai+ and Guangzhou+ samples.

CHN+ compounds
A total of 696 CHN+ compounds were detected in Changchun+ samples in ESI+, which is higher than in Shang-hai+ (253) and Guangzhou (205). These CHN+ compounds are likely assignable to amines according to previous studies (Rincón et al., 2012;Wang et al., 2017Wang et al., , 2018. The number of CHN+ compounds accounts for 24 %, 36 % and 30 % of the total organic compounds in Changchun+, Shanghai+ and Guangzhou+, respectively, whereas the peak abundance of these compounds accounts for 40 %, 71 % and 62 %, respectively. The majority (> 97 % in terms of peak abundance) of CHN+ compounds have one or two nitrogen atoms in their molecular formulas (see Fig. S9). Comparing the CHN+ compounds for the three cities, 51 common CHN+ formulas were observed in all cities, which contribute to as much as 43 %-89 % of the total abundance of CHN+ formulas. This large percentage indicates that CHN+ compounds with presumably high concentrations in Changchun+, Shanghai+ and Guangzhou+ exhibit similar chemical composition. However, again OA samples from Changchun show some distinct differences to samples from Guangzhou and Shanghai.
A van Krevelen diagram of CHN+ compounds detected in the three samples is shown in Fig. 6, illustrating H/C ratios as a function of N/C ratio. In this plot, major parts of the CHN+ compounds are found in a region that is constrained by H/C ratios between 0.5 and 2 and N/C ratios lower than 0.5. Moreover, the pie charts show that the majority (83 %-87 % in terms of peak abundance and 72 %-90 % in terms of peak numbers) of these CHN+ compounds can be assigned to mono-and polyaromatics with X C ≥ 2.5. In addition, as shown in Table 1, the average DBE and X C values of CHN+ compounds are the highest among all organic species. These observations imply that CHN+ compounds exhibit the highest degree of aromaticity of all organics in the Chinese urban OA samples, which is consistent with previous studies (Lin et al., 2012b;Rincón et al., 2012;Wang et al., 2018). Polyaromatic compounds with X C ≥ 2.7 are displayed in the lower left corner of the van Krevelen diagram, accounting for 41 % in terms of peak abundance (48 % in terms of peak numbers) of CHN+ compounds detected in Changchun+, but merely for 9 %-10 % in terms of peak abundance (27 %-31 % in terms of peak numbers) in Shanghai+ and Guangzhou+. Figure 6. Van Krevelen diagrams for CHN+ compounds in Changchun, Shanghai and Guangzhou samples. The area of circles is proportional to the fourth root of the peak abundance of an individual compound (a diagram with circle areas related to absolute peak abundances is presented in Fig. S10) and the color bar denotes the aromaticity equivalent (gray with X C < 2.50, purple with 2.50 ≤ X C < 2.70 and red with X C ≥ 2.70). The pie charts show the percentage of each X C category (i.e., gray color-coded compounds, purple color-coded compounds and red color-coded compounds) in each sample in terms of peak abundance.
For example, formulas of C 11 H 11 N 1 (X C = 2.7), C 10 H 9 N 1 (X C = 2.7) and C 12 H 13 N 1 (X C = 2.7), which are assigned to be naphthalene core structure-containing compounds, have relatively higher abundance in Changchun+ than in Shang-hai+ and Guangzhou+. Moreover, the average DBE and X C values of CHN+ compounds (see Table 1) in Changchun+ are higher than those in Shanghai+ and Guangzhou+, further indicating that CHN+ compounds in Changchun+ show a higher degree of aromaticity, which can be caused by large coal combustion emissions in the winter in Changchun. Remarkably, as can be seen in Fig. 6, the abundance of CHN+ compounds in Changchun+ distributes evenly among different individual CHN+ compounds, while in Shanghai+ and Guangzhou+ they are dominated by the formula of C 10 H 14 N 2 (the biggest purple circle in Fig. 6) with a DBE value of 5, which probably has a high concentration and/or high ionization efficiency in the positive ESI mode. According to a previous smog chamber study (Laskin et al., 2010), most CHN+ aromatics are probably generated from biomass burning through the addition of reduced nitrogen (e.g., NH 3 ) to the organic molecules via imine formation reaction, indicating that biomass burning probably made a certain contri-bution to the formation of CHN+ compounds observed in the three urban OA samples in our study.

CHOS-compounds
In this study, 75-155 CHOS-compounds were observed, accounting for 10 %, 12 % and 14 % of the total peak abundance of all organics in Changchun-, Shanghai-and Guangzhou-, respectively. Around 89 %-96 % of these CHOS-compounds were found to fulfill the O/S ≥ 4 criterion allowing the assignment of at least one -OSO 3 H functional group and thus a tentative classification to organosulfates (OSs) (Lin et al., 2012a, b;Tao et al., 2014;Wang et al., 2016Wang et al., , 2017Wang et al., , 2018Wang et al., , 2019a. OSs were shown to affect the surface activity and hygroscopic properties of the aerosol particles, leading to potential impacts on climate (Hansen et al., 2015;Wang et al., 2019a). Out of all formulas, 23 common CHOS-formulas were detected for the three sample locations, accounting for 28 %, 58 % and 52 % of the CHOSpeak abundance in Changchun-, Shanghai-and Guangzhou-, respectively. However, 40 common CHOS-formulas were found between Shanghai-and Guangzhou-, accounting for 60 %-65 % and 78 %-81 % in terms of the CHOS-formula numbers and peak abundance, respectively. This indicates that the chemical composition of the major CHOS-compounds of Shanghai-and Guangzhou-are quite similar, while they show substantial chemical differences for samples from Changchun-. Figure 7 shows the DBEs as a function of carbon number for all CHOS-compounds detected for the three cities. The CHOS-compounds exhibit a DBE range from 0 to 10 and carbon number range of 2-15. However, the majority of CHOS-compounds with elevated peak abundances concentrate in a region with rather low DBE values of 0-5. The average H/C ratios of CHOS-compounds are in the range of 1.56-1.85 and thus higher than for any other compound class, whereas the average DBE values of 1.71-2.55 are the lowest among all classes. This indicates that CHOS-compounds in the OA from the three Chinese cities are characterized by a low degree of unsaturation. Moreover, the pie charts in Fig. 7 show that aliphatic compounds with X C ≤ 2.5 are dominant in CHOS-compounds with a fraction of 96 %-99 % in terms of peak abundance, which is substantially higher than that (13 %-48 %) for CHO, CHON and CHN species. Aliphatic CHOS-compounds with C ≤ 10 can be formed from biogenic and/or anthropogenic precursors (Hansen et al., 2014;Glasius et al., 2018;Wang et al., 2019a), such as C 2 H 4 O 6 S 1 (derived from glyoxal) (Lim et al., 2010;McNeill et al., 2012), C 3 H 6 O 6 S 1 (derived from isoprene) (Surratt et al., 2007) and C 8 H 16 O 4 S 1 (derived from α-pinene). However, more CHOS-compounds with C > 10 and with DBEs lower than 1 are observed in Changchun-, such as C 14 H 28 O 5 S 1 , C 13 H 26 O 5 S 1 , C 12 H 24 O 5 S 1 , C 11 H 22 O 5 S 1 and C 11 H 20 O 6 S 1 . These high-carbon-number-containing CHOS-compounds are likely formed from long-alkyl-chain compounds with less Figure 7. Double bond equivalent (DBE) versus carbon number for all CHOS-compounds for all sample locations. The molecular formula represents the abundance-weighted average CHOS-formula, and the area of circles is proportional to the fourth root of the peak abundance of an individual compound (a diagram with circle areas related to absolute peak abundances is presented in Fig. S11). The color bar denotes the aromaticity equivalent (gray with X C < 2.50, purple with 2.50 ≤ X C < 2.70 and red with X C ≥ 2.70). The pie charts show the percentage of each X C category (i.e., gray colorcoded compounds, purple color-coded compounds and red colorcoded compounds) in each sample in terms of peak abundance. oxygenated functional groups, which were previously suggested to be emitted from traffic (Tao et al., 2014) or derived from sesquiterpene emissions (Brüggemann et al., 2019). However, as sesquiterpene emissions can be expected to be very low in wintertime at Changchun, the presence of these compounds further underlines the strong impact of anthropogenic emissions on CHOS-formation in Changchun-. In this study, the (O-3S) / C ratio was used instead of the traditional O/C ratio to present the oxidation state of CHOScompounds, since the sulfate functional group contains three more oxygen atoms than common oxygen-containing groups (e.g., hydroxyl and carbonyl), which makes no contribution to the oxidation state of the carbon backbone of the CHOScompounds. Comparing average values for H/C, (O-3S) / C and DBEs of CHOS-for the three sample locations (see Table 1), we find that the H/C ratios (1.85) and (O-3S) / C ratios (0.61-0.71) for Shanghai-and Guangzhou-samples are larger than those for Changchun-samples (H/C = 1.56 and (O-3S) / C = 0.52), whereas the DBE values (1.71-1.79) in Shanghai-and Guangzhou-are lower than those for Changchun-(2.55). These observations indicate that CHOS-compounds in urban OA from northeast China are less oxidized but more unsaturated compared to those in east and southeast China, likely due to enhanced emissions from residential heating during winter in north China.

CHONS compounds
A total of 4 %-5 % of the total organics detected in ESI-were identified as CHONS-compounds in terms of peak abundance. In contrast, CHONS+ compounds account merely for 0.3 %-1 % of all organics detected in ESI+. The average MM avg of the CHONS-compounds for the three sample locations ranges from 214 to 293 Da, generally showing larger molecular masses than compounds of any other class because of the likely presence of both nitrate and sulfate functional groups. In total, only five common CHONS-formulas were detected for all three sample locations, accounting for 4 %, 21 % and 20 % of the CHONS-peak abundance in Changchun-, Shanghai-and Guangzhou-, respectively. As already observed for other compound classes, these percentages imply that the CHONS-compounds in urban OA of Shanghai-and Guangzhou-exhibit a rather similar chemical composition, whereas such compounds are different for Changchun-.
In the OA samples of Shanghai-and Guangzhou-, 78 %-87 % of CHONS-compounds in terms of peak abundance have seven or more O atoms in their formulas, allowing the assignment of one -OSO 3 H and one -NO 3 functional group in the molecular structures, thus classifying them as potential nitrooxy-organosulfates. In contrast to Shanghai-and Guangzhou-, only 26 % of CHONS-compounds were assigned to such nitrooxy organosulfates for Changchun-, indicating that most of the N atoms in the CHONS-compounds are present in a reduced oxidation state, e.g., in the form of amines. The average DBE and X C values of CHONScompounds in Shanghai-and Guangzhou-are 3.3-3.45 and 0.43-0.44, respectively. Again these values differ for the Changchun-samples with an increased average DBE of 3.75 and an average X C of 1.06, indicating that CHONS-compounds in Changchun-possess on average a higher degree of unsaturation and aromaticity compared to such compounds in Shanghai-and Guangzhou-samples. Interestingly, the compound with the formula C 10 H 17 O 7 NS has the highest relative peak abundance (32 %) in Shanghai-and Guangzhou-, whereas in Changchun-the compound with the formula C 2 H 3 O 4 NS is dominant. C 10 H 17 O 7 NS has previously been identified as mononitrate organosulfate generated from α/βpinene (Iinuma et al., 2007;Surratt et al., 2008;Lin et al., 2012b;Wang et al., 2017), while C 2 H 3 O 4 NS may be assigned as a cyanogroup-containing sulfate. This observation is comparable to our previous study (Wang et al., 2019a), which found that C 10 H 17 O 7 NS was dominant for CHONScompounds in low-concentration aerosol samples collected in Beijing (China) and Mainz (Germany). Consistently, a C 2 H 3 O 4 NS compound had the highest abundance among CHONS-compounds in polluted Beijing aerosol samples. This agreement can be explained by the adjacent locations of Beijing (39.99 • N,116.39 • E) and Changchun (43.54 • N, 125.13 • E) and similar residential heating patterns by coal combustion during wintertime. In conclusion, these results further demonstrate that the precursors for CHONS-compounds in Shanghai-and Guangzhou-are different from those in Changchun-, which is probably due to differences in anthropogenic emissions.

Limitations
In this study, we used the peak abundance-weighted method to illustrate the difference in chemical formulas assigned by the Orbitrap MS. This comparison was made based on the assumption that the measured organic compounds have the same peak abundance response in the mass spectrometer. However, this assumption can bring some uncertainties because the ionization efficiencies vary between different compounds (Schmidt et al., 2006;Leito et al., 2008;Perry et al., 2008;Kruve et al., 2014). For example, the ionization efficiencies of nitrophenol species detected in negative ESI mode can vary by a large degree depending on the position of the substituents at the nitrobenzene ring (Schmidt et al., 2006;Kruve et al., 2014), and the ionization efficiencies of carboxylic acids can also vary by several orders of magnitude depending on the structures (Kruve et al., 2014). Nonetheless, it is a challenging analytical task to identify and quantify all compounds in ambient OA due to the high chemical complexity of OA and the limits in authentic standards of OA. Despite the inherent uncertainties, the peak abundance-weighted comparison of molecular formulas provides an overview of the difference in chemical composition of OA in these three representative Chinese cities. In particular, the chemical formulas assigned in this study can be validated in future studies by authentic standards, and the difference in ionization efficiencies can be further evaluated.

Conclusions
The molecular composition of the organic fraction of PM 2.5 samples collected in three Chinese megacities (Changchun, Shanghai and Guangzhou) was investigated using a UHPLC-Orbitrap mass spectrometer. In total, 416-769 (ESI-) and 687-2943 (ESI+) organic compounds were observed and separated into five subgroups: CHO, CHN, CHON, CHOS and CHONS. Specifically, 120 common formulas were detected in ESI-and 129 common formulas in ESI+ for all sample locations, accounting for 57 %-71 % and 30 %-75 % in terms of peak abundance, respectively. Overall, we found that urban OA in Changchun, Shanghai and Guangzhou shows a quite similar chemical composition for organic compounds of high concentrations. The majority of these organic species were assigned to monoaromatic or polyaromatic compounds, indicating that anthropogenic emissions are the major source for urban OA in all three cities.
Despite the chemical similarity of the three sample locations for organic compounds in urban OA, remarkable differences were found in chemical composition of the remaining particle constituents, in particular for OA samples from Changchun. In general, a larger number of polyaromatics were observed for Changchun samples, most likely due to emissions from coal combustion during the wintertime residential heating period. Moreover, the peak abundanceweighted average DBE and average X C values of the total organic compounds in Changchun were found to be larger than those for Shanghai and Guangzhou, showing that organic compounds in Changchun possess a higher degree of unsaturation and aromaticity. For average H/C and O/C ratios a similar trend was observed. While average H/C and O/C ratios detected in ESI-were found to be highest for Guangzhou samples, relatively lower values were observed for Shanghai and Changchun samples, indicating that OA collected in lower-latitude regions of China experiences more intense photochemical oxidation processes and/or is affected to a larger degree by biogenic sources.
Data availability. All relevant data have been included in this paper in the form of tables and figures. Specific data requests can be addressed by email to the corresponding authors.
Author contributions. RJH, TH and KW conducted the study design. LY, HN, JG and MW collected the PM 2.5 filter samples. KW and YZ carried out the experimental work and data analysis. KW wrote the manuscript. KW, TH, RJH, MaB, YZ, JH, MB and MG interpreted data and edited the manuscript. All authors commented on and discussed the manuscript.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. This study was supported by the Na- Review statement. This paper was edited by Frank Keutsch and reviewed by three anonymous referees.