Measurement report: Underestimated reactive organic gases from residential combustion – insights from a near-complete speciation

Reactive organic gases (ROGs), as important precursors of secondary pollutants, are not well resolved as their chemical complexity has challenged their quantification in many studies. Here, a near-complete speciation of ROG emissions from residential combustion was developed by the combination of proton transfer reaction time-of-flight mass spectrometry (PTR-ToF-MS) with a gas chromatography system equipped with a mass spectrometer and a flame ionization detector (GC-MS/FID), including 1049 species in all. Among them, 125 identified species, ∼ 90 % of the total ROG masses, were applied to evaluate their emission characteristics through real combustion samplings in rural households of China. The study revealed that with 55 species, mainly oxygenated species, higher hydrocarbons with≥8 carbon atoms, and nitrogen-containing species, previously unand under-characterized, ROG emissions from residential coal and biomass combustion were underestimated by 44.3 %± 11.8 % and 22.7 %± 3.9 %, respectively, which further amplified the underestimation of secondary organic aerosol formation potential (SOAP) as high as 70.3 %± 1.6 % and 89.2 %± 1.0 %, respectively. The hydroxyl radical reactivity (OHR) of ROG emissions was also undervalued significantly. The study provided a feasible method for the near-complete speciation of ROGs in the atmosphere and highlighted the importance of acquiring completely speciated measurement of ROGs from residential emissions, as well as other processes.


Introduction
Residential combustion, dominated by approximately 89 % solid fuels in China , is responsible for ∼ 23 % and ∼ 71 % of the outdoor and indoor PM 2.5 concentrations and ∼ 67 % of PM 2.5 -induced premature deaths (Yun et al., 2020). Reactive organic gases (ROGs), organic gases other than methane, from residential combustion, have been shown to serve as key precursors for secondary organic Published by Copernicus Publications on behalf of the European Geosciences Union. aerosols (SOAs) (Huo et al., 2021a) and ozone formation (Heald and Kroll, 2020;. ROG emissions from residential combustion, especially biomass combustion, have been widely studied due to their great contribution to global ROGs and the complexity of compositions. Among them, studies focusing on the ROG speciation for residential combustion could generally be divided into three categories according to the measurement methods, as listed in Table S1 in the Supplement. The first one was the whole-air sampling with offline analysis by a one-dimensional gas chromatography system equipped with a mass spectrometer and/or a flame ionization detector (GC-MS/FID), which mainly focused on the hydrocarbons (< C12) (Mo et al., 2016;Wang et al., 2013;Liu et al., 2008). With the development of the advanced instruments, the second category of studies on ROG emissions gave more attention to polar species like oxygenated ROGs, which could be detected online through the whole combustion process, mainly by proton transfer reaction time-of-flight mass spectrometry (H 3 O + PTR-ToF-MS) due to the high mass resolution and sensitivity (Cai et al., 2019;Bruns et al., 2017;Stockwell et al., 2015;Koss et al., 2018;Akherati et al., 2020;Wu et al., 2022). Considerable (approximately 6 %-24 %) species with intermediate volatility in residential ROG emissions were identified as the large contributors of SOA (Cai et al., 2019;Koss et al., 2018).
Thirdly, due to the inability to isolate isomers by H 3 O + PTR-ToF-MS and considerable number of oxygenated ROGs with intermediate volatility in residential ROG emissions, increasing interest has been expressed in the application and comparisons of multiple instruments for the detailed identification of ROG species Hatch et al., 2017). More than 150 PTR ion masses were identified using a combination of techniques including GC pre-separation, a two-dimensional GC system (GC × GC), Fourier transform infrared spectroscopy (FTIR), and NO + chemical ionization mass spectrometry (NO + CIMS), which contributed ∼ 90 % of the ROG masses detected by H 3 O + PTR-ToF-MS in biomass combustion emissions . The comparisons demonstrated that H 3 O + PTR-ToF-MS might be the most suitable for the detection of the lowest-volatility and most polar species, which covered the most (50 %-79 %) species, compared with the other instruments, in the combined ROG measurement covering more than 500 species from different instruments Hatch et al., 2017).
Recently, the higher alkanes (≥ C8), one kind of considerable species in residential combustion emissions Huo et al., 2021b;Li et al., 2023), which were not included in the comprehensive measurements of previous studies (Hatch et al., 2017), could be measured well by PTR-ToF-MS with NO + ion chemistry (C. Koss et al., 2016). Thus, PTR-ToF-MS might be a preferential and promising method for the development of near-complete ROG speciation relevant for residential combustion, but it needs to be combined with GC-MS/FID for the complementary measurement of aliphatic hydrocarbons. Therefore, the present study focused on (1) developing the near-complete ROG speciation through quantifying all signals by H 3 O + PTR-ToF-MS and supplementing C2-C22 aliphatic hydrocarbons by GC-MS/FID and NO + PTR-ToF-MS and (2) the composition of ROG emissions through the real combustion sampling in rural household of China. Finally, the near-complete ROG speciation further supported the estimation of the ROG emissions from residential combustion in China as well as their hydroxy radical reactivity and formation potential of SOA. The present study took the residential combustion as an example for developing the near-complete ROG speciation mainly considering the large complexity of combustion-relevant ROG speciation and the comprehensive measurement of residential combustion previously, which could be used to further confirm the present result by overlapping species.

Sampling
The ROG samples of the combustion of four typical biomass fuels (wood, corncob, bean straw and corn straw) and two typical types of coal (anthracite and briquette coal) (see Fig. S1 in the Supplement) were collected from the stack nozzles of household stoves by vacuumed SUMMA canisters (Entech Inc., 3.2 L). During canister sampling, the combustion in the stoves was in the stage of flaming visually, which was the common condition for heating or cooking in the rural areas in northern China. Particles were removed by a particle filter with a 5.0 µm pore size Teflon filter (PTFE). Temperatures (37 ± 17 • C) of flue gas were monitored at the sampling location by a flue gas analyzer (Testo 350). Wood and straw burned at an average temperature of 394 • C (in the range of 231-567 • C) and 353 • C (334-371 • C), respectively, while residential coal burned at a higher temperature (514 • C, 411-581 • C), measured by an infrared thermometer in the stove. Here a total of 23 samples were collected, as shown in Fig. S1. All the samples stored in SUMMA canisters were detected with the combination of H 3 O + /NO + PTR-ToF-MS and GC-MS/FID within 9 d.

Analysis by H 3 O + PTR-ToF
H 3 O + PTR-ToF-MS (Vocus 2R, Tofwerk AG, Switzerland) served as the primary equipment toward ROG complete speciation due to the detectability of most ROGs with relatively complete species coverage (Li et al., 2020;Krechmer et al., 2018) and theoretically computable sensitivity . All valid signals in the raw mass spectrum were identified, quantified and analyzed for uncertainty, following the detailed process of data treatment shown in Fig. 1.

Identification
A total of 991 valid ion masses were extracted from original mass spectrum through multi-peak fitting (Stark et al., 2015), of which 130 species were well identified, while 861 signals were still unknown according to literature surveys (Pagonis et al., 2019;Koss et al., 2018;Stockwell et al., 2015).

Extraction (991 masses)
The raw mass spectrum data of gaseous ROGs measured by H 3 O + PTR-ToF-MS were mainly distributed in the range of m/z 40-200 , with a mass resolution of 10000 m m −1 . Multiple peaks were added at each nominal mass below m/z 200, resulting in a peak list of 991 valid masses in the software Tofware package v3.2.3 (Tofwerk Inc.).

Speciation (130 specific ROGs and 861 unknown masses)
Based on accurate m/z, the isotope pattern and a published PTR library from the review of 49 publications (containing 226 molecules and nearly 1000 species) (Pagonis et al., 2019) and a local PTR library (Yuan et al., 2022;Gao et al., 2022), molecular information of individual mass was calculated and matched. Due to the complexity of isomers, the recommended names of various species were further determined by the combustion-relevant reports of Koss et al. (2018) and Stockwell et al. (2015). Finally, 130 masses matched proposed species, while the remaining 861 peaks were defined as unknown masses with the molecular formula but without recommend species or only with accurate m/z without elemental composition calculated.

Quantification
Sensitivities of all the masses in this work were obtained by (1) calibration by authentic standard for species with standard matter available, (2) theoretical calculation by the relationship between sensitivity and kinetic rate constants (k)  for identified species without commercial standards, or (3) estimation using the sensitivity of acetone as a proxy for unknown masses.

Calculation method (112 species)
For other well qualitative species without available standards, we used the method proposed by  to theoretically calculate the concentrations by the relationship between ROG sensitivities and kinetic rate constants (k kinetic ) for proton transfer reactions of H 3 O + with ROGs ( Fig. S2), which was established by the calibrated species above.

Estimation method (861 masses)
For unknown masses, it is difficult to estimate k kinetic which depends on molecular properties, and the sensitivity of acetone was used as a proxy (Cai et al., 2019).

Uncertainty analysis
The measurement uncertainty of a specific ROG is mainly from the analysis and storage. The analysis uncertainty depending on the quantification method in Sect. 2.2.2 was estimated following the method of Sekimoto et al. (2017). The effect of storage inside the canister was evaluated using the standard samples with different storage duration in the laboratory. Accordingly, the total measurement uncertainties of different ROG species were estimated, as listed in Table S2.

Analysis uncertainty
The analysis uncertainty corresponds to the three kinds of quantification methods. Firstly, the experimentally calibrated species have an upper-bound analysis uncertainty of 15 %, including those of analysis precision and calibration factors (Gao et al., 2022), as shown in Table S2. Secondly, for those species quantified by theoretical calculation, the uncertainty was mainly attributed to the estimated value of k kinetic and the linear regression slope (∼ 20 % from fitting bias in this study) between sensitivity and k kinetic , which was within 50 % . Thirdly, for unknown masses, the uncertainty was mainly from the difference of their "real" sensitivities from the assumed sensitivity of acetone, which was dependent on the difference of their k kinetic . Generally, the oxygenated species (C x H y O z ) have higher (∼ 2.5-3.5 × 10 −9 cm 3 mol −1 s −1 ) k kinetic than those of aliphatic hydrocarbons (C x H y , ∼ 2 × 10 −9 cm 3 mol −1 s −1 ), and the k kinetic of acetone is 3.23 × 10 −9 cm 3 mol −1 s −1 (Cappellin et al., 2012). Thus, the maximum uncertainty of the estimated concentrations of unknown masses caused by this assumption is estimated to be 65 %, although the "real" k kinetic is as low as 2 × 10 −9 cm 3 mol −1 s −1 in this study.

Loss in storage
To evaluate the possible artifacts caused by the SUMMA canister storage, sensitive tests of the storage duration were carried out using the standard samples in the laboratory. Standard samples including 85 species, with a known concentration (5 ppbv in this study), were prepared into clean vacuum SUMMA canisters and detected within 2 h as well as on days 1, 2, 4, 7, 10 and 14 after preparation. Here, the loss fraction was estimated to be the maximum threshold because the mixing ratios of most observed ROGs in all samples were normally higher than 5 ppbv. The detailed species and deviations were shown in Fig. S3. The storage loss on day 10 was used considering that samples were analyzed within 9 d after sampling in this work (Table S2). According to experimental results, the loss of some carbonyls (-CHO and RCO-), furans and nitrogenous species (-CN) was measured within 20 %, while the loss of several alcohols (-OH) exceeded 50 %. The results of carbonyls were comparable with previous studies in which the half-lives of aldehydes were 18 d in the canister (Batterman et al., 1998). The higher polarity of alcohols is a possible reason for their larger loss proportion, as the polar species preferentially adsorbed on surface sorption sites of SUMMA canister inner walls (Batterman et al., 1998). Although volatility is another potential factor, in current experimental results for 10 d, the loss proportion has no significant dependence on volatility (Fig. S4).
For the other species of carbonyls, furans and nitrogenous species without standards, their loss during storage was assumed to be the largest measured loss fraction in the same category. For phenols, an important category in ROG emissions from residential combustion (Bruns et al., 2016), their loss was believed to be predictable considering that phenols have been mentioned as being suitable for canister sampling in US EPA method TO15 (Epa/625/R-96/010b, 1999) and was estimated as 20 % referring to the largest loss fraction of measured carbonyls.
For other categories such as acids (-COOH), species with more than two oxygen atoms (usually two functional groups) and components containing -NO n , there were no standards used to evaluate the loss during storage. Overall, 63 specific ROGs with large uncertainty potentially (> 50 %) and 861 unknown masses with uncertain loss fraction were only used to quantify the total ROGs and were excluded from the further discussion of ROG composition.

Fragment interference
Aliphatic hydrocarbons such as C 3 H 6 H + and C 4 H 8 H + are interfered by fragments from different species. In this work, 12 such ions were excluded from the results to avoid the un-certainty, due to well-characterized hydrocarbons being provided by GC-MS/FID.
In summary, except for 12 aliphatic hydrocarbon fragments, the other 979 detected masses by H 3 O + PTR-ToF-MS were used in this study, among which 55 species with the uncertainty ranging from 1 % to 44 % were used in further discussion of ROG composition, and the rest of the masses with larger (> 50 %) or unknown uncertainty potentially were only used to quantify the total ROGs in Fig. 2.

Limitation
H 3 O + PTR-ToF-MS could identify the ROGs as long as the proton affinity of ROGs was greater than that of water (691 kJ mol −1 ) , with relatively complete species coverage. It may be the preferred method toward ROG complete measurement because most ROGs can be detected (Li et al., 2020;Krechmer et al., 2018) and the sensitivity for a given ROG can be calculated theoretically by H 3 O + PTR-ToF-MS . Despite these advantages, H 3 O + PTR-ToF-MS has two limitations related to the reagent ion chemistry. Firstly, the technique is insensitive to C2-C7 alkanes, ethene and acetylene with lower proton affinity than water (Jobson et al., 2010). Secondly, higher alkanes (≥ C8), one kind of important component of fuel combustion (Huo et al., 2021b;, were difficult to quantify by H 3 O + PTR-ToF-MS due to fragments produced during the ionization process. Formaldehyde, a special species during PTR measurement, was not detected in this study, suffering from double effects by (1) mass discrimination and (2) reversible reaction. Firstly, for the PTR-ToF-MS deployed in this study, as the transmission curve ( Fig. S2) shows, the mass transmission efficiency of protonated formaldehyde at m/z 31 is close to 0, leading to an almost negligible sensitivity of formaldehyde. The decisive factor for transmission efficiency of PTR-ToF-MS is the radio frequency (RF) amplitude and the frequency of the big segmented quadrupole (BSQ), which can be set up as an ion filter (Y. . A similar situation has been reported by Yang et al. (2022), in which formaldehyde was not measured by the PTR-ToF-MS (Vocus 2R, Aerodyne Research Inc.) (Yang et al., 2022). Secondly, non-negligible back reactions between protonated formaldehyde and water vapor can reduce the sensitivity for formaldehyde (Spanel and Smith, 2008;Cui et al., 2016). For the developed Vocus instrument deployed in current work, the water vapor flow increases from the previous level 4-8 sccm (cm 3 min −1 at 105 Pa and 273.15 K) (De Gouw et al., 2004) to 20-30 sccm . This change brings a great advantage that the humidity dependence of instrument response to most species can be ignored. However, for formaldehyde, the reduction of its sensitivity is predictable because the really high water vapor concentration was provided for the back reactions.

Species' combination
Aiming to develop a near-complete speciation of ROGs in this study, NO + PTR-ToF-MS (Vocus 2R, Tofwerk AG, Switzerland), and GC-MS/FID (TH-300, Wuhan Tianhong Instruments, China) were deployed additionally against the limitations of H 3 O + PTR-ToF-MS discussed above, which was highly complementary, covering a unique and important range of compositional space. During the species' combination for a near-complete speciation of ROGs, overlapping measurements of the same species should be counted only once.

NO + PTR-ToF-MS
NO + PTR-ToF-MS has been demonstrated to provide a supplementary measurement of higher alkanes (C. Koss et al., 2016). Different from H 3 O + PTR-ToF-MS by which the sensitivity for a given ROG can be calculated theoretically even without the standard for calibration, authentic standards are necessary for quantification by NO + PTR-ToF-MS, which limited the characterization of mass spectrum ionized by NO + . The difficulty to predict the ionized ROG products and to interpret the mass spectrum unambiguously further limited its application in ROG speciation because NO + has three common reaction mechanisms with ROGs: charge transfer, hydride abstraction, and cluster formation (Koss et al., 2016). Therefore, NO + PTR-ToF-MS in this study was only used for a supplementary measurement of higher alkanes (≥ C8) with a well-established quantitative method (C. .
C8-C15 alkanes were calibrated using a custom cylinder (Linde Gas North America LLC, USA), and sensitivities of C16-C21 alkanes were assumed to be the same as the sensitivity of C15 n-alkane according to C. . The error caused by this assumption was considered to be minimal because the degree of fragmentation, a parameter inversely proportional to the sensitivity of higher alkanes, was similar between C16-C21 alkanes and C15 alkanes (∼ 20 %) (C. .

GC-MS/FID
Against the limitations of PTR technologies (Arnold et al., 1998;Gueneron et al., 2015), GC-MS/FID with a cryogenfree preconcentration device was also deployed. A total of 57 hydrocarbons and 12 oxygenated organic compounds were measured by GC-MS/FID and quantified using gas standards (Linde gas North America LLC, USA).

Overlapping species
In total, 32 species measured by GC-MS/FID overlapped the 14 protonated ions measured by H 3 O + PTR-ToF-MS (the pink shaded bar in Fig. 2), including 16 aromatic hydrocarbons, 10 carbonyls and 6 biogenic volatile organic com-  (Cai et al., 2019;Stockwell et al., 2015). All the quantified species were calculated by standard matter. The theoretically quantified species were measured by H 3 O + PTR-ToF-MS by the calculation method proposed by Sekimoto et al. (2017) to determine the relationship between ROG sensitivity and kinetic rate constants for proton transfer reactions of H 3 O + with ROGs. Among the above theoretically quantified species, species like alcohols and acids (-OH and -COOH), species with more than two oxygen atoms (usually two functional groups), and components containing -NO n had relatively large uncertainty due to the potential loss in the storage. The unknown species were detected by H 3 O + PTR-ToF-MS and were semi-quantified by the sensitivity of acetone. pounds (BVOCs), denoting the species generally emitted and formatted from natural sources in atmosphere. The concentrations of overlapping species in all samples were carefully compared and showed good consistencies between two instruments (slope = 1.00 ± 0.15, 0.94 < r < 0.99) (Fig. S5). Considering the better performance of GC-MS/FID in isomer differentiation, GC-MS/FID data were given precedence.
Although C8-C10 alkanes were measured by both GC-MS/FID and NO + PTR-ToF-MS, they represent different meanings. The concentration of higher alkanes from NO + PTR-ToF-MS should be regarded as the summed concentrations of n-alkanes and branched alkanes that have the same chemical formulas. C8-C10 alkanes measured by GC-MS/FID were mainly normal alkanes and few branched alkanes. Following the principle of maximizing the species conservation, the GC data were used to preserve isomer speciation. Meanwhile, after subtracting the concentrations of relevant GC species, the NO + PTR-ToF-MS data also remained as representing an independent species.

Formaldehyde
Formaldehyde, a very important species from combustion (Zarzana et al., 2017) but not measured in this study (Sect. 2.2.4), was assumed to account for ∼ 7 % and ∼ 5 % in all discussed species for coal and biomass combustion in this study, respectively, according to the previous reported results (Cai et al., 2019;Stockwell et al., 2015Stockwell et al., , 2016. To include the contribution of formaldehyde in the near-complete speciation of ROGs, the emission ratio of formaldehyde with the reference species in emissions was effective for this purpose. Gen-erally, all the overlapping species measured in this and the previous studies could be used as the reference species because the relative contribution of all the overlapping species agreed well between the current study and the previous studies, as presented in Fig. S6. Benzene was chosen for normalization because in the emissions of all types of fuel combustion, benzene was one of the most abundant aromatics that were the major overlapping species between the current and previous studies. Specifically, the emission ratio of formaldehyde (ER HCHO, ref ) to benzene from anthracite coal (2.13 g g −1 , benzene) and straw (2.85 g g −1 , benzene) combustion could be obtained from Cai et al. (2019) and Stockwell et al. (2016). Considering the emission ratios of the species covered by both previous studies and the current work were in good agreement, as shown in Fig. S6, the mass fraction of formaldehyde (f HCHO, cal ) in ROG emissions from combustion of this work could be calculated using the previously reported ER HCHO, ref and the currently measured mass fraction of benzene (f benzene ) as follows: Finally, all the species included in this study were sketched in Fig. 2, as well as their mass fractions in ROG emissions from residential combustion which would be discussed in detail below. Combining 965 unique masses detected by H 3 O + PTR-ToF-MS with 69 species detected by GC-MS/FID, 14 higher alkanes detected by NO + PTR-ToF-MS, as well as formaldehyde from references, a total of 1049 species were used in this study, including 27 alkanes, 9 alkenes, 1 alkyne, 16 aromatics, 6 BVOCs, 9 polycyclic aromatic hydrocar-bons (PAHs), 14 higher alkanes, 16 carbonyls, 10 furans (furan and its homologues and derivatives), 9 phenols (phenol and its homologues and derivatives), 8 nitrogen-containing species and 924 other species with large uncertainty. Among them, only 125 species with uncertainty below 50 % were included in the speciation of ROGs from residential combustion emissions. Table S2 lists the 125 species and their uncertainties. They contributed 89 % ± 20 % and 92 % ± 34 % of the total ROGs from residential combustion. More details of each kind of sample are shown in Fig. S7.

Results and discussions
3.1 A near-complete speciation of ROGs from residential combustion ROG compositions emitted from typical residential combustion using two types of coal (anthracite and briquettes) and four types of biomass (wood, corncob, corn straw and bean straw) are shown in Fig. 3. The measured ROG profiles for each kind of fuel had a good correlation (R > 0.6) (Fig. S8), and the average result was used here. Generally, ROGs emitted from the residential combustion can be divided into three groups based on the element composition, including hydrocarbons, oxygenated species and nitrogen-containing species. Differing from previous studies which mainly stressed the dominant role of hydrocarbons (Mo et al., 2016;Stockwell et al., 2015;, the contribution of oxygenated species (36.8 %-56.8 %) was comparable with that of hydrocarbons (40.8 %-48.7 %) in this study. It was expected that 24 more oxygenated species mainly including furans, phenols and carbonyls were measured by H 3 O + PTR-ToF-MS in our study, which were un-and under-characterized in previous studies using GC methods. Besides hydrocarbons and oxygenated species, nitrogen-containing species mainly included acetonitrile and acrylonitrile and also played a considerable role in ROG emissions from residential combustion, with the proportions ranging from 5.7 % to 14.5 %, which have been previously reported (Cai et al., 2019).
Here, we defined the species previously un-and undercharacterized by GC methods as newly identified species and could be measured only using H 3 O + /NO + PTR-ToF-MS in this study, and as a result 55 of 125 species were newly identified species. As shown in Fig. 3, these newly identified species mainly including furans and phenols contributed 44.3 % ± 11.8 % of the total ROGs for coal emissions and 22.7 % ± 3.9 % of the total ROGs for biomass emissions. We also compared our results with the previous reports, and for comparison the previous speciation was scaled by the total fraction of the previously reported species in the total ROGs measured in this study. As shown in Figs. 3 and S6, the fraction of reported species in previous studies was comparable to the present result. In particular, as Fig. 3c shows, the present composition of ROGs from residential wood com- bustion was close to that of Black Spruce combustion simulated in the laboratory by multiple advanced trace-gas instruments, which reported 464-551 species (∼ 173 molecules) in all (Hatch et al., 2017). It further confirmed that the obtained ROG characterization with the combination of H 3 O + /NO + PTR-ToF-MS and GC-MS/FID was nearly complete. Our study underscored the importance of the completely speciated measurement of the ROG emissions from residential combustion especially for coal combustion.
A large difference was observed between the ROG speciation of coal and biomass combustion, but this not significant among different types of coal or biomass, as shown in Fig. 3. Specifically, the alkenes were mainly ethylene-and propenedominated hydrocarbons emitted from biomass combustion, while alkanes were mostly hydrocarbons from coal combustion. In particular, coal combustion emitted considerable higher alkanes including 8-21 carbon atoms and gaseous 6640 Y. Gao et al.: Underestimated reactive organic gases from residential combustion PAHs (mainly including two to three benzene rings), primarily generated by pyrolysis of the volatile matter in coal (Du et al., 2020), accounting for 8.3 %-14.8 % of ROGs much higher than the minor fractions (0.4 %-2.4 %) for biomass combustion emissions. In terms of oxygenated species, coal combustion emitted considerable furans (16.8 % ± 3.2 %) and phenols (6.1 % ± 1.5 %) mainly formed through pyrolysis of polymers in coal (W.-J. Morgan and Kandiyoti, 2014), which together played a comparable role with carbonyls (26.9 % ± 6.8 %) in ROGs. In comparison, carbonyls (40.6 % ± 6.6 %) were the dominant oxygenated species in ROG emissions from biomass combustion, mainly originated from products of biomass pyrolysis and pyrosynthesis (Morgan and Kandiyoti, 2014). A slightly higher proportion of phenols and furans from wood combustion (12.8 % ± 3.0 %) than straw combustion (8.7 % ± 2.8 %) was observed, possibly resulting from the higher composition of lignin in wood (Collard and Blin, 2014). Considerable terpenes were also observed in ROG emissions from residential coal (1.5 % ± 0.2 %) and biomass (3.2 % ± 0.5 %) combustion.
The ROG composition and individual species' proportion in source profiles obtained in the present study were mainly from the measurements during the flaming stage. Considering the difference between flaming stage and the whole combustion cycle, the potential bias of the present results should be further discussed. By re-analyzing the data obtained from the authors (Cai et al., 2019), the ROG composition from coal combustion in flaming stage and the whole cycle agreed well, which was expected due to the small changes of ROG composition throughout the first three stages which emitted 96 % of ROGs (Fig. S9). Similar results (Fig. S10) could be concluded from the re-analysis of the reported emission data from biomass combustion by Koss et al. (2018). Furthermore, the proportion of individual species between the flaming stage and the whole cycle has a deviation in the range of −50 % to 22 % for coal and straw combustion (Fig. S11). Actually, the previous study of Gilman et al. (2015) carefully compared discrete emission ratios (ERs) during flaming and smoldering combustion and fire-integrated ERs of the whole cycle, and the average slope and standard deviation of discrete versus fire-integrated ERs for select ROGs from 56 biomass burns in the USA was 1.2 ± 0.2 (Gilman et al., 2015). In summary, the bias of the fractions of species categorized by functional group from both coal and biomass combustion obtained in our study was negligible, and the bias of individual species' proportion from both coal and biomass was estimated to be within 50 % generally.

SOAP and OHR underestimation from newly identified species
To further understand the role of residential-combustionemitted ROGs in atmospheric chemistry, the hydroxyl radical reactivity (OHR) and secondary organic aerosol forma-tion potential (SOAP) of per unit mass (or concentration) of ROGs emitted from residential combustion were calculated based on the source profiles. OHR is defined here as the sum of hydroxyl radical (OH) reactivity of each species, calculated by product of ROG species weight percentage (W ROG i ) in emissions from residential combustion and corresponding OH reaction rate (k OH+ROG i ) (Carter, 2008;Koss et al., 2018), as presented in Eq.
(2) below. SOAP is the sum of SOAP of each species and calculated by multiplying proportion of ROG species with respective SOA yields, as presented in Eq.
Among 80 SOA potential precursors in Table S3, the SOA yields of 44 species from previous chamber studies have been published, while SOA yields of nearly half potential precursors were still unknown. The SOA yields in the real atmosphere are dependent on the level of nitrogen oxides (NO x ), total organic aerosol (OA) mass loading and temperature, etc., by modulating the chemical reaction pathway and phase partitioning. The SOA yields mainly measured under high-NO x conditions ([NO x ] > 1 ppb) except for benzenediols (C 6 H 6 O 2 ) and C2 phenols (C 8 H 10 O) from previous chamber studies were scaled to the ambient conditions ([OA] = 15.0 µg m −3 , T = 25 • ) (Gao et al., 2019) based on the two-product model (Ng et al., 2007;Li et al., 2016) and further corrected for vapor wall losses (Zhang et al., 2014). Table S3 summarized the corrected SOA yields applied in this study and some details of chamber experiments (e.g., the chamber yields and the numbers of experiments). Potential precursors with unknown SOA yields include furans, phenols, three-ring PAHs and terpenes except for α-pinene and alkanes, with more than six carbon atoms especially branched alkanes. Alkanes containing 13, 14 and 16 atoms were estimated using the reported two-product parameters (Presto et al., 2010) which were derived from the experimental yields of C12 alkanes and C17 alkanes. SOA yields for other potential precursors were assumed to be the corrected SOA yield of species with similar structure or the same number of carbon atoms applied in this study. The overall uncertainty of estimated SOAP is related to the uncertainty of SOA yields and species' proportions in source profiles. The yield uncertainty for corrected SOA yields from publications with at least two experiments was estimated to be within 11 % according to the bias between two-product model fitting results and experimental yields. For other species with published SOA yield from only a single experiment or assumed SOA yields, the yield uncertainty has been estimated as ∼ 50 % (Bruns et al., 2016), which was cited in this study. The uncertainty of species' proportion was 20 % and 34 % in coal and biomass combustion profiles, respectively, as mentioned in Sect. 2.3.4. Thus, the total uncertainty of SOAP could be calculated using an error propagation function, being 32 % and 41 % for coal and biomass combustion, respectively. Figure 4 shows the OHR and SOAP of per unit mass (or concentration) ROG emissions. The OHR for coal and biomass emissions was quite similar (0.14-0.16 s −1 µg −1 m 3 ) but with different compositions, which was expected due to the differences of their ROG compositions. The OHR was dominated by oxygenated species (39.6 %-73.7 %) and alkenes (15.0 %-48.2 %), and the contribution of other species was within the range of 9.3 %-17.1 %. The newly identified ROG species dominated the OHR of coal combustion with the fractions of 64.2 % ± 7.8 % and 54.6 % ± 9.3 % for anthracites and briquettes combustion, respectively, due to the large contribution of furans, phenols, PAHs and higher alkanes in ROGs. In comparison, the previously reported species contributed more to OHR of biomass combustion than that of newly identified species. The ratio of OHR between newly identified and previously reported ROGs was 1.20-1.80 for coal and 0.22-0.51 for biomass, much higher than the ratios of their emissions (0.79-0.81 for coal and 0.20-0.36 for biomass). It meant that the OHR of ROG emissions from residential coal and biomass combustion was underestimated by 59.4 % ± 4.8 % and 26.2 ± 6.8 %, respectively, without the newly identified species.
SOAP derived from per unit mass ROG emissions of coal combustion was 0.078-0.085 µg µg −1 , much higher than that from biomass combustion (0.016-0.035 µg µg −1 ). Nevertheless, for all samples, newly identified ROGs accounted for over 70 % of the SOAP. SOAP was dominated by newly identified oxygenated species like phenols which contributed 47.6 % ± 12.4 % and 56.7 % ± 7.0 % to SOAP of emissions from coal and biomass combustion, respectively, and higher alkanes and PAHs also played important roles in SOAP emissions from coal combustion. The ratios of SOAP derived from newly identified ROGs and previously reported ROGs were 7.8-8.8 and 2.2-2.7 for coal and biomass, respectively, much higher than those of mass percentages. These results indicated that for both coal and biomass combustion, the measurement of newly identified ROGs would be greatly affected on SOA estimation. The field study has found that newly identified ROGs like higher alkanes and PAHs contributed more than 60 % of SOA formation from measured precursors in ambient air (C. . Our study of ROG emissions could contribute to the explanation of the high SOA formation in atmosphere to some extent. In other words, failure to include newly identified ROGs in emission inventories and SOA models could lead to significant underestimation of residential contribution to SOA production.

ROG emissions from residential combustion in China
ROG emissions could generally be calculated through multiplying the activity data by the emission factors which were not measured in this study. However, the emission factors of most of the common ROGs like hydrocarbons in the emissions from residential combustion have been reported in previous studies (Cai et al., 2019;Stockwell et al., 2015). Hence, the emission factor of the newly identified ROG species in this study could be estimated by the reported emission factor (EF) of the previously reported species combined with their ERs from residential combustion obtained in this study. Here, benzene and its reported EF were used for the purpose above, as benzene was the major overlapping species in all the studies, with a high abundance and a relative low uncertainty in combustion emission. Major studies of the EF of benzene from residential combustion were further reviewed and listed in Table S4. Considering the coal samples tested in this study were anthracite and briquette coal, the present study cited the latest reported EF of benzene from anthracite coal combustion by Cai et al. (2019), which agreed with the other reported values of anthracite/briquette coal combustion (Tsai et al., 2003) and 1-2 orders of magnitude lower than those of bituminous coal combustion (C. Liu et al., , 2015Cai et al., 2019;Tsai et al., 2003;Wang et al., 2013). In terms of straw combustion, the present study used the median value of the reported EF of benzene from straw combustion in China, being 284 mg kg −1 , which was derived from the simulated real-world combustion in the FLAM-4 laboratory campaign (Stockwell et al., 2015). More particular consideration about selection of the reported EF of benzene is described in the Supplement. We also tested other species with reported EFs (Fig. S12). There were no significant differences (−39 % to 4 % for straw and 6 % to 26 % for coal) of the estimated EFs of newly identified ROGs among different tests, which further confirmed our results were comparable with the previous studies but with more ROG species measured, as shown in Fig. S12. To relate the ROG EFs especially previously unmeasured or rarely measured species emissions to benzene EF, the ER of ROG species to benzene was the ratio of their concentrations in the sample, and the average ER in different samples of each type of fuel was used in this study, as listed in Table S5. The key point of the relating above was assuming the ERs obtained in this study were consistent with those of the previous studies. The consistence could be confirmed by the good correlation (R = 0.73 for straw combustion, and R = 0.82 for coal combustion) of the ERs of the overlapping species between our study and the previous studies, as shown in the Fig. S6a and e presented the correlation.
The estimated EFs of anthracite and straw combustion were used below to estimate the ROG emissions of residential coal and straw combustion in mainland China. Notably, the applicable data about the contribution of bituminous and anthracite coal were from the rural energy survey conducted about 10 years ago (2013-2014), which indicated the bituminous coal contributed 97 % and 55 % of the residential coal consumptions in Baoding (Zhi et al., 2017(Zhi et al., , 2015 and Beijing (Zhao et al., 2015). China has been carrying out the toughest ever clean energy substitution and vigorously replacing bituminous coal with anthracite in response to the cleaning action plan in the residential sector since 2013. This was further strengthened from 2017 to 2020 during the 3-year battle against air pollution (e.g. Action Plan for Clean and Efficient Utilization of Coal (2015-2020), http://zfxxgk.nea.gov.cn/, last access: 30 April 2023). The use and sale of bituminous coal were generally not allowed (Luo, 2019). Thus, we could expect the large decrease of the use of bituminous coal in residential sector in China although there are no updated statistical data applicable. This study assumed that anthracite is the main residential coal type to roughly estimate ROG emissions in China.
Accordingly, the national ROG emissions of residential combustion were estimated combining with the residential coal consumption and the crop straw combustion data in China. Specifically, the data of residential coal consumption from 2010 to 2019 were from the China Energy Statistical Yearbook (National Bureau of Statistics, 2010-2022), which included the data of each province in China mainland in each year. The data of crop straw combustion from 2010 to 2019 (Table S6) were from the Report of Prospects and Investment Strategy Planning Analysis on China Straw Refuse Treatment Industry (2022-2027) (Qianzhan Industrial Research Institute, 2022), which only reported the total amount of the whole China mainland and included both the household and field combustion. The province data of crop straw combustion in 2017 (Table S7) were used to study the spatial distribution in this study, which were from the Second National Pollution Source Census Bulletin (Ministry of Ecology and Environment of the People's Republic of China et al., 2020).
The spatial and temporal distribution of ROG emissions from residential combustion was presented in Fig. 5. The total emissions of ROGs from residential coal combustion and crop straw combustion were 14 and 4384 kt in 2019, respectively, and as expected these values were underestimated by 44.3 % ± 11.8 % and 22.7 % ± 3.9 %, respectively, because fewer species were included previously. Unexpect-edly, the ROG emissions from crop straw combustion were 2 orders of magnitude higher than those of coal combustion, which included those both from household and field combustion. Even if the emission factors of bituminous coal were applied, the ROG emissions from residential coal combustion in China would increase by approximately 1-2 orders of magnitude, still lower than the ROG emissions from biomass combustion. More refined energy consumption statistics are necessary to update as adjustment of Chinese energy structure. Notably, the straw combustion emissions were stable after 2017 compared with the gradual decrease from 2010 to 2017, mainly due to the limitation of straw utilization . In comparison, ROG emissions of residential coal combustion began to decrease after 2017, benefiting from the clean heating action in the north of China (National Development and Reform Commission, 2017). Spatially, the hot areas of ROG emissions from residential coal combustion were mainly in the North China Plain (NCP) , while those of straw combustion were mainly in the main food-production bases of China like northeastern China; the southern NCP; and Jiangsu, Anhui and Hunan provinces.

Conclusions
In this study, a near-complete chemical description of ROGs emitted from residential combustion in rural household of China was developed through quantifying all signals by H 3 O + PTR-ToF-MS and supplementing C2-C22 aliphatic hydrocarbons by GC-MS/FID and NO + PTR-ToF-MS. Among the near-complete description of ROGs, 55 species un-and under-characterized in previous studies using GC methods were analyzed intensively by H 3 O + /NO + PTR-ToF-MS, mainly including oxygenated species (carbonyls, furans, phenols) and higher hydrocarbons (PAHs, higher alkanes) with more than eight carbon atoms as well as nitrogen-containing compounds. Compared to the comprehensive measurements by more instruments previously, the combination of PTR-ToF-MS and the GC-MS/FID method was labor-saving and further could minimize the measurement uncertainties from the synthesis of measurement data due to fewer kinds of instruments.
For the nearly complete ROGs, divided into three categories by the element composition, oxygenated species played a similar major role to hydrocarbons, and nitrogencontaining species dominated by acetonitrile and acrylonitrile were also considerable in ROG emissions from residential combustion. In particular, coal combustion emitted considerable higher alkanes including 8-21 carbon atoms, gaseous PAHs (mainly including two to three benzene rings), furans and phenols; differently, biomass combustion emitted more carbonyls and terpenes.
Considering the newly discovered species, it is observed that approximately half and a quarter of the ROG emissions from coal and biomass combustion are underestimated. Com- The blank parts on the map indicate the provinces with missing data. The units "t" and "kt" refer to tons and kilotons, respectively. bined with the spatial-temporal consumption of residential coal and biomass combustion in China, ROG emissions of residential combustion were estimated. The ROG emissions from straw combustion were 2 orders of magnitude higher than those from coal combustion, with a negligible decline in recent years as the limited straw utilization ratio, which suggested the biomass combustion would be the only important residential emissions with the continuous replacement of residential coal in rural China. Given the newly identified species more reactive or with higher SOA yields, amplified underestimation of OHR and SOAP was observed for both coal combustion (59.4 % ± 4.8 % and 89.2 % ± 1.0 %) and biomass combustion (26.2 % ± 6.8 % and 70.3 % ± 1.6 %). These results highlighted the importance of the completely speciated measurement of the ROG emissions from residential combustion.
Author contributions. All authors contributed to the manuscript and have given approval of the final version. HW and CH designed the study. HW and YG performed the data analyses and wrote the manuscript. YG, LY and SJ conducted the experiment. BY, GS, YL, QW, DDH, SZ and SL contributed to the interpretation of results. LZ and AK revised the manuscript. ST assisted with sampling.
Competing interests. The contact author has declared that none of the authors has any competing interests.

Disclaimer.
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Financial support. This work was supported by the National Natural Science Foundation of China (grant no. 42175135), the National Key R&D Program of China (grant no. 2022YFE0136200) and the Science and Technology Commission of the Shanghai Municipality (grant no. 20ZR1447800).
Review statement. This paper was edited by Theodora Nah and reviewed by two anonymous referees.