Elucidating the critical oligomeric steps in secondary organic aerosol and brown carbon formation

. Small α-dicarbonyls represent the major precursors for secondary organic aerosol (SOA) and brown carbon (BrC) in the atmosphere, but the chemical mechanisms leading to their formation remain unclear. Here we elucidate the fundamental kinetics and mechanisms for aqueous-phase oligomerization of glyoxal (GL) using quantum chemical and 15 kinetic rate calculations. Our results identify several essential isomeric processes for GL, including protonation to yield diol/tetrol and carbenium ions, nucleophilic addition of carbenium ions to diol/tetrol as well as to free methylamine/ammonia (MA/AM), and deprotonation to propagate oligomers and N-heterocycles. Both protonation and nucleophilic addition occur without activation barriers and are dominantly driven by electrostatic attraction. Deprotonation proceeds readily via water molecules in the absence of MA/AM but corresponds to the rate-limiting step for N-containing cationic intermediates to 20 yield N-heterocycles. On the other hand, the latter occurs readily via a catalytic process by acidic anions (eg., SO 42- ). A carbenium ion-mediated reaction rate of GL is 4.62 × 10 -3 s -1 under atmospheric conditions, in good agreement with the experimental data. Our results provide essential mechanistic and kinetic data for accurate assessment of the role of small α-dicarbonyls in SOA and BrC formation.

Glyoxal (GL), an important and simple carbonyl compound, is originated from the gas-phase oxidation of biogenic isoprene and anthropogenic aromatics (Altieri et al., 2006;Volkamer et al., 2001;Gomez Alvarez et al., 2007). The global source of GL is estimated to be 45-56 Tg yr -1 (Fu et al., 2008;Myriokefalitakis et al., 2008), with a predicted contribution of 40 2.6 Tg C yr -1 to global SOA mass (Fu et al., 2008). GL and methylglyoxal contribute to 16% of SOA mass during a severe haze episode in Hebei province, China (Li et al., 2021a). In Mexico City, GL contributes at least 15% of SOA mass (Volkamer et al., 2007), and in the Pearl River Delta, China, 21% of SOA formation originates from the heterogeneous reactions of GL and methylglyoxal (Ling et al., 2020). Hence, GL is a primary contributor to the rapid and efficient formation of SOA under urban environments. The aqueous-phase reaction of GL starts with hydration, subsequently forming 45 several high-molecular-weight oligomers, such as dimers, trimers, tetramers, and pentamers (Hastings et al., 2005;Loeffler et al., 2006;Gomez et al., 2015;Kua et al., 2008;Avzianova and Brooks, 2013). Previous theoretical studies have suggested that the direct hydration of GL is thermodynamically and kinetically unfeasible (Shi et al., 2020;Ji et al., 2020;Li et al., 2021b). The rapid growth of SOA was observed in sulfuric acid nanoparticles (Jang et al., 2002;Huang et al., 2016;Surratt et al., 2007;Liggio et al., 2005), while several other studies have revealed little effect of acids (such as sulfuric acid) on the 50 formation of GL oligomers (Loeffler et al., 2006;Kroll et al., 2005;Peltier et al., 2007). Hence, the aqueous-phase chemical reaction mechanism of GL and its role in SOA formation are still unclear.
On the other hand, brown carbon (BrC) is also generated by the aqueous-phase reaction of GL in the presence of amines or ammonium in the troposphere (Li et al., 2021b;Shapiro et al., 2009;Galloway et al., 2009;Lee et al., 2013;Maxut et al., 2015;Marrero-Ortiz et al., 2019;Tuguldurova et al., 2019;Kua et al., 2011). Zhang and co-workers have revealed a slight 55 amines contributes to the formation of imidazole in cloud processing (Lian et al., 2020). Most previous studies have shown that imidazole and high-molecular-weight light-absorbing C-N compounds are produced by different atmospheric chemical reactions of GL with amines/ammonium, but the reaction mechanism is yet to be clarified.
In this work, the aqueous-phase chemistry of GL in the absence and presence of methylamine/ammonia (MA/AM) was systematically investigated using quantum chemical calculations. The fundamental chemical mechanisms of the formation of 70 oligomers and N-heterocycles were investigated. The chemical composition and the product distribution were also estimated and characterized by conventional transition state theory considering solvent cage and diffusion-limited effects (Methods).
The aerosol growth rate for the heterogeneous chemistry of GL was also evaluated under different atmospheric conditions. We also paid a special attention on the key factors in the formation of SOA and BrC from GL to provide insight into the important role of the aqueous-phase chemistry of GL. 75

Methods
All quantum chemical calculations were performed be means of Gaussian 09 program (Frisch et al., 2009). Geometry optimization of all stationary points (SPs) such as reactants, transition states (TSs), intermediates, and products, was calculated using the M06-2X functional (Zhao and Truhlar, 2008) with the 6-311G(d,p) basis set (Ji et al., 2017;Ji et al., 2020), i.e., the M06-2X/6-311G(d,p) level, which has shown good performance in describing the geometrical optimization of 80 the heterogeneous reactions of small α-dicarbonyls (Ji et al., 2020;Shi et al., 2020). Thermodynamic contributions and harmonic vibrational frequencies were calculated at the same level as that for geometry optimization to identify all SPs as either a TS (exactly with only one imaginary frequency) or the minima (zero imaginary frequency). Intrinsic reaction coordinate (IRC) calculations were implemented to construct the minimum energy pathway (MEP), verifying that each TS accurately connected the corresponding reactants and products. At the same level, TS was searched by examining the SP 85 using the TS keyword in geometry optimization, while the absence of a TS was confirmed if no energy exceeded the bond dissociation energy along the reaction coordinate (Ji et al., 2020). The TSs for four deprotonation pathways in MG+MA/AM reaction systems were identified at the M06-2X/6-311G(d) level because none of them were searched at the M06-2X/6-311G(d,p) level (detailed discussion in Supporting Information). The pointwise potential curve (PPC) scanning was performed to further confirm a barrierless process at the M06-2X/6-311G(d,p) level (Hazra and Sinha, 2011). For this 90 method, all other geometric parameters were fully optimized, except for fixing the internal breaking or forming bond length (detailed in Supporting Information).
For simplicity, hereinafter they were denoted as the M06-2X//M06-2X level, where a SPE calculation at the M06-2X/6-95 311+G(3df,3pd) level was carried out for the geometry optimized at the M06-2X/6-311G(d,p) level. To further evaluate the results at the M06-2X//M06-2X level, a higher-level calculation using the CCSD(T) method with the flexible 6- 311+G(2df,2p) basis set was performed to refine the PESs. The CCSD(T) method, i.e., coupled cluster approach with single and double substitutions including a perturbative estimation of connected triples substitutions, corresponds to a higher electronic correlation method. As discussed in Supporting Information (SI), the M06-2X//M06-2X level is suitable to predict 100 energies and kinetics, and also represents a compromise between computational efficiency and accuracy.
The rate constants (k) of the pathways with TSs were calculated using the conventional transition state theory (TST) (Gao et al., 2014;Galano and Alvarez-Idaboy, 2009) based on the above PES information: where h and kB are the Planck and Boltzmann constants, respectively, ∆G ‡ represents the activation barrier energy with the 105 thermodynamic contribution corrections and solvent cage effects, and σ is the reaction path degeneracy. To simulate realistic conditions in the solution, the rate constants are refined by using solvent cage effects (Okuno, 1997) and diffusion-limited effects (Collins and Kimball, 1949). For some of the reactions with low free energy barriers, the rate constants are found to close to the diffusion-limit, which is calculated using the Collins-Kimball theory (Gao et al., 2014). The k values of the pathways without TSs are controlled by the diffusion-limit effect and thereby equal to the diffusion-limited rate constants: 110 where R is the reaction distance, NA denotes the Avogadro number, and DAB represents the mutual diffusion coefficient of reactants. The branching ratio (г) was determined by the following equation: where the k corresponds to the rate constant of each pathway, and the ktotal is the sum of the k value for each parallel pathway. 115 The detailed description of kinetics is displayed in SI.

Ion-mediated initial reaction of GL
The ion-mediated initial reactions of GL proceed via either proton-mediated (RH+1) or hydroxyl ion (OH -)-mediated (ROH-1) hydration (Figs. 1a and S1a), yielding cationic (CIs) or anionic intermediates (AIs). As discussed in SI, no TS is identified by 120 PPC scanning for all pathways in ion-mediated reactions (Fig. S2a). That is, all ion-mediated pathways are barrierless processes unless otherwise stated. The PESs of possible pathways in the ion-mediated reaction of GL are presented in Figs. 1a and S1. For protonmeditated pathways (RH+1), protonation of carbonyl O-atom (RH+11) is largely exothermic with a reaction energy (∆Gr) of -96.9 kcal mo1 -1 , to form the CI11. Alternatively, protonation of GL can be also initiated by the hydronium ion (H3O + ) with a barrierless process. As shown in Fig. S1, water protonation can provide the corresponding ∆Gr value of -111.0 kcal mol -1 , 130 and thus for the convenience of discussion, the following proton-mediated pathways refer specially to the hydrogen ions (H + ) initiated reactions. The nucleophilic attack of GL by OH -(ROH-11) is also exothermic with the ∆Gr value of -13.4 kcal mo1 -1 , to yield the AI11. The small exothermicity of ROH-11 implies thermodynamically unfavorable formation of AI11. Table S1 lists the ks values of RH+11 and ROH-11 pathways as well as the ktotal of R1. Herein, the ktotal value of the ion-initiated reactions (the sum of the k values of RH+11 and ROH-11 pathways) is 6.02 × 10 9 M -1 s -1 , and the k of the ROH-11 pathway 135 contributes 30% to the ktotal. Further, considering the moderately acidic condition inside fine particles (Liu et al., 2017;Guo et al., 2012), proton-mediated reaction of GL is of major significance in the troposphere, and thus, the subsequent reactions of CIs are mainly considered in the following part.
As shown in Fig. S3, the length of the C−O bond in CI11 is elongated to 1.24 Å, facilitating the subsequent hydration reaction (RH+12). CI11 reacts via hydration and dehydration to yield diol (DL) with successive increasing in ∆Gr values 140 ranging from -8.0 to -1.0 kcal mol -1 . Subsequently, DL protonation occurs at the carbonyl (RH+21-1) or hydroxyl (RH+22-1) O-atom, leading to the formation of tetrol (TL) and the first-generation carbenium ion (1 st -CB1) (Figs. 1a and S1b). The RH+21-1 and RH+22-1 pathways to form CI21-1 and CI22-1 are also strongly exothermic with the ∆Gr values of -103.0 and -104.4 kcal mol -1 , respectively. The pathway for CI21-1 to TL processes via hydration and deprotonation with successive increasing in ∆Gr values, and CI22-1 to 1 st -CB1 reaction via deprotonation corresponds to a slightly increasing ∆Gr value, 145 suggesting that both TL and 1 st -CB1 are the dominant intermediates. Table S1, the k values of RH+21-1 and RH+22-1 pathways are all 4.14 × 10 9 M -1 s -1 and their half-lives (t1/2) are lower than ~10 -4 s. It implies the rapid conversion from DL to TL and 1 st -CB1, in line with the experimental results that the abundance of GL monohydrate is lower than 2% in acidic conditions (Malik and Joens, 2000). As discussed in SI, it further confirms that the direct hydration (RH2O1 and RH2O2) and OH --mediated hydration (ROH-1 and ROH-2) are kinetically 150 and thermodynamically hindered. Hence, the cation-mediated initial reaction of GL, as the dominant route in the aqueous phase, is mainly focused on and explored in the following study.
As shown in Fig. 2, the natural charge of C-atom in 1 st -CB2 is more positive than that in 1 st -CB1, implying a stronger electrostatic attraction between 1 st -CB2 and TL. However, the ∆Gr value of RTL51 is -3.1 kcal mol -1 , which is higher than that of RTL41 ( Fig. S4a-b). It indicates that the reactivity of the positive charge centers in carbenium ions is affected by both 170 electrostatic attraction and steric effect.
The nucleophilic addition reaction of 1 st -CBs with DL is also illuminated and presented in Figs. 1b and S4a-b, although DL is not the most dominant products in the aqueous-phase reaction of GL. Both ROD2 and ROD1 are generated from DL, respectively, through the analogous pathways to RTL4 and RTL5. As shown in Fig. 2, the hydroxyl O-atom in TL has a larger negative natural charge than that in DL, implying that there is a stronger electrostatic attraction between 1 st -CBs and TL. In 175 addition, the ∆Gr values of association reactions of 1 st -CBs with DL are -5.3 (for 1 st -CB1) and -1.8 (for 1 st -CB2) kcal mol -1 , respectively, which are less negative than those with TL. It indicates that the most abundant dimers correspond to the oligomeric pathways arising from 1 st -CBs with TL in weakly acidic condition.

Trimerization and oligomerization
Similar to the formation of 1 st -CBs, as shown in Fig. 1b, the ring-opening dimers then repeat protonation and dehydration to 180 form six second-generation carbenium ions (2 nd -CBs) (Fig. S5), which further engage in the formation of cyclic dimers (CDs) or ring-opening trimers (ROTs). Figs. S6-S7 present the schematic energy diagram for the formation of CDs and ROTs, and ROTs are produced and identified via R8-R20 routes and the overall schematic diagram is shown in Fig. 1b. For example, protonation (RH+81-1) and dehydration (RH+81-2) of ROD1 yield 2 nd -CB1, which undergoes intramolecular isomerization 185 pathway to produce CI111 (R111). CI111 is further hydrolyzed (R112) and deprotonated (R113) to yield CD1. Alternatively, 2 nd -CB1 association with TL and DL yields CITL151 and CIDL151, respectively, and their subsequent pathways are similar to the pathways of the formation of RODs, resulting in the formation of ROT1 and ROT2 (Fig. S7a) isomeric conversion can also occur among the nine ROTs via protonation, hydration, and deprotonation processes (R21-R27 190 in Fig. S9).
Similar to dimerization pathways, the nine ROTs further engage in protonation and dehydration reactions to produce twenty-five third-generation carbenium ions (3 rd -CBs) (R28-R36 in Fig. 3). In this study, the configurations of oligomers with the lowest energies are applied because many isomers of dimers and trimers can be yielded. Subsequently, twenty-five 3 rd -CBs undergo intramolecular isomerization, hydration, and deprotonation to further yield twelve cyclic trimers (CTs) (Fig. 195 S10 and Table S2). Based on kinetic rate calculations, the k values of dimer and trimer formation are ~10 9 M -1 s -1 in the aqueous phase, limited by liquid-phase diffusion. As discussed above, the formation of various ring-opening/cyclic dimers and trimers is initiated by protonation and subsequently propagated via the electrostatic attraction, ultimately contributing to SOA formation.

Oligomerization mechanisms without methylamine/ammonia 200
As shown in Fig. 2, a strong electrostatic attraction exists between 1 st -CBs and AM or MA because N-atoms exhibit large negative natural charges (-0.875 e for MA and -1.078 e for AM). Hence, the carbenium ion-mediated oligomerization of GL with MA/AM to form N-oligomers is simulated and presented in Fig. 4. Figs. S11-S12 depict the optimized geometries of key SPs. Also, the involved cation-mediated pathways default to barrierless processes unless stated (Fig. S2b).

The nucleophilic addition of 1 st -CBs with MA 205
Attack of 1 st -CBs by MA results in four N-containing ring-opening dimer (N-ROD) formation. For example, such multistep processes of 1 st -CB1 with MA are shown as the following (Fig. 4a- For the association pathway of 1 st -CB1 with MA (RMA601), the ∆Gr value is -43.2 kcal mol -1 . In CIMA601, the length of 210 the formed C−N bond is 1.50 Å (Fig. S12a). Unlike the oligomerization without MA, the subsequent deprotonation of CIMA601 is difficult to be initiated by H2O (Fig. S13, as discussed in SI). It implies that deprotonation from N-containing CIs needs to be initiated by species with larger electronegativity than H2O. Taking into account the real atmospheric conditions, in this study, deprotonation of the amino group in CIMA601 is initiated by SO4 2-(RMA602). As shown in Fig. S14, the natural charge center of CIMA601 is located at the amino H-atom, which is readily abstracted by SO4 2to form N-RODMA1. The value of -5.7 kcal mol -1 , to form CIMA612. Similar to the subsequent reaction of CIMA601, deprotonation of CIMA612 is also initiated by SO4 2-(RMA613), to form the other N-ROD (N-RODMA2), with the ∆G ‡ value of -7.3 kcal mol -1 . In addition, there also exist an intermolecular isomerization pathway from N-RODMA2 to N-RODMA3 via protonation, hydration, and deprotonation (Fig. 4c) investigated and discussed in SI via the similar stepwise pathways of 1 st -CB1 with MA (Figs. S16-S17), yielding a Ncontaining cyclic tetramer (N-CTMA2). The intermolecular isomerization reaction from N-CTMA1 to N-CTMA2 is also observed (Fig. S18a). All N-containing dimers, trimers and tetramers subsequently contribute to N-heterocycles, which are 245 the important precursors of BrC aerosols. Because all deprotonation reactions proceed via the corresponding TS in the presence of MA and their rate constants fall in the range of (1.17-1.30) × 10 9 M -1 s -1 (Table S3), deprotonation is the ratelimiting step to propagate N-heterocycles. It implies that N-heterocycle formation is more dependent on the content of inorganic compounds or inorganic salts in aerosol rather than particle acidity.

The nucleophilic addition of 1 st -CBs with AM 250
The carbenium ion-mediated reactions to N-heterocycles in the presence of AM involves the stepwise processes (Figs. 4 and S15-S17), similar to the nucleophilic addition of 1 st -CBs with MA. That is, the formation of N-heterocycles from GL with AM involves three vital steps (Fig. S19): (1) the nucleophilic reaction of AM with CBs to form N-RODs, (2) protonation and dehydration of N-RODs to yield N-containing CBs, and (3) the formation and propagation of N-heterocycles by the association reactions of N-containing CBs with AM. Alternatively, GL can be attacked by ammonium ion (NH4 + ) to produce 255 CIAM601 due to the equilibrium reaction between AM and NH4 + in solution, with the ∆Gr value of -0.9 kcal mol -1 (Fig. 4a). (2) and (3)  Hence, the carbenium ion-mediated mechanism also provides a key pathway for the formation of BrC from GL in the presence of ammonia salts.

Estimation of the heterogeneous GL reaction rates and growth rates of SOA and BrC formation
To evaluate the atmospheric regions where the heterogeneous reaction of GL will have significance, the heterogeneous GL 265 reaction rates and the growth rates of SOA and BrC formation (GRSOA and GRBrC) are estimated under rural, remote, and urban conditions using the predicted carbenium ion-mediated reaction mechanism of GL mentioned above. First, the heterogeneous GL reaction rates without or with MA/AM were driven by the expression as follows, where Cg is the gas phase concentration under rural, remote and urban conditions, k is the calculated rate constant of 270 protonation reaction (4.20 × 10 9 M -1 s -1 ) in the absence of MA/AM or deprotonation (1.17/1.32 × 10 9 M -1 s -1 ) in the presence of MA/AM, and γGL is uptake coefficient of GL under the different conditions.
The heterogeneous GL reaction rate [krate(total)] is the sum of the rates without and with MA/AM. Table 1 lists the krate without and with MA/AM and krate(total) under rural, remote, and urban conditions as well as the available experimental data (Liggio, 2005). The krate(total) values are 4.62 × 10 -3 , 9.25 × 10 -4 , and 1.85× 10 -3 s -1 under above mentioned three conditions. 275 The krate value under urban condition almost agrees with that of the experimental data and is slightly larger than those of the experimental data under other conditions (Liggio, 2005). However, estimating other atmospheric conditions are not unreasonable because the γGL used here is more suitable for the urban condition (Liggio, 2005). Second, the growth rate (GRSOA) (Ji et al., 2020) is expressed as:  Table 1), and [H + ] is the concentration of the hydrogen ion (10 -6 M) in the weakly acidic solution, and AWC is the aerosol water content (1.1 × 10 -10 ) (Ji et al., 2020). The GRBrC is also calculated using the formula  Table 1. The GRSOA is 1.41 μg m -3 h -1 under urban condition, which is larger than the GRBrC (0.44 μg m -3 h -1 for MA and 0.45 μg m -3 h -1 for AM). The total growth rate (the sum of the GRSOA and GRBrC values) is also consistent 290 with the experimental data (1.44 μg m -3 h -1 ) (Liggio, 2005). Hence, it is reasonable to expect the occurrence of carbenium ion-mediated mechanism in ambient aerosols as a means of SOA and BrC formation.

Conclusions
This study provides a valuable insight into the aqueous chemistry of GL and also reveals the rate-limiting steps in the absence and presence of amines/ammonia. In the absence of amines/ammonia (Fig. 5), the cation-mediated oligomerization 295 is characterized by barrierless pathways and strong electrostatic attraction as follows: (I) protonation, hydration, and deprotonation of GL to yield DL and TL, (II) the protonation and dehydration to yield CBs, and (III) the formation of dimers from the association reactions of CBs with DL and TL. Each dimer repeats steps (II) and (III) to propagate the oligomerization. In the presence of amines/ammonia, the step (III) starts from nucleophilic addition of CBs with amines/ammonia rather than with DL/TL due to the stronger electrostatic attraction between CBs and amines/ammonia (Fig.  300 2). However, the key mechanistic step in the propagation of N-heterocycles is deprotonation of N-containing cationic intermediates. Our results of two distinct mechanisms indicate that BrC formation is more dependent on the aerosol content of inorganic compounds or inorganic salts rather than particle acidity, compared with the formation of SOA.
On the other hand, there exist the competing pathways with the initial protonation pathway for the cationic oligomerization of GL without and with amines/ammonia, (i.e., the association reactions of CIs with OH -). These competing 305 pathways lead to CIs returning back the corresponding reactants and affect the fate of GL. Fig. S20 and Table S1 present the ∆Gr and k values of the association reactions for some key CIs with OH -. Generally, the fate of CIs at each step is dominantly determined by the initial protonation rather than the reaction with OH -. For example, the ∆Gr value of the association pathway of CI11 with OHis -66.8 kcal mol -1 , and its k value is 1.47 × 10 9 M -1 s -1 . Compared with the protonation of GL to CI11, this reaction is thermodynamically and kinetically unfavorable. The branching ratios show that 70% of GL proceeds protonation pathway to finally form DL. Because protonation is favorable in the acidic aerosol, the cation-mediated oligomerization of GL without and with amines/ammonia can efficiently proceed to contribute to SOA and BrC under the atmospheric conditions. Using our predicted heterogeneous GL reaction rates, the heterogeneous lifetime (τ) of GL is estimated to be 4.43 min under urban conditions, somewhat smaller than that of experimental data (5.0 min) (Liggio, 2005). However, the τ values are 315 89 and 61 min under rural and remote conditions due to low GL level, respectively (Liggio, 2005). It indicates a more important role of heterogenous reaction of GL in urban air quality compared with other conditions. On the other hand, the τ values determined here for rural, remote, and urban conditions are all lower than those of the photolysis (211 min) and photooxidation of GL (300 min). Especially, the τ value under urban condition is significantly shorter than the gas-phase lifetime. The results indicate that even under relatively clean conditions, the heterogeneous GL loss rates are faster than the 320 loss rates due to other gas phase processes and are much significantly rapid in polluted regions. Given that GL contributes to 6.9% of the total radical production at midday (Aiello, 2003), the heterogeneous GL loss to particle implies the reduction of HOx and demands further study. Our work reveals the fundamental chemical mechanism of SOA and BrC formation from small α-dicarbonyls and also provides the kinetic and mechanistic data for atmospheric modeling to assess the budget of SOA and BrC formation on urban, regional, and global scales. 325 Data availability. All raw data can be provided by the corresponding authors upon request.

Supplement.
The supplement related to this article is available on the EGU Publications website.    (Cerqueira et al., 2003;Lawson et al., 2015;Qian et al., 2019). b The values are the rates of the oligomerization without and with MA/AM (in s -1 ). c krate (