Structure – activity relationship for the estimation of OH-oxidation rate constants of carbonyl compounds in the aqueous phase

In the atmosphere, one important class of reactions occurs in the aqueous phase in which organic compounds are known to undergo oxidation towards a number of radicals, among which OH radicals are the most reactive oxidants. In 2008, Monod and Doussin have proposed a new structure–activity relationship (SAR) to calculate OHoxidation rate constants in the aqueous phase. This estimation method is based on the group-additivity principle and was until now limited to alkanes, alcohols, acids, bases and related polyfunctional compounds. In this work, the initial SAR is extended to carbonyl compounds, including aldehydes, ketones, dicarbonyls, hydroxy carbonyls, acidic carbonyls, their conjugated bases, and the hydrated form of all these compounds. To do so, only five descriptors have been added and none of the previously attributed descriptors were modified. This extension leads now to a SAR which is based on a database of 102 distinct compounds for which 252 experimental kinetic rate constants have been gathered and reviewed. The efficiency of this updated SAR is such that 58 % of the rate constants could be calculated within ± 20 % of the experimental data and 76 % within ± 40 % (respectively 41 and 72 % for the carbonyl compounds alone).


Introduction
In the atmosphere, one important class of condensed phase chemical reactions occurs in the aqueous phase which can be found at various ionic strengths in deliquescent particles, activated particles or in the droplets of clouds, fog and rain.In these media, organic compounds are known to undergo oxidation by a number of radicals, among which OH radicals are the most reactive oxidants (Herrmann et al., 2010).This reactivity initiates chain reactions that are related to atmospherically important issues such as the oxidizing capacity of the atmosphere (Monod and Carlier, 1999;Monod et al., 2007;Poulain et al., 2010;Ervens et al., 2013), the fate of organic compounds (Blando and Turpin, 2000;Monod et al., 2005) and the formation of secondary organic aerosol (SOA) (Altieri et al., 2006;Carlton et al., 2007;Volkamer et al., 2009;Ervens and Volkamer, 2010;Tan et al., 2010Tan et al., , 2012;;Lim et al., 2010;Ervens et al., 2011).
Among the thousands of organic species involved in this chemistry, carbonyl compounds play a major role in the atmosphere.Aldehydes and ketones not only are major species directly emitted in the atmosphere, but furthermore the carbonyl function is systematically formed with high yields in the gas phase photooxidation processes of volatile organic compounds (VOCs) (Carlier et al., 1986, Finlayson-Pitts andPitts, 2000).As shown by Aumont et al. (2005), thanks to explicit modelling, during the atmospheric oxidation of VOCs, a major fraction of the products are polyfunctional and the ketone and aldehyde functionalities together represent the major part of the resulting chemical functions.Additionally, it was recently evidenced that the heterogeneous and multiphase reactivity of polyfunctional carbonyl molecules (glyoxal, methylglyoxal, glycolaldehyde, pyruvic acid, methacrolein, methylvinylketone, etc.), could lead to important amounts of oligomers, representing a possible substantial source of humic like substances (HULIS) and/or Published by Copernicus Publications on behalf of the European Geosciences Union.
SOA in the atmosphere, especially in the presence of water (Loeffler et al. 2006;Altieri et al., 2006;Carlton et al., 2007;Volkamer et al., 2009;Perri et al., 2009;Ervens and Volkamer, 2010;Tan et al., 2010;Lee et al., 2012;Ervens et al., 2011;Liu et al., 2012;Ortiz-Montalvo et al. 2012).The oligomeric condensation processes of carbonyl compounds in the atmosphere are only partially understood, and a large effort currently concentrates on the determination of the oligomerization mechanisms.It is believed that these processes need the presence of water.In the aqueous phase, it has been recently suggested that radical polymerization, initiated by OH-oxidation of partially (or fully) hydrated carbonyl compounds, are responsible for the formation of oligomers (Lim et al., 2010;Tan et al., 2012;Renard et al., 2013).While the database of experimental kinetic parameters relevant to the atmospheric aqueous phase is still limited, the number of compounds which can be potentially involved in such processes is tremendous.It is thus of primary importance to develop tools to accurately predict the initiating OH-oxidation step for both carbonyl compounds and their corresponding hydrated forms.The need for such tools is even strengthened by the development of box models to accurately simulate laboratory aqueous phase experiments (Lim et al., 2010), larger multiphase atmospheric models comprising several hundreds of aqueous phase reactions (Herrmann et al., 2005) or the development of explicit models (Mouchel-Vallon et al., 2013) comprising thousands of aqueous phase reactions.Monod and Doussin (2008) have proposed a new structure-activity relationship (SAR) for the OH-oxidation (by H-abstraction) of aliphatic organic compounds in the aqueous phase.The studied organic compounds included aliphatic alkanes, alcohols, organic acids, bases and polyfunctional compounds containing at least two of these functions.The methodology used was based on Atkinson's group-additive SAR (Atkinson, 1987;Kwok and Atkinson, 1995) for the gas phase reactions with, as a major difference, the β-position effects taken into account.The resulting accuracy of this method was that 60 % of the estimated values were found within the range of 80 % of the experimental values (Monod and Doussin (2008)).Minakata et al. (2009), using a similar group contribution approach, have proposed another structure-activity method concerning both H-abstraction and OH-addition on C = C double bonds.For saturated species, while this alternative relationship is applicable to a larger number of chemical families (as it includes ethers, esters, halides, nitrile, amines, amides, sulfides, sulfoxides, thiols, nitro compounds, nitroso compounds and phosphate-containing compounds, in addition to those modelled by Monod and Doussin, 2008), it involves only a parameterization using the α-position effect and no group contribution factor for k(OH) rate constants.This latter choice makes this SAR simpler to implement but it lowers its prediction performance.This may explain why, in their thorough comparison of the robustness of the two SARs for alkanes, mono-alcohols, poly-alcohols, and car-boxylic acids, Herrmann et al. (2010) found that the SAR proposed by Monod and Doussin (2008) performed better and was therefore eventually implemented as an extension of the CAPRAM 3.0i mechanism whenever possible (Herrmann, 2012).
More recently, Minakata and Crittenden (2011) have demonstrated the linear free energy relationships between aqueous phase hydroxyl radical rate constants and free energy of activation (Ea) bringing a more robust basis to SARs and opening the way for other types of parameterization involving Ea.Nevertheless, these parameterizations require extensive quantum mechanical calculations for each molecule considered, which make them still too cumbersome (and imprecise for larger molecules) for any automatic implementation in explicit models.
One of the major drawbacks of the SAR proposed by Monod and Doussin (2008) was the lack of parameterization for the carbonyl function.The difficulty with carbonyl compounds is due to their well-known ability to undergo hydration in the aqueous phase leading to equilibrium with the corresponding gem-diol forms (R1).et al. (2010) found that the SAR proposed by Monod and Doussin (2008) gave better 1 and eventually implemented as an extension of the CAPRAM 3.0i mechanism whe 2 (Herrmann, 2012).3 More recently, Minakata and Crittenden (2011) have demonstrated the linear free energ 4 between aqueous phase hydroxyl radical rate constants and free energy of activation 5 more robust basis to SARs and opening the way for other types of parameterization 6 Nevertheless, these parameterizations require extensive quantum mechanical calcula 7 molecule considered which make them still too cumbersome (and imprecise for larger 8 any automatic implementation in explicit models.9 One of the major drawbacks of the SAR proposed by Monod and Doussin (2008)  carbonyls, acidic carbonyls, their conjugated bases, and all their corresponding gem-diol 34

Structure-activity relationship principles 36
Considering that the aim of this work is to extend the structure-activity relationship 37 Monod and Doussin (2008), the principle of the estimation remains unchanged.It is 38 assumption that the overall rate constant for the OH radical induced H-abstraction is eq 39 The value of the equilibrium constant K hyd can vary by several orders of magnitude depending on the chemical structure of the considered molecule, while the reactivity towards OH can differ by a factor of 3 from the carbonyl to the corresponding gem-diol form (Schuchmann and von Sonntag, 1988).The combination of these two effects leads to major difficulties in the reliable parameterization of rate constants which were often experimentally determined when both forms were co-existing.Minakata et al. (2009) considered carbonyl bearing molecules either being totally hydrated (formaldehyde and glyoxal for which K hyd > 100) or totally non-hydrated (K hyd < 0.01), with an exception for acetaldehyde, which was treated in its two forms as its K hyd was known to be close to 1(K hyd, acetaldehyde = 1.2 at 298 K).However, K hyd is also intermediate for many other carbonyl compounds such as propionaldehyde, butyraldehyde, valeraldehyde, isobutyraldehyde, pyruvic acid, or biacetyl (see Table 1), which requires a careful OH-oxidation rate constant estimation for both forms to try to correlate estimated and experimental values.While significant, the K hyd database is incomplete, which was a major difficulty for the development of a SAR taking into account the hydration ratio.Meanwhile, a specific structure-activity relationship dedicated to K hyd was proposed by Raventos-Duran et al. (2010), allowing the possibility of extending the Monod and Doussin (2008) SAR to carbonyl compounds.
Considering both the importance of carbonyl multiphase chemistry in the atmosphere and the new possibilities for their SAR parameterization, the present work      aims at extending the Monod and Doussin (2008) SAR to carbonyl compounds, including aldehydes, ketones, dicarbonyls, hydroxy-carbonyls, acidic carbonyls, their conjugated bases, and all their corresponding gem-diol forms.

Structure-activity relationship principles
Considering that the aim of this work is to extend the structure-activity relationship proposed by Monod and Doussin (2008), the principle of the estimation remains unchanged.It is based on the assumption that the overall rate constant for the OH radical induced H-abstraction is equal to the sum of each kinetic rate of each reactive site.These partial kinetic rates are determined by taking into account the chemical environment of the function along the carbon skeleton.Each -CH 3 , -CH 2 -, -CH <, -OH and -CHO function of the molecule is associated with a group kinetic rate constant: k(group) (Eq.1).To take into account both field and resonance effects, the rate constants k associated with each H-bearing function are modulated with both the αneighbouring effect (represented by the F parameters) and the β-neighbouring effect (represented by the G parameters).
Furthermore, the effects of cyclic structures with 4 to 7 carbon atoms are taken into account by the addition of specific descriptors.

Hydration equilibrium
One of the major difficulties of this work is due to the hydration equilibrium (R1) of the carbonyl in aqueous solution.
As already mentioned, carbonyl species undergo hydration and can reach equilibrium with their parent gem-diol species with an equilibrium constant, K hyd , which can be defined, when one considers water activity as unity, as in Eq. ( 2).
where k hydration and k dehydration are respectively the forward and the reverse kinetic rate constants of R1.
Typically, the reactivity of these partner molecules toward the hydroxyl radical is significantly larger for the carbonyl form than for the gem-diol form (Schuchmann and von Sonntag, 1988).Furthermore, K hyd can differ by orders of magnitude from one species to the other.Indeed, the position of the equilibrium is greatly dependent on the structure of the hydrate.Thus, as an example, formaldehyde in water at 20 • C exists 99.99 % in the hydrated form, while for acetaldehyde this figure is 58 %, and for acetone the hydrate concentration is negligible (see Table 1.).It was, hence, necessary to take into account this equilibrium when performing the SAR calculation.To do so, descriptors were proposed to calculate both the carbonyl + OH and gem-diol + OH rate constants and an overall rate constant was calculated as follows (Eq. 3) to be compared with experimental data.
where k overall is the overall rate constant for the OH-oxidation and k gem-diol and k carbonyl the calculated rate constants for related species.This approach assumes that, unless mentioned, the hydration equilibrium was always reached in the experimental set-ups to determine the related overall rate constants.
To provide a substantial basis for this assumption, we have collected rate constants for the hydration and dehydration processes from the literature.Both processes are generally acid-base catalysed (Bell et al., 1956;Betterton and Hoffmann, 1987); hence, in the absence of any other catalyst, a pseudo-first order hydration rate constant may be expressed by the sum of the following terms: This rate constant is obviously extremely dependant on pH.explaining why k hydration reaches a minimum value close to neutrality.
Pseudo-first order rate constants for these processes close to neutrality are gathered in Table 2. Hydration rate constants are often, but not always, quite large and the equilibrium is then reached within a few seconds.On the contrary to the common belief, some of these processes are rather slow as is the case for acetaldehyde and other larger aliphatic aldehydes such as propionaldehyde and isobutyraldehyde (Pocker and Dikerson, 1969).At room temperature, it may take few tens of seconds to reach the equilibrium while, near 0 • C, it may take several hundreds of seconds.
This observation led us to carefully review our OH reaction rate constant experimental database (see Database, Sect.2.4) to verify that equilibrium was always achieved during the experiments by checking that hydration processes www.atmos-chem-phys.net/13/11625/2013/Atmos.Chem.Phys., 13, 11625-11641, 2013 were always largely faster than the studied reaction.It is important to note here that the values shown in Table 2 are true minima as they are taken near pH = 7 and without taking into account any other possible catalysts.In many experimental studies, pH was often far from neutrality.Monod et al. (2005) for example, used the Fenton reaction at pH = 2 for aliphatic aldehydes which allowed a hydration rate in the range of 1 s −1 (Pocker and Dikerson, 1969).Other studies used pulsed OH generation systems coupled with fast detection of reactants such as pulsed radiolysis (Schuchmann and von Sonntag, 1988) or flash photolysis (Gligorovski and Hermann, 2004) and were looking at the OH-oxidation processes over timescales of few milliseconds (or less) which were considered to be sufficiently short to neglect any significant equilibrium feedback at least at room temperature, which is the only temperature investigated in the present study.

Carbonyl compounds and gem-diols descriptors
In this work, five descriptors were added to the SAR from Monod and Doussin (2008).These new parameters are specific to carbonyl compounds and gem-diol chemistry.They are k(-CHO), F'(-OH), F(C = O), G(gemOH) and G(C = O).Among these descriptors, k(-CHO) is the only one directly related to the abstraction of a H on a new function; F(CO) and G(CO) represent the α-and β-effects of the carbonyl function for both aldehydes and ketones while F'(-OH) represents the α-effect of a second OH borne by the C of the gem-diol function.It must be indicated that the α-effect of the first OH has been previously determined in Monod and Doussin (2008) and has not been modified here.Eventually, G(gemOH) represents the β-effect of the second OH on the H-abstraction on the first OH in the gem-diol function.
Again, the k(OH) value was taken as previously determined.Two examples are given in Table 3 to illustrate the use of this new relationship: one for 2-butanone, the other for propionaldehyde, and their corresponding gem-diols.
The new descriptors k(-CHO), F'(-OH), F(C = O), G(gemOH) and G(C = O) were varied simultaneously using the Microsoft ® Excel ® Solver routine in order to minimize the sum of the normalized square difference between calculated and experimental values.The parameters previously determined by Monod and Doussin (2008) remained unchanged.

Database
This work is based on the data given in Table 1 and in Table 4 which respectively gather experimental, theoretical, estimated and preferred values for the hydration equilibrium constants and for the rate constants for OH-oxidation of ketones, aldehydes and polyfunctional carbonyl compounds.The kinetic database comprises room temperature rate constants for 31 distinct species among which one counts 8 mono-functional aldehydes, 9 ketones, and 14 polyfunctional species including α-dicarbonyls, hydroxycarbonyls and oxoacids.The latter data arising from an exhaustive literature search are displayed in Table 4. On the contrary, there was no intention in Table 1 to build an exhaustive list of experimental data for the hydration equilibrium as this work has already been performed by Gomez-Bombarelli et al. (2009).It was hence limited to the species included in Table 4.
Most of the time, the rate constants of OH-oxidation of organic compounds shown in Table 4 were determined using the relative rate kinetics method.The rate constants were re-calculated taking into account updated values for the reference compounds.For the latter, recommended values by Buxton et al. (1988) were chosen in most cases, however, when no recommendation is mentioned in the   literature, or when more recent studies were published, average values were calculated.Considering that no significant changes for these reference rate constants have been published since the paper of Monod and Doussin (2008) and for the sake of the SAR homogeneity, these reference values were kept unchanged; they are given in Table 2 of Monod and Doussin (2008).

Results
The optimized descriptors are given in Table 5. Figure 1 focuses on carbonyl containing compounds, for each compound, the calculated value has been plotted as a function of all existing experimental values, thus resulting in some dispersion when experimental values do not agree.Nevertheless, as can be seen, the additional SAR parameters lead to an efficient structure-activity relationship.A fair linearity of the correlation curve is found while no significant bias can be deduced from the regression as the slope is not significantly different from unity and no significant intercept can  be observed.Six data points are located outside the "factor of 2 zone" in Fig. 1.It is important to note that for each of these outliers (mainly polyfunctionals) some other experimental data are well reproduced by the SAR.
The normalized difference between the calculated and experimental constants ( factor defined as in Eq. 5) can be used to evaluate the efficiency of the SAR.
The statistic distribution of is given in Fig. 2 from which we determined that for 41 % of the carbonyl compounds, the SAR was able to reproduce the experimental values within a range of ± 20 % and for 72 % within a range of ± 40 % (respectively called "80 % efficiency" and "60 % efficiency" in Monod and Doussin, 2008).These values are slightly less satisfactory than the overall SAR described by Monod and Doussin (2008), for which the 80 % efficiency level was reached for 60 % of the compounds as opposed to the 41 % here.This reduced efficiency can be explained by various factors among which the fact that for carbonyl compounds, our calculations merge two levels of uncertainties: (i) the efficiency of the SAR and (ii) the uncertainties associated with the hydration equilibrium constant determination.In addition, the limited number of available data and the limited diversity of structures must be pointed out.It can also be seen, in Fig. 2, that our SAR performs well for ketones and aldehydes and that the results are slightly less satisfactory for polyfunctional species.Two classes of compounds have been excluded from this statistical analysis because it was considered that their reactivities for the H-abstraction process, which is the purpose of this SAR, were not relevant.Hence, α-carbonyl bases (pyruvate, ketomalonates) and β-diketones (acetylacetone) were not considered in the efficiency evaluation even when the calculated values were not too far from the experimental values.
Similarly to what has been shown by Karpel Vel Leitner and www.atmos-chem-phys.net/13/11625/2013/Atmos.Chem.Phys., 13, 11625-11641, 2013 Doré (1997) for oxalate, α-carbonyl bases can undergo electron transfer following R2 (Huie, 1995;Ervens et al., 2003or Schaefer et al., 2012) RC For pyruvate ion, our calculated value is 1.2 × 10 8 M −1 s −1 while the most recent values range between 3.8 × 10 8 and 7.1 × 10 8 M −1 s −1 (Schaefer et al., 2012).For ketomalonates, the situation is even worse as no H available for abstraction can be identified over the structure (as the main form is the dehydrated form according to Raventos-Duran, 2008 calculation).In this case, our approach would lead to a negligible rate constant which is clearly not the case (see Table 1).Obviously, our SAR fails to catch this reactivity as it is relevant for another chemistry.It must be pointed out here that the introduction of a k(C(O)COO − ) = 2.1 × 10 8 M −1 s −1 for R2 and the corresponding use of the factors F(CH 3 ), F(COOH) and F(COO − ) would reconcile the calculated and the experimental data for pyruvate, ketomalonate and ketomalonate dianion, respectively.This value is close to 3.9 × 10 8 M −1 s −1 which is the rate constant for a very similar reaction (Buxton et al., 1988) i.e. the electron transfer between the carbonate ions and the OH radical: Nevertheless, it was not decided to extend our SAR to electron transfer processes considering the little information available and the very limited number of molecules in our database which could be affected.
The other reaction which appears to be irrelevant from the point of view of H-abstraction involves acetylacetone.It is well known that β-diketones are particularly prone to form stable enols or enolates because of conjugation of the enol or enolate with the other carbonyl group, and the stability is gained in forming a H-bonded six-membered ring.In this case, the OH radical addition to the enol form is likely to be the dominant process.For acetylacetone, the use of our SAR leads to 2.2 × 10 8 M −1 s −1 while the experimental value is more than 40 times larger.

SAR Parameters values
The five descriptors determined here (Table 5) to extend the existing SAR to carbonyl compounds are not meaningless.On the contrary, they carry significant chemical information and the fitted values can be rationalized as shown below.
Indeed, the value found for the rate constant of the aldehydic function (k(CHO) = 1.86 × 10 9 M −1 s −1 ) is significantly higher than any of the other base abstraction rate constants (respectively 3.5 × 10 8 , 6.5 × 10 8 and 4.7 × 10 8 M −1 s −1 for k(CH 3 ), k(CH 2 ) and k(CH) and 6.9 × 10 8 M −1 s −1 for k(OH)).This result reflects the well-known enhanced reactivity of the aldehyde function which has been also observed in the gas phase (Kwok and Atkinson, 1995) and which is further confirmed by the corresponding bond dissociation energies (BDE).For acetaldehyde in the gas phase, Blanksby and Ellison (2003) report BDE values of 89.4 ± 0.3 kcal mol −1 and 94 ± 2 kcal mol −1 for the C-H bond dissociation in the aldehydic function and in the CH 3 group respectively (these values do not take into account solvation enthalpies).It is also in good agreement with the fact that aldehydes are more reactive than their corresponding gem-diols, as it was experimentally demonstrated by Schuchmann and von Sonntag (1988) for acetaldehyde and ethylgemdiol which differ by a factor of 3 (Table 4.).Nevertheless, this ratio cannot be extrapolated to all carbonyl compounds.Indeed, as can be seen in Fig. 3, the rate constants for gem-diols linearly correlate with the carbonyl compound rate constants with a slope close to unity for both ketones and aldehydes.The two correlation lines are just offset by a value which reflects (i) the difference of reactivity between the aldehyde function and the -CH(OH) 2 group (+1.3 × 10 9 M −1 s −1 ) and (ii) the negative effect of the ketone group on the neighbour abstractable H and the reactivity of the > C(OH) 2 group (−4.4 × 10 8 M −1 s −1 ).
The parameter G(gemOH) also directly reflects the reactivity of the specific function (> C(OH) 2 ).Indeed, this parameter, which manifests the β-inductive effect of the hydroxyl group on the H-abstraction of the other -OH of the gem-diol function, is solely used as k(OH) × G(gemOH) to provide a rate constant for each of the OH groups.Actually, it would have been mathematically equivalent to provide a k(> C(OH) 2 ) rate constant, which would have been equal to 2 × k(OH) × G(gemOH) = 3.3 × 10 8 M −1 s −1 .This latter Table 6.Resonance (R) and field (F) values as defined by Swain and Lupton (1968) for the chemical groups involved in the present SAR (taken from Swain et al., 1983) Concerning the α-position effects, the carbonyl function exhibits very classically a significant deactivating behaviour (F(C = O) = 0.22).This can be explained by the mesomeric withdrawing effect of the carbonyl function.On a more quantitative point of view, if one considers the field (F) and resonance (R) parameters defined by Swain and Lupton (1968) (Table 6), it is surprising to obtain a F(C = O) value closer to F(COO − ) (0.24) than to F(COOH) (0.07) as one would have expected a stronger deactivating effect.In Monod and Doussin (2008), it was shown that the effect of the OH function in α-position is strongly activating: F(OH) = 2.1.In this work, it has been found impossible to parameterize the gemdiol reactivity with an equal value i.e. by taking to the power of 2 the F(OH) factor.This would have led to unreasonably high rate constants.As a consequence, a F'(OH) was introduced.The use of this factor is hence only relevant for the hydrated forms of carbonyl compounds and is only used as follows to calculate the reactivity of the C-H in the gemdiol function: k(CH) × F(OH) × F'(OH) has been found to be deactivating with a value equal to 0.65.While the overall effect of the two OH remain activating (as 0.65 × 2.1 = 1.4), it is not clear why two OH would be less activating than one.One can hypothesize that some steric effects of the two OH groups and their potential H-bonded water molecules could make the C-H less accessible.A similar hypothesis was previously considered to explain a k(CH) value lower than that of k(CH 2 ) (Monod and Doussin, 2008).
Finally, the carbonyl function was found to be deactivating in the β-position as G(C = O) = 0.90.This value is in good agreement with the electron withdrawing field effect of the carbonyl function, which exhibits in the acetyl function a Swain-Lupton F value of + 0.50 (Table 6).This value is close to the value for COOH (+0.44) and the two corresponding G parameter values are also in good agreement : respectively 0.90 (Table 5) and 0.73 (Monod and Doussin, 2008).

Comparison of the performance of the extended SAR with previously proposed estimation methods
To perform reliable comparisons, the rate constant values calculated by other estimation methods were taken directly from the corresponding papers or recalculated.Whenever possible, the rate constants for both aldehydes and the corresponding gem-diols were calculated, and the hydration equilibrium constant was taken into account to calculate the overall rate constant value which was plotted against our experimental data set (Table 4).The results are shown in Fig. 4 and the regression parameters are given in Table 7.
As already pointed out in Monod and Doussin (2008), it can be observed that the BDE correlation method proposed by Ervens et al. (2003) performs poorly.In spite of its great interest, this method suffers from the fact that the BDE is generally approximately estimated.Here, it is interesting to see that this correlation leads to a systematic underestimation of the values by almost an order of magnitude.It can be seen also that the SAR proposed by Monod et al. (2005) exhibits a very poor correlation and leads generally to an overestimation of an order of magnitude.It has already been discussed (Monod and Doussin, 2008) that this SAR was probably the result of an over-parameterization as being the solution of only 8 equations with 8 variables based on an extremely limited database of 8 values, which all correspond to alcohols.
On the other hand, the SAR proposed by Minakata et al. (2009) exhibits a quite good correlation as the r 2 correlation factor is only slightly smaller than the one arising from our work.This result is interesting as this SAR takes only into account effects in α-position.It must be pointed out that this characteristic does not make this SAR significantly simpler; to simulate the same chemical families (alkanes, acids, bases, alcohols, aldehydes and ketones) they propose the use of 14 parameters while 18 are necessary here, taking into account both α-and β-position effects.The main difference in performance with our work is that the correlation study performed with the Minakata et al. (2009) calculated values shows, on average, a systematic underestimation of 20 % (see slope values in Table 7).The reason for this bias is not clear, but it is surprising to notice that among the group rate constants that Minakata et al. (2009) proposed, no direct k(CHO) was considered while they were very careful to affect a very small value of 7 × 10 5 M −1 s −1 to k(COOH).This reactivity representation for aldehydes is probably one of the key reasons for the systematic bias found.Indeed, when one focuses on the aldehyde correlation only, the slope decreases from 0.80 to 0.69.

Conclusion
In this work, we have extended an already existing structureactivity relation for the OH radical reaction in the aqueous phase, published by Monod and Doussin in 2008, to 7.
extremely atmospherically relevant class of chemical compounds, i.e. the carbonyl compounds.This work has required a significant preparatory work to take into account the hydration equilibrium which strongly affects the reactivity of both aldehydes and ketones.It is worth noting that only five descriptors were added and that none of the values of the previously proposed descriptors were modified.The obtained performance of the updated SAR are quite satisfactory.The linear regression of the correlation plot (Fig. 5) yields y = (1.008± 0.022)•x + (1.2 ± 6.9).10 7 with r 2 = 0.86 and n = 248, which indicates that neither bias nor significant offset could be detected.This analysis led us to also evaluate that 58 % of the rate constants could be calculated with ± 20 % of the experimental data and 76 % within ± 40 %.
In the future, it will be very interesting to extend this SAR to halogenated compounds in general and to chlorides, in particular, as the database for these species is quite rich and they are of great interest for surface water chemistry.For the aqueous phase atmospheric chemistry purpose, it will be also necessary to extend this SAR to ethers, hydroperoxides and organosulfates which are atmospherically very relevant, however, for the latter two classes of compounds, the exper- imental rate constant database is very limited.In addition, it might be very interesting to apply the present approach to derive the structure-activity relationship for other radicals such as SO •− 4 or NO • 3 .Indeed, these radicals are extremely relevant in certain atmospheric conditions and the elucidation of the fate of organic materials in the aqueous phase is often dependant on our ability to predict their rate constants.
It has to be kept in mind that as efficient as a prediction method can be, it must be used with care and therefore the critical rate constants will always have to be investigated experimentally.Indeed, any parameterization is based on a limited number of descriptors, which only reflects a very simplified view of the complexity of chemical dynamics.More complex effects such as peculiar mechanisms (see the example of electron transfer), long distance electronic effects, steric or cage effects, and cyclic transition states can lead to strong differences between predicted and correctly measured rate constants.Therefore, for atmospheric purposes, a significant amount of additional experimental data is required to continue the validation of the aqueous phase SARs and to allow for their reliable extension.Their future performance fully rely on further experimental determinations, which are an essential necessity.

1Figure 1 :Fig. 1 .
Figure 1 : Correlation plot between the calculated and the experimental rate constants of aqueous 2 phase OH-oxidation of carbonyl compounds in linear (left) and logarithmic scales (right).The straight 3 line is the linear regression curve: y = (1.01 ± 0.06).x-(0.4 ± 1.4) × 10 8 with r 2 = 0.77.The dashed 4 lines define the limits where the rate constants are calculated within a factor of 2 from the 5 experimental data 6 Fig. 1.Correlation plot between the calculated and the experimental rate constants of aqueous phase OH-oxidation of carbonyl compounds in linear (left) and logarithmic scales (right).The straight line is the linear regression curve: y = (1.01 ± 0.06)•x−(0.4± 1.4) × 10 8 with r 2 = 0.77.The dashed lines define the limits where the rate constants are calculated within a factor of 2 from the experimental data.

1 Figure 4 : 5 Fig. 4 .
Figure 4: Correlation plot between experimental rate constants and calculated rate constants for 2 carbonyl compounds according to previously proposed estimation methods.The lines are the linear 3 regression lines which parameters are given in Table 7. 4 5 Fig. 4. Correlation plot between experimental rate constants and calculated rate constants for carbonyl compounds according to previously proposed estimation methods.The lines are the linear regression lines which parameters are given in Table7.

35 1Figure 5 : 7 8Fig. 5 .
Figure 5: Correlation plot between experimental and calculated rate constants for all compounds: 2 alkanes, alcohols, acids, bases, (Monod and Doussin (2008)), aldehydes, ketones (this work) and 3 polyfunctionals (both works).The plain line is the linear regression line y =1.008×x and the dashed 4 lines define the limits where the rate constants are calculated within a factor of 2 from the 5 experimental data.The inset is the distribution of the  factor (see equation 2) for the whole set of 6 data.7 8 Fig. 5. Correlation plot between experimental and calculated rate constants for all compounds: alkanes, alcohols, acids, bases,(Monod and Doussin, 2008), aldehydes, ketones (this work) and polyfunctionals (both works).The plain line is the linear regression line y = 1.008 × x and the dashed lines define the limits where the rate constants are calculated within a factor of 2 from the experimental data.The inset is the distribution of the factor (see Eq. 2) for the whole data set.

Table 1 .
Hydration equilibrium constants for the compounds used to define the SAR descriptors.

Table 2 .
Pseudo-first order rate constants and first order rate constants for the hydration/dehydration reactions as reported in the literature Recalculated using the preferred value for the equilibrium constant as defined in Table1. *

Table 3 .
Example of the calculation of the estimated aqueous phase rate constants of OH-oxidation of 2-butanone and propionaldehyde, and their corresponding gem-diols.

Table 3 :
Example of the calculation of the estimated aqueous phase rate constants of OH-oxidation of 1 2-butanone and propionaldehyde, and their corresponding gem-diols 2

Table 3 :
Example of the calculation of the estimated aqueous phase rate constants of OH-oxidation of 1 2-butanone and propionaldehyde, and their corresponding gem-diols 2

Table 3 :
Example of the calculation of the estimated aqueous phase rate constants of OH-oxidation of 1 2-butanone and propionaldehyde, and their corresponding gem-diols 2

Table 3 :
Example of the calculation of the estimated aqueous phase rate constants of OH-oxidation of

Table 4 .
Database for the rate constants of OH-oxidation of aldehydes, ketones and polyfunctional carbonyl compounds.