Articles | Volume 25, issue 20
https://doi.org/10.5194/acp-25-13379-2025
© Author(s) 2025. This work is distributed under the Creative Commons Attribution 4.0 License.
Implications of VOC oxidation in atmospheric chemistry: development of a comprehensive AI model for predicting reaction rate constants
Download
- Final revised paper (published on 22 Oct 2025)
- Supplement to the final revised paper
- Preprint (discussion started on 11 Apr 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-1241', Gianluca Armeli, 19 May 2025
-
AC1: 'Reply on RC1', Xian Liu, 22 May 2025
- RC2: 'Reply on AC1', Gianluca Armeli, 22 May 2025
-
AC1: 'Reply on RC1', Xian Liu, 22 May 2025
-
RC3: 'Comment on egusphere-2025-1241', Anonymous Referee #2, 10 Jun 2025
- AC2: 'Reply on RC3', Xian Liu, 19 Jun 2025
- AC3: 'Reply on RC3', Xian Liu, 19 Jun 2025
Peer review completion
AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload
AR by Xian Liu on behalf of the Authors (23 Jun 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (26 Jun 2025) by Thomas Berkemeier
RR by Anonymous Referee #1 (30 Jun 2025)
ED: Publish subject to minor revisions (review by editor) (09 Jul 2025) by Thomas Berkemeier
AR by Xian Liu on behalf of the Authors (17 Jul 2025)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (17 Jul 2025) by Thomas Berkemeier
AR by Xian Liu on behalf of the Authors (18 Jul 2025)
Author's response
Manuscript
General comments
The study in question presents a new model for the prediction of reaction rate constants of volatile organic compounds (VOCs). The authors used the reaction rate constant dataset by McGillen et al. to train a Siamese message passing neural network (MPNN) to predict these rate constants. The outcoming model was given the name “Vreact” and it was shown to outperform existing models for reaction rate constant prediction.
The dataset used in this study comprises 2802 gas-phase reaction rate constants for 1586 VOCs and 4 oxidants (·OH, ·Cl, ·NO3 and O3). The authors underline this diversity of oxidants as one of their advantages compared to previous models which only use a single oxidant per model. Because of the wide value range of reaction constants, the values were log-transformed. Vreact takes the SMILES string of the VOC and the oxidant as inputs, which is an established and modern approach in chem-informatics. Graph representations are generated from these inputs and fed to the neural network that creates the molecular feature tensors A and B. Further mathematical operations are executed to account for the effects of molecular interactions. Finally, the prediction value for the reaction rate constant is made.
Moreover, the authors evaluate how Vreact can contribute to the understanding of aerosol formation mechanisms. They showcase the oxidation of 2-methyl-4-penten-2-ol, discussing different reaction pathways and how the interaction layer of Vreact can be used for comprehension. Furthermore, the authors gathered more data from 2020 and onwards, which they called the ‘post-2020 test set’ to analyze the extrapolation ability of Vreact, leading to satisfactory results. Besides, more insights on the reaction rates of specific chemical classes are provided.
All in all, the article presents a modern and sustainable study. The Vreact model that is the key component of this work was built on well-established methods and principles and could overall convince with its performance. Vreact’s advantages and improvements towards other models were clearly outlined in a comprehensible way. The study was conducted scientifically correct with no obvious shortcomings. Despite it being a rather data scientific topic, its atmospheric relevance became evident. The illustrations used are helpful and supporting. The supplementary material contains further details on the model architecture and is useful for a deeper understanding. Another valuable resource is the web tool version of Vreact, reinforcing reproducibility and open data.
Specific comments
After the results of the test set were presented, the authors provided more extensive evaluations and showcases of the model’s abilities. First, they draw a more detailed comparison between Vreact and the existing single-oxidant models. Therefore, they use two independent approaches: 1) using the pre-trained Vreact to predict the test sets from the literature and 2) retraining Vreact on the original train/test splits of the literature. Approach 2) is a bullet-proof method that really isolates the model’s predictive capability and delivers a nice comparison. Approach 1) has the potential problem, that the literature test sets contain data points that are part of Vreact’s training set. This would be problematic, because generally, machine learning models perform significantly better on seen data, resulting in an unfair comparison. It would be appreciated, if the authors could address this issue briefly, since it was unmentioned in the text so far.
Technical corrections
No typing errors or other technical problems were found.