Articles | Volume 26, issue 7
https://doi.org/10.5194/acp-26-4771-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
Technical note: Hybrid machine learning model for bias correction of UTLS relative humidity against IAGOS observations in ERA5 reanalysis
Download
- Final revised paper (published on 10 Apr 2026)
- Supplement to the final revised paper
- Preprint (discussion started on 10 Dec 2025)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on egusphere-2025-4529', Anonymous Referee #1, 23 Dec 2025
- AC1: 'Reply on RC1', Jérémie Juvin-Quarroz, 14 Jan 2026
-
RC2: 'Comment on egusphere-2025-4529', Anonymous Referee #2, 07 Jan 2026
- AC2: 'Reply on RC2', Jérémie Juvin-Quarroz, 14 Jan 2026
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Jérémie Juvin-Quarroz on behalf of the Authors (22 Jan 2026)
Author's response
Author's tracked changes
Manuscript
EF by Polina Shvedko (22 Jan 2026)
Supplement
ED: Referee Nomination & Report Request started (24 Feb 2026) by Franziska Aemisegger
RR by Anonymous Referee #1 (07 Mar 2026)
ED: Publish subject to technical corrections (25 Mar 2026) by Franziska Aemisegger
AR by Jérémie Juvin-Quarroz on behalf of the Authors (30 Mar 2026)
Manuscript
## Overall
This is a nice technical note expanding on the work of Wang et al 2025. I think the manuscript deserves to be published after improving the clarity of the presentation and considering the major comments below. Ideally the manuscript would be accompanied by example training code in the author's language of choice.
## Major Comments
- L3 & L46: Many publications on this topic often make some form of the statement "[There are] considerable errors in RH_i estimates" which makes "accurate forecast of ISSRs [difficult]." I'm interested to see more analysis on the type and distribution of errors to better understand how ISSR forecast errors will result in ineffective (or inefficient) avoidance measures. In our experience, RH_i (and ISSRs) have high pointwise error, but overall ISSR regions are (generally) spatially and temporally correlated with ISSR forecasts.
- L253: What are the requirements to support effective contrail avoidance strategies?
- L51: Wang et al 2025 published a ANN humidity correction methodology. This publication adds an XGBoost regression for RH_i < 85%, and a different training/validation data split. Given the similarities, this line deserves a whole paragraph describing the differences with Wang 2025, and how this methodology aims to improve on the previous work.
- L74: What kind of biases in the weather might this domain selection introduce? Have you tested how well your models apply outside this domain?
- L83: Did you consider model levels? It may be worth exploring if the higher vertical resolution would improve your results.
- L117-121: How did you interpolate the values for T and q? Linear interpolation in q introduces bias when working with coarse pressure levels.
- Table 1: Teoh et al 2024 introduced a latitude correction for the humidity correction. Should latitude be a feature?
## Minor Comments
- L31: "are spending" -> "spend"
- L33: Suggest using stats from more recent Teoh, R. et al. (2024) “Global aviation contrail climate effects from 2019 to 2021,” Atmospheric Chemistry and Physics, 24(10), pp. 6071–6093. Available at: https://doi.org/10.5194/acp-24-6071-2024.
- L37: Its worth motivating why we need to detect ISSRs. Its presumed that the reader knows "to meteorologically forecast ISSRs with enough accuracy" we need ISSR detections. May want to add context e.g. "Global ISSR forecasts are generally derived for numerical weather forecasting systems, or nowcast from in situ measurements or inferred from remote sensing. Both approaches rely on accurate detections of ISSRs, in the first case to validate models, or in the second through measurements"
- L44: Not just ERA5 - any numerical weather prediction system. I'd flip this around - numerical weather prediction models provide a comprehensive prediction across the global atmosphere. ERA5 is a highly trusted source of numerical weather prediction.
- L45: Define what a dry-bias means
- L48: Other publications with humidity correction: (constant) Schumann, 2012; Schumann et al., 2015; Teoh et al., 2020; Schumann et al., 2021; (piecewise function) Teoh et al 2022; Teoh et al 2024; (quantile mapping) Platt et al 2024
- L55: This sentence sounds like an LLM. I'd move L59-L61 up front, remove this sentence, and then have L57-58. Can you be more specific as to why you chose the hybrid model? From this description it sounds like you used XGBoost for compute performance reasons rather than accuracy.
- L94: How long is the "longer period"?
- L104: Just confirming that IAGOS accuracy is a function of RH_i or of absolute humidity. I had remembered that humidity sensor accuracy was a function of absolute humidity.
- L126: How does this compare to Wang 2025?
- L156: Is it possible the ANN is overfitting these engineered features? You acknowledge the proper data split, but could you use additional data outside the domain to gain confidence?
- L160: This criteria sounds more like "No existing cirrus" rather than "clear sky." Could also look at the IAGOS ice crystal measurements to judge pre-existing cirrus (Petzoldt 2025)
- L170: (Re)Introduce acronym MAE
- L182: Add citation? Where does this baseline come from?
- L186-188: Its not clear to me why "structured input data" ~ drier regimes. Its more clear to me that "high humidity conditions" ~ complex non-linear dependencies.
- L221-222: This is first clear explanation of why XGBoost is preferable to ANN for the drier regimes. L230 - L233 is also great. Bring this language up front!
- L223: Repeats the previous line
- Table 3 is super helpful - It would be helpful to use this language up front when describing the benefits of the hybrid architecture.
- Table 3, Table 4: How do these results compare with Wolf et al 2025 or Platt et al 2024 (quantile mapping)