Articles | Volume 26, issue 8
https://doi.org/10.5194/acp-26-5447-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
CloudViT: exploring cloud type classification with vision transformers in global satellite data
Download
- Final revised paper (published on 22 Apr 2026)
- Preprint (discussion started on 02 Oct 2024)
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
CC1: 'Comment on egusphere-2024-2724', Chen Zhou, 30 Oct 2024
- AC1: 'Reply on RC1', Julien Lenhardt, 25 Feb 2025
-
RC1: 'Comment on egusphere-2024-2724', Anonymous Referee #1, 17 Nov 2024
- AC1: 'Reply on RC1', Julien Lenhardt, 25 Feb 2025
-
RC2: 'Comment on egusphere-2024-2724', Anonymous Referee #2, 20 Dec 2024
- AC1: 'Reply on RC1', Julien Lenhardt, 25 Feb 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Julien Lenhardt on behalf of the Authors (25 Feb 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (06 Mar 2025) by Minghuai Wang
RR by Anonymous Referee #1 (25 Mar 2025)
RR by Anonymous Referee #2 (10 Apr 2025)
ED: Reconsider after major revisions (21 Apr 2025) by Minghuai Wang
AR by Julien Lenhardt on behalf of the Authors (13 Jul 2025)
Author's response
Author's tracked changes
EF by Katja Gänger (01 Aug 2025)
Manuscript
ED: Referee Nomination & Report Request started (02 Aug 2025) by Minghuai Wang
RR by Anonymous Referee #2 (09 Nov 2025)
ED: Reconsider after major revisions (08 Dec 2025) by Minghuai Wang
AR by Julien Lenhardt on behalf of the Authors (19 Jan 2026)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (20 Jan 2026) by Minghuai Wang
RR by Anonymous Referee #2 (21 Feb 2026)
ED: Publish subject to minor revisions (review by editor) (02 Mar 2026) by Minghuai Wang
AR by Julien Lenhardt on behalf of the Authors (06 Mar 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (15 Mar 2026) by Minghuai Wang
AR by Julien Lenhardt on behalf of the Authors (20 Mar 2026)
Manuscript
This paper presents CloudViT, a novel cloud classification method based on Vision Transformers (ViTs) and cloud properties derived from MODIS satellite data. The authors aim to classify cloud types across global datasets using spatial patterns of cloud properties such as cloud top height (CTH), cloud optical thickness (COT), and cloud water path (CWP). The method is evaluated on co-located ground-based observations and satellite data, producing accurate classifications of different cloud types. The approach is further tested with applications to General Circulation Models (GCMs), notably ICON-Sapphire, showcasing CloudViT's ability to generalize cloud type retrievals at kilometer-scale resolution.
CloudViT leverages self-supervised learning for pretraining and contrastive learning to overcome the limited number of labeled cloud observations. The method is robust, showing competitive performance when compared to traditional methods and CNN-based approaches, and effectively captures global cloud distributions, including complex cloud types like cumuliform and stratiform clouds. I think the paper is suitable for acceptance with minor revisions.
Minor Comments:
L142: Change "retrieved" to the verb form "retrieve."
L177: Replace "requires" with "require" to agree with the plural subject.
L209: In the sentence "this type of model, alongside CNNs, are," replace "are" with the singular verb "is" to agree with the subject "this type of model."
L323: Change "cardinal" to "cardinality" to correctly refer to the size or number of elements in a set.
L587-L593: I believe it would be beneficial to discuss the limitations, such as follows:
Since MODIS data is collected through near-nadir scanning, observations in high-latitude regions become oblique, leading to distortions and errors in cloud property retrievals, such as cloud top height and optical thickness. This could potentially affect the model’s performance in polar regions.