the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Automated detection and monitoring of methane super-emitters using satellite data
Joannes D. Maasakkers
Pieter Bijl
Gourav Mahapatra
Anne-Wil van den Berg
Sudhanshu Pandey
Alba Lorente
Tobias Borsdorff
Sander Houweling
Daniel J. Varon
Jason McKeever
Dylan Jervis
Marianne Girard
Itziar Irakulis-Loitxate
Javier Gorroño
Luis Guanter
Daniel H. Cusworth
Ilse Aben
Download
- Final revised paper (published on 19 Sep 2023)
- Preprint (discussion started on 26 Jan 2023)
Interactive discussion
Status: closed
-
RC1: 'Comment on acp-2022-862', Anonymous Referee #1, 21 Feb 2023
This paper describes a two-step machine learning approach that uses a Convolutional Neural Network (CNN) to detect plume-like structures in TROPOMI methane data and then applies a Support Vector Classifier (SVC) to distinguish emission plumes from retrieval artefacts. The CNN is trained using hand-selected scenes from 2018-2020 and then applied to 2021 observations. This is an important topic because TROPOMI collects millions of measurements over the globe each day, and future missions will collect even more. Automated approaches are therefore needed to process these data and reliably identify emission plumes. In general, this manuscript does represent a substantial contribution. However, the methods section (section 2.2, 2.3) needs a substantial revision to make it more understandable to the average reader of Atmospheric Chemistry and Physics, who may not be familiar with these machine learning techniques.
The description of the choice and configuration of the CNN is very short. Four reasons are cited to justify the choice of this particular machine-learning method (L131-136), but readers not familiar with these methods might not know that CNNs are commonly chosen for image recognition and pattern recognition. We learn that “the same convolutional kernel scans the entire image”, but never told what this kernel is or where it comes from or what a “pooling layer” (figure 2) is or does.
The description of the CNN training process is even more obscure and confusing. We are told that “For the training process, the class weight parameter is set to the ratio between the number of plumes (828 positives) and negatives (2242), …”, but we are never told what the class weight parameter is or how sensitive the solution might be to this setting. The paragraph that follows (L148-163) makes the training process look more like black magic, where the user utters a few magic words (Keras, ReLU, ADAM, softmax) and wondrous things happen. All of these terms are used without reference to refereed scientific papers. Instead, the reader is sent to a web page (Chollet et al., 2015) with a sales pitch and code and then a github site (O’Malley et la.. 2019). What part of these code distributions are used here? All of them? The only real reference in this paragraph is Li et al. (2018), which describes one of two approaches used for optimizing hyperparameters. Neural network training is a major of this paper. Additional insight into these methods is essential to gain the acceptance and understanding of this Earth Science audience. At a minimum, we need to understand the specific inputs and outputs of these methods and how the results are validated against standards. A few additional figures illustrating these topics would be great.
The discussion of Feature Engineering (Section 2.3, L193-199) and list of features in Table C1 is more helpful, but still unnecessarily confusing. I was surprised that the feature vector included the CNN score (0, 1) as feature, that is apparently no less or more important than any other. We are told that the algorithm operates on 32x32 pixel scenes using a 41 x 1 feature vector. However, then we learn (L205) that “In our binary classification problem, the CAM visualizes which regions of the deepest feature maps(the 8x8, deepest max-pooling layer in Figure 2) lead to an activation of the plume class.” Figure 2 shows only two pooling layers and does not mention where the 8x8, deepest max-pooling layer. We are then told (L207-208) that “This spatial activation is calculated using the gradients between all internal 64 feature maps and the fully-connected layer.” Where did we learn about internal 64 feature maps? At this point, I was totally lost.
Smaller issues, concerns and editorial suggestions:
L66: “As such the hyperspectral …” à “As such, the hyperspectral …”
L78: “atmospheric conditions and identify plume signatures” à “atmospheric conditions to identify plume signatures …”?
L85: “we target three high-resolution satellite instruments …” The word “target” is ambiguous here, because it sometime means that you point a satellite instrument (i.e., GHGSat) at a target. From the context, I believe you mean “we use data from three high-resolution satellite instruments …” Is that correct?
L90: “We use two 90 machine learning models in sequence to detect plumes in the TROPOMI methane data. First we apply a Convolutional Neural Network to detect plume-like structures in TROPOMI methane atmospheric mixing ratio data, then we use additional atmospheric parameters and supporting data to further distinguish between genuine methane plumes and retrieval artefacts. We then use (targeted) high-resolution methane observations to pinpoint the responsible sources.”
- These two sentences are largely redundant with the last few sentences of the paragraph above. It would be better to replace them with a brief overview of the next few subsections.
Sections 2.2 – 2.4 – see comments above.
L314: “allows multiple close by plumes” à “allows multiple nearby plumes”
- Also, somewhere in this paragraph, it would be good to specify the minimum number of enhanced pixels needed to define a "plume".
L345, L358, L371. It appears that the same, 10m wind field is used in the analysis of GHGSat, PRISMA, and sentinel-2, but two different notations are used. For GHGSat (L345), it is called with U10 winds from GEOS-FP, while for the other two satellites, it is called “GEOS-FP 10m wind data”. It would be good to use consistent nomenclature.
L351: “hyperspectral 30x30 km2 images at a spatial resolution of 30x30 m …” If the pixels and images are nearly square, it would be better to describe their dimensions as 30 km x 30 km and 30 m x 30 m, respectively.
L352: “The minimum revisit time can be up to 7 days with ±20% across-track pointing.”
- Do you mean "up to" or "as short as" 7 days? For example, can PRISMA sometimes have repeat cycles as short as one day or are they almost always longer than 7 days?
L356: “location of interest on a future moment in time.” à “location of interest in the future.”
L361: “capable of the detection of methane” à “capable of detecting methane”
L362: “with a pixel resolution of 20 m” à is this “with a pixel resolution of 20 m x 20 m”?
L369: “that similarly to Varon et al. (2021)” à “", that, like Varon (2021) uses ..."
L375: “identifies 26,444 scenes (3.3 %) as containing” à “identifies 26,444 scenes (3.3 %) that contain”
Citation: https://doi.org/10.5194/acp-2022-862-RC1 -
RC2: 'Comment on acp-2022-862', Anonymous Referee #2, 06 Mar 2023
General comments
This study by Schuit et al. proposed a machine-learning-based approach to automatically detect super-emitters of methane from high-resolution TROPOMI retrievals. This is an important work because it provides opportunities to make the most of the huge amount of measurements obtained by TROPOMI or other instrument in the future. However, I found the methodology section hard to follow, as it involves different models (e.g., CNN, SVC, IME, the Weather Research and Forecasting model coupled with a Chemistry module, version 4.1.5), different datasets (e.g., TROPOMI, emission inventory, GEOS-FP, GHGSat, PRISMA and Sentinel-2), and the relationship between these models and datasets are complicated.
- A flow chart of the full methodology framework is needed, which describes each step of the approach in detail, including the purpose, the input and output data, as well as the involved model. Such figure would help the readers to better understand the whole procedures of the complicated approach. For example, it could explain the relationship between the CNN and the SVC model, and describes how the data is transferred between these models.
- The method section is suggest to be reorganized according to the method flow chart.
- Another important information that should be reflected in the flow chart is, whether the specific step is automatic or manual. I’ve noticed that some processes in this method need manual inspection, and such kind of information are suggested to be summarized.
- The evaluation of the estimated emission source rates are conducted through comparison with GHGSat, PRISMA, and Sentinel-2 detections. Such comparison was conducted for several locations as examples. I suggest the authors to clarify the representativeness of these locations. Meanwhile, comparison of all the available source rates estimated from different methods are encouraged, including the correlation coefficient, mean bias, etc.
Other specific comments:
- The only third-level heading under Sect. 2.3 is Sect. 2.3.1. Besides, does “source rate quantification” belong to “feature engineering”? I think that the source rate quantification step should be after the SVC model when the artefact is excluded.
- Line 322: How did the authors define the dominant source type in a specific grid if it contained mixed emission sources?
- Line 332: What is the inversion-based method? Please provide a brief description of this method and its comparison with the new method developed in this study.
- Is it able to capture the day-to-day variation of emissions for a specific source detected?
Citation: https://doi.org/10.5194/acp-2022-862-RC2 - AC1: 'Comment on acp-2022-862', Berend Schuit, 21 Apr 2023