The Information Content of Dense Carbon Dioxide Measurements from Space: A High-Resolution Inversion Approach with Synthetic Data from the OCO-3 Instrument
Abstract. Bottom-up accounting methods of carbon dioxide (CO2) emissions can provide high-resolution emissions estimates at a global scale; however, the necessary in situ observations to verify these emissions are limited in coverage. Space-based observations of CO2 in the Earth’s atmosphere expand this coverage to a near-global scale to inform carbon cycle science and record emission trends. This work applied an observing system simulation experiment (OSSE) to characterize the flux information contained in “Snapshot Area Map” (SAM) CO2 measurements from the Orbiting Carbon Observatory-3 (OCO- 3). Unlike previous space-based carbon-observing systems, OCO-3 SAMs provide spatially dense observations of CO2 over targeted urban areas at unprecedented coverage. A Bayesian inversion using synthetic data was applied to these SAMs to explore their effectiveness in optimizing estimates of fossil fuel CO2 (FFCO2) emissions from the Los Angeles Basin. Results demonstrated that errors in the locations of large point sources diminished the inversion’s ability to reduce errors at the sub-city-level. Furthermore, reductions in atmospheric transport error exacerbated these issues. Only after geolocation errors in large point source locations were removed and atmospheric transport error was reduced did individual SAM observations provide modest corrections to prior flux estimates. The aggregation of multiple SAMs proved to be effective in reducing systematic errors in manufacturing- and transportation-related estimates, demonstrating the need for similar measurements in future space-based missions.
Dustin Roten et al.
Status: final response (author comments only)
- RC1: 'Comment on acp-2022-315', Anonymous Referee #1, 17 Jun 2022
- RC2: 'Comment on acp-2022-315', Anonymous Referee #2, 18 Jul 2022
- AC1: 'Comment on acp-2022-315', Dustin D. Roten, 13 Sep 2022
Dustin Roten et al.
Vulcan 3.0 https://daac.ornl.gov/cgi-bin/dsviewer.pl?ds_id=1741
Model code and software
Bayesian Inversion Code https://doi.org/10.5281/zenodo.2655990
Dustin Roten et al.
Viewed (geographical distribution)
The manuscript "The Information Content of Dense Carbon Dioxide Measurements from Space: A High-Resolution Inversion Approach with Synthetic Data from the OCO-3 Instrument" by Roten et al. assesses the potential of XCO2 "Snapshot Area Maps" (SAMs) from OCO-3 for the quantification of the CO2 emissions from large cities based on a classical Bayesian atmospheric inversion framework. It relies on tests with pseudo XCO2 data over Los Angeles.
The inversions of the anthropogenic CO2 emissions from cities based on satellite data receive a growing interest with the analysis of OCO-2 and OCO-3 city plume transects or images, the preparation of new generations of satellite XCO2 imagers, and the development of dedicated inversion systems. A wide range of studies have been published on this topic using both OCO-2/3 or synthetic XCO2 data. The authors of this manuscript have developed material (tools, simulations, experimental protocols, diagnostics) that can support the derivation of new insights and learnings in this field of activity. Their experiments bring some interesting results.
However, 1) the manuscript requires a major general rewriting and 2) I have concerns regarding the specific configuration of the experiments or regarding the conclusions raised from the results.
1) The reading of the manuscript is laborious because of inappropriate or vague wordings and notations, and because of a lack of rigor and precision. Efforts, reasoning and some good knowledge of the atmospheric inversion are often needed to unravel the meaning of the text.
The abstract and introduction provide many meaningless and random statements. The abstract is hardly informative because its statements lack of context. The introduction weaves between general considerations on the CO2 atmospheric inversions and indications that correspond to city scale applications only. It is sometimes difficult to connect a statement to the corresponding reference to a past publication. The discussions on the ground based networks and on the "increased spatiotemporal" coverage of OCO-3 compared to OCO-2 are a bit misleading. The justification for the use of pseudo data experiments ("the use of synthetic data from OCO-3 eliminated the potential for systematic biases from local CO2 emissions reductions during the COVID-19 pandemic lockdowns and biases in preliminary data from OCO-3") is a bit puzzling and highlights the need for a clearer rationale for the specific analysis conducted in this new study.
The recollection of the critical assumptions and parameters of the modeling and inversion configurations (e.g. regarding the set-up of the control vector as a function of the test cases, or regarding the precise set-up and iterative process of test case 4) from section 2 is laborious. The end of section 2.4 is particularly confusing. The information is not properly organized. The presentation of the diagnostics in section 3 lacks of clarity and bears many little missteps. The title of the manuscript itself could be rethought to be more informative about the purpose of the study (e.g. about the focus on the monitoring of anthropogenic CO2 emissions from cities).
2) The experiments and the analysis raise many questions. Given the current state of the manuscript, I focus on general concerns, in connection with the three main conclusions given in the abstract:
a) A major result from the experiments is the increase of errors in the estimate of the emissions from the inversion in the tests cases 1 and 2 (not only at "sub city level" as suggested by the abstract, but also at the city level). These test cases are those for which the differences between the true and prior estimate of the emissions are appropriately characterized by the difference between two inventories that are widely used in the community. The analysis reveals that the explanation for such an increase of errors is due to the poor adequacy between the spatial correlations in the Q matrix and the actual discrepancies between the two inventories. Indeed, when considering anthropogenic emissions within a city, exponentially decaying spatial correlations (inherited from large scale inversion practices) hardly make sense and the results confirm it. Such correlations may make more sense if splitting the control vector between the different sectors of activities with no correlation of uncertainties across the sectors. But even in that case, the size of cities and the dynamics of the emissions hardly justifies correlations of uncertainties decreasing in space (in an isotropic way). More details about the diagnostics of the correlation lengths may feed this discussion, but this computation seems to be based on a single occurence of error map (lines 300-305 are not really clear) which can be misleading here: the significant correlations in space probably arise from the areas of the map where the different sectors are relatively well mixed and the test does not account for the fact that the correlation rises up again further between areas that bear similar emission sectors (?). As briefly envisaged in section 4, rather than building an hybrid ("custom") prior estimate of the emissions to overcome the problem, I believe that the authors should have improved the set-up of the control vector and of the corresponding matrix Q, especially since results from test case 4 reveal that the use of this custom prior does not really solve for the lack of improvement in the emission estimates. The implicit conclusion suggested by the manuscript that the large point sources within the city should be correctly geolocated to get good estimates of the city total emissions cannot rely on the current set of results.
b) My understanding is that the perturbations applied to the "true" XCO2 field in order to generate pseudo data are not consistent with the set-up of the R matrix in the inversion system. This could provide insights on the skill of the inversion when the inversion configuration does not properly characterize the statistics of the actual errors in the model vs. data misfits. But this needs to be properly handled, analyzed and discussed. Here, the manuscript ignores this lack of consistency and raises conclusions that can be highly misleading, in particular regarding the impact of "decreasing the model error" (actually, of decreasing R but the "true" model errors are null in these experiments). If there is no transport model errors in the model vs. data misfits, then decreasing or increasing R simply leads to fitting more or less the data. If the data drive the inversion in the wrong direction (which is the case in Test case 1), decreasing R will increase the problem (which is the case in test case 2). That does not easily say something about what would happen if there would be small or large transport errors in the model vs. data misfits.
This point (b) connects to the previous one (a) and the lack of consistency between the assumption by the inversion system that prior uncertainties follow the distribution N(0,Q) and the actual differences between ODIAC and Vulcan.
c) We hardly understand why the analysis of the average emission estimates based on the joint use of multiple SAMs in Test case 4 relies on an experimental set up which is completely different from the other ones. Why considering a huge bias (a factor 4, which hardly applies to city scale inventories ?) and no other errors in the prior estimate of the emissions here ? The results from this test case are difficult to connect to the others and the huge error on the prior estimate of the emissions prevents this experiment from convincing us about the potential of the SAMs. In a general way, the analysis hardly provides quantitative analysis of the typical precision of the emission estimates from the inversions, which is a key index of this potential.
d) Few other points:
- Los Angeles appears to be a very complex case for city scale inversions due to the surrounding topography and ocean. The modeling of the CO2 transport over such a city is challenging (from this point of view, lines 214-217 can be misleading). By using the same transport model to simulate the pseudo data and for the inversion (i.e. by using a perfect transport model for the inversion), the study avoids this issue. This should be properly discussed.
- the characterization of the "background" field underlying the CO2 emitted by the city appears to be a critical source of uncertainty for city scale inversions (in general, it does not seem to be as easy as suggested by lines 271-272). The size of OCO-3 SAMs may actually be limited for the characterization of the background of cities such as Los Angeles. This should also be appropriately taken into account when discussing the optimal spatial sampling in section 4
=> despite ignoring these two critical sources of errors (in addition to the "bias" in the real retrievals of XCO2 data, such as those mentioned at line 77), the inversion hardly provides convincing results for the estimate of the city total emissions (improvements at city scale in test case 3 are nearly negligible, and see my concerns regarding test case 4). Opposed to the last statement of the abstract (and to those of the final lines of the introduction), it does not really demonstrate the need for such measurements
- A point regarding the set-up of R: first, lines 267-268 are misleading. Prior XCO2 vs. data misfits in the XCO2-space include the transport of the errors in the prior estimate of the fluxes. Then, the misfits between the pseudo data and the prior XCO2 concentration seems not to include transport model, background and biosphere errors. Finally, by construction, there is no spatial correlation between the instrumental errors which have been used to perturb the pseudo-data. Therefore, it seems that the derivation of the spatial correlations for R based on the comparison between the pseudo-data and the prior XCO2 (l. 286) does not make sense.
- l340_343: I do not agree, the level of improvement of the fit to the data is not an index of the potential to reduce the error in the emission estimates, or at least not at scales larger than the resolution of the control vector (especially when the R and B matrices are not consistent with the actual errors as here). Controlling emission at higher spatial resolution provides more degrees of freedom to fit the data along with capabilities to decrease errors in emission estimates at this higher resolution. But one cannot say much more about this with the diagnostics provided there ? This concern propagates to lines 347-348 and subsequent considerations until the end of section 3.1.1.
- regarding the discussion at lines 541-548: what does Figure 9 say about it (either the posterior estimates of the emissions or the uncertainties in these estimates) ?