Airborne measurements of CO

As widely recognized at the international level, there is a need for reduction in anthropogenic emissions (IPCC, 2014). This however implies the necessity for reliable climate predictions from atmospheric models in order to allow policymakers to make informed decisions. Unfortunately, current climate predictions are hampered by excessive uncertainties; for example intercomparisons of different models show important differences in their predictions as shown in Friedlingstein et al. (2006). This makes it difficult to assess the better environmental policies to implement. Because most biogenic fluxes in Europe are influenced by human activities, with 22 % of Europe's land dedicated to agriculture (FAO, 2013) and 45 % covered by forests, of which 80 % is managed for wood supply (UNECE, FAO, 2011), understanding and managing these biogenic fluxes must also be a component of any policy to reduce anthropogenic emissions.

A commonly used approach to estimate carbon budgets by teasing apart sources
and sinks in a given spatial domain is the atmospheric Bayesian inversion.
Atmospheric inversions combine prior knowledge from emission inventories with
atmospheric observations acting as a top-down constraint to produce better
posterior knowledge. As the main goal of this study is to assess the benefit
of inter-species correlations in reducing the uncertainty of the posterior
state space, we are particularly interested in the effects of such
correlations on the uncertainty reduction, defined as the difference between
prior and posterior uncertainty normalized by the prior. The vast majority of
published papers on atmospheric inversions investigate the budget of a single
species, usually a long-lived greenhouse gas like CO

So far the lion's share of the studies investigating atmospheric inversions
make use of both continuous in situ and flask measurements from ground-based
observational networks of tall towers (e.g. Kadygrov et al., 2015;
Sasakawa et al., 2010).
However, as profiles collected from aircraft easily exceed the height of
towers, airborne data may also offer an interesting option. This alternative was tested in some recent studies that made
use of aircraft profiles alone or in combination with other data sources
(e.g. Brioude et al., 2013; Gourdji et al., 2012). Methods to maximize the
cost-effectiveness of airborne data are the use of unmanned aircraft
(drones) and commercial airliners. The latter, in particular, allow for
collecting data on a regular basis without requiring a particularly small or
light sensor. The most important projects making use of commercial airliners
are CONTRAIL (Comprehensive Observation Network for Trace Gases; Machida et al.,
2008) and MOZAIC/IAGOS (Measurements of Ozone and water vapour by in-service
AIrbus aircraft/In-service Aircraft for a Global Observing System; Marenco et al., 1998; Petzold et al., 2015). Both projects have been running for more
than 2 decades and have produced extensive datasets that have proven to be
important in the fields of atmospheric modelling and satellite calibration
and validation (Zbinden et al., 2013; Sawa et al., 2012). Regarding
carbonaceous species, CONTRAIL has so far collected CO

This paper is focused on investigating the benefits
of such a multi-species inversion on uncertainty reduction in comparison with a single-species
inversion. To achieve this goal, we set up a synthetic experiment utilizing
the measurement times and locations collected from the IAGOS projects in the
year 2011. The present paper is intended to pave the way for future studies
making use of multi-species IAGOS datasets when they become available. A
receptor-oriented framework was set up to derive flux interactions between
the atmosphere and the biosphere using IAGOS data. The modelling framework is
composed of a Lagrangian particle dispersion model (LPDM, specifically the
STILT model), a diagnostic biosphere–atmosphere exchange model (the VPRM
model), gridded emission inventories, global tracer transport model output
that provides the tracer boundary conditions for the regional domain, and a
Bayesian inversion scheme. The present work is based on the modelling
framework used in Boschetti et al. (2015) and builds upon that by adding other
species, and using a formal Bayesian inversion. A multi-species inversion
was carried out in order to exploit the correlations in uncertainties
between CO

Specific emission sectors accounted for in the state vector and aggregated categories as used in Fig. 8.

Before describing the different models composing the modelling framework, we
introduce some specific terminology to reduce ambiguity in Sect. 2.1.1–2.1.6. Quantities that can be observed are termed
“species” or “trace gases”, corresponding
in this case to total CO

Specific fuel types accounted for in the state vector and aggregated categories as used in Fig. 8.

In this study the modelled profiles have identical structure to those collected from the IAGOS fleet of commercial airliners. More precisely, the spatial and temporal coordinates of different observations will be used as input for the modelling framework, whereas the observed values of atmospheric mixing ratios of CO and meteorological parameters themselves will play a role in calibrating the modelling framework.

Central for this work is the concept of the mixed layer (ML), the lower part
of the troposphere in which trace gases are well mixed due to turbulent
convection in the timescale of an hour or less, and in which the effect of
regional surface–atmosphere fluxes is the strongest. As input to the
inversion we use the enhancement of the species' mixing ratio within the
mixed layer relative to that in the free troposphere (FT), similar to the
approach described in Boschetti et al. (2015). This mixed layer enhancement best
reflects the influence of regional fluxes. To compute this, we take the
average mixing ratio within the mixed layer and subtract the value taken at
2 km above the mixed layer top (

The modelling framework is composed of a regional transport model (STILT), the EDGAR (Emission Database for Global Atmospheric Research) emission inventory to model anthropogenic emissions, VPRM (Vegetation Photosynthesis and Respiration Model) to model emissions from the biosphere, and output from global transport models for lateral boundary conditions for the different modelled species. The expressions “anthropogenic emissions” and “fossil fuel emissions” are considered synonymous in this paper and are used to indicate the sum of fossil fuel and biofuel emissions, without including contributions from LULUCF (Land Use, Land-Use Change and Forestry).

For regional transport we make use of the LPDM STILT (Stochastic Time-Inverted Lagrangian Transport; Lin et al., 2003) to derive the sensitivity of the atmospheric mixing ratio measurement to upstream surface–atmosphere fluxes, so-called “footprints”. Briefly, for each measurement location and time (also called receptor point), the model releases an ensemble of virtual particles that are driven back in time using wind fields from ECMWF and turbulence as stochastic process; the residence time within the lower half of the mixed layer is used to determine the potential contribution from surface fluxes, and the cumulative sum of these contributions determines the footprint that identifies the part of the domain with a certain influence on a single receptor point. To represent the mixed layer enhancements, the footprints for receptors within the boundary layer are averaged, and the footprint for the free tropospheric receptor is subtracted from this, resulting in a footprint for the mixed layer enhancements. This footprint is then matrix-multiplied with an emission map from an emission inventory, resulting in a simulated mixing ratio enhancement corresponding to the regional contribution at the measurement location.

A detailed description of STILT is given in Lin et al. (2003) and Gerbig et
al. (2003). We use STILT coupled with emission models for both anthropogenic
(EDGAR) and biosphere (VPRM) fluxes on a regional domain that covers most of
Europe (33 to 72

For fossil fuel emissions, we use a model based on the EDGAR v4.3.1 emission
inventory (European Commission, 2016) modified
following the same approach taken for COFFEE (CO

STILT transport is driven by meteorological fields from the ECMWF IFS (12 h forecasts twice daily at 3-hourly temporal resolution), which have a
spatial resolution of 0.25

Atmospheric inversions provide an estimate of the distribution of sources
and sinks over the domain's surface from available concentration
measurements (top-down approach). This can be formalized in the
following linear relation:

Bayesian inversion combines observations (IAGOS profiles) with a priori
information (scaling factors and their a priori uncertainties) to
reconstruct the most probable state vector. Optimum posterior estimates of
the scaling factors are obtained by minimizing the following cost function

The targeted quantities of this study are the aggregated emissions over a
specific area at a specific timescale (e.g. month); those quantities can be
derived from the prior and posterior state through a spatiotemporal
aggregation operator

Cumulative sum of the ML footprints for all the flights into or out of Frankfurt airport (FRA) in the year 2011. The grey line delineates the 50 % footprint.

To quantitatively assess the information provided by the inversion, the
reduction of uncertainty in the posterior compared to the prior estimate is
a useful measure. The uncertainty reduction (UR) is defined as

As in this study a multi-species inversion with CO, CO

Prior error correlation matrix

We used a single year (2011) dataset restricted to the vertical profiles centred at the Frankfurt airport (FRA) and restricted to daytime during well-mixed atmospheric conditions (10:30 to 17:30 CET). The dataset contains 1098 pseudo-observations, 366 for each of the three observable species, whereas the state vector contains the scaling factors for 2604 flux categories, each equal to one in the prior.

The prior error covariance matrix can be expressed as follows:

between different anthropogenic modelled species;

between GEE and respiration;

between different emission sectors;

between different fuel types.

After having specified the prior error correlation matrix

The posterior of each Bayesian inversion depends on its specific prior. As
the multi- and single-species inversions have different prior uncertainty
structures, the uncertainty reduction for targeted quantities cannot be
directly compared (Eq. 4). To be able to compare the two inversions, we
require that the a priori aggregated uncertainty of the targeted quantities
remains the same, and distribute it differently each time; the prior
rescaling matrix

Relative uncertainty of the prior fluxes aggregated domain-wide and annually for the different cases.

In an atmospheric inversion, the model–data mismatch from every uncertainty
source (such as measurement uncertainty, transport model uncertainty,
spatial representation error due to limited model resolution, and boundary
condition inaccuracies) needs to be taken into account. In our inversion
scheme, we parameterize both the transport model uncertainty and the
measurement uncertainty, with the latter playing a minor role. The
model–data mismatch covariance matrix (

Model–data mismatch correlation matrix

The assumed measurement uncertainty is 1 ppm for CO

In the multi-species inversion, the transport error correlation across species is 0.7 (Fig. 3b), while in the single-species inversion this is set to zero. Time correlation is assumed to decay exponentially with an exponential constant of 12 h. The between-species correlation for model–data mismatch related to transport uncertainty reflects the fact that species are partially co-emitted and share the same atmospheric transport (and its related uncertainty).

As explained in the introduction, in situ measurements are not available for
all of the three trace gases of interest, but only for CO. For this reason
this paper aims to evaluate the benefits of a multi-species inversion over a
corresponding single-species inversion by performing a synthetic experiment, using
pseudo-observations derived by perturbation of the model outputs based on a
priori state vector values. More precisely, the pseudo-observation vector is
obtained by matrix multiplication between the Jacobian matrix

The final rescaling matrix

Mean daily enhancement of mixed layer vs. free tropospheric mole fractions. Modelled mixing ratios are shown as black lines, while the observed CO is shown as a blue line. Note that the modelled values for CO have been multiplied by a factor of 2.8, corresponding to the mean ratio between observed and modelled CO enhancements, to match the observed values.

Before evaluating the performance of the inversion scheme in reducing the uncertainty of the state space, a closer look at the ability of the modelling framework to reproduce the enhancements is necessary. Unfortunately, this can be done only for CO as actual measurements are not available for the other species. Figure 5 shows the mean daily enhancement of the three fossil fuel species for both observations and model outputs using prior emissions. A common feature of the three trace gases is that lower values tend to occur during summer time due to better mixing of the atmosphere. Conversely, enhancement values tend to be higher during winter, reflecting the more stratified atmosphere of the cold months.

In Fig. 5 the modelled CO plot was multiplied by a factor of 2.8,
corresponding to the mean ratio between observed and modelled CO
enhancements, similar to what was found in Boschetti et al. (2015). Mixing ratio
values are highly variable, but the model provides a good indication of the
temporal variation of the ML enhancement; the squared correlation
coefficient between observed and modelled CO enhancements is 0.62, while the
standard deviation of corrected model and observation residuals is 85 ppb;
note that by not accounting for the

Figure 6 shows the prior and posterior error covariance matrices for the
base multi-species inversion. Note that CO

Prior error covariance matrix

Figures 7 and 8 show a priori, a posteriori, and “true” fluxes related to
different aggregated fuel types and to different emission categories as
described in Tables 1 and 2 for the months of July and December. Figure 8
also shows the biospheric contribution (as absolute values) scaled down by a
factor of 10. As is to be expected, the biospheric contributions show strong
differences according to the seasonal cycle, while anthropogenic emissions
remain rather stable. However, it is worth pointing out that while the
fossil fuel prior is similar for both months, the assumed truth can be
rather different due the random assignment of the prior error realization.
In most cases, the posterior adapts and is therefore closer to the truth
than the prior; the posterior uncertainty is also visibly reduced, as
expected. Regarding the different tracers, CO

Our modelling framework is currently not well suited to account for
unreported sources of CH

Prior, posterior, and true (pseudo-data) fluxes in physical units
aggregated for different fuel types. Note that as the true fluxes are the
result of a random perturbation of the prior, they do not describe an actual
situation in the physical world. So, for example, the fact that the true
value of CH

Prior, posterior, and true (pseudo-data) fluxes in physical units
aggregated for different emission sectors for CO

In general, the absence of some emission sources in an inventory is
equivalent to the assumption of having point sources not included in the
emission map, but still contributing to the measurements. The inversion
scheme would typically react to this by assigning such point sources in some
other sectors to another fuel type. As a result, the posterior enhancements would
be biased low in proximity of those point sources, and (slightly) biased high
for influences from other regions with the same sector or fuel type. This
issue should definitely be considered in future study making use of actual
CO, CO

Note that our modelling framework does not allow for simulating CO biogenic fluxes during the growing season. Warm days in summer correspond to large amount of biogenic volatile organic compounds (VOCs) being emitted from vegetation, producing CO at non-negligible levels. According to Hudman et al. (2008), anthropogenic emissions account for only 31 % of CO emissions in the USA during summer. Conversely, according to estimates from EDGAR, CO anthropogenic emissions during summer are about 18 % of the annual anthropogenic emissions. Combining these two results, one could conclude that CO production from biogenic sources accounts for roughly 42 % of total annual CO emissions.

CO

Contrary to CO

The contribution to CO

As further assessment of the inversion performance, we tested the ability of
the inversion scheme to capture the truth compared with a perturbed version
of the prior. Such perturbed version is obtained by adding a random
distribution with mean and standard deviation equal one to the prior state
space, similar to how the truth is obtained. For each simulated species we
calculated the total annual fluxes for prior, posterior, truth, and
perturbed prior. From these total fluxes we then derive the overall residual
between prior and truth, posterior and truth, and perturbed prior and truth.
It is clear from Table 4 that while the overall bias between posterior and
truth is lower than the prior–truth bias, the bias between perturbed prior
and truth is much higher, implying that the performance of the inversion is
not an artifact of the pseudo-data generation. In addition, it was found
that the truth–posterior bias of the multi-species inversion is mostly
slightly lower compared to the single-species inversion. The difference is
between

Improper characterization of the error correlation may result in systematic
bias in the posterior estimate. As mentioned in Sect. 2.1.6, inter-species
correlation, the correlation between different fuel types and the
correlation between different emission sectors in

Overall bias for different species between the prior and both posterior and perturbed prior. The percentage values in parentheses are the corresponding prior–truth bias.

Residuals between total annual posterior fluxes (post) and total
annual true fluxes (truth) for the five simulated species (in MtC y

In addition, the posterior–truth biases are always lower than the
prior–truth biases. The posterior uncertainty values (1

Before investigating the benefits of correlations between different tracers, it is worthwhile to evaluate the uncertainty reduction in the monthly budgets for all five modelled species (Fig. 9, based on targeted spatial domain in Fig. 1). The first thing to note is that for all of the five trace gases the posterior uncertainty is lower than the prior one, as it should be. In addition, prior uncertainty varies through the year, reflecting modulation in emission fluxes obtained by adding activity factors to describe the seasonal, weekly, and daily cycle.

Comparison between prior and posterior monthly uncertainties for the five tracers. The posterior uncertainty is plotted for both the multi-species inversion, accounting for inter-species correlations, and the single-species inversion, in which all of the species are independent. Both prior and posterior uncertainty are expressed in physical units. The spike in the prior methane uncertainty estimate for the month of March depends on the emission inventory and is related to the cycle of agricultural activities.

Prior uncertainty assumes values around 0.4–0.6 MtC month

In addition, note that in this case, the posterior uncertainties for single-
and multi-species inversions are similar for the modelled species, with the
exception of the CO

All of the species experience a reduction in the posterior uncertainty ratio
due to the addition of inter-species correlation; said reduction is up to
20 % for fossil fuel CO

Benefit of a multi-species inversion over the corresponding
single-species inversion (dotted line) per different species per months of
the year. The benefit has been tested for the three different cases of
Table 3. Note that CO

Benefit of a multi-species inversion over the corresponding single-species inversion (dotted line) per different species and month. The benefit has been tested for a “normal” inversion featuring both prior and model–data mismatch (mdm) correlation between different species (black) or only one of these two components (red and orange). Results refer to Case 1 of Table 3 (black line of Fig. 10). Values derived from Palmer et al. (2006) for the month of March are indicated with a diamond. Note that “unc.” stands for uncertainty.

In order to assess the contribution of inter-species correlation in the prior
uncertainty vs. that of model–data mismatch uncertainty, Fig. 11 also shows
the resulting posterior uncertainty ratios for Case 1 (Table 3) from
inversions only using prior or model–data mismatch correlation. For the
anthropogenic component of CO

Palmer et al. (2006; in the following referred to as P06) studied the importance
of inter-species correlation to improve inverse analysis using airborne data
from the TRACE-P mission conducted in March/April 2001 over the western
region of the Pacific Ocean. P06 derived a prior error correlation lower
than 0.2 by analysing the uncertainty of emission factors from an
Asia-specific emission inventory (Streets, 2003), which is significantly
smaller than the correlation of 0.7 assumed in the present study. P06 deemed
CO

From this comparison we can see that the estimates of the benefit of
including inter-species correlation in atmospheric inversions in P06 and in
this paper are of the same order of magnitude for anthropogenic CO

The present paper presents a synthetic experiment aiming to evaluate the effects of exploiting correlations between different trace gases in an atmospheric inversion. We quantitatively described the capability of the modelling framework to reproduce observations, the performance of the inversion scheme in reducing the uncertainty of the different trace gases, and the benefits of multi-species inversions compared to corresponding single-species inversions. We also describe a method to re-scale different prior uncertainty covariance matrices so that the corresponding posterior uncertainties are actually comparable.

Where possible, we compared model outputs with available observations.
Such comparison, possible only for CO, showed a good degree of agreement
between the model and observations with an overall correlation of roughly
0.75; modelled values for CO enhancement underestimate the observed ones by a
factor of roughly 2.8, compatible with what was found in Boschetti (2015).
It is found that posterior uncertainty is much lower than the prior for all
of the five simulated species. The mean uncertainty reduction for CO

The present paper paves the way for future studies using simultaneous
measurements of different trace gases. This will be especially important in
the context of the upcoming routine measurements of CO

IAGOS and MOZAIC data for carbon monixide mole fraction
measurements are available at the IAGOS data base under

The authors declare that they have no conflict of interest.

The research leading to these results was supported by the European Community's Seventh Framework Programme ([FP7/2007-2013]) under grant agreement no. 312311 (IGAS).

The authors acknowledge the strong support of the European Commission, Airbus, and the airlines (Lufthansa, Air France, Austrian, Air Namibia, Cathay Pacific, Iberia, and China Airlines so far) who carry the MOZAIC or IAGOS equipment and have performed maintenance since 1994. In its last 10 years of operations, MOZAIC has been funded by INSU-CNRS (France), Météo-France, Université Paul Sabatier (Toulouse, France), and Forschungszentrum Jülich (FZJ, Jülich, Germany). IAGOS has been additionally funded by the EU projects IAGOS-DS and IAGOS-ERI. The MOZAIC–IAGOS database is supports by Aeris (CNES AND INSU-CNRS). Edited by: Martyn Chipperfield Reviewed by: three anonymous referees