Introduction
Atmospheric measurements of CO2 show that on average half of the
anthropogenic emissions of CO2 are taken up each year by the land and
oceans . Allocating this global sink to specific
regions, or even partitioning it between land and oceans, has proved
challenging . Understanding the mechanisms behind this
allocation, and their response to climate variability, is crucial for
accurately estimating the carbon cycle impact on future climate scenarios
. Current approaches to quantify the spatial
distribution and temporal variation of carbon sources and sinks can be
broadly classified into two categories, “top down” and “bottom up”.
Bottom-up methods, such as biosphere models and ocean biogeochemistry models,
calculate the surface exchange of CO2 between two reservoirs by
modeling the physical processes in the reservoirs that lead to such
exchanges. Top-down methods, generally speaking, infer surface fluxes of
CO2 from measured spatiotemporal gradients in tracer concentrations
in either reservoir.
The most common top-down method for estimating surface fluxes of CO2
from atmospheric measurements is an atmospheric inversion. An inversion
infers surface fluxes from observed spatiotemporal gradients of CO2
in the atmosphere by simulating atmospheric transport to connect the two.
Most inversions are Bayesian in nature, in that they calculate corrections
from a prior flux scenario (typically from bottom-up models) under
constraints of assumed errors in the prior fluxes and atmospheric
measurements. The flux estimates from an inversion, therefore, are subject to
the assumed prior flux map and its error structure, the atmospheric transport
model, the set of atmospheric observations assimilated and the assimilation
technique. Due to the diversity of each of these elements in the current
suite of atmospheric inversions, estimates of CO2 fluxes from biomes
and ocean basins vary widely across inversions, even though they agree on the
global CO2 budget , as would be expected from
mass balance considerations.
showed that the northern extra-tropical sink was
fairly consistent across inversions of in situ CO2 data, but the
partitioning between the tropics and the southern extra-tropics was more
variable. The tropics were found to be responsible for most of the
interannual variability of the global CO2 growth rate, and northern
Asia was found to be responsible for an increasing northern land carbon
uptake between 1990 and 2008. However, the tropics and northern Asia were
also the regions most severely undersampled by the surface CO2
observation network used by the inversions in .
Therefore, it remained an open question whether their conclusions were real
or artifacts of insufficient observational constraints.
Satellite estimates of atmospheric CO2 mole fraction, in principle,
can add observational constraints over remote areas that are difficult to
sample with surface sampling sites, such as the tropics, boreal Eurasia and
much of the oceans. This was the chief motivation behind the Greenhouse gases
Observing SATellite (GOSAT), launched in 2009 . GOSAT
near-infrared (NIR) spectra of reflected sunlight have been analyzed to estimate
the column average CO2 mole fraction under its orbit. It was hoped
that these column averages – hereafter called XCO2 – assimilated by
atmospheric inversions would help constrain the CO2 flux over
regions such as the tropics and northern Asia. showed
that assimilating GOSAT XCO2 indeed reduced the spread in tropical
land flux estimates across a suite of atmospheric inversions. However, the
year-round coverage of GOSAT did not extend beyond ±36∘
latitude, limiting its ability to draw conclusions about high-latitude
fluxes. Over the tropics, despite the year-round coverage, GOSAT retrievals
were sparse due to cloud cover and high aerosol loading from biomass burning,
also limiting its ability to constrain tropical fluxes. The balance between
tropical and temperate fluxes estimated from GOSAT soundings was also
inconsistent with information from independent aircraft profiles, raising
questions about its validity .
In 2014, the next CO2 observing satellite, Orbiting Carbon
Observatory 2 (OCO-2), was launched . Compared
to GOSAT, OCO-2 has more extensive spatial coverage, both in the density of
soundings and in their latitudinal extent. Its higher measurement signal-to-noise
ratio allows for higher-precision retrievals of XCO2, and higher
spatial sampling density enables easier validation with the ground-based
Total Carbon Column Observing Network, or TCCON .
OCO-2 also has a smaller footprint than GOSAT, potentially enabling
more retrievals over the tropics by looking through gaps in clouds, over
scenes that GOSAT might have treated as cloud-contaminated. Due to the more
extended spatial coverage, higher sampling density, higher precision and
better validation opportunity, OCO-2 can potentially provide better
constraints on surface CO2 fluxes than what has hitherto been
possible from the surface network and GOSAT. Several inverse modeling groups
are currently engaged in investigating this potential.
One of the key problems in estimating CO2 fluxes from GOSAT
retrievals is the presence of small but spatially coherent biases in the
retrievals arising from, for example, a dependence of the retrieved XCO2 on
aerosols or surface albedo .
Some synthetic data studies such as warned that
such sub-parts-per-million (sub-ppm) biases might significantly reduce the utility of satellite
XCO2 retrievals, but most earlier studies either did not consider
this complication or claimed that it
was easily fixable . In practice, these biases were found
to strongly affect estimated fluxes in atmospheric inversions of GOSAT data
e.g.,. Initial analyses suggest
that OCO-2 estimates of XCO2 likely suffer from similar biases
, although they can be better characterized due to
the increased density of soundings. Efforts are underway to characterize and
remove such biases through improvements in the radiative transfer and surface
reflectance models. Current validation strategies for satellite XCO2
have their own limits, since their truth metrics (e.g., TCCON XCO2)
may not be sufficiently accurate . Therefore, as satellite
retrieval algorithms achieve higher accuracy, they will need better
validation strategies in the future. It is likely that with further progress
in those directions, XCO2 biases will go down to the point, where they
no longer limit our ability to infer regional CO2 fluxes.
Even with completely unbiased XCO2 retrievals, surface flux estimates
would still be subject to uncertainties related to the atmospheric transport
model, the optimization technique employed, and the balance between data and
prior flux errors. At present, it is not clear whether the divergence in flux
estimates seen in intercomparisons such as is driven
primarily by the variety of XCO2 retrievals assimilated or the other
factors mentioned above, although more limited intercomparisons suggest that
those other factors may be at least as important as the differences in
XCO2 assimilated . It is possible that the
uncertainty in a regional flux estimate stemming from factors specific to the
inverse modeling setup is larger than what we can tolerate for detecting,
say, the climate impact on those fluxes. In that case, even perfectly
accurate estimates of satellite-based XCO2 will not enable us to
answer the carbon cycle questions we hope to answer with current and future
CO2 sensing satellite missions. It is therefore crucial that we
quantify the impact of factors specific to an inverse modeling setup on the
uncertainty of inferred surface fluxes.
In this study, we consider one of those factors, namely the atmospheric
transport model. Using a series of observing system simulation experiments
(OSSEs), we quantify the uncertainty in flux estimates due to differences
between present-day state-of-the-art atmospheric transport models. The
approach is similar to that used by earlier work :
From a common set of surface fluxes (henceforth called “true” fluxes), we use a suite of different
atmospheric transport models to produce a suite of time-varying three-dimensional atmospheric CO2 fields.
We sample these fields to produce synthetic observations of CO2 at in situ and OCO-2 sampling locations.
We assimilate these synthetic observations in a single data assimilation system with a single transport model.
For a given data stream (e.g., in situ observations or OCO-2 land nadir), the spread in the
posterior fluxes is an estimate of the uncertainty driven by transport model differences.
In earlier work, performed their analysis for the
GOSAT instrument, while focused on the (planned)
A-SCOPE active sensor. Our methodology is closest to that of
, who estimated the transport-model-driven uncertainty of
CH4 fluxes assimilating only surface layer data. In our analysis, we
try to answer two specific questions:
For atmospheric inversions assimilating OCO-2 XCO2
retrievals, what are the uncertainties in posterior flux estimates – at
different spatiotemporal scales – that arise due to the divergence of
present-day state-of-the-art atmospheric tracer transport models?
Are the uncertainties larger or smaller if we assimilate only in situ
measurements of CO2? In other words, does assimilating space-based
total column XCO2 such as OCO-2 XCO2 magnify or diminish
transport-model-related uncertainties in the flux estimates?
The second question stems from a long-standing hypothesis that simulating
XCO2 in a model is less sensitive to transport errors such as errors
in the modeled planetary boundary layer (PBL), making XCO2
assimilations less sensitive to transport errors than PBL CO2
assimilations . This is plausible, since modeling
convection and the formation of the PBL are leading-order uncertainties in
present-day transport models . Any error in modeling
the exact PBL height and vertical mass flow translates into an error in
estimated fluxes if the primary assimilated data for an inversion are PBL
CO2 mole fractions. On the other hand, the column average
XCO2 is relatively insensitive to convective transport errors and the
exact PBL height, so those types of transport errors may have less influence
on estimated fluxes if the primary data are XCO2. However, the
spatiotemporal variations in XCO2 due to surface fluxes are smaller
than corresponding variations in PBL CO2. Therefore, XCO2
inversions starting from biased priors (true for most if not all current
inversions) may be less accurate than PBL CO2 inversions. In the net,
it is not clear whether lower transport errors in modeled XCO2 can
compensate for lower flux signals to give us more accurate fluxes
.
Data and methodology
As described earlier, we ran a suite of transport models with the same
boundary conditions (initial mole fraction field and surface fluxes), sampled
them to produce a suite of synthetic observations and then assimilated those
observations in the same inversion framework to come up with an estimate of
flux uncertainty due to transport model differences. We describe the
individual elements of this process below.
“True” fluxes
Synoptic differences between transport models are likely correlated with
surface fluxes, since they are influenced by common drivers such as
temperature, precipitation and insolation. Therefore, it is important to use
realistic fluxes to generate the true scenario. We produce the true surface
fluxes by assimilating CO2 data from the National Oceanic and
Atmospheric Administration's (NOAA) Global Greenhouse Gas Reference Network
(GGGRN) and the TCCON in a TM5 4DVAR atmospheric inversion (described later
in Sect. ). The inversion spanned 1 June 2014 to
1 April 2016. This ensured that the true fluxes had realistic land and ocean
sinks consistent with the observed global CO2 growth rate. At the end
of the optimization, TM5 4DVAR wrote out global 1∘ × 1∘ 3-hourly total CO2 fluxes for transport models to ingest in the next
step.
Generation of CO2 fields
We ran a suite of transport models between 1 June 2014 and 1 April 2016
with the true fluxes produced earlier, starting from the same initial
CO2 mole fraction field as the inversion used to produce the true
fluxes. The suite consisted of TM5, LMDZ, ACTM, PCTM and GEOS-Chem. Details
of the individual models can be found in the respective references in
Table . It is important to note here that this suite of
models spans the range of transport models currently being used by various
members of the OCO-2 Science Team to assimilate OCO-2 XCO2
retrievals. Moreover, these models are driven by four different
meteorological reanalysis products: ECMWF ERA-Interim (TM5, LMDZ), MERRA
(PCTM), GEOS FP (GEOS-Chem) and JMA-55 (ACTM). These four products span the
gamut of meteorological fields used by most atmospheric inversions today.
Therefore, the divergence of flux estimates seen in this study can be taken
to be a reasonable measure of the divergence expected in real data inversions
with these transport models.
The transport models produced hourly (PCTM) or 3-hourly (TM5, LMDZ, ACTM,
GEOS-Chem) CO2 fields at their individual lateral and vertical
resolutions, which are listed in Table . Note that the
temporal granularity listed is the time step at which the CO2 mole
fraction field was written out; the time step of the models for calculating
transport is usually smaller. The models also wrote out the geopotential
heights and atmospheric pressures at the vertical layer edges. As a first
check, we verified that global average CO2 mole fractions from the
different models, calculated from their own pressure and CO2 fields,
closely matched the expected time series from the true fluxes diluting into
an atmosphere of 5.123 × 1018 kg, the total
dry air mass of TM5. Figure shows these time series.
It is evident that, while all the colored lines are very close to the dashed
black line, there are small differences that are seasonally coherent. These
differences arise from differences in the molar mass of carbon assumed by the
models (e.g., 12 g mole-1 vs. 12.01115 g mole-1), small
differences in the air mass between different models and the handling of
water vapor in the model atmosphere. Rather than standardize the models to
remove these small differences, we decided to keep them since they reflect
legitimate differences between the models that would express themselves in
real data inversions.
The different atmospheric transport models run in this study to produce CO2 fields.
Model
Resolution (lon × lat)
Vertical layers
Temporal granularity
Meteorology
Reference
TM5
3∘ × 2∘
25
3 h
ERA-Interim
LMDZ
3.75∘ × 1.875∘
39
3 h
ERA-Interim
ACTM
1.125∘ × 1.125∘
32
3 h
JRA-55
PCTM
1.25∘ × 1∘
40
1 h
MERRA
GEOS-Chem
5∘ × 4∘
47
3 h
GEOS FP
Time series of the global average
CO2 mole fraction expected from the true flux scenario (bold black
dashed line) and calculated from the individual model outputs (colored
lines). The flux scenario only provides increments of the mole fraction, so
these increments were added to the initial mole fraction of TM5 to calculate
the black line.
Generation of synthetic data
The five different modeled dry air mole fraction CO2 fields were
sampled with the same code to produce synthetic observations of CO2
from in situ and satellite platforms. Table gives the
number of samples per year from each data stream, and the generation of
pseudo-data are described below.
Number of pseudo-observations per year from
the different observing systems and sampling strategies.
Data stream
Observations/year
MBL
37 558
IS
107 963
LN
49 311
LG
46 103
OG
163 452
In situ sampling
Synthetic in situ samples corresponded to the times and locations of
CO2 measurements at network sites maintained by NOAA and partner
agencies, as contained in ObsPack versions GV 2.1 (Cooperative Global
Atmospheric Data Integration Project, 2016) and NRT 3.2.2 (NOAA Carbon Cycle
Group ObsPack Team, 2017). The following data filtering was applied:
Campaign data from aircrafts – such as CALNEX, SONGNEX and ORCAS – were excluded. In situ CO2 data from the CONTRAIL program were also excluded.
At low-altitude sites, only mid-afternoon hourly averages were used.
At mountaintop sites, only late-night hourly averages were used.
For coastal sites, where the sampling protocol differentiated between background and non-background air, only background samples were used.
Bi-weekly to monthly NOAA aircraft profiles, mostly over North America, were included.
Flask CO2 data from the CONTRAIL program were also included.
Note that these filters were applied to come up with a set of sampling
coordinates (locations and times) to represent realistic sampling frequency
and density for real data inversions. No actual CO2 measurements were
used from either ObsPack version. In addition, the sampling times and
locations corresponding to mid-afternoon CO2 samples from six towers
belonging to the Japan–Russia Siberian Tall Tower Inland Observation Network
(JR-STATION) were also included in our in situ network .
Each model CO2 field was sampled at these sampling coordinates,
adhering as closely as possible to the sampling protocol that a model would use
in a real data inversion. For example, if a site's elevation places it in the
lowermost model layer, TM5 samples it one layer above to avoid surface
effects, while the other four models sample it in the surface layer. This
distinction was kept while sampling the five models. The set of synthetic
observations generated with this sampling, and corresponding flux estimates,
will be referred to as “IS” in the rest of this paper. During this
work, we discovered an artifact in our version of PCTM at the South Pole,
which was fixed by moving the South Pole site 2∘ N along
0∘ longitude (details in Appendix ).
In addition, we also considered a subset of the IS samples that corresponded
closely to the network used by . The network used in that
TRANSCOM 3 model intercomparison experiment chiefly consisted of marine
boundary layer and background sites, suitable for assimilation in
coarse-resolution flux estimation systems of the time. Since then, many continental
sites have come online. These sites are located closer to terrestrial fluxes
and therefore have larger flux-induced variations in the CO2 mole
fraction. However, modeling these variations accurately depends on modeling
the continental boundary layer accurately, which is one of the most uncertain
aspects of atmospheric transport modeling. By comparing the spread in our IS
flux estimates to that from assimilating a more limited set of mostly
background sites comparable to , we sought to answer the
question of whether the cost of increased model uncertainty in the
continental PBL outweighed the benefit of more measurements from the
non-background sites.
We constructed this limited subset of IS, henceforth referred to as “MBL” (short for marine boundary layer),
as follows. We subselected our IS data set for sites that were used by
. Three sites used by – namely CMN, GSN and
HAT – did not exist in our IS data set and therefore were not used. ITN and JBN
in were replaced by SCT (Beech Island, South Carolina) and
DRP (Drake Passage), respectively, two currently operational sites (cruises in
the case of DRP) geographically nearest to the discontinued ITN and JBN. The
resulting MBL network corresponded as closely as possible to the mostly
background network used by , while also reflecting changes in
the CO2 sampling network since then.
OCO-2 sampling
The five different model CO2 fields
were sampled at the locations and times of OCO-2 retrievals from the ACOS
version 7r algorithm , as archived at
https://disc.gsfc.nasa.gov/datasets/OCO2_L2_Lite_FP_7r/summary (last access: 23 March 2017).
Real data inversions of OCO-2 typically only use retrievals of “good”
quality, selected by xco2_quality_flag=0. We performed the same
selection of the sounding locations to mimic realistic spatiotemporal
coverage. The vertical profiles of CO2 from all the models were
convolved with the OCO-2 column averaging kernels and prior profiles of the
corresponding real retrievals to produce sets of synthetic OCO-2
XCO2. These synthetic XCO2 were classified according to
sounding mode and surface type of the original soundings, to come up with
land nadir (LN), land glint (LG) and ocean glint (OG) synthetic OCO-2
XCO2 for each transport model.
OCO-2 takes 24 samples every second, which span ∼ 7 km along
track. Column average CO2 is expected to be highly correlated over
these short length scales , and therefore these 24
retrievals do not provide independent information about XCO2.
However, most trace gas inversions – including TM5 4DVAR – treat all
measurements as independent. Moreover, most global transport models have grid
cells hundreds of kilometers in size and therefore cannot model or interpret the
small-spatial-scale XCO2 variations seen by OCO-2. To avoid highly
correlated measurements being treated as independent measurements in our
assimilation, and to bring the spatial resolution of the retrievals more in
line with the resolution of transport models used in most global inversions,
we average the synthetic XCO2 in 10 s bins along orbit, which
results in one value per orbit per ∼ 70 km bin along track. The
averaging is done in two steps. First, retrievals are averaged over 1 s bins, with weights inversely proportional to the square of the
posterior retrieval uncertainty for each retrieval. Next, over a 10 s
interval, all 1 s bins with at least one valid retrieval are averaged
to create a 10 s average. This two-step averaging is done to avoid
weighting the 10 s average disproportionately towards one part of the ∼ 70 km track which might have a lot of retrievals. Soundings of
different modes (LN, LG or OG) are averaged separately to create different
10 s averages for each mode. OCO-2 averaging kernels and prior
profiles are similarly averaged to create 10 s mean averaging kernels
and prior profiles.
In situ sampling at OCO-2 sounding locations
The difference between OCO-2 and in situ
samples are twofold: (i) the first is a column measurement, while the second
is a point measurement, and (ii) the spatiotemporal coverages of the two
systems are vastly different. Differences between OCO-2 and in situ
inversions convolve the two and therefore cannot be used to test the
hypothesis of that inversions of column data are less
sensitive to transport model errors than inversions of in situ data. To test
this hypothesis, we devised two purely theoretical in situ networks, called
“IS-LNLG” and “IS-OG”. The IS-LNLG (IS-OG) network consists of PBL
samples of the CO2 mole fraction at locations and times of all OCO-2
land (ocean) soundings from Sect. . The five
different model fields were sampled at the IS-LNLG and IS-OG networks, 30 m
above ground level as defined by the 1 arcmin global relief model ETOPO01
. The difference in fluxes between inversions of IS-LNLG
(IS-OG) in situ pseudo-data and LNLG (OG) OCO-2 pseudo-data can be expected
to reflect the difference between PBL and column sampling over land (ocean),
and not differences in spatiotemporal coverage between actual in situ and
OCO-2 samples.
Inversion framework
TM5 4DVAR is a state-of-the-art variational inversion
system that has been used to estimate surface fluxes of CO2
, CO , CH4
and N2O . Given a set
of prior fluxes xa with their error covariance Sa, a set of
measurements y with their error covariance Sϵ and a
transport model K connecting fluxes to measurements, a Bayesian flux
estimation system tries to minimize the cost function J:
J=12(Kx-y)TSϵ-1(Kx-y)+12(x-xa)TSa-1(x-xa).
The posterior estimate of x, usually denoted x^, is given by
x^=xa+SaKTKSaKT+Sϵ-1y-Kxa=xa+Gy-Kxa,
where G=SaKTKSaKT+Sϵ-1 is called the
Kalman gain matrix and determines the weighting between prior information and
observations. Details about TM5 4DVAR have been documented by
. In this work we use the ability of TM5 4DVAR to
assimilate in situ and total column CO2 measurements as documented by
. We run the TM5 transport model (K in the equation
above) at global 3∘ × 2∘ × 25-layer resolution and
solve for ocean and land fluxes at 3∘ × 2∘ globally. We
have already described our method for constructing the synthetic observations
y. Below we describe the remaining elements of this inversion, namely
Sa, Sϵ and xa.
Prior flux (xa) and covariance (Sa)
Prior ocean and land fluxes were constructed as the multi-year (2000–2015)
mean of CarbonTracker 2016 posterior fluxes
(https://www.esrl.noaa.gov/gmd/ccgg/carbontracker/, last access: 20 October 2017). Hence, the prior
did not have any interannual variability but did have a land sink consistent
with the decadal trend of atmospheric CO2 growth rate. Fossil fuel
emissions, for both the true and prior fluxes, were taken from the ODIAC
inventory and not optimized. Both the land and ocean
fluxes were optimized on a weekly timescale, on a global
3∘ × 2∘ grid. Ocean and land fluxes had 3-hourly
variations within each week, which were not optimized. The fossil fuel flux
had daily and hourly variations according to . Errors in
the weekly prior ocean fluxes were assumed to be 1.57 times the absolute flux
in each grid cell, with a spatial correlation of 1000 km and a
temporal correlation of 3 weeks. Errors in the weekly prior terrestrial
fluxes were assumed to be half the heterotrophic respiration in each grid
cell from the CASA biosphere model , with a spatial
correlation of 250 km and a temporal correlation of 1 week. The grid
scale uncertainty in terrestrial fluxes thus constructed was typically an
order of magnitude higher than for ocean fluxes. However, due to the shorter
error correlation lengths and times assumed for terrestrial fluxes, the
uncertainties in the global totals for 2015 were of the same order of
magnitude, 0.44 PgC yr-1 for oceans and 0.53 PgC yr-1 for land.
The ocean uncertainty constructed this way corresponds roughly to the
uncertainty in the ocean sink imposed by decadal measurements of the
atmospheric O2 / N2 ratio , while the land flux
uncertainty is large enough to allow sufficient summertime uptake over North
America and Eurasia .
Data error (Sϵ)
The analytical error of a flask-air or continuous in situ measurement of
CO2 is very small, typically 0.1–0.2 ppm. However, even with
perfect fluxes and an unbiased transport model, we do not expect to fit all
observations to that precision, because a coarse-resolution transport model
cannot adequately represent sub-grid-scale variations that lead to the
measured mole fraction at a point. Therefore Sϵ also contains the
representativeness error of the transport model, which can be considered to
be a random error contributed by the model. This representativeness error is
computed by evaluating the norm of the spatial gradient of the modeled
CO2 mole fraction at the scale of TM5's lateral resolution at each
sampling time and location. The total error in Sϵ is the quadrature
sum of this model error and an analytical error of 0.2 ppm.
Figure shows the total and analytical errors at three
example sites at times when CO2 samples were taken. Tutuila, American
Samoa (SMO), is a remote marine boundary layer site with little model
variability, with a model error of ∼ 1 ppm. Niwot Ridge
(NWR) is a background mountaintop site within the continental US and therefore
has higher model variability. Finally, Beech Island (SCT) is a tall tower in
the southeastern US where seasonally coherent transport variability is
convolved with strong local fluxes. It should be noted here that the numbers
in Fig. are somewhat smaller than typical values in
the literature e.g.,. Therefore, our
estimate of the transport uncertainty for in situ CO2 inversions is
likely to be on the higher side.
Analytical (blue) and total (red)
uncertainty of in situ measurements in the Sϵ matrix at three
example sites, at times of actual CO2 measurements. SMO is a remote
marine boundary layer site with little model variability, while LEF and WKT
are continental sites with significant model variability.
The formal reported uncertainty of OCO-2 XCO2 retrievals is an
underestimate . Therefore, the errors estimated for the
10 s averages are likely underestimates as well. Moreover,
Sϵ in Eq. () is not just the
measurement error but the covariance of the model–observation mismatch.
Therefore, we construct the data error for XCO2 as the sum of two
components, σ10s2=σmeas2+σmodel2.
The measurement part, σmeas2, is calculated in two steps.
First, variances are calculated for 1 s averages by summing the inverse
variances of all the soundings in that average, as reported by the retrieval
algorithm. A lower threshold of εbase2/Nret
is set on that variance, where Nret is the number of retrievals
in the 1 s average, and εbase is an error floor that
is 0.8 ppm over land and 0.5 ppm over oceans. If the 1 s variance
calculated this way is denoted σ1s2, then the variance of
the 10 s average is calculated as σmeas-2=(1/10)∑σ1s-2, where the sum goes over the 1 s bins in the
10 s average. Note that the final error σmeas does not drop
by 10 because of the factor 1/10 in the front.
The model part, σmodel, is calculated by considering a suite
of inverse models optimized against in situ data and calculating their
difference with OCO-2 XCO2 retrievals. The differences are binned by
latitude band, month and OCO-2 sounding mode, and averaged. For each
month/latitude/mode bin, the cross-model spread in the average differences is
taken to be 2×σmodel for that bin. While there is no
unique way of deriving a σmodel, this algorithm creates a
σmodel that includes model variability across multiple
state-of-the-art transport models driven by realistic fluxes. In practice,
σmodel is usually larger than σmeas for most
10 s averages. On average, σ10s is ∼ 1.5 ppm and ∼ 0.9 ppm for land and ocean
soundings, respectively.
One final point to note is that in OSSEs random perturbations are often
added to the data to simulate random measurement error
e.g.,. However, that is relevant when the goal is
to get an accurate estimate of the analytical posterior uncertainty of the
flux. In this work, however, the goal is to estimate the spread in flux
estimates due to the relative bias between different transport models.
Moreover, inversion groups assimilating real OCO-2 and surface data do not
add random error to those measurements, so differences in flux estimates
between different groups have no contribution from this kind of added random
measurement error. Therefore, in this work we have not added any
perturbations to our synthetic measurements.
Note about the impact of transport models
If two different transport models (K1 and K2) are used to assimilate
data y starting from the same prior xa and with the same
error matrices Sa and Sϵ, then their respective posterior flux
estimates will be
x^i=xa+I-S^iSa-1(xt-xa),S^i=Sa-1+KiTSϵ-1Ki-1,
where xt is the true flux. Therefore the difference between the two flux estimates will be
x^1-x^2=S^2-S^1Sa-1(xt-xa).
That is, the transport-related flux difference depends on the distance from
the prior to the true flux, as well as S^i, which is determined by
the interaction between the error matrices and the transport model Ki.
However, Eq. () makes a crucial assumption, namely that
both transport models are unbiased, or y=Kixt+ϵ, where ϵ is the random error of y. In
practice, this is never the case, and for flux inversions the error due to a
transport model is usually because the transport model is biased with respect
to true atmospheric transport, at spatiotemporal scales of interest. In our
experiment, we mimic this by letting “nature” be each of five transport
models (TM5, PCTM, LMDZ, ACTM, GEOS-Chem) in turn. As long as these models
span the range of transport in nature , the
uncertainty in fluxes coming out of our experiment will be a reasonable
estimate of the uncertainty due to the difference between modeled and true
atmospheric transport. In our experiment, the difference between two flux
estimates from pseudo-data produced by two different transport models K1
and K2 is
x^1-x^2=S^KTSϵ-1(K1-K2)xt,
where xt are the true fluxes in our OSSE, and x^i is
the flux estimate when synthetic observations produced by model Ki are
assimilated in TM5 4DVAR. K represents the transport and observation
operator of TM5, while S^ depends on K, Sa and Sϵ. In a
real data inversion, flux estimates from two different inversion frameworks
that happen to use transport models K1 and K2 will not necessarily
differ by the amount given in Eq. (), because of other
choices made in setting up the inversion systems. Rather,
Eq. () can be thought of as the range of flux
estimates possible in a typical flux inversion (TM5 4DVAR in our case) if
K1 and K2 span the range of possible real atmospheric transport. It
should be noted that the range as expressed in Eq. ()
does not depend on the flux prior xa, but only on the prior
uncertainty Sa through its influence on S^.
Difference between transport models
OCO-2 has a local overpass time of 13:30, and most surface measurements
assimilated in flux inversions – except for mountaintop sites – are from
the afternoon once a fully mixed PBL has formed.
Therefore, the mid-afternoon CO2 mole fraction difference between
models, both in the PBL and in the total column, would contribute to flux
differences in our experiment. The zonal average of those differences between
1 Dec 2014 and 1 Mar 2016 are plotted in Fig. ,
where the lowest 150 hPa is an approximation for the mid-afternoon
PBL depth. Maps of these differences for summer, winter and the annual
average are shown in Figs.
and in the Appendix. For each grid cell, the
median CO2 mole fraction of all five models was subtracted from each
model to highlight model differences instead of large-scale features common
to all models. All modeled CO2 fields were mapped to a global
1∘ × 1∘ grid while conserving mass. Since the models had
varying resolutions and grid registrations, this resulted in unavoidable
checkered patterns in the differences in Fig. .
That, however, did not impact the large-scale model-to-model differences
shown.
In Fig. , the agreement across models is generally
better over the Southern Hemisphere (SH) than over the north. This is
primarily driven by larger ocean masses in the south than in the north,
since,
as Figs. and show, the
agreement across models is generally higher over oceans than over land. This
is expected because (a) vertical transport, one of the major axes of
variability across models, is stronger over land than over oceans, and (b) surface flux variability is also higher over land than over oceans,
amplifying the difference between transport models when viewed in the
CO2 concentration space. Models driven by the same parent meteorology
do not necessarily show the same features in the modeled CO2 field.
In the Northern Hemisphere (NH) summer, LMDZ shows faster exchange between
the continental PBL and the free troposphere (FT) than TM5, evidenced by
higher CO2 mole fractions in the continental PBL in
Fig. . By similar logic, PCTM shows much slower
PBL–FT exchange than GEOS-Chem. In the NH winter, contrary to summertime, at
the northern temperate latitudes PCTM and TM5 exhibit faster PBL–FT exchange
than GEOS-Chem and LMDZ, respectively. The two models driven by
GEOS-derived winds (GEOS-Chem and PCTM) are significantly different in the
PBL over North and South America, East Asia and tropical Africa throughout
the year. The corresponding difference between the two models driven by ERA-Interim
winds (LMDZ and TM5) are smaller. ACTM has an overall low bias of ∼ 0.5 ppm in the PBL, which shows up to a lesser extent in the
total column (Fig. ) and the total atmospheric
CO2 mass (Fig. ). However, such an overall
bias should not affect fluxes estimated from ACTM pseudo-observations (henceforth “pseudo-obs”). ACTM
also appears to trap more (compared to the model median) of the wintertime
respiration signal from boreal Eurasia in the PBL
(Fig. ), which should have implications for boreal
flux estimates.
The zonal average difference between each
model (ACTM, GEOS Chem, LMDZ, PCTM and TM5) and the cross-model median at
13:30 local time, in ppm CO2, between 1 December 2014
(2014.915) and 1 March 2016 (2016.164). (a) depicts differences
in the lowest 150 hPa, which is an approximation for the PBL. (b) depicts differences in column averaged CO2. Each column
has its own color bar. Since transport differences in the total column are
smaller than in the PBL, the dynamic range of (b) is half that
of (a).
In the total column, GEOS-Chem and PCTM look very different in the NH summer,
with PCTM trapping more of the NH summertime uptake and SH wintertime
respiration signals in the respective hemispheres. In the NH winter,
GEOS-Chem displays the tropical Asian biomass burning signal more strongly in
the total column than PCTM, while the East Asian fossil fuel enhancement is
higher in the GEOS-Chem XCO2 throughout the year. In the NH summer,
LMDZ appears to transport more of the temperate and boreal uptake signal to
the south than TM5, leading to slightly higher XCO2 values in
the north. In the NH winter, conversely, TM5 appears to transport more of the
northern respiration signal to the south.
Results
Figure shows the range of the annual CO2
flux from assimilating synthetic observations produced by the five different
transport models. For each region, the black horizontal line denotes the
estimate from assimilating pseudo-obs generated by TM5; i.e., it is the
“perfect-transport” OSSE. The other four models are not distinguished here
for visual clarity, but Fig. in
Appendix marks them separately. The range of
the annual flux estimates across the five forward models in
Fig. , which is a measure of the transport model
uncertainty in the flux estimates, is tabulated for all regions and data
streams in Table in the Appendix.
Real satellite retrievals of XCO2 have spatially coherent and
sampling-mode-dependent biases due to interfering species such as aerosols
and water, surface effects such as albedo and elevation, and geometric
effects such as the solar zenith angle. However, synthetic data generated by
the five transport models, which serve as the input in our inversions, do not
have such biases. Hence the range of flux estimates from different data sets
is purely determined by the coverage difference between different sampling
modes and the type of measurement (total column versus near-surface point),
while the differences between the flux estimates from pseudo-obs generated by
different models (horizontal lines within each color bar in
Fig. ) is a measure of the inter-model transport
difference as sampled by a particular observing mode/network. In this
context, the horizontal black lines in Fig.
represent perfect-transport inversions, meaning the synthetic
observations were generated and assimilated with the same transport model.
Therefore, the difference between those lines (TM5) and true fluxes (white
circles) in the figure represents the balance between Sa and Sϵ
in our setup of TM5 4DVAR, and a smaller difference from a different model
(any other horizontal line) should not be interpreted as significant. It
should also be noted that our goal is not to rank models according to their
proximity to true fluxes in Figs.
and , but rather to quantify the spread across
different models used to generate the synthetic data and how that spread
varies with sampling and coverage.
Annual flux estimates from
land (a) and ocean (b) regions and zonal
bands (c). For each region, the prior and true fluxes are shown by a
gray diamond and a white circle, respectively. The different color bars
correspond to different synthetic data streams assimilated: IS stands for in
situ; LN, LG and OG stand for OCO-2 land nadir, land glint and ocean glint, respectively; and LNLG = LN + LG (all
land soundings). The data streams IS-LNLG and IS-OG are theoretical PBL
sampling networks at OCO2 sounding locations and times, described in
Sect. . For each color, the vertical extent of the
bar denotes the range (minimum to maximum) of the flux estimates from
pseudo-data produced by the five transport models for that data stream. The
black horizontal line through each bar denotes the estimate from TM5
pseudo-obs, while the fainter horizontal lines denote the estimates from the
pseudo-obs produced by the other four models. The individual models are not
distinguished here for visual clarity but are marked separately in
Fig. in
Appendix .
Figures and show
the range of monthly fluxes from TRANSCOM-like land and ocean regions for
each type of synthetic data stream assimilated. For visual clarity, only the
range across the five models has been shown instead of individual flux
estimates. The land regions in Fig. are
identical to the TRANSCOM regions except that Africa has been partitioned
into Saharan and sub-Saharan Africa instead of north and south of the
Equator.
Monthly flux estimates from
TRANSCOM-like land regions and global total land. The different colors
correspond to different synthetic data streams assimilated, as in
Fig. . The different models used to generate the
synthetic data have not been distinguished here to minimize visual clutter.
The theoretical PBL networks IS-LNLG and IS-OG have also been omitted for the
same reason. Plots of seasonal fluxes over many more regions, with the models
distinguished, are included in the Supplement.
Same as Fig.
except over TRANSCOM ocean regions and global total ocean.
Discussion
Global budget
All five models were run from the same initial CO2 field with the
same surface fluxes. The resulting global burden of CO2 in the models
were close but slightly different, as shown in Fig. .
The increase in the global average CO2 mole fraction between
1 January 2015 and 1 January 2016 ranged from 2.89 ppm (TM5) to 2.97 ppm
(LMDZ). That 0.08 ppm range in the mole fraction, given the dry air
mass of TM5, corresponds to a range of 0.16 PgC in the change in the
global CO2 burden over 2015. Therefore, even if our pseudo-data
inversions nail the global CO2 budget for 2015 exactly, we can expect
a variation of up to 0.16 PgC in that budget owing to the small
model-to-model differences in Fig. .
The global total CO2 flux in Fig. shows
a spread of ∼1.5 PgC yr-1 for in situ inversions, which is larger
than the spread seen in earlier inverse model intercomparisons such as
. This is because intercomparisons such as
typically report the constraint on the multi-year
average global growth rate, while here we are looking at the constraint on a
single year's growth rate from in situ samples.
compared eight different inverse models of a single year using in situ data
and found a spread of 1.73 PgC yr-1 across models for the annual growth
rate, with a standard deviation of 0.5 PgC yr-1. The inversions in
were less controlled than our setup, since they
used different flux and measurement covariances as well as different
transport models. Therefore, in our more controlled experiment, a spread of
1.5 PgC yr-1 is reasonable among the different in situ data streams. It
is noteworthy that the spread in the global total flux in
Fig. for the OCO-2 pseudo-data inversions is
∼ 0.25 PgC yr-1, close to the previously calculated limit of
0.16 PgC yr-1. This reduction in the spread from in situ to OCO-2
inversions is primarily due to the more spatially extensive sampling of OCO-2
and not because of OCO-2's sensitivity to the total column (as opposed to the
surface layer), evidenced by the ∼ 0.25 PgC yr-1 spread in the
global CO2 flux from IS-LNLG and IS-OG inversions in
Fig. . This suggests that, compared to the
current in situ network, a more spatially extensive sampling strategy,
whether total column or PBL, can provide a stricter constraint on the global
CO2 budget that is less sensitive to transport model specifics.
Large-scale partitioning of the global budget
The partitioning of the 2015 global
CO2 sink into two geographical domains, with the tropics being
defined as 23.5∘ N and S latitudes. Each color represents one
type of synthetic data assimilated, while each symbol shape represents one
model used to generate the synthetic data. The diagonal gray line represents
the 2015 global sink of 3.64 PgC yr-1 in the true fluxes used to
generate the synthetic data, while the large plus sign denotes their
partitioning. The scales are identical across all four panels, but not the
origins.
The partitioning of the 2015 global
CO2 sink into two geographical domains, with the tropics being
defined as 23.5∘ N and S latitudes. This is similar to
Fig. , except that we have compared two real OCO-2 and
one real in situ sampling schemes (LNLG, OG, IS) with the two theoretical in
situ ones (IS-LNLG, IS-OG) of Sect. . The scales
are identical across three of the four figures and the same as
Fig. ; the partitioning of global and ocean fluxes had
significantly more spread and required a different scale.
The spread in the flux partitioning
across five models from the assimilation of different pseudo-data streams.
This is a tabulated summary of the information in Figs.
and . For each pseudo-data stream (e.g., MBL) and
each partitioning (e.g., 23.5∘ N, which is the dividing line between
the northern extra-tropics and the rest), the table contains the spread
across five models of the sum and difference of the fluxes between the two
partitions. All numbers are in PgC yr-1.
Partitioning
MBL
IS
LN
LG
LNLG
OG
IS-LNLG
IS-OG
sum
diff
sum
diff
sum
diff
sum
diff
sum
diff
sum
diff
sum
diff
sum
diff
Equator
1.71
2.27
1.51
3.02
0.22
1.44
0.24
1.59
0.24
1.49
0.29
1.81
0.33
1.66
0.29
2.13
Land–ocean
1.71
3.74
1.51
2.49
0.22
1.99
0.24
1.86
0.24
2.35
0.29
0.75
0.33
9.71
0.29
1.92
23.5∘ N
1.71
1.67
1.51
2.08
0.22
1.80
0.24
1.59
0.24
2.03
0.29
2.20
0.33
1.62
0.29
1.65
23.5∘ S
1.71
2.37
1.51
2.12
0.22
1.44
0.24
1.59
0.24
1.46
0.29
1.95
0.33
1.61
0.29
1.84
The global atmospheric growth rate of CO2 (denoted C below) is
determined by the fossil fuel (Fff) emissions and the global sink
from the land biosphere (Fbio) and oceans
(Foce):
dCdt=Fff+Fbio+Foce,
where Fbio includes fire emissions. CO2 inversions
typically assume a known Fff and estimate Fbio and
Foce from atmospheric observations of CO2. Therefore, in a
suite of inversions assuming the same Fff, the global total sink
Fbio+Foce is constrained to a number whose uncertainty is
determined by how well the global CO2 budget is determined by the
CO2 observations assimilated. A plot of the estimated Foce
vs. Fbio from the suite should therefore be clustered around a
straight line with a slope of -1. The same logic applies for any other
two-way partitioning of the global sink, such as Northern versus Southern
Hemisphere. Figures and
show four different two-way partitionings of the global total CO2
sink from our ensemble of inversions of synthetic data. The straight line
with slope -1 corresponds to the global total sink of
-3.64 PgC yr-1 in our true fluxes used to generate the observations.
For each inversion estimate, the distance from that straight line is a
measure of how much the estimated global budget deviates from the true global
budget for 2015, while the position along the line is an indication of how
the inversion splits the global budget into the two partitions.
Table contains summary statistics from
Figs. and . For each data
stream (e.g., MBL) and partitioning (e.g., Equator, which corresponds to the
partitioning between the Northern and Southern Hemisphere), the table
contains the spread in the sum and difference of fluxes between the two
partitions. The spread in the sum is a measure of the uncertainty in the
global budget as constrained by that data stream, while the spread in the
difference is indicative of the uncertainty in the partitioning.
The global budget for a single year is constrained poorly by inversions with
IS and MBL pseudo-data, evidenced by the large spread of the global sum in
Table and the scatter of the IS and MBL points
around the -3.64 PgC yr-1 straight line in
Figs. and . This is
consistent with the larger spread in the global sink estimate of inversions
with IS and MBL data in Fig. . Among the models,
PCTM pseudo-obs seem to demand a higher CO2 flux consistently, while
ACTM and GEOS-Chem pseudo-obs demand slightly lower CO2 fluxes. Since
growth in the atmospheric CO2 burden was the same for all the models
in 2015 (Fig. ), these differences are due to
large-scale transport differences sampled by the in situ network.
Since the OCO-2 pseudo-obs in this OSSE are bias free, differences in the
partitioning from different sounding modes (LN, LG, OG and land or LNLG) are
purely due to sampling differences. This includes the obvious difference of
sampling the atmosphere over land and ocean surfaces, and also a more subtle
difference in the timing of the samples, coming from the fact that during the
early part of the OCO-2 record up to July 2015 the satellite operated
continuously for 16 days in nadir (glint) mode before switching to glint
(nadir). As a result, land nadir and land glint samples over the same
location could be separated by up to 16 days. Since CO2 fluxes can
change significantly over 16 days, this may give rise to differences in LN
and LG derived flux estimates. The impact of spatiotemporal differences in
sampling are evident in Fig. . Among assimilations of
OCO-2 pseudo-obs (LN, LG, LNLG, OG) simulated by a single forward model,
there can be a ∼ 0.5 PgC yr-1 spread in the partitioning across a
latitude, whether the Equator or one of the tropics, while the land–ocean
partitioning is more uncertain, with a spread of up to
∼1.5 PgC yr-1. Interestingly, the land–ocean partitioning seems to
be better pinned down by OCO-2 ocean soundings than land soundings, evidenced
by the smaller inter-model spread when assimilating OG pseudo-obs than when
assimilating LN, LG or LNLG pseudo-obs. The same does not appear to hold for
any latitudinal partitioning.
Finally, we contrast the partitioning from LNLG (OG) with that from IS-LNLG
(IS-OG) to gauge the impact of transport error on PBL versus total column
measurements. The IS-LNLG (IS-OG) network, which has spatially extensive PBL
sampling only over land (ocean), has a much larger spread in the land–ocean
partitioning than the LNLG (OG) network of column samples. This
suggests that, if the goal is to partition land and ocean fluxes, PBL sampling
can amplify differences across transport models, which are larger in the PBL
than in the total column . Moreover, comparing the spreads
of IS-LNLG and IS-OG inversions suggests that these transport differences are
larger over land than over ocean. If the goal, however, is to partition the
global budget across a latitude (i.e., the other three partitionings in
Table ), column sampling does not appear to have
an obvious advantage over PBL sampling. This is likely because of the fast
zonal mixing of the CO2 flux signal; i.e., the flux signal missed by
PBL samples at one location due to incorrectly modeled vertical mixing will
be seen by PBL sites downstream within the same zonal band.
Annual fluxes at zonal, continental and TRANSCOM scales
The spread in flux estimates across the five forward models, or the
transport-driven uncertainty, is very similar in
Fig. and Table
between IS and MBL data streams for most regions. Over some land regions that
have seen a significant increase in measurement density since
, such as North America and Europe, the additional
measurements in IS result in a smaller uncertainty compared to MBL. Over land
regions where the coverage of IS and MBL are almost identical, such as Africa
and tropical Asia, the uncertainties are (not surprisingly) comparable
between IS and MBL. Over ocean regions, the IS and MBL uncertainties are very
similar, except over the Pacific, where the increased coverage in IS on the
west coast of North America is likely responsible for the reduction in
uncertainty. The uncertainty in the global uptake and the global land and
ocean fluxes are slightly smaller for the IS network than for the MBL
network. However, for most other zonal regions the IS and MBL uncertainties
are roughly equal, likely because of the fast zonal mixing in the atmosphere.
The regional annual flux estimates of Fig. show
that the spread among land flux estimates when assimilating OCO-2 pseudo-data
over land (LN, LG and LNLG) is often smaller than when assimilating in situ
data (IS, MBL). This could be a combination of the total column nature of
OCO-2 pseudo-data and its increased spatial homogeneity of coverage. To
separate the two effects, we look at IS-LNLG, which has the same coverage as
LNLG but only PBL samples instead of total columns. Over certain regions,
such as temperate North and South America, and temperate Eurasia, the IS-LNLG
spread is larger than the IS spread, which is larger than the LNLG spread.
This suggests that over those regions the transport model error – relative
to the flux signal – in the total column is smaller than in the PBL, leading
to lower transport-drive uncertainty in total column CO2
assimilations than in situ CO2 assimilation. Sampling the PBL more
densely over those regions is likely to increase transport-driven uncertainty
in fluxes. This is consistent with the hypothesis of .
However, over some other regions, such as boreal Eurasia and tropical South
America, the IS-LNLG spread is much smaller than the IS spread, suggesting
that over those regions, the reduction in uncertainty going from IS to LNLG
is primarily due to the more uniform spatial coverage and not due to total
column sampling. In fact, over tropical South America the IS-LNLG spread is
smaller than the LNLG spread, suggesting that the transport error in the
total column is larger than that in the PBL. Finally, over regions such as
Europe, the ordering of IS, IS-LNLG and LNLG uncertainties suggests that the
reduction in uncertainty in going from IS to LNLG is partly due to the more
spatially uniform coverage and partly due to total column sampling.
Over some ocean regions such as the temperate North Pacific and South
Atlantic, the IS-OG spread is larger than the OG spread, suggesting that
modeling the PBL is more uncertain than modeling the total column over those
regions. However, the opposite is true over several other ocean regions, such
as the temperate North Atlantic and South Pacific. Thus, the hypothesis of
cannot be said to hold over most ocean regions. Finally,
one striking features of ocean fluxes in Fig.
is worth pointing out here. The transport-derived uncertainty for IS-LNLG
estimates is often the largest among all data streams, which leads to a large
uncertainty in the global land–ocean partitioning using the IS-LNLG network.
This suggests that increasing the PBL sampling only over land – where the
transport models disagree more – is likely to worsen ocean flux estimates in
the presence of imperfect-transport models.
The global uptake and its partitioning between land and ocean, or the Northern
and Southern Hemisphere, are less uncertain for XCO2 assimilations
than for in situ CO2 assimilations IS and MBL. Looking at the IS-LNLG
and IS-OG inversions, we conclude that the improvement in the global budget
and its north–south partitioning is likely due to a more uniform spatial
coverage, while the improvement in land–ocean partitioning is likely due to
the total column nature of the OCO-2 pseudo-data. Partitioning the budget in
zonal bands – i.e., northern extra-tropics, tropics and southern
extra-tropics – has (roughly) the same uncertainty across all inversions. This is likely due
to the fast zonal flow in the free troposphere, which ensures that surface
flux signals missed by one set of measurements – perhaps due to imperfect
transport – are seen by other measurements in the same zonal band.
Traditionally, inversions of surface CO2 data have had larger
uncertainty in tropical flux estimates than in northern temperate
regions, stemming from the sparse observational coverage in the tropics
. The larger interannual variability of the tropical
flux, seen by several inversion studies including and
, is also ascribed partly to the higher uncertainty in
tropical flux estimates. In contrast, the uncertainty in flux estimates
stemming from uncertainties in modeled transport does not have the same
correlation with observational coverage. For inversions with in situ data,
the relatively well-covered regions of temperate North America and Europe
show the same transport-derived uncertainty as the poorly covered regions of
temperate South America and tropical Asia
(Fig. ). In general, we do not find that the
uncertainties in flux estimates due to transport model errors are lower over
the northern temperate latitudes than over less measured tropical and
southern temperate areas.
One final noteworthy aspect of the flux estimates of
Fig. is that for some regions (such as
temperate South America, Atlantic tropics, Southern Ocean, south Indian
temperate, tropical oceans, Indian Ocean, the southern extra-tropics and
southern extra-tropical land) the range of in situ flux estimates does not
overlap with the range of LN, LG or LNLG (and sometimes OG) flux estimates.
For some other regions such as the Indian Ocean and the Southern Ocean, there
is no overlap between the OCO-2 land (LN, LG, LNLG) and ocean (OG) estimates.
Since there are no biases between the IS, OCO-2 land and ocean pseudo-data,
these flux differences suggest that spatiotemporal coverage differences
between different observation networks and OCO-2 sampling modes can lead to
flux differences that are larger than uncertainties due to transport.
Monthly fluxes
Figures and show
the monthly flux estimates for 2015 from TRANSCOM-like land and ocean
regions. As before, only the spread across the pseudo-data generated by the
five transport models is shown for visual clarity. The reduced sensitivity of
OCO-2 pseudo-data inversions to transport model uncertainty is obvious for
most months over both land and ocean regions. As before, this reduced
sensitivity is from a combination of two factors: (a) spatially uniform
coverage of OCO-2 compared to the in situ network and (b) the assimilation
of column average XCO2 as opposed to PBL CO2. The relative
importance of the two factors – as gauged by the relative sizes of the bars
between the OCO-2 (LN, LG, LNLG, OG), real in situ (IS) and hypothetical in
situ (IS-LNLG, IS-OG) data streams in Figs.
and – varies by region and season. For example,
in October in sub-Saharan Africa, going from the sparse IS network to the
more uniform IS-LNLG network reduces the flux uncertainty significantly, but
going from PBL measurements (IS-LNLG) to the total column (LNLG) does not
reduce the uncertainty further. In contrast, over the same region in
December, the increased PBL sampling of the IS-LNLG network inflates the flux
uncertainty compared to the IS network, while going from PBL sampling
(IS-LNLG) to the total column (LNLG) brings that uncertainty down
significantly. In general, over most land regions and most months, given
OCO-2's spatiotemporal sampling, assimilating total column CO2 (LNLG)
results in equal or lower transport-driven uncertainty than assimilating PBL
CO2 (IS-LNLG). The same relationship holds between IS-OG and OG
inversions over ocean regions with a few exceptions (e.g., south Indian
temperate in June and July). However, the relationship between the real (IS)
and hypothetical (IS-LNLG, IS-OG) networks is less general and reflects the
impact of different sampling.
The transport-derived uncertainty in monthly fluxes has clear seasonality
over most land and ocean regions. In general, over temperate and boreal land
regions, the uncertainty is higher in the summer than in the winter, likely
due to stronger convective transport and higher horizontal wind shear in the
summer months. Temperate oceans sometimes display the opposite behavior
(e.g., temperate North Atlantic and North Pacific), whereby transport-driven
uncertainty is lower in the summer and higher in the winter. This is likely
because advective, and not convective, transport uncertainty is the dominant
uncertainty over oceans. Over the tropics the distinction is less clear cut,
with no clear commonality between tropical Asia and tropical South America.
Over the tropical Indian Ocean, the uncertainty is lowest in the last third
of the year, whereas in the tropical Pacific the uncertainty is lowest in
the middle of the year.
Over certain ocean regions (e.g., Atlantic tropics, east Pacific tropics,
south Indian temperate, Southern Ocean), the range of monthly fluxes obtained
from synthetic XCO2 over land (LN, LG and LNLG) often does not
overlap at all with the range obtained from either the ocean data (OG) or in
situ data (IS). Sometimes, the OCO-2 land pseudo-data inversions overlap with
the ocean pseudo-data inversions but not with the true fluxes (e.g.,
temperate North Atlantic and North Pacific). Since there are no coherent
biases in OCO-2 pseudo-data in these synthetic data experiments, the
differences between land and ocean XCO2 inversions, or between either
set and the true fluxes, can only be due to differences in sampling the same
CO2 field with different sets of sampling times and locations. These
sampling differences can lead to flux differences that are larger than the
transport-driven uncertainty in fluxes. As noted earlier, this implies that
in real data inversions biases can appear between land and ocean XCO2
inversions, or between OCO-2 and in situ inversions, purely due to an
imperfect-transport model sampling the same field according to different
sampling patterns. This can, for example, lead to biased flux estimates when
ocean fluxes are inferred using OCO-2 land soundings, even when the
retrievals are unbiased.
Conclusions
In this work, we have used five different transport models in an
OSSE to estimate the uncertainty in inversion-derived flux estimates due to
the uncertainty of the modeled transport in flux inversions. The five
transport models were driven by four different state-of-the-art reanalyzed
meteorological data sets that are commonly used in the flux inversion
community and therefore could be expected to span the spectrum of transport
model behavior. In the OSSE, we created synthetic in situ and column
CO2 measurements by running the five transport models forward with
the same boundary conditions and then assimilated those measurements in a
single flux inversion system. The spread in the flux estimates was therefore
purely due to the spread among the five transport models. We tested this
setup for different sampling protocols: (a) an in situ set corresponding to
NOAA's present-day cooperative air sampling network; (b) an in situ set of
mostly background sites corresponding to the network used by
for the TRANSCOM 3 model intercomparison experiment; (c) a set of
XCO2 measurements corresponding to OCO-2 land nadir, land glint and
ocean glint soundings, convolved with corresponding OCO-2 averaging kernels
and priors; and (d) a set of in situ samples within the PBL at the times and
locations of OCO-2 land and ocean soundings. This allowed us to test the
interaction of imperfect transport, observational coverage and the
assimilation of column versus PBL mole fractions. Our use of the OCO-2 data
– both the temporal averaging and the errors in those averages – followed
the current protocol used by OCO-2 flux modelers, and therefore our results
should be directly usable by the modelers to draw conclusions about their
real data inversions. There are four important take-home messages from this
work that we would like to convey.
MBL vs. IS
A comparison of the spread of flux estimates from the MBL and IS inversions
suggests that the added coverage from mostly continental sites on top of the
mostly background network considered by can reduce
transport-induced uncertainty over land regions, despite the uncertainty in
transport over continents. This is likely due to the added observations
averaging out some of the transport variability. The added coverage has
minimal or negative benefit in reducing transport-induced uncertainty of
ocean flux estimates, and estimates over zonal bands, except for the Pacific
ocean and its temperate and tropical subdivisions.
Geographical distribution of transport uncertainty
For inversions of in situ data, flux estimates over the tropics have been
historically less certain than estimates over the northern temperate regions,
owing to lower observational coverage over the former. In previous work, the
uncertainty of fluxes purely due to transport was also found to be slightly
higher over tropical regions than over extra-tropical regions
. However, in this work, we see that that demarcation does
not hold for flux uncertainty stemming from transport model uncertainty. For
example, the spread among IS inversions over temperate North America or
Europe in Fig. is as large as their spread over
tropical Asia or temperate South America, respectively, despite the first two
being much better covered with CO2 samples.
Column vs. PBL CO2
hypothesized that inversions of column average
CO2 may be less sensitive to vertical transport errors than PBL
CO2, since redistribution of CO2 in the vertical does not
change the column average. However, the variation of column CO2 due
to fluxes is also much smaller than in the PBL. The transport model
sensitivity of column CO2 inversions depends on the balance between
this smaller flux signal and smaller transport error. In our experiments, we
see that over TRANSCOM-scale and larger land regions (except tropical South
America) inversions using column CO2 data over land (LNLG) are
indeed less sensitive to transport errors than inversions using PBL
CO2 at the same locations and times (IS-LNLG). Over TRANSCOM-scale
ocean regions, however, the picture is more ambiguous, as several regions
(e.g., Atlantic Ocean, South Pacific temperate, North Atlantic temperate,
Southern Ocean) display a smaller uncertainty when assimilating PBL
CO2 (IS-OG) than column CO2 (OG). This is likely because the
uncertainty in convective transport over oceans is smaller than on land. The
global budget and the partitioning across zonal bands are constrained equally
well by column and PBL CO2 samples, provided they have the same
spatiotemporal coverage. The partitioning across land–ocean boundaries is
noticeably more uncertain when using PBL samples over land than column
samples, likely because vertical transport differences near the surface are
larger over land than oceans.
It should be noted here that the low sensitivity of column measurements to
PBL CO2 variations is often considered a weakness, since surface flux
signals are the largest in the PBL. Efforts are currently underway to
construct active remote-sensing instruments that are preferentially sensitive
to the lower troposphere . Our OSSEs suggest that,
were such an instrument to be deployed, the uncertainty of surface flux
estimates derived from that instrument might very well be larger than from an
OCO-2-like column CO2 instrument due to transport model uncertainty
near the surface. In the long term, significant improvement in transport
modeling will be needed to benefit from a remote-sensing instrument
preferentially sensitive to near-surface CO2.
Impact of coverage
In our synthetic data inversions, the difference between the fluxes inferred
from the same forward-model run but different sampling strategies is purely
due to the interaction between non-ideal transport and data coverage, and not
because of biases between the different samples. Despite this lack of bias,
there are several regions where the entire spread of flux estimates across
the five forward models has no overlap between certain types of data, or with
the truth. For example, LN, LG and LNLG annual flux estimates from the Indian
Ocean have no overlap with either IS or OG estimate or the truth, while
XCO2 estimates of temperate South American fluxes are completely
detached from all IS estimates. This effect is even more pronounced for
monthly flux estimates. This suggests that, in the presence of imperfect
transport and no measurement bias, different coverage and sampling can
generate biases in flux estimates that are larger than their uncertainty due
to transport. We should therefore avoid inferring, say, oceanic fluxes by
using only OCO-2 land soundings.