Biases in Atmospheric Co 2 Estimates from Correlated Meteorology Modeling Errors

Estimates of CO 2 fluxes that are based on atmospheric measurements rely upon a meteorology model to simulate atmospheric transport. These models provide a quantitative link between the surface fluxes and CO 2 measurements taken downwind. Errors in the meteorology can therefore cause errors in the estimated CO 2 fluxes. Meteorology errors that correlate or covary across time and/or space are particularly worrisome; they can cause biases in modeled atmospheric CO 2 that are easily confused with the CO 2 signal from surface fluxes, and they are difficult to characterize. In this paper, we leverage an ensemble of global meteorology model outputs combined with a data assimilation system to estimate these biases in modeled atmospheric CO 2. In one case study, we estimate the magnitude of month-long CO 2 biases relative to CO 2 boundary layer enhancements and quantify how that answer changes if we either include or remove error correlations or covariances. In a second case study, we investigate which meteorological conditions are associated with these CO 2 biases. In the first case study, we estimate uncertainties of 0.5– 7 ppm in monthly-averaged CO 2 concentrations, depending upon location (95 % confidence interval). These uncertainties correspond to 13–150 % of the mean afternoon CO 2 boundary layer enhancement at individual observation sites. When we remove error covariances, however, this range drops to 2– 22 %. Top-down studies that ignore these covariances could therefore underestimate the uncertainties and/or propagate transport errors into the flux estimate. In the second case study, we find that these month-long errors in atmospheric transport are anti-correlated with temperature and planetary boundary layer (PBL) height over terrestrial regions. In marine environments, by contrast, these errors are more strongly associated with weak zonal winds. Many errors, however, are not correlated with a single meteorological parameter, suggesting that a single meteorological proxy is not sufficient to characterize uncertainties in atmospheric CO 2. Together, these two case studies provide information to improve the setup of future top-down inverse mod-eling studies, preventing unforeseen biases in estimated CO 2 fluxes.


Introduction
Scientists increasingly use atmospheric CO 2 observations to estimate CO 2 fluxes at Earth's surface (e.g., Gurney et al., 2002;Michalak et al., 2004;Peters et al., 2007;Gourdji et al., 2012).This "top-down" approach contrasts with "bottomup" studies that rely primarily on expert knowledge of biological processes (e.g., Huntzinger et al., 2012;Raczka et al., 2013).In order to estimate the fluxes, top-down studies typically require a meteorology model to link fluxes at the surface with measurements taken downwind.Using this link, one can estimate the fluxes even if the atmospheric measurements do not themselves directly measure the fluxes.
However, both the accuracy and effective resolution of the flux estimate hinge upon the accuracy of the meteorological Published by Copernicus Publications on behalf of the European Geosciences Union.S. M. Miller et al.: Biases in atmospheric CO 2 from meteorology errors model.Errors in the meteorological model may (or may not) bias estimated CO 2 fluxes depending upon the error characteristics and the space/timescales of interest.
More specifically, the effect of CO 2 transport errors on the estimated fluxes depends upon two important factors.First, the flux estimate becomes more uncertain as the CO 2 transport error variance (or standard deviation) increases.Topdown studies that use Bayesian statistics will explicitly account for these variances when estimating fluxes (e.g., Enting, 2002;Tarantola, 2005); before estimating the fluxes, the modeler first estimates the total variance due to an array of model or data errors -due to imperfect atmospheric transport or imperfect measurements, among many other sources of error (e.g., Gerbig et al., 2003;Michalak et al., 2005;Ciais et al., 2011).
Second, the flux estimate becomes more uncertain as the temporal and/or spatial error covariances increase.As the covariances increase, each CO 2 measurement effectively provides less and less independent information to constrain the surface fluxes.Furthermore, these temporally and/or spatially correlated errors can bias the flux estimate over a region or over the entire geographic area of interest (e.g., Stephens et al., 2007).
Quantification of this complex cause-and-effect between meteorological errors and errors in estimated CO 2 fluxes represents an ongoing research challenge, and a number of existing studies have characterized different aspects of these uncertainties.For example, a series of studies known as TransCom (Atmospheric Tracer Transport Model Intercomparison) represents one of the first coordinated projects on CO 2 transport uncertainties (Gurney et al., 2002;Baker et al., 2006).These early studies used 13 different global atmospheric models and compared differences in top-down CO 2 budgets due to atmospheric model differences.Subsequent to the TransCom project, a number of studies have focused on the effects of changing vertical mixing and/or planetary boundary layer height (PBLH) (Gerbig et al., 2008;Williams et al., 2011;Kretschmer et al., 2012Kretschmer et al., , 2014;;Parazoo et al., 2012;Pino et al., 2012).In general, those papers found that uncertainties in PBLH can lead to biases of ∼ 3 ppm in modeled daytime CO 2 .Another paper examined the effect of uncertain horizontal winds (Lin and Gerbig, 2005).The authors applied a particle-trajectory model at a measurement site in Wisconsin and found that uncertainties in the horizontal winds contributed ∼ 6 ppm (standard deviation) to the overall CO 2 transport uncertainty.In summary, a number of previous studies have either perturbed individual meteorological parameters or, in the case of TransCom, sampled transport uncertainties using 13 preselected atmospheric models.
The present study is particularly concerned with temporal and/or spatial error covariances in atmospheric CO 2 transport.To what extent do CO 2 transport errors covary in space and time?How large are these covariances relative to the magnitude of the surface CO 2 fluxes, and which meteorological factors drive large error covariances?These covariances are often difficult to characterize (e.g., Lin and Gerbig, 2005;Lauvaux et al., 2009) and are omitted from most existing topdown efforts.
We explore several facets of these questions using a global meteorology model ensemble and a meteorology data assimilation system -the Community Atmosphere Model (CAM) and a Local Ensemble Transform Kalman Filter (LETKF) (Hunt et al., 2007).Efforts by Liu et al. (2011) and Liu et al. (2012) extended this meteorological framework to model uncertainties in atmospheric CO 2 .
This framework systematically estimates meteorology and CO 2 transport uncertainties to an extent not previously possible; CAM-LETKF explicitly represents the CO 2 transport uncertainties that remain after assimilating several hundred thousand meteorology observations at each 6 h model time step.To accomplish this task, CAM-LETKF uses an ensemble of weather forecasts and optimizes the ensemble to match available meteorological observations.Furthermore, CAM-LETKF adjusts the variance of the weather ensemble at each time step to match the modeling uncertainties implied by the meteorological observations.
Using this toolkit, we construct several case studies to understand both the possible magnitude and drivers of CO 2 transport error covariances -errors that persist over many time steps and/or across large regions.The next section describes CAM-LETKF and these case studies in greater detail.

The meteorology and CO 2 model
The first component of CAM-LETKF is the meteorological model.We simulate global meteorology using CAM and the Community Land Model (CLM, version 3.5) run in weather forecast mode (not climate mode) (Collins et al., 2006;Oleson et al., 2008;Chen et al., 2010).Model simulations in this study have a spatial resolution of 2.5 • longitude by 1.9 • latitude with 26 vertical model levels.In most regions, there are three vertical model levels within the lowest kilometer of the atmosphere.These model levels are centered at 929.6, 970.6, and 992.6 hPa over regions where the land/water surface is at sea level.
We save the global model output at 6 h time increments.Furthermore, we run the model for two time periods: January-February 2009 and May-July 2009.The first month of each run serves as an initial spin-up for the model-data assimilation system.The next section describes this assimilation in greater detail.

The meteorological model-data assimilation framework
The second component of CAM-LETKF is the data assimilation and model optimization framework.This framework serves two purposes.First, the LETKF optimizes modeled meteorology (CAM-CLM) to match available observations.Second, the LETKF uses an ensemble of model forecasts to represent model uncertainties that remain after data assimilation (Hunt et al., 2004(Hunt et al., , 2007)).We define each ensemble member and the mean of the entire ensemble as follows: where x i (m × 1) is a single model ensemble member, x (m × 1) is the mean of the model ensemble, and X i (m × k) refers to the ith column of the matrix that defines the ensemble spread.In this paper, the variable m refers to the total number of model parameters -the model estimate for a variety of meteorological variables, concatenated across the globe and across all 6-hourly time steps in a given model run.Furthermore, we use k = 64 total ensemble members in this setup, as was done in Liu et al. (2011) and Liu et al. (2012).
Using this ensemble, CAM-LETKF steps through time in sequential 6 h intervals.First, the model ensemble at time t is optimized to match meteorological data (Hunt et al., 2007).To this end, we assimilate the same meteorological observations used in the National Centers for Environmental Prediction -Department of Energy reanalysis 2 (Kanamitsu et al., 2002): temperature (in situ and satellite), zonal wind (in situ and satellite), meridional wind (in situ and satellite), surface pressure (in situ), and specific humidity (in situ).At each 6 h model time step, we assimilate between ∼ 180 000 and 330 000 observations globally.At that juncture, the ensemble mean associated with time t, x(t), represents the model best guess and the ensemble members, x(t) + X(t), collectively represent the uncertainties in the modeled meteorology (i.e., posterior variances and covariances).Second, we run 6 h CAM-CLM forecasts using these realizations as initial conditions -a total of 64 model forecasts.The 6 h cycle of data assimilation and model forecast then begins again.
This model ensemble, by design, is guaranteed to reflect actual uncertainties in modeled meteorology; at each 6 h model time step, we adjust the ensemble variance such that this variance matches against the model-data residuals (Li et al., 2009;Miyoshi, 2011).The Supplement describes this procedure, known as adaptive covariance inflation.
The model ensemble also accounts for both spatial and temporal covariances in modeled meteorological uncertainties; meteorological errors within one ensemble member can easily persist over many time steps.This continuity occurs because the optimized ensemble members from the one time step become the initial conditions for the weather forecast at the next time step.For example, if the PBL height in one ensemble member is lower than the ensemble average at a given time step, it will likely be lower than average at the next time step.Similarly, if the PBL height in one ensemble member is lower than average over one grid box, it will likely also be lower than the average over an adjacent grid box.
Certain meteorological uncertainties, however, may not always be captured by the assimilation system, particularly uncertainties that do not manifest in the model-data residuals.For example, CAM-LETKF will not fully characterize uncertainties due to different PBL schemes (e.g., Yonsei versus Mellor-Yamada-Janjic) or due to other structural model differences.Furthermore, LETKF cannot spatially resolve uncertainties that occur at sub-grid scale (e.g., turbulent eddies or numerical diffusion).For further technical detail on the LETKF and adaptive covariance inflation, refer to the Supplement, Hunt et al. (2004Hunt et al. ( , 2007)), Li et al. (2009), Liu et al. (2011), or Miyoshi (2011).

CO 2 transport error variances and covariances
The CAM-LETKF system described above estimates not only meteorological uncertainties but also uncertainties in CO 2 transport.In this study, CO 2 is a passive tracer that is not part of the data assimilation, so any uncertainties in CO 2 concentrations are solely due to uncertainties in atmospheric transport.
We drive all model simulations with a published CO 2 flux estimate from CarbonTracker (CT), version CT2011oi (Fig. 1; Peters et al., 2007, http by the US National Oceanic and Atmospheric Administration (NOAA).NOAA scientists optimize CT fluxes to match atmospheric CO 2 data, so the flux estimate is consistent with actual observations (Peters et al., 2007).The original CT fluxes have a temporal resolution of 3 h.We average these fluxes to a 6 h resolution for all of the CAM simulations in this study.
We subsequently estimate 6-hourly CO 2 transport uncertainties using this setup.These uncertainties are defined as the difference between the top and bottom of the 95 % confidence interval, computed from the 64 model realizations (e.g., Fig. 2).To make this estimate, we calculate the 2.5th and 97.5th percentiles of each row in X [CO 2 ] , where the subscript [CO 2 ] refers to the atmospheric CO 2 concentrations estimated by the ensemble.The remainder of the methods section applies this CO 2 and meteorology modeling framework to two case studies.

Case study 1: the magnitude of temporally and spatially covarying atmospheric transport errors relative to a CO 2 flux estimate
This case study explores the importance of persistent, covarying transport errors and the magnitude of those errors relative to the CO 2 fluxes.In particular, we estimate uncertainties in monthly mean, afternoon, modeled CO 2 concentrations at a number of in situ atmospheric observation sites.
In one case, we include temporal and/or spatial covariances in the atmospheric transport errors, and in another case we remove these covariances.We then compare these uncertainties against the modeled afternoon CO 2 boundary layer enhancement to understand the magnitude of these errors relative to the surface fluxes.
The uncertainty in monthly-averaged CO 2 concentrations serves as a measure of how transport errors correlate or covary across time.Uncorrelated transport errors will average out, to a large degree, over many model time steps, but temporal error covariances prevent the errors from averaging down over time.Furthermore, CO 2 budgets are often reported in month-long increments (e.g., Gourdji et al., 2012, and CT), so this time window is a relevant benchmark with respect to inverse modeling studies.
We calculate uncertainties in the monthly-averaged model output (including error covariances) via several steps.First, we select out the rows of X [CO 2 ] that correspond to afternoon observations (13:00-19:00 LT) for a given month at an in situ CO 2 observation site.Second, we calculate the mean of each column in X [CO 2 ] .Each column corresponds to a different ensemble member.The resulting vector of length 64 is the difference between each ensemble member and the best estimate ( x), averaged at the monthly scale.Lastly, we use this vector to compute a confidence interval in monthly-averaged, modeled CO 2 (the 97.5th minus 2.5th percentiles).
We subsequently remove covariances in the CO 2 transport errors and recalculate uncertainties in the monthly-averaged CO 2 concentrations.As described in Sect.2.2, errors in one ensemble member can persist over many steps and can persist across a large geographic region.However, we can remove these error covariances by randomly reshuffling the elements of each individual row in X [CO 2 ] .The variance in modeled concentrations in any row or at any given time step will remain the same.However, each column will no longer represent a single ensemble member.Rather, each column will represent a random assortment of different ensemble members, and the errors in each column will no longer covary from one time step to another or one geographic location to another.
We conduct this analysis at a representative selection of observation sites in North America, Asia, and Europe.This setup indicates how errors covary with time at the monthly scale.In addition, we also conduct the analysis using multiple observation sites; we estimate monthly-averaged uncertainties at the eco-region scale and include all observation sites that lie within the given eco-region.This latter approach indicates how errors covary spatially across multiple sites at the regional scale.
These monthly-averaged uncertainties can then be compared against the afternoon, modeled CO 2 increment from regional surface fluxes.To estimate this increment, we subtract modeled free troposphere, "clean air" concentrations at 600 hPa from concentrations modeled at the surface using CT fluxes.The concentrations at 600 hPa are not necessarily a perfect measure of clean air concentrations.Rather, this approach is an approximation similar to that used by inverse modeling studies in the literature (e.g., Gerbig et al., 2003;Gourdji et al., 2012).
In summary, case study one explores the magnitude of persistent atmospheric CO 2 transport errors or error covariances relative to the afternoon CO 2 signal from surface fluxes.The next case study, in contrast, explores the meteorological conditions under which these persistent CO 2 transport errors may be more likely to occur.

Case study 2: which meteorological factors may be associated with month-long transport biases?
We create a synthetic experiment to explore the meteorological conditions under which month-long model biases in atmospheric transport may occur.The spatial patterns in the CO 2 transport uncertainties are heavily influenced by spatial patterns in the CO 2 fluxes (Fig. 2).In other words, regions with large fluxes or large diurnal flux variability also show higher CO 2 transport uncertainties.As a result, it is difficult to disentangle the effect of different meteorological parameters on CO 2 transport uncertainties.Instead, we create a synthetic tracer with constant global emissions in both space and time.This experiment serves as a lens to explore the possible effects of different meteorological parameters independent of the spatiotemporal variability in CO 2 fluxes.

6-hourly uncertainties (95% confidence interval):
- To this end, we initialize CAM-LETKF runs with zero atmospheric concentration of this synthetic tracer and then run CAM-LETKF forward for 1 month using constant global emissions (e.g., for both February and July 2009).Any uncertainties in the atmospheric distribution of this tracer are solely due to meteorological parameters, not due to the spatial distribution of the underlying fluxes.
Next, we calculate the coefficient of variation (CV) associated with the monthly-averaged surface concentrations.The CV is an inverted signal-to-noise ratio; it measures the uncertainty in modeled surface concentrations relative to the average surface concentration ( σ µ ).For example, an uncertainty of 1 ppm in modeled concentrations is most problematic if the signal from surface fluxes is weak, and a 1 ppm uncertainty is less problematic if the signal from surface sources and/or sinks is strong.
For this setup, the CV equals the standard deviation in the monthly-averaged surface concentrations divided by the monthly surface concentration averaged across all 64 realizations.We then plot the tracer CV against monthly-averaged meteorological parameters and their associated uncertainties from CAM-LETKF.These relationships give insight into the meteorological conditions or meteorological uncertainties that are associated with month-long biases in the modeled synthetic tracer.

Uncertainties in the 6-hourly modeled CO 2 concentrations
Before examining the two case studies in detail, we first provide context on the CO 2 transport uncertainties estimated with CT fluxes and CAM-LETKF.Figure 2a and b visually summarize the average 6-hourly CO 2 transport uncertainties in the model surface layer -the difference between the top and bottom of the 95 % confidence intervals.These figures show how CO 2 transport uncertainties vary across the globe -from 0.6 to 26 ppm, depending on location.Furthermore, the transport uncertainties in Fig. 2a and b show several distinctive features.The largest uncertainties are localized to regions where either the magnitude or the diurnal cycle of the CT fluxes is largest (e.g., the US Eastern Seaboard and southern Siberia during summertime, the Amazon, the Congo, and eastern China).CO 2 transport uncertainties in the eastern US and eastern Asia bleed, to a smaller degree, over the adjacent ocean where surface fluxes are small.Figure 3  more, the comparison illustrates the magnitude of the CO 2 transport uncertainties relative to the diurnal cycle in CO 2 concentrations.For example, the uncertainties at AMT in July are ∼ 30 % of the diurnal range in the CO 2 measurements.Overall, the model ensemble depicted in these plots usually encapsulates the hourly-averaged measurements.CT fluxes are estimated using these CO 2 observations and the TM5 transport model (Tracer Model, version 5) (Peters et al., 2007), so one might expect the CAM model to fit the CO 2 observations relatively well.In the instances when the model ensemble does not encapsulate the hourly-averaged CO 2 measurements, one of the many other non-transport error types could be to blame; the ensemble spread only encompasses transport errors and does not include measurement errors, errors due to finite-model resolution, or errors in the fluxes.Furthermore, these instances could be due to structural differences between CAM and TM5, including differences in model resolution.The Supplement provides more example CO 2 model-data comparisons, meteorology model validation, and data assimilation diagnostics.

CO 2 transport uncertainties at longer timescales
The uncertainty in monthly-averaged CO 2 concentrations provides one measure of how transport errors persist over time, as discussed in Sect.2.4. Figure 2c and d display uncertainties in the month-long average surface concentrations for February and July 2009.In contrast to the 6-hourly uncer-tainties, these uncertainties are far more spatially distributed.This result implies that CO 2 transport errors covary over longer periods of time in remote regions, compared to regions with large surface fluxes.Observation sites that are far from large fluxes are therefore more likely to produce a biased CO 2 budget than sites near to large surface fluxes.These "remote" sites see a lower CO 2 signal from surface fluxes, and the transport errors at these locations generally covary over longer periods of time.
A number of factors may explain these relatively large error covariances in remote regions.CO 2 transport over remote or oceanic regions is likely dominated by synopticscale weather patterns that evolve over multi-day time periods.When CO 2 is transported across the oceans or remote areas from source/sink regions, atmospheric CO 2 transport errors would likely covary at timescales characteristic of this synoptic-scale air flow.Over large CO 2 source/sink regions, by contrast, atmospheric concentrations are likely influenced more strongly by processes that occur over smaller time periods -grid-scale winds or boundary layer mixing.In addition, sustained transport errors over regions of large biosphere flux would be more likely to cancel out at longer timescales -due to the diurnal cycle of biosphere CO 2 uptake and release.
In addition to remote and ocean regions, month-long transport uncertainties are also large across the entire Northern Hemisphere during February.A subsequent Sect.3.4 explores possible reasons why these month-long biases occur.

Case study 1: the magnitude of temporally and spatially covarying atmospheric transport errors relative to a CO 2 flux estimate
We construct a case study to understand the importance of temporal and spatial error covariances relative to the magnitude of CO 2 surface fluxes.Figure 4 displays the results of this analysis for a selection of representative global CO 2 observation sites from Asia, Europe, and North America.The y axis of each bar plot indicates the difference between the top and bottom of the 95 % confidence interval in monthly mean modeled concentrations.We first consider the results when covariances in atmospheric CO 2 transport errors are included in the analysis (dark blue bars) and then compare those results to a setup in which we remove these error covariances (light blue bars).At this selection of sites, uncertainties in the monthly mean afternoon concentrations range from 1.6 to 2.8 ppm (dark blue bars).These uncertainties are lower at marine sites like RYO and TTA (see definitions in Fig. 4) and are higher at continental sites located near large biospheric fluxes, sites like FSD and WBI.Note that this analysis only considers estimated uncertainties due to meteorology.The capabilities of the atmospheric observations would deteriorate if other errors were included, such as those due to imperfect measurements or due to finite-model resolution (e.g., Gerbig et al., 2003;Masarie et al., 2011).Ochsenkopf, Germany (OXK); Talk Tower Angus, UK (TTA); East Trout Lake, Saskatchewan, Canada (ETL); Fraserdale, Ontario, Canada (FSD); and West Branch, Iowa, USA (WBI).For more information on these observation sites, refer to Table S1 in the Supplement.
Uncertainty in monthly mean modeled concentrations as a percentage of the CO 2 increment from surface fluxes We subsequently remove temporal covariances in the errors to identify the role that these covariances play in CO 2 transport uncertainties at the monthly scale.These results are displayed as light blue bars in Fig. 4. When we remove the covariances, the monthly-scale uncertainties are much smaller -by a factor of 5-20 at the individual observation sites.If CO 2 transport errors were temporally independent, then errors of opposite sign and different magnitude would cancel out to a degree when averaged over 1 month (light blue bars).Instead, the transport errors estimated by CAM-LETKF covary in time, and this covariance prevents the errors from averaging down (dark blue bars).
A multi-site comparison in Fig. 4 additionally indicates the role of spatial covariances in the transport errors; the figure shows the uncertainties in CO 2 concentrations when averaged across multiple observation sites within an eco-region.
We compute the monthly-average afternoon concentration across multiple sites for a given ensemble member.We then estimate a confidence interval based upon the distribution of the 64 ensemble members.
The results indicate a large degree of spatial covariance in the atmospheric CO 2 errors.If the errors had no spatial covariance, these errors would average down as more and more observation sites were added to the analysis.However, the dark blue bars in Fig. 4 have a similar magnitude irrespective of whether the analysis was conducted on an individual site or on a collection of many sites from an eco-region; the errors must therefore covary in space.In contrast, the light blue bars (i.e., error covariances removed) do decrease in magnitude at the eco-region scale relative to individual observation sites.In that case, the errors do average out when more and more sites are included in the analysis.Figure 5 places the results of case study one in the context of the surface fluxes.This figure displays the uncertainties in atmospheric CO 2 transport (the dark blue bars in Fig. 4) as a fraction of the mean afternoon CO 2 boundary layer enhancement.As discussed in Sect.2.4, this enhancement approximates the CO 2 increment due to regional surface fluxes, and a similar CO 2 increment is used by a number of top-down studies to estimate the surface fluxes.At the individual observation sites, the uncertainty in atmospheric CO 2 constitutes 13-150 % of the average boundary layer CO 2 enhancement.This percentage is highest at marine sites like RYO and TTA that see a relatively small boundary layer enhancement, and the relative magnitude of the uncertainties is smallest at sites that see a very large enhancement due to large summertime vegetation fluxes (e.g., at the WBI site).The uncertainties due to atmospheric transport are substantial relative to the fluxes but only when we include covariances in transport error.When we remove these covariances, the uncertainty in monthly-average afternoon concentrations drops to only 2-22 % of the boundary layer enhancement.
The results of this analysis hold several implications for future atmospheric inverse models and/or top-down studies that optimize CO 2 fluxes.Most existing inverse models account for atmospheric CO 2 transport errors in their statistical setup.In a Bayesian synthesis or geostatistical inverse model, for example, this information is incorporated into a covariance matrix, and that covariance matrix is used as an input to the equation that optimizes the CO 2 fluxes (e.g., Enting, 2002;Michalak et al., 2004;Ciais et al., 2011).However, the majority of existing studies assume that this covariance matrix is diagonal (i.e., no error covariances), in part, because these temporal and spatial covariances are challenging to estimate (e.g., Lin and Gerbig, 2005;Lauvaux et al., 2009).The present study, in contrast, indicates that both temporal and spatial error covariances play an important role in monthlyscale errors in atmospheric transport.
Ignoring these error covariances could lead to numerous challenges.When we add more data at an observation site or add more sites the analysis, the actual errors do not average down to the extent that uncorrelated errors would.Rather, adding more data or more observation sites provides more limited gains in accuracy.As a result, an inverse model that overlooks the error covariances will estimate uncertainties in the CO 2 fluxes that are too small, and/or the inverse model may erroneously map atmospheric transport errors onto the surface fluxes (e.g., Stephens et al., 2007).Future inverse modeling studies could better account for these uncertainties by including off-diagonal terms in one of the covariance matrices used by the inverse model.
The next case study (Sect.3.4) explores the meteorological factors that may be associated with these persistent atmospheric transport errors.

Case study 2: which meteorological factors are associated with month-long atmospheric transport biases?
In this case study, we use a synthetic tracer experiment (Sect.2.5) to uncover possible drivers of atmospheric transport biases at month-long timescales.The previous section (Sect.3.3) explored the importance of covariances in atmospheric CO 2 transport errors, and this section investigates the meteorological conditions associated with these persistent errors.
Figure 6 displays the CV for monthly-averaged surface concentrations of the synthetic tracer.The CV, a unitless quantity, does not just indicate where the uncertainties are largest.Rather, the CV indicates the magnitude of these uncertainties relative to the mean modeled tracer concentration.Arguably, this noise-to-signal ratio measures the influence of transport uncertainties more effectively than a simple standard deviation.
This coefficient shows a number of distinctive seasonal and spatial patterns.Like the uncertainties in monthly-  S2 in the Supplement) and plot the two parameters that correlate most closely with the tracer CV over terrestrial and marine regions, respectively.In all cases, we fit both a standard major axis regression and nonlinear least squares ( 1 ) and plot the regression with the higher correlation coefficient.
averaged CO 2 (Fig. 2c, d), the CV in Fig. 6 is highest in terrestrial boreal and arctic regions of the Northern Hemisphere during winter.The CV is lowest over Europe, Australia, and the Amazon during all seasons.
The CV in Fig. 6 exhibits different spatial patterns over land and ocean regions, and these respective patterns correlate with different sets of meteorological variables.Over the oceans, for example, high CV values in Fig. 6a are clustered in zonal bands -along the Equator and along 40 • S. In contrast, high CV values do not cluster into zonal bands to the same degree over terrestrial regions.Rather, CV values are often high when temperatures are low (e.g., over Canada or Russia in February).
We plot the synthetic tracer CV against numerous modeled meteorological parameters to further understand the possible drivers behind atmospheric transport uncertainties averaged over these monthly timescales.To this end, we examine correlations between the tracer CV and 60 different meteorological parameters, including the uncertainties in the meteorological variables.Figure 7 displays the two variables that correlate most strongly with the tracer CV over land regions and over ocean regions, respectively.
Over land regions, meteorological conditions that lead to high atmospheric stability and low energy are most closely associated with atmospheric transport errors.For example, a high tracer CV is associated with low temperatures (R 2 = 0.45) and low specific humidity (R 2 = 0.40).Similarly, a high tracer CV is correlated with low net solar flux (R 2 = 0.35), low planetary boundary layer height (R 2 = 0.33), and low vertical diffusion diffusivity (R 2 = 0.31).Note that many of these meteorological variables are closely related to one another, so the individual correlations listed above are all interrelated.
In addition, several of the meteorological variables exhibit a nonlinear relationship with the tracer CV, and the potential for bias in modeled atmospheric transport increases more quickly in stable atmospheric conditions.For example, the CV increases more quickly when planetary boundary heights are low.
In contrast to land regions, the tracer CV over the oceans is most closely associated with low zonal wind speeds (R 2 = 0.29, Fig. 7).Over land regions, that correlation is zero.Uncertainties in atmospheric transport over the oceans are also associated with low PBL heights (R 2 = 0.25).These two meteorological variables explain different patterns in the tracer CV; PBL heights and zonal wind speeds over the ocean are not correlated with one another (R 2 = 0), so these two parameters may indicate different processes underlying the atmospheric transport errors.
These differences between land and ocean regions may reflect differences in synoptic-scale circulation.Over the oceans, high CV values are clustered in zonal bands, and these clusters often occur at the transition between distinctive synoptic flow patterns.Modeled atmospheric tracer transport is more uncertain in these transition regions -at the transition between southern westerlies and southern trade winds and at the transition between the North Atlantic trade winds and the westerlies.Zonal winds over the continents are often more variable than over the oceans (Fig. S17 in the Supplement), and atmospheric transport uncertainties do not cluster into the same, distinctive, zonal bands.
The results of this synthetic tracer experiment hold a number of potential applications to top-down CO 2 flux estimation.The danger of obtaining a biased CO 2 budget is likely higher in regions with consistent low energy and limited vertical mixing.A number of existing studies indicate that uncertainties in PBLH and vertical mixing are closely tied to uncertainties in estimated trace gas transport or in estimated trace gas fluxes (e.g., Stephens et al., 2007;Williams et al., 2011;Miller et al., 2012;Pino et al., 2012;Kretschmer et al., 2012).This study further suggests that sustained transport errors due to PBLH are more likely in regions or at times when PBL heights and mixing are consistently low.The meteorological model ensemble is not necessarily more uncertain in these regions (see Figs. S15-S16 in the Supplement).Rather, the extent to which meteorological uncertainties translate into tracer transport uncertainties appears to depend, at least in part, on the stability and net energy input associated with the boundary layer.
In summary, boundary layer energy and height explain some of the patterns in the estimated transport errors, but other patterns are associated with uncertainties in synoptic flow and are not related to a single meteorological parameter.In fact, over both terrestrial and oceanic regions, individual meteorological parameters only explain a maximum of 29-45 % of the variability in the tracer CV.This result stresses the utility of a meteorological model to calculate the variances and covariances in atmospheric transport errors rather than relying on a single, meteorological proxy.
Note that this study does not account for uncertainties in bottom-up, biogeochemical flux models due to uncertainties in driving meteorological variables.For example, processbased, biogeochemical models of CO 2 typically require estimates of meteorological variables like humidity, temperature, or precipitation to compute the surface fluxes.A number of existing studies have used atmospheric data and/or atmospheric models to explore the meteorological variables that drive CO 2 flux models.For example, Lin et al. (2011) explored how uncertainties in flux model drivers affected fluxes estimated for Canadian boreal forests.They found that uncertainties in downward shortwave radiation contributed to the largest uncertainties in the simulated fluxes.Similarly, numerous studies indicate that both air temperature and humidity are drivers of CO 2 fluxes (e.g., Law et al., 2002;Gourdji et al., 2012).These meteorological variables (e.g., downward shortwave radiation, temperature, and specific humidity) correlate with the persistent atmospheric transport uncertainties discussed earlier in this section.A future study could connect these uncertainties (in the biogeochemical model and in atmospheric transport) to gain an even broader picture of how meteorological uncertainties affect CO 2 flux modeling and ultimately top-down CO 2 flux estimates.

Conclusions
We use CAM-LETKF to explore the characteristics of correlated or covarying atmospheric CO 2 transport errors and the implications of those errors for CO 2 flux estimates.The first case study examines the relative magnitude of these errors at the monthly timescale.At this scale, error covariances play a critical role in the uncertainties in modeled atmospheric CO 2 ; we find that uncertainties increase by a factor of 5-20 at individual CO 2 observation sites when we include the error covariances in the analysis.These monthly-scale errors correspond to 13-150 % of the afternoon CO 2 boundary layer enhancement, depending on the site in question.
Existing top-down studies often overlook these covariances, and these results imply that atmospheric CO 2 measurements contain less information about the fluxes than is often assumed.As a result, existing inverse models may underestimate the uncertainties in estimated CO 2 fluxes and/or may be vulnerable to unforeseen biases in the estimated fluxes.Accounting for these correlated errors can be as simple as modifying one of the covariance matrix inputs in a Bayesian inverse model.
In a subsequent case study, we investigate the meteorological factors associated with month-long biases in atmospheric transport.The largest short-term CO 2 transport errors correlate strongly with the location of the largest surface fluxes, but month-long biases in atmospheric transport are not only localized to regions with large fluxes.Rather, these biases may be more likely to occur at observation sites that are far from large fluxes and in regions with high atmospheric stability and low net radiation.Over the oceans, biases in atmospheric transport are also associated with weak zonal winds.Existing top-down flux studies may be more likely to estimate inaccurate regional fluxes under those conditions.However, a large fraction of the estimated atmospheric transport errors cannot be described by a single meteorological parameter.This result indicates the utility of a meteorological modeling system, like CAM-LETKF, to estimate errors in atmospheric CO 2 transport.Through this framework, we can better understand the connections between uncertain atmospheric transport and uncertainties in CO 2 budgets estimated from atmospheric data.
The Supplement related to this article is available online at doi:10.5194/acp-15-2903-2015-supplement.

Figure 2 .
Figure 2. The top panels display average 6-hourly CO 2 transport uncertainties estimated by CAM-LETKF.The uncertainties (95 % confidence intervals) are for the surface model layer for (a) February and (b) July 2009.The bottom panels (c and d), in contrast, display the uncertainties in month-long averaged surface CO 2 concentrations.Note that these plots include model output from all 24 h of each day.The Supplement provides analogous figures for daytime-or nighttime-only model output.

Figure 4 .
Figure 4.The uncertainties in monthly-averaged, afternoon atmospheric CO 2 (Sects.2.4, 3.3) at a selection of representative, global CO 2 observation sites.Panels (a) and (b) show the results at each site for February and July 2009, respectively.Dark blue bars indicate the difference between the top and bottom of the 95 % confidence interval when we include error covariances.The light blue bars indicate the results when we remove these covariances in atmospheric transport errors.Observation sites in the figure include Ryori, Japan (RYO);Ochsenkopf, Germany (OXK); Talk Tower Angus, UK (TTA); East Trout Lake, Saskatchewan, Canada (ETL); Fraserdale, Ontario, Canada (FSD); and West Branch, Iowa, USA (WBI).For more information on these observation sites, refer to TableS1in the Supplement.

Figure 5 .
Figure 5. Uncertainty in monthly-averaged afternoon CO 2 concentrations as a percentage of the average afternoon CO 2 boundary layer enhancement.This figure places the uncertainties from Fig. 4 (dark blue bars) in context of the afternoon CO 2 increment from surface fluxes.Larger percentages indicate greater potential for bias in monthly CO 2 budgets estimated from atmospheric data.

Figure 6 .
Figure6.The coefficient of variation (CV, unitless) for the monthly-averaged model surface layer.The results plotted here are for the synthetic tracer simulation (Sects.2.5, 3.4).In that simulation, the synthetic fluxes have a constant spatial distribution.The resulting CV (σ / µ) shows the distribution of month-long, surface-level transport uncertainties independent of the spatial distribution in the fluxes.

Figure 7 .
Figure7.Each panel shows the relationship between the synthetic tracer CV (Fig.6) and various monthly-averaged meteorological parameters estimated by CAM-LETKF.The top row (a) shows the results for terrestrial regions while the bottom row (b) displays the results for ocean/marine regions.Darker colors in each panel indicate a higher density of points.We test the correlation with 60 different parameters (TableS2in the Supplement) and plot the two parameters that correlate most closely with the tracer CV over terrestrial and marine regions, respectively.In all cases, we fit both a standard major axis regression and nonlinear least squares ( places these transport uncertainties in context of CO 2 data measured at two observation sites in the United States.These time series plots validate the model's capacity to simulate daily variations in CO 2 concentrations.Further- www.atmos-chem-phys.net/15/2903/2015/Atmos.Chem.Phys., 15, 2903-2914, 2015 Figure 3. Hourly-averaged CO 2 measurements at (a) Moody, Texas, and (b) Argyle, Maine, compared against the CAM-LETKF model ensemble.Measurements are from the top inlet height at each location.In this figure, the model ensemble represents uncertainties due to atmospheric transport but not other errors (e.g., due to the fluxes and model resolution).