Evaluating a 3-D transport model of atmospheric CO2 using ground-based, aircraft, and space-borne data

. We evaluate the GEOS-Chem atmospheric transport model (v8-02-01) of CO 2 over 2003–2006, driven by GEOS-4 and GEOS-5 meteorology from the NASA Goddard Global Modeling and Assimilation Ofﬁce, using surface, aircraft and space-borne concentration measurements of CO 2 . We use an established ensemble Kalman Filter to estimate a posteriori biospheric+biomass burning (BS + BB) and oceanic (OC) CO 2 ﬂuxes from 22 geographical regions, following the TransCom-3 protocol, using boundary


Introduction
Atmospheric transport models have played a central role in the interpretation of atmospheric CO 2 concentrations. They have been used in the forward mode to assess whether a priori flux inventories can reproduce observed atmospheric CO 2 concentration variations (e.g., Gurney et al., 2003), and in the inverse mode to adjust surface CO 2 fluxes in order to minimize the discrepancy between observed and model concentrations (e.g., Gurney et al., 2002, Rödenbeck et al., 2003, Gurney et al., 2004, Stephens et al., 2007. Model evaluation is therefore a critical step in developing robust flux estimates using the inverse model.
A substantial amount of previous work involved with assessing atmospheric transport models of CO 2 has been coordinated by an atmospheric tracer transport model intercomparison project (TransCom, e.g., Gurney et al., 2003 andGurney et al., 2004). They have in particular assessed the sensitivity of CO 2 flux estimation to atmospheric transport by quantifying the variation from several independent transport models. Up until now, the GEOS-Chem global 3-D transport model has not participated in this project, however, the model has been extensively evaluated using a wide range of ground-based, aircraft, and satellite measurements of CO 2 , CO, HCN, CH 3 CN (e.g., Li et al., 2003, Heald et al., 2004, Palmer et al., 2008, Li et al., 2009. Previous work attempted to evaluate the GEOS-Chem model using the SCanning Imaging Absorption SpectroMeter for Atmospheric CHartography (SCIAMACHY) CO 2 columns from 2003 but the results were inconclusive because there was also substantial unexplained bias in the satellite data (Palmer et al., 2008). Within that study, we performed a limited evaluation of model CO 2 columns at Park Falls, USA, and found that the model could not reproduce the magnitude of the minima during the growing season, consistent with previous studies (Yang et al., 2007). We also showed that the model reproduced GLOBALVIEW surface concentration data over North America. A preliminary study using data from the 2003 CO 2 Budget and Rectification Airborne experiment (COBRA) (Bakwin et al., 2003) showed that the model had a positive bias of 2 ± 3.5 ppm throughout the boundary layer, suggesting too weak model vertical mixing; a relatively small model bias in the free troposphere (2-6 km) where surface flux signatures are relatively weak, increasing to a positive bias of 2.3 ± 1.8 ppm at 8-10 km that was attributed to a possible error in describing stratosphere troposphere exchange (Shia et al., 2006).
Here, we perform a more comprehensive evaluation of the GEOS-Chem global 3-D transport model simulation of CO 2 during 2003-2006 using surface, aircraft and satellite data that span the depth of the troposphere. We are especially looking for unexplained biases that could compromise the ability of this model to inform the carbon cycle community on changes in the magnitude and distribution of CO 2 sources and sinks. In Sect. 2, we describe the GEOS-Chem model and the surface flux inventories. In Sect. 3, we describe the ground-based, aircraft and satellite data we use to evaluate model CO 2 concentrations, and to infer the magnitude and distribution of surface sources and sinks. In Sect. 4, we describe the ensemble Kalman Filter, which is used to optimally fit surface fluxes to minimize the discrepancy between observed and model ground-based data. We present in Sect. 5 the a posteriori flux estimates for the terrestrial biosphere and biomass burning, and ocean biosphere from [2003][2004][2005][2006]. In Sect. 6 we evaluate the model, driven by a priori and a posteriori flux estimates, using surface, aircraft and satellite data that focus on the boundary layer, free troposphere, and upper troposphere. We conclude the paper in Sect. 7.

The GEOS-Chem model of atmospheric CO 2
We use the GEOS-Chem global 3-D chemistry transport model (v8-02-01) to relate prescribed CO 2 surface fluxes to atmospheric CO 2 concentrations, driven separately by GEOS-4 (Bloom et al., 2005) and GEOS-5 (Rienecker et al., 2008) assimilated meteorology data from the Global Modeling and Assimilation Office Global Circulation Model based at NASA Goddard Space Flight Center. The resulting model calculations using GEOS-4 and GEOS-5 meteorology are denoted as G4 and G5, respectively.
Using different meteorological fields offers us an opportunity to test the sensitivity of our results to differences in atmospheric transport. These 3-D meteorological data are updated every six hours, and the mixing depths and surface fields are updated every three hours. We use these data at a horizontal resolution of 2 • latitude ×2.5 • longitude. GEOS-4 (GEOS-5) meteorology has 30 (47) hybrid vertical levels ranging from the surface to the mesosphere, 20 (30) of which are below 12 km. We find significant differences between GEOS-4 and GEOS-5 meteorological fields that appear to be related to the use of different convection parametrisations used in the GEOS-4 and GEOS-5 analysis approaches, which have consequences for model CO 2 distributions. GEOS-5 uses the relaxed Arakawa-Schubert (Moorthi and Suarez, 1992) convection scheme to describe wet convections, while GEOS-4 distinguishes between deep and shallow convections following the schemes developed by Zhang and Mc-Farlane (1995) and Hack (1994). Impacts of these two different convection schemes on tropospheric ozone have been previously reported by Wu et al. (2007) using GEOS-3 (with relaxed Arakawa-Schubert scheme) and GEOS-4 data sets. Figure 1 shows, for example, differences between G4 and G5 prior atmospheric CO 2 columns in April and August, 2004, respectively. These model atmospheric CO 2 columns are simulated using the same (1) initial distribution on 1 January 2004; and (2) the a priori CO 2 surface fluxes. However, the differences between their monthly mean CO 2 columns can be as large as 1.0 ppm over tropical lands. These differences are reflected in the top-down flux estimates.  August, 2004. Except the differences in the meteorological fields, both model simulations were run at a horizontal resolution of 2 • × 2.5 • , with the same (1) initial distribution on 1 January 2004; and (2) a priori CO 2 surface fluxes.
We use a version of the GEOS-Chem transport model that accounts for CO 2 concentration contributions from geographical regions to the total atmospheric concentration. Figure 2 shows the 22 geographical regions we consider, which are based on the TransCom-3 (T3) study (Gurney et al., 2002). The CO 2 simulation is based on previous work (Suntharalingam et al., 2005;Palmer et al., 2006Palmer et al., , 2008 with updates described below. We include a priori surface estimates for fossil fuel, biofuel, biomass burning, and surface fluxes from the ocean and terrestrial biosphere. We use a spatial pattern of annual fossil fuel emissions based on work for 1995 (Suntharalingam et al., 2005, Brenkert, 1998, and scale fluxes to 2003-2006 based on global total fossil fuel emissions, including emissions from the top 20 emitting countries, from the Carbon Dioxide Information Analysis Centre (Marland et al., 2007). Resulting annual global fossil fuel emissions are 7. 29, 7.67, 7.97, and 8.23 PgC for the years 2003 to 2006, respectively. We ignore temporal variation of fossil fuel emission on timescales less than a year. Other studies show that including this additional temporal variability can be important, but associated uncertainties are substantial (Erickson et al., 2008).
We use a climatological biofuel emission estimate (Yevich and Logan, 2003), which has an annual emission of 0.75 PgC yr −1 with 0.34 PgC yr −1 from the northern continents. This additional anthropogenic emission has not been included as part of the standard prior used in TransCom experiment.
Monthly biomass burning emissions are taken from the second version of the Global Fire Emission Database (GFEDv2) for (van der Werf et al., 2006, which are derived from ground-based and satellite observations of land-surface properties. We prescribe monthly ocean fluxes that have been determined from sea-surface pCO 2 observations (?), and have an annual net uptake of 1.4 PgC.
We use the CASA biosphere model  constrained by observed GEOS meteorology and Normalized Difference Vegetation Index (NDVI) data to prescribe atmospheric CO 2 exchange with the terrestrial biosphere. CASA is spun up for several hundred years using the multi-annual mean monthly meteorology and NDVI for the simulation period. This results in a nearly annually-balanced biosphere. Specific monthly CASA fluxes are derived using monthly weather and NDVI data with variations on shorter timescales determined by 3-h G4/G5 meteorology analyses (Olsen and Randerson, 2004). This produces flux distributions with diurnal to interannual variability, but no long-term trend and a mean annual net flux very near zero. We initialise our model run on January 2002 using a previous model run (Palmer et al., 2006), which we integrate forward to January 2003. Due to the unavailability of GEOS-5 meteorology data, the initial G5 CO 2 distribution on January 1st 2004 is constructed from the G4 model simulation that starts from January 2003. We include an additional initialization to correct the model bias introduced by not accounting for the net uptake of CO 2 from the terrestrial biosphere. We make this downward correction by comparing the difference between GLOBALVIEW CO 2 data (GLOBALVIEW-CO2) and model concentrations over the Pacific during January 2003. Differences range from 1 to 4 ppm with a median of 3.5 ppm, and we subtract this value globally, following Suntharalingam et al. (2005).
To improve the model latitude gradient of CO 2 , we fitted the initial atmospheric CO 2 concentrations over the Southern Hemisphere, described by three 30 • latitude bands, to the zonal mean of the co-located GLOBALVIEW CO 2 measurements during the first month of 2003. We acknowledge that the resulting atmospheric CO 2 distribution, in particular its vertical structure, will still include error and consequently will affect the estimation of surface CO 2 fluxes. However, we anticipate most of this error is absorbed in the 2003 flux estimates after fitting the model to CO 2 observations. This is supported by the good agreement between our a posteriori flux estimates and results from other long-term inversion experiments, and independent atmospheric CO 2 observations. We conclude that using a longer spin-up time to determine the initial distributions would not significantly alter the major conclusions of this paper.

Data used to infer CO 2 flux estimates and to evaluate GEOS-Chem
We use independent data to estimate surface fluxes and to evaluate resulting model atmospheric CO 2 concentrations.

GLOBALVIEW CO 2 data
We use the GLOBALVIEW smoothed CO 2 data set to infer surface CO 2 flux estimates. This is a data product, representing a (smooth) statistical fit to over 200 time series from a global ground-based flask and continuous observation network. The smoothed values are extracted from a curve fitted to measurements that are thought to represent large wellmixed air parcels. GLOBALVIEW also provides extended dataset with 48 pseudo-weekly synchronous CO 2 values per year from an extrapolation procedure used to fill gaps in the observation record at individual sites (Masarie and Tans, 1995). Figure 2 shows the geographical distributions of the available 277 observation time series during the time period [2003][2004][2005][2006]. Nearly one-third of available stations are located around North America and Europe, with little coverage over the tropics. We sample the model at the nearest grid box to the station location and average the data over 48 pseudo weeks. For stations that straddle ocean/land model grid boxes we sample the model at the nearest windward ocean grid boxes, as suggested by the TransCom 3 protocol (Gurney et al., 2003).

Aircraft data
To help evaluate the model vertical distribution of CO 2 throughout the troposphere we use aircraft data from the Comprehensive Observation Network for TRace gases by AIrLiner (CONTRAIL, Matsueda et al., 2002); Intercontinental Chemical Transport Experiment North America (INTEX-NA, Singh et al., 2006); the COBRA campaign (Bakwin et al., 2003); and Airborne Extensive Regional Observations in Siberia (YAK-AEROSIB, Paris et al., 2008Paris et al., , 2010. Table 1 provides a summary of these campaigns; for the sake of brevity, we refer the reader to the dedicated campaign literature, as cited above, for further details of each dataset. We sample the model at the appropriate time and location of each observation.

Atmospheric infrared sounder satellite data
The Atmospheric Infrared Sounder (AIRS), aboard the NASA Aqua satellite, was launched into a sun-synchronous near-polar orbit in 2002. AIRS measures atmospheric thermal infrared radiation between 3.74 µm and 15.4 µm using 2378 channels. CO 2 columns are retrieved from selected CO 2 channels in the 15 µm band using the Vanishing Partial Derivatives (VPD) algorithm, which does not rely on a priori information (Chahine et al., 2008). These thermal IR channels are most sensitive to CO 2 at 450 hPa, with fullwidth half peak spanning 200-700 hPa. The horizontal resolution of the AIRS CO 2 data is 90 × 90 km 2 . Previous work has shown the retrieved mid-tropospheric AIRS CO 2 data are within 2 ppm of aircraft measurements at 8-13 km (Chahine et al., 2008). The AIRS CO 2 global trend, determined by a linear least-squares fit to monthly means described using 2 • latitude bins over 60 • S-60 • N from January 2003 to December 2008, is 2.02 ± 0.08 ppm yr −1 .
We use the gridded monthly mean level-3 AIRS CO 2 product. For each gridded AIRS measurement, we sample the model at the nearest 2 • × 2.5 • grid box, convolve the resulting vertical profile with the AIRS vertical weighting functions, which account for the vertical sensitivity of the instrument and air mass at different pressures, as a function of latitudes (Chahine et al., 2008), and calculate the monthly mean. We acknowledge that using level-2 AIR CO 2 products including reported averaging kernels is more appropriate for more detailed model-observation comparisons. However, level-2 data were not fully available at the time when most of our comparisons were made.

The ensemble Kalman Filter inverse model
We optimally fit prescribed a priori surface fluxes S 0 (x,y,t), via the GEOS-Chem forward model, to observed groundbased GLOBALVIEW CO 2 data at selected stations (denoted by white circles and red triangles in Fig. 2), similar to the T3 study (Gurney et al., 2002). A priori surface fluxes include those from combustion of fossil (FF) and bio-(BF) fuels, biomass burning (BB), and the terrestrial (BS) and ocean (OC) biospheres (Sect. 2). The adjustment is in the following form: where each monthly basis function i m (x,y) represents a pulsed emission of 1 PgC yr −1 from each of the 22 individual T3 regions i during month m. For ocean regions, we assume i m (x,y) has an uniform spatial distribution. For land regions, the spatial distribution is informed by the annualmean net primary production from the CASA model (Gurney et al., 2003(Gurney et al., , 2008. We use an Ensemble Kalman Filter (EnKF) (Feng et al., 2009) to estimate coefficients λ i m , which we assume have initial values of zero.
The state vector x are monthly values of λ i m for each T3 region (Fig. 2). We evaluate the resulting a posteriori surface fluxes, S. For the purpose of this calculation we assume perfect knowledge of FF and BF, and report BS + BB and OC flux estimates; in practice, any adjustment to λ i m will also reflect errors in FF and BF. We express our a posteriori monthly flux estimates as the equivalent annual flux (PgC yr −1 ), following Gurney et al. (2004Gurney et al. ( , 2008; for clarity, we also present our results as PgC/month. For the EnKF, uncertainties associated with λ i m are represented by an ensemble of perturbations states X so that the a priori error covariance matrix P is approximated by: P = X( X) T . We use the full matrix representation of the EnKF, i.e., using an ensemble of the same size of the state vector dimension. The perturbation states are projected into the observation space as the perturbations to the mean atmospheric CO 2 concentrations by using the GEOS-Chem 3-D atmospheric transport model. To reduce computational costs, we introduce a lag window of 8 months to reduce the number of variables (and hence the size of ensemble) to estimate at each assimilation step. The current lag window is longer than we adopted previously for assimilating satellite measurements, reflecting the sparse spatial coverage of groundbased data. We find that the influence of fluxes older than the lag window do not provide strong constraints, accounting for model transport error. Consequently, a much longer lag window will not dramatically reduce the flux uncertainties presented here. As a result,at each assimilation step of one month, we need to estimate 176 values (8 months × 22 regions) of λ i m .
L. Feng et al.: GEOS-Chem CO 2 simulation We optimally estimate the a posteriori state vector x a using: where x f and x a are the a priori and a posteriori state vectors; the observation vectors, y obs , represents the atmospheric CO 2 concentrations (ppm); H (x f ) are the model observations (ppm), where H is the observation operator that describes the relationship between the state vector and the observations. H accounts for global atmospheric CO 2 transport and surface emission/sink during each assimilation lag window, and interpolation of the resulting 3-D CO 2 fields to the observation locations. We have ignored the feedbacks of the perturbed CO 2 concentrations on atmospheric dynamics, and hence observation operator H is a linear function of the state vectors (i.e., the coefficients λ i m for the regional flux adjustments).
We calculate the ensemble gain matrix K e (ppm −1 ) using: where R is the observation error covariance, and Y is defined as Y = H ( X f ). To calculate Y, we introduce model tracers to describe the perturbation of surface fluxes, X, on the variability of observed CO 2 concentrations (Palmer et al., 2006(Palmer et al., , 2008. We assume an a priori uncertainty c i m for values of λ i m over land region i to be where BS i m represents the monthly BS flux (PgC yr −1 ); adding 1.0 avoids artificially small uncertainties where the prior BS flux is weak. The resulting uncertainty for each regional land surface flux is close to 50% of the a priori estimate, similar to values used in previous studies (see for example, Gurney et al., 2008). We find that our a posteriori flux estimates are relatively insensitive to c i m (not shown). We use a similar approach to describe the uncertainty of ocean re- where OC i m is the monthly mean ocean surface fluxes. We use a smaller offset value (0.6) for ocean fluxes, reflecting the smaller, less uncertain seasonal variation compared to the terrestrial fluxes.
The observation vector, y obs includes data from GLOB-ALVIEW stations, which are used to infer the monthly surface fluxes for 2003-2006. These stations, chosen based on the measurement availability during 2003-2006, are marked as white and red dots in Fig. 2; additional details of each station can be found at http://www.esrl.noaa.gov/.
Because changes in data availability may introduce artificial noise into flux estimates, we assimilated GLOBALVIEW surface data using relative weights (taken from the GLOB-ALVIEW auxiliary files named with a extension of 'wts') larger than 4.0. The relative weights reflect how many real measurements are available at a particular site for each year (Masarie and Tans, 1995). Table 2 shows a list of the CO 2 time series we used in our flux inversions. We estimate an observation uncertainty for each GLOBALVIEW station by using the standard deviation of the weekly residual between observations and the fitted curve as provided by GLOB-ALVIEW (Gurney et al., 2004). We limit the minimum observation uncertainties to be 0.25 ppm, and also enlarge the uncertainties for co-located stations (Gurney et al., 2004). To account for model transport (and representation) error, we assume an uniform 1.0 ppm uncertainty. We assume the observation and a priori errors are uncorrelated in time and space, resulting in diagonal matrices for P and R.

A posteriori continental and oceanic CO 2 fluxes
Global annual a posteriori CO 2 flux estimates over 2004-2006 for the G4 (G5) model are −4.4 ± 0.9 (−4.2 ± 0.9), −3.9 ± 0.9 (−4.5 ± 0.9), and −5.2 ± 0.9 (−4.9 ± 0.9) PgC yr −1 , respectively. These estimated fluxes using the G4/G5 meteorology are generally similar. However, in 2005 the G5 estimated net sink is higher than the G4 flux by 0.6 PgC. This discrepancy is thought to be associated with different model vertical transport (see Fig. 1). Table 3 compares our global net fluxes (after anthropogenic fossil and bio-fuel emissions have been included) with three   (Rödenbeck et al., 2006). Our results, in particular the 3-yr totals, are in good agreement with these previously reported results that are determined using much higher spatial resolutions. Figure 3 shows a priori and a posteriori fluxes over three T3 land aggregates: North continents (Boreal North America, Temperate North America, Europe, Boreal Eurasia, Temperate Eurasia); Tropical continents (Northern Africa, Tropical Asia, Tropical America); and South continents (Southern Africa, Australia, South America) (Gurney et al., 2003). In general, the assimilation process reduces the uncertainties associated with the estimated BS + BB surface fluxes over North continents, and to a lesser extent over the South continents; a posteriori uncertainties over the Tropics are similar to the prior values. These error reductions reflect the efficacy of the constraints provided by GLOBALVIEW data. Resulting regional G4/G5 a posteriori fluxes follow the temporal changes of the prior, but have much stronger uptake during the boreal growing seasons. Table 4 shows that our results are generally consistent with previous T3 experiments for 1992-1996(Gurney et al., 2003. Our global annual G4 and G5 a posteriori estimates are much stronger sinks over northern continents during 2004-2006 (−3.00 and −3.65 PgC yr −1 , respectively) compared to mean T3 estimates for 1992-1996, which may reflect a number of factors: increased activity of the terrestrial biosphere, an overestimate of prescribed anthropogenic CO 2 emissions, or a negative (slower) model bias in boundary layer mixing (e.g., Stephens et al., 2007).
There are also large discrepancies between the estimated natural fluxes over northern continents determined by different groups: our G4 estimates (−3.0 GtC yr −1 ) are in good agreement with JENA S99 v3.2 (−2.8 PgC yr −1 Rödenbeck et al., 2006), but much stronger than LSCE v1.0 (−2.07 PgC yr −1 Chevallier et al., 2010), partially due to our additional biofuel emissions of 0.34 PgC yr −1 from northern continents. The G5 posteriori has much stronger sinks over northern continents than the G4 results, which are related to differences in model transport. Figure 4 shows the a priori and a posteriori BS + BB CO 2 fluxes over continental Europe, Temperate North America, Boreal Eurasia, and Temperate Eurasia. A posteriori estimates based on GEOS-5 meteorological data show a larger sink over northern extra-tropical continents during 2004-2006 than G4 runs. The largest discrepancies are over Temperate Eurasia, where peak G4/G5 a posteriori CO 2 uptake can be more than twice the a priori value. There are also shifts (up to 1 month) in the peak CO 2 uptake periods over these regions. Over Europe and Temperate North America, the net emission during winter months is smaller than the prior values. The stronger uptake during the growing seasons and the smaller emission during the winters represent a substantial departure from the annually-balanced CASA model, and reflect possible overestimation of biospheric respiration by the CASA model (Gurney et al., 2004), and errors in the prescribed fossil fuel emissions (Erickson et al., 2008).
Fluctuations in the a posteriori fluxes, leading to short periods of weak negative fluxes during winter months, are likely to be an artifact due to errors in source attribution from a limited number of observations. Figure 4 compares the a priori and a posteriori BS + BB CO 2 fluxes over T3 Tropical South America, Northern Africa, and Tropical Asia. Tropical land fluxes have weaker seasonal cycles than those characterized by the extratropics. The differences between a posteriori and a priori estimates (G4 and G5) are usually insignificant, reflecting the small number of observations available to constrain these continental fluxes. For example, the CASA biosphere model and GFEDv2 biomass burning emission estimates predict a net emission from Tropical America in August 2005; for that region and month in other years there is a net sink. Without additional data, we cannot comment on whether the model generates a realistic flux response to the drought conditions over the Amazon basin during 2005 (Phillips et al., 2009). Figure 5 shows the ocean CO 2 fluxes for the corresponding period, which have been aggregated as (a) North ocean (North Pacific, Northern Ocean, North Atlantic); (b) Tropical ocean (West Pacific, East Pacific, Tropical Atlantic, Tropical Indian); and (c) South ocean (South Pacific, South Atlantic, South Indian, Southern Ocean). The differences between the a posteriori and a priori annual ocean fluxes are generally less than 0.2 PgC yr −1 except over southern extra-tropical oceans where a posteriori annual fluxes have a negative shift of 0.3 (0.5) PgC yr −1 . Large seasonal variations in the a posteriori aggregated South ocean flux are correlated with the observed changes in atmospheric CO 2 at southern high latitudes. We find that the data assimilation process introduces extra variability to the a priori values, which may partially be caused by mis-allocation of continental CO 2 sources/sink to oceans, due to the inability of the measurements to adequately constrain ocean fluxes. We find that G4 ocean CO 2 uptake is stronger than G5 fluxes (by 0.3 PgC yr −1 ) over the North ocean, and also that G4 seasonal flux variations over the southern extra-tropical oceans are generally larger than G5.

Model evaluation
We use surface, aircraft and satellite data to help evaluate the GEOS-Chem G4 and G5 models driven by a priori and corresponding a posteriori flux estimates. First, we use the campaign-based aircraft data to help evaluate vertical profiles of CO 2 in the troposphere. Second, we use surface, aircraft, and satellite data to test how well the model can reproduce the observed seasonal cycle and trend of CO 2 from 2003 to 2006. We acknowledge some circularity in our using a selection of ground-based data to infer fluxes and then to use all stations (smoothed data) to evaluate model atmospheric concentrations resulting from the a posteriori fluxes, but this approach still provides a gross measure of the model fit to the surface data.  Fig. 5. Same as Fig. 3, but averaged over the northern extratropical oceans, the tropical oceans, and the southern extratropical oceans.

Vertical distribution
We use aircraft data from the CO 2 Budget and Regional Airborne Study during May-August, 2004 over North America; from INTEX-NA that measured North American continental outflow during 2004; and from the YAK-AEROSIB campaign during 2006 (Table 1). For these campaigns we sample the model at the time and location of each measurement. Figure 6 shows that the G4 and G5 model averages are typically within 2 ppm of the COBRA CO 2 observations in the free troposphere. Variability of model and observed boundary layer concentrations are similar in magnitude and larger compared to the free troposphere. The model is able to reproduce the sharp CO 2 vertical gradient in the boundary layer during June and July, but has a positive bias of 5 ppm in the early (May) growing season and a negative bias of 3.5 ppm in the late (August) growing season. Table 5 shows that G4 and G5 have a similar level of skill at reproducing the mean observed profiles over the campaign. Table 5 also shows the mean model minus measurement statistics for INTEX-NA and YAK-AEROSIB. Generally, the model is within 1.5 ppm of the measurements above the boundary layer with a standard deviation close to 3 ppm. The bias and standard deviation is typically higher for boundary layer measurements. For INTEX-NA and YAK-AEROSIB data, G4 and G5 show comparable performance. On the basis of this comparison there is no conclusive evidence that the model is suffering from a significant error in stratospheretroposphere exchange, as previously suggested by Palmer et al. (2008).

Trend and seasonal variations of tropospheric CO 2
We use data from GLOBALVIEW, the AIRS space-borne sensor, and from the CONTRAIL aircraft campaign (Table 1)   The GEOS-Chem model, described at a horizontal resolution of 2 • × 2.5 • , has been sampled at the time and location of each measurement. Data and model concentrations have been averaged over 500 m intervals. Monthly mean observations are denoted by the black lines, with the grey envelope representing the 1-standard deviation about that mean. Blue and red lines denote the monthly mean CO 2 concentrations corresponding to the G4 and G5 a posteriori flux estimates, respectively. The horizontal lines about a posteriori concentrations denote the 1-standard deviation about the monthly mean.
to assess how well the model reproduces observed large-scale trends and latitude variability of CO 2 . Figure 7 shows the GLOBALVIEW and model CO 2 concentration record from 2003 to 2006, inclusive, averaged over 30 • latitude bins. To extract the trend and the seasonal cycle from surface CO 2 time series f (t), we decompose f (t) into polynomial and harmonic functions (Thoning et al., 1989) after smoothing with a 8-week moving average: f (t) = a 0 + a 1 t + a 2 sin(2π t) + a 3 cos(2π t) + a 4 sin(4π t)

Boundary layer
where t runs from 0 to 3 yr (i.e., from 2004 to 2006). The coefficient a 0 represents the mean, and a 1 is the annual trend. The amplitude of the seasonal cycle a s is calculated by a s = a 2 2 + a 2 3 . Table 6 shows the GLOBALVIEW and G4/G5 model trend and seasonal cycles. For comparison, we also include the results for G4 model using the a priori surface flux estimates. For model evaluation we use GLOBALVIEW data from all 277 time series when observations below 3 km are available. The G4 model driven by a priori fluxes overestimates the trend by more than 100%. We generally find that the a posteriori fluxes are more consistent with the observed seasonal cycle, with differences typically less than 20%. We find that for all latitudes, the G4 and G5 model generally underestimates the annual trend by 4-10%, mainly due to the possibly overestimated a posteriori terrestrial sink (as described above). Figure 8 shows the latitudinal gradient of 2004 GLOB-ALVIEW surface CO 2 data, binned at 10 degree latitude intervals, is about 4 ppm (0.033 ppm/ • latitude) over 60 • S-60 • N. The G4 and G5 model gradients for the same latitude range are 0.033 ppm/ • latitude and 0.036 ppm/ • latitude, respectively. G4 model zonal means agree to within 1 ppm of the GLOBALVIEW data at all extratropical latitudes, which increases to 1.5 ppm over the tropics where observations are sparse. G4 and G5 model zonal means are similar except between 30 • N-50 • N where the G5 model has a bias of of 1 ppm. The results for 2005 and 2006 (not shown) are similar. Figure 9 shows a time series of level-3 monthly mean AIRS data, averaged over 30 • latitude bins. The G4 and G5 models have been sampled at the appropriate time and location of each gridded AIRS measurement, and convolved with a latitude-dependent AIRS weighting function (Chahine et al.,   The GEOS-Chem model, described at a horizontal resolution of 2 • × 2.5 • , has been sampled at the time and location of each measurement. Red and blue lines denote the model weekly mean concentrations using a posteriori fluxes inferred using GEOS-4 (G4) and GEOS-5 (G5) meteorological fields, respectively. 2008). AIRS CO 2 concentrations show a global trend of 2.21-2.63 ppm yr −1 while the G4/G5 models have a trend of 1.95-2.19 ppm yr −1 .

Free troposphere
Over southern high latitudes, AIRS data are not available; the model values have only a weak seasonal cycle as expected. Over southern middle latitudes the model has a smaller seasonal cycle and lower concentrations than observed by AIRS, suggesting possible errors in the fluxes and/or atmospheric transport. We acknowledge few independent data to validate AIRS retrievals over southern middle latitudes.
Over northern tropical latitudes, the a posteriori model seasonal cycle is in good agreement with AIRS, but has an The GEOS-Chem model, described at a horizontal resolution of 2 • × 2.5 • , has been sampled at the time and location of each AIRS level-3 CO 2 scene, weighted by the observation numbers, and convolved using the vertical weighting functions from Chahine et al. (2008). The grey envelope denote the 1-standard deviation about the zonal mean CO 2 observations in the latitude band. The green crosses are GLOBALVIEW aircraft measurements at vertical range 5-8 km, and the cyan dots represent G4 a posteriori model CO 2 concentrations sampled at the same time and locations of each GLOBALVIEW aircraft measurement. amplitude much smaller than the sparse GLOBALVIEW aircraft data that span 5-8 km. When we sampled the models at the same time and location of these GLOBALVIEW aircraft measurements, the models agreed better with the observations, suggesting smearing effects in the monthly zonal mean data from vertical weighting functions (as well as from horizontal and temporal averaging). We still find that the model seasonal cycles are smaller than the observations. We did not observe the difference in seasonal cycle with the groundbased GLOBALVIEW data, suggesting that incorrect model vertical transport plays an important role in the discrepancy between the model and data. Over northern mid-latitudes, the model and AIRS seasonal cycles are of comparable magnitude but there is a phase shift with the model leading by 1-2 months which is consistent with the sparse GLOBALVIEW aircraft observations which span 5-8 km. Previous work has also reported GEOS-Chem model bias in the seasonal cycle of CO 2 (Palmer et al., 2008), which has been attributed to deficiencies in modeling vertical transport in the free troposphere. We do not reproduce the AIRS seasonal cycle at northern high latitudes, with the model more consistent with the GLOBALVIEW data. Figure 10a and b show CONTRAIL and model CO 2 concentrations during [2003][2004][2005][2006]. We sample the models at the time and location of each CONTRAIL measurement, and bin them between 35 • S-0 • and 0 • -35 • N between 8-12 km. The resulting model and observed trends are similar (2.00-2.15 ppm yr −1 ) in both latitude bands. The models also capture the observed magnitude and phase of the seasonal cycle in both hemispheres. Figure 10c shows the CONTRAIL and model latitude gradient of CO 2 concentrations. Observed variations about the annual mean mainly reflects the seasonal cycles at 8-12 km, which have been slightly underestimated by our models. Coarse model horizontal resolution can also smear out small spatial variations shown in the neighbouring observations. At latitudes 15 • N-30 • N, model concentrations show much less variation than the observations: at 25 • N, the observed variation is 2.4 ppm, while G4 (G5) model variation is only 1.4 ppm (1.1 ppm), partially due to transport deficiencies and coarse spatial resolutions. The G5 model has less variation than G4, suggesting that the G5 model has slower vertical mixing.

Conclusions
We have evaluated the GEOS-Chem model of atmospheric CO 2 using surface, aircraft and space-borne data. We have driven the model using GEOS-4 and GEOS-5 meteorology, which offers us an opportunity to assess the sensitivity of a posteriori fluxes to atmospheric transport, a priori fluxes of fossil fuel, biofuel, biomass burning, and the terrestrial and ocean biospheres. Model analyses that used GEOS-4 and GEOS-5 meteorology are denoted by G4 and G5, respectively.
The sign and magnitude of regional a posteriori CO 2 fluxes are in broad agreement with TransCom-3 flux estimates for 1992-1996, but our model has a larger sink over northern and southern continents. Our larger estimated sink over northern continents is partially due to including biofuel emissions as part of our prior flux estimates.
The stronger drawdown during the growing season and weaker source during the rest of the year represents a substantial departure from the annually-balanced CASA model, possibly reflecting one or a combination of factors, as found by previous studies, e.g., overestimating prior biospheric respiration, errors in prescribed fossil fuel emission, and errors in boundary layer transport.
We evaluated the a posteriori model vertical CO 2 profile against aircraft campaign data from COBRA 2004 (May-August), INTEX-NA (July-August), and YAK AEROSIB (April, September). The G4 and G5 models reproduced the mean observed concentrations in the free troposphere and upper troposphere generally within 1.5 ppm, with substantial variations that reflect sub-grid variability. However, we found the model had difficulty in capturing boundary layer concentrations observed during COBRA during early (May) and late (August) growing season over North America. The a posteriori G4 and G5 surface concentration trend is 4-10% lower than GLOBALVIEW data, and the model seasonal cycles are within 20% of GLOBALVIEW. The observed latitude gradient of CO 2 over 60 • S-60 • N (0.033 ppm/ • latitude) is well reproduced by the G4 and G5 model.
The model has a small negative bias in the free troposphere CO 2 trend (1.95-2.19 ppm yr −1 ) compared to AIRS data which has a trend of 2.21-2.63 ppm yr −1 , consistent with surface data. Over southern middle and tropical latitudes the model overestimates the seasonal cycle observed by AIRS. Over northern tropical latitudes the model seasonal cycle is in good agreement with AIRS. Over northern midlatitudes the observed and model seasonal cycle are of comparable magnitude but the model leads AIRS by 1-2 months.
Model CO 2 concentrations in the upper troposphere reproduce the trend of about 2.0 ppm yr −1 over 2003-2006 observed by CONTRAIL. The models also captures the observed mean latitude gradient, but both the CONTRAIL observations and models show significant variation about that mean particularly at latitudes greater than 10 • N.
Based on our (limited) model evaluation we find no significant bias in GEOS model transport that would necessarily impede progress in quantitatively understanding major processes in the carbon cycle. However, we acknowledge that once we start evaluating model CO 2 concentrations above the boundary layer the data available quickly becomes sparse in time and space. Global space-borne tropospheric column measurements of CO 2 , with the accuracy and precision required for surface CO 2 flux estimation, are fast becoming a reality. To establish and maintain confidence in these column measurements, we must start to strengthen column and in situ measurement capabilities that facilitate regular access to the free and upper troposphere over continents and over the remote troposphere without the constraints imposed by commercial air corridors. This can be and is being achieved using vehicles such as the Gulfstream V and the Globalhawk UAV that have the capability of duration flying in the free and upper troposphere.