A framework for comparing remotely sensed and in-situ CO 2 concentrations

A framework has been developed that allows validating CO2 column averaged volume mixing ratios (VMRs) retrieved from ground-based solar absorption measurements using Fourier transform infrared spectrometry (FTS) against measurements made in-situ (such as from aircrafts and tall towers). Since in-situ measurements are done frequently and at high accuracy on the global calibration scale, linking this scale with FTS total column retrievals ultimately provides a calibration scale for remote sensing. FTS, tower and aircraft data were analyzed from measurements during the CarboEurope Regional Experiment Strategy (CERES) from May to June 2005 in Biscarrosse, France. Carbon dioxide VMRs from the MetAir Dimona aircraft, the TM3 global transport model and Observations of the Middle Stratosphere (OMS) balloon based experiments were combined and integrated to compare with the FTS measurements. The comparison allows for calibrating the retrieved carbon dioxide VMRs from the FTS. The Stochastic Time Inverted Lagrangian Transport (STILT) model was then utilized to identify differences in surface influence regions or footprints between the FTS and the aircraft CO2 concentrations. Additionally, the STILT model was used to compare carbon dioxide concentrations from a tall tower situated in close proximity to the FTS station. The STILT model was then modified to produce column concentrations of CO 2 to facilitate comparison with the FTS data. These comparisons were additionally verified by using the Weather Research and Forecasting – Vegetation Photosynthesis and Respiration Model (WRF-VPRM). The differences between the model-tower and the model-FTS were then used to calculate an effective bias of approximately −2.5 ppm between the FTS and the tower. This bias is attributed to the scaling factor used in the FTS CO 2 data, which was to a large extent derived from the aircraft measurements Correspondence to: R. Macatangay (macatang@iup.physik.uni-bremen.de) made within a 50 km distance from the FTS station: spatial heterogeneity of carbon dioxide in the coastal area caused a low bias in the FTS calibration. Using STILT for comparing remotely sensed CO 2 data with tower measurements of carbon dioxide and quantifying this comparison by means of an effective bias, provided a framework or a “transfer standard” that allowed validating the FTS retrievals versus measurements made in-situ.


Abstract.
A framework has been developed that allows validating CO 2 column averaged volume mixing ratios (VMRs) retrieved from ground-based solar absorption measurements using Fourier transform infrared spectrometry (FTS) against measurements made in-situ (such as from aircrafts and tall towers). Since in-situ measurements are done frequently and at high accuracy on the global calibration scale, linking this scale with FTS total column retrievals ultimately provides a calibration scale for remote sensing. FTS, tower and aircraft data were analyzed from measurements during the CarboEurope Regional Experiment Strategy (CERES) from May to June 2005 in Biscarrosse, France. Carbon dioxide VMRs from the MetAir Dimona aircraft, the TM3 global transport model and Observations of the Middle Stratosphere (OMS) balloon based experiments were combined and integrated to compare with the FTS measurements. The comparison allows for calibrating the retrieved carbon dioxide VMRs from the FTS. The Stochastic Time Inverted Lagrangian Transport (STILT) model was then utilized to identify differences in surface influence regions or footprints between the FTS and the aircraft CO 2 concentrations. Additionally, the STILT model was used to compare carbon dioxide concentrations from a tall tower situated in close proximity to the FTS station. The STILT model was then modified to produce column concentrations of CO 2 to facilitate comparison with the FTS data. These comparisons were additionally verified by using the Weather Research and Forecasting -Vegetation Photosynthesis and Respiration Model (WRF-VPRM). The differences between the model-tower and the model-FTS were then used to calculate an effective bias of approximately −2.5 ppm between the FTS and the tower. This bias is attributed to the scaling factor used in the FTS CO 2 data, which was to a large extent derived from the aircraft measurements Correspondence to: R. Macatangay (macatang@iup.physik.uni-bremen.de) made within a 50 km distance from the FTS station: spatial heterogeneity of carbon dioxide in the coastal area caused a low bias in the FTS calibration. Using STILT for comparing remotely sensed CO 2 data with tower measurements of carbon dioxide and quantifying this comparison by means of an effective bias, provided a framework or a "transfer standard" that allowed validating the FTS retrievals versus measurements made in-situ.

Introduction
There has been much evidence that increasing global temperatures for the past 50 years can be attributed to human activity and that anthropogenic influence would continue to change the composition of the atmosphere in the next years. Due to man's insatiable need for energy and industrialization, carbon dioxide (CO 2 ), a by-product of fossil fuel combustion and biomass burning (brought about by land use change) has become the most significant anthropogenic greenhouse gas (IPCC, 2005). Due to this, much attention is being given on the absorption characteristics of CO 2 as well as its contribution to possible climate changes due to its increased concentration in the atmosphere (McCartney, 1983).
Currently, global transport models utilize in-situ measurements of carbon dioxide from a global network of surface sites for analyzing, estimating and predicting its concentrations (Carbon Cycle Greenhouse Gases Group (CCGG), 2003) as well as determining regional scale exchanges of CO 2 (Rödenbeck et al., 2006;Peylin et al., 2005;Peters et al., 2007). These in-situ surface measurements have the advantage that they are highly accurate. However, they have a limited spatial coverage and an increasing number of measurements are performed within the proximity of local sources and sinks with networks of tall tower observatories over the continents. The limited spatial coverage and the proximity to local sources and sinks makes model estimates susceptible R. Macatangay et al.: Framework for comparing remotely sensed and in-situ CO 2 VMRs to transport errors, such as errors in vertical transport processes (moist convection and turbulent mixing in the boundary layer), especially for continental regions (Washenfelder et al., 2006;Gerbig et al., 2008). This, in turn, provides uncertainties in the geographic (spatial) and temporal distributions of CO 2 sources and sinks (Dufour et al., 2004;Gerbig et al., 2008). The uncertainties imply that difficulties would come about in predicting the response of carbon dioxide due to climate and land-use changes (Yang et al., 2002), as well as in projecting the future rate of increase of atmospheric CO 2 (Dufour et al., 2004).
Space-borne or satellite measurements, such as the Orbiting Carbon Observatory (OCO) (whose planned launch is on 15 December 2008) (Crisp et al., 2004), the Scanning Imaging Absorption Spectrometer for Atmospheric Chartography (SCIAMACHY) (Burrows et al., 1990) and the Greenhouse Gases Observing Satellite (GOSAT), may offer the solution to the problem of sparse spatial and temporal distributions of carbon dioxide sources and sinks by providing global column measurements of CO 2 (Yang et al., 2002). To supplement and validate the satellite data, ground-based solar absorption spectroscopy in the infrared or Fourier transform infrared (FTIR) spectrometry is employed (Warneke et al., 2005). It measures the same quantity (column concentrations) as the satellite and exhibits less spatial variability as compared to in-situ data while retaining information about the surface fluxes and the diurnal behavior of carbon dioxide. It also complements existing in-situ networks and provides information about CO 2 exchange on regional scales (Washenfelder et al., 2006). The Total Carbon Column Observing Network (TCCON), which is a system of high-resolution ground-based FTIR spectrometers, provides this capability (http://www.tccon.caltech.edu/).
In this paper, CO 2 column abundances from solar absorption FTIR measurements during the CarboEurope Regional Experiment Strategy (CERES) in Biscarrose, France are presented as well as a method to calibrate these measurements against aircraft data. To provide for a "transfer standard" between incomparable measurement techniques, such as insitu tower data and column concentrations from FTIR measurements, the Stochastic Time Inverted Lagrangian Transport (STILT) model  was utilized. The study is not about showing the full capability of solar absorption FTIR measurements for column retrievals of CO 2 , since the instrument used in CERES has a spectral response which is not yet fully understood unlike the ones targeted and in operation for TCCON. The main aim is to provide a framework that allows validating the FTIR retrievals against measurements made in-situ from aircraft as well as from tall towers. Such in-situ measurements are made regularly with high accuracy on an internationally accepted calibration scale (WMO scale), and linking this scale with FTIR retrievals ultimately provides a calibration scale for remote sensing.

Determining CO concentrations
Fourier transform infrared (FTIR) spectroscopy measurements were performed during the CarboEurope Regional Experiment Strategy (CERES) from May to June 2005. CERES aims to come up with a comprehensive database of atmospheric CO 2 concentrations, fluxes, as well as meteorological parameters at the regional scale. An overview of the experiment is given in Dolman et al. (2006). The experiment area is a 250 km×150 km region located Southwest of France bounded to the west by the Atlantic ocean with a shoreline almost rectilinear along a north-northeast orientation. The Les Landes forest dominates the western half of the domain with 80% incorporated in the regional experiment area. It is mainly composed of maritime pines containing clearings of different sizes and are composed of agricultural land, mainly crop, and also grassland and pasture. Historically, a plantation forest was originally planted in the area to drain the marshlands. Now, the region is managed as a commercial forest with regular harvests and crop rotations (Dolman et al., 2006).
During the measurement campaign, carbon dioxide was analyzed in the near infrared region of the electromagnetic spectrum (1.597-1.618 µm or 6180-6260 cm −1 band centered at 1.607 µm or 6220 cm −1 ) due to its proximity to the solar Planck function maxima, which then maximizes the signal-to-noise ratio. Atmospheric oxygen was also retrieved to provide a means to determine the dry air mixing ratio, avoiding uncertainties from the surface pressure and the water vapor column. The Fourier transform spectrometer (FTS) was stationed in Biscarrosse, France at 44 • 22 40 N latitude, 1 • 13 52 W longitude and 67.6 m (above sea level) altitude. A total of 4908 spectra were analyzed during the CarboEurope regional experiment encompassing measurements from 8 May 2005 to 26 June 2005. The Bruker 120 M (Mobile) Fourier transform spectrometer was utilized during the campaign. A maximum optical path difference of 30 cm was employed and a resolution of 0.03 cm −1 was used. The 120 M has a focal length of 220 mm and an aperture size of 0.5 mm was used during the dates mentioned. This produces a field of view of 2.3 mrad. Forward and backward scans were taken totaling an average acquisition time of 24.0 s for each spectrum.
Beside the FTS station is a tower instrumented by the Laboratoire des Sciences du Climat et de l'Environment (LSCE). It houses a continuous in-situ monitoring station called CARIBOU, which includes a LICOR analyzer that measures CO 2 concentrations with a ±0.5 ppm precision. The tower is located at a latitude of 44 • 22 40.6 N, a longitude of 1 • 13 52.5 W with the inlet at 114.71 m (above sea level). It also houses a pressure sensor located at 106.81 m (above sea level) (Galdemard et al., 2006). Several aircraft measurements were also performed during the regional experiment. Among them is the METAIR Dimona (Dimona), a touring motor glider (TMG), in which CO 2 is measured onboard using a combination of a fast, open path LICOR 7500, a slower, more precise closed path LICOR 6262 (Neininger et al., 2001), and flask samples that are analyzed for CO 2 in the laboratory at the Max Planck Institute for Biogeochemistry (MPI-BGC) in Jena, Germany with an accuracy of 0.1 ppm. The overall precision of the combined CO 2 dataset (the fast open path LICOR 7500, the slower closed path LICOR 6262 and the flask samples) at 1 Hz is 0.5 ppm.
To aid in the interpretation of the data and to serve as a "transfer standard" between different incomparable kinds of measurements such as the FTS and the in-situ tower data, the Stochastic Time Inverted Lagrangian Transport (STILT) model was utilized . It is based on the HYSPLIT model (Draxler and Hess, 1997;Draxler and Hess, 1998), using a similar mean advection scheme but employing a different turbulence module. It has been further modified to use winds, surface sensible heat and momentum fluxes, and computed convective mass fluxes from ECMWF assimilated meteorological fields (Gerbig et al., 2008). Being originally designed for comparisons with in-situ measurements (single receptors or single measurement locations), STILT was modified for comparisons with column measurements (multiple receptors). The multiple receptor scheme is depicted in Fig. 1. For each receptor location, x r , representative particles were released at a time t r giving rise to particle densities, ρ(x r , t r |x, t) at x and time t. From the particle densities, the surface influence or footprint, S(x, t), which relate surface fluxes (sources or sinks) to the concentration, C(x r , t), at the measurement location (the receptor), can be determined. The initial boundary tracer conditions are taken from the TM3 global transport model (Heimann and Körner, 2003). For more details on the STILT model refer to the papers from Lin et al. (2003) and to Gerbig et al. (2003). The model was run at a 0.125 • latitude × 0.083 • longitude resolution and 3 days backward in time. The CO 2 concentration output from the model (in ppm) is determined by CO 2 = CO 2,background + CO 2,fossil fuel where CO 2,background is the background carbon dioxide obtained from the TM3 global transport model boundary fields, CO 2,fossil fuel comes from fossil fuel emissions due to combustion estimated using the recent greenhouse gas emissions inventory from the Institute of Economics and the Rational Use of Energry (IER), University of Stuttgart (http: //carboeurope.ier.uni-stuttgart.de/), CO 2,photosynthetic uptake is the carbon dioxide concentration taken up by the vegetation and CO 2,respiration is the amount of CO 2 released by plants.
The biospheric exchange is based on the diagnostic model GSB (greatly simplified biosphere) using light and temperature response and 3 vegetation classes namely forests, shrubs and crops . STILT applied to column measurements. Receptor points were placed at equal intervals along the vertical column for each altitude range. Altitude ranges are from 1-500 m, 500-3000 m, 3-6 km, 6-11 km and 11-18 km. The released particles give rise to particle densities at certain locations wherein influences can be calculated.

Results
The next sections discuss results from the FTIR retrievals, comparisons with the MetAir Dimona aircraft, results from the STILT model and the effect of clouds on the retrieved O 2 and CO 2 columns.

Retrieval
CO 2 and O 2 vertical columns were retrieved using the GFIT nonlinear least squares spectral fitting algorithm (version 2.40.2) developed by NASA/JPL (Toon et al., 1992). O 2 was analyzed in the 1.25-1.29 µm or 7765-8005 cm −1 band centered at 1.27 µm or 7885 cm −1 with H 2 O as an interfering gas. CO 2 was retrieved in the 1.597-1.618 µm or 6180-6260 cm −1 band centered at 1.607 µm or 6220 cm −1 . Interfering gases in the 6220 cm −1 CO 2 band are H 2 O, HDO and CH 4 . The retrieved O 2 column was compared to 20.95% of the total dry pressure column, P dry,column . The dry pressure column was determined using where P obs is the observed surface pressure, m dry is the mean molecular mass of dry air, m H 2 O is the mean molecular mass of water vapor, g is the surface acceleration due to gravity and the H 2 O column is the water vapor column retrieved in the O 2 window (Washenfelder et al., 2006). From this, a linear fit with zero intercept was done from which the slope (1.0432) was used to scale down the O 2 column to make it correspond The upper limit of the precision of the O 2 VMR was determined from its diurnal variation as shown in Fig. 4. The  O 2 diurnal variation is given by where O 2,VMR is the daily mean of the volume mixing ratio of oxygen. One way of estimating the upper limit of the precision of CO 2 is to also use the diurnal variation this time for the CO 2 column average VMRs (Yang et al., 2002) shown in Fig. 5. However, since there is a natural variability in the CO 2 column average volume mixing ratio over the course of the day due to diurnally varying surface sources and sinks (mostly biospheric), this method only gives an upper limit of the precision. The CO 2 diurnal variation is given as where CO 2,VMR is defined as 20.95% of the CO 2 /O 2 column ratio for individual measurements and CO 2,VMR is the mean of the day. The CO 2 /O 2 column ratio minimizes systematic errors such as errors present in the pressure and in the instrumental line shape (Warneke et al., 2005) and at the same time retaining the diurnal source/sink signals. Quantiles were used to quantitatively assess the diurnal variations, specifically quartiles and the central 90%ile. These statistics are summarized in Table 1. Approximately 50% of the measured data have diurnal variations between ±0.2091% and ±0.2072% for O 2 and CO 2 , respectively and approximately 90% of the measured data have diurnal variations between ±0.5577% and ±0.5265% for O 2 and CO 2, respectively. The outliers in the diurnal variations result from influences of clouds (Warneke et al., 2006).

Aircraft comparison
The accuracy of the CO 2 retrievals was determined by comparing the FTS CO 2 VMRs with integrated aircraft carbon dioxide volume mixing ratios. Of the mentioned measurement dates, simultaneous Dimona and FTS measurements were available during five days, 25, 26, and 27 May and 6 and 14 June. During these days, only those data from the aircraft that fell within a 50 km distance from the FTS station were selected. From this, seven instances were identified. These instances are shown in Fig. 6 together with the FTS location and its pointing directions as well as the flight paths and the maximum altitude of the MetAir Dimona.
The Dimona reached a maximum altitude of approximately 3 km during the CarboEurope experiment. It was thus necessary to append CO 2 profiles above the aircraft ceiling. For the free troposphere portion of the profile, data were taken from the TM3 global transport model, which was coupled to surface fluxes from fossil fuel emissions as well as to the BIOME-BGC model to include biospheric exchange (Heimann and Körner, 2003). For the stratospheric part of the profile, in-situ balloon data from the Observations of the Middle Stratosphere (OMS) experiment performed in Fort Sumner, New Mexico (35 • N, 104 • W) on 17 September 2004 were utilized. Since the balloon measurements were not performed during the same period as CERES, the balloon profile was corrected for age using the annual increase rate of CO 2 . Since also the balloon measurements were not done in Biscarrosse, France, a coordinate transformation is necessary. Measurements of potential temperature during the bal-loon flight were used. Potential temperature is approximately a conserved quantity in the stratosphere. The potential temperature was then converted to altitude using the equation formulated by Knox (Knox, 1998) where θ is the potential temperature in Kelvin and z is the altitude in km. It was then converted back to pressure using NCEP altitude-pressure-temperature profiles for Biscarrose, France during the specific aircraft overpass dates and the CO 2 concentration values were then interpolated. A ±0.75 ppm uncertainty was assigned based on the precision of the balloon data and from the 0.5 year uncertainty in the mean age of the air in the stratosphere. A 0.5 year uncertainty in the stratosphere translates into approximately 0.75 ppm uncertainty in the carbon dioxide concentration when one considers the 1.4 ppm year −1 annual increase rate of CO 2 . The CO 2 concentrations for the aircraft have an uncertainty of ±0.5 ppm. For the model, a pressure dependent uncertainty in the CO 2 profile was assigned ranging from ±0.5 ppm at the aircraft ceiling increasing to a maximum of ±0.75 ppm at the tropopause.
To compare the combined (aircraft, model and balloon) carbon dioxide concentrations with the FTS data, it is necessary to consider the different characteristics of the observing systems. Derived quantities, such as total columns, may then be compared properly among different measurement platforms. In this case, the combined (aircraft, model and balloon) data is said to be "simulated" by the FTS retrievals by using the FTS a priori CO 2 VMR and by weighting the combined (aircraft, model and balloon) CO 2 concentrations with the FTS column averaging kernels (Rodgers et al., 2003). This procedure is summarized in the following equation: CO 2,simulated = CO 2,a priori + A(CO 2,aircraft+MODEL+balloon −CO 2,a priori ) where CO 2,a priori is the a priori CO 2 profile used in the retrieval, A is the column averaging kernel (shown in Fig. 7  (left) for instance 7) and CO 2,aircraft+MODEL+balloon is the aircraft data appended with the model and balloon data. The "simulated" CO 2 profile for instance 7, as shown in Fig. 7 (right), was then additionally weighted with a pressure dependent gravitational acceleration and integrated with respect to pressure using a trapezoidal numerical integration. The result was then divided by the mean molecular mass of dry air to determine the column CO 2 . The column averaged volume mixing ratio is then determined by dividing the column CO 2 by the dry pressure column. A similar procedure was performed for the carbon dioxide column uncertainties with an additional error propagation done on the uncertainties in the profile. The uncertainties in each pressure level were squared, integrated with respect to the square of the pressure and the square root of the integrated value was calculated. Figure 8 shows the comparison of the averaged (retrieval error weighted) FTS CO 2 VMR (20.95% of the CO 2 /O 2 column ratio) for the aforementioned instances to the integrated (combined "simulated" aircraft, model and balloon) CO 2 VMRs. CO 2 columns were reduced by 1.0291 wherein the scaling factor was determined from the slope of a zerointercept linear fit. The correlation coefficient is 0.67 and the residuals approximately vary between ±1 ppm. Two in-stances, 4 and 6, deviated more that expected from the oneto-one line due to differences in the surface influence regions between the FTS and the Dimona (see Discussion).

Measurement and model comparisons
The Stochastic Time Inverted Lagrangian Transport (STILT) model was used for comparison of carbon dioxide concentration time series from the Biscarrosse tower data using a single receptor placed at the same latitude and longitude as the tower with an above ground level height of 47 m. Figure 9 (upper panel) shows the time series comparisons between these mentioned datasets. Tower and STILT data simultaneous to the FTS measurements were evaluated. Aside from this, days prior to the period with enhanced biospheric activity due to changes in phenology (prior to 16 June 2005) were considered. The tower measurements were also compared with the Weather Research and Forecasting -Vegetation Photosynthesis and Respiration Model (WRF-VPRM) modeling system as shown in Fig. 9 (lower panel). WRF-VPRM is a coupled modeling system designed to simulate high-resolution atmospheric CO 2 concentration fields. Here, WRF is the state of the art mesoscale meteorological model and it is coupled to the diagnostic biospheric model VPRM. The a priori and the column averaging kernel used in the FTS retrieval was applied to the combined (aircraft, model and balloon) data to make a comparison with the CO 2 concentrations retrieved from the Fourier transform spectrometer using Eq. (6). (Right) "Simulated" CO 2 profile. CO 2 concentration data for the aircraft have an uncertainty of ±0.5 ppm. Above the aircraft ceiling, the modeled CO 2 data was assigned to have a pressure dependent uncertainty varying from ±0.5 ppm to ±0.75 ppm. The uncertainty in the balloon data was estimated to have a ±0.75 ppm based upon the variability of the measured CO 2 data and the uncertainty in the mean age of the air in the stratosphere. VPRM produces biospheric CO 2 fluxes and passes these to WRF, which performs atmospheric CO 2 tracer transport simulation. The modeling system also takes into account anthropogenic CO 2 fluxes. The comprehensive description of the modeling system and setup can be found in Ahmadov et al. (2007). Statistics for the comparisons are shown in Table 2. A more detailed analysis of the comparison of WRF-VPRM and the Biscarrosse tower is currently being prepared by Ahmadov et al. (2007).
The STILT model was then extended for comparison to vertical column concentrations of CO 2 using multiple receptors along the column (see Fig. 1). Similar to what was done with the aircraft profiles, OMS in-situ balloon data, corrected for age and transformed in coordinates, were appended above the STILT model. The modeled carbon dioxide profile is shown in Fig. 10 for instance 7 compared to the Dimona-TM3-OMS CO 2 profile. The FTS retrieval a priori CO 2 and its averaging kernel were also applied (Eq. 6) to the STILT modeled CO 2 profiles before integrating the column. The column averaged VMRs of carbon dioxide from the STILT model and the FTIR data were then compared. The column averaged CO 2 volume mixing ratio retrieved from the FTIR data were also compared with WRF-VPRM similarly "simulated" with the FTS a priori CO 2 VMR and with the FTS column averaging kernel. The comparisons are shown in Fig. 11 and the pertinent statistics are summarized in Table 2. Additionally, taking only afternoon values (3 p.m. to 8 p.m. local time), the standard deviation of the differences and the mean differences were calculated among the datasets. The effec-  tive bias, which is in effect the difference between the FTIR data and the tower data, was then computed as the difference between the model-tower and the model-FTIR mean differences. These are also noted in Table 2.

Effect of clouds on O 2 and CO 2 precision
To quantitatively assess the effect of clouds on the precision of the retrieved O 2 and CO 2 VMRs, measurements from a clear day and a partly cloudy day during the campaign were compared. As shown in Fig. 12, 2.75-min averaged data were compared from measurements during a clear day (18 June

Discussion
Surface influence functions, or footprints, which quantify the contribution of surface fluxes to the concentration of the aircraft measurement as well as of the FTIR column, can be used to assess potential reasons for disagreement between the two types of measurements. The time integrated footprints shown in Fig. 13a have been determined using STILT. They show that the surface influences for instances 4 and 6 have a significant difference for the FTS and for the Dimona aircraft. For instance 4, where the CO 2 column averaged VMR of the FTS is lower compared to the Dimona (see Fig. 8, instance 4), the FTS footprint has a discontinuity in the area of northern Spain. Surface fluxes in this region would therefore not affect the FTS measurements as it does for the Dimona producing the mentioned difference. This discontinuity can be attributed to particles rising above the surface hence producing no surface influence at that region. Aside from this, the aircraft is also more confined in a smaller region for this instance compared with the other instances (see Fig. 6, instance 4). This gives it a rather limited sampling area, in which other processes can influence the aircraft data as compared to the FTIR. For instance 6, the FTS column averaged VMR is higher than the Dimona (see Fig. 8, instance 6). The footprints of the Dimona show more influences on land than the FTS (see Fig. 13b), consistent with the flight track covering more vegetated areas (see Fig. 6, instance 6). Given that the land region at that time of the year is a much stronger sink for CO 2 as compared to the ocean due to the active land biosphere, explains the lower CO 2 observed by the aircraft (see Fig. 15). In Fig. 14, decomposition of the STILT modeled CO 2 concentrations for the different altitude ranges is shown. The lower altitude ranges (1-500 m and 500-3000 m), show significant influence of the biosphere in the CO 2 concentrations. These altitude ranges, which are well within the planetary boundary layer where significant turbulence is experienced (hence more vertical mixing), get more contributions from vegetation photosynthetic uptake and respiration. Higher up, from 3 km to 18 km, the carbon dioxide is dominated mostly by the background values with little variability due to vegetation.
Decomposition of the STILT modeled carbon dioxide concentrations by altitude range and sources/sinks is shown in Fig. 15 for instances 2, 3 and 6 at the location of the FTS. Instances 2 and 3 get more biospheric influences because their footprints are inland while instance 6 receives less influence form the biosphere, since its footprint originate mostly from the ocean (see Fig. 13a) producing a higher CO 2 value detected by the FTS than the aircraft (see Fig. 8) (sampling over vegetation (see Fig. 6)).
Referring to Fig. 6, one can see that there are instances (instances 4 and 6) where the Dimona was taking samples in locations where the FTS was not pointing. One might say that this could be a potential source of disagreement between the FTIR spectrometer and the aircraft. However, looking at the FTS slant and vertical column averaged VMRs in Fig. 16, it becomes clear that taking slant or vertical column averaged VMRs does not matter. This was also verified with the WRF-VPRM also shown in Fig. 16.
The comparison between the carbon dioxide column averaged VMRs measured with the FTIR spectrometer and the integrated (combined "simulated" aircraft, model and balloon) CO 2 concentrations can be considered to be in agreement with each other since the error bars fall within the oneto-one line (see Fig. 8). The most significant source of error for the FTS CO 2 column averaged volume mixing ratio is the precision of the instrument (120 M) used in the CarboEurope experiment. For the integrated carbon dioxide VMR, the most significant source of uncertainty is the spatial heterogeneity of CO 2 measured by the aircraft in the planetary boundary layer (see Fig. 7, right panel). The spatial heterogeneity is a result of taking aircraft data within a 50 km distance around the FTS station. Flying closer to the FTS station can therefore improve FTS validations with aircrafts.
After validating the FTS carbon dioxide column averaged VMRs with the integrated (combined "simulated" aircraft, model and balloon) CO 2 data, a meaningful next step would be to compare FTS measurements with in-situ tower data. The problem of directly comparing in-situ and remotely sensed data is that the quantities are different in nature to start with. One needs a tool to mediate between the two measuring techniques to assess whether the in-situ and FTS data are consistent. The STILT model provides this tool. WRF-VPRM is also used for additional verification. Pertinent statistics were calculated for days with FTS measurements and days prior to the period with enhanced biospheric activity due to changes in phenology (prior to 16 June 2005) since the greatly simplified biosphere (GSB) used in STILT simulate phenological changes with less certainty. Additional statistics were calculated using only afternoon values  for the carbon dioxide concentrations of the datasets. Using only afternoon values reduces the uncertainties between the flux-concentration relationships due to a deeper boundary layer during these times compared to morning and night time hours. Models better represent deeper boundary layers than shallower ones due to limitations in its vertical resolution. Therefore, comparisons between modeled and measured data would be more substantial when only afternoon data are considered. For the tower comparisons, the statistics reveal that the models have difficulties capturing the variability in the in-situ data as evidence of approximately 3-4.5 ppm standard deviation of the differences. The mean differences or biases, however, show smaller values (∼0.5-0.7 ppm). The differences in using STILT and using WRF-VPRM for the tower comparisons come from the dissimilar transport simulation and biosphere models that are employed.
For the FTIR comparisons, the models experience lesser difficulties in simulating the variability in the column (standard deviation of the differences ∼1 ppm). This is expected since the column is less sensitive to local and to synoptic changes in CO 2 concentrations. However, for the mean differences or biases between the FTIR data and the models, the values are larger (∼2 ppm) than with the tower. The differences between the model-tower and the model-FTIR were then used to calculate an effective bias of approximately −2.5 ppm between the FTIR and the tower. This bias comes from the scaling factor used in calibrating the FTIR data with the integrated (combined "simulated" aircraft, model and balloon) CO 2 data. The uncertainty in the applied scaling factor for the FTS columns results from spatial heterogeneity in the aircraft data used to scale the CO 2 columns (not evident in the modeled profile in Fig. 10). Additional information on this spatial heterogeneity will be available from the simulation of CO 2 along the flight track, however, this is beyond the scope of this paper and will be presented in a future publication focusing on the airborne data.

Conclusions
Ground-based solar absorption measurements using Fourier transform infrared spectrometry (FTS) were performed during the CarboEurope Regional Experiment Strategy (CERES) from May to June 2005 in Biscarrosse, France. Near-infrared spectra from a Bruker 120 M Fourier transform infrared (FTIR) spectrometer were then analyzed to retrieve carbon dioxide (CO 2 ) concentrations using a non-linear least squares fitting algorithm developed by NASA JPL (GFIT). To facilitate the comparison of the FTIR CO 2 retrievals to simultaneous in-situ measurements made from a tall tower and aboard an aircraft, the Stochastic Time Inverted Lagrangian Transport (STILT) model was utilized.
To represent the dry air volume mixing ratio (VMR), O 2 was retrieved and compared to 20.95% of the dry pressure column resulting in a reduction factor of 4.32% for the retrieved oxygen. For the retrieved O 2 and CO 2 , the diurnal variation was used to estimate the upper limit of the precision. As a result, ninety percent of the data fell within ±0.56% and ±0.53% of the diurnal variations for O 2 and CO 2 , respectively. The retrieved carbon dioxide column averaged volume mixing ratios were then calibrated using data from the METAIR Dimona aircraft, with TM3 model values appended above the aircraft ceiling for the free troposphere portion of the profile and OMS balloon measurements added for the stratosphere part of the column. The profiles were then "simulated" using the a priori and the averaging kernels used in the FTS retrievals and were then integrated to come up with the column concentrations. The CO 2 columns were Fig. 14. Decomposition of the STILT modeled CO 2 by altitude range. The CO 2 multiple receptor signal is decomposed into the different altitude ranges of 1-500 m, 500-3000 m, 3-6 km, 6-11 km and 11-18 km. The lower altitude ranges (1-500 m and 500-3000 m), show significant influence of the biosphere in the CO 2 concentrations. then reduced by 2.91%. Two instances (4 and 6) deviated larger than expected from the one-to-one line and these instances were identified to have FTS and Dimona footprints that differ relatively more in terms of influence regions than the other instances. As slant and vertical column differences may have been the cause of discrepancies between the FTS and the Dimona, this disparity was analyzed and verified with the Weather Research and Forecasting -Vegetation Photosynthesis and Respiration Model (WRF-VPRM). The difference in the slant and vertical column averaged volume mixing ratio turned out to be negligible. For future FTS validation experiments using aircrafts, this means that vertical profiles should be flown in close proximity given the spatial heterogeneity of carbon dioxide, but there is no need to adopt a slanting aircraft profile.
Time series concentrations of carbon dioxide from the single receptor STILT model and from WRF-VPRM were then compared to the in-situ tower data. The models had difficulties capturing the variability in the in-situ data but had relatively small biases. The difference between the two models when their outputs were compared to the tower data come from using different transport simulation and biosphere models.
Using similar model parameters, the integrated multiple receptor STILT model and the WRF-VPRM column outputs were compared to the FTS column averaged volume mixing ratios of CO 2 . The models had a better behaviour of simulating the variability in the column, which is expected since the column is less sensitive to local and to synoptic changes in CO 2 concentrations compared to the tower data. However, the biases are larger. These biases are attributed to the scaling factor used in calibrating the FTIR data with the integrated (combined "simulated" aircraft, model and balloon) CO 2 data. The scaling factor was derived to a large extent Fig. 15. Decomposition of the STILT modeled CO 2 by altitude range and sources/sinks for instances 2, 3 (upper panel) and instance 6 (lower panel) at the FTS location. Instances 2 and 3 get more biospheric influences because their footprints are inland. Instance 6, on the other hand, receives less influence form the biosphere, since its footprint originate mostly from the ocean producing a higher CO 2 value detected by the FTS than the aircraft (sampling over vegetation). from aircraft measurements that sampled within a 50 km distance from the FTS and this introduces spatial heterogeneity in the carbon dioxide volume mixing ratios around the FTS.
Since identical model parameters were used for landatmosphere fluxes when STILT was compared with in-situ tower data (single receptor) and with column measurements from the FTS (multiple receptors), STILT can be used as a "transfer standard". Using STILT for comparing remotely sensed CO 2 data with tower measurements of carbon dioxide and quantifying this comparison by means of the effective bias, provided a framework that allowed validating the FTIR retrievals versus measurements made in-situ. Since these insitu measurements are done frequently and at high accuracy on the global calibration scale, linking this scale with FTIR retrievals ultimately provides a calibration scale for remote sensing.