Calibration of TCCON column-averaged CO2: the first aircraft campaign over European TCCON sites

The Total Carbon Column Observing Network (TCCON) is a ground-based network of Fourier Transform Spectrometer (FTS) sites around the globe, where the column abundances of CO 2, CH4, N2O, CO and O2 are measured. CO2 is constrained with a precision better than 0.25 % (1σ ). To achieve a similarly high accuracy, calibration to World Meteorological Organization (WMO) standards is required. This paper introduces the first aircraft calibration campaign of five European TCCON sites and a mobile FTS instrument. A series of WMO standards in-situ profiles were obtained over European TCCON sites via aircraft and compared with retrievals of CO2 column amounts from the TCCON instruments. The results of the campaign show that the FTS measurements are consistently biased 1.1 % ± 0.2 % low with respect to WMO standards, in agreement with previous TCCON calibration campaigns. The standard a priori profile for the TCCON FTS retrievals is shown to not add a bias. The same calibration factor is generated using aircraft profiles as a priori and with the TCCON standard a priori. With a calibration to WMO standards, the highly precise TCCON CO2 measurements of total column concentrations provide a Correspondence to: J. Messerschmidt (messerschmidt@iup.physik.uni-bremen.de) suitable database for the calibration and validation of nadirviewing satellites.


Introduction
Carbon dioxide (CO 2 ) is the most abundant anthropogenic greenhouse gas (GHG), and its increase is driving global climate change. To understand climate change, both the monitoring and the prediction of CO 2 abundances are important. Monitoring is necessary to improve our understanding of processes governing the CO 2 cycle, and it is also of major interest for measuring the success or failure of emission reduction or sequestration schemes. Prediction will become an even more important factor as the consequences of climate change will increasingly affect human and natural systems.
Currently the sources and sinks of CO 2 are determined by two different approaches: bottom-up and top-down. The former estimates the carbon budget by starting with process information at the scale of a few square meters, requiring upscaling to provide information at regional scales. The latter uses atmospheric inverse transport modeling to derive surface flux distributions from atmospheric concentration measurements. Until recently the top-down approach was solely based on a network of in-situ boundary layer measurement stations. This approach is limited by the sparse spatial coverage of the sampling sites (Marquis and Tans, 2008), but also by the dependence and sensitivity of sink estimates to the assumed vertical model transport (Baker et al., 2006;Stephens et al., 2007).
To improve the constraint on carbon cycle processes and for a global coverage, the space agencies JAXA, ESA, and NASA have launched an ambitious effort to map the integrated column of CO 2 and CH 4 by satellite observations (e.g. GOSAT, CarbonSat, OCO-2). The space-based observations can significantly improve the source-sink estimates by improving the description of the CO 2 distribution, provided they are sufficiently precise and accurate (Rayner and O'Brien, 2001).
TCCON is a worldwide network of ground-based FTSs that was founded in 2004. It has been largely used as a calibration and validation resource for satellite measurements (e.g. Reuter et al., 2011;Morino et al., 2011), but also provides insights into carbon cycle science (e.g. Yang et al., 2007;Keppel-Aleks et al., 2011). The individual TC-CON sites are operated by various institutions around the world (e.g. Washenfelder et al., 2006;Wunch et al., 2011;Deutscher et al., 2010;Geibel et al., 2010). TCCON data products are column-averaged dry-air mole fractions, e.g. X CO 2 , X CH 4 , X N 2 O , X CO . TCCON measurements for CO 2 show a precision better than 0.25 % (∼1 ppm) (1-σ ) . Under clear sky conditions, precisions of even 0.1 % (1-σ ) can be achieved (Washenfelder et al., 2006;Messerschmidt et al., 2010;Deutscher et al., 2010). With its sufficiently precise measurements of total columns of greenhouse gases, FTIR spectrometry is currently the most suitable measurement technique to validate and calibrate satellite total column measurements.
To provide the link between satellite measurements and the ground-based in-situ network, a sufficiently accurate constraint of trace gas abundances is of critical importance. Referencing the TCCON measurements to the WMO calibration scale is achieved using aircraft and balloon profiling above the FTS stations.
The first calibration campaign of a TCCON site was described by Washenfelder et al. (2006). The calibration to the WMO reference scales revealed a bias of 2 % for the Park Falls site, and showed an excellent correlation. Deutscher et al. (2010) describe the calibration campaign of the TC-CON site in Darwin, Australia and yield a low bias of about 1 % with respect to WMO standards. Additionally the agreement between Darwin and the first calibration campaign data was shown. Wunch et al. (2010) included further calibration campaigns of TCCON sites in the United States of America, Japan, and New Zealand and harmonized the calibration method for all sites. All calibration campaigns yield consistently a single calibration factor of 0.989 ± 0.002 (2-σ ) for CO 2 .
This paper introduces the first aircraft calibration campaign of European TCCON sites with the same TCCON data retrieval as used by Wunch et al. (2010). During the campaign, in-situ profiles to high altitude were obtained with an aircraft above five European TCCON sites and a mobile FTS system in Jena, Germany. An overview of the campaign and the results for CO 2 will be presented in this paper. The results show a European TCCON sites calibration factor for CO 2 of 0.989 ± 0.002 (2-σ ) consistent with other TCCON sites .

The IMECC campaign
The EU project, Infrastructure for Measurement of the European Carbon Cycle (IMECC), is an Integrated Infrastructure Initiative within the Sixth Framework Programme of the European Commission. The aim of the IMECC project is to build the infrastructure necessary for the characterization of the European carbon balance. 30 partners within 15 countries are contributing for four years (2007)(2008)(2009)(2010)(2011) in three main initiatives. The first focuses on the improved comparability of European CO 2 measurements. The second targets on establishing a broad, co-ordinated and accessible European CO 2 database. The implementation of new measurement approaches is supported in the third initiative.
The first airborne campaign to calibrate FTS sites in Europe with respect to WMO standards (Zhao and Tans, 2006) was funded by the IMECC project. Organization of the flight tracks, the aircraft instrumentation and the post-flight analysis of the aircraft in-situ data was undertaken by the Max Planck Institute for Biogeochemistry (MPI-BGC). The main purpose of the campaign was the calibration of the following European TCCON sites: Bialystok (Poland), Bremen (Germany), Garmisch (Germany), Karlsruhe (Germany), and Orléans (France), and the mobile FTS system located in Jena (Germany), which was built to be deployed at Ascension Island. Figure 1 shows the locations of the calibrated sites and the airbase of the IMECC campaign in Hohn, Germany.
The calibration flights took place between 28 September and 9 October 2009. The in-situ instrumentation was on board a Learjet 35A, operated by enviscope GmbH (Frankfurt a. M., Germany), with a flight ceiling of 13 km. Near the European TCCON sites, high altitude in-situ profiles were taken, typically from 500 m up to 13 000 m. The lowest 5 km were mostly flown in spirals, however, due to e.g. air traffic restrictions, this approach had to be modified at some sites. A typical aircraft profile is shown in Fig. 2. Additional dips were performed during the transfer flights from the airbase. Overall, eight flights were made on four days. In about 20 flight hours, 16 vertical profiles were generated over the European TCCON sites at solar zenith angles (SZAs) ranging from 51 to 84 degrees. The flight overpasses are listed in Table 1. During all flights, in-situ data were taken for CO 2 , CH 4 , H 2 O, CO, N 2 O, H 2 , SF 6 .
The FTS sites were operated at the time of the campaign by the individual responsible working groups. All European  In the following section, the different sites are described in detail.

Calibrated European TCCON sites
Bialystok, Poland. The FTS facility in Bialystok is operated by the Institute of Environmental Physics (IUP), Bremen, Germany in close cooperation with AeroMeteoService, Bialystok, Poland. Bialystok represents the easternmost measurement site within the European Union. An on-site tall tower (300 m) provides boundary layer in-situ measurements. Bialystok and Orléans, France are the only sites with co-located FTS and tall tower measurements in Europe. Additionally, CO 2 profiles up to 2.5 km altitude are measured from small aircraft regularly. The FTS instrument was funded by the EU-projects GEOmon (Global Earth Observation and Monitoring) and IMECC and has been in operation since March 2009. The FTS in Bialystok is fully automated, and is controlled via remote access (Messerschmidt et al., 2010).

Aircraft instrumentation
For continuous measurements of CO 2 , CH 4 and H 2 O, the aircraft was equipped with a Wavelength-Scanned Cavity Ring Down Spectrometer (CRDS) (model G1301-m, Picarro Inc., Sunnyvale, CA), providing mixing ratio data recorded at ∼0.5 Hz intervals. The analyzer was calibrated against WMO reference gases in the laboratory before and after the airborne campaign, providing an accuracy of 0.1 ppm for CO 2 and 2 ppb for CH 4 . Measurements were made in wet air, and dry-air mixing ratios were derived following the method of Chen et al. (2010). CO data were measured with an Aero-Laser instrument (model AL5002), which was calibrated during flight using WMO traceable standards. The instrument provides dry-air mixing ratios at 1 Hz frequency with an accuracy of 2 ppb (Gerbig et al., 1999). Additionally, up to eight flasks per profile were taken at different altitude levels, from which CO 2 , CH 4 , N 2 O, CO, H 2 and SF 6 were analyzed, validating the quality of the continuous measurements. The flasks were analyzed post-flight at the MPI-BGC's gas analysis lab. Supplemental meteorological data (pressure, temperature, latitude, longitude, altitude, distance to site, and time) were also recorded.

European TCCON data
In the FTS instruments JEN, KAR, GAR and BRE the Optics User Software (OPUS version 6.5), a program provided by Bruker, was used to record the interferograms. In BIK and ORL, the raw interferogram data were obtained directly from the embedded web server inside the instruments. To calculate the spectra from the interferograms, we used the Interferogram Processing Package (IPP), which was developed at the Jet Propulsion Laboratory (JPL) (Pasadena, USA) within the framework of TCCON. In the former case, OPUS-IPP (version 20100123) and in the latter case, SLICE-IPP (version 20100123) was used. Both software packages perform the same Fast Fourier Transformation, the different names only indicate the different formats of the interferograms. Additionally, they correct the spectra for solar intensity variations, caused e.g. by passing clouds . GFIT (version 4.4.10), a nonlinear least-squares spectral fitting algorithm, developed by G. Toon (JPL), was used for the retrieval of the trace gas column amounts from the measured spectra . The tropospheric portion of the a priori CO 2 profile, used in GFIT, is based on an empirical model fitting GLOBALVIEW CO 2 data (GLOBALVIEW-CO2, 2010). The tropopause height is determined from the National Centers for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) reanalysis. The stratospheric a priori CO 2 decreases with altitude above the tropopause height, depending on the age of the air, based on measurements by Andrews et al. (2001). In order to eliminate a potential bias introduced by the a priori profiles used in the standard TCCON retrieval, the assembled aircraft profiles were used as GFIT a priori . The column-averaged dry-air mole fraction (DMF) of the measured gases, e.g. X CO 2 , can be calculated from the retrieved column amount by The units of X CO 2 are µmol mol −1 and commonly expressed as parts per million [ppm]. Taking the ratio of the atmospheric CO 2 and O 2 columns minimizes some systematic and correlated errors present in both retrieved CO 2 and O 2 columns (e.g. pressure errors, influence of the instrumental line shape, tracker pointing errors, Washenfelder et al., 2006;Wunch et al., 2011). The CO 2 column is retrieved for two CO 2 bands centered at 6228 cm −1 and 6348 cm −1 , and the RMS-error weighted mean is used to calculate X CO 2 . Column O 2 is retrieved from the electronic band centered at 7882 cm −1 . A correction to the airmass dependence, supplied with GFIT and described in Wunch et al. (2011) and Deutscher et al. (2010), was added. Data outside the ranges [0.20-0.22] for the dry air mole fraction of O 2 , as well as outside the range [350 ppm-400 ppm] for CO 2 are regarded as outliers in the TCCON standard retrieval and discarded. For the IMECC campaign, the variation of the FTS measurements during the time of the overpasses was used as a filter. Only individual FTS measurements were considered that had a standard deviation about the mean X CO 2 less than the standard TCCON precision of 0.25 %. Fewer than 10 % of the data points were removed by this filter.

On-site in-situ measurements at European TCCON sites
At Bremen, Garmisch, and Karlsruhe no on-site in-situ measurements were available. At the other three sites, Bialystok, Jena, and Orléans on-site in-situ facilities are installed and used in the campaign.
Bialystok. With a height of more than 300 m, the tall tower located at the Bialystok site is one of the tallest in Europe. A variety of atmospheric trace gases have been sampled at five levels (4 m, 23 m, 90 m, 180 m, 300 m) quasi-continuously since 2005. CO 2 volume mixing ratio is measured with a LI-7000 non-dispersive infrared (NDIR) gas analyzer from LI-COR, traceable to WMO standards with an accuracy of 0.02 ppm. Further details on additional instruments can be found in Popa et al. (2010).

Jena.
A LI-COR LI-6262 NDIR gas analyzer is mounted on the weather station on the roof of the MPI-BGC in Jena, providing continuous CO 2 measurements, traceable to WMO standards with an accuracy of 0.5 ppm.
Orléans. The FTS site near Orléans is located next to the Trainou tall tower observatory, a 180 m tall tower that provides quasicontinuous in-situ measurements of CO 2 and other trace gases from three levels (50 m, 80 m, 180 m). CO 2 is measured with a LI-6252 NDIR gas analyzer from LI-COR.

Aircraft in-situ data
The airborne in-situ data have been merged in the MPI-BGC labs with the flask analysis data using weighting functions corresponding to the flow rate during flask sampling. The averaged concentrations agree within WMO recommendations for CO 2 , CH 4 and CO with the exception of the two first profiles in Bialystok and the first profile in Orléans (WMO, 2009). The mean difference is 0.06 ppm for CO 2 . Above 8 km, these flights were affected by a small leak in the pump that provided sample gas to the CRDS, causing the CO 2 and H 2 O measurements to be contaminated by cabin air. CO 2 for those portions of the profiles was taken from the flask data. CH 4 measurements by CRDS showed no significant difference during those periods. For each profile the qualitycontrolled and corrected data were averaged within pressure intervals of 5 hPa. The uncertainties given for the mixing ratios encompass the uncertainty due to interpolation across missing values (e.g. due to instrument calibration periods), and also include the statistical uncertainty from sampling only a limited number of seconds at each pressure interval. In addition, there is an uncertainty added which is related to the calibration of the standard gases against WMO primary gases (for CO 2 0.1 ppm, for CH 4 2 ppb, and for CO 2 ppb). Given the aircraft ceiling of 13 km, the aircraft measurements covered roughly 80 % of the total column in terms of pressure measured from the ground.

Completing the in-situ profiles
In order to compare the FTS data with the high altitude insitu profiles, the aircraft data have to be extended to the ground and in the stratosphere to cover the CO 2 total column. Therefore the airborne in-situ data are combined with on-site in-situ measurements, if provided (BIK, ORL, JEN) or extrapolated to the surface with the lowest aircraft measurement (BRE, KAR, GAR). To estimate the stratospheric CO 2 decrease with the age of air (Sect. 3.1), the standard GFIT a priori CO 2 profiles are used for the extension in the stratosphere. The tropopause height is determined from the NCEP reanalysis, supplied in the TCCON standard retrieval. For aircraft profiles that were measured higher than the tropopause, the standard GFIT a priori CO 2 profile was attached to the highest aircraft measurement. For aircraft profiles that were not measured up to the tropopause pressure, the aircraft profile was extended with the most contemporary GFIT a priori profile. At the Karlsruhe site, only one overpass was carried out, and the upper troposphere was filled in with the highest aircraft measurement. All assembled in-situ profiles are shown for each site in Figs. A1-A4. The aircraft measurements are given in red. The GFIT a priori profiles fitted in CO 2 to the aircraft measurements are shown in blue. Extended parts for missing measurements in the upper troposphere are indicated as black and used contemporary GFIT a priori profiles in green. The NCEP tropopause height is indicated by a thin red line. The original GFIT a priori profiles are shown with a thin dotted black line. The resulting uncertainties are discussed in Sect. 3.6.

Integration of the assembled in-situ profiles
The completed in-situ profiles over the European TCCON sites can be compared with the FTS DMF, when integrated to compute column-averaged CO 2 DMF. Rodgers and Connor (2003) introduced a method to compare two instruments, of which one has much higher vertical resolution than the other. This approach has been modified by the Wunch et al. (2010) retrieval set up, and is duplicated here. The averaging kernels are needed for comparison between two instruments. The averaging kernel matrix represents the changes in a retrieved profile at one level i due to a perturbation to the true profile at another level j . Since GFIT does a profile scaling retrieval (PSR), the averaging kernel matrix reduces to a vector representing the sensitivity of the retrieved total column to perturbations of the partial columns at the various atmospheric levels. In GFIT the averaging kernels are calculated with the scaled profiles, therefore the FTS retrieval scaling factor, γ , has to be taken into account : with c s : smoothed DMF of the aircraft, γ : FTS retrieval scaling factor, c a : FTS a priori DMF, a T : FTS column averaging kernel, x h : aircraft profile, x a : FTS a priori profile The derivation of the equation of the column-averaged aircraft CO 2 DMF can be found in Wunch et al. (2010) and yields: with γ : FTS retrieval scaling factor, VC air : total column of dry-air, VC a priori CO 2 : total vertical column of CO 2 , VC aircraft CO 2 ,ak : column averaging kernel-weighted vertical column of the aircraft, V C a priori CO 2 ,ak : column averaging kernel-weighted vertical a priori.
Variability in the averaging kernels is primarily driven by changing solar zenith angles. Therefore the averaging kernel from the FTS measurement nearest in time to the central time of the overpass was used for the smoothing. This averaging kernel is the mean over both CO 2 retrieval windows. The column averaging kernel vectors used for the integration of the 16 in-situ profiles during the IMECC campaign are shown in Fig. 3.

Uncertainty discussion
Measurements are affected by three types of error sources: random effects, known and unknown systematic effects. Random effects result in a measurement to measurement variability and can be quantified by the standard deviation. Known systematic effects should not simply be encompassed by increasing the estimated uncertainty, according to the Joint Committee for Guides in Metrology (JCGM, 2008) recommendations. They rather should be corrected and the uncertainty in the correction included in the total uncertainty of the corrected quantity. The total uncertainty is calculated as the sum in quadrature of random effects and the uncertainty of the corrected known systematic effects. Unknown All column averaging kernels for CO 2 used for the integration during the IMECC campaign. The colors indicate the associated site at which the FTS measurements were taken. Due to different solar zenith angles (SZAs), the averaging kernels vary for the various sites and overpass times. The SZAs are given in Table 5. systematic effects reveal in comparison to independent measurements and can be corrected against a calibration standard, such as the WMO standards, the objective of this paper.

Uncertainty of FTS-derived DMFs
FTS measurements are known to be mainly affected by the following systematic effects. Firstly the a priori profiles can be wrong due to false estimations of the temperature, pressure or water vapour profiles. Furthermore the volume mixing ratio shape of the a priori profiles can be wrong. Secondly the sun tracker pointing at the middle of the sun can be offset. Thirdly the instrumental line shape (ILS) can be distorted due to shear or angular misalignment of the instrument or field of view (FOV) failure . The calculation of X CO 2 by Eq. (1) reduces some of the effects that are common to both gases (solar tracking pointing errors, zero level offsets, ILS errors or surface pressure measurement errors). Furthermore it is known that the X CO 2 exhibit an airmassdependency, resulting in 1 % larger X CO 2 at low solar zenith angeles (SZA) than at high SZA. This dependency is removed in the standard GFIT retrieval by a single empirical correction (Sect. 3.1). A quantification of realistic perturbations of the a priori profile, the tracking and the ILS was done by Wunch et al. (2011). It could be estimated that the X CO 2 in total would be affected by 0.18 % for low SZA (20 • ) and 0.13 % for high SZA (70 • ) .
Within the IMECC campaign, potential systematic effects introduced by the a priori profiles were eliminated by using the assembled aircraft profiles as a priori profiles (Sect. 3.1). Concerning the quality of the solar tracking, a suitable indicator is the pointing error, which is the deviation from pointing at the middle of the sun and can be estimated by the Doppler Shift. The ILS is regularly monitored in all TCCON FTS instruments (Sect. 3.1) and misalignments could further be seen in the fitting residuals by characteristic artifacts. All FTS instruments and their solar tracker were optimized prior the IMECC campaign, and hence systematic effects by the pointing error and the ILS were minimized.
One known source, systematically affecting FTS measurements, was not diminished prior the campaign or is taken care of in the retrieval. Messerschmidt et al. (2010) showed that collocated FTS instruments agree within 0.07 %, but only after correcting for a systematic effect introduced by a mis-sampling of the internal reference laser provided in the commercially available FTSs. Briefly, a periodic laser missampling leads to so called ghosts (artificial spectral lines), which are mirror images of the original spectral lines. The influence of the ghosts on the retrieved X CO 2 was quantified as a function of the ghost and parent line intensities, called the ghost/parent line ratio (GPR). For a typical GPR, the retrieved X CO 2 is affected by about 1 ppm. Therefore, a correction scheme was introduced for solar measurements afflicted with ghosts (Messerschmidt et al., 2010). The effect of the retrieved X CO 2 was quantified and this correction applied to all measurements during the IMECC campaign.
The Messerschmidt et al. (2010) correction scheme does not predict the sign of the ghosts, which means that it is ambiguous as to whether the ghosts lead to an over-or an underestimation of the retrieved X CO 2 . For three of the FTS instruments (BIK, BRE, ORL), this sign was inferred from the side-by-side measurements detailed by Messerschmidt et al. (2010). For the Garmisch and Karlsruhe FTS instruments, the ghosts were minimized prior to the aircraft campaign and did not introduce a large systematic effect. The Jena instrument could not be corrected prior to the aircraft campaign, and had significant ghosts, which affected the retrievals. The results suggest an over-estimation of X CO 2 . However, as we cannot be sure of the sign, we investigate two "worst-case" scenarios in calculating the scaling factors for the FTS relative to the in-situ profile in Sect. 4. These correspond to all ghosts (Table 3) leading to an (a) under-and (b) overestimation of the retrieved X CO 2 . The difference between these scenarios is used to check the correction of the systematic effect introduced by the ghost correction scheme in the calculation of scaling factors.
One further source lead to systematic effects: due to poor weather in Jena and Bremen, not all overpasses could be carried out at the same time as the FTS data were measured (BRE_1, JEN_3, JEN_4). To account for a delay of two hours in all three cases, the expected variation due to the diurnal CO 2 cycle was accounted for as a systematic effect. At both sites, the magnitude of the diurnal cycle was estimated from the trend of the FTS measurements on the same day. The diurnal cycle was calculated for BRE_1 by the trend of the FTS data taken for a 2 h time period prior to the overpass and for JEN_3 and JEN_4 by the trend of the FTS data measured for a 2.5 h time period after the overpass. The trends were estimated with the FTS data that met the filter criteria introduced in Sect. 3.1 and extrapolated to the over- Table 3. Systematic effects due to ghosts and a time delay between the overpass and FTS measurements and the uncertainty sources contributing to the total uncertainty of the FTS measurements. The total uncertainty accounts for the FTS measurements variability during the overpasses, an uncertainty in the estimation of the expected variation due to the diurnal cycle and the uncertainty in the ghost estimation, according to Messerschmidt et al. (2010) pass time. On-site in-situ measurements showed for the extrapolated time period in Jena a variability of ±0.5 ppm and no significant trend that indicate further influence e.g. from local pollution or changing meteorological conditions. For Bremen no on-site in-situ measurements exist. The BRE_1, JEN_3, JEN_4 data are not included in the calculation of the calibration factor, due to the remaining lack of information during the overpasses, but the results will be discussed in Sect. 4.2.
Random effects, such as noise and variations in the solar tracker and instrument performance, are quantified by the measurement to measurement variability during the overpasses.
The total uncertainty for the FTS data is the sum in quadrature of the contributing standard uncertainties: the standard deviation about the mean during the overpass, the standard uncertainty of the ghost estimation and the standard uncertainty of the diurnal cycle estimation. Table 3 summarizes the magnitude of the systematic corrections, the uncertainties and the total uncertainty for all overpasses.

Uncertainty of the assembled in-situ data
The uncertainty of the assembled in-situ data is derived from the uncertainty of the aircraft measurements, the uncertainties in extrapolating the profiles and the usage of contemporary profiles (Table 4).
The GFIT a priori CO 2 profiles are used to extend the insitu data above the tropopause, as explained in Sect. 3.4. Thus a typical profile of mean age (Andrews et al., 2001) above the local tropopause is used to calculate the lag of stratospheric CO 2 values with respect to mean tropospheric values. Furthermore a decrease of the seasonal cycle with altitude is taken into account. Seasonally resolved aircraft measurements during the SPURT project (Engel et al., 2006) revealed that the seasonal cycle in the lowermost stratosphere (i.e. the region of the stratosphere between the local tropopause and the 380 K isentrope) is not only attenuated with increasing vertical distance to the local tropopause but is also shifted with respect to the troposphere (Hoor et al., 2004;Bönisch et al., 2008Bönisch et al., , 2009Hintsa et al., 1998). The seasonal cycle magnitude can be as large as 3 ppm at the mid latitude tropopause and decreases to about half of that value at about 50 K potential temperature above the local tropopause. The amplitude and timing of the seasonal cycle at the tropopause is captured quite well in the a priori profiles with a maximum in May. The variability in this area is, however, very high, especially when using pressure coordinates. Therefore a conservative uncertainty estimate is used by assuming that the CO 2 seasonal cycle in the lowermost stratosphere can not be correctly represented and that this seasonal cycle leads to an additional uncertainty of the CO 2 a priori profile of about 2 ppm, that is a typical amplitude of the seasonal cycle in the lowermost stratosphere. This uncertainty is independent of contributions from the absolute uncertainty of the mean age profile, that is estimated to be about 0.3 ppm . The total uncertainty of the stratospheric CO 2 values is thus estimated as the sum in quadrature and on the order of 2.02 ppm.
For some overpasses, the profiles could not be measured up to the tropopause. If no contemporary aircraft profile was available, the upper troposphere was filled with the highest aircraft measurement; e.g. as clearly seen in Fig. A2. The CO 2 variability in the upper troposphere, measured at the European TCCON sites, is within 2 ppm and applied as uncertainty for the filling. If a contemporary aircraft profile was available, it was used to estimate the profile above the last aircraft measurement (Figs. A1, A3, A4). It is assumed that the profile can therewith be better estimated than by using the highest aircraft measurement and an uncertainty of 1.5 ppm is assigned.
For the aircraft data, the standard uncertainty provided by the post-flight analysis at the MPI-BGC's lab was applied. The uncertainties given for the mixing ratios contain uncertainties from extension with the lowest aircraft measurement to the surface pressure, as well as from interpolation across Table 4. Contributing uncertainties to the total uncertainty of the assembled in-situ data. The total uncertainty is calculated by the sum in quadrature of the weighted fraction in terms of pressure with respect to the completed in-situ profile.
Uncertainties contributing to the total uncertainty [ppm] stratospheric extrapolation 2.02 missing tropospheric values 2.00 usage of contemporary profile 1.50 mean aircraft profile 0.11 missing values (e.g. due to instrument calibration periods). Also included is the statistical uncertainty from sampling only a limited number of seconds at each pressure interval. In addition, an uncertainty related to the calibration of the standard gases (working tanks) against WMO primary gases is added. The mean standard deviation for the IMECC campaign aircraft profiles is 0.11 ppm. The total uncertainty is calculated from the sum in quadrature of these contributing uncertainties weighted by their relative contribution to the completed profile in terms of pressure. Due to poor weather conditions a profile was not flown above the Karlsruhe TCCON site. Aircraft measurements were, however, recorded during a stop-over 50 km to the south of the site. The Karlsruhe data are therefore treated similarly to the other overflights, but because of these exceptional circumstances, they are not included in the calculation of the calibration factor. They will be discussed in Sect. 4.2.
The resulting uncertainties for the FTS measurements and for the integrated column-averaged assembled aircraft CO 2 profiles are listed for all overpasses in Table 5.

Comparison to previous TCCON calibrations
The IMECC results can be compared with previous TCCON calibrations, published in Wunch et al. (2010), by predicting a linear relationship and no intercept. The results are plotted in addition to the previous TCCON calibrations presented  in Fig. 4. The IMECC data are shown in red and the previous TCCON calibrations in green. The best fit to the IMECC data is calculated by considering both errors on the x-and y-axis (York et al., 2004) and is indicated with a red line. The previous TCCON calibrations are shown with a green line. The thin blue lines show the best fits under the worst-case ghost scenarios. The resulting scale factors are reported as the slope of the best fit ± two standard deviations. The scale factor, the best fit uncertainty and the scale factor uncertainty are listed in comparison to the previous TCCON calibrations in Table 6. The worst-case ghost  Wunch et al. (2010). The scaling factors agree within their uncertainties. This suggests one global scaling factor can be used for CO 2 for all TCCON sites worldwide. With the thin blue lines the best fit to the worst ghost case scenarios are indicated.
scenarios yield scale factors that lie within the uncertainty of the IMECC calibration scale factor, which implies a correct elimination of the systematic effect by the ghost correction scheme. The larger difference for the upper bound (maximum overestimation) is mostly due to the large ghosts found in the Jena instrument (X CO 2 + 1.63 ppm).
The IMECC calibration scale factor calculated here to be 0.989 ± 0.002 (2-σ ) agrees with the Wunch et al. (2010) calibration (0.989 ± 0.002 (2-σ )) . The IMECC calibration sup-  , which can be applied independent of site and season. However, the previous TCCON calibrations did not include a correction of potential ghosts in the FTS spectra.

Calibration of the TCCON standard X CO 2 product
FTS data collected during the IMECC campaign were also fitted using the standard GFIT a priori profiles in order to  analyze the spectra in a way consistent with the standard TC-CON retrieval. This approach allows estimation of the quality of TCCON CO 2 data products obtained using the standard GFIT a priori profiles. The mean of all results for the FTS data and the integrated in-situ profiles are listed in comparison with the former retrieval approach in Table 7. The differences in the X CO 2 are calculated as the FTS retrieval with TCCON standard a priori minus the FTS retrieval with the aircraft a priori. The estimation of the scale factor was performed following Sect. 4.1. A linear relationship and a zero intercept was predicted, the best fit was estimated with the York et al. (2004) fitting method, and the KAR_1, BRE_1, JEN_3, and JEN_4 data are excluded in the fitting procedure. The IMECC data, retrieved with the standard GFIT a priori, are shown in Fig. 5 Table 8. The KAR_1, BRE_1, JEN_3, and JEN_4 data were excluded because of missing information about the exact atmo- spheric profile during the FTS measurements. In the case of KAR_1 data, the recorded aircraft profile was displaced, and in the case of the BRE_1, JEN_3, and JEN_4 data the aircraft profiles were not contemporary with the FTS measurements. The latter profiles were corrected for a systematic effect of a diurnal cycle of the order of the FTS measurement precision magnitude (Table 3). The scale factor, calculated including the KAR_1, BRE_1, JEN_3, and JEN_4 data, yields within their uncertainty the same scale factor as without the data. The Karlsruhe data, however, exhibit an overestimation with respect to the best fit, that can not be investigated due to the lack of information. Model simulations could help to assess potential influence from pollution by nearby emissions at the Karlsruhe site. GFIT retrievals use an a priori profile that is based on dry-air mole fractions. In reality, the FTS observes a profile shape with respect to pressure that is described by the wetair mole fractions. We investigated the effect of this assumption by comparing retrievals with the aircraft dry-air profile as a priori with retrievals made by creating an a priori wetair profile by using the co-measured H 2 O profile. The FTSretrieved X CO 2 values on average differ by 0.1 µmol mol −1 , with the wet-air profile yielding higher columns. However, the application of the averaging kernel and a priori dependent smoothing to the in-situ profile means that these are similarly affected, and individual ratios of aircraft/FTS X CO 2 do not change. The FTS retrieval is therefore insensitive to the a priori profile shape in comparison studies with other measurements (or models). This confirms that the a priori profiles used in GFIT do not add any systematic biases to the results of comparisons between FTS X CO 2 and other measurements.

Summary and outlook
The IMECC campaign results a negative bias of 1.1 % ± 0.2 % (2-σ ) of the FTS X CO 2 measurements with respect to WMO standards. The negative bias is likely due to spectroscopic inaccuracies, as the aircraft profiles were used as a priori profiles. The results from the IMECC campaign are in very good agreement with previous TCCON calibrations and the findings confirm the TCCON calibration published in Wunch et al. (2010) for five new European TCCON sites.
The IMECC campaign spectra were also analyzed with the standard GFIT CO 2 a priori. The standard GFIT CO 2 a priori does not add a bias and the results agree with the results obtained with the aircraft profiles as a priori. The findings show that the TCCON standard X CO 2 product can be measured by instruments using the standard GFIT a priori profiles with a bias of 1.1 % ± 0.2 % with respect to WMO standards and a precision of 0.25 % (1-σ ). With calibrated, high precision FTS measurements, TCCON provides an ideal resource for the calibration and validation of satellite measurements as it measures the same quantity as satellites but with a higher precision and accuracy. The European TCCON standard X CO 2 product accuracy could be estimated to be 0.8 ppm (400 ppm · 0.2 %).
The uncertainty could firstly be improved by minimizing potential ghosts prior to a calibration campaign and a reliable ghost sign determination in the analysis. Secondly the uncertainty in the in-situ profile is dominated by the sections of the atmosphere not measured by the aircraft. With a jet aircraft flying at maximum flight altitude, roughly 80 % of the total column in terms of pressure can be sampled. The very accurate in-situ measurements have to be extrapolated in the stratosphere; this contributes to a large part of the uncertainty. This should be improved by extending the in-situ measurements to higher altitudes, for example with balloon or AirCore measurements (Karion et al., 2010) for a further accurate constraint of the calibration factor.  The upper troposheric portions in ORL_2 and ORL_4 are substituted by the measurements of ORL_1 and ORL_3. All aircraft profiles were taken at one day, two at low solar angle and two at higher solar angle around noon. Color description is given in Fig. A1.