Satellite-inferred European carbon sink larger than expected

Institute of Environmental Physics (IUP), University of Bremen, Bremen, Germany Department of Physics and Astronomy, University of Leicester, Leicester, UK IMK-ASF, Karlsruhe Institute of Technology, Karlsruhe, Germany Netherlands Institute for Space Research, Utrecht, The Netherlands Colorado State University, Fort Collins, CO, USA National Institute for Environmental Studies, Tsukuba, Japan Max Planck Institute for Biogeochemistry, Jena, Germany Atmospheric and Environmental Research Inc., Lexington, USA Arctic Research Center, Finnish Meteorological Institute, Sodankylä, Finland IMK-IFU, Karlsruhe Institute of Technology, Garmisch-Partenkirchen, Germany Meteorological Research Institute, Tsukuba, Japan


Introduction
Global anthropogenic CO 2 emissions are estimated to be 9.3 ± 0.6 GtC a −1 (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011), of which 4.3 ± 0.1 GtC a −1 remain in the atmosphere with the difference being taken up by land (2.6 ± 0.8 GtC a −1 ) and ocean (2.5 ± 0.5 GtC a −1 ) (Le Quéré et al., 2013).However, large uncertainties remain in our knowledge of the distribution of the land sink.This arises, for example, from the sparseness of the surface measurements assimilated by inverse models to infer the surface fluxes (Peters et al., 2010;Bruhwiler et al., 2011).This is also the case in Europe, where surface stations are limited to member states of the EU (EU28) and prevailing westerly winds poorly constrain the carbon fluxes in the largest part, i.e. the Russian part of Europe up to the Urals.
Inverse models are optimised to derive global or regional surface fluxes of CO 2 from surface-based in situ measurements.The high accuracy of these measurements allows the models to analyse small gradients over long distances and to apply strict mass conservation so that (to some extent) information about surface fluxes can be inferred in regions remote from measurement sites.Such models are particularly sensitive to long-range transport errors and large-scale biases of the measurements (Miller et al., 2007;Chevallier et al., 2014); for example measurement biases in North Africa can corrupt the inferred fluxes in Europe or elsewhere.
Satellite measurements have entirely different strengths and weaknesses compared to surface in situ measurements.They have lower accuracy and precision, but much better spatial coverage.However, regional biases of the satelliteretrieved dry-air column-average mole fraction of CO 2 (XCO 2 ) of a few tenths of a ppm can already hamper an inversion with mass-conserving global inversion models (Miller et al., 2007;Chevallier et al., 2007).Achieving this accuracy is challenging for current satellite retrievals (Reuter et al., 2010(Reuter et al., , 2013;;Buchwitz et al., 2013b) and even for ground-based validation measurements (Wunch et al., 2011).
Spatial gradients in the satellite data are more reliable over small scales than over large scales because potential retrieval biases are minimal when similar meteorology, surface characteristics, and observation geometry exist.By allowing global land-sea XCO 2 biases (Basu et al., 2013) or monthly, latitudinally varying XCH 4 biases (Bergamaschi et al., 2013), attempts have been made to adapt the inverse models to the characteristics of the satellite data, but mass conservation can still transport errors over long distances.

Inversion technique and satellite data sets
In this study we perform a regional surface flux inversion using only satellite measurements within the European TRANSCOM (Gurney et al., 2002) region (from the Atlantic to the Urals, area = 1.0 × 10 13 m 2 , Fig. 2a), thus ensuring that any potential retrieval biases in other regions do not impact on the results.Taking the satellites' averaging kernels into account, we analyse the differences between CarbonTracker CT2011_oi (Peters et al., 2007) model simulations and five independent satellite XCO 2 retrievals (BESD v02.00.08 2003-2010, ACOS v3.4r03 2010, UoL-FP v4.0 2010, RemoTeC v2.11 2010, and NIES v02.xx 2010) of two different instruments (GOSAT and SCIAMACHY).For each sounding (17 400 a −1 for BESD, 4000 for ACOS, 4900 for UoL-FP, 3100 for RemoTeC, and 3800 for NIES), we calculate the accumulated European surface influence function (Jacobian) by using the Stochastic Time-Inverted Lagrangian Transport model (STILT).Potential issues arising from long-range transport are reduced because air masses leave the analysis region typically within a few days.If the XCO 2 difference to CarbonTracker depends on the European surface influence, we infer by how much the Carbon-Tracker fluxes (being the basis for CarbonTracker's concentrations) would have to be modified in order to bring measurement and model in better agreement.As an example, if the XCO 2 difference (satellite − model) decreases with increasing surface influence, the model fluxes are assumed to be too large (Fig. 2b).A systematic offset is interpreted as retrieval (or model) bias.This results in the inversion being solely dependent on regional (medium-scale) gradients, which is a strength of the satellite retrievals.The inversion yields monthly optimised fluxes and utilises the optimal estimation formalism with CarbonTracker fluxes as a priori and first guess.As CarbonTracker assimilates surface in situ measurements, our inversion can be considered to be a stepwise inversion of satellite and surface in situ measurements.For more details see Appendix A.

Error analysis and ensemble set-up
By means of an ensemble of five different inversion set-ups (25 ensemble members in total) and a comprehensive error analysis, we can find no indications that (i) the used background model providing reference concentrations and a priori fluxes (CarbonTracker), (ii) the used convection scheme, (iii) the used meteorology, (iv) aggregation errors, or (v) persistent, inner-European retrieval biases in mean wind direction explain the observed carbon sink.Additionally, it seems unlikely that five independently developed retrieval algorithms optimised for two different sensors produce consistent erroneous surface fluxes.The analysed potential uncertainties add up to a total uncertainty of 0.30 GtC a −1 for the annual fluxes.If not otherwise noted, all uncertainty estimates of annual fluxes within this paper include this additional uncertainty; monthly uncertainty estimates correspond to unmodified a posteriori error estimates.More details about the error analysis and the specific error components can be found in Appendix B.

Results
Figure 1 (top) shows the annual European biospheric surface fluxes (land excluding fossil and fire) of CarbonTracker and five satellite data inversions.Note that all uncertainties within this publication correspond to 1σ and that Carbon-Tracker uncertainties have been scaled (see Appendix A).CarbonTracker fluxes, here representing current knowledge, imply that the European carbon sink is 0.41 ± 0.36 GtC a −1 (multi-year average, Fig. 1, top).For some of the years (especially 2003-2005) it cannot be concluded with high confidence whether Europe is a sink or a source.The flux inversion with BESD using SCIAMACHY data indicates that Europe's biosphere is indeed likely a sink in all analysed years (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010) and very likely in the period 2004-2010.The satellite-derived multi-year average of the European sink is 0.95 ± 0.33 GtC a −1 .
The year-to-year variations are somewhat larger for BESD than for CarbonTracker, which may imply a larger ecosystem sensitivity.In this context, note that Schneising et al. (2014) found that the seasonal cycle amplitude tends to have a larger temperature sensitivity on the Northern Hemisphere compared with CarbonTracker, even though the difference is not statistically significant.The smallest sink is being found in 2003 (due to the European heat wave and drought; Ciais et al., 2005) and the largest sink in 2006.
It is interesting to note that the results reported by Nassar et al. ( 2011) support a strong European sink in 2006, which they derived from global inversions of TES (Tropospheric Emission Spectrometer) satellite measurements.The TES CO 2 retrieval conceptually differs from SCIAMACHY or GOSAT XCO 2 retrievals because the instrument measures thermal infrared radiation and averaging kernels for CO 2 peak in the mid-troposphere.In the study, solely soundings above oceans between 40 • S and 40 • N were used.Remapping their results yields for the European TRANSCOM region 1.33 ± 0.20 GtC a −1 (Nassar et al., 2014), which agrees well with our result for 2006 (1.33 ± 0.33 GtC a −1 ).
From the perspective of the European carbon sink, 2010 was an average year (CarbonTracker: 0.38 ± 0.35 GtC a −1 )   and it was the first complete year in which GOSAT data were gathered.Analysing an ensemble of five independently developed satellite retrieval algorithms and five different inversion set-ups, our best estimate of the European carbon sink in 2010 is 1.02 ± 0.30 GtC a −1 (see Appendix B).All 25 ensemble members agree reasonably well and suggest a larger carbon sink than estimated by CarbonTracker.The best agreement among the satellite retrievals is found for the baseline inversion (see Appendix B).The largest deviations from the baseline inversion are observed when changing the background model or the meteorology.Peylin et al. (2013) performed an inter-comparison study of an ensemble of eleven global inversion models which showed that European CarbonTracker fluxes (0.30 GtC a −1 for 2001-2004, assuming an area of 1.0 × 10 13 m 2 ) are similar to the ensemble mean (0.40 GtC a −1 ).However, the ensemble spread is 0.42 GtC a −1 (1σ ) and individual models estimate the European biospheric carbon sink to be of the order of 1 GtC a −1 , which is similar to our findings and it should be noted that the analysed period (2001)(2002)(2003)(2004) includes 2003 with little uptake.
The middle panel of Fig. 1 compares monthly surface fluxes and shows that BESD's larger annual sink originates mainly from stronger CO 2 uptake during the growing season and to a lesser extent from weaker CO 2 release during the dormant season.This pattern is stable over the years and consistent among the satellite derived fluxes in 2010 with the exception of the RemoTeC fluxes which are similar to Car-bonTracker fluxes in June and July but which are the lowest in January-April.As a result, the annual fluxes for RemoTeC show the weakest sink (0.74 ± 0.33 GtC a −1 ).Note that Re-moTeC has also the lowest number of soundings which can result in sparsely sampled regions.All satellite retrievals gain the largest uncertainty reduction during the growing season (Fig. 1, bottom) when CarbonTracker uncertainties are largest.A relatively large fraction of satellite observations are made within this period because of advantageous solar zenith angles and cloud conditions.The poor sampling during the dormant season does not allow for a larger error reduction, and it cannot be completely excluded that Car-bonTracker underestimates respiration and/or decomposition within this period, which would result in a weaker annual average sink.However, it should be noted that this is, in principle, accounted for by error propagation into the uncertainty of the annual averages assuming that the a priori fluxes in the dormant season are unbiased.Due to the lower activity of the biosphere during the dormant season over Europe, the a priori flux uncertainties in this season are smaller, which is consistent with results from an ensemble study of global inversion models showing the smallest inter-model spread in this season (Peylin et al., 2013).
The phase of the seasonal cycle seems consistent among the satellite inversions and CarbonTracker, but this agreement should not be overinterpreted monthly resolution and the month-to-month correlations regularising the inversion (see Appendix A).By means of a regional inversion study using 15 CarboEurope stations, Broquet et al. ( 2011) estimated the European summer uptake (June-September) to be in the range of about 1.4-3.6GtC a −1 for 2003-2007.Our results indicate an uptake of 3.5-5.0GtC a −1 within the summer months for the same period.As the cited study concentrated on a much smaller domain in western Europe of area 3.9 × 10 12 m 2 , we have scaled their results to 1.0 × 10 13 m 2 .

Discussion
In order to investigate the origin of the difference between the satellite (0.90 ± 0.33 GtC a −1 BESD baseline inversion in 2010) and CarbonTracker (0.38 ± 0.35 GtC a −1 for 2010) fluxes, we limited our analysis to satellite (BESD) soundings falling in a 350 km surrounding of all surface stations assimilated by CarbonTracker (Fig. 2a).In this case we estimate a European carbon sink of 0.58 ± 0.37 GtC a −1 , which agrees with CarbonTracker (within the error bars).The satellites retrieve column averages in contrast to the surface sites performing point measurements in the boundary layer.Therefore, the satellite measurements still represent a larger part of Europe because footprints are more widespread; that is, a one-to-one comparison with surface sites is not appropriate.Nevertheless, this experiment indicates that the underestimation of the European carbon sink in some inverse models may result from the sparse sampling of surface sites with no stations outside EU-28 countries.This is consistent with the inversion studies of Bruhwiler et al. (2011), who showed that the derived European carbon budget and its seasonal cycle can critically depend on the spatial coverage of the surface sites.Some inversion models, assimilating only surface in situ measurements, find a large sink in the Eurasian boreal TRANSCOM region and a weaker sink in Europe.The models shift the sink towards Europe when assimilating satellite XCO 2 measurements (Basu et al., 2013).A potential explanation is that the sparseness of surface sites hinder the models from discriminating between the European and Eurasian regions.This hypothesis is supported by a study of Schneising et al. (2011) using satellite XCO 2 data suggesting that the Eurasian boreal forests are a weaker sink than expected from CarbonTracker.
For two of the aggregation experiments (see Appendix B) we divided the European domain in two equally large parts (i) at 27.4 • E and (ii) at 52.3 • N. The inferred fluxes indicate that eastern Europe (53 ± 31 %) may contribute more than western Europe (47±30 %) and northern Europe (66±31 %) more than southern Europe (34 ± 35 %) to the overall European carbon sink.However the error estimates, which consider a posteriori error and ensemble spread, show that the differences are not significant.
Inverse modelling studies are the focus of this paper.However, recent findings reveal that carbon accumulation increases continuously with tree size (Stephenson et al., 2014).This potentially contributes to explaining the discrepancy with bottom-up inventories.In this context, it should also be noted that the flux estimates of Schulze et al. (2009) concentrated on the period 2000-2005 including years (e.g.2003) with little uptake in Europe (Ciais et al., 2005).

Validation
In order to validate our results, we use the optimised fluxes of the baseline inversion of BESD satellite data to simulate optimised concentrations.These concentrations as well as Car-bonTracker concentrations are then compared with independent measurements, which have been inverted neither by Car-bonTracker nor by us.For more details see Appendix C. The comparison with TCCON (Wunch et al., 2011, Total Carbon Column Observing Network) ground-based FTS (Fourier transform spectrometer) column measurements shows that the optimised concentrations slightly improve the standard deviation of the difference and the seasonal cycle amplitude (Fig. 5).This is also the case when comparing with CON-TRAIL (Machida et al., 2008, Comprehensive Observation Network for TRace gases by AIrLiner) in situ measurements aboard commercial aircraft.In an altitude range corresponding to 700-300 hPa optimised and CarbonTracker concentrations are similar because the European surface influence is small.However, we observe an improvement of the seasonal cycle in the lower troposphere at 950-700 hPa with differences of up to 0.86 ppm between optimised and Carbon-Tracker concentrations during the growing season (Fig. 6).
A comparison with ground-based in situ measurements of NOAA's cooperative air sampling network cannot be considered a validation because these measurements have been assimilated into CarbonTracker so that improved agreement cannot be expected.Nevertheless, this comparison is valuable to assess potential inconsistencies between the assimilated data sets.The overall agreement with the surface measurements slightly degrades.The seasonal cycle of the optimised concentrations shows marginal improvements in most of the months but also the largest discrepancy in one of the months (Fig. 7).

Conclusions
In summary, our study reveals that the European terrestrial carbon sink appears considerably larger than expected from bottom-up estimates and the majority of inverse models assimilating in situ CO 2 atmospheric concentration measurements (Schulze et al., 2009;Peylin et al., 2013;Peters et al., 2010;Chevallier et al., 2014).The addition of surface mea- surement sites in the eastern part of Europe may in the future help to confirm our findings with global mass-conserving inversion models.In addition, these models have the potential to identify the origin of the carbon absorbed in Europe.New satellite missions with more measurements, higher spatial resolution, precision, and accuracy (e.g.OCO-2 or Car-bonSat/CarbonSat Constellation, Crisp et al., 2004;Bovensmann et al., 2010;Buchwitz et al., 2013a) have the potential to reduce the remaining uncertainties especially during the dormant season.They will enable flux estimations at high spatial resolution and contribute to improved process understanding on local, regional, and global scales.

Appendix A: Inversion technique
This section describes our baseline inversion set-up corresponding to the bars and solid lines within Fig. 1.Deviations from the baseline are described in Appendix B. We use the optimal estimation formalism to infer optimised surface fluxes for the European TRANSCOM region (from the Atlantic to the Urals, area = 1.0×10 13 m 2 , Fig. 2a) during the period 2003-2010 on a monthly basis by minimising the cost function: The state vector x (containing the parameters of interest) consists of 96 monthly fluxes at European scale and 96 monthly biases (2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010).The bias elements ensure that the flux information is coming solely from inner-regional gradients and is insensitive to seasonal, region-wide (constant) biases.We use monthly CarbonTracker CT2011_oi (Peters et al., 2007) fluxes (resulting from an inversion assimilating surface in situ measurements) as flux a priori and zero as a priori biases (x a ).Usually, monthly averages cannot be obtained directly because of large regional (e.g.plant type) and temporal (e.g.day/night) flux variations.Such variations are modelled by CarbonTracker so that it is sufficient for our application to derive the monthly average flux deviation from Carbon-Tracker.
The a priori error covariance matrix S a is constructed from (scaled) monthly CarbonTracker flux uncertainties and a 100 ppm uncertainty for the monthly biases (rendering them basically un-constrained).CarbonTracker uses a Kalman filter technique with a 5-week assimilation window which results in monthly flux uncertainties considered unrealistically large.Consequently, we apply a scaling of 1/3 so that the uncertainties of CarbonTracker's annual averages become similar to uncertainties estimated by Basu et al. (2013) and Chevallier et al. (2014) inverting surface in situ measurements.The resulting monthly uncertainties (Fig. 1, bottom), with lowest values during the dormant season and largest values during the growing season, agree reasonably well with the inter-model spread of an ensemble of atmospheric CO 2 inversions (Peylin et al., 2013).Potential seasonal biases are assumed to vary only slowly during the year, and we assume an error correlation length l of 3 months between the bias elements of S a (the correlation between two months i and j is computed by e −|i−j |/ l ).In order to better constrain the inversion, we add the same correlations to that part of S a corresponding to the fluxes.A priori error correlations between the bias and flux elements are not assumed.
The measurement vector y comprises all XCO 2 measurements falling in the time period 2003-2010 and in the European TRANSCOM region (Fig. 2a).Averaging kernels have been applied to the measurements using co-located Carbon-Tracker profiles (Reuter et al., 2013).The forward model F is represented by the co-located CarbonTracker XCO 2 values.
The measurement error covariance matrix (S ) is constructed from the retrieval uncertainties.However, as some GOSAT data sets include unrealistically small errors, we scale them to match (on average) the single measurement precision determined from a validation exercise using the European TCCON sites shown in Fig. 5.We use the same validation approach as that of Reuter et al. (2013) and find the following scaling factors (single measurement precision from the validation/average reported error, both in ppm): BESD 2.25/2.39,ACOS 1.98/1.00,UoL-FP 2.50/1.06,Re-moTeC 2.16/0.77,and NIES 2.19/0.88.GOSAT measurements are regularly sampled with 150 km (260 km since 08/2010) distance and we assume no error correlations between them.In contrast, SCIAMACHY measurements can have direct neighbours; thus error correlations become more likely (e.g. as a result of similar meteorological conditions).In order to make the error characteristics comparable with those of GOSAT measurements, we assume an error correlation length of 200 km (GOSAT's approximate average sampling distance in 2010) for measurements within one orbit.This introduces off-diagonal elements in S -a matrix which can have more than 1.5×10 5 ×1.5×10 5 elements.Therefore, sparse matrix arithmetic and block-wise inversion is required for the calculations.
Within this study, we concentrate on European fluxes, so that correlations with fluxes in other regions will not affect our results.Additionally, we can assume that the Carbon-Tracker fluxes already provide a good estimate.For these reasons, we use CarbonTracker fluxes as a priori and first guess and assume linearity of the forward model.The a posteriori solution x minimising the cost function then results from a single correction to the first guess: Here Ŝ includes the a posteriori errors and their correlations, and K is the Jacobian matrix which is calculated with the STILT (Gerbig et al., 2003;Lin et al., 2003) Lagrangian particle dispersion model.The XCO 2 increment expected from the optimised fluxes can be calculated from K x − x a (Fig. 2c).
We use STILT to calculate global column-average footprints R (surface sensitivities in ppm (µmolC m −2 s −1 )) −1 with a resolution of 0.5 • × 0.5 • for each satellite sounding (Fig. 3a-d show an example).For this purpose, we split the atmospheric column into 40 sub-layers with variable (pressure) width p from which about 25 are placed in the planetary boundary layer, 12 in the free troposphere, and 3 in the stratosphere.One STILT receptor is placed at the centre of each layer.Each receptor is the starting point of 25 particles for which back-trajectories of 480 h are calculated forming the basis for the receptors footprint R. In order to avoid unnecessary calculations, we terminate particles leaving a 10 • bounding box around Europe.Given the surface pressure p 0 and the column averaging kernel a, vertical integration is performed by The European influence of a sounding, i.e. the corresponding element of the Jacobian matrix K (in ppm (GtC a −1 ) −1 ), is calculated by integration over the European TRANSCOM region and unit conversion (Fig. 3e shows an example).We use NCEP/NCAR reanalysis as driving meteorological fields.
The annual flux average of year y is calculated from the inversion results by x y = w T x.Here the elements of the weighting vector w are 1/12 for the 12 monthly fluxes corresponding to y and zero elsewhere.

Appendix B: Error analysis and ensemble set-up
A posteriori error.The a posteriori error estimates are calculated from Eq. (A2) accounting for the error estimates of the satellite retrievals S a and the a priori flux uncertainties S .The uncertainty of the annual flux average is given by σ x y = w T Ŝ w 1/2 and amounts to 0.15 GtC a −1 for BESD (multi-year average), 0.14 GtC a −1 for ACOS (2010), 0.15 GtC a −1 for UoL-FP (2010), 0.14 GtC a −1 for RemoTeC (2010), and 0.19 GtC a −1 for NIES.However, this uncertainty estimate is incomplete and additional error terms are considered via an ensemble approach as explained in the following.
Background model.Even though the inversion solely relies on inner-European gradients, the choice of the background model (CarbonTracker) may introduce potential uncertainties to the inferred fluxes.To investigate this issue, we derived fluxes for all five satellite retrievals in 2010 using the MACC (Chevallier et al., 2014) model (version 11.2) for reference concentrations and a priori fluxes (Fig. 1, MACC background).The resulting annual fluxes are consistent with the results based on CarbonTracker to which they have a root mean square difference (RMSD) of 0.22 GtC a −1 .
Convection.Inaccuracies in the parametrisation of convection in STILT are an additional potential error source.In order to assess this potential uncertainty, we modified the baseline set-up and used the convective available potential energy (CAPE) parametrisation instead of the modelled vertical wind speeds and re-processed the footprints for all five satellite retrievals in 2010 (Fig. 1, CAPE convect.).The resulting annual fluxes deviate by a RMSD of 0.09 GtC a −1 .Thus, convection is unlikely to explain the observed carbon sink.
Stilt set-up.Other STILT set-up parameters may also influence the results.Experiments using finer receptor grids and/or more particles showed negligible influence on the accumulated European surface influence.The used integration time also seems to be sufficient because flux results level off well before 480 h (Fig. 4a).This justifies the assumption that the vast majority of particles have left Europe within this time.
Meteorology.Global inversion studies showed that the atmospheric transport model can significantly impact the inferred surface fluxes (Gurney et al., 2002).We expect that our regional set-up does not critically depend on long-range transport errors for two main reasons: (i) air masses leave the analysis region typically within a few days, and (ii) the analysis relies on accumulated European surface influences; that is, the exact pattern of the surface influence is less important.The inversion results of all five satellite retrievals in 2010 using the ERA Interim reanalysis (Fig. 1, ERA Interim met.) have a RMSD of 0.32 GtC a −1 compared to the results based on the NCEP/NCAR reanalysis.
Aggregation error.As described before, we derive Europewide, monthly flux increments.This could be interpreted as "hard constraint" possibly resulting in temporal and/or spatial aggregation errors (Kaminski et al., 2001).Engelen et al. (2002) estimated this effect for the European TRANSCOM region to be 0.13-0.31GtC a −1 depending on the used flux fields when inverting sparsely sampled in situ measurements.We expect that aggregation errors are less pronounced in our case because (i) spatio-temporal patterns of the used a priori fluxes (CarbonTracker) are assumed to be relatively realistic, and (ii) the inverted satellite data are considerably more densely sampled than in situ measurement sites.Nevertheless, we analysed this potential error component by undertaking three experiments.Each of which divided the domain into two equally large parts (i) at 27.4 • E, (ii) at 52.3 • N, and (iii) at the middle of each months.Within each sub-domain, we derived the surface flux and the bias and aggregated the sub-domains afterwards.The a priori error covariance has been apportioned accordingly so that the a posteriori error statistics of the aggregated domains remained similar.The inversion results differ from the baseline by 0.12 GtC a −1 (RMSD, longitude split), 0.06 GtC a −1 (RMSD, latitude split), and 0.04 GtC a −1 (RMSD, temporal split), respectively.The average of the three experiments differs by a RMSD of 0.08 GtC a −1 from the baseline (Fig. 1, aggreg.exp.).
Regional biases.Even though our regional inversion scheme is insensitive to retrieval biases outside Europe, it could in principle still suffer from retrieval biases within Europe arising, for example, from persistent aerosol or cloud patterns, surface albedo, or chlorophyll fluorescence.However, this would only be the case if biases were correlated with the surface sensitivity.Europe's weather is complex and characterised by alternating high-and low-pressure systems induced by Rossby waves with west winds dominat-ing on average.This means, measurements in eastern Europe will tend to have a larger sensitivity to the European surface flux.Therefore, a hypothetical retrieval bias with east-west gradient would correlate with the surface sensitivity and affect the inversion results.A retrieval bias producing larger XCO 2 values in eastern Europe could be misinterpreted as a CO 2 source for west wind conditions.Conversely it would be misinterpreted as sink under east wind conditions.Therefore, we built two sub-samples of the BESD data set containing only measurements with planetary boundary layer winds in west or east direction, respectively.The resulting annual average surface fluxes of both sub-samples have only a small difference of 0.06±0.19GtC a −1 .Sub-sampling for boundary layer winds in north and south direction results in a small difference of 0.03 ± 0.19 GtC a −1 .The uncertainty estimates represent the standard deviation over the analysed years.In case of persistent bias patterns, the sign of the flux error changes with wind direction so that it appears twice in the calculated differences.We estimate that the annual surface flux error due to persistent biases amounts to 0.02 GtC a −1 ≈ 1/4 0.03 GtC a −1 + 0.06 GtC a −1 .Therefore, retrieval biases in mean wind direction are unlikely to explain the observed carbon sink.
Differences of the retrieval algorithms.Additionally, it is valuable to recall the differences in the retrieval algorithms used for this study: (i) the retrievals use different spectral fitting windows and spectroscopy; (ii) different cloud and aerosol screening techniques are used resulting in different samplings; (iii) light scattering related errors are corrected differently by the retrievals' full physics schemes being optimised for clouds and/or aerosols; (iv) the surface albedo is handled differently, (v) chlorophyll fluorescence is explicitly accounted for by only one of the retrievals (ACOS); (vi) one retrieval uses no empirical bias correction (NIES).See also Reuter et al. (2013) for a summary of differences.Overall it seems unlikely that five independently developed retrieval algorithms optimised for two different sensors produce the same bias patterns.
Statistical set-up.Inherent to Bayesian inversion systems, the statistical set-up (a priori errors and their covariances) influences the a posteriori solution.We generate the a priori error statistics by scaling monthly CarbonTracker flux uncertainties with a factor of 1/3 and assuming error correlation lengths (for bias and fluxes) of 3 months.The resulting seasonal cycle flux uncertainty and the annual average flux uncertainty agree reasonably well with values found by Basu et al. (2013), Chevallier et al. (2014), and Peylin et al. (2013) (see discussion in Appendix A).Nevertheless, we examined the influence of error scaling (Fig. 4b) and correlation lengths (Fig. 4c) on the results of the baseline inversion.
The analysed error scaling factors range from 0.20 to 0.45 so that the corresponding a priori uncertainties vary over a relatively large range from about ±0.2 GtC a −1 to ±0.5 GtC a −1 .All inverted annual fluxes deviate by less than ±0.1 GtC a −1 from the baseline.The same is true for the vast majority of fluxes inferred for error correlation lengths which range from 1 to 5 months.
Combining the error contributions.Figure 1 (top) shows the results of an ensemble of different inversion set-ups.The ensemble members quantify the expected departures of the annual fluxes from the baseline inversion with respect to those properties which are believed to have the largest influence on the inversion results (i.e.background model, meteorology, convection parametrisation, and aggregation set-up).The individual aspects are analysed separately so that the ensemble spread may not be a reliable measure for the uncertainty because permutations are missing.We approximate the missing permutations (16 permutations times 5 satellite retrievals in total) by linear combinations of the individual departures and calculate the median carbon sink (1.02 GtC a −1 ) and its standard deviation (0.28 GtC a −1 ).Using only the 25 ensemble members (and ignoring all permutations) would result in an average of 0.93 GtC a −1 and a standard deviation of 0.19 GtC a −1 .Additional error components such as regional biases, statistical set-up, etc., are assumed to contribute to a lesser extent.For convenience, we consider them by adding 0.10 GtC a −1 (via summation of variances) so that the overall uncertainty of the ensemble median is estimated to be 0.30 GtC a −1 (Fig. 1, dotted area).The uncertainty due to different samplings of the satellite retrievals is implicitly accounted for because part of the inter-algorithm differences.
If not otherwise noted, all uncertainty estimates of annual fluxes within this paper include this additional uncertainty.For example, the annual flux uncertainties of the individual satellite algorithms (Fig. 1, error bars) include the a posteriori error and the estimated additional uncertainty.Monthly uncertainty estimates correspond to un-modified a posteriori error estimates.

Appendix C: Validation
Based on the optimised fluxes f of the baseline inversion of BESD satellite data, we simulate optimised concentrations ĉ and compare them and CarbonTracker concentrations c a with independent measurements, which have been inverted neither by CarbonTracker nor by us.For this purpose, we calculate the European influence k (i.e. the Jacobian for each measurement) and multiply it with the derived flux increment f = f − f a , where f a represents CarbonTracker fluxes.
Except for a potential offset, the simulated concentrations ideally agree better with the measurements than the Car-bonTracker concentrations.Vertical integration (in case of column measurements) is performed as described in Appendix A.
The Total Carbon Column Observing Network (Wunch et al., 2011, TCCON) uses a ground-based FTS to derive the dry-air column-average mole fraction of CO 2 (and other www.atmos-chem-phys.net/14/13739/2014/gases).Six TCCON sites are located within the European TRANSCOM region (Fig. 5g): Białystok (Poland), Bremen (Germany), Garmisch-Partenkirchen (Germany), Karlsruhe (Germany), Orléans (France), and Sodankylä (Finland).We limit the validation period to 2010 because four of the six sites started operation in 2009.In order to reduce the amount of computational expensive footprint calculations, we only use those TCCON measurement per day and site being closest to 12:00 UTC. Figure 5a-f show these measurements overlaid by CarbonTracker and optimised concentrations.The differences between both are generally small (0.22 ppm standard deviation) and largest in summer when the flux increment is maximal.The overall standard deviation of the difference to TCCON marginally improves from 1.13 ppm to 1.11 ppm.Station-to-station biases (Fig. 5h) tend to improve, but the differences are below 0.1 ppm, which is smaller than TCCON's network accuracy of 0.4 ppm (Wunch et al., 2011).Consistent with the studies of Reuter et al. (2011) and Keppel-Aleks et al. ( 2012), CarbonTracker has a too small seasonal cycle amplitude at TCCON sites in northern midlatitudes (Fig. 5i).The seasonal cycle of the optimised concentrations is in slightly better agreement with TCCON: 11 months show an improvement by about 0.2 ppm (difference to CarbonTracker).
The Comprehensive Observation Network for TRace gases by AIrLiner (Machida et al., 2008, CONTRAIL) uses commercial aircraft as a platform for highly accurate and precise in situ measurements of atmospheric CO 2 and other species.In order to prevent the instruments from damage caused by polluted air masses, no measurements are performed below approximately 2000 ft (610 m).Within the period 2003-2010 the cities London, Milan, Moscow, and Paris were frequent destinations in Europe.Level flights are typically performed in altitudes above 10 km.The footprints of measurements in high altitudes are widespread and backtrajectories usually do not collect significant European surface influence within 480 h.Therefore, we concentrate on ascents and descents; Fig. 6i shows the position of all analysed measurements.Figure 6a-d compare CarbonTracker as well as optimised concentrations with CONTRAIL measurements near London, Milan, Moscow, and Paris within an altitude range corresponding to 700-300 hPa.The optimised concentrations agree well with CarbonTracker concentrations and CONTRAIL concentrations (0.10 ppm standard deviation).This is not surprising because the European surface influence is already small in these altitudes.Lower in the atmosphere, the differences between optimised and CarbonTracker concentrations become larger (Fig. 6e-h, 0.42 ppm standard deviation).Station-to-station biases (Fig. 6j) as well as the seasonal cycle in 700-300 hPa (Fig. 6l) are very similar with differences always below 0.07 ppm.However, we observe a distinct improvement of the seasonal cycle in the lower troposphere at 950-700 hPa (Fig. 6k) with differences of up to 0.86 ppm between optimised and CarbonTracker concentrations during the growing season.
NOAA's cooperative air sampling network performs highly accurate and precise ground-based in situ measurements of atmospheric CO 2 (and other species), which are the backbone of the CarbonTracker assimilation system.During the inversion procedure, CarbonTracker modifies its surface fluxes so that an optimal (Bayesian) fit is achieved between simulated and measured atmospheric concentrations.Any additional constraints (e.g. the satellite data) have only the potential to degrade the agreement with the surface concentrations; an improvement is almost impossible.Therefore, a fair validation is only possible with independent measurements, which have not been assimilated.Nevertheless, a comparison with in situ measurements is valuable to assess potential inconsistencies between the assimilated data sets.Figure 7i shows the position of the European measurement sites assimilated in CarbonTracker in 2010.Compared to the relatively large seasonal cycle amplitude in Fig. 7a-h, the difference between optimised and CarbonTracker concentrations is small (0.80 ppm standard deviation).The largest concentration increments can be found at sites with relatively large European surface influence (e.g.Hegyhátsál, Hungary, Fig. 7e) while the smallest increments are found upwind (e.g.Mace Head, Ireland).As expected, the overall agreement with the surface measurements slightly degrades (5.88 ppm vs. 5.96 ppm, standard deviation of the difference).Stationto-station biases of the optimised and the CarbonTracker concentrations are very similar, and the most pronounced feature is the large underestimation at the Black Sea site in Romania (Fig. 7j).The seasonal cycle at this site seems to be overlaid by frequent positive outliers which may explain its discontinuation (Fig. 7b).The average seasonal cycle of the optimised concentrations is in marginally (0.05-0.66 ppm, difference to CarbonTracker) better agreement with the surface sites during 8 months (Fig. 7k).However, the remaining months are in less good agreement and the largest departure from CarbonTracker concentrations (1.22 ppm) is found in June.
provided ground-based in situ flask measurements.The European TCCON groups involved in this study acknowledge financial support by the EU infrastructure project InGOS.The University of Bremen acknowledges financial support of the Białystok and Orléans TCCON sites from the Senate of Bremen and EU projects IMECC, GEOmon and ICOS-INWIRE, as well as maintenance and logistical work provided by AeroMeteo Service (Białystok) and the RAMCES team at LSCE (Gif-sur-Yvette, France) and additional operational funding from the National Institute for Environmental Studies (NIES, Japan).

Figure 1 .
Figure 1.European biospheric surface fluxes (land excluding fossil and fire) from CarbonTracker and an ensemble of five satellite inversions (BESD, ACOS, UoL-FP, RemoTeC, and NIES) and five different inversion set-ups.The baseline uses CT2011_oi as background model, modelled vertical wind speeds as convection, NCEP/NCAR reanalysis meteorology, and an aggregation area which equals the European TRANSCOM region (Appendix A); each other inversion set-up differs by one of these properties (Appendix B). (top) Annual averages and 1σ uncertainties (a posteriori and additional uncertainties, see Appendix B) as well as the ensemble median and its uncertainty (dotted area, see Appendix B). (middle) Monthly averages of the baseline inversions; (bottom) monthly uncertainties of the baseline inversion (1σ , as derived from the inversion scheme not including additional error components).Note that CarbonTracker uncertainties have been scaled (see Appendix A).

Figure A1 .
Figure A1.BESD multi-year (2003-2010) seasonal (June-August) statistics.a) Average satellite retrieved XCO 2 calculated from annual seasonal anomalies in order to minimise effects due to different annual samplings.Gridded to a regular 2 • × 2 • grid and smoothed with a Hann function with 5 • × 5 • effective width.Circles mark the European measurement sites of NOAA's air sampling network assimilated in CarbonTracker in 2010 and a 350 km surrounding.Areas in colour or dark grey represent the European TRANSCOM region, medium grey other land surfaces, and sea surfaces are light grey.b) Average relationship between European surface flux influence calculated with STILT and ∆XCO 2 (BESD-CarbonTracker). Black dots correspond to bin averages.Note that individual years and months may differ.c) Average XCO 2 increment anomaly expected from the optimised fluxes and calculated by K (x − x a ) (see Appendix A). 33

Figure 2 .
Figure 2. BESD multi-year (2003-2010) seasonal (June-August) statistics.(a) Average satellite-retrieved XCO 2 calculated from annual seasonal anomalies in order to minimise effects due to different annual samplings.Gridded to a regular 2 • × 2 • grid and smoothed with a Hann function with 5 • × 5 • effective width.Circles mark the European measurement sites of NOAA's air sampling network assimilated in CarbonTracker in 2010 and a 350 km surrounding.Areas in colour or dark grey represent the European TRANSCOM region, medium grey other land surfaces, and sea surfaces are light grey.(b) Average relationship between European surface flux influence calculated with STILT and XCO 2 (BESD-CarbonTracker). Black dots correspond to bin averages.Note that individual years and months may differ.(c) Average XCO 2 increment anomaly expected from the optimised fluxes and calculated by K x − x a (see Appendix A).

Figure 5 .
Figure 5. Validation with TCCON ground-based FTS measurements.TCCON XCO 2 measurements and corresponding optimised (CT2011_oi + BESD increment) and CarbonTracker (CT2011_oi) concentrations at the European TCCON sites (a) Białystok (Poland), (b) Bremen (Germany), (c) Garmisch-Partenkirchen (Germany), (d) Karlsruhe (Germany), (e) Orléans (France), and (f) Sodankylä (Finland).A constant offset correction (by subtracting the mean difference) has been applied to the optimised and the CarbonTracker concentrations.(g) Position of TCCON sites (green) within the European TRANSCOM region (dark grey).(h) Station-to-station biases of the optimised (red) and the CarbonTracker (black) concentrations.(i) Average seasonal biases of the optimised and the CarbonTracker concentrations.Error bars represent the standard error of the mean, i.e. the standard deviation of the difference divided by the square root of the number of measurements.

Figure 6 .
Figure 6.Validation with CONTRAIL aircraft-based in situ measurements.CONTRAIL CO 2 measurements and corresponding optimised (CT2011_oi + BESD increment) and CarbonTracker (CT2011_oi) concentrations near the European cities (a, e) London, (b, f) Milan, (c, g) Moscow, and (d, h) Paris in the altitude range (a-d) 700-300 hPa and (e-h) 950-700 hPa.A constant offset correction (by subtracting the mean difference) has been applied to the optimised and the CarbonTracker concentrations.(i) Position of CONTRAIL measurements (green) within the European TRANSCOM region (dark grey).(j) Station-to-station biases of the optimised (red) and the CarbonTracker (black) concentrations.(k) Average seasonal biases of the optimised and the CarbonTracker concentrations for the altitude range 950-700 hPa.(l) Average seasonal biases of the optimised and the CarbonTracker concentrations for the altitude range 700-300 hPa.Error bars represent the standard error of the mean, i.e. the standard deviation of the difference divided by the square root of the number of measurements.

Figure 7 .
Figure 7.Comparison to ground-based in situ measurements of NOAA's cooperative air sampling network.Surface flask CO 2 measurements and corresponding optimised (CT2011_oi + BESD increment) and CarbonTracker (CT2011_oi) concentrations at the European air sampling sites (a) Baltic Sea (Poland), (b) Black Sea (Romania), (c) Centro de Investigación de la Baja Atmósfera (Spain), (d) Hohenpeißenberg (Germany), (e) Hegyhátsál (Hungary), (f) Mace Head (Ireland), (g) Ochsenkopf (Germany), and (h) Pallas-Sammaltunturi (Finland).A constant offset correction (by subtracting the mean difference) has been applied to the optimised (red) and the CarbonTracker (black) concentrations.(i) Position of air sampling sites (green) within the European TRANSCOM region (dark grey).(j) Station-to-station biases of the optimised and the CarbonTracker concentrations.(k) Average seasonal biases of the optimised and the CarbonTracker concentrations.Error bars represent the standard error of the mean, i.e. the standard deviation of the difference divided by the square root of the number of measurements.