Atmospheric Chemistry and Physics Inverse modeling of European CH4 emissions: sensitivity to the observational network

Abstract. Inverse modeling is widely employed to provide "top-down" emission estimates using atmospheric measurements. Here, we analyze the dependence of derived CH4 emissions on the sampling frequency and density of the observational surface network, using the TM5-4DVAR inverse modeling system and synthetic observations. This sensitivity study focuses on Europe. The synthetic observations are created by TM5 forward model simulations. The inversions of these synthetic observations are performed using virtually no knowledge on the a priori spatial and temporal distribution of emissions, i.e. the emissions are derived mainly from the atmospheric signal detected by the measurement network. Using the European network of stations for which continuous or weekly flask measurements are available for 2001, the synthetic experiments can retrieve the "true" annual total emissions for single countries such as France within 20%, and for all North West European countries together within ~5%. However, larger deviations are obtained for South and East European countries due to the scarcity of stations in the measurement network. Upgrading flask sites to stations with continuous measurements leads to an improvement for central Europe in emission estimates. For realistic emission estimates over the whole European domain, however, a major extension of the number of stations in the existing network is required. We demonstrate the potential of an extended network of a total of ~60 European stations to provide realistic emission estimates over the whole European domain.


Introduction
Inverse modeling of atmospheric CH 4 provides "top-down" emission estimates, and represents an important tool to analyze the global CH 4 budget (Bergamaschi et al., 2009;Chen and Prinn, 2006;Bousquet et al., 2005;Mikaloff Fletcher et al., 2004a, b;Houweling et al., 1999;Hein et al., 1997).In recent years, inverse modeling efforts have been extended also to the regional scale (e.g. on the spatial scales of single countries), using high-resolution models and better coverage of measurements (Bergamaschi et al., 2005;Manning et al., 2005).Such regional top-down estimates can be potentially used for verification of international agreements on emission reductions, such as the Kyoto protocol, which however requires constant monitoring in a dense network (Bergamaschi, 2007a;IPCC, 2000).
The four-dimensional variational inverse modeling system TM5-4DVAR, based on the atmospheric transport model TM5 (Krol et al., 2005), is designed to infer emissions from atmospheric observations.In particular, it allows optimizing emissions at the model grid cell scale (compared to optimization of larger geographical regions in previously used synthesis inversions).At the same time, large observational data sets can be used, such as high frequency in situ measurements, providing constraints on monthly emissions from concentration variations at synoptic time scales.
The TM5-4DVAR system as currently implemented (Bergamaschi et al., 2009;Meirink et al., 2008a, b) can be simultaneously applied to both surface observations and satellite based measurements made by sensors such as SCIAMACHY (Scanning Imaging Absorption Spectrometer for Atmospheric CHartographY) on ENVISAT.The existing surface stations are monitoring mainly the atmospheric background concentrations of greenhouse gases (such as the NOAA/CMDL, and AGAGE networks, e.g.Dlugokencky et al., 1994Dlugokencky et al., , 2003;;Prinn et al., 1990), and have in recent years been extended to more regional stations (especially over Europe and North America).Nevertheless, the surface monitoring network is still sparse.The additional inclusion of satellite data from SCIAMACHY was recently reported in several studies (Bergamaschi et al., 2009;Meirink et al., 2008b;Frankenberg et al., 2008).While satellite data provide almost a global coverage of the methane concentration distribution, surface observations remain essential due to their higher accuracy and better temporal resolution for continuous in-situ measurements.
In this work we investigate the information content available from ground-based measurement observational networks.We will explore the sensitivity of the emissions derived from the TM5-4DVAR system to the observational networks.The zooming capability of the TM5-4DVAR system gives the opportunity to better spatially resolve a specific domain, in a consistent global inversion framework.Here, the focus is on observational network in the European domain.
Several studies have already addressed the impact of the observational network on inversion results in inverse modeling approaches for GHG gases, based on synthesis inversions or mass balance approaches and focusing mainly on continental scale aggregated regions (e.g.Law et al., 2002Law et al., , 2003Law et al., , 2004;;Law and Vohralik, 2001;Rayner et al., 1996).Only recently, first sensitivity studies have been presented also for the regional scales (e.g.Carouge et al., 2008a, b).
To test the sensitivity of our inverse modeling framework, we present experiments that use synthetic observations, similar to observing system simulation experiments (OSSEs) (e.g.Carouge et al., 2008a, b;Meirink et al., 2006;Chevallier, 2007).The advantage of this approach is that the generation of pseudo observations allows us to test the accuracy of the solution and the impact of different assumptions made on the network by direct comparisons of the derived CH 4 emissions to the "true" emissions that were used to generate the pseudo-data.Given this simple setting where observations are generated with the same model used for the inversion, we do not expect this study to provide results directly applicable to real cases, but we can gain knowledge on the limits and potential of our model framework.
The goal of this study is to (i) analyze the information content of single measurement stations, and (ii) investigate the effect of different observational networks on the retrieved emissions, i.e. flask versus continuous sampling, and the size of the network.Our overall aim is to gain insight in the accuracy at which emissions can be derived at the country scale in Europe, and how this accuracy depends on the network properties.

TM5 forward runs
Synthetic observations are generated by TM5 forward simulations.TM5 is a global offline chemistry-transport model (Krol et al., 2005), driven by meteorological fields (hours 03-12 for datasets with a 12-hourly cycle are used) from the European Centre for Medium Range Weather Forecasts (ECMWF) operational Integrated Forecast System (IFS).Key processes simulated in TM5 include mass-conserving tracer advection, convection, and boundary layer mixing.TM5 has a two-way nested zooming capability, which allows the model to perform higher horizontal resolution simulations in specified 3    degree resolution over Europe (Fig. 1).In the vertical direction, 25 layers are used, defined as a subset of the 60 layers used operationally in the ECMWF IFS model until 2006.
We apply the CH 4 tracer version as described in Bergamaschi et al. (2009).Chemical destruction of CH 4 by OH radicals in the troposphere is simulated using pre-calculated OH fields based on Carbon Bond Mechanism 4 (CBM-4) chemistry and optimized with methyl chloroform (Bergamaschi et al., 2005;Houweling et al., 1998).Chemical destruction of CH 4 by OH, Cl, and O( 1 D) in the stratosphere is based on the 2-D photochemical Max-Planck-Institute (MPI) model (Brühl and Crutzen, 1993).
The emission inventories used in the TM5 forward runs represent in our experiments the "true" emissions, applied to generate the pseudo observations.The emissions are based on current "state-of-the-art" bottom-up inventories as used in Bergamaschi et al. (2009) (see Table 1), and include the major CH 4 natural and anthropogenic source categories.Emissions from wetlands, rice paddies, and biomass burning have seasonal variations, while the emissions from all other source categories are assumed to be constant.Over the European domain, anthropogenic emissions from ruminants, waste handling, and emissions related to fossil fuels (coal mining, oil and gas production and distribution) are important.For the Scandinavian countries emissions from wetlands also play a major role.The spatial distribution of total annual mean emissions is shown in Fig. 2. The emission distribution for Europe is shown in more detail in Fig. 2b.Note that the emission hot spots located in the North Sea are related to oil and gas production.
Global CH 4 mixing ratios at the start of the simulations (1 January 2001) have been initialized using 3-D fields from inversions constrained by real observations for 2000-2001.
For simplicity's sake, we assume this error to be random and uncorrelated.For this purpose, we apply a random function with an arbitrary magnitude of 50% of the estimated "model representativeness error".The model representativeness error is based on the spatial gradient of modelled CH 4 mixing ratios at the monitoring sites, using all (horizontally and vertically) adjacent model grid cells (Bergamaschi et al., 2005).It also takes into account a 3 ppb measurement error.As in Bergamaschi et al. (2009), we applied a new scheme, which includes estimates of the impact of the subgrid-scale variability of emissions on simulated mixing ratios for stations in the boundary layer.
The estimated model representativeness error is highly variable in time (depending on meteorological conditions) and can reach values of several 100 ppb.Close to emission sources, the subgrid-scale variability of emissions, based on the spatial concentration gradient calculated in the forward run, generally outweighs the measurement error.Systematic biases in the observations are not taken into account in this study.

Atmospheric networks
The synthetic observations are created for global and European monitoring stations for which real observations are available for the year 2001, 37 remote sites and 17 European stations (Fig. 1 and Table 2).These stations are denoted "current stations" (CS) and include sites at which flask samples are collected (at a typical sampling frequency ∼1 per week), and sites with continuous measurements (with a time resolution of 1 h or better).Remote stations use flask samples.Of the 17 European stations there are 11 flask-sampled, and 6 continuously-sampled sites.We generate synthetic flask samples (weekly) and continuous measurements (hourly), accordingly.
In addition to the current station network (CS), we consider the following extensions of the network: (1) in a first upgrading step, the current 11 European flask sampling sites are converted into sites with continuous measurements (denoted "current stations continuous measurements", CS-CM).
(2) In a second step, further stations (50 sites in total) are added in Europe (denoted as "extended network", EXT).These include 16 stations for which real measurements have started after 2001 (e.g.various tall towers from the CHIOTTO -Continuous High precision Tall Tower Observations of greenhouse gases-network) or have been recently proposed in research proposals.
Additional 34 stations are added primarily in South and Eastern Europe to achieve a comprehensive coverage of the European domain.For simplicity, in many cases the latter stations are placed at the outskirts of major cities, when possible close to facilities such as airports or research centers.In the entirely synthetic framework of this study, this approach should be suitable to illustrate the potential benefits of such an extended network.However, these sites are not meant to evaluate the optimal locations and density of a network, neither to be concrete proposals for new stations.For the extension of the real network, many aspects have to be considered, such as the site representativeness of a larger region (e.g.absence of important local sources), requiring for each potential new site a detailed analysis of the region of influence.
The atmospheric stations used in this study are compiled in Table 2 and shown in Fig. 1.

TM5-4DVAR inverse modeling system
We employ the TM5-4DVAR inverse modeling system, based on the TM5 model and its adjoint (Krol et al., 2008), and a four-dimensional variational optimization technique.
The main parts of the system are described in detail in Meirink et al. (2008a) and subsequent further improvements in Bergamaschi et al. (2009).Briefly, the TM5-4DVAR system is minimizing iteratively the cost function to find an optimal set of model parameters (control vector x): where x B is the a priori estimate of x, and B the parameter error covariance matrix (containing the uncertainties of the parameters and their correlations in space and time).The variable y OBS,i denotes the set of observational data at time i, R i their corresponding error covariance matrix, and H i (x) the simulated concentrations corresponding to the observations.
In our case, the control vector x can be written as x= s T , c T T .It consists of monthly-mean surface emissions for each model grid cell s, and the three-dimensional concentration field c at the start of the inversion period.In contrast to inversions that apply detailed a priori emission inventories (e.g.Bergamaschi et al., 2009), and optimize different source categories independently, we optimize here only the total emissions for each grid cell.
The parameter error covariance matrix B is split into spatial and temporal correlation matrices (Meirink et al., 2008a).Spatial correlations are modeled as Gaussian functions of the distance between grid cells, and temporal correlations as exponential functions of the time difference (for Table 2. List of the stations used in the synthetic experiments.Global flask measurement sites are named "RM".Over Europe the current observational network stations are denoted as "CS".The sites added to the CS network to form the extended network are called "EXT".Latitudes and longitudes are expressed in degrees.Altitudes are in meters.Sites are with continuous (CM) or flask (FM) measurements.Symbols "DY" and "NI" define daytime and nighttime sampling, respectively.
where corr is the spatial correlation between grid cells, D is the distance in km, and L c is the correlation length in km.Vertical correlations of errors in the initial concentration field have been estimated using the National Meteorological Center (NMC) method as outlined in (Meirink et al., 2006).
In this study we apply a "semi-exponential" description of the probability density function (PDF) of a priori emission errors (Bergamaschi et al., 2009) to avoid negative a posteriori emissions.Due to the non-linearity of this "semiexponential" approach a system with an outer loop for evaluation of the non-linear model and an inner loop for incremental optimization of the linearized model is used (for details see Bergamaschi et al., 2009).In a set of sensitivity experiments, we also evaluate the effect of using the "semiexponential" PDF compared to a regular Gaussian PDF.

Inversion set-up
As a starting point of our inversions, we create an a priori emission inventory in which emissions are distributed homogeneously over land (except Antarctica), using an annual total of 500 Tg CH 4 /yr.In addition, we account for a homogeneously distributed total of 17 Tg CH 4 /yr over the ocean.Prior emissions on a global scale are shown in Fig. 2c.These homogeneous emissions are assumed to be constant in time.The a priori uncertainty of these emissions is set to large values (300% of the a priori grid-cell emissions).
This inversion set-up turns out to be consistent.This is supported by the value of the mean relative difference between true and a priori emissions calculated over Europe at the grid-scale (value of about 200%), and, above all, by studies on the frequency distribution of the true emissions versus the a priori in the European domain.Over land, the percentage of the true emissions lying within 300% of the prior emissions is up to 95% for the European countries EU27 including Norway, Switzerland, and former Yugoslavia.In this context, we should expect a posteriori frequency distributions where emissions with values much smaller and larger than the prior are less dominant.In particular, hot spots regions (e.g. with emission values five times more than the prior distribution) will be hardly retrieved by the system.
The choice of having a homogeneous a priori distribution is motivated by our aim to test the model setup in a very challenging situation, where we basically know very little of the a priori emissions.In first experiments, not shown here in the paper, we started from more realistic distributions (e.g.prior emissions with regional patterns resembling more the true distribution both in emission values and in their spatial distribution), which produced a posteriori emissions unrealistically too close to the true values, as expected from the simplified setup of these experiments.
The spatial correlation length for the emissions is set to rather small values (50 km) to give the inverse modeling system a large degree of freedom to optimize the spatial emission patterns.A correlation length of 50 km implies that emissions in neighboring grid-cells are significantly correlated over areas as wide as large metropolitan and industrial locations.Emissions in neighboring grid-cells are correlated with correlation coefficients (values of the B matrix described in Sect.2.1) equal to about 0.37.Furthermore, we assume a temporal error correlation time of one month.The inversions are run over a 12-month period (from 1 January 2001 until 1 January 2002), during which the monthly emissions are optimized.Each inversion, performed at a single processor of IBM Power 5 cluster at the European Centre for Medium-Range Weather Forecasts (ECMWF), requires approximately 20 days of computational time.
We apply here the same sampling scheme for the atmospheric observations as in the previous inversions described by Bergamaschi et al. (2009).This implies that continuous measurements are sampled only once per day to avoid the continuous measurements to over-constrain the inversion.Stations in the boundary layer are generally sampled during daytime (from 12:00-15:00 LT -local time), while mountain stations are sampled during nighttime (from 00:00-03:00 LT).This strategy avoids sampling in the shallow nighttime boundary layer and sampling during upslope transport for mountain stations.
The inversions are generally performed in two cycles.After a first inversion, we reject observations that differ by more than three sigmas (overall observational and model-representativeness error calculated during the inversion) from the a posteriori model simulation.In case of using true atmospheric observations (e.g.Bergamaschi et al., 2009), such large deviations are normally caused by the inability of the model to represent the observation (e.g.due to local emissions or local circulation processes), and about 1% of them lie outside the three sigmas range.In the case of synthetic measurements, as used in this work, we should expect much higher data rejection.The reason is that, if the sampling site is located in an area with large true emissions, the synthetic observations will show high concentrations and the perturbations to these observations can be potentially large.Since the homogeneous a priori emissions are much lower in these areas, the inversion may experience problems in reproducing these observations.If we do not perturb the observations, we would expect less discrepancy between modeled and observed data.In this case, we should also not reject any synthetic data as they all provide reliable information of the underling emissions.
A second inversion cycle is performed using the reduced observational data set.The percentage of measurements that are rejected is generally less than 10%.The advantage of this two-step inversion should be seen in the light of "real world" inversion, where certain datasets cannot be represented by the model and need to be discarded.

Inversion experiments
The inversions performed in this study are compiled in Table 3.
To investigate the constraining effect of different station types, we perform inversions where only observations of a single European station are used (but maintaining the global background stations).We select (i) Mace Head (MHD, 25 m a.s.l.), a station which samples the marine background, but is also frequently influenced by air masses from the UK and Ireland, and partly also from continental Europe; (ii) the tall tower at Cabauw (CB4, 200 m a.s.l.), a typical boundary layer station during daytime; (iii) Schauinsland (SIL, 1205 m a.s.l.) a mountain station of medium altitude.We apply continuous observations for these single sites (but weekly flask samples for the global background stations).The three inversions are denoted I1, I2, and I3 for MHD, CB4 and SIL respectively (Table 3).In addition, we calculate directly the sensitivities of these three sites to emissions, using "backplume" simulations based on the TM5 adjoint model (Krol et al., 2008).
In our main set of inversions (S1, S2, S3), we analyze the impact of the different observational networks on the retrieved emissions.As outlined in Sect.2.1.2these networks are denoted: S1 (current stations), S2 (current stations continuous measurements), and S3 (extended network).
For S1, we describe additional sensitivity experiments (scenarios S1.1, S1.2, S1.3, S1.4), in the Appendix.The aim of these experiments is to test the influence of the assumptions on a priori uncertainty and correlation length for the emissions (S1.1 and S1.2), and the impact of using perturbed synthetic observations in our TM5-4DVAR system (S1.3 and S1.4).
In Sect.4.1 and 4.2, we present some additional sensitivity studies, S1a/b, S2a/b, and S3a/b, to investigate some bias encountered in inversions S1-S3.In these experiments the ocean emissions are set to zero, and both the semi-linear (a: using the semi-exponential PDF) and linear (b: using the regular Gaussian PDF) versions of the TM5-4DVAR system are applied.Furthermore, no perturbation of the synthetic observations is applied in these scenarios.

Derived CH 4 emissions with inversion of single stations
We first analyze the potential of single stations to retrieve CH 4 emissions over Europe.Figure 3 shows the European annual mean CH 4 emission distributions resulting from the inversions (Fig. 3a, b, and c) together with footprints (Fig. 3d, e, and f) for single station back-plume simulations.measurements to CH 4 emissions (similarly as described in Krol et al., 2008, with units in ppb/(kg/s)).
We observe that the inversions retain the a priori emission value over those European areas where the annual mean sensitivities are small.This suggests that air masses from these regions rarely reach the station location, and measurements at the specific site contain little information on emissions from areas with low calculated sensitivities.
Mace Head, MHD, a boundary layer/marine background station, has a footprint that covers mainly Ireland, the UK, and the upwind ocean sector.Furthermore, MHD is also partly sensitive to emissions from the north-west continental European region.We note that the single-station inversion wrongly assigns high emissions to areas southeast of the MHD station.
The Cabauw tall tower, CB4, a boundary layer station, has a strong sensitivity to emissions in a radius of ∼300 km around the station.Therefore, areas of high emissions are reasonably well retrieved over the Benelux countries, and partly over the UK, France and Germany.Over Northern France, CH 4 emissions are somewhat overestimated in comparison to the true emissions.
Schauinsland, SIL, a mountain station at ∼1200 m altitude, is less sensitive than Cabauw and Mace Head to regional emissions.It retrieves a quite homogenous emission pattern over Germany and Northern France, in line with the back-plume calculation.The Schauinsland station thus provides weak constraints on the regional scale, but could potentially provide emission information content for larger scales compared to what is detected by boundary layer stations (due to the smaller influence of local sources).
In our modeling framework, the results discussed before illustrate that the use of a single station is not sufficient to retrieve a reliable emission distribution over Europe, when little a priori knowledge on emissions is assumed.However, it can be effective on small regions close to its location.

Scenarios S1, S2, and S3
Figure 4 shows the annual mean CH 4 emission distribution over Europe obtained from scenarios S1 (current stations), S2 (all current stations with continuous measurements), and S3 (extended network).The true emission distribution is shown in Fig. 4a.In addition, Fig. 5 and Table 4 report annual total CH 4 emissions calculated for several European countries.
The current observational network captures the spatial pattern of CH 4 emission distribution over Europe reasonably well, considering that the inversion was started with a uniform a priori distribution.CH 4 emissions are adequately retrieved from the UK, Ireland, France, Germany and the Benelux, hereafter named the North West European countries (NWE).In these countries, major hot-spots are retrieved (e.g. over Benelux and UK), consistently with the true distributions.Areas with high CH 4 emissions are also visible in Eastern Europe, on the borders of Poland, Czech Republic and Slovakia.However, they have a lower magnitude and are spread over larger areas.
The current network retrieves total emissions from the NWE countries within 5% of the true values (within 20% for single countries e.g.France).As we start from a priori emissions for the NWE sector that are 45% lower than the true values (Table 4), this demonstrates the strong constraints of the observations for the emissions of this sector.
Conversely, CH 4 emissions from Scandinavian regions (Norway, Sweden and Finland), Southern Europe (Italy in particular) and Eastern Europe (Poland) are not adequately captured.In some cases (e.g.Spain) retrieved methane emissions are close to the true values, but this is mainly because the homogeneous a priori emissions were already close to the true emission totals.
In scenario S2 (current stations continuous measurements), there is an improvement in the regional spatial patterns especially for the emission hot spot over Poland.Furthermore, in this scenario, the total emissions from the Scandinavian countries are better quantified.In Southern and Eastern Europe, however, CH 4 emissions are still poorly retrieved.
In scenario S3 (extended network), a major improvement of derived emissions is achieved, compared to S1 and S2.For the Scandinavian countries, country totals are closer to true values for Sweden (difference of less than 7%), and Finland (about 32%).In the UK and Germany, CH 4 emissions are retrieved to within about 7% and 2% difference, respectively.Major improvements are also seen especially in Eastern Europe (e.g. for Poland the difference between retrieved and true emissions is about 8%), and Southern Europe (e.g.Italy).However, there are also some countries (e.g.Norway, Denmark, and some Eastern European countries) for which derived total emissions are actually slightly worse than in scenario S2, with a positive bias of S3 compared to the true values.We will investigate these discrepancies further in Sect.3.2.2.
To demonstrate the general improvement of the retrieved a posteriori methane emission pattern over land using an improved and extended network, we present in Table 5 the correlation coefficient (r) and the linear regression coefficients (offset and slope (b) of the derived monthly CH 4 emissions versus the true values (e.g. more than 3000 data points).These values have been calculated considering the land pixels representing the "enlarged" EU27 Countries (EU27 with Norway, Switzerland, and former Yugoslavia), denoted here as En-EU27.Table 5 shows a substantial improvement in the correlation coefficients (0.47 for S1 and 0.74 for S3).Moreover, the offset is decreasing and the slope value gets closer to 1.

Sensitivity scenarios S1.1-S1.4
In the Appendix A1 and A2, we discuss the sensitivity of our results (for scenario S1) when we change the correlation length and the a priori errors in the emissions.Moreover, we analyze inversions where we used unperturbed pseudo observations (S1.3), and pseudo observations with smaller perturbations (S1.4).Results from scenarios S1.1 and S1.2 suggest that the choice of our experimental framework (with a correlation length of 50 km, and an error on the prior emissions of 300%) produces the best agreement between derived and true emissions over the NWE sector (e.g.Table 4), regions where the CS observational network provides good constraints.Furthermore, from experiments S1.3 and S1.4 we conclude that our system is not very sensitive to the random perturbation of the pseudo observations.However, there are several aspects of the a posteriori emission distributions calculated in scenarios S1, S2 and S3 that require further analyses: 1.All scenarios fail to retrieve the substantial CH 4 emissions from gas and oil production in the North Sea that are present in the true emission distribution (Fig. 4).
2. For some countries, the a posteriori annual total emissions of scenario S3 (our "best network") appear to be biased high compared to the true value.The overestimate provided by S3 is often larger than for scenarios S1 and S2 (e.g. for En-EU27, scenario S3 provides annual total emissions overestimated by 11%; S2 by 7%, and S1 by 5%).
3. The a posteriori emissions in grid cells close to the station locations are in many cases biased high compared to the true emissions.
These issues will be discussed in the next sections.4.

Emissions from the North Sea
In scenarios S1, S2 and S3, CH 4 emissions from the area over the North Sea, between the UK and the Scandinavian countries (located between 0 • and 5 • E, and 54 • and 62 • N, and denoted as "GBout" in Table 4) are not properly retrieved.This is mainly due to the very small a priori emissions over oceans, and partly to the lack of constraints over this region (there are few observational sites located close to the GBout area).As a result of the very small a priori emissions over oceans, the absolute a priori errors (chosen as 300% of the a priori emissions for both land and sea pixels) are relatively small.Therefore, the inversion system will be inclined to assign the emissions to the UK or the Scandinavian countries.This is confirmed by sensitivity experiments S1.a/b, S2.a/b, and S3.a/b, where the ocean emissions (including the oil and gas emissions in the North Sea) are removed from the true and a priori emission distributions.The next section discusses these scenarios in more detail.

Positive bias in scenario S3
Further experiments are conducted that can help to explain the positive bias for some countries using the highest density network.In these experiments, the ocean emissions (including oil and gas emissions over the North Sea) have been removed in the calculation of the pseudo-observations that are fed into the inversions.Furthermore, no perturbation of the synthetic observations is applied, i.e. overall more ideal settings are applied.For these scenarios, we stopped after the first inversion to avoid rejecting potentially important unperturbed observations (an issue also mentioned in Sect.2.3).
In the first set of experiments (S1.a,S2.a, S3.a), we apply the semi-linear inversion of the TM5-4DVAR system (as for scenarios S1-S3).In the second set of experiments (S1.b,S2.b, S3.b) we use the linear version, i.e. with normally distributed a priori error pdf's.
In Fig. 5, annual totals of the a posteriori CH 4 emissions from S1.a, S2.a, and S3.a show lower emissions compared to S1-S3 for En-EU27 and in particular for Norway, Denmark, and the UK (Table 4).In each case the difference for En-EU27 (e.g.S3 versus S3.a) is approximately 0.5 Tg CH 4 /yr, which roughly corresponds to the true value of the total annual anthropogenic CH 4 emissions over the North Sea in the offshore region ("GBout" in Table 4).This suggests that for scenarios S1, S2 and S3 the higher derived emissions from Scandinavian countries and UK are actually a result of the anthropogenic emissions from the North Sea that the TM5-4DVAR system cannot retrieve properly and erroneously attributes to the areas nearby the stations located at the coastlines (see also Sect. 4.3).Nevertheless, the a posteriori emissions (EU15, En-EU27) are still slightly overestimated in e.g.S3.a.Scenarios S1.b, S2.b, S3.b, generally show a much better agreement with the true emissions of the annual country totals (Fig. 5).This suggests that the semilinear version of the TM5-4DVAR system introduces a small positive bias.The semi-linear version was introduced to suppress negative emissions in the a posteriori emissions and to suppress strong dipole structures (Bergamaschi et al., 2009).The semi-linear scheme employs a skewed emission probability density distribution (PDF) with smaller probabilities towards a zero emission.Emissions larger than the a priori value follow a normal Gaussian distribution.When the a posteriori emissions are aggregated to the country scale this assumption leads to a small positive bias of about 5% (based on the annual totals for En-EU27).The linear version indeed produces very pronounced dipole structures with negative emissions in the a posteriori emission distributions at the grid scale (not shown).These dipole structures cancel out when aggregated to country scale totals.

Retrieved emission hot-spots near measurement stations
Grid points close to boundary layer stations show a clear tendency to overestimate a posteriori emissions (e.g. for Cabauw in the Netherlands, London in the UK, Saclay in France, Quistello, and San Pietro Capofiume in the Povalley).This feature becomes even clearer in the extended network.In the areas that are not covered by the current network, the additional stations cause higher a posteriori emissions in the grid cells close to the stations.One possible explanation is that homogeneous a priori emissions around these stations are often lower than the true emissions.The inversion system selects the most efficient way to match the pseudo observations.Thus, increasing the emissions in one grid box close to the station results in a lower perturbation of the background term of the cost function (Eq. 1) than enhancing emissions over larger areas.As shown in Fig. A1 a larger correlation length has the effect of reducing these artificial large emissions at some stations, but, as discussed in the Appendix A1, also smears out the retrieved peak values.We are currently investigating why grid points close to boundary layer stations show a clear tendency to overestimate a posteriori emissions.At the moment, we conclude that emissions retrieved on individual grid cells should not be over-interpreted.Retrieved emissions should always be analyzed at scales larger than a few grid cells.

Conclusions
Using Scandinavian countries.However, for realistic emission estimates over the whole European domain, a much larger extension of the existing network is required.We demonstrate the potential of an extended network of a total of ∼60 European stations to provide realistic emissions estimates over the whole European domain.We note that in this work we did not attempt to design an "optimal network", which could derive realistic CH 4 emissions over the whole European domain also with less than 60 observational sites.We leave this investigation for future studies.
It is important to realize that with the current observational network we cannot retrieve emissions from Southern and Eastern European countries properly.In the absence of observational sites, the knowledge of the a priori emission distribution becomes essential as the optimized emissions will remain close to the prescribed a priori distribution.
Finally, we investigated some important aspects of the TM5-4DVAR system.
-We demonstrated that continuous atmospheric observations provide strong constraints on emissions at regional scale, and allow deriving the major features of their spatial distributions.Increasing the network density markedly improves the agreement between retrieved and true emission patterns (clearly visible in the correlation coefficients).However, derived emissions of the individual model grid cells show major differences compared to the true emissions.In particular, we need to further investigate the overestimated emissions attributed to areas close to boundary layer stations sampling in regions with high emissions.
-We did not use detailed a priori emission inventories for the inversions.Thus, we optimized emissions entirely from the atmospheric signal.The major a priori assumption in our model settings is that emissions are distributed mainly over land.While this assumption seems generally reasonable, it leads to some artifacts if large sources are located over the ocean.We showed that the system cannot retrieve the localized anthropogenic CH 4 emissions over the North Sea properly, and allocates them to the surrounding countries where observational sites are present.
-The semi-linear TM5-4DVAR version provides generally consistent emission patterns for the European countries.However, it also introduces a small positive bias (about 5% higher derived CH 4 emissions) compared to the linear version.
-We applied a large perturbation to our synthetic observations (50% of estimated representativeness error).However, since this perturbation was random, it had a relatively small effect on the derived country aggregated emissions.Future studies should also investigate the potential impact of systematic errors in more detail.

A1 Sensitivity tests on the a priori emission error and its spatial correlation length
To test the system with different choices for the a priori emission error and its spatial correlation length, we perform experiments S1.1 and S1.2.Results are tabulated in Table 3.In S1.1, the uncertainty of the a priori emissions is set to 1000% of the a priori emissions values, and the spatial correlation length to 10 km.In S1.2, the uncertainty remains 300%, and the correlation length is set to 200 km.These settings imply the following.In S1.1, the system has much more degrees of freedom and can potentially deviate stronger from the a priori emissions with marked variation between grid cells due to the small spatial correlation length.In scenario S1.2, we may expect areas with high CH 4 emissions with values of the same order as S1, but with stronger correlation at regional scale.
Figure A1 shows the a posteriori annual mean CH 4 emission distributions for scenarios S1.1 and S1.2.The annual mean emission distribution derived for S1.1 has a more distinct spatial structure than S1 (e.g. for the UK, Eastern Europe, and the Benelux countries).Hot spot areas are more localized with higher values for the peaks compared to S1.
As expected, scenario S1.2 shows derived annual mean CH 4 emissions with more homogeneously distributed spatial patterns.As an effect of the larger value for the spatial correlation length, emission hot spots are distributed over wider areas (e.g.UK, Benelux, Eastern Europe), and emission peak values are generally smaller compared to the S1 and S1.1 scenarios.Differently, in some areas north-east of the HU1 site, the system "detects" large emission patterns with high peak values.This could be an artifact caused by the optimization process.Not having enough information from the observations (the CS network cannot properly constrain the Eastern European regions), higher emission values are distributed over neighbouring grid cells according to the correlation length value.
The annual totals at the country scale presented in Table 4, show that both S1.1 and S1.2 overestimate the derived CH 4 emissions for En-EU27 (about 10%), compared to S1.The settings chosen for S1 determine annual mean CH 4 emissions closer to the true emissions (less than 5% difference for En-EU27).

A2 Sensitivity tests on the perturbations applied to synthetic observations
To assess the robustness of the a posteriori emissions, we tested our TM5-4DVAR system by using (i) non perturbed synthetic observations (scenario S1.3 in Table 3), and (ii) synthetic observations perturbed with random noise with amplitude equal to the measurement error of 3 ppb (scenario S1.4) instead of the 50% of the representativeness error estimated from the forward simulation.Firstly, we analyze the amount of measurements that are rejected after the first optimization cycle.The amounts of measurements that are not used in the second cycle of the inversion are 7%, 2%, and 2% for scenarios S1, S1.3, and S1.4,respectively.This result is expected, since when measurements are not or are less strongly perturbed, the system is able to reproduce these pseudo-measurements better.
We found that the spatial patterns for the derived European annual mean emission distributions (not shown here) do not show relevant differences in the three scenarios.At the country scale, for the En-EU27 (Table 4), the total annual emissions for S1, S1.3, and S1.4 differ generally by less than 2%.This shows that the perturbation of the pseudo-measurements does not influence the results significantly.

Fig. 1 .
Fig. 1.(a) TM5 global domain with the two embedded zoom regions.(b) Inner zoom region over Europe.The stations locations are also shown (use the abbreviations in Table2).Remote stations constituting the global background network are shown in green triangles.Current observational network sites, CS, are represented by red dots (continuous measurements), and red triangles (flask measurements).Yellow dots represent the station sites chosen to extend the observational network over Europe.

Fig. 2 .
Fig. 2. Annual mean methane emission distributions: global true methane emission distribution (a), European true methane emission distribution (b), and global a priori emissions distribution (c).Black dots (CM sampling) and triangles (FM sampling) are the station locations for CS network.

Fig. 5 .
Fig. 5.Total annual emissions for selected EU countries.Horizontal bars represent the true value; grey vertical bars are the a priori values for each country.Colored vertical bars represent annual total emissions for scenarios in Table 4. Orange: solid is S1, vertical line pattern is S1.a, hatched pattern is S1.b.Light green: solid is S2, vertical line pattern is S2.a, hatched pattern is S2.b.Dark green: solid is S3, vertical line pattern is S3.a, hatched pattern is S3.b. Countries abbreviations are as in Table4.

Table 1 .
Bottom-up emission inventories used to generate the synthetic observations.

Table 3 .
List of the sensitivity tests discussed in this work.

Table 4 .
Annual total methane emissions for European countries (units in Tg CH 4 /yr).