uncertainties in atmospheric trace gas inversions using hierarchical Bayesian methods.

. We present a hierarchical Bayesian method for atmospheric trace gas inversions. This method is used to estimate emissions of trace gases as well as “hyper-parameters” that characterize the probability density functions (PDFs) of the a priori emissions and model-measurement covariances. By exploring the space of “uncertainties in uncertainties”, we show that the hierarchical method results in a more complete estimation of emissions and their uncertainties than traditional Bayesian inversions, which rely heavily on expert judgment. We present an analysis that shows the effect of including hyper-parameters, which are themselves informed by the data, and show that this method can serve to reduce the effect of errors in assumptions made about the a priori emissions and model-measurement uncertainties. We then apply this method to the estimation of sulfur hexaﬂuoride (SF 6 ) emissions over 2012 for the regions surrounding four Advanced Global Atmospheric Gases Experiment (AGAGE) stations. We ﬁnd that improper accounting of model representation uncertainties, in particular, can lead to the derivation of emissions and associated uncertainties that are unrealistic and show that those derived using the hierarchical method are likely to be more representative of the true uncertainties in the system. We demonstrate through this SF 6 case study that this method is less sensitive to outliers in the data and to subjective assumptions about a priori emissions and model-measurement uncertainties than traditional methods.


Introduction
Inverse modeling is widely used to estimate sources and sinks of trace gas fluxes and their distributions using measurements of atmospheric mole fractions and chemical transport models (CTMs). The estimation of surface fluxes has been performed at a variety of spatial and temporal scales, ranging from regional (e.g., Stohl et al., 2009;Manning et al., 2011;Rigby et al., 2011) to global (e.g., Chen and Prinn, 2006;Bousquet et al., 2011) and for timescales of hours to years.
Most inversions utilize a Bayesian framework and incorporate a priori information to condition the system, as shown by Eq. (1) (normalizing factors are not shown throughout this A. L. Ganesan et al.: Uncertainty quantification in trace gas inversions text for brevity) (Enting et al., 1995).

ρ(x|y) ∝ ρ(y|x)ρ(x)
(1) The Bayesian framework with Gaussian probability density functions (PDFs) and linear models used in most trace gas inversions gives rise to the cost function shown in Eq.
(2). The deviations between measurements, y, and model-simulated mole fractions, Hx, where H is a matrix that contains the sensitivities of atmospheric mole fractions to changes in emissions sources and x is a vector containing the emission sources, are weighted by uncertainty covariance, R. Similarly, deviations between emissions and their a priori values, x prior , are weighted by uncertainty covariance, P. This cost function is minimized with respect to x to find the "optimal" point that minimizes the total mismatch of the two terms (Enting, 2002;Tarantola, 2005).
This framework has generally been used because of its simplicity to solve. Several limitations are however present. Bayesian methods rely on knowledge of model-measurement (R) and a priori emissions (P) uncertainties, and the derived fluxes and associated uncertainties strongly depend on these parameters. Model-measurement uncertainties describe uncertainties in the instruments as well as uncertainties associated with the model's simulation of a measurement. The model error can be split into several components: structural errors within the CTM or meteorological model (Peylin et al., 2002;Thompson et al., 2011); model representation error, which describe errors in the representation of a point measurement in representing a grid volume (Chen and Prinn, 2006); aggregation errors, which result from averaging parameters over space and time and assuming fixed distributions within those domains (Kaminski et al., 2001;Thompson et al., 2011). Knowledge of these uncertainties is critical for robustly estimating posterior fluxes and their uncertainties; however, they are largely elicited through "expert judgment". The lack of objective methodology in ascertaining these uncertainties has been identified in many studies as a major limitation of traditional inverse methods (Rayner et al., 1999;Peylin et al., 2002;Law et al., 2002;Kaminski et al., 1999).
Some studies have used atmospheric data to "tune" covariances and then used these optimized parameters to derive fluxes (Michalak et al., 2005;Berchet et al., 2013). Several issues exist in these methods: (1) the uncertainties derived in the tuned R and P cannot be propagated through to estimating fluxes and their uncertainties, and (2) the Bayesian statistics used to derive fluxes assume that each term of the cost function in Eq. (2) is independent. Because both the a priori emissions and observations were used to derive the tuned R and P, independence between the two terms can no longer be assumed and correlations between the two will exist that are not fully accounted for; (3) these methods typically use Gaussian PDFs in setting up the cost function.
We present a hierarchical Bayesian method to estimate trace gas emissions and additional parameters, which we call "hyper-parameters", that describe the a priori emissions PDF and the model-measurement uncertainty PDF. Compared to traditional methods, the a priori information in the system is extended to include a set of hyper-parameters equipped with their own prior distributions, which we call "hyper-priors". Throughout the text, we refer to the emissions PDF that is characterized by these hyper-parameters as the "a priori emissions PDF".
We first describe the development of a hierarchical framework. Beginning with Bayes' theorem in Eq. (1), we seek to estimate the joint distribution of two parameters, x and θ, using data y (Eq. 3).
A joint distribution can be decomposed using the "probability chain rule" (Eq. 4).
Substitution of Eq. (4) into Eq. (3) leads to a hierarchical Bayesian model, in which both x and θ are informed by the data.
Hierarchical methods have been successfully applied in other fields (Riccio et al., 2006;Gelman, A. and Hill, J., 2002;Lehuger et al., 2009). A general summary and the application to uncertainty analysis can be found in Cressie et al. (2009). In this application, in which we estimate hyper-parameters in addition to fluxes, a hierarchical approach allows us to explore the additional space of "uncertainties in uncertainties". The framework ensures that estimated parameters and their uncertainties and covariances are passed systematically through the inversion. The entire set of fluxes and hyperparameters are updated in one step, therefore using measurements only once.
Because there often does not exist an analytical solution to maximize the posterior PDF represented in Eq. (5), Markov chain Monte Carlo (MCMC) is used (Tarantola, 2005). MCMC samples the PDFs of a set of parameters by constructing a Markov chain that represents the posterior PDF after a large number of steps. MCMC has the additional advantage that it may be used on a broad class of models, which need not be Gaussian. For example, positive fluxes can be constrained through the use of an a priori emissions PDF with support only on the positive real axis (Rigby et al., 2011).
We show that the hierarchical method results in a more complete uncertainty characterization than traditional inverse methods in which a priori emissions and model-measurement covariances are based largely on expert judgment. We present an application of this method for inversions of trace gas emissions and explore the ways in which the method can be used to quantify uncertainties in these inversions. Finally, we utilize this method to estimate regional sulfur hexafluoride (SF 6 ) emissions using measurements from the Advanced Global Atmospheric Gases Experiment (AGAGE) network and the UK Met Office Numerical Atmospheric-dispersion Modelling Environment v3 (NAME) transport model (Jones et al., 2007;Ryall and Maryon, 1998).
2 Application of hierarchical Bayesian modeling to emissions estimation

Theoretical framework
We are interested in estimating fluxes of trace gases and their uncertainties using measurements of atmospheric mole fractions. Fluxes and hyper-parameters could vary in space and time and are shown in this framework as vectors that could be estimated with spatial and temporal resolution. We apply the hierarchical Bayesian model to use data, y, to estimate x, a vector of emissions and boundary conditions to the inversion domain, as well as a set of hyper-parameters that govern the a priori emissions and model-measurement uncertainty PDFs. The hyper-parameters include vectors µ x and σ x , which describe the log mean and log standard deviation of a lognormal a priori emissions PDF, the vector σ y , which describes the standard deviation of a Gaussian model-measurement uncertainty PDF, and scalar τ , which is a model-measurement autocorrelation timescale. The modelmeasurement covariance matrix, R, is formed with diagonal terms comprised by the squares of σ y . Off-diagonal terms are computed through Eq. (6), where r ij is the covariance between measurements i and j , r ii and r jj are the variances of each measurement, t i,j is the time between measurements and τ is the autocorrelation timescale.
The joint distribution of x, µ x , σ x , σ y and τ is expressed through Eq. (7), following the framework developed above.
It is shown in Eq. (7) that each hyper-parameter requires a hyper-prior PDF to be specified. We have chosen to represent the PDF of each parameter by Eqs. (8)-(13). The lognormal distribution (LN) was used for emissions and PDF parameters because the distribution is skewed so that negative values are not defined but large values have small nonzero densities. An exponential PDF (EXP) was used for τ because the mode is zero, and in most inversions it is generally assumed that there is no model-measurement autocorrelation. Model-measurement uncertainties were assumed to be Gaussian (N) because we assume random errors in the instrument and model to be symmetric. The functional forms used in this application of the hierarchical Bayesian framework represent only one of many possible applications and can be reformulated to represent different assumptions. Model parametric uncertainties have not been explicitly considered here but have the potential to lead to biases in the outcome of the inversion.
ρ(σ y ) = LN(σ y,prior , σ σ,y,prior ) (10) The posterior joint distribution, ρ(x, µ x , σ x , σ y , τ |y), is estimated using MCMC with a Metropolis-Hastings algorithm (Rigby et al., 2011;Tarantola, 2005). The Metropolis-Hastings algorithm generates states from a proposal distribution and selectively accepts transitions so that the stationary distribution of the resulting chain represents the posterior distribution. A "burn-in" period of 25 000 iterations was discarded to remove any memory of the initial state, followed by 25 000 iterations to form the posterior PDFs. Convergence can be assessed using metrics such as Geweke's diagnostic (Geweke, 1992). One of the main advantages of this algorithm is that the normalization factor implied in Eq. (7) does not need to be computed. The chain is constructed such that each parameter has "knowledge" of the state of the other parameters, and, therefore, uncertainties and correlations between parameters are built into the chain. Using this hierarchical approach, posterior emissions and associated uncertainties, which are of primary interest, fully account for uncertainties in hyper-parameters. The set of parameters being estimated (x, µ x , σ x , σ y , τ ) is denoted as X. The criterion for acceptance at iteration n for the proposed set of parameters, X , is The sizes of the proposal distributions were adjusted so that the resulting acceptance ratio for each parameter was between 0.25 and 0.5 to achieve optimal mixing (Roberts et al., 1997). The main computational differences between the hierarchical inversion and a non-hierarchical inversion performed using MCMC are as follows (in order of importance): (1) the hierarchical inversion solving for uncertainty parameters requires that the inverse and determinant of covariance matrices (e.g., R) be computed in every iteration, while a traditional inversion with a fixed uncertainty structure would require that this operation be performed only once; (2) the hierarchical inversion solves for a larger number of parameters, inducing a small additional computational cost. For our applications, the main computational burden was 1. Based on traditional computational methods, as the size of the observation and/or parameter space grows, the cost of computing the inverse and determinant will be approximately n 3 . Therefore, for very large problems, alternative computational approaches may be necessary. Several methods exist to dramatically reduce the computational cost of inverting large (covariance) matrices, if required, for higher-dimensional applications (see for example, Sun et al., 2012). For an inverse problem similar to the one presented in this study, observational and/or parameter spaces of the order of a thousand elements can be readily solved with minimal code modification, on the order of minutes on a moderately powerful workstation. We anticipate that MCMC algorithms could be developed that could extend this to tens of thousands of elements, if required.

Pseudo-data experiment
We are interested in investigating the effect of the hierarchical method on posterior emissions uncertainties and use a pseudo-data simulation to demonstrate the concept. In this simulation, 1000 ensemble members were randomly generated from a known emissions distribution, ρ(x|µ * x , σ * x ), where µ * x and σ * x were known and fixed. Each emissions ensemble member was used to simulate mole fraction pseudo-data, which were then applied in both a hierarchical Bayesian inversion (HB) and a nonhierarchical Bayesian inversion (NHB) to infer emissions for cases in which the a priori emissions uncertainties were incorrectly specified. In this work we adopt the stance that uncertainty quantification is correct if, on average, a given realization, x * , sampled from ρ(x|µ * , σ * ), is consistent with the marginal distribution ρ(x|y). If uncertainty is correctly captured, then 5 % of the time x * should lie in the 5th percentile of the posterior distribution, 10 % within the 10th percentile and so on. We can thus plot a quantile-quantile (Q-Q) plot to compare the theoretical with the empirical quantile. If the empirical and true quantiles lie on the 1 : 1 line, we can conclude that the inversion is correctly capturing system uncertainties. If the uncertainties used in the inversion are too tightly constrained, ensemble members will tend to consistently lie in the tails of the estimated posterior distribution so that the Q-Q plot resembles an inverted "S-curve" around the diagonal. If the uncertainties are too lax, then the posterior judgments are under-confident and the Q-Q plot follows an S-curve around the diagonal.
Gaussian distributions were chosen for this pseudoexperiment for their simplicity and symmetry, but these results can be extrapolated to any distribution. Pseudo-data were generated from each realization, x * , using the NAME model for 1 month at Mace Head, Ireland, and include ran-dom Gaussian noise with standard deviation, σ * y . For each inversion, with parameter and hyper-parameter PDFs shown through Eq. (16), we fixed the mean of the a priori emissions distribution to be µ * x and tested the effect of making incorrect assumptions in the inversion about the a priori emissions uncertainty. Two cases were investigated: one in which the a priori emissions uncertainty (σ x,prior ) was onehalf and one in which it was twice σ * x , respectively. This is equivalent to an inversion where we incorrectly assumed the a priori emissions uncertainty to be smaller or larger than the "truth". In the NHB case, these values were fixed (i.e., there is no uncertainty in σ x,prior ); in the HB case, some flexibility was allowed for the inversion to adjust these values. The uncertainty in the uncertainty, σ σ,x , was assumed to be 100 % of σ x,prior , and model-measurement uncertainty was specified exactly as σ * y . To generate the estimated quantiles, we tracked the quantile of x * in the posterior distribution of each inversion and determined how often each quantile was being sampled (a perfectly characterized system would result in uniform sampling of all quantiles, as explained above). Figure 1 shows Q-Q plots for the HB and NHB cases. When the assumed a priori emissions uncertainty was too constrained and not allowed to adjust in the inversion, the result was a greater sampling of the tails of the distribution. When the hyper-parameter was included, the estimated distribution shifted towards the 1 : 1 line, indicating a better representation of the true distribution. A similar situation was observed when the assumed uncertainty was too large, reflecting a posterior distribution that sampled the middle of the distribution more frequently than expected. Inclusion of the hyper-parameter again resulted in a shift toward the 1 : 1 line and a better characterization of the true distribution. The results of this pseudo-data experiment illustrate the ways in which the hierarchical method can reduce the effect of errors in our assumptions about the uncertainties governing the system. Similar results were found in experiments testing the effect of incorrect assumptions about model-measurement uncertainty. An important feature of the HB framework is that the posterior emissions PDF is less sensitive to assumptions about the hyper-parameters governing the a priori emissions PDF than if direct assumptions were made about this PDF, as is the case in a NHB framework. This is because, in a HB framework, the parameters governing the a priori emissions PDF are themselves informed by the data. 3 Case study: regional SF 6 emissions

Inversion setup
We use the above methodology to derive monthly regional SF 6 emissions and boundary conditions for the regions around four AGAGE stations, monthly model-measurement uncertainties, monthly means and standard deviations of the a priori emissions PDFs and monthly autocorrelation timescales. Model-measurement uncertainties were calculated for two time periods each month: daytime (approx. 06:00-18:00 local time of each station) and nighttime (approx. 18:00-06:00), in order to investigate the common assumption that uncertainties at night are larger than those during the day. We have chosen a monthly estimation timescale, but in principle any resolution can be used.
High-frequency measurements used in this study are dry air SF 6 mole fractions from the AGAGE stations at Mace Head, Ireland; Trinidad Head, USA; Cape Grim, Australia; and Gosan, South Korea, for the period of January-December 2012 (Prinn et al., 2000;Rigby et al., 2010). Gosan measurements from the summer period were excluded due to the complexities induced by frequently shifting sampling of both northern and southern hemispheric background air. All measurements were made on the Medusa GC-MS system and were calibrated on the Scripps Institution of Oceanography (SIO) -2005 scale . Measurements were three hourly averages, which is the temporal resolution of the model sensitivity output.
The UK Met Office's Lagrangian Particle Dispersion Model (LPDM), NAME, simulates atmospheric transport by following particles backwards in time from the measurement station. NAME has been used in previous studies for modeling trace gas transport at various sites (O'Doherty et al., 2004;Manning et al., 2011;Ganesan et al., 2013). NAME directly outputs the sensitivity of measurements to surface emissions by tracking the mass of particles and time spent in the lower 100 m of the model over the duration of the simulation. These model-derived sensitivities are referred to as "air histories". For each site modeled here, the Met Office's Unified Model (UM) was used at 0.352 • × 0.234 • horizontal resolution and at 3 h temporal resolution. Particles were followed backwards in time for 20 days. Computational domains were chosen to be large enough to model the transport of pollutants from important source regions to each site as well as to allow for the assumption that boundary conditions to the LPDM domain do not vary significantly over the 20-day period. All air histories were generated with the release of particles from a 100 m column over the model surface. Grid cells were aggregated into approximately 20 regions over each domain following the methodology of Rigby et al. (2011). Regions were aggregated from grid cells as a function of a priori emissions and average sensitivity so that grid cells with high a priori emissions and/or high sensitivity would be estimated at higher resolution than those with low a priori emissions and/or low sensitivity.
Various methods have been used to determine boundary conditions to LPDM domains (Stohl et al., 2009;Manning et al., 2011;Rigby et al., 2011). In this application, boundary conditions were assumed to be constant over each month and were solved in the inversion as the part of the simulated mole fractions that were not accounted for by the 20-day air histories (i.e., emissions from farther back in time).
We follow the hierarchical system outlined by Eqs. (7)-(13). In this setup, µ x,prior was chosen to be the natural logarithm of 2008 values from the EDGAR v4.2 (henceforth referred to as EDGAR) database extrapolated to 2012 based on the linear trend for the inversion domain from 2004 to 2008 (JRC/PBL, 2011). These trends were assumed to be 2.8 % per year growth in Europe, 0.6 % per year growth in North America, 3.4 % per year growth in Oceania and 11 % per year growth in East Asia. Prior a priori emissions uncertainty, σ x,prior , was chosen as the value that resulted in 68 % of the lognormal PDF contained between 50 and 150 % of µ x,prior (similar to regional uncertainties used in Rigby et al., 2010). Prior model-measurement uncertainty, σ y,prior , was chosen to be the natural logarithm of the sum of the instrument uncertainty and the uncertainty associated with propagating the calibration scale, each assumed to be 0.05 pmol mol −1 . We did not include a prior estimate of the model representation error in the hierarchical inversions and allowed the inversion the flexibility to deduce this uncertainty. In the comparable inversions performed without the hierarchical method, we investigated the effect of including or not including a model representation error equal to the standard deviation of daily measurements. The mean a priori autocorrelation timescale, τ prior , was assumed to be 7 days, which is an approximate timescale of synoptic-scale meteorological events. Prior baseline values were assumed to be the minimum measured value during the month with an uncertainty of 3 %.
Uncertainties in these hyper-parameters, σ µ,x,prior , σ σ,x,prior and σ σ,y,prior , were calculated as the values that resulted in 68 % of the PDF contained between 50 and 150 % of µ x,prior , σ x,prior and σ y,prior , respectively.

Results and discussion
We present median 2012 SF 6 emissions for the regions around four AGAGE stations, monthly boundary conditions, diurnal model-measurement uncertainty and autocorrelation timescales. Median posterior emissions are shown in Fig. 2 and are tabulated in Table 1. Deviations of these emissions from the prior are presented in Fig. 3. The domain for which emissions are presented is smaller than the inversion domain but represents the region that the measurement station is most sensitive toward. Figure 4 presents national emissions derived from the HB inversion along with emissions derived from two NHB inversions that use the same PDFs and either include or exclude a model representation error. Uncertainties reflect the 16th to 84th percentiles of the posterior emissions PDFs to show consistency with previous studies citing 1σ uncertainties of Gaussian distributions. The results of the HB inversion show that (1) SF 6 emissions from the UK, France and Germany have deceased from the scaled EDGAR emissions and are also smaller than 2008 EDGAR emissions; (2) East Asian SF 6 emissions have decreased from scaled EDGAR values but have increased compared to 2008 EDGAR emissions; (3) emissions from the western coast of North America have largely decreased from 2008 EDGAR emissions; (4) Australian emissions have approximately remained the same as the scaled EDGAR emissions. In comparison to the 2007-2009 estimates made in Rigby et al. (2011), there are some statistically significant differences. In particular, the emissions from South Korea derived in this study are lower; however, emissions from Asia were shown through the sensitivity studies performed in Rigby et al. (2011) to be highly sensitive to inversion parameters, such as measurement averaging period.
To compare derived emissions and uncertainties between the HB and NHB methods, two NHB inversions were run for each site: one in which a model representation error was included and assumed to be the standard deviation of measurements each day and one in which no model representation error was used. Without including a model representation error, emissions become unrealistically large for East Asian countries due to the significantly elevated measurements that are not captured by the model and prior. Uncertainties on these emissions are likely too small, owing to the underestimated model-measurement uncertainty. In previous studies, methods such as statistical filtering have been used to remove measurements that cannot be resolved by the model prior to the inversion to prevent unrealistic emissions from being derived . When a model representation error was included in the NHB inversion, emissions substantially decreased in East Asia and became more consistent with those derived in previous East Asian studies, stressing the importance of properly accounting for the model-measurement uncertainty Kim et al., 2010;Li et al., 2011;Rigby et al., 2011;Fang et al., 2013).
In the HB inversion, emissions uncertainties are generally larger than those derived in the NHB inversions. These results suggest that the HB method is able to account for a larger space of uncertainties in the inversion and derive emissions uncertainties that are likely more representative of the true uncertainties in the system. sults obtained for the period 2006-2009, these studies produced statistically different emissions, likely indicating that the uncertainty analyses are not robust. These five studies used various methods by which they assessed uncertainties in emissions, making it difficult to compare uncertainties between each study. Vollmer et al. (2009 and Fang et al. (2013) derived emissions uncertainties by examining an ensemble of inversions with varying a priori emissions and a priori emissions uncertainties. These sensitivity tests were used to quantify the effect of incorrect assumptions about a priori emissions and a priori emissions uncertainties, as our study aims to do; however each of these inversion ensemble members cannot be considered independent, and the resulting uncertainties may not be statistically robust or fully propagated through to emissions. Rigby et al. (2011) presented uncertainties derived solely from the Bayesian inversion, which is the likely the cause for the much smaller uncertainties than the other studies. The results of the HB inversion presented in this study show 2012 Chinese emissions to be 2.1 (1.7-2.8) Gg yr −1 (derived value scaled by country fraction), which is statistically consistent with the recent results of Fang et al. (2013). While our derived uncertainties are similar to some of these studies, we propose that the uncertainty quantification outlined in this work is more statistically justifiable, complete and traceable than those presented elsewhere.
Derived model-measurement uncertainties for all stations are shown in Fig. 5 and simulated mole fractions and boundary conditions in Fig. 6. Uncertainties at Mace Head are generally lower than the a priori value used, suggesting that the model does well at representing this site. During July, a large pollution event is observed but is not captured by the model. The result is that measurements during this month have larger derived uncertainties and are therefore weighted less prominently in the inversion. Though a monthly timescale was used in this case study for deriving model-measurement uncertainties, solving for weekly or daily values would result in fewer observations being strongly de-weighted in the inversion. While some previous studies have filtered outlier measurements prior to the inversion through various methods, the HB method is less sensitive to these outliers. At Trinidad Head there is less consistency between the model and measurements than at Mace Head, leading to model-measurement uncertainties that are almost twice as large as those derived for Mace Head. These uncertainties are significantly higher in the latter part of the year (August-December), when more regional pollution is intercepted. At this time, the nighttime uncertainties are considerably larger than the daytime uncertainties, and one possible cause could be from errors in the model that have diurnal characteristics (e.g., sea breezes that are not captured). Uncertainties at Gosan are an order of magnitude larger than those derived at any other site. While the model captures the timing of many of the pollution events at this site, the sizes of the pollution events are considerably larger than predicted by the model and prior. This suggests that there could be emissions in close proximity to the station that are not captured by the model at the resolution used or that the model is under-representing the sensitivity to surface emissions. Uncertainties at Cape Grim are the smallest of all of the stations modeled here, owing to the measurement of mostly baseline air and very small pollution events. Notably, the derived model-measurement uncertainties at Cape Grim decreased in the second half of the year when improved instrumentation was installed, which resulted in better instrumental precision. For most of the sites, we have found that there was not a clear advantage to using daytime observations only. Derived autocorrelation timescales (not shown) were between 6 h and 1 day for each month and station and are similar to those presented in other studies (Berchet et al., 2013).

Conclusions
We present an application of a hierarchical Bayesian method for trace gas inversions. We show that the inclusion of "hyper-parameters" to represent the a priori emissions PDF, model-measurement uncertainty and measurement autocorrelation timescale results in a more complete quantification of emissions uncertainties over traditional inverse methods that rely heavily on expert judgment.
Using the hierarchical method, we have estimated emissions of SF 6 for regions around four AGAGE stations and hyper-parameters for each site. The emissions uncertainties derived using the hierarchical method are generally larger than those derived in traditional inversions as they account for a broader space of uncertainties in the system, including random aggregation, representation and structural errors. We show that model error is a significant contribution to model-measurement uncertainties at some sites, for example Gosan, without which, unrealistically large emissions would be derived with small uncertainties. The large discrepancy between the model and observations at this site results in the derivation of large model-measurement uncertainties and accordingly larger emissions uncertainties than those resulting from a standard Bayesian inversion. Similarly at Trinidad Head, derived uncertainties are larger than expected, owing to poor model fit, which results in larger emissions uncertainties than the comparable non-hierarchical inversion. In contrast, the generally good agreement between observations and model at Mace Head and Cape Grim results in a modelmeasurement uncertainty being derived that is smaller than the initial a priori value. Each of these findings is consistent with our expectations about the uncertainty characteristics of model performance at these sites but have been derived using minimal expert judgment.