Method for evaluating trends in greenhouse gases from ground-based remote FTIR measurements over Europe

This paper describes the statistical analysis of annual trends in long term datasets of greenhouse gas measurements taken over ten or more years. The analysis technique employs a bootstrap resampling method to determine both the long-term and intra-annual variability of the datasets, together with the uncertainties on the trend val-5 ues. The method has been applied to data from a European network of ground-based solar FTIR instruments to determine the trends in the tropospheric, stratospheric and total columns of ozone, nitrous oxide, carbon monoxide, methane, ethane and HCFC-22. The suitability of the method has been demonstrated through statistical validation of the technique, and comparison with ground-based in-situ measurements and 3-D 10 atmospheric models.


Introduction
Global climate change is one of the most important environmental issues facing the world today.A key element of this issue is understanding the atmospheric behaviour of radiatively active gases (direct greenhouse gases), and also gases involved in the chemical production of greenhouse gases (indirect greenhouse gases).Long-term measurements of such gases provide the experimental data to study the evolution of these gases and the changing sources and sinks.These data are often expressed in terms of an annual trend in the amount of a particular gas.In order for these trend results to be used appropriately it is vital that the uncertainty associated with the trend value is properly quantified.An accurate determination of the trend value is challenging due to influence of large seasonal variations and other effects reflected in the data (Oltmans et al., 1998).
This paper describes the development and implementation of a trend analysis method to determine the annual trend and associated uncertainties, based on a statistical model that makes minimal assumptions about uncertainty distributions associated

EGU
with the raw data.The method has been applied to measurements of direct and indirect greenhouse gases measured by a network of six ground-based solar Fourier Transform Infrared (FTIR) sites across Europe.The outputs from the analysis are the annual trends in the total, tropospheric and stratospheric amount of each gas at each of the sites and their associated uncertainties.
Section 2, below, gives a short description of the measurement network and the derivation of tropospheric and stratospheric columns from the data.The trend analysis method is described in Sect.3, while Sect. 4 covers the validation of the method.Section 5 gives the main results of the trend analysis, including comparison with insitu trend measurements and atmospheric model results.The conclusions are given in Sect.6.

The UFTIR remote sensing network
The work described in this paper was carried out as part of an EC Project on "Time Series of Upper Free Troposphere Observations from a European Ground-based FTIR Network" -UFTIR (http://www.nilu.no/uftir)(De Mazi ère et al., 2005).The UFTIR remote sensing network comprises six sites across Europe making solar absorption measurements using high-resolution FTIR spectrometers.Table 1 gives the location and altitude of these sites, which cover the latitude range from 28 • N (Izana, Tenerife) to 79 • N (Ny Ålesund, Spitzbergen).These sites have been making total column measurements of a range of atmospheric gases for many years, and the results from these measurements are held on the database of the Network for the

EGU
cover the period from 1995 to 2005.A significant amount of effort was made during the course of the project to harmonise the data analysis procedures used by each group (De Mazi ère et al., 2005).The outputs of the FTIR measurements are time series of vertical profiles of partial columns for each species with a data point for each day on which a measurement was made.Where more than one measurement was made on a particular day, the daily mean profile has been taken.
The work described in this paper addresses the determination of the trends and the associated uncertainties for the UFTIR datasets, with a focus on calculating separate tropospheric and stratospheric trends for each species.

Tropospheric column determination
Since one of the primary objectives for the analysis was to determine separate tropospheric and stratospheric trends for the FTIR datasets, a key issue was how to quantify the tropospheric content of the atmospheric profile results.It was decided that the best option was to use tropopause altitude information from the NCEP meteorological database to determine appropriate tropopause heights and variabilities for each site.The average tropopause height varied between 10.14 km at Ny Ålesund to 14.85 km at Izana.The (1 σ) variability of the tropopause was between 1.10 km (at the Junfraujoch) and 1.55 km (at Izana).
The tropopause information was then used to produce a weighting function to apply to the partial column profile data.The tropospheric weighting function, W , is a sigmoid function of altitude of the form : where z = mean layer altitude z T = mean tropopause altitude a = e/(standard deviation of tropopause altitudes).

EGU
The tropospheric partial column was then determined by integrating the weighted profile.The stratospheric partial column was then calculated as the difference between the total column and the tropospheric partial column.Separate trends were then calculated for the total, tropospheric and stratospheric columns.

Trend analysis requirements
The objective of the trend analysis method is to assess whether there are statistically significant long-term trends in the various datasets.The most straightforward approach to determining a trend from data is to fit a straight line to the data, using a least squares criterion for example.The gradient of the fitted line can then be used to indicate the long term trend.In order to associate an uncertainty or confidence limits with the gradient it is necessary to estimate the contribution of random effects in the data to the likely variation in the slope estimate that would be observed if the data was gathered a number of times over identical conditions.If the random effects can be assumed to be independently and identically normally distributed, then it is straightforward to show that the gradient parameter is also associated with a normal (Gaussian) distribution, allowing confidence limits to be calculated easily.However, the FTIR measurements show significant intra-annual effects so that the likely departure of the data points from a straight line fit has a significant time-dependent correlation, and hence is not independent.Secondly, even for measurements at the same site at the same period, the observed distribution of measurements can have significantly non-normal features.In order to determine valid estimates of the trends, it is necessary to take into account both the intra-annual variability and the potential non-normality of the distributions associated with the measurement data.
The approach described in this paper augments the basic linear trend model with a intra-annual function in order to represent the intra-annual effects, and uses least-

EGU
squares regression in conjunction with a bootstrap resampling method in order to determine confidence limits associated with the trend estimates.The advantage of the approach is that it uses well-known least squares techniques without requiring an assumption of normality at the same time as accounting for the significant intra-annual effects present in the data.

Intra-annual model
Since the intra-annual (seasonal) variability is of a periodic nature, it is appropriate to model these effects in terms of a Fourier series, B: where t is measured in years, and b 0 to b n are the parameters of the Fourier series contained in the vector b.The total variation in measurements due to the trend and the intra-annual effects is then modelled by a function, F : where a is the annual trend in the data.This model captures the underlying periodicity of the data and reduces the impact of sparse data.It also enables regular gaps in the data series, such as those during the winter months in high latitude sites, to be accommodated without causing discontinuities in the intra-annual function.See Sect.6.2 for examples of the fitted intra-annual models.

Bootstrap resampling
The technique of bootstrap resampling enables non-normally distributed data to be treated robustly (Gatz and Smith, 1995).It is based on the idea that the distribution associated with the random effects reflected in the data is best represented by the residual deviations of a model fit to the data.EGU demonstrated (Cox et al., 2002).In this technique, the model function F (t, a, b) is fitted to the data (t i , M i ) to determine estimates a 0 and b 0 that minimise with respect to a and b.Since the function F is a linear function of the parameters a and b, these estimates can be found using standard linear least squares methods (Lawson and Hansen, 1974).
Once the initial fit has been determined, the residual deviations are then regarded as a discrete representation of the distribution associated with the random effects reflected in the data.Given R i ,q sampled at random from the set R i ,0 ) is generated with: The model is refitted to this data set to give parameter estimates a q and b q .This procedure is repeated a large number of times, q=1, . . ., N, to generate the 1 by N matrix A containing the set of trend results a q and the n by N matrix B with the set of intra-annual variability parameters b q .Each row of these matrices contains a sample from the distribution for the corresponding parameter, and therefore provides a (discrete) approximation to this distribution.Since the elements in the A form a sample from the distribution for the trend parameters, the 2.5 and 97.5 percentiles of this empirical distribution specify a 95% confidence interval associated with the value of the trend.Using standard matrix factorisation techniques, it is possible to organise the computation so that determination of the parameter fits can be done efficiently.
This method allows the uncertainty associated with any of the model parameters to be evaluated without making any assumptions about the statistical distribution of the residuals.It can therefore be applied generally to results for different species and Introduction

Conclusions References
Tables Figures Printer-friendly Version Interactive Discussion

EGU
sites.It can also be extended to combinations of parameters, and an example of this is described in Sect.5.3, where methods for aggregating the results obtained from many sites are discussed.

Validation of analysis method
As in many areas of experimental data analysis, tests must be carried out in order to demonstrate that the results obtained are valid and that the model chosen provides a satisfactory explanation of the data.In this application, it is necessary to choose an appropriate number of terms in the intra-annual model (i.e., the order of the Fourier series).The method used to evaluate the confidence intervals associated with the estimates relies on bootstrap resampling from the distribution of residual errors.This approach requires some further demonstration that there is no significant bias introduced and that the confidence intervals are reliable.
In addition to the statistical validation of the model, the results of the trend analysis of the data from the UFTIR sites can be compared to the results of ground-based insitu monitoring networks, and the output from atmospheric models, to give external validation of the results.

Number of factors in the intra-annual model
The choice of the number of terms in the Fourier series has to balance the need to determine a faithful representation of the underlying periodic behaviour with that of avoiding over-fitting the data.An investigation was carried out to assess the appropriate order for the Fourier series by looking at the root-mean-square (RMS) residuals for different orders for each of the UFTIR species.As the order is increased the RMS is calculated.The point at which the RMS value shows no significant reduction usually represents a good balance between faithfulness and economy of representation.This study showed that a 3rd order series with a total of 7 coefficients (a constant and 3 Introduction

Conclusions References
Tables Figures Printer-friendly Version Interactive Discussion

EGU
sine and 3 cosine components) provided the best overall representation of the typical intra-annual variability without over-fitting the data.

Bias
It is possible for a bootstrap resampling process to introduce biased confidence intervals.Efron and Tibshirani (1993) describe how any bias in the analysis may be quantified and, if it is large, how bias-corrected intervals may be estimated.As shown by Efron and Tibshirani (1993), the bias correction, z 0 , for any statistic is given by z 0 =C −1 (r), where C −1 indicates the inverse function of the standard normal cumulative distribution function and r is the proportion of bootstrap resample values less than the original estimate.A value of r= 1 2 , giving z 0 =0, implies that there is no bias.
When the bias check is applied to the bootstrap resample values of the trend estimated for each of the species and each of the sites, the values of r are all close to one half, ranging from 0.43 to 0.54 scattered (apparently) randomly about one half, with a mean bias of 0.49.It can therefore be concluded that the use of the bootstrap resampling method is not introducing significant bias, and a bias correction is not necessary.

Reliability of confidence limits
The bootstrap resampling method has been used to estimate the non-parametric 95% confidence intervals associated with the underlying trends.However, it is necessary to assess the level of confidence that can reasonably be placed on these intervals for the finite number N of resamplings (N=5000 in our case) used.Berthouex and Brown (1994, p. 68) state that the precision of the quantiles estimated in the manner used here decreases rapidly as the estimates move towards the extreme tails of the distribution.They provide formulae for quantifying this precision, as follows.Let p denote the fractional quantile of interest (0.025 and 0.975 in our case).Then the 95% confidence intervals associated with the pth quantile are (for a large sample, as in our Introduction

Conclusions References
Tables Figures The corresponding values of the statistic of interest (the trend in our case) are then obtained immediately from the corresponding values in the empirical error distribution.Table 2 shows an example of the results of the confidence limit precision.In this case they are calculated for the tropospheric partial column trends from the Jungfraujoch dataset as a percentage of the average value in 2000.The table gives the lower (2.5% quantile) and upper (97.5% quantile) confidence limits for each species, together with the associated precision range calculated using Eq. ( 7).
The results shown in Table 2 indicate that the overall confidence intervals are not affected significantly when this "additional" uncertainty is included.We can therefore taken the original 95% confidence interval as a reasonable estimate of the uncertainty in the trend value.

Time series and intra-annual variability
The output of the bootstrap resampling analysis produces an estimate of the average trend and intra-annual variability parameters for a given dataset.The first step in the analysis of the results was to see how well these parameters captured the variability and trends in the measurements.The five panels of Fig. 1 show a series of examples of the measured time series, the results of the model function determination, and the underlying trend.The top two panels are the time series for tropospheric ethane for Harestua and Ny Ålesund, showing that the Fourier series provides a good fit for species with a large variability even when there are regular gaps in the data, as for Ny Ålesund where measurements are not possible during the Arctic winter.

EGU
that the general structure is captured well even if the intra-annual behaviour is very different from site to site.The final panel shows that the method is equally appropriate in the cases where there is little intra-annual variability as in this example of tropospheric nitrous oxide measured at the Jungfraujoch.In summary, the results shown in Fig. 1 demonstrate the suitability of the 3rd order Fourier series, discussed in Sect.4.1, in capturing the range of intra-annual variability in the various datasets.

Trend results from individual sites
The null hypothesis to be tested for each analysis is that "there is no underlying straightline trend over the time span of the data", i.e., the gradient of the underlying long-term trend in the regression model is zero.The sampling distribution of the gradient of the underlying straight-line trend term is determined empirically using bootstrap resampling.If the 95% confidence interval associated with the gradient, computed from this empirical distribution, does not contain zero then, in a formal statistical sense, there is reason to doubt the null hypothesis.
Table 3 shows the annual trends in the total, tropospheric, and stratospheric columns for each species and site, together with the associated uncertainties based on the 95% confidence limits of the bootstrap resampling.Also shown are the site latitudes and altitudes.All trends are reported as a percentage of the average value in the year 2000 for that particular parameter.Those annual trends shown in bold identify those cases where the confidence interval does not contain zero, and the null hypothesis is not met, i.e. it indicates those cases where there is a statistically significant (positive or negative) trend.
For comparative purposes the 95% confidence intervals associated with the estimates in Table 3 were also calculated under the (unsupported) assumption that Gaussian statistics apply.In most cases these were comparable to the bootstrap resampled results.A valid bootstrap approach can be expected to provide reliable confidence intervals that may be smaller than or greater than those obtained under the assumption 15791 Introduction

Conclusions References
Tables Figures Printer-friendly Version Interactive Discussion

EGU
that Gaussian statistics apply.Whether the intervals are smaller or greater depends on the nature of the sampling distribution, which the bootstrap resampling method estimates in an unbiased way.We concluded that our approach is valid because of the careful validation carried out on the results.

Estimating combined trends from all sites
In this section we aim to consider how the results obtained for each site can be aggregated to evaluate the long-term trend over the whole network.There are many ways of combining the results from all six sites to obtain representative values for the whole network.In this paper, the selected approach is to take the mean of the individual site values.Computing this statistic is straightforward, but standard approaches to evaluating the associated uncertainties can be misleading because only a small number of data points are available -six in this case.We show here how bootstrap resampling can be used to overcome this problem (Efron, 1982;Efron and Tibshirani, 1993).As a result of the calculations described earlier, for each of the p sites, a row vector of length N, that describes the sampling distribution of the gradient in gas amount was calculated.These vectors can be arranged into a single p by N matrix, G, that contains all N (=5000) bootstrap estimates from all six sites.
This array of bootstrap estimates can form the basis for estimating the confidence interval associated with any estimator formed by taking a function of the trends -in this case the arithmetic mean.For a specific estimator, we form its value for each column of the matrix G, (i.e., for each bootstrap sample).The result is a 1 by N matrix whose elements estimate the sampling distribution for that estimator.The 2.5 and 97.5 percentiles of this empirical distribution specify a 95% confidence interval associated with the value of the estimator.Figure 3 shows the combined trend values for the total, tropospheric and stratospheric columns for each of the UFTIR species.The error bars on each trend show the associated 95% confidence intervals obtained from the estimated sampling distribution.
The results shown in Fig. 3 can be taken as estimates of the trends for each species 15792 Introduction

Conclusions References
Tables Figures Printer-friendly Version

Interactive Discussion
EGU over a large spatial scale.The results show broadly similar trends in troposphere and stratosphere for most species, except for ozone, and to a lesser extent carbon monoxide, which show different tropospheric and stratospheric behaviour.These results are in good agreement with the expected behaviour as determined by models of the atmosphere -see Sect.5.4.The results of the combined trend behaviour are also in good agreement with the trends determined by long term ground level monitoring.For example: the measured value for the tropospheric N 2 O trend of 0.245 (±0.044)%/yr compares to a global mean rate of 0.25%/yr determined by the AGAGE and NOAA/CMDL network results (WMO, 2003).
the measured tropospheric HCFC-22 trend of 3.18(±0.24)%/yrcan be compared to the GC/MS measurements from the AGAGE site at Mace Head in Ireland (53 • N) which give a growth rate of 3.02%/yr for the period 1999-2003.

Comparison with the CTM Model
An alternative approach to looking at the combination of results from the different sites is to compare the measured trends to those predicted by 3-D atmospheric models.The advantage of this method is that it enables systematic differences in the behaviour at different sites (and latitudes) to be taken into account, and it also provides a useful validation tool for the long-term behaviour of the models themselves.
The model used within the UFTIR project was the Oslo CTM2 model developed within the Department of Geosciences at the University of Oslo (Isaksen et al., 2005;Gauss et al., 2006).This model is a global 3-D chemical transport model (CTM) driven by real meteorological data from the European Centre for Medium Range Weather Forecasting (ECMWF).The model was run with a 2.8 • ×2.8 • horizontal resolution and 40 vertical layers from the surface up to 10 hPa.The chemical scheme includes comprehensive stratospheric and tropospheric chemistry.The spatial and temporal variation in emissions has been included based on the EDGAR3.2inven-Introduction

Conclusions References
Tables Figures Printer-friendly Version Interactive Discussion
The model was used to predict the vertical profiles for each species above each site over the trend analysis period.The tropospheric/stratospheric partial column functions (see Sect. 2.1) were then applied to the profiles, and the bootstrap resampling algorithm applied as for the measured columns.
The detailed results of these analyses, and their scientific implications will be discussed in other papers, however a few examples are given here to demonstrate the comparability between the measured and modelled trends.Figure 3a and b shows the tropospheric and stratospheric ozone trends, with generally good agreement between the measured and modelled results (given the trend uncertainties) including the differences from site to site and between troposphere and stratosphere.Figure 4a and b shows the tropospheric trends for carbon monoxide and ethane, again with good agreement between the data set, including the systematic difference between the Izana results and the other sites.

Discussion and conclusion
The ability to reliably determine trends in atmospheric datasets is an important element in the study of the long term behaviour of the atmosphere and climate system.Conventional methods for estimating the uncertainties in the trends may give misleading results as they make unjustified assumptions about the statistical distribution of the data.We have established that the method described in this paper -bootstrap resampling with a low order Fourier series to capture the intra-annual variability -is a statistically robust method for determining the trends and uncertainties.A series of statistical and experimental validation tests have been carried out to demonstrate the suitability of the method.We have also shown how the method can be applied to the aggregated data from different sites to give an assessment of the long-term trends across a network of measurement sites.

EGU
The trend analysis method has been applied to long-term datasets of direct and indirect greenhouse gases measured by the UFTIR network of six ground-based solar FTIR sites.The output from these analyses gives the trends and associated uncertainties for the total, tropospheric and stratospheric columns of ozone, nitrous oxide, carbon monoxide, methane, ethane and HCFC-22.These results provide a valuable data resource for the study and modelling of the changing sources, sinks and dynamics for each species.Further papers will address the scientific interpretation of the results for the various species.

Fig. 2 .
Fig. 2. Total column, tropospheric and stratospheric trends for each species combined over all sites.
The third and fourth panel show the total column ozone time series for Kiruna and Izana, showing Introduction

Table 3 .
Annual trend results for UFTIR measurements.Examples of measured (blue crosses) and fitted (red triangles) time series of vertical column amount (in molecules m −2 ) including the underlying trend (red line).Introduction