The Impact of Orbital Sampling, Monthly Averaging and Vertical Resolution on Climate Chemistry Model Evaluation with Satellite Observations

Ensemble climate model simulations used for the Intergovernmental Panel on Climate Change (IPCC) assessments have become important tools for exploring the response of the Earth System to changes in anthropogenic and natural forcings. The systematic evaluation of these models through global satellite observations is a critical step in assessing the uncertainty of climate change projections. This paper presents the technical steps required for using nadir sun-synchronous infrared satellite observations for multi-model evaluation and the uncertainties associated with each step. This is motivated by need to use satellite observations to evaluate climate models. We quantified the implications of the effect of satellite orbit and spatial coverage, the effect of variations in vertical sensitivity as quantified by the observation operator and the impact of averaging the operators for use with monthly-mean model output. We calculated these biases in ozone, carbon monoxide, atmospheric temperature and water vapour by using the output from two global chemistry climate models (ECHAM5-MOZ and GISS-PUCCINI) and the observations from the Tropospheric Emission Spectrometer (TES) instrument on board the NASA-Aura satellite The results show that sampling and monthly averaging of the observation operators produce zonal-mean biases of less than ±3 % for ozone and carbon monoxide throughout the entire troposphere in both models. Water vapour sampling zonal-mean biases were also within the insignificant range of ±3 % (that is ±0.14 g kg −1) in both models. Sampling led to a temperature zonal-mean bias of ±0.3 K over the tropical and mid-latitudes in both models, and up to −1.4 K over the boundary layer in the higher latitudes. Using the monthly average of temperature and water vapour operators lead to large biases over the boundary layer in the southern-hemispheric higher latitudes and in the upper troposphere, respectively. Up to 8 % bias was calculated in the upper troposphere water vapour due to monthly-mean operators, which may impact the detection of water vapour feedback in response to global warming. Our results reveal the importance of using the averaging kernel and the a priori profiles to account for the limited vertical resolution and clouds of a nadir observation during model application. Neglecting the observation operators resulted in large biases, which are more than 60 % for ozone, ±30 % for carbon monoxide, and range between −1.5 K and 5 K for atmospheric temperature, and between −60 % and 100 % for water vapour.


Introduction
The ensemble climate model simulations have become important tools for exploring the response of the earth system to changes in anthropogenic and natural forcings.In the last three decades, there have been large volume of global satellite observations of atmospheric species (Fishman et al., 2008, and references therein) that have become available.These observations data are useful for the evaluation of numerical models (e.g.see Soden and Bretherton, 1994;Allen et al., 2004;Chin et al., 2004;Aghedo et al., 2011;Bodas-Salcedo et al., 2011), contribute to the understanding of processes controlling the distribution of trace species (e.g.Klein and Jakob et al., 1999;Voulgarakis et al., 2011;Bodas-Salcedo et al., 2011), and to constrain radiative forcing calculations through the use of, for example, Published by Copernicus Publications on behalf of the European Geosciences Union.
However, the observational data that can be used for model evaluation, for example, in the framework of international projects, such as the Climate Model Intercomparison Project (CMIP) and Atmospheric Chemistry and Climate Model Intercomparison Project (ACC-MIP, Shindell et al., 2009) especially towards the fifth assessment report of the Intergovernmental Panel of Climate Change (IPCC AR5), will need to be provided in a format that are quantitatively comparable with model output in terms of horizontal, vertical and temporal resolution, and data frequency (for example see Bodas-Salcedo et al., 2011).The goal of presenting satellite observations in a way comparable to numerical model simulations require several technical steps.The steps include the assessment of: -the adequancy of limited spatial and temporal resolution of observations taken by nadir sounders to represent the magnitude and variability of species (Luo et al., 2002), -the impact of averaging observations to model horizontal and temporal resolution, and -the influence of observation operator, which account for the limited vertical resolution of nadir satellite observations.
Each of these steps presents different challenge and introduces uncertainties that need to be quantified (e.g.Sayer et al., 2010).For example, Luo et al. (2002) used GEOS-Chem model output to show that the interpolated daily global maps generated from the TES synthetic data (sampled from the original model time series) were comparable to the original daily-mean model output, within a spatial error that is less than 10 % in more than 70 % cases for ozone, and less than 20 % for 80 %-90 % of the cases for CO.In particular the possibility of averaging the observational data on a monthlymean time scale, will facilitate the comparison of model monthly-mean to observation monthly-mean, and reduced the effort required in data exchange and the cost of storing model time-series.This is of interest for the CMIP5 and ACC-MIP activities, where modelling groups prefer to provide monthly-mean model output to specified data archives.This paper quantifies the uncertainties listed above for ozone, carbon monoxide (CO), atmospheric temperature and water vapour by using two global chemistry climate models (ECHAM5-MOZ and GISS-PUCCINI model), and data from the NASA-Aura Tropospheric Emission Spectrometer (TES).The optimal estimation approach used in the operational TES retrieval algorithm provides a step-by-step methodology on data validation.This methodology has been demonstrated in several publications on the evaluation of TES ozone (Worden et al., 2007;Richards et al., 2008;Osterman et al., 2008;Nassar et al., 2008), carbon monoxide (Luo et al., 2007;Ho et al., 2009), atmospheric temperature (Shephard et al., 2008a) and water vapour (Shephard et al., 2008b).This approach has also been applied to model evaluation and assimilation with TES data (e.g.Jones et al., 2003;Parrington et al., 2008;Jones et al., 2009;Worden et al., 2009;Nassar et al., 2011).We present the data in Sect.2, the technique for the application of nadir satellite to model evaluation in Sect.3. Sections 4 to 6 discuss the influence of orbital sampling, monthly averaging, and application of satellite operators, respectively.The conclusions are provided in Sect.7.

The ECHAM5-MOZ model output
The ECHAM5-MOZ (Aghedo, 2007;Aghedo et al., 2007) is a tropospheric chemistry climate model containing the tropospheric chemistry of MOZART2.4 (Horowitz et al., 2003), which is fully embedded in the general circulation model ECHAM5 (Roeckner et al., 2003).The setup used in this paper has an horizontal resolution of 2.8 • latitude by 2.8 • longitude (T42), and 31 hybrid sigma-pressure vertical levels, from the surface to 10 hPa.The model temperature, vorticity, divergence, and surface pressure were constrained towards the operational forecast data of the European Centre for Medium Range Weather Forecast (ECMWF) through the nudging technique (Jeuken et al., 1996).We use the model output from January 2005 to December 2008, after an eighteen months spin-up.
The tropospheric chemistry of MOZART 2.4 includes reactions involving NO x -HO x -O x -CO-CH 4 and other hydrocarbons, including oxygenated hydrocarbons.The heterogenous reaction of N 2 O 5 on sulphate aerosols are also included.The model includes both dry and wet deposition, which are formulated according to Ganzeveld (2001) and Stier et al. (2005), respectively.The upper boundary concentrations for ozone, NO x , HNO 3 , and N 2 O 5 were fixed at the top levels higher than 30 hPa in the model, and are prescribed based on climatological zonal-and monthly-mean values described in Horowitz et al. (2003).The concentrations above the model tropopause are relaxed towards these climatological values with a constant relaxation time of 10 days.The photolysis rates are derived from tabulated values from the Tropospheric Ultraviolet and Visible radiation model (Madronich and Flocke, 1999), with an update for O( 1 D) from the photolysis of ozone as described in Horowitz et al. (2003).The full chemical scheme in the ECHAM5-MOZ model contains 168 chemical reactions and 63 transported species.We use the anthropogenic and biomass burning emissions of year 2000, which are created during the REanalysis of the TROpospheric chemical composition over the past 40 yr (RETRO) project (Schultz et al., 2007).Lightning NO x and vegetation emissions are calculated interactively within the model based on the parameterisation of Grewe et al. (2001) and the Model of Emissions of Gases and Aerosols from Nature (MEGAN, Guenther et al., 2006), respectively.This lightning parameterization produces about 6.7 Tg(N) yr −1 of NO x emissions.

The GISS-PUCCINI model output
The model GISS-PUCCINI consists of the model for Physical Understanding of Composition-Climate INteractions and Impacts (PUCCINI) (Shindell et al., 2006b), which is fully embedded in the GISS climate model (Schmidt et al., 2006).The model contains both tropospheric and stratospheric chemistry.The model was run at 2 • latitude by 2.5 • longitude Cartesian horizontal resolution, with increased effective resolution for tracers by carrying higher order moments at each grid box.This configuration has 40 vertical hybrid sigma layers from the surface to 0.1 hPa (∼80 km).Simulations were performed using observed sea-surface temperatures (Rayner et al., 2003) and linear relaxation of winds toward NCEP/NCAR reanalysis (Kalnay et al., 1996).We use the GISS-PUCCINI model output from January 2005 to December 2008.
Tropospheric chemistry includes basic NO x -HO x -Ox-CO-CH 4 chemistry as well as PAN, isoprene, alkyl nitrates, aldehydes, alkenes, paraffins, and other hydrocarbons.The lumped hydrocarbon family scheme was derived from the Carbon Bond Mechanism-4 (Gery et al., 1989) and from the more extensive Regional Atmospheric Chemistry Model, following Houweling et al. (1998).The stratospheric chemistry includes chlorine-and bromine-containing compounds, and CFC and N 2 O source gases.The main additions to the previous versions are the addition of acetone to the hydrocarbons following Houweling et al. (1998), polar stratospheric cloud formation now depends upon the abundance of nitric acid, water vapor and temperature (Hanson and Mauersberger, 1988), and the addition of a reaction pathway for HO 2 + NO to yield HNO 3 (Butkovskaya et al., 2007).Photolysis rates are calculated using the Fast-J2 scheme (Bian and Prather, 2002), whereas other chemical reaction rate coefficients are from Sander et al. (2000).Tracer transport uses a non-diffusive quadratic upstream scheme (Prather, 1986).The full scheme includes 156 chemical reactions among 50 species.Year 2000 emissions were used from the dataset assembled for the IPCC fifth assessment report simulations (Lamarque et al., 2010).

TES satellite data
The Tropospheric Emission Spectrometer (TES) is an infrared, high-resolution (0.1 cm −1 ), Fourier Transform spectrometer covering the spectral range from 650 to 3050 cm −1 , and an average nadir footprint of about 5 km by 8 km (Beer et al., 2001).TES operates in a polar sun-synchronous orbit with a repeat cycle of 16 days.The spectral radiances measured by TES are used to retrieve the atmospheric profiles of trace species through a non-linear optimization algo-rithm that minimizes the difference between observed radiances and those calculated with a Radiative Transfer Model, subject to the condition that the solution is consistent with an a priori description of the atmosphere (Rodgers, 2000;Bowman et al., 2002Bowman et al., , 2006)).TES provides the vertical profiles of tropospheric ozone, carbon monoxide, water vapour and atmospheric temperature on a global scale.The analysis presented in this paper employs version 4 of TES data from the standard global survey mode collected from January 2005 to December 2008.The global survey mode includes both daytime and nighttime measurements which crosses the equator at about 01:45 a.m. and 01:45 p.m. local time.
The retrieved profile x of an atmospheric trace species is an estimate of the true atmospheric profile x and it can be expressed as: where x a is the a priori profile, A is the averaging kernel matrix, is the observational error, whose covariance account for the random and systematic errors and errors associated with joint retrieval of dependent states (Worden et al., 2004).
The profiles x a , x and x are expressed as natural logarithm of the volume mixing ratio for ozone, carbon monoxide and water vapour, whereas for atmospheric temperature, they are expressed in Kelvin.TES profiles have 67 vertical levels from the surface to 0.01 hPa, with varying layer thickness in the boundary layer following the orography.These 67 vertical levels are a subset of the pressure levels of the TES radiative transfer forward model (Clough et al., 2002).

The technique for comparing nadir satellite data to model output
A number of steps are required to ensure consistent comparison between a nadir infrared satellite observation (such as TES) and numerical models due to the differences in horizontal, vertical and temporal resolutions between observation and the models.The steps include extracting co-located spatial and temporal points from models by sampling, interpolating the extracted points along the vertical dimension to match the pressure levels of observation, and adjusting the extracted and the interpolated model points with the a priori and the averaging kernel profiles (jointly referred to as observation operator) to account for limited vertical resolution of observations and clouds (Kulawik et al., 2006).These steps can be represented by a relation analogous to Eq. (1), given by: where the subscript i denote the time-varying horizontal location (i.e.latitude, longitude and time of sampling), and x is the natural logarithm of the volume mixing ratio for atmospheric trace species such as ozone, carbon monoxide and For atmospheric temperature: Subsequently, we will use y r to denote the raw model output, y m to denote the sampled model output, and ŷm to denote the sampled model output containing the application of TES operators according to Eqs. ( 2) and (3 or 4).
The execution of the steps leading to Eq. ( 2) requires model output time-series.However, most modelling groups submit only monthly-mean model output to the various archives setup for the IPCC assessment report.Archiving only monthly-mean output from models is necessary to reduce data volume and storage cost.In such case, we need to quantify the limitations of such monthly-mean observational data, including the influence of the monthly averaging of the averaging kernel matrix.In particular we quantify the uncertainty introduced through: (1) orbital sampling (2) monthly averaging, and (3) application of satellite operators.Quantifying these uncertainties have an implication and provide useful insight for the general application of space-based data to model evaluation.

The influence of orbital sampling
We investigate the bias introduced by sampling the data along the nadir sun-synchronous orbit (see the example of TES orbit for a particular global survey in Fig. 1) in this section.For this purpose, we use the 3-hourly output from the ECHAM5-MOZ and GISS-PUCCINI models.

Without data screening
We use TES maximum throughput spatio-temporal information to extract the co-located points from the 3-hourly model output.In using the TES maximum throughput, we assumed every measurement performed by TES has a good retrieval quality.This assumption ensures that the calculation of the sampling bias has a general application to nadir sun-synchronous satellite instrument and is not affected by TES retrieval quality.We binned the extracted co-located points back to the model original grids for comparison.In a 30-day month, TES maximum throughput contains about 51 537 individual nadir samplings.Figure 2 shows the distribution of the samplings binned to the ECHAM5-MOZ and GISS-PUCCINI grids of T42 and 2.5 × 2 respectively, for a 30-day month.The visible bell-shaped pattern containing ozone, carbon monoxide and water vapour) derived from Equation (2), where: For atmospheric temperature: Subsequently, we will use y r to denote the raw model output, y m to denote the sampled model output, and ŷm to denote the sampled model output containing the application of TES operators according to Equations ( 2) and (3 or 4).The execution of the steps leading to Equation ( 2) requires model output time-series.However, most modelling groups submit only monthly-mean model output to the various archives setup for the IPCC assessment report.Archiving only monthly-mean output from models is necessary to reduce data volume and storage cost.In such case, we need to quantify the limitations of such monthly-mean observational data, including the influence of the monthly averaging of the averaging kernel matrix.In particular we quantify the uncertainty introduced through: (1) orbital sampling (2) monthly averaging, and (3) application of satellite operators.Quantifying these uncertainties have an implication and provide useful insight for the general application of space-based data to model evaluation.

The influence of orbital sampling
We investigate the bias introduced by sampling the data along the nadir sun-synchronous orbit (see the example of TES orbit for a particular global survey in Figure 1) in this section.For this purpose, we use the 3-hourly output from the ECHAM5-MOZ and GISS-PUCCINI models.

Without Data Screening
We use TES maximum throughput spatio-temporal information to extract the co-located points from the 3-hourly  We denote the monthly-mean of original raw model timeseries at a particular grid-box, G as: where N is the total number of model points belonging to a grid box, which is equal to the number of model time steps in a month.Note that for the 3-hourly model output we used, N is the same for all model grid-box, and it is equal to 8 times the number of days in the month.In a like manner, we also define the monthly-mean of the sampled model output as: where n G is the total number of sampled points belonging the grid G.For a 30-day month, n G is as shown in Fig. 2. Note that n G is a subset of N and is such that n G ≤ N as shown in the example presented in Fig. 3.We therefore denote the absolute bias due to sampling by S G , quantified as: And the percentage error due to sampling, SP G is also quantified as: Figure 3 shows the model 3-hourly (i.e.raw) time-series at grid-point G corresponding to 20 • E longitude and 18 • S latitude in the month of July of 2005-2008 (see the grey lines).The figure also shows the sampled points from the timeseries using TES spatio-temporal information as black symbols for each year.We show the comparison of the sampled and the raw model time-series at the 550 hPa pressure level in the ECHAM5-MOZ model (left column) and at 562 hPa pressure level in the GISS-PUCCINI model (right column) for ozone (first row), carbon monoxide (second row), atmospheric temperature (third row) and water vapour (fourth row).The grey and the black symbols shown on the far right of the plot represent the monthly-mean of raw model timeseries (grey) and the sampled points (black), and their corresponding standard error of the mean σ mean , which is defined as: where σ is the standard deviation, and the superscripts m and r denote the sampled and the raw model time-series respectively.We calculated the standard error of the mean by dividing the standard deviation of the points with the total number of points.
Figure 3 shows that sampling along TES nadir sunsynchronous orbit can adequately capture the magnitude of observed concentration on a monthly-mean time-scale (as shown in the second and third columns of Table 1), despite its limited number of sampling in a month (as shown in Fig. 3).Table 1 shows the range of differences between the mean of the sampled and the raw model time-series from the ECHAM5-MOZ (and the GISS-PUCCINI) model in the individual year on the second and third columns, respectively.The bias between the sampled and raw model timeseries ranges from 0.1-6 ppbv (−0.7 to −3 ppbv) for ozone, −3.4 to 24 ppbv (−12 to 10 ppbv) for CO, 0.05-1.6K (−0.5 to 0.7 K) for temperature, and −0.07 to 0.2 g kg −1 (−0.1 to 0.2 g kg −1 ) for water vapour in the two models re-spectively.The 4-yr mean bias between the sampled and the raw model time-series is only 2.6 ppbv (−1.6 ppbv) for ozone, 4.8 ppbv (−4.3 ppbv) for CO, 0.7 K (0.2 K) for temperature, and 0.02 g kg −1 (−0.06 g kg −1 ) for water vapour in the ECHAM5-MOZ (and GISS-PUCCINI) model, indicating the suitability of TES in capturing decadal variability.The standard error of the 4-yr mean of the sampled series (Eq.9) are an order of magnitude larger than those of the corresponding raw model output (Eq.10) (see the fourth and the fifth columns of Table 1).For example, the standard error of the mean in the sampled ozone from the ECHAM5-MOZ model is 2.23, while the standard error of the mean in the raw model series is only 0.26.
Figures 4a and 4b show the percentage and absolute error, respectively due to sampling in the ECHAM5-MOZ (left column) and GISS-PUCCINI (right column) ozone (first row), carbon monoxide (second row), atmospheric temperature (third row) and water vapour (fourth row).The zonalmean errors are calculated by summing up the absolute errors and dividing by the total number of points in the latitudinal zone at each level.The first row of Figs.4a and 4b show that the bias due to sampling is between ±1 % (±1 ppbv) for ozone over most of the troposphere in both models.Over limited region within the boundary layer, and at the upper troposphere and lower stratosphere (UTLS), the sampling bias in ozone could be up to ±2 % (−8 ppbv to 6 ppbv) in ECHAM5-MOZ model and ±2.5 % (maximum ±8 ppbv) in the GISS-PUCCINI model.The second row of Figs.4a and 4b also show that the sampling bias for carbon monoxide is generally less than ±1.2 % (less than ±1.2 ppbv over the entire free troposphere, and could vary between −2 ppbv to 5 ppbv within the boundary layer) in both models.These biases are less than the errors calculated by Luo et al. (2002) due to our longer time-averages.These results show that the Table 1.The range of bias S G between the mean of the sampled y m and the raw model time-series y r at the gridpoint G corresponding to 20 • E longitude and 18 • S latitude for the individual years shown in Fig. 3 are shown on the second and the third columns respectively for ECHAM5-MOZ and GISS-PUCCINI models, respectively.We also show the 4-yr mean bias between the two distributions in the parenthesis on the second and the third columns.The fourth and the fifth columns contain the standard error of the 4-yr mean of sampled σ m mean timeseries (and raw model time-series σ r mean in parenthesis) as specified in Eq. ( 9) (and Eq. 10).
Bias range (4- zonal-mean error due to sampling is negligible for ozone and carbon monoxide in both models, with the implication that observations by a nadir sun-synchronous satellite adequately capture the monthly-mean zonal-mean magnitude and the distribution of ozone and carbon monoxide. The third row of Figs.4a and 4b show the influence of sampling on the atmospheric temperature in both models.We found a sampling bias of less than ±0.3 K throughout the tropics and mid-latitudes in the troposphere.The GISS-PUCCINI model however shows sampling bias of up to −1.4 K over some parts of the southern-hemispheric higher latitudes.The fourth row of Figs.4a and 4b show the influence of sampling on water vapour in both models.Similar to the bias in temperature, we also found sampling biases within the range of ±3 % over the entire troposphere within the tropical and the mid-latitudinal bands.The percentage biases are above −5 % over the southern-hemispheric higher latitudes of the GISS-PUCCINI model.This may be probably due to the dry conditions over the southern-hemispheric higher latitudes, which may lead to a division by a very small number, as confirmed by the small absolute biases within the region (see the fourth row of Fig. 4b).The absolute biases of water vapour lie within the range of ±0.14 g kg −1 in both models, and are concentrated in the tropical and mid-latitudes lower troposphere, where water vapour has the highest concentration.
These results show that the influence of sampling is somewhat dependent on the model, but in the two global chemistry climate models we considered, sampling has no significant influence on the monthly-mean zonal-mean ozone, carbon monoxide and water vapour.However the biases in atmospheric temperature due to sampling may be important when using nadir sun-synchronous orbits to create a decadal representation of the atmospheric temperature over the higher latitudes.

With data screening
Satellite measurements have instances of "bad" retrievals, which are flagged as a part of the operational processing (Osterman et al., 2009).The quality flags are provided in every TES metadata product.The total number of good points sampled by TES in 2005 through 2008 is as shown in Fig. 5 for the species.Figure 5 shows that from January-May 2005, TES has less than 15 000 sampled points, and in June 2005, TES performed no measurement.In June-December 2008, the reduction in the total number of points sampled by TES is due to reduced latitudinal coverage from ±82 • N to a range between 70 • N and 50 • S. Figure 6 provide an example of the geographical distributions of ozone concentrations in October 2006 at 562 hPa.It compares the monthly averages computed from the total number of screened points sampled along the TES orbit (left column) gridded to the model resolution to the monthly mean of the model output computed from the 3-hourly timeseries (left column) in the GISS-PUCCINI model.The figure shows that on a monthly-mean time-scale, observations by a nadir sun-synchronous satellite can adequately capture the magnitude and the large-scale distribution pattern due to transport of ozone, and carbon monoxide (plots not shown).
Figures 7a and 7b show the time versus latitudinal-mean curtain of the percentage error calculated for ozone and CO respectively, when the sampled points are screened with the quality flags (top) and not screened (bottom).The plots show similar pattern of error, with a slight increase in error when the sampled data are screened due to the reduction in total number of points.The higher than usual percentage errors calculated in January-May 2005, and in September 2005 suggests that the monthly average computed from less than 50 % of the total number of nadir points measured by a global sun-synchronous observations with a latitudinal coverage range of ±82 • N may not provide adequate representation of the monthly-mean.

The influence of monthly averaging
In this section, we test the implication of the co-variability of the averaging kernel and the concentrations of the species on a monthly-mean time scale.We can write the expectation of the retrieved species in Eq. (1) as: ) however E[ ] = 0 if we assume a zero mean spectral measurement error.We can therefore rewrite the second term on the right hand side of the Eq. ( 11) as: where a j k are the elements of the N by N averaging kernel matrix A for a particular target scene, j is the row and k is the column of the matrix.
If the variability of the elements of the averaging kernel A are uncorrelated with the variability of the true state x for all orbits belonging to the sampled grid-points on a monthly mean timescale, then we approximate Eq. ( 12) as: where the condition specified in Eq. ( 13) depends on the atmospheric species under consideration.The validity of the Eq. ( 13) approximation is a necessary condition to construct the monthly-mean of observations in a manner analogous to monthly-mean model output.If we now use Eq. ( 13) in Eq. ( 11), we derive the approximation: The monthly-mean of the sampled species with the application of the satellite operator derived from Eq. ( 2) can therefore be written as: where i, n G , and G are as defined in Sect. 4. The monthlymean approximation of Eq. ( 15) can therefore be constructed using Eq. ( 14), given by: We apply the monthly-mean of the a priori profile x a i and the monthly-mean of the averaging kernel matrix A i to the monthly average of the model profiles sampled along the TES nadir orbit x m i to test the closeness of the approximation in Eq. ( 16) to Eq. ( 15) for ozone, CO, atmospheric temperature and water vapour in any given month.In particular, we quantify the bias due to averaging the operators by calculating the absolute error V and the percentage error VP according to:

V = xm
The results of the percentage and the absolute error is presented in Figs.8a and 8b respectively.The zonal-mean errors are calculated by summing up the absolute errors and dividing by the total number of points in the latitudinal zone at each level.The errors for ozone, carbon monoxide, atmospheric temperature and water vapour are shown in the first, second, third and fourth rows of the both figures respectively.The first row of Figs.8a and 8b show that the errors caused by averaging the ozone averaging kernel is only up to ±1 % (up to ±1 ppbv) in the lower and middle troposphere in both models.At the UTLS, the error can be up to 3 % (about 8 ppbv) in both models.On the second row of Fig. 8a, we see that using a monthly-mean CO averaging kernel could cause between −0.2 % to +0.7 % throughout the whole troposphere in both models.In the absolute values, these CO errors due to using monthly-mean averaging kernels range from only −0.2 ppbv and 0.6 ppbv (see Fig. 8b).Again, the biases we calculated show no significant influence of using the monthly-mean averages of the operators for both ozone and carbon monoxide.This is especially interesting, since it shows that monthly-means of TES ozone and carbon monoxide observations similar to model output are suitable for model evaluation projects such as the ACC-MIP project.
With the exception of the boundary layer over the southern-hemispheric high latitudes, the error recorded due to averaging the temperature averaging kernels is less than ±0.08 K throughout the entire troposphere in both models (see the third row of Figs.8a and and 8b).Averaging the averaging kernel led to an atmospheric temperature error of up to 0.2 K in the boundary layer over the southern-hemispheric higher-latitudes (third row of Fig. 8b), which is still small in comparison with typical model biases.Employing the water vapour monthly-mean averaging kernels cause an error that ranges from −1 % to 8 % within the entire troposphere in both models (fourth row of Fig. 8a).The actual absolute error amount within the lower troposphere varies from −0.04 to 0.16 g kg −1 .However, in the middle and upper troposphere, the absolute water vapour error due to using monthly-mean operators only varies between ±0.04 g kg −1 (fourth row of Fig. 8b).
In summary, using the monthly average of the averaging kernels has no significant impact on the application of nadir satellite retrievals to models for ozone and carbon monoxide.Averaging the averaging kernel may impact the boundary layer atmospheric temperature slightly over the southernhemispheric higher-latitudes, where the error calculated is up to 0.2 K.The bias due to the use of water vapour monthlymean operators of up to 8 % calculated in the upper tropo-sphere may be significant for water vapour feedback on the rate of global warming (e.g.Soden and Held, 2006).

The influence of satellite operators
The averaging kernel and the a priori profiles account for the limited vertical resolution of the satellite measurement, and are together called the satellite operators.The reduction of the sensitivity due to clouds is contained in the averaging kernel.When the averaging kernel is applied to the model, it accounts for the reduction in vertical sensitivity caused by the clouds.For TES data, the satellite operators are included in the retrievals and are provided as part of the data distribution.This section presents the error associated with neglecting these operators, that is not following the techniques explained in Sect. 3 leading to the execution of Eq. ( 2).We denote the monthly average of sampled model time-series with the application of TES operator as: where n G and G are as defined in Sect. 4. The monthly mean of the sampled model output without the application of satellite operators is given by Eq. ( 6).We calculated the absolute error due to neglecting the application of the satellite operators as T: and the percentage error TP as: where y m is the sampled model output, and ŷm denotes the sampled model output containing the application of TES operators according to Eqs. ( 2) and (3 or 4).Figures 9a and 9b show the 2005-2008 zonal-mean percentage and absolute errors, respectively caused by not applying the operators.Note that the zonal-mean errors are calculated by summing up the absolute errors and dividing by the total number of points in the latitudinal zone at each level.In both models we considered, we show the consequence of not using the operators for ozone, carbon monoxide, temperature and water vapour on the respective first, second, third and fourth rows of Figs.9a and 9b.
For ozone, the percentage error caused by not accounting for the limited vertical resolution of the nadir satellite observations ranges from −30 % to more than 60 %.In the absolute amount, the biases vary from −10 ppbv to 25 ppbv within the lower and middle troposphere of both models.In the UTLS, the biases ranges from −90 ppbv to more than 50 ppbv in both models (see the first row of Figs.9a and 9b).The ozone results show strong model dependence, and further elucidate that influence of the averaging kernel on a    and 9b).
On the third row of Figs.9a (and 9b), the neglect of TES atmospheric temperature operators causes an error ranging from −1.5 K to more than 5 K in both models.For water vapour, the percentage error for not applying the operators also range from −10 % to 40 % throughout the entire troposphere, except over the boundary layer and the UTLS of the higher latitudes.In the ECHAM5-MOZ model, the percentage error in the boundary layer and the UTLS of the higher latitudes could be up to 100 %, while the respective errors are up to 80 % and −60 % in the GISS-PUCCINI model (see the fourth row of Fig. 9a).Absolute errors of neglecting water vapour operators are concentrated in the lower and the middle troposphere, and vary from −0.3 to 1.4 g kg −1 in the ECHAM5-MOZ model and −0.4 to 2.5 g kg −1 in the GISS-PUCCINI model (as shown in Fig. 9b).
In comparison to the impact of sampling and averaging, the failure to account for the limited vertical resolution of a nadir satellite measurement yields the largest bias.This highlights the importance of the operators and accounting for the differences between the vertical resolution of nadir satellites and models.The summary of the impact of sampling along satellite orbit, averaging the observation operators, and accounting for the limited vertical resolution through the application of the operators are presented in Table 2 for both models.

Summary and discussion
This paper presented the technical steps required for using nadir sun-synchronous satellite observations for multi-model evaluation and the uncertainties associated with each step.We quantified the implications of sampling, the effect of averaging the observation operators (that is the a priori and the averaging kernel profiles) for use with monthly-mean model output, and the impact of neglecting the observation operators.We calculated these biases in ozone, carbon monoxide, atmospheric temperature and water vapour by using the output from two global chemistry climate models (ECHAM5-MOZ and GISS-PUCCINI) and the observations from TES satellite from January 2005 to December 2008.
The results show that sampling has no significant influence on ozone, carbon monoxide and water vapour throughout the entire troposphere in both models.We calculated 2005-2008 zonal-mean sampling biases no larger than ±2.5 % and an absolute amount ranging from ±8 ppbv for ozone in both models.Carbon monoxide and water vapour sampling biases were also within the range of ±1.2 % (that is −2 to 5 ppbv) and ±3 % (that is ±0.14 g kg −1 ) respectively in both models.We also found insignificant biases due to using the monthly averages of ozone and carbon monoxide operators.The biases due to averaging the operators range from only −1 % to 3 % (that is −1 to 8 ppbv) for ozone and from −0.2 % to 0.7 % (that is −0.2 ppbv to 0.6 ppbv) for carbon monoxide in both models.
The influence of sampling on atmospheric temperature is within the range of ±0.3 K in the tropical and mid-latitudes in both models.However, the biases due to sampling became significant over the boundary layer in the higher latitudes, where they could be as large as −1.4 K, especially in the GISS-PUCCINI model.Even though the biases due to averaging the temperature averaging kernel and the a priori profiles were quite small (only ±0.08 K) in the tropical and the mid-latitudes throughout the entire troposphere, they were as high as 0.2 K over the boundary layer in the southernhemispheric higher latitudes.We also found up to 8 % bias in the upper troposphere water vapour when the monthly-mean operators were used.This may be significant for feedback of water vapour on the rate of global warming.
Our results show the importance of the averaging kernel and the a priori profiles in accounting for the limited vertical resolution of a nadir satellite measurement for model application.The results show that neglecting the observation operators will result in large biases, which are more than 60 % for ozone, ±30 % for carbon monoxide, which range from −1.5 K to 5 K for atmospheric temperature, and for water vapour, they are within the range of −60 % and 100 %.These high biases highlight that the reduction in the sensitivity due to clouds and all other effects, which are captured by the averaging kernels cannot be ignored during the comparison of the satellite measurement to numerical models.
These results show that monthly averages constructed from points sampled by the nadir satellite sufficiently captures the magnitude and the large-scale distribution pattern due to transport of ozone and carbon monoxide on a monthlymean time scale, and are adequate for model evaluation, subject to the condition that the observation operator constructed in similar averages are applied to model output.

Fig. 1 .
Fig. 1.An example of the nadir orbit of Tropospheric Emission Spectrometer for a particular global survey.

Fig. 1 .
Fig. 1.An example of the nadir orbit of Tropospheric Emission Spectrometer for a particular global survey.

Fig. 2 .
Fig. 2.The distribution of the maximum throughput of a nadir sun-synchronous sampling binned to the ECHAM5-MOZ and GISS-PUCCINI model grids of T42 and 2.5 × 2 respectively, in a 30-day month.

Fig. 3 .Fig. 4a .Fig. 4b .
Fig. 3.The time-series of model output at longitude 20 • E and latitude 18 • S in July 2005-2008 (grey lines).The points sampled along the model time-series using TES spatio-temporal information is shown in Black symbols (diamonds, triangle, inverted triangle and squares) for each respective year.The ECHAM5-MOZ and the GISS-PUCCINI models are shown on the left and the right columns respectively.The grey and the black symbols shown on the far right of the plot are the mean of original time-series (grey) and the sampled points (black).On each of the mean values, we show the standard error of the mean as defined in Eqs.(9) and (10) for sampled and raw model time-series respectively.

Fig. 5 .
Fig. 5.The total number of good nadir points sampled by TES in 2005 through 2008 for ozone, carbon monoxide, temperature and water vapour.TES conducts no measurement in June 2005.

Fig. 6 .
Fig.6.The geographical distribution comparing the monthly averages of ozone computed from the points sampled from the GISS-PUCCINI model using only the spatio-temporal locations belonging to the "good" nadir points, i.e. screened (left) to the monthly mean model output computed from the entire 3-hourly time-series, i.e. no screening (right) in October 2006 at 562 hPa.

Fig. 7a .
Fig. 7a.The latitudinal average time series of the percentage error in ozone calculated when the sampled points are screened with the quality flags (top) and not screened (bottom).The screened has the gap because TES performed no measurement in June 2005.

Fig. 7b .
Fig. 7b.The latitudinal average time series of the percentage error in carbon monoxide calculated when the sampled points are screened with the quality flags (top) and not screened (bottom).The screened has the gap because TES performed no measurement in June 2005.

Fig. 8a .Fig. 8b .
Fig. 8a.The 2005-2008 monthly-mean zonal average of the percentage error introduced by using the monthly-mean satellite operator approximation in the ECHAM5-MOZ (left column) and GISS-PUCCINI (right column) models.The influence of monthly-mean averaging kernel on ozone, carbon monoxide, atmospheric temperature and water vapour are shown on the first, second, third and fourth rows respectively.

Fig. 9a .Fig. 9b .
Fig.9a.The 2005The -2008 monthly-mean zonal average of the percentage error encountered for failure to account for the limited vertical resolution of the nadir satellite.We show this impact on the ECHAM5-MOZ (left column) and the GISS-PUCCINI (right column) ozone, carbon monoxide, atmospheric temperature and water vapour on rows one, two, three and four, respectively.

Table 2 .
The 2005-2008 zonal-mean percentage and absolute errors due to technical steps required to use sun-synchronous satellite in model evaluation.We show the biases due to (1) sampling the model along TES spatio-temporal resolution, (2) using monthly average of the satellite operators and (3) neglecting the application of satellite operators.The list of acronyms used in the Table is as given below.output is a function of the distribution of ozone calculated by the model.For carbon monoxide, the percentage error of not applying the operators also varies between ±30 % (that is −40 ppbv to 15 ppbv) in the ECHAM5-MOZ model, and lies between −30 % and 20 % (−35 to 30 ppbv) in the GISS-PUCCINI model (see the second row of Figs.9a model