Stability of temperatures from TIMED / SABER v 1 . 07 ( 2002 – 2009 ) and Aura / MLS v 2 . 2 ( 2004 – 2009 ) compared with OH ( 6-2 ) temperatures observed at Davis Station , Antarctica

Temperature profiles from two satellite instruments – TIMED/SABER and Aura/MLS – have been used to calculate hydroxyl-layer equivalent temperatures for comparison with values measured from OH(6-2) emission lines observed by a ground-based spectrometer located at Davis Station, Antarctica (68 ◦ S, 78 E). The profile selection criteria – miss-distance <500 km from the ground station and solar zenith angles >97 – yielded a total of 2359 SABER profiles over 8 years (2002–2009) and 7407 MLS profiles over 5.5 years (2004–2009). The availability of simultaneous OH volume emission rate (VER) profiles from the SABER (OH-B channel) enabled an assessment of the impact of several different weighting functions in the calculation of OHequivalent temperatures. The maximum difference between all derived hydroxyl layer equivalent temperatures was less than 3 K. Restricting the miss-distance and miss-time criteria showed little effect on the bias, suggesting that the OH layer is relatively uniform over the spatial and temporal scales considered. However, a significant trend was found in the bias between SABER and Davis OH of ∼0.7 K/year over the 8year period with SABER becoming warmer compared with the Davis OH temperatures. In contrast, Aura/MLS exhibited a cold bias of 9.9± 0.4 K compared with Davis OH, but importantly, the bias remained constant over the 2004– 2009 year period examined. The difference in bias behaviour of the two satellites has significant implications for multiannual and long-term studies using their data. Correspondence to: W. J. R. French (john.french@aad.gov.au)


Introduction
The SABER (Sounding of the Atmosphere by Broadband Emission Radiometry) instrument on board the NASA's TIMED (Thermosphere Ionosphere Mesosphere Electrodynamics) satellite has been providing temperature profiles as one of its level 2 products since early 2002 (Mertens et al., 2001).Several reports of comparisons between SABER temperatures and those measured by ground-based techniques and instruments have been published in the past few years (e.g., Mertens et al., 2004;Oberheide et al., 2006;Xu et al., 2006;López-González et al., 2007;Mulligan and Lowe, 2008;Remsberg et al., 2008;Smith et al., 2010).Each report provides an estimate of the offset or bias between SABER and the comparator dataset (see Table 2 in Supplement).All of these studies involve relatively short data runs, the longest of which was the three-year (2003)(2004)(2005)(2006)) study of Oberheide et al. (2006) and they all involve comparisons with data from Northern Hemisphere sites only.This work differs from previous studies in the following respects: it uses ground-based temperature data from a Southern Hemisphere station -Davis Station, Antarctica (68 • S, 78 • E) and it includes the years 2002-2009 which enables us to examine the behaviour of the bias between SABER and the Davis dataset during this extended period.The instrument at Davis has been well documented (Greet et al., 1998;French et al., 2000) and has been providing well characterised temperature data in every Southern Hemisphere winter season since 1995 (Burns et al., 2003;French and Burns, 2004;French et al., 2005).We also compare our ground-based observations with temperature profiles from a second satellite instrument -the Microwave Limb Sounder (MLS) on board the EOS (Earth Observing System) Aura spacecraft (Schwartz et al., 2008) -over the period [2004][2005][2006][2007][2008][2009] to help contextualise our SABER-Davis OH results.

Davis spectrometer
A 1.26 m f/9 Czerny-Turner scanning spectrometer has operated at Davis each year since 1995 recording the P-branch lines of the OH(6-2) band near 840 nm.Observations are made in the zenith with a 5.3 • field-of-view and with an instrument resolution of ∼0.16 nm.A cooled GaAs photomultiplier, operated in pulse counting mode, detects the sky emission.Acquisition time is of the order of 7 min per spectrum.Further details of the instrument are contained in Greet et al. (1998) and French et al. (2000).
Transition probabilities taken from Langhoff et al. (1986) are used to derive rotational temperatures because they are a complete set for all bands that are closest to the experimentally determined ratios of French et al. (2000) for the OH(6-2) band.Sample temperatures are derived as a weighted average of temperatures from the three possible ratios from the P 1 (2), P 1 (4) and P 1 (5) emission lines.The weighting factor is the statistical counting error (based on the error in estimating each line intensity).P 1 (3) is not used due to contamination by an un-thermalised OH(5-1) P 1 (12) line (French et al., 2000).P 1 (2) is corrected for the ∼2% temperature-dependent contribution by Q 1 (5).Auroral activity is monitored via the atomic oxygen line at 844.6 nm.Backgrounds are selected to balance the small auroral contribution (from N 2 1PG and N + 2 Meinel bands) and solar Fraunhofer absorption for spectra acquired during moonlit conditions.Correction factors account for the difference in -doubling between the P-branch lines determined with knowledge of the instrument line shape from high-resolution scans of a frequency-stabilized laser.Further details of the rotational temperature analysis procedure are available in Burns et al. (2003) and French and Burns (2004).
Instrument spectral response calibration is maintained by reference to several Low Brightness Source (LBS) units, which are cross-referenced annually to Australian National Measurement Institute (ANMI) standards.A total of 863 scans of the LBS on the spectrometer and 356 crossreference scans of the LBS against ANMI standards were made over the 2002 to 2009 data interval considered here.The instrument response correction has not varied by more than 1.1% in the P 1 (2)/P 1 (5) ratio (the most widely separated ratio used for a rotational temperature calculation) over this time, corresponding to a maximum temperature variation of 1.1 K due to the different response corrections applied for all years.The correction uncertainty is generally less than 0.3 K each year, with the exception of 2002 (1.2 K) due to detector cooling problems.Data used in this study have been corrected for an issue with the orientation of the LBS compared to previous published work (Burns et al., 2003;French and Burns, 2004;French et al., 2005), where the LBS was measured in the vertical orientation at Davis and cross referenced with measurements taken with the LBS in the horizontal orientation at ANMI.A measurable difference in the spectral radiance of the LBS was detected in 2007 as a result of the altered shape of the tungsten filament in the different orientations of the LBS.Correcting the AMNI calibrations to match the Davis vertical measurements results in cooler OH rotational temperatures by an average 0.9 K.

SABER
The SABER instrument is a radiometer which measures Earth limb emission profiles over the spectral range 1.27-17 µm from the TIMED satellite in circular orbit at 625 km inclined at 74 • to the equator (Russell et al., 1999).The latitude coverage ranges from 54 • S to 82 • N or 82 • S to 54 • N depending on the yaw cycle.The satellite orbit precesses slowly to complete 24 h local time in 60 days.Temperature is retrieved from 15 µm and 4.3 µm CO 2 emissions over an altitude range of 10-105 km, with a vertical resolution of about 2 km, and along track resolution of 400 km (Mertens et al., 2002).A summary of the evolution of SABER data releases is provided by Remsberg et al. (2008).Retrievals employ iterative algorithms in the upper mesosphere lower thermosphere (UMLT) region to account for non-local thermodynamic equilibrium (non-LTE) radiative transfer effects.An assessment of these algorithms has been made by Mertens et al. (2001Mertens et al. ( , 2002Mertens et al. ( , 2004) and further detail of the v1.07 non-LTE retrieval algorithm is given by García-Comas et al. (2008).
Errors in the retrieved temperatures in the 80-100 km region are estimated to be in the range ±1.5-5K if the kinetic temperature profile does not have pronounced vertical structure (García-Comas et al., 2008).Errors are greater in summer polar conditions, but these are not of concern here since our study concentrates on Southern Hemisphere polar winter.In addition to kinetic temperature, volume emission rate (VER) profiles are derived simultaneously from the OH-B channel, sensitive in the range 1.56-1.72µm, which includes mostly the OH(4-2) and OH(5-3) bands.While we compare temperatures from OH(6-2) band measurements at Davis in this study, altitude differences between all the vibrational levels are not expected to exceed 2 km (McDade, 1991).SABER calibration is maintained by reference to an internal blackbody (at 247 K) every 446 s. Between each up/down limb scan pair, views are also made of "cold space" and the instrument baffling as an estimate of internal stray light effects (Remsberg et al., 2008).

EOS Aura Microwave Limb Sounder
The EOS (Earth Observing System) Aura spacecraft was launched on 15 July 2004 into a near-polar, 705 km altitude, sun-synchronous orbit (Schwartz et al., 2008).It provides almost complete global coverage (82 • S-82 • N) with ∼14 orbits per day.The MLS (Microwave Limb Sounder) instrument on board Aura observes thermal microwave emissions in different regions from 115 GHz to 2.5 THz.The MLS field-of-view is in the direction of orbital motion, and it scans the Earth's limb vertically from ∼5-100 km every 24.7 s, with an along-track resolution of ∼165 km (increasing to 220 km in the UMLT region).Temperature and geopotential height profiles are produced on a fixed vertical pressure grid by the MLS level 2 (version 2.2 data is used in this study) processing algorithm which is applied to the thermal microwave emissions near the spectral lines 118 GHz O 2 and 234 GHz O 18 O.It produces scientifically useful temperature profiles for geopotential heights corresponding to the range 316 hPa (∼8 km) to 0.001 hPa (∼97 km) (Schwartz et al., 2008).The vertical resolution, as defined by the full width at half maximum (FWHM) of the averaging kernels, varies from 5.3 km at 316 hPa to 9 km at 0.1 hPa and reaches 15 km at 0.001 hPa.(Schwartz et al., 2008).The effect of the considerably lower resolution of the MLS data, compared with a SABER temperature profile, on the results of this study is discussed in greater detail in Sect.4.3.Recent studies utilizing MLS temperature measurements in the MLT region include an investigation of summer planetary-scale oscillations by Meek and Manson (2009) and of the quasi 5-day wave by von Savigny et al. (2007).

Davis OH(6-2)
Hydroxyl observations at Davis are made when the sun is more than 7 • below the horizon, which defines an observing season window between 8 February (day 049) and 23 October (day 296).This interval includes the warm winter mesopause period with partial coverage of the spring and autumn transition periods.No observations can be obtained during the cold summer mesopause period.Nightly averages are derived from all 7 min spectra that pass data quality selection criteria.Application of these criteria discards measurements with weighted standard deviation >15 K, counting errors >10 K, high backgrounds, large background slopes, and unusual changes of intensity between consecutive scans.Burns et al. (2003) have examined the influence of cloud, aurora and Fraunhofer absorption in moonlight in these data.A total of 1677 nightly averages are obtained for the 2002 to 2009 interval from nearly 125 000 individual spectra that pass the selection criteria.A minimum of 10, maximum of 163 (average of 74) samples contribute to each nightly aver- age. Figure 1a shows these nightly averages with 1σ error bars in comparison with the MSISE-90 model temperatures for 87 km at 69 • S. The model shows good agreement with the observations at Davis, and illustrates the full seasonal cycle at this latitude.

SABER
SABER version 1.07 data (available at http://saber.gats-inc.com) were used in this study.A total of 2547 profiles (typically 2-4 profiles/day) satisfied the selection criteria of tangent point within a 500 km radius of Davis and solar zenith angle >97 • (night-time profiles corresponding to the spectrometer measurements) over the 2002-2009 interval.The satellite 60-day yaw cycle results in SABER observations over Davis in the same three time intervals each year (days 75-140, 196-262 and 323-014).However, the solar zenith angle criterion (>97 • ) completely rejects the summer interval (days 323-014).Temperature profiles were rejected if the VER was unusual, viz, the FWHM of the VER profile was >2σ from the mean profile width (4.8 km<FWHM<11.5 km) or if the pressure interpolated to 87 km was >2σ from the mean pressure www.atmos-chem-phys.net/10/11439/2010/Atmos.Chem.Phys., 10, 11439-11446, 2010 (0.0012<pressure<0.005 mb) at that altitude (188 profiles rejected, leaving 2359).Profiles were also rejected if their VER profiles departed significantly from a Gaussian shape (measured by the χ 2 fit value) mostly eliminating double peaked VER profiles in a similar fashion to Winick et al. (2009).Following the application of these criteria, only those profiles that were within 8 h of the closest OH(6-2) nightly average were retained giving a total of 2060 profiles.
In order to compare satellite temperatures with those derived from the ground-based instrument, we derive hydroxyl layer equivalent temperatures from satellite profiles using a weighting function which is representative of the hydroxyl layer (e.g., Oberheide et al., 2006;López-González et al., 2007;Mulligan and Lowe, 2008).In the first part of this study, we concentrated on SABER profiles because of the availability of simultaneous OH 1.6 µm VER profiles.Five different weighting profiles were considered as follows: -W SL: the weighting function from She and Lowe (1998) of the form W SL = (a/(1 where the coefficients a = 27; b = 87.133;c = 3.8682; d = 1.8235; e = 2.9306, calculated over the tangent point altitude (Z) range 70 < Z < 110 km.
-W G87: a Gaussian centred at 87 km with FWHM 8 km calculated over the range 70 < Z < 104 km.
-W VER: the SABER (OH-B) VER profile, evaluated from the altitude of the peak OH VER ±20 km.
-W GFIT: a Gaussian fitted to the altitude of the OH VER peak ±20 km, but then weighted to ±17 km.
-W VERm: a profile derived from the mean of the 2359 VER profiles that comprise the initial selection set.
These weighting profiles are shown in Fig. 2 with the mean (and standard error) of the 2359 SABER temperature profile retrievals across the UMLT region.Multiplying each temperature profile by the corresponding weighting profile yields T SL, T G87, T VER, T GFIT and T VERm.The SABER temperature interpolated to 87 km (T Alt) was also calculated.T GFIT temperatures, derived in this way are shown in Fig. 1b.with a standard error of 0.27 K. Table 1 summarizes the mean differences for the 2060 individual profiles, and 847 nightly average comparisons.Overall, SABER and OH temperatures lie within 2 K of each other, no matter which weighting profile is chosen, and whether the comparison is made on an individual profile or nightly average basis.Profiles based on the SABER VER (T VER, T GFIT and T VERm) generally lead to a warm bias while weighting with the She and Lowe (1998) profile leads to a cool bias.The smallest standard deviation in the bias is obtained with the W VERm profile.Spatial and temporal variability may contribute to the differences between Davis OH and SABER temperatures since we have accepted all overpasses within 500 km in range and 8 h in time.The effect on bias has been examined with more restrictive spatio-temporal coincidence criteria; restricting the range of miss-distance acceptance in 100 km steps and miss-time acceptance from ±8 h to ±15 min.No systematic trend in the bias is observed with tighter restrictions on either criterion for any of the weighting functions considered (see Tables 3 and 4 of the Supplement).The consistency of biases over the span of spatial and temporal ranges considered indicate that the OH layer is largely uniform over these scales, or that differences are averaged out in the wide field-of-view of SABER.Burns et al. (2003) found good correlation between Davis OH and sodium lidar temperatures at Syowa station (69 • S, 39 • E) at a distance of 1500 km.The majority of the day-to-day variability is in response to large scale planetary waves with spatial extents of several thousand km (French and Burns, 2004;French et al., 2005).Tidal amplitudes are small at this latitude (generally less than 2 K) which contributes to the lack of bias deviation over timescales up to 8 h (Hagan et al., 1999).

SABER-Davis OH bias trend
Our analysis thus far has treated the 2002-2009 comparison dataset as a whole.Returning to Fig. 1c we now examine the trend in bias over the 8-year time span.Means for each year, and for each yaw cycle "campaign", are plotted in Fig. 3a.
A linear fit shows an apparent trend in the bias increasing at ∼0.7 K/year (implying that SABER temperatures are getting warmer compared to the OH measurements) and is independent of the weighting function used.To our knowledge such a trend has not been reported in previous comparisons.A calibration drift in either instrument could produce such a trend.The total change in response correction for the Davis instrument over 2002-2009 equates to about 1.1% or 1.1 K in temperature from the P 1 (2)/P 1 (5) ratio (which contributes ∼57% on average to the weighted mean) and less for the other ratios.The uncertainty in the response calibration is ∼0.3 K. Thus in terms of a calibration drift for the Davis instrument a change in bias of ∼0.7 K/year (a total of 5.6 K over 8 years) is conservatively at least 5 times the maximum change in response correction and at least 19 times the uncertainty.
Another possibility for the source of a bias trend could be an inaccurate representation of long term changes in CO 2 or O concentrations in the SABER non-LTE retrieval algorithm, perhaps as a consequence of the decrease in solar flux over the life of the SABER mission (J.-H.Yee and M. López-Puertas, personal communication, 2010).If SABER is the source of the drift, it has important implications for the use of its temperatures in the assessment of long term change.Further information on the bias trend may be gained over the next few years as solar activity increases, but for now the source of the trend in bias remains unresolved.

Aura/MLS-Davis OH bias and trend
As a check on the bias trend we have performed the same comparison exercise using available data from the Microwave Limb Sounder (MLS) on NASA's Aura satellite.A total of 7407 MLS level 2 (version 2.2) profiles between July 2004 and December 2009 satisfied the same acceptance range (<500 km) and solar zenith angle (>97 • ) criteria used for the SABER analysis.Since our SABER analysis demonstrated that OH-equivalent temperatures are relatively insensitive to the shape of the weighting function, we used only a simple Gaussian weighting profile centred on an altitude of 87 km and a FWHM of 8.7 km to calculate hydroxyl layer equivalent temperatures.A significant cold bias of 9.9 ± 0.4 K (MLS colder than Davis OH) was obtained as the mean of 1034 nightly averages over the 5.5 years.Importantly, however, biases derived for each year do not show an increasing drift between 2004 and 2009 as shown in Fig. 3b.
One possible consideration for the difference in bias behaviour is the effect of the much lower vertical resolution of MLS (15 km FWHM) compared with SABER (2 km FWHM).This was investigated in the following manner.All of the SABER profiles that passed the selection criteria were convolved with averaging kernels of the altitude and width described in Schwartz et al. (2008)   retrievals, thereby creating SABER profiles with the vertical resolution of an MLS profile (see Fig. 4 of the Supplement).These "MLS-like" SABER profiles were then weighted with a Gaussian profile centred at 87 km altitude and FWHM of 8.7 km to calculate OH-equivalent temperatures comparable to SABER T G87.
Although individual "MLS-like" profiles produced OHequivalent temperatures as much as 30 K different from the SABER T G87 values, averages for each year were within 2 K, and the bias drift compared with Davis OH measurements persisted.
The effect of applying a relatively broad Gaussian weighting to the higher resolution SABER profile and to the lower resolution "MLS-like" profile tends to attenuate any differences between them.On the basis of this investigation, we conclude that the lower resolution of the Aura profiles does not significantly change the OH-equivalent temperatures or the bias stability reported.
-Profiles based on the SABER VER (T VER, T GFIT and T VERm) generally lead to a warmer SABER-OH bias while the She and Lowe (1998) and Gaussian at 87 km weighting profiles generally lead to a cooler bias.
-More restrictive spatio-temporal coincidence criteria have little effect on the biases obtained indicating that the hydroxyl layer is largely uniform over the scales considered here.
-The SABER-OH bias appears to be increasing at ∼0.7 K per year over the comparison interval.To our knowledge such a trend has not been previously reported.
-This trend is not apparent in equivalent comparisons with Aura/MLS measurements.
Recently, Remsberg et al. (2008) have reported an assessment of the quality of the version 1.07 temperature-pressure profiles of the middle atmosphere from TIMED/SABER.They conclude that the SABER v1.07 temperature distributions can be used to generate the near-global, seasonal and interannual variations of T k in the UMLT for the period 2002 to 2008.We agree that the SABER temperatures are a unique resource for researchers in the UMLT region.Our detailed comparison of SABER temperatures with OH temperatures recorded at Davis over an 8-year period (2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) suggest that some caution may be needed when using SABER temperatures over an extended time-period.

Figure 1 .
Figure 1.Davis hydroxyl (1677 nights : panel A) and the SABER T_GFIT evaluation (847 2 nights : panel B) nightly averages overplotted on MSISE-90 model temperatures for 69°S).3 Error bars for the Davis OH series are 1 standard error-in-the-mean.An estimate of the 4 SABER error (5K) is plotted for comparison.Panel C shows the SABER-DavisOH 5 difference.6 Fig. 1. (A) Davis hydroxyl nightly average temperatures (1677 nights) overplotted on MSISE-90 model temperatures for 69 • S. Error bars for the Davis OH series are 1 standard error-in-the-mean.(B) SABER T GFIT nightly averages (847 nights) with an estimate of the SABER error (5 K) plotted for comparison with the OH temperatures.(C) The SABER-DavisOH difference for coincident nights.

Figure 2 . 4 Fig. 2 .
Figure 2. SABER weighting profiles considered in this study (see text for descriptions), 2 compared with the mean (winter) mesopause temperature profile derived from the 2359 3 SABER profiles within 500 km range.4Fig.2.SABER weighting profiles considered in this study (see text for descriptions), compared with the mean winter mesopause temperature profile derived from the 2359 SABER profiles within 500 km range.

6
Figure 3. (Panel A) The mean SABER-DavisOH bias for each year, and for each yaw-cycle 2 campaign, from Fig. 1C.A straight line fit to the latter shows a trend in the bias of 3 ~0.7 K/annum.(Panel B) MLS-DavisOH bias for each year.A straight-line fit to this data is 4 not significantly different from zero trend at the 95% confidence level.5 6

Table 1 .
A comparison of the mean temperatures and SABER-OH biases for each weighting profile for 2060 individual profiles and 847 nightly averages compared to the closest OH temperature measurement (T OH ) and closest OH nightly average (T OHNA ).