GOMOS ozone profile validation using ground-based and balloon sonde measurements

The validation of ozone profiles retrieved by satellite instruments through comparison with data from groundbased instruments is important to monitor the evolution of the satellite instrument, to assist algorithm development and to allow multi-mission trend analyses. In this study we compare ozone profiles derived from GOMOS night-time observations with measurements from lidar, microwave radiometer and balloon sonde. Collocated pairs are analysed for dependence on several geophysical and instrument observational parameters. Validation results are presented for the operational ESA level 2 data (GOMOS version 5.00) obtained during nearly seven years of observations and a comparison using a smaller dataset from the previous processor (version 4.02) is also included. The profiles obtained from dark limb measurements (solar zenith angle>107) when the provided processing flag is properly considered match the ground-based measurements within ±2 percent over the altitude range 20 to 40 km. Outside this range, the pairs start to deviate more and there is a Correspondence to: J. A. E. van Gijsel (anne.van.gijsel@rivm.nl) latitudinal dependence: in the polar region where there is a higher amount of straylight contamination, differences start to occur lower in the mesosphere than in the tropics, whereas for the lower part of the stratosphere the opposite happens: the profiles in the tropics reach less far down as the signal reduces faster because of the higher altitude at which the maximum ozone concentration is found compared to the mid and polar latitudes. Also the bias is shifting from mostly negative in the polar region to more positive in the tropics Profiles measured under “twilight” conditions are often matching the ground-based measurements very well, but care has to be taken in all cases when dealing with “straylight” contaminated profiles. For the selection criteria applied here (data within 800 km, 3 degrees in equivalent latitude, 20 h (5 h above 50 km) and a relative ozone error in the GOMOS data of 20% or less), no dependence was found on stellar magnitude, star temperature, nor the azimuth angle of the line of sight. No evidence of a temporal trend was seen either in the bias or frequency of outliers, but a comparison applying less strict data selection criteria might show differently. Published by Copernicus Publications on behalf of the European Geosciences Union. 10474 J. A. E. van Gijsel et al.: GOMOS ozone profile validation

latitudinal dependence: in the polar region where there is a higher amount of straylight contamination, differences start to occur lower in the mesosphere than in the tropics, whereas for the lower part of the stratosphere the opposite happens: the profiles in the tropics reach less far down as the signal reduces faster because of the higher altitude at which the maximum ozone concentration is found compared to the mid and polar latitudes.Also the bias is shifting from mostly negative in the polar region to more positive in the tropics Profiles measured under "twilight" conditions are often matching the ground-based measurements very well, but care has to be taken in all cases when dealing with "straylight" contaminated profiles.
For the selection criteria applied here (data within 800 km, 3 degrees in equivalent latitude, 20 h (5 h above 50 km) and a relative ozone error in the GOMOS data of 20% or less), no dependence was found on stellar magnitude, star temperature, nor the azimuth angle of the line of sight.No evidence of a temporal trend was seen either in the bias or frequency of outliers, but a comparison applying less strict data selection criteria might show differently.
Published by Copernicus Publications on behalf of the European Geosciences Union.

Background
Ultraviolet light (UV) present in solar radiation can potentially threaten life on Earth as UV radiation can cause alterations in DNA (Luchnik, 1975).Ozone in the Earth's atmosphere absorbs 97 to 99% of the UV, significantly reducing the harmful effects.Most of these absorption reactions take place in the so-called ozone layer, which is concentrated at altitudes between 15 and 35 km.A reduction of the ozone concentration and associated increase of the ultraviolet radiation is expected to result in a change of plant species composition and a possible reduction of agroecosystem production (Milchunas et al., 2004;Ballare et al., 1996;Koti et al., 2005).Another example from animal experiments suggests that through the increase of UVb radiation (between 280 en 315 nm), for each 1% loss of ozone the incidence of eye cataracts would rise by 0.5% (van der Leun and de Gruijl, 1993) and from epidemiological data an increased incidence of non-melanoma skin cancer by 2% can be expected per percent ozone decrease (Urbach, 1997).
The catalytic destruction of ozone by chlorofluoromethanes was first described by Molina and Rowloand (1974).In order to protect life on Earth from the UV, the so-called the Montreal protocol was designed to protect the ozone layer from destruction by ozone depleting substances.Although production of these substances has been significantly reduced, due to their long life-time ozone destruction will still continue for several decades, as can be seen from the appearance of the record-size ozone hole above Antarctica in 2006 (ESA, 2006).
The European Space Agency launched the ENVISAT satellite dedicated to environmental research in March 2002.ENVISAT carries three instruments dedicated to atmospheric studies: SCIAMACHY, MIPAS and GOMOS (see http:// envisat.esa.int/instruments/).The main objective of the last instrument is to monitor ozone and its trends in the stratosphere.GOMOS stands for Global Ozone Monitoring by Occultation of Stars and as its name states, this instrument uses stellar occultation to retrieve information on ozone and other trace gases from spectra in the ultraviolet, visible and near-infrared wavelengths.GOMOS is self-calibrating and due to its star tracking capabilities it has a very accurate altitude determination.
Approval has recently been given for the continuation of the ENVISAT mission beyond 2010 (its originally planned end of mission year).The current end of the mission is expected no later than August 2014, but the exact date depends on the available amount of fuel (EO-PE (PLSO and MAO teams), 2007).In order for the mission to continue, some orbital changes will take place in October 2010.These changes will reduce the altitude of the platform and reduce the repeat cycle from 35 to 30 days, but no major problems are foreseen for GOMOS acquisitions.However, comparison with long-term validation records is required to monitor the effects of these changes as well as the platform/instrument's ageing and to assess improvements in the GOMOS processing algorithms.In this respect, validation activities are essential to guarantee the stability of the quality of GOMOS and other remote sensor products (Dupuy et al., 2009;Brinksma et al., 2006).

Previous validation activities
The quality assessment of ozone profiles retrieved from satellite data can be carried out in three different ways: 1) using model studies/climatology; 2) using already validated alternative satellite products or 3) using profiles collected with ground-based/airborne instruments.Bertaux et al. (2004) compared GOMOS ozone profiles of 4 days in 2002 with the Fortuin-Kelder ozone climatology and found an excellent agreement.Differences found were attributed to natural variation and the inclusion of daytime data in the climatology whereas only night-time GO-MOS measurements were taken for the comparison.They also compared two GOMOS measurements at the same location, but from two consecutive orbits and using distinct stars.The observed internal consistency was again referred to as "excellent".Kyrölä et al. (2006) built a climatology from the GO-MOS measurements (prototype processor version 6.0a) consisting of monthly latitudinal distributions of the ozone number density and mixing ratio profiles.The generated stratospheric profiles were compared with the Fortuin-Kelder daytime ozone climatology.Large differences were observed in the polar region which were found to be correlated to large increases of NO 2 .Around the equator GOMOS reported significantly less ozone than the Fortuin-Kelder climatology, but it was mentioned that the Fortuin-Kelder climatology was less reliable in this region due to the low amount of data points used.In the upper stratosphere, ozone values from GOMOS were systematically larger than in the Fortuin-Kelder climatology, which was again attributed to the diurnal variation.In the middle and lower stratosphere, GOMOS reported a few percent less ozone than Fortuin-Kelder.Verronen et al. (2005) compared night-time GOMOS ozone profiles with MIPAS measurements for individual cases as well as profile means for a limited number of profiles (1 day in 2002 and 1 day in 2003).Although MIPAS uses a different measurement technique from GOMOS (MIPAS is a mid-infrared limb sounder), good agreement -within 10-15% -was found for the stratosphere and lower mesosphere.Nevertheless, MIPAS persistently gives a higher estimate in this altitude region.Note that also two processor versions for GOMOS had been used for the different days.Comparing GOMOS version 5.00 ozone profiles to ACE-FTS (version 2.2 ozone update product), median differences between the collocated profiles were within 10% for the altitude range 15 to 40 km (Dupuy et al., 2009).
In Meijer et al. (2004), a comparison of approximately 2500 GOMOS version 4.02 ozone profiles using data from lidar, balloon sonde and microwave radiometer data was presented.The authors illustrated that the quality of the GO-MOS profiles strongly depended on the limb illumination conditions.For dark limb measurements, the GOMOS profiles agree well (bias <7.5%) with the collocated data over the altitude range 14 to 64 km.No dependence on star temperature and magnitude or latitude was found, although the observed bias between 35 and 45 km was somewhat larger in the polar regions.
The ozone profiles delivered by GOMOS were compared with balloon sonde measurements acquired in 2003 at two locations by Tamminen et al. (2006).Their results indicated that the overall agreement between collocated measurements was good and that small scale structures could be detected with GOMOS' vertical resolution.Explanations for the differences between the two locations were sought in star brightness and strength of the polar vortex.Renard et al. (2008) found an excellent agreement between GOMOS ozone profiles and balloon-borne vertical columns in the middle stratosphere, with an accuracy of 10% for individual profiles.
For the tropical zone, several ground-based and satellite measurements including GOMOS have been compared with data from a balloon-based sensor (SAOZ UV-Vis spectrometer using solar occultation) circling the globe in three missions (Borchi and Pommereau, 2007).GOMOS prototype processor version 6.0b performed very well above 22 km (bias of 1-2.5%), but degraded strongly below this altitude.Even though the altitude registration of GOMOS was considered very precise, SAGE II and SAOZ were found to be more precise (in terms of ozone): ∼2% compared to ∼6% for GOMOS above 22 km.Note however that the latitudinal coverage was very limited as well as the number of data samples.Furthermore, it is suggested that remote sensing measurements have a systematic high bias in oceanic convective clouds areas.
Also in this region, Mze et al. (2010) compared GOMOS version 5.00 ozone profiles to balloon sondes from eight stations in the SHADOZ network.They found a satisfactory agreement between 21 and 30 km, although site-dependent differences were observed.At lower altitudes, the GOMOS ozone profiles exhibited a large positive bias compared to the balloon sondes.

Outline
This article can be seen as a continuation of the work presented in Meijer et al. (2004) as the available GOMOS dataset is extended to seven years and a new processor version is available.The following sections will describe the used input data, and the methodology.Section three will present the validation results, in 3.1 the comparison between the previous processor (version 4.02) and the current operational processor (version 5.00) for an overlapping dataset, and in 3.2 the validation results of version 5.00 for the sevenyear-spanning dataset.The conclusions can be found in Sect. 4.

GOMOS ozone profiles
The GOMOS data used in this study include the operational level 2 data from version 5.00 spanning the period August 2002 to August 2009.We also obtained a dataset processed with the previous algorithm version 4.02 for comparison purposes.This second set contains data from the period June 2004 to January 2005 complemented with a few measurements in August 2005.Note that as the version 4.02 data do not cover the same time period as used in Meijer et al. (2004), the results presented here are not directly comparable.We do not intend to reproduce their results; we merely aim to point out differences between version 4.02 and 5.00 relative to the ground-/balloon-based measurements.Section 2.1.1 describes the implemented changes from the old (4.02) to the current (5.00) processor.All data were restricted to an estimated error in the ozone concentration of 20% or less.
The product confidence data (PCD) flags in the GOMOS products indicate the validity of the retrieval of the local density profiles.In addition, the GOMOS ozone profiles receive a quality flag based on the illumination conditions of the atmospheric limb.Five illumination conditions have been characterised: -bright (solar zenith angle at the tangent point smaller than 97 Because of the orbit chosen for ENVISAT, no full-dark measurements can be taken over the Arctic region.Nevertheless, similar to Meijer et al. (2004), an alternative filtering using a solar zenith angle larger than 107 • (astronomical twilight zenith angle) for the tangent points will be used here as well in order to get a picture for this region.
Besides the illumination condition, the data quality is also influenced by the characteristics of the observed star.Weak or dim stars have a lower signal-to-noise ratio and therefore a noisier transmission spectrum than strong/bright stars.Furthermore, the star temperature determines the maximum intensity: for hot stars this is in the UV region whereas for cold stars the maximum intensity is in the visual wavelengths and the transmission in the UV part is very noisy.As the UV wavelenghts are used for the retrieval above 40 km, the usability of weak stars is there strongly reduced (Tamminen et al., 2010;European Space Agency, 2007).
The product quality disclaimer contains additional recommendentations from the GOMOS quality working group for data selection (European Space Agency, 2006).

Changes from version 4.02 to 5.00
In addition to various corrections applied to the level 1 product, several level 2 processor changes have been implemented in IPF 5.00.
The atmospheric density profile is no longer retrieved in version 5.00; instead a reference atmospheric density profile is derived from ECMWF data below 1 hPa and the MSIS90 model above.This profile is then subsequently used in the version 5.00 retrieval.This was implemented as in version 4.02 a strong deviation from ECMWF data below 25 km and above 40 km was observed.The retrievals should especially improve at low altitudes where ECMWF data are accurate.
Additional errors are reported for ozone, NO 3 and aerosols.A quadratic aerosol law (αλ 2 + βλ + γ , with λ as the wavelength and α, β and γ are altitude dependent and derived from the GOMOS measurements) has been incorporated to describe the wavelength dependence of the aerosol extinction, whereas in version 4.02 an inverse wavelength dependence 1 λ was assumed.A number of different aerosol models has been studied where the quadratic model showed the best performance in comparison to other satellite and ground-based measurements (GOMOS quality working group meeting 15, 2007) and it allows a more realistic description of the aerosol effective cross section than the 1 λ law.A different cross section was introduced for the retrieval of ozone in IPF 5.00.Bogumil et al. (2000) is used for both the UV and the visible wavelengths.More details on the GOMOS processing and introduced changes are given by Bertaux et al. (2010) and in the GOMOS handbook (European Space Agency, 2007).

Ground-based measurements
The importance of ground-based measurements is slowly getting recognised by initiatives like GAW (Global Atmospheric Watch), Geomon (Global Earth Observation and Monitoring of the atmosphere) and GMES (Global Monitoring for Environment and Security).Despite the fact that these measurements are essential for a global understanding of our climate, securing long-term funding to warrant their continuation is usually rather difficult (Nisbet, 2007).Although satellite observations can complete the picture through the spatial coverage of their measurements, we must ensure a careful validation of the derived information.It is important to realise that satellite-based instruments are complementary to the ground-based observations, as for instance the temporal and vertical resolutions of the last category are often higher and the errors of the products better characterised.Furthermore, the long-term background measurements by ground-based observations are required to overcome data gaps in between satellite missions and to quantify the introduced differences between sequential satellite-based instruments (McDermid et al., 1990;Jégou et al., 2008;Clerbaux et al., 2008).
Here we combine sonde, lidar and microwave radiometer data for the validation using the altitude ranges where each instrument has the largest added value and best performance.

Stratospheric ozone lidar data
In this study we make use of ozone profiles derived from differentially absorbed lidar signals emitted and recorded by stratospheric ozone lidar systems.Two light pulses are simultaneously emitted at different wavelengths with different ozone absorption cross sections.The difference in the returned backscatter can be related directly to the ozone concentration, which is derived as a function of the altitude based on the elapsed time since the pulse emission.The lidars mostly operate under night-time and clear-sky conditions.
All of the eleven participating lidars are part of the Network for the Detection of Atmospheric Composition Change (NDACC).The lidar working group of NDACC has developed various protocols to ensure consistency between the different lidars and high data quality is established through intercomparison and validation exercises with models and other instruments (McDermid et al., 1998;NDACC lidar working group, 2009).
The lidar network can be considered homogeneous within about 2 percent and, on average, precision of the ozone measurements is around 1% up to 30 km, 2 to 5% at 40 km and 10% at 45 km (Keckhut et al., 2004;Steinbrecht et al., 2009).On average, resolutions range between 1 and 2 km at low altitudes (below 20 km) increasing to 3-5 km at 40 km (Godin et al., 1999).

Balloon-borne ozone sonde data
Ozone sondes consist of an inert pump, an electrochemical cell facilitating a reaction between ozone and iodide, a detector for the small electric current generated by this reaction, and an interface to a radiosonde which additionally measures air temperature and pressure (Deshler et al., 2008).Data are provided as partial ozone pressure, which have be converted to number density using the air temperature and pressure that were measured simultaneously to ozone by the sonde.Ozone sondes have a precision of about 5% (Smit and Kley, 1998;Thompson et al., 2003a;Deshler et al., 2008).
In this study, balloon soundings has been used from the Ground-Based Measurement and Campaign Database (GBMCD) subgroup of the Atmospheric Chemistry and Validation Team (ACVT) with the addition of Southern Hemisphere Additional Ozonesondes (SHADOZ, see Thompson et al. (2003aThompson et al. ( , b, 2007) ) for a description of this initiative) to increase the coverage in the tropics.
Data from the SHADOZ sondes are re-binned to longer time intervals using a block average for a given time window size (e.g. 10 s).In order to deal with the non-linear behaviour of pressure with increasing altitude, the logarithm of the reported pressures was taken before averaging, followed by taking the inverse logarithm of this average to normalise.
All sonde data has been cut off at an altitude of 30 km and averaged over two kilometer (corresponding to the GOMOS resolution below 35 km) to avoid the introduction of local biases caused by the presence of small scale structures seen by the sonde which would mainly enlarge the standard deviation of the differences.This 2-km averaging was done using a running mean.

Microwave radiometer data
As a third validation instrument we have used data from microwave radiometers.These instruments are often operated continuously during both day and night time.Although they have a broad vertical resolution, the data are useful to study the stratosphere and especially the mesosphere where lidar data are no longer available.
The vertical resolution (defined as the full width to half maximum of the averaging kernels) is in the range 6 to 10 km between 20 and 50 km and about 13 km at 64 km (Boyd et al., 2007;Hocke et al., 2007).Precision is typically about 5% between 20 and 55 km and increases above (7% at 64 km).Compared to the ozone profiles provided by AURA microwave limb sounder (version 2.2), agreement with two NDACC microwave radiometers was within 5% (Boyd et al., 2007).
Data are restricted in this study to altitudes ranging between 30 and 70 km with the condition that the reported error cannot exceed 30%.

Equivalent latitude data
Potential vorticity (PV) data on the 475 K potential temperature field were obtained from the ECMWF interim reanalysis (ERA-interim) data archive.Since it has been noted that the position of the vortex boundary derived from potential vorticity data may differ from that seen in observations (Greenblatt et al., 2002;Müller and Günther, 2003), which has been attributed to the availability of input data for the calculation of PV, it was decided not to interpolate the PV spatially and temporally nor to derive the vortex position.Instead, equivalent latitudes were derived for all GOMOS 5.00 data as well as for the ground-based measurements and data were linked to the nearest grid cell (cell size of 1.5 • ) and closest time (PV data are computed for 8 h intervals).Subsequently, the relative equivalent latitude difference between the GOMOS and ground-based measurements was used to study the effect on the validation results.

Collocations and data treatment
Following Meijer et al. (2004), we have restricted all collocations to a maximum horizontal distance of 800 km and a maximum time difference of 20 h between measurements.For the full dataset comparison in Sect.3.2, we also enforce a maximum difference in equivalent latitude of 3 degrees to avoid problems in the polar region related to observing different air masses.Above altitudes of 50 km, the maximum time difference is set to 5 h and the daylight conditions have to be the same, as mesospheric ozone is subject to diurnal variation.
Both the validation and GOMOS datasets have been interpolated using a nearly linear spline to a common (200 m) altitude grid.As described before, the sonde data are averaged to the GOMOS resolution using a running mean.Differences in vertical resolution are not taken into account for the lidar data, because the effect is considered relatively small given the similar resolution of GOMOS.If we are to apply the averaging kernels and consider the a-priori information from the microwave radiometer data, the GOMOS data will be degraded and no longer independent from the microwave radiometer data (Meijer et al., 2003).The effect of not taking this resolution difference into account should lead to an increased standard deviation of the differences between GOMOS and the microwave retrievals.Substantial differences would be expected at altitude regions where there are small scale features, which is less likely above 30 km.Here we have smoothened the GOMOS data that collocates with microwave radiometer measurements using a running mean of 10 km as an average microwave radiometer resolution at 50 km (middle of the range used for the validation).Note that however no large effects were observed when completely disregarding the differences in resolution.validation data in blue (mean and standard deviation in thick and thin lines respectively).The 5 ozone concentration is plotted on a log-scale for the upper 30 km.The middle panels show the 6 difference between GOMOS and the validation data (with respect to the validation data) in 7 percentage as a function of altitude.The green line shows the median difference profile, the 8 black lines the mean (thick black line) plus/minus 1 standard deviation (thin lines) and the 9 grey lines show the mean plus/minus 2 standard errors.On the right side of the middle panel 10 the number of collocated pairs is shown, with the total number of used pairs at the bottom of 11 Fig. 1.GOMOS 4.02 (top) and 5.00 (bottom) versus validation data.Left panels show the ozone number density as a function of altitude, with the GOMOS profiles in red and the validation data in blue (mean and standard deviation in thick and thin lines respectively).The ozone concentration is plotted on a log-scale for the upper 30 km.The middle panels show the difference between GOMOS and the validation data (with respect to the validation data) in percentage as a function of altitude.The green line shows the median difference profile, the black lines the mean (thick black line) plus/minus 1 standard deviation (thin lines) and the grey lines show the mean plus/minus 2 standard errors.On the right side of the middle panel the number of collocated pairs is shown, with the total number of used pairs at the bottom of the plot.The right panel shows the median difference (thick black line) together with the 16 and 84 percentiles (dark grey lines) and the 2.5 and 97.5 (light grey lines) percentiles.
A complete validation should also consider the provided estimates of error in the ozone retrievals.In this study we have only used the provided errors in the validation and GO-MOS data in the data selection process as for GOMOS the estimated error is a subject of discussion in the quality working group (e.g. the scintillation correction is still an issue; Sofieva et al., 2009) and errors in the validation data are often not reported (sonde) or non-homogeneous (e.g.different definitions used in the lidar community).As a consequence, a full study could be dedicated to the comparison of errors and their uncertainties.We believe that through the large numbers used in the analyses, these complications are dealt with in a different way as the error in the data should correspond to the spreading seen in a dataset for a large population.The improved error estimates in the next GOMOS processor ver-sion (IPF 6) are described in Tamminen et al. (2010) and suggestions for further improvements are given in Sofieva et al. (2010).data is shown on a log-scale from 50 km upward to enhance visibility.The middle plots show the difference between GO-MOS and the validation, where the difference is calculated as: GOMOS−VALID VALID × 100.The green line shows the median difference, the thick black line corresponds to the mean difference, the thin black lines illustrate the mean ±1 standard deviation and the thin grey lines show the mean ± 2 standard errors.The number of used collocated pairs for a given altitude is shown on the right side of the middle panel, whereas the total number of collocated pairs is shown at the bottom of the panel.The right panel shows the following quantiles of the differences (lines from left to right): 2.5%, 16%, 50% (median), 84% and 97.5%.

Comparison between versions 4.02 and 5.00
The differences between the two analyses in total pairs and the collocated pairs for some altitudes originate from the difference in assigned errors to the datasets.In general more data points in version 5.00 fulfil the criterion of a maximum error of 20%.Few outstanding differences between the two versions can be observed in the median profiles.The small negative bias from 20 to 50 km has shifted positively.With both versions, the standard deviation increases substantially below 30 km due to the presence of some outlier profiles.A large part of the deviation between the mean and median differences between 24 and 30 km can be attributed to comparisons with Dumont d'Urville (66.7 • S), Thule (76.5 • N) and Legionowo (52.4 • N) soundings.A closer investigation at the latter two sites pointed out that some of these observations include straylight contamination.At Dumont d'Urville however, the illumination condition is not the only factor involved, as fully dark observations still produce outlier ozone concentrations compared to the soundings.This can be attributed to the increasing spatial variability in this area as time progresses, given the fact that the June and July comparisons show good results.As ozone depletion can start already in mid-winter at the latitude of Dumont d'Urville (Roscoe et al., 1997), differences with measurements at other latitudes are likely to be found, which is what we observe in this case -with the relatively large distance between the (fully dark) satellite and sonde measurements.In addition, small scale structures www.atmos-chem-phys.net/10/10473/2010/Atmos.Chem.Phys., 10, 10473-10488, 2010 are difficult to follow with GOMOS' resolution.As spring advances, so does the ozone hole formation whereas the illumination conditions for GOMOS observations get worse.As a result, most collocations are with lower latitude measurements, which have a very different ozone distribution in this period.One future solution would be to optimise the collocation criteria and make them dependent on latitude and/or time of the year.
Figure 1 (see the GOMOS standard deviation in left panel) also shows that a few additional outlier profiles are produced with version 5.00 around the ozone maximum.These can be filtered out by removing unrealistic profiles exceeding a concentration of 10 13 molecules per cm 3 .Note that differences with the comparison carried out by Meijer et al. (2004) at the higher part of the profile (above 45 km where only microwave data are available for comparison) in version 4.02 are caused by a difference in the time span of the datasets of Meijer et al. and our datasets: the current analysis only covers data from 2004 and 2005, resulting in fewer collocations with microwave radiometers and at fewer sites (e.g.no data is available for Lauder and Mauna Loa).In fact, the majority of these collocations are found at Payerne (80 to 100% depending on the altitude), making the top of the plot a (rather) local instead of global picture.
Figure 2 shows the same picture as Fig. 1 but with the outlier profiles removed as described above.The median difference profiles are, as expected, virtually the same.The mean now follows the median from an altitude of about 20 to around 60 km.Outside this range we still detect outliers due a low signal to noise ratio and increased scintillation (low altitudes), whereas we will investigate with the longer and larger v5.00 dataset if the observed behaviour at higher altitudes is also seen at other locations.

Validation of the GOMOS v5.00 ozone profiles
In this section we present the validation results for all seven years.Note that more collocations are found in early years where funding was available for additional validation measurements, and secondly, GOMOS had a larger spatial coverage in the beginning as it could use a larger azimuth range for the line of sight.
We have split the main dataset into various subsets to identify possible dependencies on observation characteristics.Table 1 gives an overview of the used ranges for these parameters and Fig. 3 shows the locations of the GOMOS data together with the validation sites.

Illumination condition
Figure 4 shows the quality of the observations as a function of the illumination condition.The bright limb cases are presented on the left panel, showing that the retrieval with the current processor is still insufficient for these cases.At high altitudes there is a large negative bias and below 35 km the profiles contain many extreme values.
Under twilight conditions (middle panel), the results look a lot better.Compared to the full-dark limb cases (right panel), there are more high outliers, but a substantial amount of data can be used.
In our "dark" selection (solar zenith angle >107 • ), a part of the data has limb illumination flags (see Sect. 2.1) indicating twilight and/or straylight contamination (flags equal to 2, 3 or 4) of the profiles.These 'light-contaminated' data have been compared to those flagged 'dark' (flag equal to 0) in the latitude region 40 • N to 50 • N.This region was chosen to avoid a potential latitude bias, no dark flagged collocations are found above 55 • N and insufficient pairs were found located on the southern hemisphere.The profiles that are flagged to be light-contaminated give overall more negative differences than those flagged "dark", but these differences are not significant.However, note that the "dark" flagged cases consisted of 70 collocations at a given altitude at most; when more data become available with time, the differences might turn out to be significant.

Stellar properties
Observations of strong stars should result in profiles of higher quality as the signal is less noisy.Indeed the 16% and 84% quantiles (Fig. 5) show a narrower distribution over a large part of the altitude range.However, the 97.5% quantile shows the presence of some high-value outliers.The number of collocations with strong stars is low in comparison to the weak stars-cases, making the difference profiles more variable.At altitudes above 45 km, the majority of the collocations are in the polar region (Ny Ålesund microwave radiometer), whereas for the weak star observations most of the collocations are located in the mid-latitude region.This difference (thick black line) between GOMOS and the validation data together with the 16 5 and 84 percentiles (thin dark grey lines) and the 2.5 and 97.5 percentiles (thin light grey 6 lines).On the right side of each panel is the number of collocated pairs used for the 7 corresponding altitude.8 Fig. 4. Validation results for the different limb illumination conditions.Left panel: bright limb; middle panel: twilight limb; right panel: dark limb cases.All plots show the median difference (thick black line) between GOMOS and the validation data together with the 16 and 84 percentiles (thin dark grey lines) and the 2.5 and 97.5 percentiles (thin light grey lines).On the right side of each panel is the number of collocated pairs used for the corresponding altitude.within 800 km and 20 h × ≤800 km and t≤20 h within 400 km and 10 h × ≤400 km and t≤10 h within 200 km and 5 h × ≤200 km and t ≤5 h explains why the difference profiles for the top appear worse for the strong star cases -when we consider only the polar cases, there is almost no difference between the two star magnitude groups.
With respect to the temperature of the observed stars, fewer collocations with cold stars are available than with hot stars, especially in the mesosphere where all collocations are with weak stars.The combination of weak and cold stars complicates the retrieval (Kyrölä et al., 2010a, b) which results in a higher error estimate.This is reflected in the decreasing amount of available collocations with altitude as we filter on a maximum error of 20%.No significant influence of the star's temperature on the results is then observed.How-ever, if we increase the maximum permitted error for GO-MOS to 100%, we see an increase in the number of available profiles, but the higher half of the profile (roughly above 40 km) shows a strongly increased variability and the median differences enhance with respect to the cases shown in Fig. 6 (e.g. at 55 km, the data have a negative bias of 50% and at 70 km the bias equals about 30% -not shown).Note that the mentioned data are not flagged invalid.

Line of sight azimuth angle
Figure 7 shows the influence of the line of sight (LOS) azimuth angle during the time of observation.Most observations are found to be in slant viewing and quite a few (given the smaller azimuth range) are in the back LOS.The median difference profiles are very similar, but fewer outliers are observed in the back LOS configuration.In contrast to Meijer et al. (2004), an increased standard deviation is not (any longer) seen for the side LOS data.
GOMOS is currently (September 2010) operating in the range 17 • to 47 • , which corresponds mostly to the slant LOS.The past ranges are listed in the GOMOS monthly status reports, see http://earth.esa.int/pcs/envisat/gomos/reports/monthly.

Geographical area
For the analysis shown in Fig. 8 the dataset has been split into three geographical regions.Most collocations are found in the mid-latitude region (right panel), where the majority of the validation stations is located.In the polar region (left panel) there are also many collocations: even though there are fewer stations, there are many GOMOS overpasses given the orbit of ENVISAT.This leads to various GOMOS measurements collocating with a single ground-based measurement.The collocating microwave data are from two stations: Ny Ålesund (largest contribution) and Kiruna (3 profiles above 55 km).The GOMOS profiles increasingly start to overestimate the ozone concentration above 50 km, which is likely an effect of the increasing uncertainties in the microwave radiometer data and the increasing straylight contamination.Perhaps the processor that is under development in the GOMOS bright limb project will improve the ozone retrieval as it does not depend on the weak star signals.In comparison to the other regions, the bias is more negative between 18 and 30 km, reaching up to −8%.As indicated in Sect.3.2.1, it is possible that (part of) this more negative bias originates from twilight/straylight contamination of the profiles.A larger dataset is required to be conclusive.
In the tropical region (right panel), fewest collocations are available.The effect of decreasing signal after having descended below the ozone maximum (which is at a higher altitude in the tropics) is clearly illustrated, as the variation increases with decreasing altitude.Likewise, the median shows an offset from the 0% difference in the tropics before that happens in the other areas.

Collocation criteria
Figure 9 confirms that the chosen collocation criteria are not introducing any biases.In fact, we could consider increasing the allowed difference in equivalent latitude, as we saw for subsets of the data that no clear deterioration was found when changing from 3 to 5 or 10 degrees.A more elaborated study focussing on the polar area is to be carried out in the future.
Also, no evidence of a trend was observed when grouping the data by year (not shown), but perhaps this is masked by the flagging and or the chosen error regime.

Conclusions
Ground-/balloon based instruments can be used to bridge the gap between different satellite instruments, both in terms of technique and time.The ground-based observations often provide a long-term monitoring record with a high vertical resolution at a single location, whereas the satellite measurements are complementary as they can provide a global coverage with a limited life span.The comparison between data from satellite and ground-based instruments is a necessity to validate the retrievals and to monitor the performance of the instruments (Froidevaux et al., 2008;Hocke et al., 2007;Nardi et al., 2008;Jégou et al., 2008).The suite of groundbased and satellite retrievals together with models furthermore provides a unique tool to study atmospheric events and to detect trends (Ladstätter-Weißenmayer et al., 2007;Steinbrecht et al., 2006;Steinbrecht et al., 2009).
In this study we first have compared the ozone profiles from the current operational processor (version 5.00) with the previous version (4.02) by matching the datasets with ground and balloon based measurements.The validation results indicate that the two processing algorithms produce very similar results.The bias has improved in some areas, but a few more outliers are encountered.It was shown that some of the outlying data points can be removed by filtering the profiles on negative and exceptionally large values.Improved quality flagging in future processor versions may overcome this problem.
Additionally, we have compared seven years of version 5.00 GOMOS ozone profiles with balloon sonde, lidar and microwave radiometer ozone measurements.Data were collocated using a maximum difference of 800 km, 3 degrees in equivalent latitude and 20 h in time (5 h above an altitude of 50 km).Lidar and microwave radiometer data were restricted to a maximum uncertainty of 30%, while the GOMOS profiles were filtered to exclude measurement points with an er-ror greater than 20% and reporting ozone number densities below 0 or above 10 13 molecules/cm 3 .For the dark limb observations, this resulted in 1897 collocated pairs with balloon soundings, 576 collocations with lidar observations and 587 collocations with microwave radiometer data.
The comparison shows that GOMOS profiles obtained from dark limb measurements are found to be of a high quality when the provided processing quality flag is properly taken into account.Profiles measured under twilight conditions are of similar quality as dark limb measurements.However, the occurrence of outliers is higher.Care has to be taken in all cases when dealing with straylight contaminated profiles, which especially affect higher altitudes in the polar region.Also in the mid-latitudes we can observe deviations from the validation data in the mesosphere.In the tropics there is a better match in the mesosphere between the validation instruments and the GOMOS measurements, but some large outliers are present.Overall, the ozone profiles are most similar (within a few percent) in the range 20 to 40 km, where the bias is moving towards the positive and the lowest good retrieval altitude increases when going from the poles to the equator.
Theoretically, observations of strong stars (visual magnitude ranging between −2 and 1) should result in profiles of have a higher quality (less noise) than observations of weak stars (magnitude between 1 and 4).However, for the GO-MOS data within the selected error range (0-20%), we did not see any clear distinction between these two groups, but possibly that is related to the selection criteria applied here.The same is valid for the distinction between hot and cold stars.For instance, when extending the allowed error range to 100%, we see a large increase of the bias for the profiles obtained with cold stars.Atmos.Chem. Phys., 10, 10473-10488, 2010 www.atmos-chem-phys.net/10/10473/2010/Comparing the different azimuth ranges for the line of sight (LOS), we can conclude that the median difference profiles are very similar and the smallest amount of outliers is observed using the back LOS configuration.
No evidence of a temporal trend was seen in the bias or occurrence of outliers, but it is likely that more profiles are rejected as the instrument ages.An analysis using a less strict data selection might be used to prove this.The next GOMOS processor version is expected to better deal with the increased dark charge of the detectors, reducing the amount of outliers and thus increasing the overall profile quality.

2 Figure 1 :
Figure 1: GOMOS 4.02 (top) and 5.00 (bottom) versus validation data.Left panels show the 3 ozone number density as a function of altitude, with the GOMOS profiles in red and the 4

Figure 1 Figure 2 :Fig. 2 .
Figure1shows the comparison between GOMOS versions 4.02 (top) and 5.00 (bottom) with the validation data (VALID).The left panels of both plots show the mean ozone profiles (thick lines) as a function of altitude together with the corresponding standard deviations (thin lines) for GO-MOS (in red) and the validation data (in blue).The ozone

Figure 3 : 6 Fig. 3 .
Figure 3: Global overview of collocated measurements available in this study.GOMOS 3 measurements in black (dark limb observations), dark grey (twilight conditions) and light 4 grey (bright limb observations) circles together with the validation sites plotted as blue 5 asterisks.6 Fig. 3. Global overview of collocated measurements available in this study.GOMOS measurements in black (dark limb observations), dark grey (twilight conditions) and light grey (bright limb observations) circles together with the validation sites plotted as blue asterisks.

2 Figure 4 :
Figure 4: Validation results for the different limb illumination conditions.Left panel: bright 3 limb; middle panel: twilight limb; right panel: dark limb cases.All plots show the median 4

2 Figure 9 : 6 Fig. 9 .
Figure 9: As Figure 4, here showing the effect of making the collocation criteria stricter.Left 3 panel: cases with a maximum difference of 800 km and 20 hours; middle panel: 400 km and 4 10 hours maximum difference; right panel: cases fulfilling a 200 km and 5 hours maximum 5 difference.6

Table 1 .
Overview of analysed data subsets per parameter.