Validation of OMI total ozone retrievals from the SAO ozone profile algorithm and three operational algorithms with Brewer measurements

The accuracy of total ozone computed from the Smithsonian Astrophysical Observatory (SAO) optimal estimation (OE) ozone profile algorithm (SOE) applied to the Ozone Monitoring Instrument (OMI) is assessed through comparisons with ground-based Brewer spectrometer measurements from 2005 to 2008. We also compare the three OMI operational ozone products, derived from the NASA Total Ozone Mapping Spectrometer (TOMS) algorithm, the KNMI (Royal Netherlands Meteorological Institute) differential optical absorption spectroscopy (DOAS) algorithm, and KNMI’s Optimal Estimation (KOE) algorithm. The best agreement is observed between SAO and Brewer, with a mean difference of within 1 % at most individual stations. The KNMI OE algorithm systematically overestimates Brewer total ozone by 2 % at low and mid-latitudes and 5 % at high latitudes while the TOMS and DOAS algorithms underestimate it by ∼ 1.65 % on average. Standard deviations of ∼ 1.8 % are calculated for both SOE and TOMS, but DOAS and KOE have higher values of 2.2 % and 2.6 %, respectively. The stability of the SOE algorithm is found to have insignificant dependence on viewing geometry, cloud parameters, or total ozone column. In comparison, the KOE– Brewer differences are significantly correlated with solar and viewing zenith angles and show significant deviations depending on cloud parameters and total ozone amount. The TOMS algorithm exhibits similar stability to SOE with respect to viewing geometry and total column ozone, but has stronger cloud parameter dependence. The dependence of DOAS on observational geometry and geophysical conditions is marginal compared to KOE, but is distinct compared to the SOE and TOMS algorithms. Comparisons of all four OMI products with Brewer show no apparent long-term drift, but seasonal features are evident, especially for KOE and TOMS. The substantial differences in the KOE vs. SOE algorithm performance cannot be sufficiently explained by the use of soft calibration (in SOE) and the use of different a priori error covariance matrices; however, other algorithm details cause fitting residuals larger by a factor of 2–3 for KOE.


Introduction
The Dutch-Finnish Ozone Monitoring Instrument (OMI) (Levelt et al., 2006) aboard the NASA Aura satellite was launched on 15 July 2004 to continue the long-term record of satellite total ozone measurements, initiated in 1970 with the launch of the nadir-sounding Backscatter Ultraviolet instrument (BUV) aboard the Nimbus-4 spacecraft, and followed in 1978 with the launch of the Total Ozone Monitoring Spectrometer (TOMS) and Solar Backscatter Ultraviolet (SBUV) instruments aboard Nimbus-7.There are two independent operational total ozone algorithms applied to OMI measurements to produce the standard OMI total column ozone products.The OMTO3 algorithm is based on the wellknown TOMS method developed at NASA Goddard Space Flight Center (GSFC) (Bhartia and Wellemeyer, 2002).The algorithms used for OMDOAO3 and OMO3PR take advan-Published by Copernicus Publications on behalf of the European Geosciences Union.
tage of the spectroscopic capability of the OMI instrument.These were both developed at the Royal Netherlands Meteorological Institute (KNMI).One is based on differential optical absorption spectroscopy (DOAS) (Veefkind et al., 2006) and the other on the optimal estimation (OE) inversion technique (KNMI OE, KOE) (van Oss et al., 2001;Kroon et al., 2011).The variety of OMI operational ozone data products offers a good opportunity to compare the total ozone retrieval performance among the different algorithms and to identify their strengths and shortcomings.
An independent OE-based ozone profile algorithm, referred to as SOE here, was developed at the Smithsonian Astrophysical Observatory (SAO) (Liu et al., 2010a).It was shown capable of capturing tropospheric ozone signals in OMI measurements that are perturbed by convection, biomass burning, anthropogenic pollution, and transport of pollution.In subsequent validation studies, good agreement was found between OMI SOE ozone profiles and high resolution ozone profiles made by satellite and ozonesonde (Liu et al., 2010b;Wang et al., 2011).The SOE algorithm was shown to capture very well the ozone variability in the extratropical tropopause region through comparison with aircraft and ozonesonde measurements (Pittman et al., 2009;Bak et al., 2013).
In Liu et al. (2010a), the profile of partial ozone columns is retrieved at 24 layers and total ozone column is just the sum of partial ozone columns at all layers.In principle, OEbased profile algorithms should have the potential to provide more accurate total ozone estimates than the two primary total ozone algorithms because of its use of a wider wavelength range (270-330 nm) than that used for total ozone (Bhartia and Wellemeyer, 2002;Veefkind et al., 2006).Liu et al. (2010a) indicated that the total ozone retrieval errors (root sum square of both random noise and smoothing error) from SOE are typically 1-2.0 DU on average at solar zenith angle < 80 • .However, systematic errors due to systematic measurement errors and forward model and model parameter errors were not assessed.In addition, the total ozone retrieval performance has not been evaluated with independent ground-based observations.
The main objective of this study is to evaluate the retrieval performance in total ozone through comparison with 4 years (2005)(2006)(2007)(2008) of Brewer observations over the Northern Hemisphere, collected from World Ozone and Ultraviolet Radiation Data Centre (WOUDC) network and the Sodanklyä Total Column Ozone Intercomparison (SAUNA) campaign.The dependence of SOE-Brewer differences on various algorithmic variables (solar zenith angle, cross-track position, cloud parameters, total ozone amount) is thoroughly examined to identify possible problems with the SOE algorithm under certain conditions.SOE total ozone columns are further evaluated for long-term stability and seasonal or daily variability.The evaluation of possible dependence on algorithmic variables and time will provide useful insights into the characteristics of this algorithm, which have not come from previous studies.
The same comparisons performed between SOE total ozone and Brewer measurements have been conducted for the three operational total ozone products.Both OMTO3 and OMDOAO3 were validated previously by several groups using various reference data (e.g., Balis et al., 2007;Kroon et al., 2008;McPeters et al., 2008;Antón et al., 2009;Antón and Loyola, 2011).However, total ozone from the OMO3PR product has not yet been thoroughly evaluated against ground-based measurements.This study will therefore contribute to the assessment of that product.Despite the potential of ozone profile algorithms for improving total ozone retrieval, the successful performance of spectroscopic profile retrieval algorithms can be accomplished only when accurate calibration and forward model simulations and good knowledge of measurement errors and a priori covariance matrices are available (Liu et al., 2005;Liu et al., 2010a).In this paper, one of our interests is to see how total ozone retrieval performance differs between SOE and KOE due to the different implementations of OE they employ.
This paper is organized as follows.Section 2 briefly describes the four satellite ozone retrieval algorithms and data sets, the ground-based total ozone data, and the comparison methodology.Section 3 provides the OMI validation results using WOUDC and SAUNA data.We discuss the effect of different implementations between SOE and KOE on total column ozone retrievals in Sect. 4. Section 5 summarizes our validation results.

Ozone Monitoring Instrument (OMI) and OMI ozone algorithms
OMI is a nadir-viewing, ultraviolet-visible (UV-VIS) spectrometer, measuring backscattered solar radiances and irradiances over a wavelength range of 270 nm to 500 nm with two spectral channels: UV 270-370 nm and VIS 350-500 nm (Levelt et al., 2006).The UV channel is further divided into two sub-channels, UV-1 and UV-2, at about 310 nm, to allow for a design that suppresses stray light.OMI provides daily global coverage with an approximately 2600-kilometer-wide ground swath.Each swath consists of 60 and 30 cross-track pixels for UV-2/VIS and UV-1 spectra, respectively.The ground pixel size at nadir is 24 km (UV-2/VIS) and 48 km (UV-1) in the across-track direction and 13 km in the flight direction.
A summary of the main characteristics of the four OMI ozone retrieval algorithms is presented in Table 1.The principle of SAO and KNMI algorithms, SOE and KOE, is to find an OE-based solution that corresponds to a weighted average between measurement and a priori information, constrained by measurement and a priori error covariance matri-  2010) and then some updates are described in Kim et al. (2013).
ces (Rodgers, 2000).Both algorithms derive ozone profile information from OMI ultraviolet spectrum with a fitting window of ∼ 270-310 nm from the UV-1 channel and ∼ 310-330 nm from the UV-2 channel.Two adjacent spatial pixels across the track in UV-2 are combined to match the UV-1 spatial resolution.The OMI random noise errors reported in the level 1b radiance data are used to construct the measurement error covariance matrix.Ozone cross sections are from Brion-Daumont-Malicet (BDM) (Brion et al., 1993), which was recommended for use in ozone profile retrievals from UV measurements by Liu et al. (2007) and Liu et al. (2013).Despite these similarities, the two algorithms have many different implementation details, including state and a priori components, radiative transfer model calculations, and radiometric and wavelength calibration treatments.Details about the SOE algorithm can be found in Liu et al. (2010a), with several updates described in Kim et al. (2013) to improve radiative transfer calculations and address the impacts of correcting the OMI L1b random-noise error overestimates (Braak, 2010) on the retrieval.Detailed information about the KOE algorithm can be found in Kroon et al. (2011).
Adjustments based on comparisons of measured and simulated earthshine radiances for well-characterized geophysical reference conditions are popularly known as "soft" calibrations in contrast to "hard" calibrations, when radiometric adjustments are made solely using information from the instruments on-board calibration hardware.A calibration adjustment is applied to OMI level 1b radiances in the SOE algorithm independent of space and time to correct possible cali-bration errors causing cross-track and wavelength-dependent biases and part of the stray light error (Liu et al., 2010a).This first-order correction is derived using the average percent difference between measured and simulated radiance derived from 2 days of Microwave Limb Sounder (MLS) data in the tropics as shown in Sect.2.3 and Fig. 1 of Liu et al. (2010a).The a priori information (mean and error) for ozone is taken from a monthly and latitude-dependent ozone profile climatology constructed by McPeters et al. (2007), the "McPeters-Logan-Labow (LLM)" climatology.The retrieval variables in the state vector include ozone values at 24 layers from the surface to ∼ 0.087 hPa, surface albedo, cloud fraction, scaling parameters for the Ring effect, radiance/O 3 cross-section wavelength shift, radiance/irradiance wavelength shift, and a scaling parameter for mean fitting residual.
The KOE algorithm does not perform radiometric calibration as done in the SOE algorithm, but does use a stray light correction estimated by minimizing the signatures of Fraunhofer features in the fitted residuals separately in the UV-1 and UV-2 channels.The a priori ozone mean state is defined from LLM climatology, but a constant a priori ozone error of 20 % is assumed for all latitudes and altitudes except for ozone hole conditions.The retrieval variables include ozone profiles at 18 layers from the surface to 0.3 hPa, surface albedo, cloud albedo, and stray light correction parameters.The surface albedo and cloud albedo is turned on or off depending on the cloud fraction as a state vector; for cloud fraction < 0.2, the surface albedo is fitted with fixed cloud albedo of 0.8 whereas for cloud fraction > 0.2 the cloud albedo is fitted with the fixed surface albedo of its a priori value (Kroon et al., 2011).
The OMI TOMS and OMI DOAS total ozone algorithms use UV-2 measurements and thus retrievals are done at the higher UV-2 spatial resolution.The TOMS algorithm uses sun-normalized radiances at two wavelengths, 317.6 and 331.3 nm, to measure total ozone under most retrieval conditions.One wavelength is significantly absorbed by ozone and sensitive to the total column amount, and the other is insensitive to ozone.At large slant column densities, the retrieved total ozone is sensitive to assumed a priori profile shape.Information from the 312.6 nm wavelength, which is sensitive to ozone profile, is used to reduce this profile shape error (Wellemeyer et al., 1997).The algorithm is rather insensitive to calibration error that does not vary with wavelength, but it is more sensitive to wavelength-relative error (Bhartia and Wellemeyer, 2002).The TOMS algorithm uses ozone crosssection data based on Bass and Paur (1985).OMTO3 total ozone measurements largely rely on OMI's pre-launch radiometric calibration at nadir described by Dobber et al. (2006) and validated by Jaross and Warner (2008).Small residual errors in the collection 3 radiances (Dobber et al., 2008) are further reduced using soft-calibration techniques where biases and irregularities that vary with viewing angle and wave- length are estimated and reduced by comparing the measured radiances with forward model calculations.This approach is applied only to select data where the variability in ozone is low and therefore the radiances can be simulated reliably.The DOAS algorithm calculates the slant column density with a DOAS-based fitting of the measured spectrum in the spectral region 331.1-336nm to the differential absorption cross sections of ozone using BDM cross sections, and then it estimates the vertical column density by dividing the slant column density by the air mass factor (AMF) (Veefkind et al., 2006).
In all four OMI ozone algorithms, clouds are treated as Lambertian reflectors and partially cloudy scenes are treated using the independent pixel approximation or mixed Lambertian surfaces.SOE uses cloud pressures from the OMI O 2 -O 2 algorithm (Acarreta et al., 2004) but derives the initial effective cloud fraction from 347 nm and further fits it in the retrieval.TOMS uses cloud pressures from the OMI rotational Raman cloud pressure algorithm, OMCLDRR (Joiner and Vasilkov, 2006), and derives the effective cloud fraction at 331.3 nm in most cases.Both KOE and OMDOAO3 use cloud information (effective cloud fraction and cloud pres-sure) from the OMI O 2 -O 2 absorption cloud pressure algorithm, OMCLDO2 (Acarreta et al., 2004).
For SOE, we selectively conduct retrievals at the locations of KOE products which are collocated with Brewer measurements.It is a known issue that the effective cloud fraction is not written correctly to the output for values larger than 0.2 in the KOE v 1.1.0algorithm.Therefore, we replace cloud fraction values larger than 0.2 for KOE data before 2 January 2006 with the output of the SOE algorithm.Because the OE retrievals have coarser resolution (UV-1 vs. UV-2) and skip pixels along the track, they are on average less collocated (more distant) from ground measurements.

WOUDC Brewer total ozone data
The Brewer grating spectrometer has an improved optical design over the Dobson spectrometer and is fully automated.The Brewer can be operated in single or double monochromator configuration.The double monochromator (MK-III model) is known to better reduce the impact of stray light on the measurement than the single monochromator (MK-II or MK-IV) does (Kerr, 2002;Petropavlovskikh et al., 2011).Spectral irradiance measurements can be made by a well-maintained Brewer instrument with the precision of ∼ ± 0.1 % (Kerr, 2002).The Brewer instrument measures spectral irradiance at six wavelengths ranging from 303.2 to 320.1 nm.The measurement at 303.2 nm is only used to check the spectral wavelengths by means of internal Hg lamps.The channel at 305.3 nm is used to retrieve the sulfur dioxide (SO 2 ) column, and the ozone column is retrieved from a combination of five longer wavelengths (306.3, 310.1, 313.5, 315.8, and 320.1 nm) (Schneider et al., 2008).
Absorption coefficients based on Bass and Paur (1985) data are used in the standard Brewer algorithm.In addition, the standard Brewer algorithm does not consider the temperature dependence of ozone cross sections and instead uses a fixed temperature of −45 • C. Several studies have evaluated the effects of using newer high-resolution ozone cross-section data sets and accounting for temperature dependence on Brewer total ozone retrievals and their consistency with retrievals from Dobson spectrometers (Fragkos et al., 2013;Redonas et al., 2014).The two newer crosssection data sets are the BDM data set (used in SOE, KOE, and DOAS algorithms) and the data set by Institute of Environmental Physics, Bremen University (IUP data set, Gorshelev et al., 2014;Serdyuchenko et al., 2014).Using both BDM and IUP data sets removes the seasonality of the Dobson/Brewer differences after accounting for the temperature dependence.However, using the BDM data set produces Dobson/Brewer biases of ∼ 2-3 % as the Brewer total ozone is reduced by ∼ 3.2 % (Redonas et al., 2014), while using the IUP data set reduces the Dobson/Brewer differences to within 1 %.Therefore, the IUP data set has been recommended for ground-based Brewer and Dobson measurements.According to Fragkos et al. (2013), using the recommended IUP data set and accounting for its temperature dependence reduces the Brewer total ozone at a mid-latitude station (Thessaloniki, Greece) by ∼ −0.7 % on average, with a seasonal dependence of ∼ 0.2 % and a trend change on the order of 0.05 % decade −1 compared to the operational Brewer total ozone.These studies imply that the operational total ozone, despite the deficiencies in the standard Brewer algorithm, is close to that from the improved algorithm with a positive bias of ∼ 0.7 % and a very small seasonal dependence of ∼ 0.2 %.
We use daily mean values derived from Brewer spectrometers that are publicly available from the World Ozone and Ultraviolet Radiation Data Centre (WOUDC) archive (http://woudc.org) because hourly data are available for every year from 2005 to 2008 for only 10 stations.Daily mean values are reported as the average of all direct sun (DS) measurements during the course of the day if one or more DS observations are available.Otherwise, the daily mean values are derived from other types of measurements, mostly from zenith sky (ZS) observations.This study only considers the DS measurements to ensure the most reliable accuracy.Thirty-five stations, listed in Table 2, have been initially selected from the WOUDC archive to be used for OMI validation.These stations have at least 100 days with DS measurements every year.Five stations are equipped with double Brewer instruments and the rest with single Brewer instruments; Uccle (50.8 • N, 4.35 • E) provides both single and double Brewer measurements.

SAUNA Campaign total ozone data
The main objective of the Sodankylä Total Column Ozone Intercomparison (SAUNA) campaign was to assess the performance of the ground-based instruments and algorithms used to measure total column ozone at large solar zenith angles and high total column ozone amounts (http://fmiarc.fmi.fi/SAUNA/).The SAUNA campaign was held in Sodankylä, Finland, located 120 km north of the Arctic Circle, in March/April of 2006.The early springtime at this high latitude provides the ideal large solar zenith angles for the mission, and total ozone is consistently higher than 400 DU over Sodankylä at this time of year.The ground-based total ozone data were collected in near real time, within 24 h from single/double Brewer and Dobson instruments, including several regional-and world-standard instruments.The total ozone reference for the SAUNA campaign from Brewer measurements combining direct sun data from five instruments; double Brewers #185, #171, and #085, and single Brewers #037 and #039 is used in this validation work.The SAUNA data were not averaged daily for comparison; we use the individual observations closest to OMI overpass time.

Comparison methodology
A portion of the OMI radiance measurements are affected by an instrument error termed the "row anomaly" which began in June of 2007.Loose thermal insulating material in front of the instrument's entrance slit is believed to both block and scatter light, causing measurement error.The anomaly affects radiance measurements at all wavelengths for specific cross-track viewing directions which are imaged to the charge-coupled device (CCD) rows.Initially, the anomaly only affected a few rows (two positions in 2007, eight positions starting in 11 May 2008).But, since January 2009, the anomaly has spread to other rows and began to shift with time.While a large fraction of good measurements remain in the UV-2 and VIS channels used by OMTO3 and OM-DOAO3, the effect of the anomaly on UV-1 measurements used by the SOE and KOE algorithms is more widespread and severe.Therefore in this study, OMI data are only used from the period of 2005-2008 when the row anomaly did not substantially affect radiance data used by any of the four algorithms.
The criteria for collocating OMI with Brewer data is that it must be within 150 km between OMI pixel center and ground-based station location and on the same day.We take only the closest match on a given day, not the average of OMI pixels found.The location and overpass time of KOE and SOE (and, separately, of TOMS and DOAS) collocated at one ground point are exactly the same whereas the locations differ slightly between SOE/KOE and TOMS/DOAS.The average distance between OMI and the ground stations is 10 ± 6 km for OMTO3 and OMDOAO3 products and 30 ± 14 km for KOE and SOE products.For simultaneous evaluation of four total ozone columns as a function of crosstrack position, the cross-track position of UV-2 is re-mapped into positions across the track for UV-1 (e.g., 1-2 of UV-2 corresponds to 1 of UV-1; 3-4 of UV-2 corresponds to 2 of UV-1).
Two statistical quantities, mean bias and 1σ standard deviation, are calculated from relative differences between OMI and Brewer total ozone columns, defined as OMI i −Brewer i Brewer i • 100.Note that relative differences derived under extreme conditions such as solar zenith angles > 80 • , cloud fractions > 0.8, and aerosol index values > 2 and the outliers (outside 3σ of the mean value) are excluded.The mean bias and 1σ standard deviation are presented for individual stations in Sect.3.1.In Sect.3.2 to 3.6 we have merged all collocated OMI and WOUDC data sets to examine the possible dependence of OMI/Brewer differences on OMI viewing geometries, cloud parameters, total ozone amount, and time.

Comparison at individual stations
There are 35 stations available from the WOUDC archive for this validation study, as mentioned in Sect.2.2.Twenty-seven Brewer stations among them were identified as references using a similar selection procedure as that used by Balis et al. (2007).This selection procedure is described in the rest of this section.
Figure 1 shows the relative differences between OMI and Brewer total ozone at all 35 stations listed in Table 2. On average, both mean biases and 1σ standard deviations show smooth variations from station to station with exceptions at Pohang (36.03 • N, 129.38 • E), Mt.Waliguan (36.29 • N, 100.9 • E), and Alert (82.45 • N, 62.51 • W).These three stations are excluded as references.A larger positive bias detected at Mt. Waliguan (elevation: 3820 nm) might arise from the discrepancy between the actual station elevation and the average altitude of OMI ground pixels.The overall standard deviation values range from 1.5 % to 2.5 %, except for Pohang and Alert, where they exceed 3 %.This deviation could be caused by problems with ground-based data rather than with satellite data because satellite measurement characteristics are changing slowly (Floletov et al., 2008).In addition, a large standard deviation at Alert could be attributed to un-certainties in the retrieval of ozone columns from satellite UV/VIS measurements at high solar zenith angles.
Among the four algorithms, the SOE data present the best agreement with Brewer data at most stations; the mean difference is typically below ± 1 %.TOMS and DOAS results present similar negative biases at tropical mid-latitude stations, but DOAS biases are slightly smaller than TOMS at high-latitude stations.The worst agreement is found for KOE total ozone retrievals at all stations.The KOE data persistently overestimate Brewer total ozone measurements, with average biases of ∼ 2 % at latitudes below 43 • and up to ∼ 5 % at high latitudes.Other OMI data, when they deviate, are generally underestimated.The SOE and TOMS comparisons show similar standard deviations of 1.8 % on average.The DOAS comparison shows larger values, between 2 % and 2.5 %.The KOE-Brewer differences have the largest scatter at most stations, with standard deviations up to 3 %.
The correlations between OMI and Brewer data are shown in the left panel of Fig. 2. Two tropical stations (Paramaribo and Petaling Jaya) are excluded from comparisons because of their small correlation coefficients compared to the overall values of other stations.In addition, the Pohang, Mt.Waliguan, and Alert stations, where the mean differences deviate strongly, show inconsistencies from neighboring stations.Apart from these stations, the comparisons present high correlation coefficient values, between 0.95 and 1, depending on OMI algorithms and stations.The SOE and TOMS total ozone columns show the best correlations with Brewer data (R ∼ 0.99).The KOE data show the smallest correlations at most stations.
We derive the trend of the differences [% yr −1 ] using the linear regression slope of 4 years of the monthly averaged relative differences shown as a function of station in the right www.atmos-chem-phys.net/15/667/2015/Atmos.Chem.Phys., 15, 667-683, 2015 Figure 2. Same as Fig. 1, but for correlation coefficient (R) and trends (% yr −1 ).The correlation coefficient is calculated between OMI and Brewer total ozone columns.The trend is derived from the linear regression of the monthly differences between OMI and Brewer total ozone columns.
panel of Fig. 2. As a result of this trend analysis, we exclude three stations from comparisons, Marcus Island, Rome, and Edmonton where all OMI retrievals show absolute trends of more than 0.4 % yr −1 .This leaves 27 stations selected as good references to be used for the validation of OMI total column ozone data sets.Comparison statistics are in Table 3.For all stations in the Northern Hemisphere (NH), the average difference between SOE and Brewer is 0.02 % (0.04 DU) with a standard deviation of 1.81 % (5.98 DU), which generally represents an improvement over other comparisons presented in this study as well as in previous validation studies for other spaceborne instruments (e.g., Antón and Loyola, 2011;Koukouli et al., 2012).Overall, the SOE algorithm also demonstrates the best agreement with Brewer among all four algorithms with respect to coefficients and linear regression results for the NH, middle-latitude, and high-latitude regions.Despite the use of only two or three wavelengths, the TOMS algorithm shows similar standard deviations to the SOE algorithm (slightly smaller at mid-latitude stations, but slightly larger at high-latitude stations) except for some larger biases of up to −1.70 %.The slightly larger scatter of SOE (1.79 %) compared against that of TOMS (1.76 %) observed at mid-latitudes could be attributed to SOE's further distance from ground stations rather than the algorithm perfor- mance.We have examined how the SOE-Brewer standard deviations change when SOE total ozone is retrieved at locations of TOMS measurements: they are reduced to 1.71 % in the mid-latitudes and 1.78 % in the high latitudes, which is less scatter than TOMS measurements.The NH mean difference between DOAS and Brewer is −1.59 ± 2.18 % and between KOE and Brewer 2.76 ± 2.60 %.Compared to SOE and TOMS, both DOAS and KOE show larger differences in mean biases between middle and high latitudes.These are related to the solar zenith angle dependence as discussed in the following section.
In Fig. 3, both single and double Brewer measurements at Uccle station are compared with the four OMI data sets.This comparison with double Brewer measurements shows less scatter but insignificant SZA-dependent reduction of OMI/Brewer differences although it is known that the performance of single Brewer instruments has a distinct dependence on SZA, especially at large SZAs due to the influence of stray light (Bais and Zerefos, 1996).In addition, comparisons at other double Brewer stations also show less scatter and an even smaller trend in the OMI/Brewer differences compared to those latitudinally adjacent stations with single Brewer instruments (Figs. 1 and 2).
Figure 4 compares the daily time series of total ozone columns from OMI and SAUNA Brewer measurements at Sodanklyä for April 2006 when solar zenith angles are above 50 • .The Brewer measurements show large daily variability, which is in good agreement with OMI total ozone variations.The KOE total ozone is positively biased relative to SAUNA data with the largest standard deviation.Both TOMS and DOAS are negatively biased by more than 2 %, with TOMS- SAUNA having largest mean bias and smallest standard deviation.The SOE-SAUNA differences are negatively biased, with the smallest mean bias among the comparisons and a slightly larger standard deviation than the TOMS-SAUNA differences.This standard deviation of SOE differences is reduced to 3.6 DU when SOE retrievals are done at the locations of TOMS products.The comparison with SAUNA data is generally consistent with results found in the comparison between OMI and WOUDC at high latitudes.

Solar zenith angle dependence
The solar zenith angle (SZA) of the polar-orbiting satellite changes dramatically from the tropics to the poles as well as seasonally from summer to winter.Tropospheric ozone information available from satellite UV measurements decreases at larger SZAs (Liu et al., 2005), and radiative transfer simulations lose accuracy for very high SZAs (Caudill et al., 1997).The possible dependence of retrieval algorithms on SZA can cause seasonal-/latitudinal-dependent retrieval biases.In Fig. 5a, the stability of each algorithm is assessed for SZA dependence between 20 • and 80 • (5 • bins).The SOE and TOMS algorithms have a slight dependence on SZA; mean relative differences increase (or decrease) within 1 % over all bins.The DOAS differences show obvious dependence ranging from −2.2 % at an SZA of 22.5 • to −0.6 % at an SZA of 77.5 • (i.e., bias change by 1.6 % or 5.3 DU), although the SZA dependence of this product processed with v 1.2.3.1 of the DOAS algorithm from collection 3 OMI level-1b data has been significantly improved over the previous version of data.For example, an increase in the mean bias of more than 2 % due to SZA was found in OMDOAO3 (v 1.0.5, collection 3)-Brewer data (Koukouli et al., 2012) and the OMDOAO3 collection 2 product showed a much stronger SZA dependence of ∼ 4 % (Balis et al., 2007;McPeters et al., 2008).The overestimation of the KOE algorithm is negatively correlated with SZA bins below 60 • , but positively correlated for larger SZA bins.
As indicated in Koelemeijer and Stammes (1999) and Antón and Loyola (2011), it is important to evaluate the joint effects of satellite-viewing geometries and clouds on ozone retrievals.In Fig. 6, the SZA dependence is characterized by sub-groups of cloud fraction and OMI cross-track positions, respectively.This outcome again demonstrates the stable performance of the SOE algorithm.On the other hand, the SZA dependence of OMI-Brewer differences derived from other algorithms varies with cloud fraction, especially at SZA below 60 • .The SZA dependence of the DOAS algorithm becomes more evident with cloudiness, which is a usual characteristic of the total column ozone data based on the DOAS technique as shown in Antón and Loyola (2011).The negative SZA dependence of the TOMS algorithm also becomes apparent for cloudy conditions.In contrast, KOE presents a larger SZA dependence for clear-sky conditions.For high SZAs (> 60 • ) the dependence is similar between high and low cloud fraction groups, which is a common characteristic of all OMI ozone algorithms.Moreover, the SZA dependence for the DOAS algorithm is larger at nadir positions than at off-nadir positions.A systematic offset of 1 % between nadir and off-nadir positions is present in KOE differences for the whole SZA range, but the SZA dependence shows little dependence on cross-track positions.The SZA dependence of the TOMS algorithm is not affected by the OMI cross-track position.

Cross-track position dependence
The OMI swath contains 30 and 60 cross-track pixels for the UV-1 and UV-2 channels, respectively.The viewing angles ranges from near 0 • at nadir to almost 70 • at the extreme off-nadir position.In addition, OMI uses CCD detectors and each cross-track position is measured with a different region on the detector.Liu et al. (2010a) found that the structures of the differences between OMI observations and simulations in the spectral range 270-350 nm remarkably depends on the cross-track position, especially at wavelengths shorter than 310 nm.Most of the OMI products are reported to have crosstrack-dependent biases or striping.The performance of the OMI level 2 algorithms therefore should be assessed with respect to the cross-track position.
The dependence of OMI/Brewer biases on cross-track position is examined in Fig. 5b.It shows strong cross-track dependence in the KOE data, with the maximum biases of ∼ 4 % at nadir and minimum biases of ∼ 1 % at extreme offnadir positions.The smooth variation with cross-track position may indicate errors in the forward model simulations.The overall relative differences of all cross-track positions are ∼ −2 % in both DOAS and TOMS comparisons.However, the DOAS relative differences fluctuate considerably with cross-track positions, especially at the 4, 16, 20, and 26 positions, where the mean bias deviates significantly from the average value (−2 %) by up to ∼ ± 1 % or more.Similar results were reported in Anton et al. (2009), where they showed no obvious dependence on viewing zenith angle in either the TOMS or DOAS total ozone, but more variability in the DOAS mean biases.To our knowledge, the DOAS and KOE algorithms do not apply any additional correction to OMI level 1b data.On the other hand, both TOMS and SOE algorithms apply a correction to OMI radiance measurements to remove cross-track variability, which may result in less dependence on the cross-track position in the comparison with Brewer data.In Sect.4, we will show the effect of soft calibration on SOE-Brewer differences to see whether this calibration can explain the large difference in the dependence on cross-track position between SOE and KOE algorithms.

Cloud parameter dependence
The effect of clouds on trace-gas retrievals from satellite observations is well established in the literature (Antón and Loyola, 2011).OMI ozone algorithms use a Lambertian surface model for a cloud with a fixed albedo of 0.8, requiring the effective cloud-top pressure (or optical centroid pressure) and effective cloud fraction to model the cloud.The accuracy of ozone retrievals is sensitive to the uncertainties of cloud Table 4. Correlations (R) between OMI-Brewer monthly mean total ozone differences of the four products (rows 1-4) and monthly solar zenith angle (row 5).information and cloud treatment, and therefore the validation results should be examined with respect to cloud parameters used in retrieval algorithms (Koelemeijer and Stammes, 1999;Antón and Loyola, 2011).It was shown in Sect.3.2 that the effect of cloudiness on validation results becomes more pronounced at smaller SZAs.Therefore, in order to clearly investigate the effect of clouds on the comparison, we show relative differences for SZAs smaller than 45 • as a function of cloud parameters in Fig. 5c and d.
Figure 5c shows the influence of cloud fraction on the OMI-Brewer comparisons.The DOAS and TOMS results present similar negative and stable biases for cloud fraction bins less than ∼ 0.3, but the difference between DOAS and TOMS biases becomes larger with increasing cloudiness because of their opposite dependencies on cloud fraction.The DOAS biases increase from −1.5 % for low cloud fraction bins up to −2.5 % for high cloud fraction bins, while the TOMS biases increase but are within 1 %.The KOE biases are larger under partly cloudy conditions (0.2 <cloud fraction < 0.8) relative to clear-sky and overcast conditions, which could be related to a switch point in the algorithm between fitting the surface albedo and fitting the cloud albedo (J.P. Veefkind, personal communication, 2013).The SOE algorithm shows remarkable stability for both clear and cloudy conditions with the mean biases within ± 0.5 % except for the bin of 0.95-1.0,where the mean bias is around −1.5 %.The standard deviations of the relative differences persistently increase with increasing cloudiness for all four OMI algorithms.
Figure 5d shows the influence of the cloud top pressure on the OMI-Brewer comparisons.All of the four algorithms show no significant dependence on cloud pressure except for high clouds (cloud top pressure < ∼ 350 hPa), the average OMI-Brewer differences are larger by 1-2 % than those for middle and low clouds.Of all the four algorithms, the SOE algorithm shows the least dependence on cloud pressure.The standard deviations increase smoothly from low to high clouds except for TOMS where the standard deviations increase rapidly from 325 hPa to 275 hPa.

Total ozone column dependence
In Fig. 5e, the differences between OMI and Brewer measurements are plotted as a function of Brewer total ozone column in bins of 25 DU.The dependence on the total column ozone could be attributed to the sensitivity to profile shape of retrieved total ozone at high SZAs due to the difference between actual and assumed a priori (climatological) ozone profiles as indicated by Lamsal et al. (2007) and Antón et al. (2009).There is ∼ 2 % difference of DOAS mean biases between low (< 325 DU) and high ozone amounts (> 425 DU).This behavior could be explained partially by the positive dependence of the DOAS algorithm results on SZA because high ozone values usually occur at high latitudes where SZAs are large.The KOE mean biases generally decrease from ∼ 3 % at low values to ∼ 1 % at high values and its standard deviations show a deviation of 2.5 to 3.5 %, whereas other comparisons have a standard deviation of ∼ 2 % over all the given bins.SOE and TOMS comparisons have much smoother total ozone dependence.TOMS mean biases range from −2.1 % to −1.3 % and SOE mean biases are less than ± 0.4 % over all the given bins except at the lowest total ozone value where the mean bias is ∼ 1 %.Use of the improved tropopause-based ozone profile climatology presented by Bak et al. (2013) in the SOE algorithm further reduces the total ozone dependence slightly in both mean biases at low ozone amounts and standard deviations at high ozone amounts (see the red dashed line in Fig. 5e).

Seasonal dependence
We examine the long-term stability and seasonal variation of the OMI total column ozone retrievals to evaluate the four OMI algorithms.Figure 7 shows the 4-year time series of the total ozone differences relative to Brewer in four latitude ranges between 30 • N and 80 • N. The blue line indicates the linear regression of these monthly relative differences.None of the algorithms shows significant long-term drift in OMI-Brewer comparisons except for the KOE algorithm at 50-58 • N where the trend is 0.31 % yr −1 .The monthly mean biases of the SOE-Brewer differences vary around the annual means within ± 0.4 %, and their seasonal dependence is quite small for the three latitude bands below 60 • N.However, monthly mean biases at the high-latitude band (64-79 • N) show a clear seasonal-dependent signature with a maximum in winter and a minimum in summer.A similar seasonaldependent pattern is observed in the monthly mean biases of DOAS for all latitude bands, with a quite high correlation between DOAS and SOE temporal variations of the monthly mean biases, ranging from 0.70 and 0.89 (Table 4).For the two low-latitude bands, time series of the monthly mean differences between KOE and Brewer show a distinct annual variation with a winter minimum bias of 0-1 % and a summer maximum bias of ∼ 3.5 %, which is negatively correlated with the seasonal variation of SZA (Table 4; R = −0.66 to −0.81).This behavior could be explained by the negative dependence of KOE biases detected at small SZAs as shown in Fig. 5a.In contrast, there is negligible correlation between the seasonal variation and SZA for the two highlatitude bands.TOMS monthly mean biases have a seasonally dependent pattern of a winter minimum bias and a summer maximum bias at two latitude bands between 40 • N and 58 • N where the biases and SZA are correlated, with coefficients of −0.54 to −0.65.This seasonally dependent pattern agrees well with the comparison of the Brewer data from Hradec Kralove with EP-TOMS v8 data presented in Vanicek (2006), which showed −2 % difference during winter and −1 % difference during summer.

Comparison between SAO and KNMI OE ozone profile algorithms
Although the SOE and KOE algorithms are similar, the SOE algorithm shows significantly better performance in retrieved total ozone.Two of the major algorithmic differences are the use of soft calibration and the use of an a priori error from the LLM climatology in the SOE algorithm vs. 20 % throughout the atmosphere in the KOE algorithm.In order to investigate whether the retrieval performance differences between two algorithms are caused by these two algorithmic differences, we perform SOE retrieval experiments with modified implementations corresponding to KOE.First, we retrieve total ozone columns using the SAO algorithm with  and without soft calibration and then compare both retrievals with Brewer measurements as a function of SZA and crosstrack position in Fig. 8.The use of soft calibration slightly reduces the standard deviations, SZA dependence, and crosstrack dependence for most positions except for large reductions in mean biases by up to 2 % for the first two positions (UV-1 position 2 and 3).Comparing the magnitudes and pat-terns in the reductions vs. KOE/SOE differences in Fig. 5a and b, the KOE cross-track dependence at the left side of the OMI swath could be explained by the soft calibration, but the larger SZA and cross-track dependence (nadir to right off-nadir) cannot be explained.
Next we examine the effect of using a 20 % a priori error relative to the mean a priori profile in the SAO total column ozone retrievals and found no significant differences with total column ozone retrievals using the natural a priori error in LLM (results not shown here).Therefore we conclude that the large KOE/SOE differences are mainly caused by other implementation details such as those in radiative transfer simulations and fitting of variables other than ozone, which will cause differences in fitting residuals.
Figure 9 compares the average fitting residuals in UV-1 and UV-2 channels for one orbit of retrievals on 1 June 2006 using SOE and KOE as a function of SZA.For the SAO fitting results shown in Fig. 9b, we turned off the soft calibration and the use of common mode (average fitting residuals derived from one orbit of retrievals).Both SOE and KOE fitting residuals show the strong SZA dependence, but SAO is  ,272.5,274.7,280.1,282.5,285.1,287.0,288.1,290,295,300,305,313,315,317.5,320,322.5,325,327.5,330nm in UV-2 channel, corresponding to outputs of KOE.The sun-glint-contaminated pixels are indicated by the black symbol.The red line indicates the average in 5 • SZA bins.
smaller by a factor of 2-3.Moreover, the use of soft calibration in SAO algorithm leads to much larger differences in fitting results between two algorithms, especially in UV-2, where total and tropospheric ozone information mostly originates, by a factor of 2 (at larger SZAs) to 5 (at smaller SZAs), as shown in Fig. 9d and e.This implies significant differences in the retrieved total and tropospheric ozone columns between two algorithms.In addition, the KOE fitting residuals in both UV-1 and UV-2 channels show a peak at SZAs of ∼ 20 • which are contaminated by sun glint (black symbols), whereas the impact of sun glint on the SAO fitting residuals is not apparent even without soft calibration.

Conclusions and discussions
The OMI total column ozone data processed with SOE and the three OMI operational algorithms (KOE, TOMS, and DOAS) are evaluated using 4 years (2005)(2006)(2007)(2008) of Brewer measurements at 27 stations identified as good references using a selection procedure similar to that of Balis et al. (2007).The SOE improvements to total ozone retrievals are distinct, with insignificant dependence in the total ozone differences as a function of various algorithmic variables; even the SZA dependence is unaffected by both cloud fraction and cross-track position.However, the SOE biases show significant deviation at high-altitude clouds of ∼ 300 hPa, at high cloud fraction of ∼ 0.9, and at low ozone amount of ∼ 250 DU.The dependence of the TOMS algorithm on viewing geometry is generally marginal, but the SZA dependence is enhanced under cloudy conditions.The DOAS algorithm has a positive dependence on SZA, which becomes more significant for cloudy conditions and for large cross-track positions.KOE biases increase negatively (positively) at SZAs smaller (larger) than 60 • and depend strongly on the crosstrack position with a bias varying between ∼ 1 % and ∼ 4 %.The deviation of mean biases for high clouds compared to low-and mid-altitude clouds is commonly found in all four OMI comparisons, but with the smallest deviations in the comparison of SOE with Brewer.The positive (negative) correlation is found between TOMS (DOAS) mean biases and cloud fraction.KOE biases are larger at cloud fraction values between 0.2 and 0.8 compared to at other cloud fraction values.The SOE and TOMS algorithms exhibit a similar weaker dependence on total ozone amount compared to DOAS and KOE.
A high correlation between SOE and DOAS monthly biases is identified.The common features of their seasonaldependent errors are a weak seasonal variation in midlatitude bands and a distinct seasonal variation in highlatitude bands with winter maximum biases and summer minimum biases.The KOE monthly biases have significant seasonal variability for all latitude bands and their seasonal dependences are highly correlated with the features of SZAdependent biases at mid-latitudes.Comparable seasonal variability is found in TOMS differences at mid-latitudes.A comparison with the SAUNA campaign data shows that all four OMI total ozone columns represent the daily total ozone variations well.
Finally, we have demonstrated that the use of SAO soft calibration reduces the SZA and cross-track dependences of OMI-Brewer differences and fitting residuals, especially in UV-1 at smaller SZAs.However, this reduction cannot explain all of the differences in total ozone retrieval performance between the KOE and SOE algorithms.The use of different a priori error covariance matrices is immaterial to the retrieved total ozone.Other differing algorithm details, including radiative transfer simulations and fitting of variables other than ozone cause significantly larger fitting residuals for KOE by a factor of 2-3.
It is important to discuss the possible impacts of cross sections on the evaluation of algorithm performances as different cross sections are used in the OMI and Brewer algorithms.In 2009, the WMO/GAW(Global Atmosphere Watch)-IO3C(International Ozone Commission) established the ACSO (Absorption Cross Sections of Ozone, http:// igaco-o3.fmi.fi/ACSO/)committee to review the current ozone cross sections and determine the impacts of changing ozone cross sections on retrievals from different satellite and ground-based instruments.According to the activities from ACSO members, switching from Bass and Paur (1985) to newer BDM and IUP data sets has different impacts on retrievals from different instruments/retrieval algorithms due to the use of different wavelengths/spectral regions and the quality of ozone cross sections in the wavelengths/spectral regions used.The BDM cross-section data set is recommended for use in our ozone profile retrieval algorithm and the TOMS algorithm (Liu et al., 2013;Bhartia, 2013, http://igaco-o3.fmi.fi/ACSO/presentations_2013/satellite/WS_2013_Bhartia.pdf) and is used in all OMI algorithms except for the TOMS algorithm.If it is used in the TOMS algorithm, the OMTO3 would increase by ∼ 1.5 %.However, using BDM reduces the Brewer total ozone by ∼ 3.2 % and produces Dobson/Brewer differences of 2-3 % (Fragkos et al., 2013;Redonas et al., 2014).On the other hand, the IUP data set is recommended for ground-based Dobson and Brewer measurements as it minimizes the Dobson/Brewer differences to within 1 %; using the IUP data set, and accounting for its temperature dependence reduces the Brewer total ozone by ∼ −0.7 % with a small seasonal dependence (Fragkos et al., 2013).If one uses the recommended cross sections for different algorithms (i.e., switching to the BDM data set for the TOMS algorithm and to the IUP data set for the Brewer algorithm), the SOE and TOMS total ozone may show positive biases of ∼ 0.5-0.7 %, DOAS total ozone may show negative biases of ∼ 1 % and KOE total ozone may show positive biases of 3-4 %.Because of the very small changes in seasonal dependence and trend of Brewer total ozone and in the systematic bias of TOMS total ozone, the evaluation of algorithm performance with respect to different geophysical variables should not change much.Overall, the main conclusions of this study are not affected much except for the mean OMI/Brewer biases.

Figure 1 .
Figure 1.Mean biases and 1σ standard deviations comparing OMI and Brewer total column ozone at the 35 Brewer stations listed in Table 2.The color coding indicates the comparisons for four total column ozone data sets derived through KOE, SOE, TOMS, and DOAS algorithms.The circle and triangle symbols indicate single and double Brewer stations, respectively.The filled and opened symbols represent stations selected and rejected, respectively, through the reference selection procedure done in Sect.3.1.

Figure 3 .
Figure 3.Comparison between OMI and Brewer total ozone measurements as a function of solar zenith angle at Uccle station with single (blue) and double (red) Brewer instruments, respectively.The mean relative biases and 1σ standard deviations are shown in the legend.

Figure 4 .
Figure 4. Time series of SAUNA data (Brewer reference) and OMI total column ozone for April 2006 (upper panel).Time series of the relative differences between OMI and SAUNA total ozone (lower panel).

Figure 5 .
Figure 5. Dependence of OMI-Brewer relative mean differences and 1σ standard deviations on (a) OMI solar zenith angle, (b) OMI crosstrack position (UV1-based), (c) effective cloud fraction, (d) effective cloud-top pressure, and (e) total ozone column.The calculations for (c) and (d) are done for correlated data sets with OMI solar zenith angle < 45 • , in order to enhance the effect of cloud parameters on OMI retrievals.The red dashed line in (e) represents the SOE comparison with the use of the tropopause-dependent climatology presented in Bak et al. (2013).

Figure 6 .
Figure 6.Dependence of OMI-Brewer relative differences on solar zenith angle (right panels) for two groups of cloud fractions and (left panels) for three groups of OMI cross-track positions in UV-1 (left side of the positions, 1-10; nadir, 11-20; right, 21-30).

Figure 7 .
Figure 7. Time series (monthly) of relative differences (yellow circles) between OMI and Brewer total ozone columns over four selected latitude bands and the 1σ standard deviations (vertical bars).The blue dashed line indicates the linear regression with the linear trend shown at the bottom of each panel.The title of each panel indicates the overall mean bias and standard deviation.

Figure 8 .
Figure 8.Comparison between the SOE and Brewer total ozone columns with and without soft calibration as a function of solar zenith angle (left) and cross-track position (right).

Figure 9 .
Figure 9. Average fitting residuals in UV-1 and UV-2 channels for an orbit of retrievals (orbit 09987) on 1 June 2006 using (a) KOE, (b) SOE without soft calibration, and (d) SOE with soft calibration, as a function of solar zenith angle, with (c, e) the ratios of KOE to SOE fitting results.The average fitting residuals are defined as 1 n
*No official version, the first version is provided inLiu et al. (

Table 3 .
Comparison statistics* between OMI and Brewer total column ozone data for the Northern Hemisphere (NH), mid-latitudes, and high latitudes.
* Mean biases and 1σ standard deviations are in both DU (Dobson unit) and %.Correlation coefficients (R), slope, and offset are from the linear regression. .
The standard devia-tions of KOE and DOAS biases are larger than 2 %.Those of TOMS and SOE biases are ∼ 1.8 % over the NH, but TOMS differences have slightly less scatter than SOE differences at mid-latitude stations.The standard deviations of SOE biases (SOE total ozone is retrieved at the locations of KOE product) could be smaller than TOMS if SOE total ozone is retrieved at the locations of TOMS product.The SOE-and TOMS-based total ozone columns show much better agreement with Brewer data than the KOE do at most stations.The correlation coefficient of DOAS with Brewer is better than those of KOE, but worse than those of SOE and TOMS.