Copernicus Stratospheric Ozone Service, 2009–2012: Validation, System Intercomparison and Roles of Input Data Sets

This paper evaluates and discusses the quality of the stratospheric ozone analyses delivered in near real time by the MACC (Monitoring Atmospheric Composition and Climate) project during the 3-year period between Septem-ber 2009 and September 2012. Ozone analyses produced by four different chemical data assimilation (CDA) systems are examined and compared: the Integrated Forecast System coupled to the Model for OZone And Related chemical Tracers (IFS-MOZART); the Belgian Assimilation System for Chemical ObsErvations (BASCOE); the Synoptic Analysis of Chemical Constituents by Advanced Data Assimilation (SACADA); and the Data Assimilation Model based on Transport Model version 3 (TM3DAM). The assimilated satellite ozone retrievals differed for each system; SACADA and TM3DAM assimilated only total ozone observations , BASCOE assimilated profiles for ozone and some related species, while IFS-MOZART assimilated both types of ozone observations. All analyses deliver total column values that agree well with ground-based observations (biases < 5 %) and have a realistic seasonal cycle, except for BASCOE analyses, which underestimate total ozone in the tropics all year long by 7 to 10 %, and SACADA analyses, which overestimate total ozone in polar night regions by up to 30 %. The validation of the vertical distribution is based on independent observations from ozonesondes and the ACE-FTS (Atmospheric Chemistry Experiment – Fourier Transform Spectrometer) satellite instrument. It cannot be performed with TM3DAM, which is designed only to deliver analyses of total ozone columns. Vertically alternating positive and negative biases are found in the IFS-MOZART analyses as well as an over-estimation of 30 to 60 % in the polar lower stratosphere during polar ozone depletion events. SACADA underestimates lower stratospheric ozone by up to 50 % during these events above the South Pole and overestimates it by approximately the same amount in the tropics. The three-dimensional (3-D) analyses delivered by BASCOE are found to have the best quality among the three systems resolving the vertical dimension , with biases not exceeding 10 % all year long, at all stratospheric levels and in all latitude bands, except in the tropical lowermost stratosphere. The northern spring 2011 period is studied in more detail to evaluate the ability of the analyses to represent the exceptional ozone depletion event, which happened above the Arc-tic in March 2011. Offline sensitivity tests are performed during this month and indicate that the differences between the forward models or the assimilation algorithms are much less important than the characteristics of the …

Abstract.This paper evaluates and discusses the quality of the stratospheric ozone analyses delivered in near real time by the MACC (Monitoring Atmospheric Composition and Climate) project during the 3-year period between September 2009 and September 2012.Ozone analyses produced by four different chemical data assimilation (CDA) systems are examined and compared: the Integrated Forecast System coupled to the Model for OZone And Related chemical Tracers (IFS-MOZART); the Belgian Assimilation System for Chemical ObsErvations (BASCOE); the Synoptic Analysis of Chemical Constituents by Advanced Data Assimilation (SACADA); and the Data Assimilation Model based on Transport Model version 3 (TM3DAM).The assimilated satellite ozone retrievals differed for each system; SACADA and TM3DAM assimilated only total ozone observations, BASCOE assimilated profiles for ozone and some related species, while IFS-MOZART assimilated both types of ozone observations.All analyses deliver total column values that agree well with ground-based observations (biases < 5 %) and have a realistic seasonal cycle, except for BASCOE analyses, which underestimate total ozone in the tropics all year long by 7 to 10 %, and SACADA analyses, which overestimate total ozone in polar night regions by up to 30 %.The validation of the vertical distribution is based on independent obser-vations from ozonesondes and the ACE-FTS (Atmospheric Chemistry Experiment -Fourier Transform Spectrometer) satellite instrument.It cannot be performed with TM3DAM, which is designed only to deliver analyses of total ozone columns.Vertically alternating positive and negative biases are found in the IFS-MOZART analyses as well as an overestimation of 30 to 60 % in the polar lower stratosphere during polar ozone depletion events.SACADA underestimates lower stratospheric ozone by up to 50 % during these events above the South Pole and overestimates it by approximately the same amount in the tropics.The three-dimensional (3-D) analyses delivered by BASCOE are found to have the best quality among the three systems resolving the vertical dimension, with biases not exceeding 10 % all year long, at all stratospheric levels and in all latitude bands, except in the tropical lowermost stratosphere.
The northern spring 2011 period is studied in more detail to evaluate the ability of the analyses to represent the exceptional ozone depletion event, which happened above the Arctic in March 2011.Offline sensitivity tests are performed during this month and indicate that the differences between the forward models or the assimilation algorithms are much less important than the characteristics of the assimilated data sets.They also show that IFS-MOZART is able to deliver realistic analyses of ozone both in the troposphere and in the strato-

Introduction
The presence of a high-altitude ozone layer in the atmosphere, which protects the Earth system against the harmful ultraviolet (UV) light from the Sun, was first determined in the 1920s from observations of the solar UV spectrum.Systematic measurements of stratospheric ozone using ozonesondes started in the late 1950s (Solomon et al., 2005).At that time, the development of satellites just started, the first one (Sputnik) being launched in 1957.
Systematic satellite measurements of ozone started in the late 1970's with the series of Total Ozone Mapping Spectrometer (TOMS) and Solar Backscatter Ultraviolet Instrument (SBUV) instruments.The discovery of the Antarctic ozone hole in 1985 (Farman et al., 1985) led to the development of improved satellite instruments to observe the composition and dynamics of the stratosphere.These instruments played a key role in the discovery of the physical processes responsible for the ozone hole (e.g.Solomon, 1999).
Data assimilation determines a best possible state for a system using observations and short range forecasts.This process was first developed to enable numerical weather prediction (NWP; e.g.Lorenc, 1986).In view of the planned increase in the number and variety of sounders monitoring the ozone layer, the last years of the 1980s saw the appearance of a new application for data assimilation: chemical data assimilation (CDA), or more properly, constituent data assimilation (Rood et al., 1989;Lahoz and Errera, 2010).
Satellite observations of stratospheric composition are retrieved with varying spatial and temporal resolutions, which depend on the instrument design, the retrieval strategy and the circumstances of its operational use.Data assimilation systems can process these data sets (Lahoz and Errera, 2010) to deliver, at regular time intervals, analyses which are meshed on a two-dimensional grid (total column) or on a three-dimensional (3-D) grid (vertically resolved field).The spatial and temporal gradients in these analyses are expected to reflect dynamical and chemical processes rather than the details of the observing system.This feature is exploited in several studies of the photochemistry of the middle atmosphere, especially in the polar regions (see Robichaud et al., 2010;Lahoz et al., 2011;Sagi et al., 2014).
Thanks to their gridded and instantaneous description of the atmospheric composition, chemical analyses enable short-range to middle-range forecasts (Flemming et al., 2011) and are much easier to use and to interpret than satellite observations.The resulting "snapshot" maps show stratospheric composition at a specific time and are rou-tinely used to monitor the evolution of the ozone layer, e.g.above the Antarctic (Antarctic ozone bulletins distributed by WMO/GAW: http://www.wmo.int/pages/prog/arep/gaw/ozone/index.html).
For 10 years, the development of these monitoring and forecasting abilities has been the primary goal of a series of European projects.The European Union project MACC-II (Monitoring Atmospheric Composition and Climate -Interim Implementation) was the third in a series of projects funded since 2005 to build up the atmospheric service component of the Global Monitoring for Environment and Security (GMES)/Copernicus European programme (Peuch et al., 2014).In this paper, the term "MACC" refers to both the MACC and MACC-II projects.The final goal of MACC is to cover all aspects of atmospheric dynamics and chemistry with one global data assimilation system (DAS) based on an operational numerical weather prediction (NWP) system.
Two coupled systems were created in MACC: IFS-TM5 and IFS-MOZART (Flemming et al., 2009;Stein et al., 2013).These coupled dynamics-chemistry DAS are run at the European Centre for Medium-Range Weather Forecasts (ECMWF) in near-real-time (NRT) for monitoring present and near-future atmospheric conditions up to 5 days ahead, through analyses and forecasts of carbon monoxide (CO), formaldehyde (HCHO), nitrogen oxides (NO x , i.e.NO+NO 2 ), sulfur dioxide (SO 2 ) and ozone (O 3 ).They were both designed to deliver in one run a complete and selfconsistent picture of atmospheric chemistry and dynamics and both solve explicitly a complete set of photochemical reactions relevant to tropospheric chemistry.The description of photochemistry in IFS-MOZART also includes the halogen species, the reactions of interest in the stratosphere, and a parameterisation of the heterogeneous reactions responsible for ozone depletion in the polar lower stratosphere.
For European-scale analyses relevant to air quality applications, MACC successfully organised an ensemble of limited-area CDA systems (Gauss et al., 2013).A similar approach was adopted to deliver global analyses of stratospheric and total column ozone through the MACC stratospheric ozone service (http://www.copernicus-stratosphere.eu).Besides IFS-MOZART, this service uses three independent CDA systems in order to identify model weaknesses and aid in the improvement of the main system.These three systems are BASCOE (Errera et al., 2008;Viscardy et al., 2010), SACADA (Elbern et al., 2010) and TM3DAM (Eskes et al., 2003;van der A et al., 2010).These three systems first delivered monitoring services for the programme PROMOTE (PROtocol MOniToring for the GMES Service Element Atmosphere -http://www.gse-promote.org),which was funded by the European Space Agency from 2004 until 2009.They are run at the centres where they were designed, use offline analyses of atmospheric dynamics, and have more relaxed operational constraints than the NRT runs of IFS-MOZART and IFS-TM5 at ECMWF.The TM3DAM system is specifically designed to generate a long-term ozone column data set: the ozone Multi-Sensor Reanalysis (MSR), which documents the day-to-day variability and allows trend studies trends in total ozone over more than 30 years.Contrarily to IFS-MOZART, BAS-COE and SACADA are developed specifically to study and monitor stratospheric chemistry.Their adjoint models include photochemistry, allowing these 4D-Var systems to deliver multi-variate analyses that should provide a more selfconsistent chemical analysis of the stratosphere than possible with IFS-MOZART.Until now BASCOE and SACADA have assimilated only one instrument at a time and BASCOE processed only vertical profiles from limb-scanning instruments.In view of its advanced modelling of transport and background error covariances, it was decided to assimilate with SACADA only total ozone columns.This sub-optimal configuration was meant to test the quality of 3-D ozone analyses by an advanced 4D-Var system in the absence of limb profilers.
In this paper we compare the ozone analyses delivered in NRT by these four systems over the 3-year period September 2009-September 2012, using as reference several data sets of independent observations: ground-based instruments; balloon soundings; and a solar occultation satellite instrument.We also explore the roles of the input data sets in the outcome of this exhaustive validation.Our study is similar to the intercomparison of ozone analyses realised in the Assimilation of Envisat Data (ASSET) project (Geer et al., 2006;Lahoz et al., 2007), with some major differences: here the DAS were configured primarily to satisfy operational constraints and deliver NRT products (and in the case of IFS-MOZART to deliver several tropospheric products in addition to stratospheric ozone); we assimilated a large variety of data sets while ASSET used only observations from Envisat (Environmental Satellite); and the investigated period is much longer (3 years instead of 5 months).
The next section describes the different analyses in the MACC stratospheric ozone service and the reference observations used for their validation.Section 3 contains the evaluation of the total ozone columns based on Brewer-Dobson observations, while the vertical distribution of ozone is assessed in Sects.4 and 5, through comparison with ozonesondes and ACE-FTS satellite data, respectively.In Sect.6, we assess the performance of the MACC analyses during an event of exceptional nature: the Arctic ozone hole, 2011 (Manney et al., 2011).We additionally investigate the influence of the assimilated data set on the performance of the analyses for 1 month covered by this event: March 2011.The final section provides a summary and conclusions.

Data
The MACC stratospheric ozone service currently consists of four independent systems, running routinely on a daily ba-sis, with a maximum delay of 4 days between data acquisition and delivery of the analyses: IFS-MOZART (1 day); BASCOE (4 days); SACADA (2 days); and TM3DAM (2 days).This section gives a detailed description of the analyses: the observations that were assimilated; the underlying atmospheric composition models; the applied data assimilation algorithms; and the way the different DAS deal with background error statistics.Table 1 summarises the satellite retrievals of ozone that were actively assimilated by the four DAS of the MACC stratospheric ozone service, while an overview of the system specifications can be found in Table 2.This section additionally includes a description of the data sets used in the validation of the four analyses.

Aura satellite: OMI total columns and MLS profiles
Aura is NASA's (National Aeronautics and Space Administration) third large Earth Observing System (EOS) mission, flying in a sunsynchronous nearly polar orbit since 9 August 2004, aiming at the provision of trace gas observations for climate and air pollution studies (Schoeberl et al., 2006).Due to its nearly polar orbit, Aura is able to provide a nearly global latitude coverage.It has four instruments onboard, amongst which the Ozone Monitoring Instrument (OMI, Levelt et al., 2006) and the Microwave Limb Sounder (MLS, Waters et al., 2006), which provide complementary information.
The OMI instrument is a nadir-viewing imaging spectrometer, measuring the solar radiation backscattered by the Earth's atmosphere and surface in the ultraviolet to visible (UV-Vis) wavelength range, providing total ozone columns with a horizontal resolution of 13 km × 24 km at nadir.This data set is delivered in near real-time and was validated using Brewer and Dobson spectrophotometer ground-based observations (Balis et al., 2007).While OMI also provides nadir ozone profiles, these have not been assimilated.
The MLS instrument is a limb-viewing microwave radiometer, providing some 3500 daily vertical profile measurements of several atmospheric parameters, such as ozone (O 3 ), nitric acid (HNO 3 ), water vapour (H 2 O), hydrochloric acid (HCl), hypochlorous acid (HOCl), and nitrous oxide (N 2 O) from about 8 to 80 km (0.02 hPa to 215 hPa) with a vertical resolution of about 3 km in the stratosphere and a horizontal resolution of 200-300 km (Waters et al., 2006).As a microwave remote sensing sounder, MLS also provides observations during the polar night, which has a positive impact on ozone analyses during the onset of the ozone hole.
Ozone data retrieved from MLS are delivered in near realtime by NASA/JPL (Jet Propulsion Laboratory), with a latency of only 2 to 4 h, whereas a scientific data set, containing additionally non-ozone species, is delivered with a delay of 4 days.The former data set is used for the assimilation of Table 1.Satellite retrievals of ozone that were actively assimilated by the four models of the MACC stratospheric ozone service.The Aura MLS data used by IFS-MOZART and BASCOE are not the same: IFS-MOZART used the MLS NRT retrievals of ozone only, while BASCOE used the standard scientific, offline retrievals including five other species.PC stands for partial columns, TC for total columns and PROF for profiles.When two references are provided, the first refers to the satellite sensor, the second one to the retrieval algorithm.Froidevaux et al. (2008) estimated from comparisons with other instruments that the MLS v2 ozone profiles have an uncertainty of the order of 5 % in the stratosphere, with values closer to 10 % at the lowest stratospheric altitudes.These lower stratospheric biases mostly disappear with the improved MLS v3.4 data (Livesey et al., 2013b), which have a useful range of 261 to 0.1 hPa.Sensitivity tests were performed with IFS-MOZART, BASCOE and SACADA using the offline MLS v3 data set (see section 6.2).The accuracy and precision of these retrievals (Livesey et al., 2013b) are very similar to those reported for MLS v2 (Livesey et al., 2013a) so the uncertainties of MLS v3 are expected to be at least as small as those reported for MLS v2.

Envisat satellite: SCIAMACHY total columns
The SCIAMACHY instrument (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY) is a UV-Vis-NIR (near-infrared) imaging spectrometer onboard ESA's Environmental Satellite (Envisat) launched on 1 March 2002.SCIAMACHY observed earthshine radiance in limb and nadir viewing geometry and solar and lunar light transmitted through the atmosphere in occultation view-ing geometry.While spectrometers such as MLS are able to provide ozone profiles over the poles throughout the year, UV-Vis instruments such as SCIAMACHY are limited to periods with sufficient solar radiation.On the other hand, they can attain much higher spatial resolution.SCIA-MACHY total columns have a horizontal resolution of typically 32 km × 60 km and were extensively validated against ground-based measurements (Eskes et al., 2005).
After having operated 5 years beyond the planned mission lifetime of 5 years, all communication with the Envisat satellite was lost on 8 April 2012.IFS-MOZART assimilated SCIAMACHY total ozone columns until the last date (7 April 2012).To have a clean monthly mean, it was decided to reprocess TM3DAM for the first days of April using GOME-2 from 1 April 2012 onwards.Due to a better global coverage within 1 day for GOME-2 (SCIAMACHY attains global coverage in 6 days), leading to an improved performance, the official MACC NRT product for SACADA had already switched from SACADA-SCIAMACHY to SACADA-GOME2 on 28 October 2011.

MetOp-A satellite: GOME-2 total columns
The GOME-2 (Global Ozone Monitoring Experiment-2) instrument carried onboard EUMETSAT's (European Organisation for the Exploitation of Meteorological Satellites) Meteorological Operational Satellite MetOp-A (launched in October 2006) continues the long-term monitoring of atmospheric trace gases by ESA's (European Space Agency) ERS-2 (European Remote sensing Satellite-2) GOME.It is a nadir-viewing UV-Vis scanning spectrometer, which is able to achieve global coverage within 1 day (Munro et al., 2006).Total columns are provided with a horizontal resolution of 80 km × 40 km.GOME-2 total ozone columns are available about 2 h after sensing and were validated against ground-based measurements by Loyola et al. (2011).Ozone profiles are also retrieved from this instrument but this study used only the total columns.

NOAA satellite: SBUV-2 partial columns
SBUV/2 is a series of seven remote sensors on NOAA weather satellites (McPeters et al., 2013), of which three were assimilated by IFS-MOZART during the period investigated here (September 2009 to September 2012): NOAA-17 and NOAA-18 during the whole period; NOAA-19 after 2011-06-22.Bhartia et al. (2013) describe the two latest versions of the SBUV/2 retrievals: v8 which was available during the period investigated here, and v8.6 which was released more recently.While SBUV v8.6 includes the averaging kernels (AK) for each retrieved profile, these were not available in the v8 BUFR data used operationally at ECMWF.Hence we used the same procedure as first described for ERA-40 (Dethof and Hólm, 2004): in order to decrease unwanted vertical correlations between errors at different levels, the thirteen layers of the original SBUV v8 retrievals were combined at ECMWF over six thick layers (0.1-1 hPa, 1-1.6 hPa, 1.6-4.1 hPa, 4.1-6.4hPa, 6.4-16 hPa, 16 hPa-surface).Among the resulting partial ozone columns, the last one contributes most to the total columns.

IFS-MOZART
Within the GEMS project, the Integrated Forecast System (IFS), operated by ECMWF, was extended to be able to simulate and assimilate the abundance of greenhouse gases (Engelen et al., 2009), aerosols (Morcrette et al., 2009;Benedetti et al., 2009), as well as tropospheric and stratospheric reactive gases (Flemming et al., 2009;Inness et al., 2009;Stein et al., 2012) from satellite retrieval products.Satellite observations for the following reactive gases can be assimilated: O 3 ; nitrogen dioxide (NO 2 ); carbon monoxide (CO); formaldehyde (HCHO); and sulfur dioxide (SO 2 ); but only the former three were assimilated in the operational analysis discussed in this paper.The assimilation window of IFS-MOZART is 12 h.The version of IFS-MOZART used here was described in detail by Stein et al. (2013).To provide concentrations and chemical tendencies of the reactive gases, the IFS was coupled to a chemistry transport model (CTM) using the coupling software OASIS4 (Ocean Atmosphere Sea Ice Soil: Redler et al., 2010).The IFS computes only the transport of the aforementioned reactive gases, while the coupled CTM provides the chemical tendencies due to chemical conversion, deposition and emission.
The CTM selected to deliver analyses of stratospheric ozone for the MACC global monitoring and forecast system is MOZART-3 (Kinnison et al., 2007;Stein et al., 2012) because it simulates both tropospheric and stratospheric chemistry, including the catalytic destruction of ozone in the lower polar stratosphere.Inness et al. (2009) give a detailed description of the applied procedure for the assimilation of atmospheric constituents in IFS-MOZART.
During the period studied here, the IFS was run at T159L60, where T159 denotes an expansion to wave number 159 in the spherical-harmonic representation used by the model (corresponding to approximately 125 km horizontal resolution at the equator), and L60 denotes a vertical grid comprising 60 hybrid-pressure levels extending from 0.1 hPa down to the surface.This run uses IFS version ("cycle") 36R1.The CTM component, MOZART-3, used the same 60 vertical levels and a regular longitude-latitude grid with 1.875 • × 1.875 • horizontal resolution.Its chemical scheme includes 115 species interacting through 325 reactions (Stein et al., 2013).
The following satellite O 3 data were simultaneously assimilated (see Table 1): partial columns by NOAA SBUV-2; total columns by Aura OMI and Envisat SCIAMACHY; and profiles by Aura MLS down to 68 hPa.Note that all ozone data assimilated in IFS-MOZART are NRT products.Hence the MLS data set used here (v2.2) is the product delivered 2 to 4 h after measurement, in contrast to the data assimilated by BASCOE (see Sect. 2.2.2).
The IFS-MOZART version described here was run daily (experiment f93i) from 1 September 2009 to 30 September 2012, which determined the period considered in this paper.

BASCOE
BASCOE (Errera et al., 2008) is a 4D-Var system developed at the Belgian Institute for Space Aeronomy, BIRA-IASB.Based on a stratospheric CTM, BASCOE assimilates satellite retrievals of O 3 , H 2 O, HNO 3 , HCl, HOCl, and N 2 O, gathered by MLS.The assimilation window is 24 h, while BASCOE produces output every 3 h.The CTM includes 57 species that interact using 143 gas-phase reactions, 48 photolysis reactions and 9 heterogeneous reactions.
Heterogeneous reactions on the surface of polar stratospheric cloud (PSC) particles are explicitly taken into account.The BASCOE version used here adopts a simple coldpoint temperature parameterisation to represent the surface area available for these reactions: type Ia (Nitric Acid Trihydrate) PSCs are set to appear at temperatures between 186 and 194 K with a surface area density of 10 −7 cm 2 cm −3 .At grid points colder than 186K they are replaced by type II PSCs (i.e.water ice particles) with a surface area density of 10 −6 cm 2 cm −3 .
www.atmos-chem-phys.net/15/2269/2015/When the BASCOE forward CTM is run with no constraining observations, the stratospheric ozone fields become less realistic after a few weeks or months, depending on the region.These results are similar to those found with IFS-MOZART by Flemming et al. (2011).In the case of the BASCOE CTM, this is due to the absence of tropospheric processes and surface emissions, which prevents proper exchanges with the troposphere; and to the parameterisation of PSC surface area density, which lacks any memory of the coldness experienced by polar air masses.This last issue was discussed by Lindenmaier et al. ( 2011) using the coupled model GEM-BACH that inherited its photochemistry and PSC parameterisation from BASCOE.
For the MACC stratospheric ozone service, the BASCOE DAS is driven by the ECMWF operational 6-hourly analyses (winds, temperature and surface pressure).BASCOE is run at a horizontal resolution of 3.75 • longitude by 2.5 • latitude and uses a vertical hybrid-pressure grid comprising 37 levels, most of them lying in the stratosphere.As the driving meteorological analyses, this vertical grid extends from 0.01 hPa down to the surface.BASCOE does not include any tropospheric processes and is therefore not expected to produce a realistic chemical composition below the tropopause, resulting in larger systematic error biases for the total columns and in the lower stratosphere.
Both BASCOE and IFS-MOZART analyses assimilate Aura MLS data, but while IFS-MOZART uses the NRT retrievals v2.2 of ozone only, BASCOE uses the standard scientific, offline retrievals (level-2) v2.2 including five other species, which are available with a delay of typically 4 days.BASCOE was configured to filter out ozone observations below 150 hPa.

SACADA
Within the project SACADA, a 4D-Var scheme has been developed by the Rhenish Institute for Environmental Research at the University of Cologne and partners (Elbern et al., 2010) aiming at the assimilation of atmospheric Envisat data using state-of-the-art numerical methods.This system has been implemented for operational use at the Deutsches Zentrum für Luft-und Raumfahrt, DLR, who deliver routinely daily (12 h UT) trace gas analyses based on Envisat SCIAMACHY ozone columns since March 2010.In parallel, another SACADA service assimilates MetOp-A GOME-2 total column data since January 2008.In research mode, SACADA has been successfully applied to other satellite-and ground-based observations (Elbern et al., 2010;Schwinger and Elbern, 2010;Baier et al., 2013).
The SACADA system uses an icosahedral grid (i.e.20 equilateral triangles) on sigma-pressure levels with an approximate resolution of 250 km.The vertical grid consists of 32 model levels extending from 7 to 66 km altitude (440 to 0.1 hPa).The tropospheric ozone column is prescribed from the TOMS V8 climatology.Like IFS-MOZART and BASCOE, SACADA applies a comprehensive stratospheric chemistry scheme (see Table 2).The NRT service additionally provides information on the following unconstrained species: HNO 3 , H 2 O, and HCl.Unlike the other CDA systems used in MACC, SACADA is not driven directly by winds and temperature from the IFS NWP system; it takes these input fields from the meteorological forecast system GME (Majewski et al., 2001), run at DLR. GME is started from ECMWF analyses data daily at 00:00 UTC and provides its own 24 h forecasts.The SACADA 4D-Var assimilation uses an assimilation window of 24 h.Note that SACADA products are delivered on a standard latitudelongitude grid with 3.75 • by 2.5 • resolution from 147 to 0.3 hPa altitude.
Here we investigate two independent SACADA NRT products for two consecutive time intervals (see Table 1).NRT delivery started on 4 March 2010 with SACADA 2.0 assimilating SCIAMACHY observations of total ozone columns (version 5).After 28 October 2011, SACADA was upgraded to version 2.4 and switched to the GOME-2 instrument (retrieval version GDP 4.1), which has a better daily data coverage than SCIAMACHY.

TM3DAM
The TM3DAM data assimilation system is based on the TM3/TM5 tracer transport model and is driven by operational 6-hourly meteorological fields from ECMWF.The main purpose of TM3DAM is the generation of 30-45 year reanalyses of total ozone based on all available satellite data sets (van der A et al., 2010), but in MACC it has also been operated to provide real time analyses and forecasts.TM3 contains parameterised schemes for the description of stratospheric gas-phase and heterogeneous ozone chemistry.
The assimilation scheme in TM3DAM is based on a simplified Kalman-filter approach, with a time and space dependent error covariance, but with fixed correlations (Eskes et al., 2003), which considerably reduces the computational cost.The TM3DAM assimilation code has been updated as described in van der A et al. (2010).The system is run at a global horizontal resolution of 3 • longitude by 2 • latitude.It applies a vertical hybrid-pressure grid, consisting of 44 levels extending from 0 hPa to the surface (1013 hPa).From the upper troposphere upwards, the layers coincide with the 60layer vertical grid used at ECMWF.TM3DAM assimilates near real-time level-2 total ozone column data from Envisat/SCIAMACHY until the end of March 2012 and switched to MetOp-A/GOME-2 after all communication with the Envisat satellite was lost on 8 April 2012.NRT production of daily analyses (valid at 21 h UT) in the framework of MACC started on 16 March 2010.Only total columns are available.Besides the daily analyses, TM3DAM also generates daily forecasts for up to 9 days ahead.The Observation-minus-Forecast (OmF) statistics show that the bias of the system compared to the indi-vidual satellite measurements is typically less than 1 % for a forecast period of 1 day.

Comparison of ozone background errors
The specification of the background error covariance matrix (e.g.Kalnay, 2002) is one of the most difficult parts of an assimilation system: since assimilation errors are never observed directly, they can only be estimated in a statistical sense.Each of the considered analyses has a different way of dealing with background error statistics.In IFS-MOZART, the background error covariance matrix is given in a wavelet formulation (Fisher, 2006), allowing both spatial and spectral variations of the horizontal and vertical background error covariances (Inness et al., 2013).For ozone, the background error correlations were derived from an ensemble of forecast differences, using a method proposed by Fisher and Andersson (2001).The background error standard deviation profiles and the horizontal and vertical correlations can be found in Fig. 1 of Inness et al. (2009).
BASCOE analyses use a diagonal background error correlation matrix B with a fixed error usually between 20 to 50 % of the background field, 30 % in this version.The diagonal setup of B implies that spatial correlations are neglected.Spatial correlations help to spread the information from the data into the model.As mentioned by Errera et al. (2008), they can be neglected in a first approximation if the spatial coverage of the assimilated observations and their vertical resolution are comparable to the DAS resolution.This is the case here, where a maximum of 3 days of MLS observations are necessary to constrain all BASCOE grid points.Note that spatial correlation on the B-matrix has been implemented recently in BASCOE (Errera and Ménard, 2012), following the method by Hollingsworth and Lönnberg (1986).
The SACADA 4D-Var assimilation uses a flow dependent paramaterisation of the background error covariance matrix with a diffusion approach (Weaver and Courtier, 2001).The basic idea is to formulate covariances by Gaussians and approximate these Gaussians by integration of the diffusion operator over a specified time.Horizontal and vertical background error correlation lengths are fixed to 600 km and 3 km, respectively.The background standard deviation is set to 50 % of the background field, which is quite low and allows the observations to have a strong impact on results.
In the parameterised Kalman filter approach of TM3DAM, the forecast error covariance matrix is written as a product of a time independent (i.e.fixed) correlation matrix and a time dependent diagonal variance (Eskes et al., 2003).All aspects of the covariance matrix, including the time dependent error growth and correlation length, are carefully tuned on the basis of OmF (Observation minus Forecast) statistics.In the total ozone product, a realistic time dependent error bar is provided for each location and time.

Brewer-Dobson observations
To assess the condition of the ozone layer, one frequently uses the total column of ozone.Roughly 150 ground stations perform total ozone measurements on a regular basis.Data are submitted into the World Ozone and UV Data Center (WOUDC), operated by Environment Canada (http: //www.woudc.org),as part of the Global Atmosphere Watch (GAW) programme of the World Meteorological Organization (WMO).The observations are predominantly taken with Dobson and Brewer UV spectrophotometers at about 60 and 70 stations respectively, but WOUDC also includes observations from UV-Vis DOAS spectrometers.
Even though Dobson and Brewer instruments are based on the same general measurement principle, previous studies have identified a seasonal bias of a few percent between their midlatitude total ozone column measurements, Brewer measurements being in slightly better agreement with satellite data than Dobson measurements.In the northern hemisphere, Dobson instruments exhibit a +1 % bias compared to Brewer instruments and the bias exhibits a seasonal cycle which is not the case for Brewer instruments (Scarnato et al., 2009;Lerot et al., 2013).Similar conclusions hold for the southern hemisphere.Since the Brewer network has not such a good coverage in the southern hemisphere, however, we use the Dobson instruments as a reference in the Antarctic, keeping in mind this +1 % bias compared to Brewer instruments (i.e.we did not correct the Dobsons for this bias, but instead used the original data).
In order to assess the quality of the total ozone columns (TOCs) delivered by the four systems, we selected three stations from the WOUDC database for which the time cover-

DOAS observations
Conventional techniques for measuring ozone in the UV, such as Dobson spectrometers, are inapplicable for solar zenith angles (SZA) larger than about 80  Pommereau and Goutail (1988)) instruments belong to this family and have a standardised design, which allows observations of NO 2 and O 3 total columns twice a day during twilight (sunrise and sunset).As a general result, the SAOZ O 3 measurements are between 2-8 % higher than the Dobson ones, with a scatter of about 5 % in midlatitudes and increasing at higher latitudes.
For the "Arctic ozone hole 2011" case study in this paper (see Sect. 6), the total ozone columns by the four analyses were compared with data received by three UV-Vis zenith-sky instruments at Arctic locations, which are part of the Network for the Detection of Atmospheric Composition Change (NDACC, http://www.ndacc.org):Scoresby Sund (Greenland, 70.49• N, 21.98 • W); Zhigansk (Russia, 66.8 • N, 123.4 • E); and Harestua (Norway, 60 • N, 11 • E).The instruments at Zhigansk and Scoresby Sund have the SAOZ design and are owned by LATMOS/CNRS (Laboratoire Atmosphères, Milieux, Observations Spatiales/Centre National de Recherche Scientifique), while the instrument at Harestua has an improved design and is operated by BIRA-IASB (van Roozendael et al., 1995).
We use ozone observations gathered by balloon sondes at 38 locations, taken from the above-mentioned databases for the period September 2009 to September 2012: 12 in the Arctic; 19 in the tropics; and 7 in the Antarctic (see Fig. 1).For each latitude band, we picked out one station which is representative for the general behaviour in this latitude band and for which the time coverage for this 3-year period was sufficiently large, for a more detailed discussion: the Arctic station at Ny-Ålesund (79 • N, 12 • E); the equatorial station at Nairobi (1.27 • S; 36.8 • E); and the Antarctic station at Neumayer (70.65 • S, 8.25 • W) (red dots in Fig. 1).Data are provided by the Alfred-Wegener Institute in Potsdam, Germany (for Ny-Ålesund and Neumayer) and by MeteoSwiss in Payerne, Switzerland (for Nairobi).

ACE-FTS satellite data
ACE-FTS is one of the two instruments on the Canadian satellite mission SCISAT-1 (first Science Satellite), ACE (Bernath et al., 2005).It is a high spectral resolution Fourier transform spectrometer operating with a Michelson interferometer.Vertical profiles of atmospheric parameters such as temperature, pressure and volume mixing ratios of trace constituents are retrieved from the occultation spectra, as described in Boone et al. (2005), with a vertical resolution of maximum 3-4 km.Level 2 ozone retrievals (version 3.0) are used as an independent reference data set to validate the ozone profiles of the MACC stratospheric ozone system.
It must be noted that the low spatio-temporal sampling of ACE-FTS (due to the solar occultation technique) does not deliver profiles in all latitude bands for each month.There are also two periods during the year where there are no measurements for a duration of almost 3 weeks due to the fact that the spacecraft is in constant sunlight: June and December (Hughes and Bernath, 2012).There are four periods per year, lasting about 1 month (northern hemisphere: April, June, August, December; southern hemisphere: February, June, October, December) with no occultation poleward of 60 • (see Fig. 4 of Hughes and Bernath (2012)).At very high β angles (i.e. the angle between the orbital plane of the satellite and the Earth-Sun direction > 57 • ), it is common practice to skip more than half of the available measurement opportunities to avoid exceeding onboard storage capacities and overlapping command sequences.Therefore, the amount of observations in the tropics is significantly lower than in the polar regions.
The previous version of these retrievals (version 2.2) was extensively validated against 11 other satellite instruments, ozonesondes and several types of ground-based instruments (Dupuy et al., 2009).This version reports more ozone than most correlative measurements from the upper troposphere to the lower mesosphere.Dupuy et al. (2009) found a "slight positive bias with mean relative differences of about 5 % between 15 and 45 km.Tests with a preliminary version of the next generation ACE-FTS retrievals (version 3.0) have shown that the slight positive stratospheric bias has been removed."Adams et al. (2012) additionally present an intercomparison of ACE ozone profiles (both versions 2.2 and 3.0) against ground-based observations at Eureka, confirming that the new ACE-FTS v3.0 and the validated v2.2 partial ozone columns are nearly identical, with mean relative difference of 0.0 ± 0.2 % for v2.2.minus v3.0.
Standard deviations for levels where there are fewer than 20 observations are omitted for reasons of nonrepresentativeness.

Validation of total ozone columns
We intercompare for the first time analyses based on data from different satellites and of different types: partial/total ozone columns, profile observations or a combination of both.For an optimal interpretation of the validation results, it is important to keep in mind that SACADA and TM3DAM exclusively assimilated total ozone columns, but while TM3DAM delivers only total ozone columns as output product, SACADA also provides ozone profiles.BAS-COE exclusively assimilated vertical profiles of ozone (besides other species) and IFS-MOZART used a combination of total columns, partial columns and vertical profiles from various instruments.
In this section, we discuss the results obtained for the validation of the total ozone columns against Brewer observations at Alert (Arctic) and Chengkung (tropics), and against Dobson observations at Syowa (Antarctic).The TOC data sets from the four systems were interpolated to the latitude and longitude of these stations.The resulting time series are shown in Fig. 2, side by side with the corresponding observed ground-based data.

Alert (Arctic)
The seasonal O 3 cycle at Alert is very similar each year.The only deviations from usual behaviour of the total ozone columns occur, e.g. in November 2009, when an air mass with exceptionally high ozone passed over Alert, and in February-March 2011, when 30 % of the total ozone column above Alert was destroyed by the end of March.The latter event will be studied in detail in Sect.6 as a separate case study.
All four analyses match each other and the observed total ozone columns very closely.Peak-to-peak difference in TOC are of the order of 250 Dobson Units (DU), with maximum values reached during boreal winter and spring as a result of poleward and downward transport of ozone-rich air by the large-scale Brewer-Dobson circulation (Brewer, 1949;Dobson, 1956;Weber et al., 2011).
The only significant differences among the analyses occur during the O 3 maximum in northern spring (where mutual differences of maximum 50 DU, about 10 %, are observed) and during the Arctic ozone hole season, where SACADA delivers TOC values which are about 75 DU (20 %) above the other analyses.Unfortunately, this coincides exactly with the periods where reliable ground-based observations are missing due to the lack of sunlight.

Chengkung (tropics)
The ozone columns in the tropics are lower (between 240 and 330 DU) due to the large-scale ascent of tropospheric lowozone air and the higher incidence of solar radiation.Ozone maxima are reached in April each year, after which ozone is decreasing slowly until the beginning of November and more rapidly afterwards.The lowest values are seen in December, January, and February (DJF), when the upwelling part of the Brewer-Dobson circulation is strongest.Ozone is recovering very rapidly from January to April.This seasonality is in general well reproduced by all analyses.The ozone recovery is slower in the analyses than in the observations and the observed ozone maxima are never reached.
IFS-MOZART, SACADA, and TM3DAM mutually differ by 2 % at most, and underestimate the Brewer observations by no more than 5 %.BASCOE systematically underestimates total ozone by 20 DU throughout the year (about 7-10 %).As discussed below (Sect.4.4.2),this is due to the underestimation of ozone in the lower stratosphere.

Syowa (Antarctic)
At the Antarctic station Syowa, the local spring-time ozone hole is evident, with values below 200 DU during the months September, October, and November (SON).The total ozone columns are reduced by up to 50 %, from approximately 300 DU during austral summer and autumn down to 150 DU during the austral spring season.
The seasonal cycle of total ozone is very well reproduced by the IFS-MOZART, BASCOE and TM3DAM analyses with results very close to each other (biases < 2 %).After the loss of Envisat in April 2012, the differences between IFS-MOZART and TM3DAM become slightly larger.Before this incident, both IFS-MOZART and TM3DAM assimilated SCIAMACHY data, but afterwards, TM3DAM switched to GOME-2, while IFS-MOZART continued to assimilate observations from SBUV/2, OMI, and MLS.
SACADA exhibits strong positive biases from observations during austral winters, right before the onset of the ozone hole (up to 30 % in 2012).Closer inspection of SACADA analyses shows that these larger differences coincide with missing SCIAMACHY and GOME-2 observations during polar night when solar zenith angles are close to or in excess of 90 • .While this coverage effect should especially influence systems that assimilate data from UV instruments only, the TM3DAM system is found less vulnerable to data gaps than SACADA, as it performs very well under the same circumstances.

Discussion of SACADA total column results
All analyses show a realistic seasonal cycle in all three latitude bands and total ozone column values, which are generally in very good agreement with independent observations, with the exception of SACADA during polar night.Differences between IFS-MOZART, BASCOE, and TM3DAM are usually within 5 %.Only a few exceptions were identified, i.e. larger mutual differences (up to 10 %) are found at high altitudes during polar night, and for BASCOE in the tropics, where the system underestimates total ozone by 7-10 %.
In contrast to these three analyses, SACADA total ozone results deviate strongly from observations during certain episodes.There is a general tendency in SACADA results for positively biased ozone columns during the winter months at high latitudes compared to Alert and Syowa station data in the northern and southern hemispheres, respectively.Backscatter UV instruments provide no information for zenith angles above 90 • .As recommended for SCIA-MACHY data version 3 (Lerot et al., 2007), only observations with zenith angles up to 75 • were used.Thus, no SCIA-MACHY data were assimilated until May 2011 at the latitudes of Alert station (82.49• N).Accordingly, at Syowa station (69 • S), SCIAMACHY data were not processed from the end of March until the end of September.
From 28 October 2011 onwards, GOME-2 observations were assimilated by SACADA up to zenith angles of 90 • .In this case, the instrument is blind from mid September 2011 to April 2012 at Alert, and from mid April to mid September at Syowa.These time periods correlate generally well with the positive bias anomalies in ozone columns found in SACADA results.The area of impact of a total column observation on assimilation results is limited by the background correlation matrix, which uses a horizontal correlation radius of 600 km.Latitudes not covered by observations can therefore only be influenced via tracer transport and chemistry.In summary, we conclude that these large biases reflect a general tendency of the SACADA model to overestimate total ozone in polar night regions.Since its assimilation setup was limited to UV-Vis observations, these could not constrain the erroneous model results at high latitudes.

Validation of the vertical distribution of stratospheric ozone against ozonesondes
In this section, we discuss the results obtained for the validation of the ozone profiles against ozonesonde observations at Ny-Ålesund (Arctic), Nairobi (tropics), and Neumayer (Antarctic).
In order to compare the ozone fields from the three systems with the observed ozonesonde data, the analyses were first linearly interpolated to the geographical location of the launch sites.Even though sondes may drift long distances during their ascent, especially within the polar vortex, this often significant horizontal movement was disregarded, as tracking information is not always available.As a next step, the two analysis profiles preceding and following the measurement closest in time were linearly interpolated to the time of observation.Since the ozonesonde profiles have a much higher vertical resolution than the analyses, the ozonesonde data have been vertically re-gridded to the coarser pressure grid of the DAS, degrading the observations to the lower resolution of the DAS through a mass-conserving algorithm (Langerock et al., 2014).Figure 3 shows time series of the monthly mean ozone bias profiles with respect to the ozonesondes at the selected sites for each of the three MACC systems.

Arctic -Ny-Ålesund
The seasonal cycle at Ny-Ålesund is very well reproduced by the three analyses.Biases at Ny-Ålesund are generally smaller than 20 % for all MACC analyses throughout the stratosphere (Fig. 3).The time series of the ozone profiles shows alternating behaviour in the vertical for IFS-MOZART, persistent over the entire 3-year period, with positive biases in the lower (below 70 hPa) and upper (above 20 hPa) stratosphere and no or only slightly negative biases (mostly 5-10 %) in the middle stratosphere.The per- formance of BASCOE is stable throughout the stratosphere and for the entire 3-year period, with biases mostly less than 5 %.Largest biases over the whole period for IFS-MOZART (−20 to −30 % between 50 and 70 hPa) and for SACADA (> 50 % between 35 and 65 hPa) are found for March 2011.
While the ozone hole simulated by IFS-MOZART is too deep, SACADA simulates an Arctic ozone hole, which is not deep enough.This special event will be discussed in detail in Sect.6.Until March 2011, SACADA mainly overestimates ozone over the entire altitude range, while middle stratospheric ozone is mostly underestimated afterwards.

Tropics -Nairobi
The O 3 bias profile time series (Fig. 3) now displays a changing performance in the vertical for all three analyses.Lower stratospheric ozone is underestimated by more than 40 % by both IFS-MOZART and BASCOE (below 80 hPa and below 100 hPa, respectively) throughout the year.For BAS-COE, this is followed by a small pressure range just above and ozonesondes (black).Bottom row: mean and (solid lines) and standard deviations (dashed lines) of the relative differences, in %, of these analyses against the ozonesondes.over the period from September 2009 to September 2012.
above BASCOE) with biases of about 15 % compared to the ozonesonde data at Nairobi, and even up to 30 % in the period August-November 2010, SACADA shows ozone values which are at least 35 % higher than the other two analyses, while the seasonality is well reproduced.The discontinuity in the SACADA products from 6 to 7 September 2010, is due to resumption of the assimilation after a period where SACADA ran freely (July-September 2010) due to a data gap in the assimilated SCIAMACHY.As mentioned before for the total ozone columns, the SACADA analysis tends to drift in the absence of UV observations to assimilate.Once resumed, the assimilation reduces the mismatch with the other two analyses from 60 % down to only 10 %.

Antarctic -Neumayer
The O 3 bias profile time series show that the biases are smallest and most stable for BASCOE (usually less than 10 %).IFS-MOZART on the other hand has an annually recurrent pattern, overestimating O 3 with more than 50 % between roughly 70 and 150 hPa each Antarctic ozone hole season, from September to December, while underestimating ozone between 30 and 60 hPa in September.This indicates that IFS-MOZART has problems with a correct simulation of the ozone depletion.This is a known problem of the underlying MOZART CTM in the MACC configuration, which cannot be completely fixed by the data assimilation (Flemming et al., 2011;Inness et al., 2013), especially be-cause the assimilated profile only gives information down to 68 hPa.MOZART performs better with WACCM meteorology (Kinnison et al., 2007), which indicates that the chemical parameterisations are sensitive to the meteorological fields that are used to drive transport in the models.SACADA has problems to correctly simulate the ozone concentration in the lower stratosphere (below 80 hPa) .While the ozone hole depth of 2010 is underestimated (positive bias), the corresponding ozone depletion in 2011 and 2012 is overestimated by more than 50 %.This is related to the premature onset and end of the ozone depletion as predicted by the model, which is reflected also in the ozone values at 50 hPa.Apart from this, the observed ozone values at Neumayer at 50 hPa are in general well reproduced by the three analyses of the MACC system.IFS-MOZART and BASCOE do not differ much in their analyses.

SACADA results
In our evaluation, SACADA is the only chemical data assimilation system with full chemistry that assimilates total column ozone only.Ozone columns are assimilated by constraining the system's ozone column first guess at the satellite footprint.We find that, as in the case of SACADA, the lack of information constraining the shape of the ozone profile leads primarily to an overestimation of ozone in the lower stratosphere as can be seen, e.g. in Fig. 3 in comparison to the station at Nairobi (1.27 • S).The excess ozone in the lower stratosphere leads to an underestimation at higher altitudes above 30 hPa (see also Fig. 4).The standard deviations between the MACC systems and the ozonesondes are largest for SACADA.We conclude that total column assimilation does not sufficiently constrain the system's ozone profile.

IFS-MOZART and BASCOE results
Biases are mostly smaller than 10 % for IFS-MOZART and BASCOE in the middle to upper stratosphere.IFS-MOZART has problems with a correct representation of the vertical distribution of ozone.Often, over-and underestimations are alternating in the vertical.Biases are highest in austral spring during the Antarctic ozone hole season.Also during March 2011, when the first documented significant ozone hole in the Arctic occurred (Manney et al., 2011), somewhat larger differences are found.While IFS-MOZART and BASCOE deliver quite similar results, BASCOE profiles have a more stable behaviour at all altitudes and during the Arctic and Antarctic ozone hole seasons.Largest biases occur, for both systems, in the lower stratosphere in the tropics.This can be partially explained by the strong gradients in ozone near the tropopause, which is located at higher altitudes in the tropics than at the poles.These sharp ozone gradients in the upper troposphere-lower stratosphere (UTLS) are very difficult to represent in three-dimensional models and likely require a very fine vertical resolution (Considine et al., 2008).Furthermore relative differences are amplified in this region due to its low ozone abundance.
For BASCOE, two more elements play a role in the poorer performance in the lower tropical stratosphere: the low vertical resolution and aliasing errors in the horizontal wind fields, which are larger close to the UTLS and which lead to noise in the horizontal distribution of chemical tracers.This bug has been corrected in an upgraded version, which has been running operationally since the beginning of 2013.The vertical grid of the system is improved, from 37 levels to 91 levels, with a much finer resolution in the UTLS region.Comparison between both versions shows that O 3 values become smaller around 80 hPa and larger at lower heights (which would thus correct the currently large biases in these regions).
The larger biases for IFS-MOZART in the lower stratosphere globally (i.e.not only at the tropics, but also at the poles, especially in the Antarctic) also result from the fact that the useful range of the NRT MLS v2.2 data was restricted to levels above 68 hPa, which means that it included no profile information below that pressure level, in contrast to BASCOE, which assimilated the offline MLS v2.2 data set down to 150 hPa.Tests with the improved NRT MLS v3.4 data (Livesey et al., 2013b), which can be used down to 261hPa, show that many of those biases in the lower stratosphere disappear (see Sect. 6.2).
To illustrate that the selected stations at each latitude band are representative for the results at all stations and that the same conclusions hold in general, we additionally show the mean ozone profiles and ozone bias profiles for the MACC analyses compared to all considered ozonesonde measurements in each latitude band (see Fig. 1), averaged over the entire 3-year period from September 2009 to September 2012 (Fig. 4).On average, all analyses agree with the sondes mostly to within ±10 % above 70 hPa.Larger biases are observed for IFS-MOZART in the upper stratosphere (above 10 hPa) at the poles and in the lower stratosphere with overall biases reaching 30 % in the Antarctic and −40 % at the equator, and for BASCOE below 150 hPa.
Standard deviations between the MACC systems and the ozonesondes are smallest for BASCOE, and only slightly higher for IFS-MOZART, usually between 10 and 20 %, except for the region below 70 hPa in the tropics.The standard deviations for IFS-MOZART are higher in the area between 60 and 100 hPa in the tropics, and between 100 and 200 hPa in the Antarctic.

Influence of the temporal and horizontal resolution
SACADA data are sampled only once a day (at 12 h UT), IFS-MOZART 6-hourly, and BASCOE data 3-hourly.This may affect their performances when compared to ozonesondes.To exclude the effect of temporal resolution, we have degraded the temporal resolution of both IFS-MOZART and BASCOE to the temporal resolution of SACADA.
Relative differences between the fine and the coarse temporal resolution data sets are usually less then 2 %, but can be as high as 10 % for some months, and at some altitudes without any clear pattern.The effect on the standard deviation of the differences when using the 24 h resolution data set for all three analyses is not significant except in the lower tropical stratosphere (figures not shown).
On the other hand, a lower horizontal resolution may also lead to larger standard deviations.BASCOE and SACADA have, however, the same horizontal resolution (3.75 • by 2.5 • ), which is coarser than for IFS-MOZART (1.875 • by 1.875 • ).This illustrates that the differences in standard deviations between the MACC systems are not exclusively dependent of the temporal nor the horizontal resolution.

Validation of the vertical distribution of stratospheric ozone against ACE-FTS
In addition to the ground-based and ozonesonde data, the MACC ozone analyses have been compared to independent ACE-FTS satellite observations.The comparison between the measurements by ACE-FTS and the analysis output is performed in the following manner.The analyses, first regridded to a common 1  ACE-FTS data in space (horizontally and vertically) and time through linear interpolation.Since SACADA results are only provided every 24 h, we assume a constant composition throughout the day.Monthly mean biases of the spatialtemporal collocated data are calculated for five latitude bins, using 25 pressure bins based on the standard Upper Atmosphere Research Satellite (UARS) fixed pressure grid (i.e. six pressure levels per decade, which corresponds approximately to 2.5 km).These monthly mean biases and their associated standard deviations can be displayed as time series (Figs. 5  and 6) or as vertical profiles (Fig. 7).
In view of the problems to constrain the SACADA threedimensional ozone field using only total column assimilation, we will still show the SACADA results in the figures but we will not include these analyses in the discussion.

Partial ozone columns
The time series of the standard deviations in Fig. 5 gives a global view of how well the analyses are performing against the satellite data.The standard deviations are averaged over the entire globe (90 • S-90 • N) and over the entire stratospheric area of interest (200-5 hPa).As shown earlier, when compared with ground-based and ozonesonde observations, the results by IFS-MOZART and BASCOE are very similar.Standard deviations are on average around 6-7 % This is only slightly larger than the relative mean difference between ACE-FTS and coincident MLS profiles, reported by Dupuy et al. (2009, Table 7) as +4.7 %.The largest standard deviations are found around March and August each year.
Binning into a stratospheric pressure layer (100-5 hPa for the tropics, and 200-5 hPa for all other latitude bands) shows small overall mean biases for both systems.Individual monthly mean biases for IFS-MOZART and BASCOE always remain below 5 %, which shows that these analyses have an overall stable behaviour (figure not shown).

Ozone at predefined pressure levels
Even though partial columns indicate a stable behaviour for both IFS-MOZART and BASCOE, interpolation at specific pressure levels (10, 50, and 100 hPa) reveals alternating positive and negative biases in the vertical for IFS-MOZART, both in the Arctic and in the Antarctic, especially during ozone hole events (Fig. 6), which was also seen earlier in the comparison with O 3 sondes (Fig. 3).These vertical oscillations in bias compensate to deliver correct (assimilated) partial or total columns (see Sect. 3).
In the Arctic, biases are, for all analyses, largest in March 2011.Biases remain low for BASCOE (< 10 %), but attain values up to 20 % for IFS-MOZART.One obvious explanation is the occurrence of extreme conditions in the Arctic during this period (Manney et al., 2011).This event will be discussed in Sect.6.The same conclusions can be drawn for the Antarctic during the yearly ozone hole conditions.Biases for BASCOE still remain within 10 %, but are more pronounced for IFS-MOZART than in the Arctic, especially in the lower stratosphere (100 hPa), where relative differences up to almost 50 % in 2011 and even 60 % in 2010 are found in September, even now that Aura MLS data are available for assimilation.

Seasonal mean ozone profiles
Figure 7 shows seasonally averaged relative ozone biases for austral spring and boreal winter, for the three consecutive years in the studied period.BASCOE has a stable performance compared to ACE-FTS data throughout the stratosphere, very similar each year, but slightly underestimating ozone with an average of 5 % in the Arctic.While the biases vary between −10 and 0 % in austral spring 2010, the variability is larger (biases between −15 and +5 %) in austral spring 2011.The seasonal mean biases of IFS-MOZART again illustrate the oscillating behaviour of the profiles, both in the Arctic and Antarctic.Antarctic biases appear to be three times as large as those in the Arctic, and largest the first 6 Arctic ozone hole event 2011

Case study
Besides the overall performance of the different analyses, we want to evaluate the ability of the MACC system to capture special events, such as the yearly recurrent Antarctic ozone holes, or the exceptional Arctic ozone hole in northern win-ter/spring 2011 (Manney et al., 2011).Long-lasting exceptionally cold conditions prevailing over the Arctic, together with man-made ozone-depleting compounds lingering in the atmosphere, caused the destruction of almost 40 % of stratospheric ozone by the end of March (Manney et al., 2011).In this section, we address the performance of the MACC system during this particular event.Throughout the previous discussions, we have already shown that biases with respect to observations are largest at the peak of the ozone hole (i.e.March 2011), illustrating that most systems have difficulties to correctly simulate such an unexpected event.
Figure 8 shows the evolution of the ozone depletion, as simulated by IFS-MOZART and BASCOE in the North Pole vortex at 485 K potential temperature (∼ 20 km, ∼ 50 hPa) during the month of March 2011.The vortex is determined by the potential vorticity (PV).Two contours of scaled' PV (sPV) delimit the outer and inner vortex edges, respectively using an sPV of 1.4 10 −4 (as in Manney et al., 2011) and 1.7 10 −4 s −1 .In view of the fact that SACADA did not assimilate SCIAMACHY data at high northern latitudes before May 2011 (see earlier), we have omitted the discussion of SACADA results in this particular case study.Manney et al. (2011) showed that, in February-March 2011, the barrier to transport at the Arctic vortex edge was the strongest in either hemisphere for the last ∼ 30 years.This barrier isolates the cold air in the vortex, preventing it from mixing with air in the midlatitudes, causing a build-up of ozone, brought by long-range transport outside the vortex.Inside the vortex, the air masses were cold enough to allow PSC particles to condense.Heterogeneous reactions took place at the surface of these particles, converting chlorine reservoir molecules HCl and ClONO 2 into chemically active volume mixing ratio (ppmv) ClO and Cl 2 .catalytic destruction of ozone could start as soon as sunlight came back to illuminate these air masses.From late February/early March 2011 onwards, reduced levels of ozone were observed inside the vortex and the ozone hole started to develop.The largest chemical loss was recorded on 26 March.At that time, a stretched vortex was covering Scandinavia and northwest Russia.
As seen in Fig. 8, IFS-MOZART and BASCOE provide very similar results, BASCOE values being slightly higher and slightly noisier than the IFS-MOZART ones.The slightly higher noise has been corrected in a later version of BASCOE (see Sect. 2.2.2).
In Comparison with ozone soundings at Ny-Ålesund (Spitsbergen), which is always located within the polar vortex, shows that both IFS-MOZART and BASCOE could correctly reproduce the ozone hole conditions with relative biases mostly less than 10 % in the stratosphere (Fig. 10).IFS-MOZART has, however, problems with a correct simulation of the vertical profile, when the ozone depletion is strongest (in March 2011) and alternating positive and negative biases up to 30 % can be seen.These uncontrolled modifications of the observing system led us to explore in a more systematic manner the impact of the assimilated observations on the quality of the analyses.We chose a 1-month period with the Arctic ozone depletion already well underway while MLS and ACE-FTS were still scanning the area, i.e. the month of March 2011.We first defined three new experiments with IFS-MOZART, BASCOE and SACADA assimilating the same data set: Aura MLS version 3.3 offline ozone, keeping all observations down to 215 hPa.BASCOE was not allowed to assimilate any other species than ozone.To allow a short spin-up period of about 1 week, the three systems were started on 25 February from the BASCOE analysis delivered in NRT for that date.

Influence of the assimilated data on the performance of the analyses
Figure 11 (left) shows the mean bias and standard deviations of the differences between the NRT (i.e. the original) analyses and ACE-FTS observations, keeping only the (∼ 200) ACE profiles within the North Pole vortex, with the vortex edge calculated with an sPV of > 1.7 e −4 s −1 .At the level where ozone depletion is at its maximum (θ ≈ 485 K), we see that the depletion is much too severe in IFS-MOZART NRT analyses and completely absent in SACADA NRT analyses.Figure 11 (right) shows the results of the three new offline experiments assimilating the same data set; now all analyses perform very similarly.
To identify the exact cause of the large improvement in IFS-MOZART analyses, we ran a last sensitivity test with IFS-MOZART assimilating the usual set of UV-Vis data (OMI and SCIAMACHY total columns; SBUV/2 partial columns) in addition to the offline MLS v3 data set.As can be seen in Fig. 12, the bad performance of IFS-MOZART NRT was not due to the assimilation of UV-Vis data but rather to the assimilation of the MLS v2 NRT data.If the MLS v3 and UV-Vis observations are assimilated together (green curves), the quality of the ozone analyses delivered by IFS-MOZART improves: tropospheric ozone is improved over the previous sensitivity test assimilating only MLS v3 (blue curves), and the simultaneous assimilation of UV-Vis observations does not degrade the analysis of stratospheric ozone.
The worse performance of IFS-MOZART NRT is probably not due to the earlier version of the MLS data set either, because our sensitivity test with BASCOE (Fig. 11, blue lines) shows that the analyses of MLS v2.2SCI (left) performed nearly as well as the analyses of MLS v3 (right) despite the use of an earlier version of BASCOE.Hence the better performance of BASCOE NRT is primarily due to its assimilation of MLS v2.2SCI down to 150 hPa, while IFS-MOZART had to assimilate MLS v2.2 NRT, which was not valid (and filtered out) below 68 hPa.This subtle difference in configuration is due to an operational constraint: IFS-MOZART had to be run closer to real-time and could not wait 3 extra days for the distribution of MLS v2.2 SCI.
It is now possible to interpret the slight differences between the performances of the three systems assimilating the same MLS data set (Fig. 11,right).The biases and standard deviations are smallest for IFS-MOZART, probably thanks to its higher horizontal resolution.The standard deviations for SACADA are slightly larger than the ones for IFS-MOZART and BASCOE, which is due to its lower time sampling (24 h output frequency instead of 6 h for the two other).Finally, the BASCOE experiment delivers smaller biases and standard deviations than the original NRT analysis.This may be due to two different causes: the assimilation of MLS offline v3.3 instead of MLS offline v2.2, and/or an improvement in the pre-processing of the ECMWF wind fields that drive the transport in BASCOE.The BASCOE version used in NRT suffered from aliasing errors in the input wind fields, leading to some erroneous noise in the horizontal distribution of chemical tracers (Fig. 8).

Conclusions
Four ozone data assimilation systems (DAS) have been run continuously and simultaneously since September 2009.We have validated and compared the resulting analyses over a period of 3 years, i.e. until September 2012.These DAS have very different designs (offline or online dynamics; grid setup; specification of background error covariances) and were set up very differently with respect to the assimilated data sets.In this paper we seized this opportunity, first to provide an intercomparison and validation of the resulting analyses, and second to investigate the causes of their very different biases.
This study shows what can be achieved in near real time (NRT) with state-of-the-art DAS for stratospheric ozone and provides guidance to the users of the resulting analyses.Among the three sets of vertically resolved NRT analyses of stratospheric ozone, those delivered by BASCOE had the best overall quality.This is due primarily to the focus of BASCOE on stratospheric observations retrieved from limb sounders, and to more relaxed operational constraints allowing it to wait for the delivery of the best input data set available.
TM3DAM is based on a sequential Kalman Filter algorithm and does not model stratospheric chemistry explicitly.It aims only to provide total columns of ozone by making optimal use of the ozone column measurements from UV-Vis satellite sounders, with very small biases between the analyses/forecasts and satellite data sets.It was shown that TM3DAM is a good reference to test the ability of the three other systems to produce accurate ozone column amounts.
The low quality of the analyses delivered in NRT by SACADA is a good indication of the drawbacks to expect from current CDA systems when they are configured to assimilate total ozone columns only.This should be considered as a worst-case scenario in a future situation where no limb sounder would be available and no proper effort would be invested to assimilate the vertical profiles retrieved from nadirlooking instruments.
Finally, while IFS-MOZART did not deliver the best NRT analyses in this intercomparison, it still has the potential to deliver the best analyses (Figs. 11 and 12).Official reviews of international monitoring capacities (e.g.WMO, 2011), expect an imminent lack of ozone-profiling capabilities at high vertical resolution.Contrarily to the BASCOE version used here, IFS-MOZART should be able to adapt to this situation thanks to its demonstrated ability to assimilate several instruments simultaneously.
From a system design point of view, the sensitivity tests performed in Sect.6.2 deliver important conclusions: -All systems used in MACC require profile data to provide a good vertical distribution of stratospheric ozone.
-These profile data must include the lower stratosphere.
-IFS-MOZART is able to assimilate limb profiles and nadir products successfully.The profiles constrain well the stratosphere, allowing the partial and total columns (from UV-Vis instruments) to constrain well the troposphere.
-When they assimilate the same data set with good quality and large observational density, BASCOE, IFS-MOZART and SACADA deliver very similar performance despite their very different designs.The quality of modern ozone analyses depends primarily on the assimilated data.This conclusion has large implications for the planning of future satellite missions.
The newer SBUV/2 v8.6 profiles are distributed over 21 layers and each profile is distributed with its matrix of averaging kernels.Kramarova et al. (2013) illustrated the importance of using this information properly.While it is planned to implement SBUV/2 averaging kernels in the MACC NRT system at ECMWF, the sensitivity test in Sect.6.2 shows that this improvement was not necessary to assimilate successfully SBUV/2 v8 after a vertical re-gridding over six thick layers.
This study demonstrates the large benefit obtained from the assimilation of a single limb-scanning instrument with a high density of observations (Aura MLS).Therefore we can only share the serious concern about the lack of ozoneprofiling capabilities at high vertical resolution in the short term, as expressed already in WMO (2011).

Figure 1 .
Figure 1.Location of all stations used in this paper.O 3 sondes are indicated as filled black circles.The ones selected for a more detailed discussion have been marked in red: Ny-Ålesund (79 • N, 12 • E) in the Arctic, Nairobi (1.27 • S, 36.8 • E) in the tropics, and Neumayer (70.65 • S, 8.25 • W) in the Antarctic.The three sites selected for the total ozone column (TOC) discussion are indicated by the red squares.

Figure 2 .
Figure 2. Comparison between the TOC time series (5-day moving average) of the four analyses of the MACC stratospheric ozone service (IFS-MOZART in red, BASCOE in blue, SACADA in green, and TM3DAM in cyan) interpolated to a high northern latitude station (Alert, 82.49• N, 62.42 • W), a tropical station (Chengkung, 23.1 • N, 121.365 • E) and a southern latitude station (Syowa, 69 • S, 39.58 • E), for the period from September 2009 to September 2012.Black symbols are 5-day moving averages of Brewer (for Alert and Chengkung) and Dobson (for Syowa) observations from the WOUDC network.In order to indicate the observational uncertainty, the height of each symbol is set to 4 % of the observed value.

Figure 4 .
Figure 4. Top row: mean ozone profiles (top rows) as partial pressures in mPa from IFS-MOZART (red), BASCOE (blue), SACADA (green) and ozonesondes (black).Bottom row: mean and (solid lines) and standard deviations (dashed lines) of the relative differences, in %, of these analyses against the ozonesondes.over the period from September 2009 to September 2012.

Figure 5 .
Figure 5.Comparison of the global (i.e. from 90 • S to 90 • N) monthly mean standard deviation between IFS-MOZART (red), BASCOE (blue), and SACADA (green) with ACE-FTS (analysis minus observations) in %, for the [200,5]hPa pressure bin, for the period September 2009 to September 2012.Standard deviations for levels with less than 20 observations are omitted.Note that standard deviations are not weighted by the cosine of the latitude.

Figure 10 .Figure 11 .Figure 12 .
Figure 10.Comparison of the monthly averaged O 3 partial pressures, in mPa, by IFS-MOZART (red), BASCOE (blue) and SACADA (green) with O 3 sonde profiles observed at Ny-Ålesund for January to April 2011.The number of available O 3 sonde profiles and the number of collocated system profiles are indicated in brackets.

Table 2 .
(Brewer, 1973)of the characteristics of the four assimilation systems: IFS-MOZART; BASCOE; SACADA; and TM3DAM.The horizontal and vertical resolution have been abbreviated to Hor. and Vert.resol.respectively.Freq.standsforfrequency, and Assim.forassimilation.•N,121.365 • E, data gathered by the Central Weather Bureau of Taiwan); and a southern latitude station, Syowa (69 • S, 39.58 • E, data gathered by the Japan Meteorological Agency).As indicated above, we used the observations gathered by the Brewer instruments at Alert and Chengkung, and those gathered by the Dobson spectrophotometer for Syowa.For Alert, we used the data for both Brewer instruments 019 (MKII) and 029 (MKV).The Brewer instrument (#061) at Chengkung is of type MKIV.Brewer data at µ > 3 were filtered out, where µ is the increase in the ozone optical path length due to the obliquity of the sun's rays(Brewer, 1973).The Dobson instrument (#119) at Syowa was replaced on 1 February 2011 by a new Beck model (#122).
• W, data gathered by the Meteorological Service of Canada); a tropical station, Chengkung (23.1