Dynamic Linear Modeling estimates of long-term ozone trends from homogenized Dobson Umkehr profiles at Arosa/Davos, Switzerland

. Six collocated spectrophotometers based in Arosa/Davos, Switzerland, have been measuring ozone profiles continuously since 1956 for the oldest Dobson instrument and since 2005 for the Brewer instruments. The datasets of these two ground-based triads (three Dobsons and three Brewers) allow continuous intercomparisons and derivation of long-term trend estimates. Mainly, two periods in the post-2000 Dobson D051 dataset show anomalies when compared to the Brewer triad time series: 5 in 2011-2013, an offset has been attributed to technical interventions during the renewal of the spectrophotometer acquisition system, and in 2018, an offset with respect to the Brewer triad has been detected following an instrumental change on the spectrophotometer wedge. In this study, the worldwide longest Umkehr dataset (1956-2020) is carefully homogenized using collocated and simultaneous Dobson and Brewer measurements. A recently published report (Garane et al., 2022) described results of an independent 10 homogenization of the same

. Non monotonic post-2000 trends are also reported in Arosio et al. (2019) where MLR trends are estimated from a merged SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartographY), OMPS (Ozone Mapping and Profiler Suite) and SAGE (Stratospheric Aerosol and Gas Experiment) II dataset on the 2003 to 2018 period. In their study, stratospheric tropical trends are shown to be negative during the 2004 to 2011 period and positive since 2012. 60 Trend estimates by DLM are recent in the literature. First reports are from Laine et al. (2014) who developed the DLM analysis for trend evaluation and applied it to a merge of SAGEII and GOMOS (Global Ozone Monitoring by Occultation of Stars) data records. They compare trend estimates by DLM to trend estimates by piecewise MLR, the latter being described in a companion paper by Kyrölä et al. (2013). They conclude that DLM is a robust method well suited for modeling ozone time series changes (see Section 4.2). Their results show a statistically significant turnaround in the ozone time series after 65 1997 at mid-latitudes in the 35 to 55 km altitude range and a more complex behavior of the ozone concentration than the description which can be made by a simple piecewise multilinear regression model. Consequently, stronger ozone variations (decrease or increase) are reported locally when estimated by DLM than by MLR. Ball et al. (2017)  also been used to estimate trends in the lower stratosphere based on the merged SWOOSH/GOZCARDS (Stratospheric Water and Ozone Satellite Homogenized/Global OZone Chemistry And Related Datasets for the Stratosphere) data records (Ball et al., 2018) as discussed previously. More recently, DLM trend estimates on SOS (SAGEII, Osiris (Optical Spectrograph and InfraRed Imaging System) and SAGEIII) merged satellite data record are reported (Bognar et al., 2022) and indicate a clear upper stratospheric ozone recovery with varying turnaround years depending on the latitude, a decrease since 2012 in the NH 75 upper/middle stratosphere, but without excluding a step in the Osiris dataset as a cause, and a persistent decrease in the tropical lower stratosphere.
Dobson Umkehr ozone profile data records, which are distributed all around the world (Petropavlovskikh et al., 2022;Godin-Beekmann et al., 2022;Stone et al., 2015;Miyagawa et al., 2009;Garane et al., 2022), have been extensively used in the pre-1998 stratospheric trend estimates (Reinsel et al., 1989;Randel et al., 1999;Miller et al., 1995) . Beginning in 1956 for 80 the oldest, the Umkehr records were unique at that time since satellites records only became available in 1979(McPeters et al., 1996aBhartia et al., 2013) and ozonesondes, starting in 1960 (Smit et al., 2007), do not reach the upper stratosphere. Few studies based exclusively on Umkehr measurements report on NH post-2000 stratospheric ozone trends (Zanis et al., 2006;Park et al., 2013). Zanis et al. (2006) derived trends from the Arosa Dobson Umkehr dataset and reported statistically significant negative trends in the 1970 to 1995 period, and the first signs of a reversing trend in the lower and the upper stratosphere for 85 the period 1996 to 2004. Since this turnaround was not statistically significant, the authors suggested that the dataset should be reevaluated at a future stage when more measurements become available. The homogenized Umkehr time-series was used by Park et al. (2013) to derive trends using functional mixed models, and in the frame of the LOTUS project , which derived stratospheric ozone trends from improved and combined datasets (satellites, ground-based and models). The NH trends derived from the Umkehr datasets are in accordance with trends derived from other ground-based 90 instruments for the pre-1997 period and the post-2000 period. Umkehr data corroborate also the satellite findings showing highly statistically significant evidence of declining ozone concentrations since the mid 1980s in the upper stratosphere and post-2000 positive trends ranging between 2.0% and 3.1% per decade in the upper stratosphere of NH mid-latitudes. The Umkehr data records are still extensively used for trend estimates along with datasets from other ground based techniques, satellites and models (Steinbrecht et al., 2017;Harris et al., 2015;Petropavlovskikh et al., 2019;Tarasick et al., 2019;Godin-95 Beekmann et al., 2022). However, trend estimations on Brewer Umkehr data records are sparse. A study using simple linear regression, without consideration of explanatory variables, applied to data from the Brewer 005 of Thessaloniki presented by Fragkos et al. (2018) reports 1997-2017 statistically significant positive trends, in the NH, above 35 km of 0.3%/year and non statistically significant trends below. Fitzka et al. (2004) reports on linear trends estimated with the Senn's Q method and significances assessed with the Mann-Kendall test. We innovate here by estimating Brewer Umkehr trends considering 100 explanatory variables in the regression by DLM.
The dataset quality is of primary importance for trend studies, and multi-instrument comparison analyses are suited to assess the long-term stability of data records by estimating the drift and bias of instruments (Hubert et al., 2016). Using microwave radiometer data records, Bernet et al. (2019) showed the effect of instrumental artefacts on the long-term ozone profile trends.
Recently, trend estimated on updated and reprocessed ozone profiles datasets have resulted in reduced trend uncertainties 105 (Godin-Beekmann et al., 2022).
The quality of the Arosa/Davos total column ozone (TCO) dataset is currently under investigation by a reprocessing and a homogenization with the use of ozone absorption cross section from Serdyuchenko et al. (2014) (Gröbner et al., 2021) and the consideration of the effects of the relocation from Arosa to Davos (Stübi et al., 2021b). In Arosa/Davos, the Dobson D051 is the station's primary instrument for continuous Umkehr profile time serie. It was dedicated exclusively to Umkehr 110 measurement from 1988 until February 2013, when total ozone measurement was added to the schedule. The number of observations dedicated to Umkehr was not impacted and the number of retrieved Dobson D051 Umkehr profiles was kept to two profiles per day up to now. This frequency in observations allows the computation of statistically reliable monthly means for trend estimations. However, the instrument operations recently suffered from anomalies following technical interventions.
Therefore, a complete homogenization of the Dobson D051 Umkehr data record has been performed and is described in this 115 paper. Trend estimations free from known instrumental artefacts can then be derived from this dataset.
The paper is organized as follows: the data sources used in this study are described in section 2, with a special focus on the Umkehr method description. In section 3, the complete homogenization of the Dobson D051 Umkehr data record is detailed and compared to the homogenization performed by NOAA on the same data record in the frame of the ESA project WP-2190 (Garane et al., 2022). The MLR and DLM trend estimate methods are described in section 4, with a comparison of the trend 120 values resulting from both regressions on the same Dobson D051 data record. Results of vertically resolved long-term trend estimates by DLM are presented and discussed in section 5, followed by conclusions in section 6.

Umkehr data records from Arosa/Davos
The Umkehr technique, which will be described in section 2.1.1, allows low-resolution retrieval of ozone profiles from mea-125 surements made by Dobson and Brewer spectrophotometers. TCO and ozone profile measurements with Dobson (and Brewer) spectrophotometers were performed at Arosa (46.82 • N, 6.95 • E) from 1926Arosa (46.82 • N, 6.95 • E) from (and 1988 to 2021 and at Davos since 2012. For a detailed description of the Dobson and Brewer spectrophotometers, we refer to Stübi et al. (2021aStübi et al. ( , 2017a. The progressive relocation of the Dobson and Brewer triads from Arosa to Davos (13 km north of Arosa and 260 m lower in altitude) between 2012 and 2021 is described and analyzed in Stübi et al. (2017bStübi et al. ( , 2021b the TCO level within the instrumental noise (Stübi et al., 2017a(Stübi et al., , 2021b.

The Umkehr method
The Umkehr method is based on the measurement of the ratio of downward scattered zenith sky radiation for two wavelengths in the UVB-UVA range from 300 nm to 330 nm (Huggins absorption band) which are subject to different strengths of ozone absorption, the shorter wavelength being more strongly absorbed by ozone. This ratio changes as a function of SZA during 165 sunset and sunrise due to changes in the scattering height along the zenith (Mateer, 1965;Stone et al., 2015). As the SZA is increasing from 60°to 90°, the scattering height is increasing, and the two intensities decrease because of increased absorption and scattering by ozone and air molecules. As the shorter wavelength has a higher scattering point than the longer wavelength, its intensity is decreasing faster than the longer wavelength intensity as long as both scattering heights are below the ozone maximum. At high SZA, the scattering height for the shorter wavelength is above the ozone maximum and the scattering height 170 of the longer wavelength is still below the ozone maximum. The shorter wavelength intensity decreases then less rapidly than the longer wavelength intensity and the ratio reaches a maximum at high SZA called the Umkehr effect (Götz et al., 1934).
The Umkehr method allows the retrieval of ozone profiles from the measurements by Dobson and Brewer spectrophotometers.
We describe the particularities of Dobson and Brewer Umkehr measurements in the following subsections.

Umkehr measurements by Dobson spectrophotometer
The logarithm of the ratio of the two wavelengths intensities (R values) is converted to radiance using calibration tables (RtoN  The 12 N values (further called N curve) are screened for clear sky conditions and corrected for cloud influence using a nearby UV/VIS lux meter. This empirical correction is based on the relation between the UV/VIS intensity of clear days (within the same month, for each SZA) and the UV/VIS intensity variation during the cloudy N curve measurement (see Basher, 1982). This cloud correction is based on a uniform cloud layer and may fail for more complicated cloud structures.

185
Haze correction is not included. It was shown that the effect of small cloud corrections of the N values on the vertically resolved ozone trends is negligible. For these reasons, only profiles retrieved from N curves without any cloud correction or with a small correction are considered for our study.

Umkehr measurements by Brewer spectrophotometer
The intensity of 8 wavelenghths (306.3, 310.1, 313.5, 316.8, 320.1, 323.2, 326.5, and 329.5 nm) are quasi-simultaneously 190 measured for solar zenith angle changing from 60°and 90°. A holographic grating is used as dispersive element for the solar radiation passing then through narrow slits centered on the desired wavelengths. Mark II Brewer instruments use one single holographic grating and therefore only one dispersive element to separate the wavelengths. Mark III Brewer instruments are double monochromators that use two holographic gratings (Staehelin et al., 2003). In the layers below Dobson Layer (DL) 4, peaking at 20 km, for both instruments, the Averaging Kernels (AKs, not shown) 210 show sensitivity of observations to ozone variability in several layers, and therefore the partitioning of the retrieved ozone in individual layers is based on the a priori information.
The quality check of the retrieved ozone profile includes assessment of the number of iterations (fewer than four is considered a good profile) and the condition that the difference between observed and retrieved Umkehr observations at all SZAs remains within measurement uncertainty (Petropavlovskikh et al., 2022).

215
A generic stray light correction can be applied to reduce systematic biases in the Dobson Umkehr retrieved profiles (Petropavlovskikh et al., 2011). The NOAA version of the Dobson retrieval applies this correction while the MeteoSwiss (MCH) version does not. The seasonal bias between the Dobson and Brewer ozone records is reduced when a stray light correction is applied to the Dobson record . Moreover, as a step change in the record can be related to a change in the amount of stray light, a proper correction of the stray light effect can help to reduce the magnitude of the step.

220
The Dobson D051 Umkehr observations dataset is regularly archived at the World Ozone and Ultraviolet Radiation Data Centre (WOUDC, www.woudc.org). The Brewers are part of the eubrewnet (http://www.eubrewnet.org/eubrewnet), where raw data files are available for registered users.

Aura MLS
The Microwave Limb Sounder (MLS) is a microwave limb-sounding radiometer on board the Aura Earth observing satellite, levels p i and converted to DU following Godson (1962): with C = 0.00079DUhPa −1 ppbv −1 andX the ozone mean VMR in ppbv. Approximative heights are given as in Petropavlovskikh et al. (2022).
3 Homogenizations of the Dobson D051 dataset

235
As the quality of a dataset is essential in order to estimate reliable long-term trends with uncertainties as reduced as possible, we first investigate the quality of the Arosa/Davos longest Umkehr ozone profile dataset and proceed to its detailed homogenization.
The worldwide longest Umkehr ozone profile record was recently impacted by short term anomalies due to instrumental changes and technical issues.It has been homogenized by two simultaneous but independent studies, one by the principal  reported in the metadata. If we cannot see any indication in the metadata for an instrumental drift, no correction is applied.   For each period that requires a correction (see Table 2) we apply to the N values a SZA dependent offset which is constant over the period to be corrected. The offset is calculated such that the difference averaged over the period and over the reference instruments (two Dobsons in 2003 or three Brewers after 2011) matches the difference averaged over two years before and two years after the period and over all reference instruments (see Fig. 3): year of Dobson D051 anomaly Technical issue/instrumental change  after (period P 3 ) the Dobson D051 problematic period (period P 2 ). All values are averaged over two years periods. N2 corr SZA is the corrected N value in period P 2 .
In case of a step in the time series (e.g. in July 2003 and in May 2018), the period P 2 does not exist and should not be considered in Fig. 3. The corrected N value N2 corr SZA of period P 1 is then obtained following equations 4 and 5.

NOAA homogenization of the Dobson D051 dataset
In parallel but in a separate work, a homogenization and a correction for the stray light effect of the same Dobson dataset has been performed by NOAA (Garane et al., 2022;Petropavlovskikh et al., 2022). They use the comparison of the Dobson D051 The NOAA homogenized Dobson D051 dataset has been compared to satellites data records including AURA MLS in (Garane et al., 2022). The agreement is within ±-5 % in the upper and middle stratosphere and larger biases (up to 10 %) are 325 found in the lower stratosphere.

Comparison of the homogenizations of the Dobson D051 dataset
The NOAA homogenization has been developed to remove artificial steps in the Umkehr ozone profile records and to reduce the bias relative to other ozone observing systems. The MCH homogenization approach is different in that the homogenization process aims to remove artificial steps in the Dobson D051 Umkehr profiles record while maintaining the constant offset value ). For the same SZA, the amount of correction is different for each monthly mean value of the timeseries in proportion to the seasonal changes in total column ozone (Fig. 4a). This is not corrected for in the MCH N value homogenization. The years around 1982 and 1992 are periods of volcanic eruptions (El Chichon and Pinatubo) which are corrected by the NOAA homogenization but not considered in the MCH homogenization as the Umkehr retrieval does 345 not account for the change in atmospheric scattering due to aerosols injection (Petropavlovskikh et al., 2022).  Fig.1 for an example).
In order to evaluate the effects of both homogenization on the Dobson D051 time series, monthly mean relative difference to Aura MLS data record are plotted in Figure 5 for two altitude levels i.e. DL5 (25 km) in middle stratosphere and DL8 (40 km) in upper stratosphere. The relative difference of the Brewer B040 time series is also shown for the same layers.
The Brewer B040 relative difference shows a constant offset to Aura MLS but clear anomalies in 2012 and 2013 in DL5 (Fig.5a). The Dobson D051 homogenized by NOAA shows a very good accordance with Aura MLS both in DL5 and DL8.

360
The small mean bias is a result of the NOAA optimization of the stray light correction. Therefore, it is not the magnitude of the bias between the homogenized dataset and Aura MLS but its variation (the bias should be constant) which should be considered here. No clear offset in the difference to Aura MLS between the NOAA and the MCH homogenized record is reported in DL5. The variability of the differences to Aura MLS of each dataset looks higher after 2010 while the mean values are constant. However, the slight underestimation of the MCH homogenization since 2017 seems to match the Brewer B040 365 difference to Aura MLS in DL5 (Fig.5a). After 2017, the relative difference to Aura MLS of D051 homogenized by MCH and of the collocated B040 is within -5% to -10% while the D051 homogenized by NOAA lies within -2% of Aura MLS.
A clear correction of the 2011-2013 period is visible in DL8 (Fig.5b). Except for the respective MCH and NOAA homogenized datasets mean offsets to Aura MLS, a slight overestimation of the NOAA homogenization is visible in 2012 and 2013.
However, the Brewer B040 relative difference to Aura MLS is also slightly smaller during this time range, when the Brewer 370 instrument had not undergone any technical interventions. This is particularly visible on the anomalies time series of B040 in Fig. 5c. As the MCH homogenization relies on the Brewer collocated datasets, it allows to take into account the local variability of the ozone DL8 content that the M2GMI model, base for the NOAA homogenization, probably does not consider. As the atmospheric processes are more homogenized in the stratosphere that in the troposphere, the M2GMI ozone profiles should be representative of stratospheric ozone variability. Nevertheless, it is possible that other atmospheric interferences (i.e. aerosols)

375
can impact the Dobson readings of zenith sky radiance which would also impact Brewer observations, but might not be fully included in the M2GMI simulations.
Due to the occurence of an anomaly in 2018, which is particularly visible in DL8 for all datasets (Fig. 5c), the last correction applied to the dataset by the NOAA and the MCH homogenizations differ.
As the MCH homogenization considers a step correction in May 2018, the ozone increase during the 2018 anomaly is 380 accounted for in the mean difference of the D051 dataset to the Brewers datasets of the pre-and the post-step periods. As a result, the calculated offset is small. The NOAA homogenization method detects a change in the Umkehr ozone with respect to the M2GMI record that starts a year earlier, in 2017. The ozone increase during the 2018 anomaly is accounted for only in the mean difference to M2GMI of the post-step period of the D051 dataset. Moreover, this post-step difference is overestimated as M2GMI doesn't seem to simulate any significant anomaly at that period. As a result, the calculated offset, applied in 2017, is 385 probably overestimated.
Now that the Dobson D051 is fully homogenized, vertically resolved long-term trends can be estimated with limited influence of instrumental artefacts.
Two regression methods for trend estimation are described in this section. First, we describe the common and widely used 390 MLR and second, we detail the more recent DLM regression method. Trends estimation by both methods are then compared on the case study of MCH homogenized Dobson D051 dataset. points are considered with equal weights, and the uncertainty of the fit parameters is estimated from the regression residuals.

MLR trend estimation method
Residual autocorrelations are accounted for by applying a Cochrane-Orcutt transformation to the model (Cochrane and Orcutt, 1949).

DLM trend estimation method
Dynamic Linear Modeling allows the determination of a non-linear time-varying trend from a monthly means time series. This 405 is a Bayesian approach regression which fits the data time series for a non-linear time-varying trend, regression coefficients from explanatory variables and seasonal and annual modes, considering their uncertainties and an autoregressive component.
The trend is allowed to smoothly vary in time and its degree of non-linearity is inferred from the data, as well as the turnaround period. We use the code by Alsing (2019)    trends. In order to check the agreement of trends derived form different datasets, uncertainties including a term accounting for remaining steps and for inhomogeneities in the dataset (Bernet et al., 2021) should be considered. Figure 6 shows the long-term trend estimates from the MCH homogenized Dobson D051 dataset by DLM (in blue with ±2 420 sigma uncertainty shaded area) and by MLR (PWLT, in black with ±2sigma uncertainty shaded area) for the same explanatory variables at three altitude levels.

Comparison of MLR and DLM trend estimation: case of Dobson D051 dataset
Overall trends are similar but differ over short timescales because of their representation of the nonlinearity of the changes in the data record. The advantage of DLM lies in the estimation of a smoothly varying trend without assuming any shape.
The inflection year depends on the method: while the inflection point is fixed by the MLR PWLT (1998 in this case, see the fraction of the KDE above/below zero, slightly differs from the MLR uncertainty estimates. In the lower stratosphere, for DL4 (Fig 6a and d) DLM estimate is significantly negative at the 95% level. In the upper stratosphere, for DL8 (Fig 6c and f)   implemented the Umkehr Brewer retrieval algorithm. L.F. is responsible for the Aura MLS measurements. All co-authors contributed to the preparation of the manuscript.
Competing interests. The authors have no competing interests.