Articles | Volume 20, issue 16
Research article
26 Aug 2020
Research article |  | 26 Aug 2020

Statistical regularization for trend detection: an integrated approach for detecting long-term trends from sparse tropospheric ozone profiles

Kai-Lan Chang, Owen R. Cooper, Audrey Gaudel, Irina Petropavlovskikh, and Valérie Thouret

Detecting a tropospheric ozone trend from sparsely sampled ozonesonde profiles (typically once per week) is challenging due to the short-lived anomalies in the time series resulting from ozone's high temporal variability. To enhance trend detection, we have developed a sophisticated statistical approach that utilizes a geoadditive model to assess ozone variability across a time series of vertical profiles. Treating the profile time series as a set of individual time series on discrete pressure surfaces, a class of smoothing spline ANOVA (analysis of variance) models is used for the purpose of jointly modeling multiple correlated time series (on separate pressure surfaces) by their associated seasonal and interannual variabilities. This integrated fit method filters out the unstructured variation through a statistical regularization (i.e., a roughness penalty) by taking advantage of the additional correlated data points available on the pressure surfaces above and below the surface of interest. We have applied this technique to the trend analysis of the vertically correlated time series of tropospheric ozone observations from (1) IAGOS (In-service Aircraft for a Global Observing System) commercial aircraft profiles above Europe and China throughout 1994–2017 and (2) NOAA GML's (Global Monitoring Laboratory) ozonesonde records at Hilo, Hawaii, (1982–2018) and Trinidad Head, California (1998–2018). We illustrate the ability of this technique to detect a consistent trend estimate and its effectiveness in reducing the associated uncertainty in the profile data due to the low sampling frequency. We also conducted a sensitivity analysis of frequent IAGOS profiles above Europe (approximately 120 profiles per month) to determine how many profiles in a month are required for reliable long-term trend detection. When ignoring the vertical correlation, we found that a typical sampling strategy (i.e. four profiles per month) might result in 7 % of sampled trends falling outside the 2σ uncertainty interval derived from the full dataset with an associated 10 % of mean absolute percentage error. Based on a series of sensitivity studies, we determined optimal sampling frequencies for (1) basic trend detection and (2) accurate quantification of the trend. When applying the integrated fit method, we find that a typical sampling frequency of four profiles per month is adequate for basic trend detection; however, accurate quantification of the trend requires 14 profiles per month. Accurate trend quantification can be achieved with only 10 profiles per month if a regular sampling frequency is applied. In contrast, the standard separated fit method, which ignores the vertical correlation between pressure surfaces, requires 8 profiles per month for basic trend detection and 18 profiles per month for accurate trend quantification. While our method improves trend detection from sparse datasets, the key to substantially reducing the uncertainty is to increase the sampling frequency.

1 Introduction

The vertical profile is a type of time series data that reports the composition (e.g., ozone, water vapor, carbon monoxide) or thermodynamic properties (e.g., temperature, relative humidity, wind speed) of the atmosphere from the surface to an altitude that can range from a few tens of meters (e.g., tethered weather balloons) to more than 30 km (e.g., ozonesondes). Conventionally, the analysis of trends based on vertical profile data is conducted on particular altitude bins or pressure levels which are treated as independent time series (e.g., Miller et al.2006; Harris et al.2015; Lossow et al.2019; Petropavlovskikh et al.2019), but due to the low sampling frequency of vertical profiles, the vertically distributed trend estimates may be inconsistent from one layer to the next. Trace gas or thermodynamic properties on adjacent pressure or altitude levels are often correlated, and the treatment of the levels as independent time series ignores the potential vertical interconnection. In order to reduce the estimation uncertainty of a trend on a particular level, methods have been developed that produce an area average from multiple time series in close spatial proximity (Wigley et al.1984; Chang et al.2017) or that employ a dimension reduction technique (e.g., principal component analyses) to project multiple layers of data into a reduced set (Ghil et al.2002; Meiring2007; Park et al.2013).

In terms of producing an area average based on multiple stations dispersed across a given region, a typical procedure is to create a surface gridded product, which usually aggregates all available monitoring stations within the grid cell without considering spatial sampling and irregularities (Simmons et al.2010). Another example is the trend analysis of ozonesonde profiles based on units of partial pressure with observations integrated into a limited number of layers for the simplification of the analysis and calculation of column sum (Tiao et al.1986; Miller et al.2006). The dimension reduction technique, e.g., principal component analysis, also known as the empirical orthogonal function, is commonly used to reduce correlated multivariate data into uncorrelated vectors that maximize the explained variance with as few vectors as possible. This approach is used to calculate certain climate indices, such as Atlantic and Antarctic oscillations (, last access: 19 August 2020). However, this technique is a purely numerical algorithm that is not based on physical principles; thus a meaningful interpretation of the outcome can be subjective or prohibitive.

The aim of such aggregations, either through averaging or dimension reduction, is to enhance a certain common underlying signal (or signals) or to produce regional means (Wigley et al.1984; Ghil et al.2002). While the issue of spatial irregularity can now be tackled by a wide range of sophisticated spatial interpolation techniques (Stein1999), the averaging approach has not been thoroughly investigated in an objective manner with respect to enhancing the signal-to-noise ratio. In terms of calculating long-term trends from sparse time series comprised of vertical ozone profiles, this study seeks a statistical framework to accommodate a broad set of correlations rather than trying to maximize the signal-to-noise ratio (e.g., trend value and its 2σ uncertainty) by deliberately selecting a subset of correlated time series. Specifically, this paper aims to extend the traditional multivariate linear regression model by applying a smoothing spline ANOVA (analysis of variance) framework (Wang1998; Gu2004, 2013) to account for the vertical profile variability without a dimension reduction; therefore, a physical meaning in each pressure layer could be maintained. A smoothing spline is a nonlinear curve or surface fitting technique commonly used to model longitudinal data (for example a large cohort study in biostatistics) (Wahba et al.1995; Verbyla et al.1999; Shaddick and Zidek2015), and it has been widely applied to a variety of spatial pattern and/or temporal trend studies (Schlink et al.2006; Augustin et al.2009; Park et al.2013; Wood et al.2015; Chang et al.2017). We thus propose a geostatistical approach to fit the vertical profiles jointly as a smooth field for understanding nonlinear interactions and to determine the long-term vertical variability.

Our method uses a two-dimensional geoadditive model for carrying out the multiple correlated time series analyses on every level of the vertical profiles. This approach borrows the strength of spatial correlation models, but instead of working on dimensions of latitude and longitude, we replace the geographical coordinates with a temporal index and altitude bins. This approach also enables us to identify the common structures among correlated time series (e.g., ozone observations on adjacent layers of the atmosphere) through their seasonality and interannual variability and to filter out unstructured variations from the underlying signal by a regularization technique. We demonstrate that this method reduces the short-lived anomalies in an irregular and/or sparse time series by taking advantage of the additional information in adjacent layers of the vertical profiles and yields a trend with reduced uncertainty. While our method was developed for tropospheric ozone profiles, it can also be applied to the stratosphere, to profiles of other trace gases or atmospheric thermodynamic properties, and to profiles of oceanic or geologic properties.

Section 2 begins with a brief introduction of the time series decomposition and smoothing spline ANOVA framework. We then propose a statistical methodology to assess the profile variability. Section 3 applies this methodology to two tropospheric ozone profile datasets measured by (1) IAGOS (In-service Aircraft for a Global Observing System) commercial aircraft above Europe and China (including a probability test of how the trend precision and accuracy change according to sampling frequency) and (2) ozonesondes launched from Hilo, Hawaii, and Trinidad Head, California, by NOAA's Global Monitoring Laboratory (GML). In Sect. 4, we discuss the potential extension and benefit of this trend technique and provide a summary of our ozone profile trend analysis.

2 Model and data

2.1 Smoothing spline decomposition

Conventional long-term trend detection of a time series is influenced by many factors such as the number of years of data, the magnitude of the uncertainty and autocorrelation of the residuals (Weatherhead et al.1998). We consider a long-term time series where observations are averaged into monthly means over a number of years in discrete altitude bins or pressure levels. The basic statistical model for an individual time series can be written as

(1) y ( t ) = seasonal cycle + trend + residual ,

where y(t) is the monthly mean ozone value at time t. A regular seasonal cycle can be estimated by Fourier harmonic series, by cyclic spline functions or by simply averaging the monthly values from the long-term climatology. The specification of a trend component is, however, dependent on the goals of the analysis, and it must be flexible enough to represent the underlying structure in the data (e.g., Augustin et al.2009; Park et al.2013; Wood et al.2015; Chang et al.2017); the residual series is generally assumed to be an autoregressive process in this scenario.

Ozone variations are in general strongly dependent on the altitude or pressure level. For a given altitude or pressure level, our method borrows the information from the neighboring altitudes at approximately the same time for a more stable trend estimate (e.g., observations at nearby altitudes are more highly correlated with the layer of interest than those further away). In order to explicitly account for common structures among correlated time series, a class of generalized additive models (GAMs; Hastie and Tibshirani1990) for the functional decomposition can form the general framework for joint trend estimates. Additive models extend traditional multivariate linear models by replacing linear covariates with smooth functions, each represented by various types of spline functions (depending on the nature of the covariate) (Wahba1990; Wood2006; Wood et al.2015; Chang and Guillas2019). Nonlinear variability and complex interactions can thus be modeled by using the combination of multiple spline basis representations within the GAM framework.

With an extension of a single time series to a vertical profile series (let y(t,h) be the monthly mean ozone value at altitude or pressure level h and time t), the statistical model can be extended to a two-dimensional signal decomposition:

(2) y ( t , h ) = f seasonal ( ω , h ) + f interannual ( τ , h ) + ϵ ,


ω=1,,12 monthly index;τ=1,,ny interannual index (period of coverage);t=ω×τ=1,,12×ny total temporal index;h=1,,nh altitude bin or pressure level;

and where the temporal index t is decomposed into monthly index ω and interannual index τ, and fseasonal(ω,h) and finterannual(τ,h) are two-dimensional smooth functions representing the seasonal and interannual ozone variations across different altitudes. Specifically, fseasonal(ω,h) represents a smooth matrix including a single mean for every month and every altitude (each cross section is an annual cycle for the corresponding altitude), and finterannual(τ,h) can be seen as a de-seasonalized annual mean anomaly series at each altitude representing the year-to-year variability. The theory of the smoothing spline ANOVA is to decompose a multivariate function into a sum of multiple independent functions with main components and all possible two-way (or higher) interactions (up to as many variables as required) (Wang1998; Gu2004, 2013) in order to understand the contribution from each single component. In terms of our trend study, we only consider the vertical distribution of two basic elements in the time series: seasonal and interannual variability. A higher order of interactions can be included, but it might not be helpful or informative and could require a substantial amount of computational complexity.

The two-dimensional penalized regression splines were chosen for representing both smooth components (Wood2003), which can be expressed through a basis representation:

fseasonal(ω,h)=k1=1K1ck1ψk1(ω,h) and finterannual(τ,h)=k2=1K2ck2ψk2(τ,h),

where {ψk1(ω,h)} and {ψk2(τ,h)} are the collections of smoothing spline basis functions evaluated on the seasonal grid (ω,h) or the interannual grid (τ,h), and {ck1} and {ck2} are the collections of associated coefficients (including an overall intercept). Under this formulation, we account for temporal and vertical ozone variation in a spatial correlation model (i.e., ozone with different altitudes and different time lags can be correlated, which are not handled simultaneously by the standard multivariate linear regression model). Technical details of the statistical fitting procedure are described in Appendix A.

The specification of our model enables us to decompose the vertical profile into two geoadditive fields: seasonal and interannual components, according to their associated variabilities. These surfaces are defined by approximating the overall monthly means and annual mean anomalies in each layer or altitude through a statistical optimization. Rather than directly specifying a linear component into a multivariate linear regression model, our functional approach aims to partition monthly mean ozone profiles into components attributable to different sources of variation before any attempt at trend detection. We choose the thin-plate smoothing spline as the spatial basis function because it is computationally efficient and because it avoids the problem of choosing “knot locations” (Wood2006). Note that the focus of this study is jointly modeling multiple correlated time series at different heights above the same region, which differs from a trend study of a series of measurements with uneven temporal sampling at a single altitude (Tiao et al.1990; Weatherhead et al.2017) or irregularly distributed observations from a monitoring network (Stein1999; Chang et al.2015, 2017). Our problem can be seen as decomposing a two-dimensional signal according to the associated variability on a regular grid cell (i.e., based on the monthly and annual index at a fixed altitude bin). Specifically, we implemented the penalized regression spline decomposition using the thin-plate splines available from the R package mgcv (Wood2006; R Core Team2013).

2.2 Ozone profile observations

Our new trend calculation methodology is applied to the long-term tropospheric ozone vertical profile datasets measured by (1) IAGOS commercial aircraft above western Europe and China at 50 hPa intervals from 950 to 250 hPa for the period 1994–2017 and (2) ozonesondes launched from Hilo, Hawaii, (1982–2018) and Trinidad Head, California, (1998–2018) with a vertical resolution of 100 m from the surface to 15 km. For consistency the vertical coordinate of the ozonesondes is converted from meters to pressure levels.

The IAGOS program has measured ozone worldwide since 1994, using instruments on board commercial aircraft of internationally operating airlines (, last access: 19 August 2020, Marenco et al.1998; Petzold et al.2015). This analysis utilizes the same subset of IAGOS ozone profiles above western Europe reported by Cooper et al. (2020). Ozone is measured using a dual-beam UV-absorption monitor (time resolution of 4 s) with an accuracy estimated at about ± 2 nmol mol−1 +2 % (Thouret et al.1998; Nédélec et al.2015). Because most IAGOS aircraft have belonged to airlines based in Europe since the program began in 1994, western Europe is the most frequently profiled region in the world. Above western Europe (47–55 N, 0–15 E), IAGOS aircraft measured 34 600 ozone profiles between 1994 and 2016 with 99 % of profiles from Frankfurt, Paris, Munich, Brussels, Dusseldorf and Amsterdam. The 22-year time series has one data gap spanning March–August 2010 when instrument failures resulted in just a few sporadic profiles (Petetin et al.2016b) (the 2010 annual mean is represented by the other 6 months, after adjustments for seasonality). The lower tropospheric portions of the profiles have been shown to be regionally representative of ozone across western Europe (Petetin et al.2018). The sampling frequency varies according to airline schedules, but on average, four profiles are recorded somewhere in this region every day. IAGOS aircraft can take off and land at any time of day, and all data are used in this analysis. No diurnal ozone cycle occurs in the free troposphere above Europe (above the 750 hPa level), although a clear ozone cycle occurs in the boundary layer, and it is strongest below 950 hPa (Petetin et al.2016a). Our analysis only calculates ozone trends from the 950 hPa surface and above, which avoids the lowest layers with very strong diurnal cycles. We analyze a similar IAGOS dataset based on a compilation of ozone profiles from airports in northeastern China and Seoul, South Korea. The European and Chinese datasets focus on limited regions of central and western Europe (47–55 N, 0–15 E) and northeastern China/South Korea (28–45 N, 110–135 E).

NOAA's Global Monitoring Laboratory has measured ozone profiles above Hilo, Hawaii, (19.72 N, 155.05 W) since 1982 and Trinidad Head, California (41.06 N, 124.15 W), since 1997 using balloon-borne ozonesondes equipped with electrochemical concentration cell (ECC) sensors that have an accuracy of ±10 % in the troposphere (Johnson et al.2002; Smit et al.2007). The ozonesondes have been launched on a weekly schedule since 1992 (Oltmans et al.1996) with a lower frequency of approximately 2–4 profiles per month from 1982 to 1991. Compared to UV-absorption measurements, ECC ozonesondes show a modestly (∼1 %–5 % ±5 %) high bias in the troposphere but no change over time (Tarasick et al.2019). The Hilo ozonesonde record has been reprocessed according to the recommendations of the Ozonesonde Data Quality Assessment activity (Smit et al.2012; Smit and ASOPOS2014; Sterling et al.2018) to remove artifacts introduced by changes of sonde type, manufacturer, strength of sensing solution and preparation procedure. The sampling frequency at Trinidad Head has been approximately once per week since August 1997.

3 Results

The IAGOS dataset in western Europe is a regional aggregation of profiles from several airports. The profiles are measured by several aircraft, and on average there are approximately four profiles per day within the study region. Several studies have demonstrated that IAGOS data above western Europe are consistent with ozonesonde records in the upper troposphere–lower stratosphere (UTLS) (Staufer et al.2013, 2014), and the data have compared well to regional surface and free tropospheric ozonesonde records (Thouret et al.1998; Logan et al.2012; Petetin et al.2018). Petetin et al. (2018) validated IAGOS data in the lowest troposphere over western Europe against rural monitoring sites at various elevations and concluded that those observations can be used to study the air quality in the agglomeration. Nevertheless, to reduce the influence from the strong diurnal cycle at the surface, all observations below the 950 hPa level and within 300 m of the surface were removed from the analysis (Gaudel et al.2020).

With such a high sampling frequency, the data are expected to be far less inconsistent than a sparsely sampled ozonesonde time series (one to three profiles per week) and highly representative of the true ozone variability above western Europe. The IAGOS dataset in China is also a regional aggregation, but the time series has a period of low sampling frequency (2007–2010) with many months containing no data. In contrast, the ozonesonde records at Hilo and Trinidad Head are expected to have greater uncertainty due to the lower sampling rate of one profile per week. Our first task was an evaluation of our method against the standard linear model using the IAGOS data above Europe to determine if our method yields trend values with reduced uncertainty. The next step was to examine the impact of sampling frequency on the trend estimate based on the IAGOS profiles above western Europe. Finally, we applied our method to the sparsely sampled Hilo and Trinidad Head ozonesonde records to demonstrate the improved trend quantification.

3.1 IAGOS trends above western Europe

Figure 1 shows the pairwise simple correlation coefficients of IAGOS data between every 50 hPa layer from 950 to 250 hPa based on all de-seasonalized monthly ozone values from 1994 to 2017. As expected from previous work (Petetin et al.2016b), the results show a high correlation between adjacent layers which decays with distance. Time series are well correlated (∼0.6–0.9) within a vertical range of 100 hPa except for the top layer where correlations decrease with distance at twice the rate of the lowest layers. As noted by Weatherhead et al. (2017), two time series can be highly correlated even if they have very different mean values or magnitudes of variability. Because our technique takes advantage of the nearby layers with a moderate or high correlation, we expect an improved quantification of the absolute variability and resulting trend estimate.

Figure 1Correlation coefficients between different pressure layers based on IAGOS profiles above western Europe.


The smooth seasonal and interannual components in units of parts per billion by volume (ppbv) derived from Eq. (2) are shown in Fig. 2a and b, respectively. For each component the bottom panel provides the one-dimensional climatological seasonal cycle or the annual estimated mean anomalies on each pressure layer, while the top panel shows their continuous fields. The combination of the two fields represents our overall fit to the vertical profile data. For example, the fitted result for the year 2000 is equal to the sum of the column values (by pressure) in 2000 in Fig. 2b and the corresponding column values in Fig. 2a. In addition, based on the variogram fitting to the interannual component (Stein1999), two ozone time series separated beyond 150 hPa are no longer (auto)correlated.

Figure 2Seasonal and interannual components for the ozone distribution above western Europe.


From a visual inspection of the interannual component, the variability in the lower layers is fairly steady and unwavering compared to the upper layers. The largest variation over 1994–2017 occurred in the upper layers around 2017 (the absolute magnitude of the variability during 2017 is still in question as it was the last year of the available time series and could be modified by additional years of data as they become available). The result from the smoothing spline decomposition provides a good visualization of the trend and also provides a quantification of ozone variation through the multiple correlated time series using all profile data without a dimension reduction.

Figure 3 provides some statistical model diagnostics for the overall fit. In general the model fits the data well, but heteroscedasticity is present, as expected. The model underestimates the extreme values, which is deliberate because we developed this penalized regression approach to prevent overfitting. Figure 3b–d present the marginal residual distributions by year, month and pressure layer. Overall, we can see that the mean structures of seasonal, interannual and vertical variations are well represented (the medians of residuals in the boxplots are close to zero and the interquartile ranges are well constrained), but unsurprisingly the variance structures are not completely captured by our model. Heteroscedasticity is most prominent in the top pressure layers, in certain years such as 2017 and in the sub-seasonal variability. Besides the extreme values, the diagnostics indicate that an extended model with a more complicated variance component, or an additional climatic index for a specific layer or month/year, is required to explain the remaining variability. While the model cannot explain all of the variance, applying our method to long-term, multi-decadal records reduces the impact of the unstructured variations on the trend estimate (as illustrated below).

Figure 3Diagnosis of statistical model fitting of the tropospheric ozone distribution above western Europe.


Using the interannual component in Fig. 2b, we calculate the trend value for each individual pressure layer. This integrated fit is then compared to the separated linear fit in each layer based on Eq. (1) and the original monthly mean time series (with seasonal cycle and autocorrelation accounted for). Note that the trend result from the integrated fit (i.e., via the smoothing spline ANOVA model) takes advantage of the correlated observations from neighboring layers, thereby reducing the unstructured variation in the dataset, while the separated fit is obtained by applying a simple linear fit to the original (and rather inconsistent) time series on each independent pressure level.

As shown in Fig. 4, these two methods agree very well because, as shown in Sect. 3.2, the monthly means are generated from very large sample sizes (approximately 120 profiles each month), which minimizes the uncertainty on the simple linear trend in any given layer. A slight discrepancy can be seen above 400 hPa; to investigate the cause, we plot the observed time series and model fitted values for four pressure levels in Fig. 5. The year-to-year variations at 400 hPa are well captured by the model fit. Above 350 hPa, a large spike in the observed values occurred in 2017, allowing us to see how the penalized regression spline responds to such sharp variations. Especially at 250 hPa, if only 1 month shows a spike, such as in 2000, 2001, 2006, 2008 and 2013 (and if these spikes are not obvious on adjacent layers, such as at 300 and 350 hPa), the variation will be penalized and smoothed out. On the other hand, if the enhancement occurs over multiple months, then it will be reflected by the model, such as in 2017. These results demonstrate that the smoothing spline decomposition is a powerful tool for identifying and replicating persistent features in the ozone profiles while ignoring short-lived anomalies.

Figure 4Ozone trend estimates and associated 2σ variabilities at 50 hPa vertical resolution above western Europe based on the separated fit (black) and the integrated fit (red).


Figure 5Monthly mean ozone time series and model fitted values for four different layers above western Europe.


To demonstrate the role of regularization in the model fitting, we first provide a synthetic example that illustrates the problems of underfitting (the model is not flexible enough to capture the general pattern in the data) and overfitting (the model is overparameterized, so some unphysical fluctuations are present) in Supplement Fig. S1. Neither underfitting nor overfitting is an appropriate representation of the true process. Once sufficient model complexity is supplied (e.g., placing a knot for spline functions every 50 hPa), the statistical regularization can be used to penalize overly complex models and thus prevent overfitting. The overfitting of a surface is less obvious than a curve, but we provide a demonstration of the fitting by adjusting the optimized penalty coefficient, which is selected by the generalized cross validation (Wood2006) and is proven to be reliable, as discussed above. We scale the penalty coefficient (λ2 in Eq. A2 while keeping λ1 fixed) by a factor of 10 and 0.1, respectively (i.e., the smaller the penalty, the more roughness will be present). The result is shown in Fig. 6; the underfitting can be seen as an over-smoothed representation of the underlying structure, and sharp variations (e.g., overly complicated roughness) are an indication of overfitting.

Figure 6Sensitivity analysis of the fitted result from different smoothing penalties: (b) the smoothness determined by the generalized cross validation (GCV) criterion and potentially (a) underfitting and (c) overfitting by scaling the penalty coefficient (λ2 in Eq. A2) by a factor of 10 and 0.1, respectively.


Figure 4 shows that the ozone trends increase with altitude, ranging from 3 to 8 ppbv decade−1 above 400 hPa, as determined by the integrated fit method. These strong trends are driven by the ozone in the stratospheric air masses that are frequently sampled at these altitudes. To focus on the ozone trends in the upper tropospheric air masses, we removed all of the stratospheric air mass samples from the time series, as determined from the potential vorticity values associated with each ozone profile and available from the IAGOS data portal ( (Cohen et al.2018). The removal of the stratospheric air masses reduces the monthly mean ozone values above 400 hPa by approximately 10 ppbv and reveals a much more subtle year-to-year variability especially in the upper levels (Supplement Fig. S2). Overall the trends in the upper levels are reduced by more than a factor of 2 when the stratospheric air masses are removed, but they are still greater than the trends in the mid-troposphere (Supplement Fig. S3).

3.2 Effect of sampling frequency on IAGOS trends above Europe

Due to ozone's high temporal variability, accurate trend detection may not be possible if the ozone profile sampling rate is low. Three previous studies have estimated the optimal sampling frequency for the accurate quantification of tropospheric ozone variations from profiles. Logan (1999) concluded that at least 20 profiles per month are required in tropical and extratropical regions to derive reliable monthly mean ozone values. Saunois et al. (2012) analyzed free tropospheric IAGOS profiles above Frankfurt (1995–2008) and determined that sampling frequencies of 4 and 12 profiles per month approximately result in uncertainties of 9 %–29 % and 5 %–15 % on the seasonal mean, respectively. Leonard et al. (2017) determined that at least 3–6 ozonesonde profiles per month are required for the estimation of climatological ozone values in terms of matching the 95 % confidence level of the fully sampled monthly ozone means derived by the GFDL-AM3 model.

Here we determine the optimal ozone profile sampling frequency for trends calculated using either the separated fit or the integrated fit methods. In order to investigate the impact of sampling frequency on profile trends, we randomly selected a specified number of profiles per month (from 1 to 20) and conducted the same integrated fit analysis on the resulting monthly mean time series based on 1000 iterations of random sampling. We illustrate the impact of the sampling frequency by showing the trend results based on one, five and nine profiles per month in Fig. 7. The plot shows the vertical distributions of sampled trends by separated and integrated fits. We also show the histogram of sampled trend values at 550 hPa (where the trends and 2σ intervals are quite similar for these two approaches) and associated trend fitting uncertainties in Fig. 8. We can see a substantial improvement of the trend estimate when more data are available with a similar trend accuracy for both approaches. However, in terms of trend precision, the integrated fit produces lower uncertainty.

Figure 7Sensitivity analysis for one (a, d), five (b, e) and nine (c, f) profiles per month based on 1000 random samples for each of the 15 vertical layers above western Europe. The analysis was conducted using the separated fit (a, b, c) and the integrated fit (d, e, f). Black curves represent the vertical distribution of the true trends based on the full IAGOS dataset.


Figure 8Sensitivity analysis at 550 hPa based on 1000 random samples of one (a, d), five (b, e) and nine (c, f) selected profiles per month. Shown are the sampled trend distributions with vertical lines indicating the trend values derived from the full dataset (a, b, c) and the corresponding matched trend uncertainties based on the separated and integrated fits (d, e, f).


If we assume that the trend derived from the full dataset (dozens of profiles per month) is the “true” value, the sampled trend is randomly located around the true value as a Gaussian distribution for both approaches. For both the integrated and separated fit methods, the increased sampling frequency results in a diminished distribution of trends. The integrated fit is designed to smooth out the short-lived anomalies in the vertical profiles, and as a result the integrated fit produces a narrower range of trend distributions than the simple separated fit. This case study clearly illustrates that an inconsistent trend structure results from infrequent sampling.

The summary statistics for the sensitivity analysis in terms of both precision and accuracy are reported in Table 1. Due to a lack of systematic behavior across the 15 vertical layers between 950 and 250 hPa in this sensitivity analysis, we do not report individual layer results; instead, each number in the table represents an average from 15 000 samples (1000 samples per layer). It should be noted that all the interpretations of this analysis are based on two assumptions: (1) time series are long term, i.e., 20 years or longer, and (2) the underlying variations are fairly steady and regular. Therefore, our results indicate the minimal requirement for the sampling scheme.

Table 1Percentage of randomly generated trends that fall outside of the 1 or 2σ interval of the trend values that were derived from the full IAGOS dataset. One thousand randomly generated trends were calculated on 15 pressure surfaces for each of a predetermined number of profiles per month. Also shown are the associated mean absolute percentage error (MAPE) values and mean signal-to-noise ratio (MSNR) between trends derived from the full dataset and the uncertainty in each sample (i.e., standard error of the trend estimate). The assumed true MSNR values derived from the full dataset are 2.32 and 2.34 for separated and integrated fits, respectively.

Download Print Version | Download XLSX

As an indication of the trend precision, we calculate the percentage of the 15 000 random trend values that fall outside of the 1 or 2σ intervals of the reference trend, which is based on the full dataset. The integrated fit delivers an improved precision as demonstrated by the lower outlier rate (i.e., a higher coverage rate) in relation to a narrower uncertainty interval. For the trend accuracy metric, we calculate the mean absolute percentage error (MAPE) for each sampling profile:


where ph and ah are the sampled and the assumed true trend values at height h with the averaged metric reported in Table 1. The separated fit method improves the accuracy but to a relatively smaller degree than its improvement on the precision. While our sophisticated statistical method clearly improves the accuracy and precision of the trends, greater improvements can be achieved by simply increasing the sampling rate by one or two profiles per month.

Figure 9 visualizes Table 1 and shows the sampling rates that allow the precision and accuracy to approach that of the complete dataset. At least four or five profiles per month are required to arrive at a 5 % outlier rate with respect to the 2σ interval for the integrated or separated fit, respectively. Improved precision becomes negligible at sample sizes greater than 15 profiles per month. In terms of accuracy, a monthly sampling frequency of 14 profiles is required to reduce the error to 5 % using the integrated fit, while the separated fit requires 18 profiles per month to meet the same goal.

Figure 9The marginal decrement of the outlier rate (sampled trends located outside the 1 or 2σ interval of the assumed true trend) (a) and the sampling error (mean absolute percentage error) (b).


In order to investigate the impact of sampling strategy on the integrated and separated fits, we carry out an additional sensitivity analysis in Supplement Table S2 through a comparison of two strategies: (a) a completely random design as illustrated above and (b) a fixed sampling strategy based on the random selection of an initial day followed by additional profiles at fixed intervals of 1 to 10 d. For example, a 5 d sampling frequency is based on the random selection of an initial reference from day 1 to day 5 in the beginning of the record, followed by the random selection of a profile every 5th day until the end of the record. For both strategies, only a single profile is chosen randomly if multiple profiles are available on the same day. Therefore, the sampling scheme with a fixed frequency of 1 d represents a random selection of a single profile in each day instead of using all available data. Also, if the chosen day does not have any profiles, we treat it as missing and do not look for a replacement.

The sensitivity analysis demonstrates that sample size remains the dominant influence on precision and accuracy. When the sampling interval is not dense enough (e.g., greater than 10 d), the benefit of a regular frequency scheme is inconsequential. However, when the monthly sample size is greater than four profiles (i.e., the sampling frequency is less than once per week), the fixed frequency scheme could achieve a better performance. As a result, an optimal frequency can decrease to 10 profiles (3 d frequency) for an integrated fit and 15 profiles (2 d frequency) for a separated fit.

The above discussion is focused on the precision and accuracy of the trend estimate at various sampling frequencies. In addition, we can explore the impact of sampling frequency on our ability to simply detect the presence of a trend based on uncertainty analysis. In order to evaluate the resulting uncertainty associated with the sampled trends, we include the mean signal-to-noise ratio (MSNR) between trends derived from the full dataset (the assumed true trend values) and the uncertainty in each sample (i.e., standard error associated with the trend estimate) in Table 1. In the interest of a fair comparison, this calculation is done by comparing the sampling uncertainty with trend values derived by the same method. Note that we do not use the concept of “statistical significance” to indicate evidence for the trends, following the recent recommendations from the American Statistical Association (Wasserstein and Lazar2016; Wasserstein et al.2019). Instead, the benchmark we selected for comparing trend uncertainties is a rejection of the null hypothesis at the 95 % confidence level when the signal-to-noise ratio (SNR) exceeds a threshold value of 2. As noted in Table 1, we already knew the MSNR is around 2.3 for both methods when using the full datasets; however, the SNR analysis shows that the benchmark value of 2 can be achieved at a sampling frequency of just four profiles per month when using the integrated fit, whereas the separated fit requires eight profiles per month. In summary, at a low sampling frequency, the integrated fit provides not only more precise and accurate trend estimates, but the uncertainty associated with trends can be reduced.

3.3 Application of the integrated fit to IAGOS trends above northeastern China

IAGOS aircraft have measured ozone profiles above northeastern (NE) China and Seoul, South Korea, since 1994 but at a much lower sampling rate than western Europe. While western Europe has 36 298 profiles during 1994–2017, NE China (including profiles from Seoul, South Korea) only has 1636 profiles with many months containing sparse data during 2007–2010. The lower sampling frequency and the data gap mean that the NE China IAGOS dataset can provide a further test of the advantages of the integrated fit method. The seasonal and interannual components of the integrated fit to NE China are shown in Fig. 10. Compared to western Europe, the seasonal component appears more incoherent in the vertical partially due to a lower sampling frequency; however, a clear seasonal peak can be seen in June in the upper troposphere and in April–June in the lower troposphere. The interannual component shows a steady increase in ozone at all pressure layers from 1994 to 2005 but rather large fluctuations after 2005 in both the upper and lower troposphere.

Figure 10Seasonal and interannual components for the ozone distribution above NE China.


In this particular region, the variogram fitting suggests that ozone time series within a range of 200 hPa should be considered correlated. This vertical correlation range is greater than was found for the IAGOS data above western Europe. The longer correlation range is the result of the more systematic temporal variations across ozone vertical profiles, as seen in Fig. 10b. While the correlation structure above Europe is heavily affected by the high anomalies from stratospheric influences, if those high anomalies are filtered out (see Supplement Fig. S2), the vertical correlation range increases and becomes similar to the IAGOS data above China. Time series of observations and fitted values for four upper and lower layers are provided in Supplement Figs. S5 and S6, respectively, to illustrate the statistical fitting quality.

The resulting trend distribution above China is displayed in Fig. 11. The separated fit finds that the 2σ interval of trend values is inseparable from zero at 600 and 300 hPa, while the integrated fit detects positive trends with lower uncertainties throughout the depth of the troposphere with a larger trend in the boundary layer and a smaller trend at 500 hPa. Compared to the separated fit, which is a typical example of overfitting to the unstructured variation at a small scale, our statistical adjustment that borrows information from adjacent layers is effective in terms of a better representation between fidelity and complexity.

Figure 11Ozone trend estimates and associated 2σ variabilities at 50 hPa vertical resolution above NE China based on the separated fit (black) and the integrated fit (red).


3.4 Application of the integrated fit to the Hilo and Trinidad Head ozonesonde records

Ozonesondes have been launched from Hilo, Hawaii, continuously since 1982 at an average rate of three profiles per month for the first 10 years and four profiles per month since 1993; the sampling rate at Trinidad Head has been weekly since August 1997. Figure 12 shows all Hilo and Trinidad Head ozone observations by decade from the surface to 15 km at 100 m intervals and with profiles of the 5th, 50th and 95th percentiles in each decade. From a visual inspection, the decadal 50th percentile profiles at Hilo are similar in the lower troposphere, but the profiles diverge above 700 hPa (∼3 km), and a clear enhancement of ozone can be seen after the year 2000. The decadal 5th, 50th and 95th percentile profiles at Trinidad Head are similar during the 2000s and 2010s, while the much shorter record over 1997–1999 indicates higher ozone values in the upper troposphere.

Figure 12All ozone observations below 15 km as measured by ozonesondes above Hilo, Hawaii, and Trinidad Head, California. The observations are colored according to the decade in which they were measured. The solid lines represent the 5th, 50th and 95th percentiles. The vertical lines represent a reference at 10, 40 and 80 ppb.


The results of the functional decomposition to distinguish the seasonal and interannual components of the Hilo record are shown in Fig. 13. The magnitude of the seasonal peak increases with altitude, while the interannual component reveals a wide range of vertical variation. With a nearly flat variation in the lower troposphere and strong fluctuations at higher altitudes, the ability of the smoothing spline decomposition to identify a complex range of variations is clearly illustrated. The same plot for Trinidad Head is displayed in Fig. 14. The seasonality varies with altitude, with maximum values occurring in March in the upper troposphere and in May in the lower troposphere. The annual anomalies show greater variability in the upper troposphere especially in the first 4 years when a strong positive anomaly abruptly transitioned to a strong negative anomaly. The upper tropospheric enhancement above midlatitude western North America in 1998 has been reported previously (Langford1999; Cooper et al.2010), and this feature has been attributed to enhanced stratosphere–troposphere exchange following the strong 1997–1998 El Niño event (Langford1999; Lin et al.2015).

Figure 13Seasonal and interannual components for the ozone distribution above Hilo, Hawaii.


Note that increasing the vertical resolution of the profile data does not play an important role in the quantification of the vertical correlation range as the correlation range of a fitted variogram is still approximately 150–200 hPa in the troposphere for these two stations, as previous results suggested. The vertical distributions of the Hilo (1982–2018) and Trinidad Head (1998–2018) ozone trends from the separated and integrated fits are shown in Fig. 15, reported at a 20 hPa vertical resolution as opposed to the 50 hPa resolution of the IAGOS profiles above Europe and China. For reference, the trend estimation from the 1982–2018 surface ozone measurements at Mauna Loa Observatory (∼3400 m elevation; 60 km southwest of Hilo) was added to the Hilo plot (Cooper et al.2020). The separated fit results in trends that can vary considerably from one layer to the next. For example, from 650 to 350 hPa at Hilo the trend uncertainty range includes zero for some levels but not for others. In contrast, the integrated fit produces narrower uncertainty ranges because it avoids the influence of extreme events and borrows information from adjacent layers, resulting in all uncertainty ranges exceeding zero above 650 hPa. Supplement Table S1 reports the trend value and associated 2σ uncertainty above Hilo and Trinidad Head for the tropospheric column, along with the partial column in the lower, middle and upper troposphere, for the full records and since the common reference year of 2000.

Figure 14Seasonal and interannual components for the ozone distribution above Trinidad Head, California.


For the trend distribution at Trinidad Head, the overall trend profile is shifted toward negative values due to the positive ozone anomaly in 1998–1999 (as described above), but above 900 hPa the 2σ uncertainty range still does not exclude zero. Supplement Fig. S7 shows the trend profile from when the time series begins in 2000 to avoid the positive anomaly in 1998–1999. Since 2000, trends are very weak except for the boundary layer where trends are generally negative.

Figure 15Ozone trend estimates and associated 2σ variabilities at 20 hPa vertical resolution based on the separated fit and integrated fit methods above Hilo, Hawaii, with the ozone trend at Mauna Loa Observatory at 680 hPa provided for reference (1982–2018), and Trinidad Head, California.


4 Conclusions

Detecting trends of tropospheric ozone from ozonesonde profiles is challenging due to the relatively low sampling frequency combined with high temporal variability. Regularization is a statistical learning tool which makes a trade-off between (1) high fitted bias (low flexibility) that results from underfitting and (2) high sensitivity to small data fluctuations (low generalizability) that results from overfitting. The underfitting can be avoided by making sure the number of basis functions (e.g., thin-plate splines) is large enough to represent the underlying process. A model can be considered to be overfit if a high cross validation error is found (which can be investigated by, e.g., iteratively removing one observation and predicting this value from the remaining observations). In terms of detecting tropospheric ozone trends from vertical profiles, the vertical correlated structures in the neighboring pressure layers can be used to inform the learning process. The benefit of this approach can be reflected by the detectable trends (if any) at a low sampling rate (i.e., we have higher confidence of our ability to detect a trend from weekly sampled ozonesonde data) and by an improved quantification of trend estimates in terms of accuracy and precision.

This technique efficiently reduces the uncertainty of the resulting trends and thus increases our ability to detect and quantify trends of smaller magnitude. Therefore, this method is valuable for improved trend detection of ozonesonde records because, although these records are sufficiently long term and have a high vertical resolution with high accuracy, standard trend analysis is still challenging due to the limited sampling rate. This refined estimation is also expected to be beneficial when comparing trends derived from different regions or observing systems since the result is less sensitive to incoherent or unstructured variations.

This paper provides a sophisticated statistical regularization approach for carrying out a joint trend analysis of vertical profile data as applied to tropospheric ozone observations from IAGOS commercial aircraft profiles above Europe and China and NOAA GML's ozonesonde records at Hilo, Hawaii, and Trinidad Head, California. Our approach is designed to deliver a consistent trend and uncertainty estimate by functionally decomposing the overall vertical profile into seasonal and interannual components. Instead of fitting the trend component as a single linear term, our technique adopts a two-step data-driven approach that (1) expresses the continuous change as a series of estimated annual mean anomalies at each altitude or pressure level after accounting for de-seasonalization and vertical autocorrelation and (2) then derives the trend value from the annual mean anomalies according to its underlying structure (a linear approximation of the interannual variability is considered to be appropriate). In this study, we introduce roughness penalties on both monthly and annual scales, which are the fundamental elements of time series decomposition. This data-driven approach reveals the underlying trend structure without determining the linear, nonlinear or any other type of polynomial in advance. The success of this approach lies in the smoothing spline decomposition of ozone profile variability. Importantly, the regularization aspect diminishes the influence from extreme observations. Our method is easy to implement with low computational costs, and it offers a spatial visualization for investigating the trend and interannual variability typical of ozone profile data.

While this study focused on temporal and vertical variations at a specific location or region, an extension to a spatially varying vertical process (i.e., incorporation of multiple sites with dimensions of longitude and latitude) is theoretically possible. However, four-dimensional correlations could be difficult to disentangle while still maintaining computational tractability. The scaling problem beyond three dimensions could be even more challenging as it involves multiple measurement units in space and time.

The main results of the ozone trend analysis are summarized as follows.

  1. The abundance of IAGOS ozone profiles above Europe is sufficient to provide a reliable trend result without the need for applying an advanced statistical technique. Nevertheless, a sensitivity analysis demonstrates that at low sampling frequencies the integrated fit outperforms the separated fit in terms of accuracy and precision.

  2. A further sensitivity test confirms that regular sampling frequency can be more beneficial than random sampling under the same number of available profiles.

  3. While our method improves trend detection from sparse datasets, the key to substantially reducing the uncertainty is to increase the sampling frequency.

  4. Based on a series of sensitivity studies, we determined optimal sampling frequencies for (1) basic trend detection and (2) accurate quantification of the trend. When applying the integrated fit method, we find that a typical sampling frequency of four profiles per month is adequate for basic trend detection; however, the accurate quantification of the trend requires 14 profiles per month. Accurate trend quantification can be achieved with only 10 profiles per month if a regular sampling frequency is applied. In contrast, the standard separated fit method, which ignores the vertical correlation between pressure surfaces, requires 8 profiles per month for basic trend detection and 18 profiles per month for accurate trend quantification.

  5. Trends derived from the separated and integrated fits are similar above China: stronger positive signals in the upper and lower troposphere and relatively weaker positive signals in the middle troposphere. However, the integrated fit yielded a more consistent trend estimate.

  6. Application of the integrated fit to the Hilo record enables us to effectively reduce the unstructured variation in the profile and derive clear positive trends in the middle and upper troposphere.

  7. The reference year of the trend estimate is particularly crucial at Trinidad Head due to high anomalies in 1998–1999. The trends are weak in the middle and lower troposphere and tend to be more negative in the upper troposphere.

Finally, although this method was developed for tropospheric ozone profiles, it can also be applied to the stratosphere, to profiles of other trace gases or atmospheric thermodynamic properties, and to profiles of oceanic or geologic properties. We provide an example in Supplement Fig. S8 to illustrate the application of this method to the quantification of trends in the lower stratosphere above Hilo, Hawaii.

Appendix A: Statistical details

The model proposed in Sect. 2 can be fit to the data, and the coefficients can be estimated by minimizing the penalized least square criterion (Wood2006):

(A1) y ( ω , τ , h ) - f seasonal ( ω , h ) - f interannual ( τ , h ) 2 + D 2 F ,

where is the Euclidean norm. The first term in Eq. (A1) minimizes the residual sum of squares (quantifying the goodness of fit). The second term is a nonnegative measure of the complexity of F, which penalizes the roughness of the curve or surface (avoiding overfitting of the short-lived anomalies) defined in terms of an integral of second-order partial derivatives of F.

On the geometry of a function, the second derivative corresponds to the curvature or concavity of the field; thus this penalty term is designed to prevent a “stand-alone” sharp variability from its neighborhood system. For example, except for the top and bottom layer, the variability in each layer should be bounded by its upper and lower layers, and any variability exceeding these bounds is a potentially short-lived anomaly. Specifically, the penalty term in Eq. (A1) can be defined as follows (Wood2003):

(A2) D 2 F = λ 1 2 f seasonal ( ω , h ) ω 2 2 + 2 f seasonal ( ω , h ) h 2 2 + 2 2 f seasonal ( ω , h ) ω h 2 d ω d h + λ 2 2 f interannual ( τ , h ) τ 2 2 + 2 f interannual ( τ , h ) h 2 2 + 2 2 f interannual ( τ , h ) τ h 2 d τ d h ,

where {λ1,λ2} are the tuning parameters for seasonal and interannual components, respectively. A major feature of the roughness penalty term in the regression spline approach is the isotropic property, which means that the roughness penalty is invariant in all directions (i.e., across altitude and time). This isotropic property is natural and straightforward when considering smooth spatial structures at the same geographic coordinates. However, since time and space are measured in different units, we cannot interpret or distinguish the relative importance between smoothness in time and smoothness in space. A pragmatic but efficient approach for avoiding such scaling issues is to transform all variables into the unit hypercube (Sobol2001). We considered a potential extension for our particular problem by specifying a different smoothing parameter for each derivative term in Eq. (A2) (Wood2006); however, this step proved ineffective as the specification of many smoothing parameters resulted in an over-smooth field. The primary concern in this study is to borrow the strength of common variabilities among neighboring time series for a better trend estimate with one penalty for the monthly scale and another penalty for the annual scale which are reasonable for our application as long as the relative vertical distance for the different time series is appropriately documented.

The solution to the penalized least square in Eq. (A1) is provided by the joint (K1+K2) coefficient vector c=(ck1,ck2) as follows:


where X is a (12×ny×nh)×(K1+K2) covariate matrix that stacks up all smooth basis functions {ψk1} and {ψk2}, and S1 and S2 are known penalized matrices consisting of the smoothing spline basis functions from Eq. (A2). If both λ1 and λ2 equal 0, the solution is equivalent to the ordinary least square estimate. Extremely large values of the tuning parameter depreciate the curve features and could lead to a plane estimate for F. The estimation of tuning parameter λ={λ1,λ2} can be solved by minimizing the generalized cross validation (GCV) score, which is defined as (Golub et al.1979)

(A3) GCV = N y - y ^ 2 [ N - tr ( A ( λ ) ) ] 2 ,

where A(λ)=X(XX+λ1S1+λ2S2)-1X is the influence matrix such that A(λ)y=y^, tr(A(λ)) is the trace (sum of the elements on the main diagonal) of A(λ), and N=nh×ny×12 is the total number of data points. Furthermore, the number of basis functions has to be selected, which is not a part of the optimization problem. The number of basis functions for each component is adjusted cautiously so that the impact on the result from a further increase in basis functions is negligible (here K1=120 and K2=300; a full spline representation would be up to K1=nh×12 and K2=nh×ny, but it is considered to be computationally wasteful). Further technical details on spline functions in the penalized least square can be found in Wood (2006).

Code and data availability

The IAGOS data are publicly available at (IAGOS2020). The ozonesonde data can be found in the NOAA ESRL Global Monitoring Laboratory (GML) database at (NOAA2020). The R code for the proposed approach is available on GitHub at and (Chang2020).


The supplement related to this article is available online at:

Author contributions

KLC and ORC contributed to the conception and design. AG, IP and VT contributed to the acquisition of data. KLC conducted the analysis. KLC and ORC drafted the paper, while AG, IP and VT helped with the revision. All authors approved the submitted and revised versions for publication.

Competing interests

The authors declare that they have no conflict of interest.


We thank Toshihiro Kuwayama ( and the California Air Resources Board for supporting the ozonesonde program at Trinidad Head in recent years, and Michael Ives at Humboldt State University for his many years of collaboration with NOAA GML and his dedication to launching weekly, and sometimes daily, ozonesondes from Trinidad Head. The IAGOS program acknowledges the European Commission for its support of the MOZAIC project (1994–2003) and the preparatory phase of IAGOS (2005–2013) and IGAS (2013–2016); the partner institutions of the IAGOS Research Infrastructure (FZJ, DLR, MPI and KIT in Germany; CNRS, Météo-France and Université Paul Sabatier in France; and the University of Manchester, UK); the French Atmospheric Data Center AERIS for hosting the database; and the participating airlines (Lufthansa, Air France, China Airlines, Iberia, Cathay Pacific, and Hawaiian Airlines) for transporting the instrumentation free of charge.

Review statement

This paper was edited by Stefano Galmarini and reviewed by two anonymous referees.


Augustin, N. H., Musio, M., von Wilpert, K., Kublin, E., Wood, S. N., and Schumacher, M.: Modeling spatiotemporal forest health monitoring data, J. Am. Stat. Assoc., 104, 899–911,, 2009. a, b

Chang, K. L.: R code for “Statistical regularization for trend detection: an integrated approach for detecting long-term trends from sparse tropospheric ozone profiles”, Zenodo,, 2020. a

Chang, K.-L. and Guillas, S.: Computer model calibration with large non-stationary spatial outputs: application to the calibration of a climate model, J. Roy. Stat. Soc. C-App., 68, 51–78,, 2019. a

Chang, K.-L., Guillas, S., and Fioletov, V. E.: Spatial mapping of ground-based observations of total ozone, Atmos. Meas. Tech., 8, 4487–4505,, 2015. a

Chang, K.-L., Petropavlovskikh, I., Cooper, O. R., Schultz, M. G., and Wang, T.: Regional trend analysis of surface ozone observations from monitoring networks in eastern North America, Europe and East Asia, Elem. Sci. Anth., 5, p. 50,, 2017. a, b, c, d

Cohen, Y., Petetin, H., Thouret, V., Marécal, V., Josse, B., Clark, H., Sauvage, B., Fontaine, A., Athier, G., Blot, R., Boulanger, D., Cousin, J.-M., and Nédélec, P.: Climatology and long-term evolution of ozone and carbon monoxide in the upper troposphere–lower stratosphere (UTLS) at northern midlatitudes, as seen by IAGOS from 1995 to 2013, Atmos. Chem. Phys., 18, 5415–5453,, 2018. a

Cooper, O. R., Parrish, D. D., Stohl, A., Trainer, M., Nédélec, P., Thouret, V., Cammas, J.-P., Oltmans, S. J., Johnson, B. J., Tarasick, D. W., Leblanc, T., McDermid, I. S., Jaffe, D., Gao, R., Stith, J., Ryerson, T., Aikin, K., Campos, T., Weinheimer, A., and Avery, M. A.: Increasing springtime ozone mixing ratios in the free troposphere over western North America, Nature, 463, 344–348,, 2010. a

Cooper, O. R., Schultz, M. G., Schröder, S., Chang, K.-L., Gaudel, A., Benitez, G. C., Cuevas, E., Fröhlich, M., Galbally, I. E., Molloy, S., Kubistin, D., Lu, X., McClure-Begley, A., Nédélec, P., O'Brien, J., Oltmans, S. J., Petropavlovskikh, I., Ries, L., Senik, I., Sjöberg, K., Solberg, S., Spain, G. T., Steinbacher, M., Tarasick, D. W., Thouret, V., and Xu, X.: Multi-decadal surface ozone trends at globally distributed remote locations, Elem. Sci. Anth., 8, p. 23,, 2020. a, b

Gaudel, A., Cooper, O. R., Chang, K.-L., Bourgeois, I., Ziemke, J. R., Strode, S. A., Omen, L., Sellitto, P., Nedelec, P., Blot, R., Thouret, V., and Granier, C.: Aircraft observations since the 1990s reveal increases of tropospheric ozone at multiple locations across the Northern Hemisphere, Sci. Adv., 6, eaba8272,, 2020. a

Ghil, M., Allen, M., Dettinger, M., Ide, K., Kondrashov, D., Mann, M., Robertson, A. W., Saunders, A., Tian, Y., Varadi, F., and Yieu, P.: Advanced spectral methods for climatic time series, Rev. Geophys., 40, 1–41,, 2002. a, b

Golub, G. H., Heath, M., and Wahba, G.: Generalized cross-validation as a method for choosing a good ridge parameter, Technometrics, 21, 215–223, 1979. a

Gu, C.: Model diagnostics for smoothing spline ANOVA models, Can. J. Stat., 32, 347–358,, 2004. a, b

Gu, C.: Smoothing spline ANOVA models, vol. 297, Springer, New York, USA, 2013. a, b

Harris, N. R. P., Hassler, B., Tummon, F., Bodeker, G. E., Hubert, D., Petropavlovskikh, I., Steinbrecht, W., Anderson, J., Bhartia, P. K., Boone, C. D., Bourassa, A., Davis, S. M., Degenstein, D., Delcloo, A., Frith, S. M., Froidevaux, L., Godin-Beekmann, S., Jones, N., Kurylo, M. J., Kyrölä, E., Laine, M., Leblanc, S. T., Lambert, J.-C., Liley, B., Mahieu, E., Maycock, A., de Mazière, M., Parrish, A., Querel, R., Rosenlof, K. H., Roth, C., Sioris, C., Staehelin, J., Stolarski, R. S., Stübi, R., Tamminen, J., Vigouroux, C., Walker, K. A., Wang, H. J., Wild, J., and Zawodny, J. M.: Past changes in the vertical distribution of ozone – Part 3: Analysis and interpretation of trends, Atmos. Chem. Phys., 15, 9965–9982,, 2015. a

Hastie, T. J. and Tibshirani, R. J.: Generalized additive models, vol. 43, CRC press, New York, USA, 1990. a

IAGOS: In-service Aircraft for a Global Observing System, available at:,, last access: 19 August 2020. a

Johnson, B. J., Oltmans, S. J., Vömel, H., Smit, H. G. J., Deshler, T., and Kröger, C.: Electrochemical concentration cell (ECC) ozonesonde pump efficiency measurements and tests on the sensitivity to ozone of buffered and unbuffered ECC sensor cathode solutions, J. Geophys. Res.-Atmos., 107, ACH 8-1–ACH 8-18,, 2002. a

Langford, A. O.: Stratosphere-troposphere exchange at the subtropical jet: Contribution to the tropospheric ozone budget at midlatitudes, Geophys. Res. Lett., 26, 2449–2452, 1999. a, b

Leonard, M., Petropavlovskikh, I., Lin, M., McClure-Begley, A., Johnson, B. J., Oltmans, S. J., and Tarasick, D. W.: An assessment of 10-year NOAA aircraft-based tropospheric ozone profiling in Colorado, Atmos. Environ., 158, 116–127,, 2017. a

Lin, M., Fiore, A. M., Horowitz, L. W., Langford, A. O., Oltmans, S. J., Tarasick, D., and Rieder, H. E.: Climate variability modulates western US ozone air quality in spring via deep stratospheric intrusions, Nat. Commun., 6, 1–11,, 2015. a

Logan, J. A.: An analysis of ozonesonde data for the troposphere: Recommendations for testing 3-D models and development of a gridded climatology for tropospheric ozone, J. Geophys. Res.-Atmos., 104, 16115–16149, 1999. a

Logan, J. A., Staehelin, J., Megretskaia, I. A., Cammas, J.-P., Thouret, V., Claude, H., De Backer, H., Steinbacher, M., Scheel, H.-E., Stübi, R., Frohlich, M., and Derwent, R.: Changes in ozone over Europe: Analysis of ozone measurements from sondes, regular aircraft (MOZAIC) and alpine surface sites, J. Geophys. Res.-Atmos., 117, D09301,, 2012. a

Lossow, S., Khosrawi, F., Kiefer, M., Walker, K. A., Bertaux, J.-L., Blanot, L., Russell, J. M., Remsberg, E. E., Gille, J. C., Sugita, T., Sioris, C. E., Dinelli, B. M., Papandrea, E., Raspollini, P., García-Comas, M., Stiller, G. P., von Clarmann, T., Dudhia, A., Read, W. G., Nedoluha, G. E., Damadeo, R. P., Zawodny, J. M., Weigel, K., Rozanov, A., Azam, F., Bramstedt, K., Noël, S., Burrows, J. P., Sagawa, H., Kasai, Y., Urban, J., Eriksson, P., Murtagh, D. P., Hervig, M. E., Högberg, C., Hurst, D. F., and Rosenlof, K. H.: The SPARC water vapour assessment II: profile-to-profile comparisons of stratospheric and lower mesospheric water vapour data sets obtained from satellites, Atmos. Meas. Tech., 12, 2693–2732,, 2019. a

Marenco, A., Thouret, V., Nédélec, P., Smit, H., Helten, M., Kley, D., Karcher, F., Simon, P., Law, K., Pyle, J., Poschmann, G., Von Wrede, R., Hume, C., and Cook, T.: Measurement of ozone and water vapor by Airbus in-service aircraft: The MOZAIC airborne program, An overview, J. Geophys. Res.-Atmos., 103, 25631–25642, 1998. a

Meiring, W.: Oscillations and time trends in stratospheric ozone levels: a functional data analysis approach, J. Am. Stat. Assoc., 102, 788–802,, 2007. a

Miller, A. J., Cai, A., Tiao, G. C., Wuebbles, D. J., Flynn, L. E., Yang, S.-K., Weatherhead, E. C., Fioletov, V. E., Petropavlovskikh, I., Meng, X.-L., Guillas, S., Nagatani, R., and Reinsel, G. C.: Examination of ozonesonde data for trends and trend changes incorporating solar and Arctic oscillation signals, J. Geophys. Res.-Atmos., 111, 1–10,, 2006. a, b

Nédélec, P., Blot, R., Boulanger, D., Athier, G., Cousin, J.-M., Gautron, B., Petzold, A., Volz-Thomas, A., and Thouret, V.: Instrumentation on commercial aircraft for monitoring the atmospheric composition on a global scale: the IAGOS system, technical overview of ozone and carbon monoxide measurements, Tellus B, 67, 27791,, 2015. a

NOAA: Homogenized ozonesonde data archive, available at:, last access: 19 August 2020. a

Oltmans, S. J., Hofmann, D. J., Lathrop, J. A., Harris, J. M., Komhyr, W. D., and Kuniyuki, D.: Tropospheric ozone during Mauna Loa Observatory Photochemistry Experiment 2 compared to long-term measurements from surface and ozonesonde observations, J. Geophys. Res.-Atmos., 101, 14569–14580, 1996. a

Park, A., Guillas, S., and Petropavlovskikh, I.: Trends in stratospheric ozone profiles using functional mixed models, Atmos. Chem. Phys., 13, 11473–11501,, 2013. a, b, c

Petetin, H., Thouret, V., Athier, G., Blot, R., Boulanger, D., Cousin, J.-M., Gaudel, A., Nédélec, P., and Cooper, O. R.: Diurnal cycle of ozone throughout the troposphere over Frankfurt as measured by MOZAIC-IAGOS commercial aircraft, Elem. Sci. Anth., 4, 000129,, 2016a. a

Petetin, H., Thouret, V., Fontaine, A., Sauvage, B., Athier, G., Blot, R., Boulanger, D., Cousin, J.-M., and Nédélec, P.: Characterising tropospheric O3 and CO around Frankfurt over the period 1994–2012 based on MOZAIC–IAGOS aircraft measurements, Atmos. Chem. Phys., 16, 15147–15163,, 2016.b a, b

Petetin, H., Jeoffrion, M., Sauvage, B., Athier, G., Blot, R., Boulanger, D., Clark, H., Cousin, J.-M., Gheusi, F., Nédélec, P., Steinbacher, M., and Thouret, V.: Representativeness of the IAGOS airborne measurements in the lower troposphere, Elem. Sci. Anth., 6, p. 23,, 2018. a, b, c

Petropavlovskikh, I., Godin-Beekmann, S., Hubert, D., Damadeo, R., Hassler, B., and Sofieva, V.: SPARC/IOC/GAW report on Long-term Ozone Trends and Uncertainties in the Stratosphere, SPARC report No. 9, 78 pp.,, 2019. a

Petzold, A., Thouret, V., Gerbig, C., Zahn, A., Brenninkmeijer, C. A. M., Gallagher, M., Hermann, M., Pontaud, M., Ziereis, H., Boulanger, D., Marshall, J., Nédélec, P., Smit, H. G. J., Friess, U., Flaud, J.-M., Wahner, A., Cammas, J.-P., Volz-Thomas, A., and IAGOS TEAM: Global-scale atmosphere monitoring by in-service aircraft–current achievements and future prospects of the European Research Infrastructure IAGOS, Tellus B, 67, 28452,, 2015. a

R Core Team: R: A language and environment for statistical computing, Vienna, Austria, 2013. a

Saunois, M., Emmons, L., Lamarque, J.-F., Tilmes, S., Wespes, C., Thouret, V., and Schultz, M.: Impact of sampling frequency in the analysis of tropospheric ozone observations, Atmos. Chem. Phys., 12, 6757–6773,, 2012. a

Schlink, U., Herbarth, O., Richter, M., Dorling, S., Nunnari, G., Cawley, G., and Pelikan, E.: Statistical models to assess the health effects and to forecast ground-level ozone, Environ. Modell. Softw., 21, 547–558,, 2006. a

Shaddick, G. and Zidek, J. V.: Spatio-temporal methods in environmental epidemiology, Chapman and Hall/CRC, New York, USA, 2015. a

Simmons, A. J., Willett, K. M., Jones, P. D., Thorne, P. W., and Dee, D. P.: Low-frequency variations in surface atmospheric humidity, temperature, and precipitation: Inferences from reanalyses and monthly gridded observational data sets, J. Geophys. Res.-Atmos., 115, D01110,, 2010. a

Smit, H. G. J., Straeter, W., Johnson, B. J., Oltmans, S. J., Davies, J., Tarasick, D. W., Hoegger, B., Stubi, R., Schmidlin, F. J., Northam, T., Thompson, A. M., Witte, J. C., Boyd, I., and Posny, F.: Assessment of the performance of ECC-ozonesondes under quasi-flight conditions in the environmental simulation chamber: Insights from the Juelich Ozone Sonde Intercomparison Experiment (JOSIE), J. Geophys. Res.-Atmos., 112, D19306,, 2007. a

Smit, H. G. J., Oltmans, S. J., Deshler, T., Tarasick, D. W., Johnson, B. J., Schmidlin, F., Stuebi, R., and Davies, J.: SI2N/O3S-DQA Activity: Guide Lines for Homogenization of Ozone Sonde Data, Tech. rep., SI2N/O3S-DQA activity as part of “Past changes in the vertical distribution of ozone assessment”, available at: (last access: 22 August 2019), 2012. a

Smit, H. G. J. and Panel for the Assessment of Standard Operating Procedures for Ozonesondes (ASOPOS): Quality assurance and quality control for ozonesonde measurements in GAW, Tech. rep., World Meteorological Organization (GAW Report# 201), available at: (last access: 22 August 2019), 2014. a

Sobol, I. M.: Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates, Math. Comput. Simulat., 55, 271–280,, 2001. a

Staufer, J., Staehelin, J., Stübi, R., Peter, T., Tummon, F., and Thouret, V.: Trajectory matching of ozonesondes and MOZAIC measurements in the UTLS – Part 1: Method description and application at Payerne, Switzerland, Atmos. Meas. Tech., 6, 3393–3406,, 2013. a

Staufer, J., Staehelin, J., Stübi, R., Peter, T., Tummon, F., and Thouret, V.: Trajectory matching of ozonesondes and MOZAIC measurements in the UTLS – Part 2: Application to the global ozonesonde network, Atmos. Meas. Tech., 7, 241–266,, 2014. a

Stein, M. L.: Interpolation of spatial data: some theory for kriging, Springer, New York, USA, 1999. a, b, c

Sterling, C. W., Johnson, B. J., Oltmans, S. J., Smit, H. G. J., Jordan, A. F., Cullis, P. D., Hall, E. G., Thompson, A. M., and Witte, J. C.: Homogenizing and estimating the uncertainty in NOAA's long-term vertical ozone profile records measured with the electrochemical concentration cell ozonesonde, Atmos. Meas. Tech., 11, 3661–3687,, 2018. a

Tarasick, D., Galbally, I. E., Cooper, O. R., Schultz, M. G., Ancellet, G., Leblanc, T., Wallington, T. J., Ziemke, J. R., Liu, X., Steinbacher, M., Stähelin, J., Vigouroux, C., Hannigan, J. W., Garcìa, O., Foret, G., Zanis, P., Weatherhead, E. C., Petropavlovskikh, I., Worden, H., Osman, M., Liu, J., Chang, K.-L., Gaudel, A., Lin, M., Granados-Muñoz, M., Thompson, A. M., Oltmans, S. J., Cuesta, J., Dufour, G., Thouret, V., Hassler, B., Trickl, T., and Neu, J. L.: Tropospheric Ozone Assessment Report: Tropospheric ozone from 1877 to 2016, observed levels, trends and uncertainties, Elem. Sci. Anth., 7, p. 39,, 2019. a

Thouret, V., Marenco, A., Logan, J. A., Nédélec, P., and Grouhel, C.: Comparisons of ozone measurements from the MOZAIC airborne program and the ozone sounding network at eight locations, J. Geophys. Res.-Atmos., 103, 25695–25720, 1998. a, b

Tiao, G. C., Reinsel, G. C., Pedrick, J. H., Allenby, G. M., Mateer, C. L., Miller, A. J., and DeLuisi, J. J.: A statistical trend analysis of ozonesonde data, J. Geophys. Res.-Atmos., 91, 13121–13136, 1986. a

Tiao, G. C., Reinsel, G. C., Xu, D., Pedrick, J. H., Zhu, X., Miller, A. J., DeLuisi, J. J., Mateer, C. L., and Wuebbles, D. J.: Effects of autocorrelation and temporal sampling schemes on estimates of trend and spatial correlation, J. Geophys. Res.-Atmos., 95, 20507–20517, 1990. a

Verbyla, A. P., Cullis, B. R., Kenward, M. G., and Welham, S. J.: The analysis of designed experiments and longitudinal data by using smoothing splines, J. Roy. Stat. Soc. C-App., 48, 269–311, 1999. a

Wahba, G.: Spline models for observational data, SIAM, Philadelphia, Pennsylvania, USA, 1990. a

Wahba, G., Wang, Y., Gu, C., Klein, R., and Klein, B.: Smoothing spline ANOVA for exponential families, with application to the Wisconsin epidemiological study of diabetic retinopathy, Ann. Stat., 23, 1865–1895, 1995. a

Wang, Y.: Mixed effects smoothing spline analysis of variance, J. Roy. Stat. Soc. B-Met., 60, 159–174, 1998. a, b

Wasserstein, R. L. and Lazar, N. A.: The ASA statement on p-values: context, process, and purpose, Am. Stat., 70, 129–133,, 2016. a

Wasserstein, R. L., Schirm, A. L., and Lazar, N. A.: Moving to a world beyond p<0.05, Am. Stat., 73, 1–29,, 2019. a

Weatherhead, E. C., Reinsel, G. C., Tiao, G. C., Meng, X.-L., Choi, D., Cheang, W.-K., Keller, T., DeLuisi, J., Wuebbles, D. J., Kerr, J. B., Miller, A. J., Oltmans, S. J., and Frederick, J. E.: Factors affecting the detection of trends: Statistical considerations and applications to environmental data, J. Geophys. Res.-Atmos., 103, 17149–17161, 1998. a

Weatherhead, E. C., Bodeker, G. E., Fassò, A., Chang, K.-L., Lazo, J. K., Clack, C. T. M., Hurst, D. F., Hassler, B., English, J. M., and Yorgun, S.: Spatial coverage of monitoring networks: A climate observing system simulation experiment, J. Appl. Meteorol. Clim., 56, 3211–3228,, 2017. a, b

Wigley, T. M. L., Briffa, K. R., and Jones, P. D.: On the average value of correlated time series, with applications in dendroclimatology and hydrometeorology, J. Clim. Appl. Meteorol., 23, 201–213, 1984.  a, b

Wood, S. N.: Thin plate regression splines, J. Roy. Stat. Soc. B-Met., 65, 95–114,, 2003. a, b

Wood, S. N.: Generalized additive models: an introduction with R, CRC press, New York, USA, 2006. a, b, c, d, e, f, g

Wood, S. N., Goude, Y., and Shaw, S.: Generalized additive models for large data sets, J. Roy. Stat. Soc. C-App., 64, 139–155,, 2015. a, b, c

Short summary
We provide a statistical framework for detecting trends of multiple autocorrelated time series from sparsely sampled profile data. The result is a better and more consistent quantification of trend estimates of vertical profile data. The focus was placed on the long-term ozone time series from commercial aircraft and balloon-borne ozonesonde measurements. This framework can be applied to other trace gases in the atmosphere.
Final-revised paper