A multi-model comparison of meteorological drivers of surface ozone over Europe

The implementation of European emission abatement strategies has led to a significant reduction in the emissions of ozone precursors during the last decade. Groundlevel ozone is also influenced by meteorological factors such as temperature, which exhibit interannual variability and are expected to change in the future. The impacts of climate change on air quality are usually investigated through airquality models that simulate interactions between emissions, meteorology and chemistry. Within a multi-model assessment, this study aims to better understand how air-quality models represent the relationship between meteorological variables and surface ozone concentrations over Europe. A multiple linear regression (MLR) approach is applied to observed and modelled time series across 10 European regions in springtime and summertime for the period of 2000–2010 for both models and observations. Overall, the air-quality models are in better agreement with observations in summertime than in springtime and particularly in certain regions, such as France, central Europe or eastern Europe, where local meteorological variables show a strong influence on surface ozone concentrations. Larger discrepancies are found for the southern regions, such as the Balkans, the Iberian Peninsula and the Mediterranean basin, especially in springtime. We show that the air-quality models do not properly reproduce the sensitivity of surface ozone to some of the main meteorological drivers, such as maximum temperature, relative humidity and surface solar radiation. Specifically, all air-quality models show more limitations in capturing the strength of the ozone–relative-humidity relationship detected in the observed time series in most of the regions, for both Published by Copernicus Publications on behalf of the European Geosciences Union. 12270 N. Otero et al.: A multi-model comparison over Europe seasons. Here, we speculate that dry-deposition schemes in the air-quality models might play an essential role in capturing this relationship. We further quantify the relationship between ozone and maximum temperature (mo3−T, climate penalty) in observations and air-quality models. In summertime, most of the air-quality models are able to reproduce the observed climate penalty reasonably well in certain regions such as France, central Europe and northern Italy. However, larger discrepancies are found in springtime, where airquality models tend to overestimate the magnitude of the observed climate penalty.


Introduction
Tropospheric ozone is recognised as a threat to human health and ecosystem productivity (Mills et al., 2007).It is produced by photochemical oxidation of carbon monoxide and volatile organic compounds (VOCs) in the presence of nitrogen oxides (NO x = NO + NO 2 ) (Jacob and Winner, 2009).While it is an important pollutant on a regional scale due to the long-range transport effect, it may also influence air quality on a hemispheric scale (Hedegaard et al., 2013;Monks et al., 2015).Previous studies have shown that the reduction of emissions of ozone precursors lead to a decrease in tropospheric ozone concentrations in Europe (Solberg et al., 2005;Jonson et al., 2006).However, there is also a large year-to-year variability due to weather conditions (Andersson et al., 2007).
Ozone variability is strongly related to a meteorological conditions.Significant correlations between ozone and temperature have been associated with the temperaturedependent lifetime of peroxyacetyl nitrate (PAN) and also due to the temperature dependence of biogenic emission of isoprene (Sillman and Samson, 1995).Substantial increases in surface ozone have been associated with high temperatures and stable anticyclonic, sunny conditions that promote ozone formation (Solberg et al., 2008).Moreover, its strong relationship with temperature represents a major concern, since under a changing climate the efforts on new air pollution mitigation strategies might be insufficient.This effect, referred to as the climate penalty (Wu et al., 2008), is expected to play an important role in future air quality (Hendriks et al., 2016).Similarly, increasing solar radiation leads to high levels of ozone, though with a weak correlation (Dawson et al., 2007) and it has been suggested that it could in part reflect the association of clear sky with high temperatures (Ordóñez et al., 2005).Humidity influences photochemistry through reactions between water vapour and atomic oxygen (Vautard et al., 2012).High levels of humidity are normally related to enhanced cloud cover and thus reduced photochemistry (Dueñas et al., 2002;Camalier et al., 2007).The relationship between ozone and relative humidity can also be explained by dry deposition through stomatal uptake: under low levels of humidity plants close their stomata, which reduces the biogenic uptake (Hodnebrog et al., 2012;Kavassalis and Murphy, 2017).High wind speed is usually correlated with low ozone concentrations due to enhanced advection and deposition, although the processes involved are complex and studies from different regions reported weak or insignificant correlations (Dawson et al., 2007;Jacob and Winner, 2009).
Chemistry transport models (CTMs) are one of the most common tools with which to investigate the impacts of climate change on air quality (Jacob and Winner, 2009;Colette et al., 2015).Due to assumptions, parameterizations and simplifications of processes, the models themselves are subject to large uncertainties (Manders et al., 2012), which have been reflected in some regional differences in the magnitude of surface ozone response to projected climate change (Andersson and Engardt, 2010).Thus, model biases still remain a concern when compared to the observations, especially in terms of the response of air quality under future climate (Fiore et al., 2009;Rasmussen et al., 2012).Comparisons between model outputs and measurements of available observational data sets are essential to evaluating the model's ability to reproduce observations.Discrepancies in the outputs of the CTMs can be due chemical and physical processes, fluxes (emissions, deposition and boundary fluxes) and meteorological processes (Vautard et al., 2012;Bessagnet et al., 2016).In particular, quantification and isolation of the effects of meteorology on ozone is a challenge due to the complex interrelation between ozone, meteorology, emissions and chemistry (Solberg et al., 2016).Thus, evaluating air-quality models with respect to the meteorological inputs is important, given that meteorology drives numerous chemical processes (Vautard et al., 2012).A number of studies have evaluated the performance of the meteorological models that drive CTMs by comparing them with observations of weather parameters relevant for air quality (Smyth et al., 2006;Vautard et al., 2012;Brunner et al., 2015;Makar et al., 2015;Bessagnet et al., 2016).
Capturing observed sensitivities of ozone to meteorological factors is required to assess the confidence in the models and their ability to reproduce the observed relationships between pollutants and meteorology and better understand potential impacts under climate change.However, only a few studies have used model simulations to analyse ozone sensitivities to meteorological parameters.Davis et al. (2011) evaluated the performance of the Community Multiscale Air Quality (CMAQ) model to reproduce the ozone sensitivities to meteorology across the eastern USA.Their results showed that the model underestimated the observed ozone sensitivities to temperature and relative humidity.Recently, Fix et al. (2018) examined the capability of the NRCM-Chem model to capture the meteorological sensitivities of high or extreme ozone.Overall, they found substantial differences between the modelled and the observed sensitivities of high levels of ozone to meteorological drivers that were not consistent between the three regions of study.Due to the com-Atmos.Chem.Phys., 18, 12269-12288, 2018 www.atmos-chem-phys.net/18/12269/2018/plex interactions and processes, estimating the ozone sensitivities to key meteorological variables remains a challenge.Thus, we aim to examine the capabilities of a set of CTMs to reproduce the observed ozone responses to meteorological variables.To our knowledge, this is the first multi-model evaluation that compares observed and modelled meteorological sensitivities of ozone over Europe using a set of regional air-quality models.
The EURODELTA-Trends (EDT) exercise has been designed to better understand the evolution of air pollution and assess the efficiency of mitigation strategies for improving air quality.The EDT exercise allows the evaluation of the ability of regional air-quality models and quantification of the role of the different key driving factors of surface ozone, such as emissions changes, long-range transport and meteorological variability (more details on the EDT exercise can be found in Colette et al., 2017a).Earlier phases of EU-RODELTA and other relevant modelling exercises, such as AQMEII (Air Quality Model Evaluation International Initiative, Rao et al., 2011) covered a short period of time of 1 year, while only a few studies assessed long-term air quality but were limited to one model (Vautard et al., 2005;Jonson et al., 2006;Wilson et al., 2012), or utilised climate data rather than reanalysed meteorology (e.g. Simpson et al., 2014;Colette et al., 2015).The EDT exercise presents a multi-model hindcast of air quality over 2 decades  and thus offers a good opportunity to evaluate the role of driving meteorological factors on ozone variability.
The present study provides a novel and simple method to evaluate the performance of air-quality models in terms of meteorological sensitivities of ozone.Specifically, our analysis focuses on the European ozone season (April to September) over the years 2000-2010.The choice of this period is mainly motivated by the availability of the observational data set from Schnell et al. (2014Schnell et al. ( , 2015) ) (see Sect. 2.1).Within the EDT framework, a recent report has presented the main findings on the long-term evolution of air quality (Colette et al., 2017b).A part of these results was obtained from the analysis of the 1990s (1990-2000) and 2000s (2000-2010) separately.We focus on the second decade (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010), for which the interpolated data set of observed maximum daily 8 h mean ozone (MDA8 O 3 ) used in this study was available.Similarly to Otero et al. (2016), we apply a multiple linear regression approach to examine the meteorological influence on MDA8 O 3 .Statistical models are developed separately for observational data sets and air-quality models, with the primary focus on examining both observed and simulated relationships between MDA8 O 3 and meteorological drivers.
The present paper is structured as follows.Section 2 describes the observational data as well as the air-quality models studied here.The methodology and the design of the statistical models are introduced in Sect.3. Section 4 discusses the results and the summary and conclusions are discussed in Sect. 5.

Observations
This study uses gridded MDA8 O 3 concentrations created with an objective-mapping algorithm developed by Schnell et al. (2014).They applied a new interpolation technique over hourly observations of stations from the European Monitoring and Evaluation Programme (EMEP) and the European Environment Agency's air-quality database (AirBase) to calculate surface ozone averaged over 1 • by 1 • grid cells (see Schnell et al., 2014Schnell et al., , 2015)).Otero et al. (2016) used this data set to examine the influence of synoptic and local meteorological conditions over Europe.This interpolated product offers a possibility to establish a direct comparison between observations and CTMs.However, it must be acknowledged that for some areas with a low number of stations (i.e. the south-eastern or north-eastern European regions) the values interpolated into the 1 • × 1 • degree grid cells may not be representative of such large scales.Recently, Ordóñez et al. (2017) andCarro-Calvo et al. (2017) used this product to assess the impact of high-latitude and subtropical anticyclonic systems on surface ozone and the synoptic drivers of summer ozone respectively.They reported inhomogeneities during some years for specific grid cells (e.g. in the Balkans and Sweden), which were excluded from their analysis.However, we did not observe a clear shift when analysing the spatial averages of the time series of the MDA8 O 3 for those particular regions (e.g.Balkans and Scandinavia) (Figs.S1, S2 in the Supplement).Therefore our analysis includes the whole data set.
This study investigates the influence of observed meteorological variables on MDA8 O 3 , based on the ERA-Interim reanalysis product provided by the European Centre for Medium-Range Weather Forecasts (ECMWF) at 1 • × 1 • resolution (Dee et al., 2011).Meteorological reanalyses products are essentially model simulations constrained by observations and they have been widely validated against independent observations.Daily mean values are calculated as the mean of the four available time steps at 00:00, 06:00, 12:00 and 18:00 UTC for 10 m wind speed components (u and v) and 2 m relative humidity.Maximum temperature is approximated by the daily maximum of those time steps, while daily mean surface solar radiation is obtained from the hourly values provided for the forecast fields.

Chemistry transport models (CTMs)
A set of state-of-the-art air-quality models participating in the EDT exercise is used here: LOTOS-EUROS (Schaap et al., 2008;Manders et al., 2017), EMEP/MSC-W (Simpson et al., 2012), CHIMERE (Mailler et al., 2017), MATCH (Robertson et al., 1999) and MINNI (Mircea et al., 2016).The domain of the CTMs extends from 17 For instance, the boundary conditions were defined from a climatology of observational data for most of the experiments of the EDT exercise (including the data used here).However, the representation of physical and chemical processes and the vertical distribution differ in the CTMs, as well as the vertical distribution of model layers (including altitude of the top layer and derivation of surface concentration at 3 m height in the case of EMEP, LOTOS-EUROS and MATCH).Moreover, no specific constraints were imposed on biogenic emissions (including soil NO emissions), which are represented by most of the models using an online module (Colette et al., 2017a).Since we aim here to compare the modelled relationship between meteorology and surface ozone, prescribing common features in the CTMs is particularly an advantage when identifying potential sources of discrepancies.
The CTMs were forced by regional numerical weather model simulations using boundary conditions from the ERA-Interim global reanalysis (Dee et al., 2011).Most of the CTMs used the same meteorological input data, with a few exceptions.Three of them (EMEP, CHIMERE and MINNI) used input meteorology from the Weather Research and Forecast Model (WRF) (Skamarock et al., 2008).LOTOS-EUROS and MATCH used the input meteorology produced by RACMO2 (van Meijgaard, 2012) and HIRLAM (Dahlgren et al., 2016), respectively.Unlike the rest of the regional weather models, RACMO2 used in the EDT exercise excluded nudging towards ERA-Interim, which might have some impact on the meteorological fields generated by RACMO2.A summary of the CTMs and the corresponding sources of meteorological input data with some of the main characteristics are given in Table 1.As with the observations, CTMs and their meteorological counterpart were interpolated to a common grid with 1 • ×1 • horizontal resolution.The use of a coarser resolution could have an impact in some regions with a complex orography where airflow is usually controlled by mesoscale phenomena (e.g.sea breeze and mountain-valley winds) or in regions characterized by highemission densities (Schaap et al., 2015;Gan et al., 2016).In such cases the use of a finer grid could be beneficial for capturing the variability of local processes.
A set of meteorological parameters was selected from the meteorological input data for the regression analyses.Similarly to the procedure with ERA-Interim, daily means are obtained from the available time steps every 3 hours in the case of WRF and RACMO2, and every 6 h for HIRLAM for the following variables: 10 m wind speed components, 2 m relative humidity and surface solar radiation.Maximum temperature is also approximated by the daily maximum of those time steps.

Multiple linear regression model
Summertime usually brings favourable conditions for high near-surface ozone concentrations, such as air stagnation due to high-pressure systems, warmer temperatures, higher UV radiation and lower cloud cover (Dawson et al., 2007).This study attempts to better understand how CTMs represent the meteorological sensitivities of ozone.To this aim, we use a multiple linear regression approach that can provide useful information on sensitivities in the distribution of ozone concentration as a whole (Porter et al., 2015).
A total of five meteorological predictors (Table 2) are selected based on the existing literature that has shown their strong influence on ozone pollution (e.g.Bloomfield et al., 1996;Barrero et al., 2005;Camalier et al., 2007;Dawson et al., 2007;Andersson and Engardt, 2010;Rasmussen et al., 2012;Davis et al., 2011;Doherty et al., 2013;Otero et al., 2016).Moreover, it has been shown that the occurrence of air pollution episodes might increase when the pollution levels of the previous day are higher than normal (Ziomas et al., 1995).Then, apart from the meteorological predictors, we add the effect of the lag of ozone (MDA8 from the previous day) in order to examine the role of ozone persistence.Additionally, we include harmonic functions that capture the effect of seasonality as in Rust et al. (2009) and Otero et al. (2016), which is referred to as "day" in the MLRs (see Table 2).
For this study, we divide the European domain into 10 regions: England (EN), Inflow (IN), Iberian Peninsula (IP), France (FR), central Europe (ME), Scandinavia (SC), northern Italy (NI), Mediterranean (MD), Balkans (BA) and eastern Europe (EA).These regions are based on those defined in the recent ETC/ACM technical paper (Colette et al., 2017b).For our study, we further subdivide the original Mediterranean region (MD) into a region covering the Balkans (BA) due to the strong influence of the ozone persistence on MDA8 O 3 over this particular region as noted previously in Otero et al. (2016).Figure 1 shows the spatial coverage of each region and Table 3 lists their coordinates.As shown in Otero et al. (2016), the relative importance of predictors in the MLRs shows distinct seasonal patterns.Here, multiple linear regression models (MLRs, hereafter) are developed for each region for two seasons: springtime (April-May-June, AMJ) and summertime (July-August-September, JAS).These seasons differ from the meteorological definition but cover the period when surface ozone typically reaches its highest concentrations (i.e.April-September).Additionally, we analysed the impact of the seasons' definition by performing sensitivity tests using the meteorological seasons (i.e.March-May-April, MAM and June-July-August, JJA).As shown in Figs.S3 and S4, we found a stronger impact of some relevant key driving factors of ozone (e.g.temperature and relative humidity) when using the seasons defined above (AMJ and JAS) than when using the meteorological seasons.Therefore, we consider that our  Koeble and Seufert (2001), and hourly temperature and light using Guenther et al. (1993Guenther et al. ( , 1994)).See Simpson et al. (1995Simpson et al. ( , 2012) ) dd: Resistance model for gases (Venkatram and Pleim, 1999); for aerosols:  Koeble and Seufert (2001), and hourly temperature and light (Guenther et al., 1991(Guenther et al., , 1993))  A MLR is built to describe the relationship between MDA8 O 3 (predicant) and a set of covariates (or predictors) describing seasonality, ozone persistence and the influence of meteorological fields (Table 2).A data series y t , t = 1, ...N (e.g.observations or CTM simulations) for a given region and season is conceived as a Gaussian random variable Y t with varying mean µ t and homogeneous variance σ 2 .The mean µ t is described as a linear function of the covariates, i.e.
with t indexing daily values and d t referring to the day of the year associated with the index t.β 0 is a constant offset, β sin and β cos are the first-order coefficients of a Fourier series (e.g.Rust et al., 2009Rust et al., , 2013;;Fischer et al., 2018), β lag describes the persistence with respect to the previous-day concentration y t−1 ; if t is the first day in the late summer season (JAS, 1 July), y t−1 is the concentration of 30 June.Further regression coefficients β k describe the linear relation to potential meteorological drivers (see Table 2).For covariates Iberian Peninsula (IP) standardized to unit variance, the regression coefficients (β) are standardised coefficients giving the change in the predicant with the covariate in units of covariate standard deviation.
Following the same strategy as used in Otero et al. (2016), the MLRs are developed through several common steps: (1) starting with the full set of potentially useful components in the predictor, a stepwise backward regression using the Akaike Information Criterion (AIC) as a selection criterion successively removes those components in the predictor, which contribute least to the model performance; and (2) a multi-collinearity index known as variance inflation factor (VIF, Maindonald and Braun, 2006) is used to detect multicollinearity problems in the predictor (i.e.high correlations between two or more components in the predictor).Components with a VIF above 10 are left out of the predictor (Kutner et al., 2004).
The statistical performance of each MLR (built separately from observations and CTMs) is assessed through the adjusted coefficient (R 2 ) and the root mean square error (RMSE).The R 2 estimates the fraction of total variability described by the MLR and the RMSE gives the average deviation between the model and observation obtained in the MLR.We also examine the relative importance of the individual components in the predictor.According to the method proposed by Lindeman et al. (1980), the relative importance of each predictor is estimated by its contribution to the R 2 coefficient (Grömping, 2007).We assess the sensitivities of ozone to the predictors through the standardised coefficients obtained from the regression.These coefficients indicate the changes in the ozone response to the changes in the predictors, in terms of standard deviation.Thus, for every standard deviation unit increase (decrease) of a specific predictor, the predicant (MDA8 O 3 ) will increase (decrease) the amount indicated by its coefficient in standard deviation units,.The use of standardised coefficients allows us to establish a direct comparison of the influence of individual predictors.The Atmos.Chem.Phys., 18,[12269][12270][12271][12272][12273][12274][12275][12276][12277][12278][12279][12280][12281][12282][12283][12284][12285][12286][12287][12288]2018 www.atmos-chem-phys.net/18/12269/2018/effect of seasonality introduced by the harmonic functions (namely, "day"; Table 2) is kept in the MLRs (Eq. 1) for its usefulness in improving the power of the regression analysis.However, further explanation about the effect of the predictors focuses on the rest of the variables.
4 Results and discussion

CTM performance by region
We compare the seasonal cycle of observations and CTM results through the time series of daily averaged values of MDA8 O 3 from observations and CTMs for the whole period (i.e.April-September, 2000-2010), spatially averaged over each region.Furthermore, correlation coefficients between both CTMs and observations at each region and season are used to quantify the CTM performance.

Seasonal cycle of MDA8 O 3
We examine the ozone seasonal cycle represented by both the observational and modelled data sets.Figure 2  tions.For example, we observed substantial differences in the southern regions, such as IP, MD and BA, where the models show a considerable spread.In those regions, the CTMs exhibit different behaviour when compared to each other.For instance, the EMEP model shows ozone peak concentrations in April, while CHIMERE and MINNI show a peak in July.
Overall LOTOS shows a relatively constant positive bias in all regions, which is more evident in the MD and NI regions.

Correlation coefficients between modelled and observed time series
The correlation coefficients between the observed and modelled values of MDA8 O3 at each region and in each season are shown in Fig. 3. Overall, MDA8 O 3 from the CTMs is better correlated with observations in JAS than in AMJ in the regions ME, NI, EA and EN.As expected from inspection of the average time series (Fig. 2), the lowest correlations between models and observations are found in BA, especially in AMJ for all models.In particular, EMEP is negatively correlated with observations over this region.As mentioned above, the larger discrepancies between CTMs and observations found over BA might be attributed to a low density of observation sites from which the interpolated data set is derived, resulting in a lower quality or higher uncertainties of such products (Schnell et al., 2014) NI, or FR) where MLRs are able to describe more than 80 % of the variance in CTMs and observations.This could be attributed to the major role of meteorology in summer influencing local photochemistry processes of ozone production, while in spring long-range transport plays a stronger role (Monks, 2000;Tarasova et al., 2007).As it includes the bias, the RMSE reveals more differences among the MLRs when compared to each other (e.g.larger errors for LOTOS when compared to MATCH or CHIMERE).However, it is interesting that in general all MLRs show a similar tendency when evaluating the statistical performance, which indicate that observation-based and CTM-based MLRs present a similar statistical performance for modelling MDA8 O 3 .The ability of the CTMs to reproduce the influence of meteorological drivers on MDA8 O 3 is discussed in more detail below.

Effects of drivers on ozone concentrations
The analysis of the influence of the predictors in the MLRs reveals distinctive regional patterns in both observationbased and CTM-based MLRs.In agreement with Otero et al. (2016), here we also find that the regions geographically located towards the interior (including central, western and eastern regions) appear to be more sensitive to the meteorological predictors, especially in JAS.On the contrary, less of a meteorological contribution is found in the regions over the northernmost and southernmost part of the domain, implying that non-local processes (e.g.long-range transport) play a stronger role here.Considering such similarities, in the following, the regions EN, FR, ME, NI and EA are referred to as the internal regions, while the rest of the regions, IN, SC, IP, MD and BA, are referred to as the external regions (see Fig. 1).

Relative importance
Figure 6 depicts the relative importance of the predictors for the observation-based and CTM-based MLRs in the internal regions (Fig. 1).Here, a larger meteorological influence (i.e. the predictors other than LO3 and day) can be seen in JAS compared to AMJ in all of these regions.In general, the dominant meteorological drivers from the observation-based MLRs in these internal regions are RH and Tx.The contribution of RH is evident in AMJ (e.g.ME, or EA), while Tx is clearly dominant in JAS.SSRD is also a key driver of MDA8 O 3 , and generally, the wind factors (W10m and Wdir) appear to have a minor contribution.
Despite the CTM-based MLRs being able to capture the meteorological predictors, we observe discrepancies among Atmos.Chem.Phys., 18, 12269-12288, 2018 www.atmos-chem-phys.net/18/12269/2018/  the internal regions when compared to the observation-based MLR.The inter-model differences in terms of the relative importance of predictors are greater in AMJ than in JAS.
For instance, the contribution of the LO3 is overestimated by CTMs.Substantial differences are found in the influence of RH when comparing the observation-based and the CTMbased models.The CTMs do not capture the relative importance of the RH well, especially in AMJ.In general, the CTMs driven by WRF meteorology show a slightly larger contribution of RH in most cases, although we notice that there are also some differences among the models that share the same meteorology.CTMs do capture the relative importance of Tx in all regions, but overall they overestimate it, as they also show for SSRD.Here, we find discrepancies when comparing the contribution of predictors in the statistical models from CTMs driven by the same meteorology (e.g.EMEP when compared to CHIMERE and MINNI).Such differences among the models using the same meteorology point out that the model set-up (e.g.number of vertical levels, depth of first layer) and model parameterizations   (e.g.chemistry-physical processes) have a larger influence in the model performance than the meteorological processes.Overall, all CTMs show this tendency, although there are substantial differences when comparing the individual drivers' contributions in the observation-based and CTMbased MLRs, particularly in AMJ (Fig. 7).CTMs do not capture the contribution of LO3 reflected by the observationbased MLRs.As in the previous analysis (Sect.4.1), the largest discrepancies are found in BA.In this region, the observation-based MLR shows that most of the variability Atmos.Chem.Phys., 18, 12269-12288, 2018 www.atmos-chem-phys.net/18/12269/2018/Our results indicate that the relative importance of meteorological factors is stronger in the internal regions (Fig. 6) than in the external regions (Fig. 7), which could be partially attributed to a larger variability of most of the meteorological fields in internal regions (Fig. S5).The external regions are also more likely to be influenced by the lateral boundary conditions applied by each CTM.In addition, in some external regions (e.g.IP or MD), as mentioned in Sect.2, the use of a coarser grid in some regions might be insufficient to capture mesoscale processes, such as land-sea breezes, which also control MDA8 O 3 concentrations (Millán et al., 2002).Moreover, we observe that meteorology becomes more important in summer, when local photochemistry processes are dominant.In general, CTMs show this tendency but limitations in reproducing the effect of some meteorological drivers are found.Specifically, while CTMs tend to overestimate the contribution of Tx and SSRD, they underestimate the relative importance of RH, which is also reflected in the correlation coefficients between predicant and the predictors (Figs.S6, S7).

Sensitivity of ozone to the drivers
We assess the sensitivities of MDA8 O 3 to the drivers through their standardised coefficients obtained in the MLR (Sect.3).These coefficients provide further information about the changes in MDA8 O 3 due to the effect of each driver.Figures 8 and 9 depict the values of the main driving factors obtained in the MLR for the internal and the external regions (respectively): LO3, Tx and RH.Similarly to those patterns described by the relative importance of drivers, we observe that the ozone response to LO3 is stronger in AMJ than in JAS: the corresponding standardised coefficients are always positive and generally higher in AMJ.The observed sensitivities to LO3 are smaller in the internal regions (Fig. 8), being particularly dominant in the external regions (Fig. 9).Overall, most of the CTMs reflect a similar tendency.However, there are evident differences between observations and CTMs when comparing the values of the standardised coefficients, specifically in some regions such as BA or MD.When comparing the ozone responses of the CTMs to LO3, we observe that in most of the regions MATCH and MINNI show values closest to observations, which is consistent with the results described at the beginning of this Sect.4.1.2.
Correlations between MDA8 O 3 and Tx are strong, especially in the internal regions in JAS (Fig. S6).Overall, we show that the CTMs appear to capture the observed effect of Tx better in JAS than in AMJ in most of the regions.The highest sensitivities to Tx are found in internal regions such as ME, NI, FR and EN, which is also shown in the CTMs (Fig. 8).However, we see that most of the CTMs tend to overestimate the effect of Tx and distinct sensitivities to Tx www.atmos-chem-phys.net/18/12269/2018/Atmos.Chem.Phys., 18, 12269-12288, 2018  are also found for those models that share the same meteorology (i.e.CHIMERE, EMEP and MINNI).In particular, the MINNI and CHIMERE models show higher Tx sensitivities when compared to the rest of the CTMs.While the MINNI model presents the highest sensitivities to Tx in spring, especially in EN and FR, EMEP shows smaller values and it underestimates the correlations between Tx and MDA8 O 3 (Figs.S6, S7).
The slope of the ozone-temperature relationship (m O3−T ) has been used in several studies to assess the ozone climate penalty (e.g.Bloomer et al., 2009;Steiner et al., 2006;Rasmussen et al., 2012;Brown-Steiner et al., 2015) in the context of future air quality.Thus, we additionally analyse the ozone-temperature relationship in order to provide insight into the ability of CTMs to reproduce the observed m O3−T .Similarly to previous work (Brown -Steiner et al., 2015), the slopes are obtained from a simple linear regression using only Tx (without the influence from other predictors) and they are used to quantify this relationship in both seasons, AMJ and JAS.
Atmos.Chem.Phys., 18, 12269-12288, 2018 www.atmos-chem-phys.net/18/12269/2018/We see a stronger effect of RH in AMJ than in JAS in the observations, with the greatest impact in the internal regions (e.g.EA, ME, NI, FR and EN), which is not well represented by the CTMs (Figs. 8 and 9).As mentioned, CTMs underestimate the strength of the correlations between ozone and www.atmos-chem-phys.net/18/12269/2018/Atmos.Chem.Phys., 18, 12269-12288, 2018 relative humidity (Figs. S6,S7).This general lack of sensitivity to RH could also partially explain the tendency for all CTMs to show a high bias in simulated ozone compared with observations (Fig. 2).Among the possible reasons for this inconsistency, we hypothesize that it can be related to the fact that ozone removal processes can be associated with higher relative humidity levels during thunderstorm activity on hot moist days, which might not be well captured by CTMs.As previous studies pointed out (e.g.Andersson and Engardt, 2010), the impacts of ozone dry deposition suggest that it may also play a role in explaining the problems that CTMs show to reproduce the observed ozone-relative humidity relationship.With a simple modelling approach, Kavassalis and Murphy (2017) found that the relationship between ozone and relative humidity was better captured by the inclusion of the vapour pressure deficit-dependent dry deposition, pointing out the relevance of detailed dry-deposition schemes in the CTMs.
High SSRD levels favour photochemical ozone formation and are usually positively correlated to ozone.In this case, CTMs also present some limitations in capturing this effect and they overestimated the sensitivities of ozone to SSRD (Figs.S8, S9).For example, the observations show a lower and surprisingly negative effect of SSRD.Although the correlations between SSRD and ozone are positive (see Figs. S6, S7), the presence of other predictors in the regression may reverse the sign of the estimated coefficient.The CTMs show a stronger sensitivity of ozone to SSRD and they overestimate its influence on surface ozone.Similarly, the sensitivities to Wdir and W10m are also overestimated by the CTMs, especially in AMJ (Figs.S8, S9).
Our analysis suggests that CTMs present more limitations to reproduce the influence of meteorological drivers to MDA8 O 3 concentrations in the external regions than in the internal regions, particularly in AMJ.Moreover, we find the largest discrepancies in BA, where models show the poorest seasonal performance and correlation coefficients (Figs. 2 and 3, respectively), probably due a low quality of the observational data set.
Furthermore, LO3 is the main driver over most of the external regions and explains a large proportion of the total variability of MDA8 O 3 , while meteorological factors have a smaller influence.Lemaire et al. (2016) found a very low performance (based on R 2 ) over the British Isles, Scandinavia and the Mediterranean using a different statistical approach that only included two meteorological drivers.They attributed this low skill to the large influence over those regions of long-range transport of air pollution (Lemaire et al., 2016).Our results confirm the small influence of the meteorological drivers over those regions and the strong influence of the ozone persistence.Moreover, in the case of the external regions of northern Europe, it could also be explained due to the dominance of transport processes such as the stratospheric-tropospheric exchange or long-range transport from the European continent, rather than local meteo-rology, particularly in AMJ (Monks, 2000;Tang et al., 2009;Andersson et al., 2009).
Previous work suggested that local sources of NOx and biogenic VOC (ozone precursors) are important factors of summertime ozone pollution in the Mediterranean basin (Richards et al., 2013).Moreover, some studies suggested that the local vertical recirculation and accumulation of pollutants play an important role in ozone pollution episodes in this region: during the night-time the air masses are held offshore by a land-sea breeze, creating reservoirs of pollutants that are brought back the following day (Millán et al., 20002;Jiménez et al., 2006;Querol et al., 2017).All of these factors (e.g.local emissions as well as local and large-scale processes) control the ozone variability, which might explain the smaller influence of local meteorological factors shown in this study over the Mediterranean basin when compared to meteorological influence in the internal regions.Thus, we may hypothesize that the strong impact of LO3 observed in the external regions over southern Europe (i.e.IP, MD, BA) could be partially due to the role of vertical accumulation and recirculation of air masses along the Mediterranean coasts as a result of the mesoscale phenomena, which is enhanced by the complex terrains that surround the basin.Another important factor for the strong impact of LO3 observed is the slow dry deposition of ozone on water that would favour the ozone persistence in southern Europe.
Overall we conclude that CTMs capture the effect of meteorological drivers better in the internal regions (EN, FR, ME, NI and EA), where the influence of local meteorological conditions is stronger.The major effect of meteorological parameters found in the internal European regions might also be attributed to the fact that overall the variability of meteorological conditions is larger in those regions (Fig. S5).We also find differences among the CTMs driven by the same meteorology.As mentioned in the introduction, Bessagnet et al. (2016) suggested that the spread in the model results could be partly explained by the differences in the vertical turbulent mixing in the planetary boundary layer, which are differently diagnosed in each of the CTMs.Our results also indicate that, even though models share the same meteorology (considering the prescribed requirements defined by the EDT exercise), they show discrepancies when compared to each other, which could be attributed to other sources of uncertainties (such as physical and chemical internal processes in the CTMs).The NMVOC and NO x emissions from the biosphere are critical in the ozone formation.Since biogenic emissions were not specifically prescribed and have a strong dependence on temperature and solar radiation, discrepancies in the CTMs performances, (e.g.different sensitivities to Tx) might be expected.Furthermore, we notice that the CTMs do not consistently reproduce the regional ozonetemperature relationship, which is a key factor when assessing the impacts of climate change on future air quality.
Atmos.Chem.Phys., 18, 12269-12288, 2018 www.atmos-chem-phys.net/18/12269/2018/ 5 Summary and conclusions The present study evaluates the capabilities of a set of chemical transport models (CTMs) to capture the observed meteorological sensitivities of daily maximum 8 h average ozone (MDA8 O 3 ) over Europe.Our study reveals systematic differences between the CTMs in reproducing the seasonal cycle when compared to observations.In general, CTMs tend to overestimate the MDA8 O 3 in most of the regions.
In the western and northern regions (i.e.Inflow, England and Scandinavia), some models did not capture the high ozone levels in spring (e.g.CHIMERE and MINNI), while in some southern regions (e.g.Iberian Peninsula, Mediterranean and Balkans) they overestimated the ozone levels in summer (e.g.LOTOS, CHIMERE).Of the CTMs, MATCH and MINNI were the most successful in capturing the observed seasonal cycle of ozone in most regions.All CTMs revealed limitations in reproducing the variability of ozone over the Balkans region, with a general overestimation of the ozone concentrations that was considerably larger during the warmer months (July, August).As reflected in the results, a limitation of the interpolated observational product used here is that in some regions (e.g.southern Europe) it has a lower quality due to a reduced number of stations (Sect.2.1).The MLRs performed similarly for most of the CTMs and observations, describing more than 60 % of the total variance of MDA8 O 3 .Overall, the MLRs perform better in JAS than in AMJ, and the highest percentages of described variance were found in central Europe and northern Italy.This could be attributed to local photochemical processes being more important in JAS and is consistent with a relatively stronger influence of long-range transport in AMJ.
The effects of predictors revealed spatial and seasonal patterns, in terms of their relative importance in the MLRs.Particularly, we noticed a larger local meteorological influence in the regions located towards the interior of Europe, here termed the internal regions (i.e.England, France, central Europe, northern Italy and eastern Europe).A minor local meteorological contribution was found in the remaining regions, referred to as the external regions (i.e.Inflow, Iberian Peninsula, Scandinavia, Mediterranean and Balkans).The CTMs are in better agreement with the observations in the internal regions than in the external regions, where they were not as successful in reproducing the effects of the ozone drivers.Overall, different behaviour in the MLRs developed in the external regions could be attributed to (i) a larger influence of dynamical processes rather than local meteorological processes (e.g.long-range transport in the northern regions), (ii) a stronger impact of the boundary conditions and (iii) the use of a coarser grid that might be insufficient to capture mesoscale processes that also influence MDA8 O 3 (e.g.sealand breezes in the southern regions).
We found substantial differences in the sensitivities of MDA8 O 3 to the different meteorological factors among the CTMs, even when they used the same meteorology.As Bessagnet et al. (2016) point out, the differences among CTMs could be partly attributed to some other diagnosed model variables (e.g.vertical turbulent mixing and boundary layer height, as well as vertical model resolution).To assess the effect of such potential sources of uncertainties, further investigations would be required.Moreover, variations in the sensitivity of ozone to meteorological parameters could depend on differences in the chemical and photolysis mechanisms and the implementation of various physics schemes, all of which differ between the CTMs (see Colette et al., 2017a).Specifically, the discrepancies found in the sensitivities of MDA8 O 3 to maximum temperature might also be attributed to biogenic emissions not prescribed in the models.This was particularly reflected in the analysis of the slopes' ozone-temperature relationship (m O3−T ) to assess the climate penalty, which differed between CTMs and regions when compared to the observations in both seasons.Most of the CTMs confirm the observed climate penalty in JAS but with larger discrepancies in the external regions than in the internal regions.Furthermore, CTMs tend to overestimate the climate penalty in AMJ (particularly in the external regions).
Our results have shown discrepancies in the observed and simulated ozone sensitivities to relevant meteorological parameters for ozone formation and removal processes.In particular, we found that CTMs tend to overestimate the influence of maximum temperature and surface solar radiation in most of the regions, both of which are strongly associated with ozone production.None of the CTMs captured the strength of the observed relationship between ozone and relative humidity appropriately, underestimating the effect of relative humidity, a key factor in the ozone removal processes.We speculate that ozone dry-deposition schemes used by the CTMs in this study may not adequately represent the relationship between humidity and stomatal conductance, thus underestimating the ozone sink due to stomatal uptake.Further sensitivity analyses would be recommended for testing the impact of the current dry-deposition schemes in the CTMs.
Data availability.The data are available upon request from the corresponding author.
Author contributions.NO prepared the data set provided by the EURODELTA working group, performed the analysis and wrote the paper.TB, JS helped to guide the data analysis, and HWR provided statistical support and help to interpret the results.The rest of the co-authors contributed in scientific discussions and the whole team helped to prepare the final version of the paper.

FranceFigure 1 .
Figure 1.Map of the regions considered in the study.Regions indicated with a black star refer to the internal regions in the text.The rest of regions refer to the external regions of the European domain.

Figure 2 .
Figure 2. Time series of daily averages of MDA8 O 3 during the ozone season (April-September) for the period of study (2000-2010) at each subregion.
depicts daily averages during 2000-2010 of MDA8 O 3 at each region for the CTMs and observations.In general, all CTMs are biased high compared with the observations.CTM results are visually closer to the observations in the north-western regions (i.e.IN, EN and FR), while the spread becomes larger over the southern and south-eastern regions (i.e.BA, NI, MD).The IN, EN and SC regions show the highest-observed concentrations in the starting months (AMJ), which are generally not well captured by most of the CTMs, which show a more flat timeline (e.g.LOTOS, MATCH, CHIMERE).For example, in the SC region, some of the CTMs underestimate the ozone concentrations in AMJ (i.e.CHIMERE and MINNI).The rest of the regions show the highest observed concentrations in JAS, which is generally overestimated by the CTMs.Models show discrepancies in the ozone seasonal cycle when compared to each other and when compared with observawww.atmos-chem-phys.net/18/12269/2018/Atmos.Chem.Phys., 18, 12269-12288, 2018

Figure 3 .
Figure 3. Correlation coefficients between observed and modelled MDA8 O 3 for spring (AMJ) and summer (JAS) for the period of study (2000-2010) at each region (rows) and model (columns, ordered by highest correlation values).

Figure 4 .
Figure 4. Coefficients of determination (R 2 ) for each CTM-based (ordered as in Fig. 3) and observation-based MLR in spring (AMJ) and summer (JAS).

Figure 5 .
Figure 5. Root mean square errors (RMSE) for each CTM-based (ordered as in Fig. 3) and observation-based MLR at each region, in spring (AMJ) and summer (JAS).

Figure 6 .
Figure 6.Proportion of each predictor to the total explained variance for each CTM-based (ordered as in Fig. 3) and observation-based MLR in AMJ (a) and JAS (b) for the internal regions: England (EN), France (FR), central Europe (ME), northern Italy (NI) and eastern Europe (EA).

Figure 7
presents the relative importance of individual predictors in the MLRs developed at the external regions (Fig. 1) for both seasons.The observation-based MLRs show that the main driving factor is LO3 in AMJ, while the effect of meteorological drivers becomes stronger in JAS.RH presents a larger contribution in some regions (e.g.IN, IP or SC) in AMJ and Tx in JAS (e.g.IN, IP, SC and BA).The contribution of wind components, Wdir and W10m, is mainly re-flected in both seasons in the western regions (i.e.IN and IP) and in MD, respectively.

Figure 7 .
Figure 7. Proportion of each predictor to the total explained variance for each CTM-based (ordered as in Fig. 3) and observation-based MLR in AMJ (a) and JAS (b) for the external regions: Inflow (IN), Iberian Peninsula (IP), Scandinavia (SC), Mediterranean (MD) and Balkans (BA).

Figure 8 .
Figure 8. Standardised coefficient values of the key driving factors (LO3, Tx and RH) for each CTM-based (ordered as in Fig. 3) and observation-based MLR in AMJ (a) and JAS (b) and for the internal regions: England (EN), France (FR), central Europe (ME), northern Italy (NI) and eastern Europe (EA).

Figure 9 .
Figure 9. Standardised coefficients values of the main key-driving factors (LO3, Tx and RH) for each CTM-based (ordered as in Fig. 3) and observation-based MLR in AMJ (a) and JAS (b) and for the external regions: Inflow (IN), Iberian Peninsula (IP), Scandinavia (SC), Mediterranean (MD) and Balkans (BA).

Figure 10 .
Figure 10.Slopes (m O3−T ; ppbK −1 ) obtained from a simple linear regression to estimate the relationship of ozone-temperature for each CTM-based (ordered as in Fig. 3) and observation-based MLR in AMJ (a) and JAS (b) and for the internal regions: England (EN), France (FR), central Europe (ME), northern Italy (NI) and eastern Europe (EA).

Figure 11 .
Figure 11.Slopes (m O3−T ; ppbK −1 ) obtained from a simple linear regression to estimate the relationship of ozone-temperature for each CTM-based (ordered as in Fig. 3) and observation-based MLR in AMJ (a) and JAS (b) and for the external regions: Inflow (IN), Iberian Peninsula (IP), Scandinavia (SC), Mediterranean (MD) and Balkans (BA).

Figures
Figures 10 and 11 illustrate the m O3−T for the internal and external regions respectively.The observed m O3−T is larger in JAS than in AMJ.In AMJ, it ranges between −0.45 and 1.15 ppbK −1 with the largest values found in ME, NI and MD.In JAS, the observed climate penalty is of the order of 1-2.7 ppbK −1 with the largest values in EN, FR, ME, NI and MD.CTMs show a better agreement with observations in JAS than in AMJ.CTMs tend to overestimate the climate penalty in AMJ in most of the regions, with some exceptions, such as EMEP and MATCH, which systematically underestimate the slopes.Also, CTMs are generally better at simulating the observed m O3−T in the internal regions compared to the m O3−T in the external regions, where in general CTMs appear to overestimate the climate penalty in both seasons.Using this metric, we identify some regions that are partic- • W to 39.8 • E and from 32 • N to 70 • N, and it follows a regular latitude-

Table 1 .
Colette et al., 2017a)ry-transport models used in the study and the main characteristics (adapted fromColette et al., 2017a).

Table 2 .
List of the predictors used in the multiple linear regression analysis: meteorological parameters, lag of MDA8 O 3 (24 h, previous day) and the seasonal cycle components.
est.Since the domains covered by observations and CTMs do not coincide exactly, we applied an observational mask to use the same number of grid cells for CTMs and observations.Data used to estimate parameters of the MLR were spatially averaged over each region.Thus, we compare MLRs developed separately for CTMs and observations for each region and season.The observational data set contains the gridded MDA8 O 3 and the meteorology input from ERA-Interim, while the data set for the CTMs contains the MDA8 O 3 from each one of them along with the corresponding meteorological input (LOTOS and RACMO2, CHIMERE and WRF, MATCH and HIRLAM) (see Table1).

Table 3 .
List of regions with the acronym and coordinates.
3 ppb), suggesting the best statistical fit from a predictive point of view.Both R 2 and RMSE metrics indicate that the statistical performance of MLRs for observations and CTMs show distinct variations between seasons and regions.Overall, better performances are found in JAS and in some regions (i.e.ME, MLRs show a similar performance to the observation-based MLRs in most of the regions.The lowest RMSE values for most of the MLR are found in SC ranging between 1 and 3 ppb, while EN shows the largest RMSE values.The MLRs from MATCH and CHIMERE show the lowest RMSE values (1-