An analysis of long-term regional-scale ozone simulations over the Northeastern United States : variability and trends

C. Hogrefe, W. Hao, E. E. Zalewsky, J.-Y. Ku, B. Lynn, C. Rosenzweig, M. G. Schultz, S. Rast, M. J. Newchurch, L. Wang, P. L. Kinney, and G. Sistla Atmospheric Sciences Research Center, State University of New York at Albany, Albany, NY, USA New York State Department of Environmental Conservation, Albany, NY, USA Weather It Is, LTD, Efrat, Israel NASA-Goddard Institute for Space Studies, New York, NY, USA Forschungszentrum Jülich, Germany Max Planck Institute for Meteorology, Hamburg, Germany University of Alabama, Huntsville, AL, USA Mailman School of Public Health, Columbia University, New York, NY, USA


Introduction
Ground-level ozone has long been recognized as a pollutant causing adverse health effects in humans (Kinney and Özkaynak, 1991;Bell et al., 2005;Ito et al., 2005) and also causing damage to crops and ecosystems (Mauzerall and Wang, 2001;NRC, 2004).While initial concerns about elevated ozone levels were local in scale for highly polluted urban airsheds such as the Los Angeles basin (McRae and Seinfeld, 1983;Harley et al., 1993), subsequent research focused on also examining regional aspects of ozone pollution such as the multi-state transport of ozone and its precursors (Eder et al., 1994;Vukovich, 1995;Brankov et al., 1998;Schichtel and Husar, 2001;Civerolo et al., 2003).During Published by Copernicus Publications on behalf of the European Geosciences Union.
the past decade or so, work on intercontinental transport of air pollution (Jacob et al., 1999;Li et al., 2002;Holloway et al., 2003;Fiore et al., 2009) and the potential effects of climate change on air pollution (Hogrefe et al., 2004;Weaver et al., 2009;Jacob and Winner, 2009) introduced a global aspect to the problem of surface level ozone, recognizing it as an environmental problem that is impacted by phenomena occurring on spatial scales ranging from local to global and temporal scales from hours to decades.
This interplay of many scales poses significant challenges for air quality management.To date, most air quality management applications in the US have relied on applying regional-scale photochemical modeling systems for individual pollution episodes (Harley et al., 1993;Sistla et al., 2001), single years (Tesche et al., 2006;Tong and Mauzerall, 2006;Eder and Yu, 2006;Hogrefe et al., 2006;Zhang et al., 2009), and only rarely multiple years (Bouchet et al., 1999;Pierce et al., 2010;Godowitch et al., 2010) to quantify the effect of emission control strategy on ambient ozone concentrations.In many episodic or annual applications, emissions are typically defined for two scenarios, a baseline scenario reflecting current conditions and a control scenario reflecting future conditions.Model evaluation focuses on comparing predictions from the baseline scenario against observations.While certainly necessary to build confidence in the performance of the modeling system, often such comparisons leave several key questions unanswered: how well does the modeling system capture the effects of projected changes in emissions, i.e. how well does the modeling system perform for the purpose it is most often used for?How well do projected changes in emissions over time scales of a decade or more capture actual changes in emissions?How well does the modeling system capture the effects of meteorological variability on intra-and interannual time scales on ozone pollution, a question of particular importance when using regional-scale models to assess the effects of climate change?Most of these questions are at the core of "dynamic model evaluation", a concept defined by Gilliland et al. (2008) and integrated into an overall model evaluation framework by Dennis et al. (2010).
In this study, we present and analyze results based on air quality simulations performed with a regional photochemical model over the Northeastern US covering an 18 year period from 1988 to 2005.The objective of this study is to illustrate how future modeling studies going beyond typical photochemical model applications could be designed to help address some of the questions raised above, and to identify key inputs and processes that need to be considered when performing such simulations.In particular, we introduce various methods for comparing observed and simulated trends and variability of ground level ozone concentrations, ozone precursors and ozone/precursor relationships.Furthermore, to quantify the impact of lateral boundary conditions on simulated ozone concentrations and their variability and trends, we performed another 18-year model simulation utilizing chemical boundary conditions derived from archived monthly mean fields of global chemistry simulations performed with the ECHAM5-MOZART modeling system (Aghedo et al., 2007;RETRO, 2007;Rast et al., 2011) rather than the climatological time-invariant boundary conditions used in the base simulation.Finally, we discuss the results in the context of the model evaluation framework introduced by Dennis et al. (2010), especially from the point of view of dynamic model evaluation (Gilliland et al., 2008;Pierce et al., 2010;Godowitch et al., 2010).

Modeling system
The following is a brief summary of the model set-up used to perform the simulations analyzed in this study.The reader is referred to Hogrefe et al. (2009) for additional details.The Mesoscale Meterological Model MM5 (Grell et al., 1994) was used to simulate meteorological conditions for the time period from 1 January 1988 to 31 December 2005.The meteorological simulations were performed on two-way nested grids with 36 km and 12 km grid cell sizes covering the Northeastern US.Throughout the model simulation, MM5 was nudged towards reanalysis fields from the National Center for Environmental Prediction (NCEP) using four-dimensional data assimilation.All emission processing, including mobile sources and biogenic sources, was performed within the Sparse Matrix Operator Kernel Emissions (SMOKE) system (Houyoux et al., 2000).Anthropogenic emission inventories for 1988-2005 were compiled from a variety of sources as described in Hogrefe et al. (2009).Biogenic emissions were estimated with the BEIS3.12model taking into account MM5 temperature, radiation, and precipitation.To illustrate the changes in emissions over time, Table 1 presents the domain-wide anthropogenic NO x , VOC, and CO emissions for 1990, 1995, 2000, 2005 grouped by major source sectors.The emission reductions between 1990 and 2005 were found to vary between 40% and 50% for total NO x , VOC, and CO, with the largest reductions attributable to the mobile source and point source sectors.
Two sets of regional air quality simulations differing in their choice of boundary conditions as described below were performed with the Community Multiscale Air Quality (CMAQ) model (Byun and Schere, 2006), version 4.6, rather than CMAQ 4.5.1 that was used in Hogrefe et al. (2009) with the same set of meteorological and emission inputs.Air quality model simulations were performed with two oneway nested grids of 36 km and 12 km, corresponding to the MM5 grids except for a ring of buffer cells.These modeling domains along with the location of the monitoring stations discussed in Sect.2.2 are presented in Fig. 1.The height of the first model layer was set at 38 m.Gas phase chemistry was represented by the CB-IV mechanism (Gery, 1989) , 2007).It should be noted that the differences between the CMAQ/STATIC and CMAQ/ECHAM5-MOZART simulations can be caused both by the differences in the magnitude and differences in the temporal variability of the boundary conditions.Additional CMAQ simulations would be needed to separate these effects.One simulation could utilize boundary conditions generated by overlaying normalized temporal fluctuations extracted from the ECHAM5-MOZART simulation on the time-invariant climatological CMAQ profile, while another simulation could utilize a constant (time-invariant) vertical profile derived from time-averaging the ECHAM5-MOZART concentrations.However, performing these additional simulations is outside the scope of the present study.As pointed out in previous studies (e.g.Tang et al., 2008;Lam and Fu, 2010), several issues need to be considered when deriving chemical boundary conditions from a global model for a regional scale application.First, because of differences in chemical mechanisms between both models, species mapping needs to be performed.In the present study, the archived ECHAM5-MOZART fields did not contain any of the CMAQ aerosol species and only limited gas phase species.For the unavailable species, including most VOC groups except isoprene, the same time-invariant climatological values used in the CMAQ/STATIC simulations were used in the CMAQ/ECHAM5-MOZART simulations.Second, the concentration fields from the global model need to be mapped to the spatial and temporal structure of the regional model.In the present study, spatial mapping was accomplished through bilinear interpolation in the horizontal and linear interpolation in the vertical dimension from the ECHAM5-MOZART grid to the horizontal and vertical structure of the boundary cells along the 36 km CMAQ grid.studies have pointed out that regional-scale modeling systems often are configured with a vertical resolution in the upper troposphere and lower stratosphere that is not sufficiently fine to resolve stratosphere-troposphere exchange processes (Eder et al., 2006;Lam and Fu, 2010).Therefore, prescribing stratospheric concentration values extracted from the global model at the lateral boundaries in upper levels can result in an unrealistic downward mixing of these concentrations to the lower troposphere and even the surface (Mathur et al., 2004).Because the setup of the MM5/CMAQ system used in this study also uses a relatively coarse vertical resolution in the upper troposphere and lower stratosphere, boundary conditions for the top two model layers 14 and 15 (which have midpoint heights of 9.5 km and 13 km, respectively, as shown in Table 2) were set to the same value as for layer 13 for the CMAQ/ECHAM5-MOZART simulations to avoid intrusion of stratospheric concentration values.This approach is similar to the one described by Lam and Fu (2010) who derived CMAQ boundary conditions from the GEOS-CHEM model except that in their study a dynamic tropopause detection algorithm was used to exclude stratospheric concentrations from GEOS-CHEM from the calculation of CMAQ boundary conditions.Since the top of model layer 13 in our study is at roughly 8 km, restricting the use of ECHAM5-MOZART concentrations to this and lower levels for the computation of CMAQ boundary conditions serves the same purpose as the algorithm described by Lam and Fu (2010).Differences between the two sets of boundary conditions are illustrated in Fig. 2 and Table 2. Figure 2 shows the spatial variations of the ozone concentrations along the four model boundaries used in the CMAQ/STATIC and CMAQ/ECHAM5-MOZART simulations along with the original ECHAM5-MOZART concentrations before setting layers 14 and 15 to the same value as layer 13.The ECHAM5-MOZART based concentrations were temporally averaged over the entire simulation period for display in this figure .The figure illustrates that the ozone boundary conditions derived from ECHAM5-MOZART generally are higher than those derived from the static profile and also show more spatial variability along the boundaries.Moreover, this figure clearly illustrates the desired effect of setting the concentrations for layers 14 and 15 to the same value as layer 13 to avoid the utilization of stratospheric ozone value in the CMAQ simulations, this effect is particularly pronounced for the northern boundary and for the northernmost cells of the western and eastern boundaries, consistent with generally lower tropopause heights at northern latitudes compared to southern latitudes.Table 2 shows the boundary condi- tions used in the CMAQ/STATIC and CMAQ/ECHAM5-MOZART simulations for additional species and selected layers, spatially averaged over all boundary cells and, in the case of the CMAQ/ECHAM5-MOZART simulations, temporally averaged over the entire simulation time period.Besides the differences in ozone concentrations already illustrated in Fig. 2, this table also shows noticeable differences in the magnitude and vertical distribution of NO, NO 2 , and especially PAN which have the potential to affect simulated ozone concentrations.

Observations
Hourly ozone, CO, and NO x observations from 1988 to 2005 were obtained from the US EPA Air Quality System (AQS).As stated above, only sites located within the 12 km CMAQ modeling domain, shown in Fig. 1a, were included in the analysis.All data were screened for completeness prior to analysis, and data with more than 60% of missing data in any given year were excluded from this analysis.The application of this screening criterion resulted in the selection of 90, 34, and 3 sites with at least 40% data completeness in each year for ozone, CO, and NO x , respectively.To evaluate temporal  Munger et al. (1996Munger et al. ( , 1998)).
For the evaluation of upper air ozone simulations, ozonesonde observations taken at two sites within the 36 km CMAQ modeling domain were obtained from the World Ozone and Ultraviolet Radiation Data Center (WOUDC).These two sites are Wallops Island, Virginia operated by the National Aeronautics and Space Administration (NASA) and another site operated by the University of Alabama at Huntsville (UAH).The UAH ozonesonde station is part of the National Oceanic and Atmospheric Administration (NOAA) ozonesonde network and is funded by NOAA.The total number of available ozonesonde launches at these two sites during the 1988-2005 analysis time period was 660 and 305, respectively.
The locations of all monitoring sites are shown in Fig. 1.In all analyses comparing observations and model predictions, monitored values were assigned to the model grid cells in which the monitor was located.

Variability and trends in surface ozone
While the focus of the analysis in this paper is on the comparison of observed and simulated ozone variability and trends over 18 years, we also compiled standard statistical measures of model performance for May-September 8-h daily maximum ozone concentrations.The results of this analysis for the CMAQ/STATIC simulations across the 18 years and 90 monitors are shown in Table 3 and reveal a similar level of model performance as reported in other studies for individual years (e.g.Eder and Yu, 2006;Appel et al., 2007) with an absolute (normalized) bias of +4.9 ppb (+9.7%) and an absolute (normalized) root mean square error of 14.5 ppb (28.2%).At the 95th percentile of May-September 8-h daily maximum ozone concentrations, the absolute (normalized) bias is −0.7 ppb (−0.9%) and the absolute (normalized) root mean square error is 7.7 ppb (9.3%).At the 5th percentile of May-September 8-h daily maximum ozone concentrations, the absolute (normalized) bias is +12.5 ppb (+56.6%) and the absolute (normalized) root mean square error is 13.6 ppb (60%), indicating the model tends to slightly underestimate high values and strongly overestimates low observed values.Correlation coefficients are greater than 0.7 both for all values and for the top 95th percentile while they are less than 0.2 for the 5th percentile values, again indicating performance issues at the low end of the distribution which is strongly influenced by background concentrations as specified through boundary conditions.Results for 1-h daily maximum ozone concentrations generally are similar to those for 8-h daily maximum ozone concentrations.
As a first step in comparing observed and simulated variability, Fig. 3 presents power spectra calculated from 18 years of hourly observed and CMAQ/STATIC ozone time series.To reduce the noise in the spectra and facilitate the comparison, we calculated the spectra at 19 selected sites and then averaged the spectral density at each frequency over these sites.Figure 3 illustrates that CMAQ/STATIC tends to capture the variability in the diurnal and synoptic bands but underestimates variability in the high-frequency (intraday) and low-frequency (seasonal and longterm) bands of the spectrum.The underestimation of the intra-day variability is consistent with earlier analyses of simulations for single summers (Hogrefe et al., 2001) while an analysis of the strength of longer-term fluctuations had not been possible previously because of the limited duration of simulations.
To further study longer-term variability, we calculated inter-annual variability (IAV) of observed and CMAQ/STATIC 8-h daily maximum ozone as follows.First, we rank-ordered each year's May-September distribution of daily maximum 8-h ozone at each site.Next, for each rank we calculated IAV as the standard deviation of these 18 Table 3. Model performance metrics calculated for the CMAQ/STATIC simulated daily maximum 8-h ozone concentrations at 90 monitors for 1988-2005.Metrics for the "All days" row were calculated using all observed and simulated values for May-September for each year.Metrics for the 5th and 95th percentile rows were calculated by first determining the corresponding percentile from the 153 daily May-September values for each year and then calculating the metrics across the 18 years from 1988-2005.All metrics were calculated separately at each monitor and then averaged for display in this  values divided by the mean of these 18 values.We performed this calculation separately for observations and the CMAQ/STATIC simulations at each site.Figure 4a shows boxplots of the observed and simulated IAV for the 5th, 25th, 50th, 75th, and 95th percentiles of May-September 8-h daily maximum ozone; the box plots show the distribution of IAV for a given percentile across all 90 sites.It is evident that the CMAQ/STATIC IAV is lower than the observed IAV for all percentiles.This is confirmed by Fig. 4b which shows the ratio of simulated to observed IAV versus all percentiles of May-September 8-h daily maximum ozone.While this ratio is less than one for all percentiles, the underestimation is most pronounced for the lower percentiles.q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 5th 5th 25th 25th 50th 50th 75th 75th 95th 95th In addition to comparing observed and simulated variability on interannual timescales, the extended simulation period also provides an opportunity to compare observed and simulated trends in ozone concentrations.Figure 5 shows time series of the 5th, 50th, and 95th summertime percentiles estimated from observed and simulated May-September 8-h daily maximum ozone concentrations for [1988][1989][1990][1991][1992][1993][1994][1995][1996][1997][1998][1999][2000][2001][2002][2003][2004][2005] 90 ozone monitors in the modeling domain.The figure indicates that CMAQ/STATIC appears to capture the trend in the upper range but not the middle and lower range of the summertime ozone distribution.In addition, CMAQ underestimates intra-seasonal variability as indicated by the spread of the percentiles within a given year and interannual variability as measured by the variability across the years for a given certain percentile.This latter result is consistent with Fig. 3 shown above.
To provide an illustration of the spatial variability in observed and simulated ozone trends, Fig. 6a-d provide leastsquare trend estimates for the 5th and 95th percentiles of May-September daily maximum 8-h ozone concentrations at each of the 90 O 3 monitors considered in this study.Consistent with Fig. 5, there is good agreement in both magnitude and spatial variability of the linear trends estimated for the 95th percentiles of observed and CMAQ/STATIC simulated summertime 8-h daily maximum ozone concentrations.Trends are generally downward, with the largest negative trends of −1.5 ppb/year or more in the greater New York City area.On the other hand, for the 5th percentiles, while observations show an increasing trend at almost all stations, there is a mixture of upward and downward trends in the CMAQ/STATIC simulations with only a small trend when averaged over all stations.
Further analysis of the differences of observed and simulated trend estimates across different parts of the summertime ozone distributions was performed as follows: linear trends were estimated at each site for each percentile of the rank-ordered May-September 8-h daily maximum ozone concentrations over the 1988-2005 time period.Figure 7 shows the magnitude of these trends on the y-axis plotted against the percentiles on the x-axis.While trends were calculated separately at each site, the median across all sites is shown in this figure.Figure 7 shows that for the 95th percentile, the median trend across all sites is −0.71 ppb/yr for the observations and −0.89 ppb/yr for CMAQ/STATIC.For the 50th percentile, the trends are −0.07 ppb/yr observed vs. −0.32ppb/yr simulated, and for the 5th percentile, the trends are +0.24ppb/yr observed vs. −0.03ppb/yr simulated.In other words, the relative difference between the observed and simulated trends is about 25% at the 95th percentile but much larger for the 50th and 5th percentiles.Moreover, for the 5th percentile the directionality of the trend varies between the observations and CMAQ/STATIC simulations, with the observations showing an increase and the CMAQ/STATIC simulations exhibiting a small decrease.These results confirm that the agreement between the linear trends estimated from observations and CMAQ/STATIC is better for the upper than the lower percentiles.

Variability and trends in ozone precursors
Simulated ozone trends, especially at the upper end of the distribution, are strongly influenced by trends in anthropogenic emissions within the modeling domain.While the relatively close agreement between observed and simulated ozone trends at the upper end of the distribution presented in the previous section suggests that the underlying emission trends assumed in this study were reasonable, such an agreement of ozone trends could be the result of compensating errors.Therefore, trend analysis was extended to ground-level concentrations of NO x and CO measured during early morning hours that can serve as a proxy for emissions (Godowitch et al., 2010) to examine whether the assumptions about emission trends made in this study as described in Sect.2.1 were consistent with observational evidence.Figure 8a-b show the baseline time series of observed and simulated groundlevel NO x and CO for 1988-2005, and Table 4 depicts average concentrations, variability in space, and trends over time for these pollutants.The baseline time series were estimated by applying a Kolmogorov-Zurbenko (KZ) iterated moving average filter as described in Rao et al. (1997) using a window length of 31 days and 3 iterations.The time series were computed as spatial averages over the 3 NO x and 34 CO monitors described in Sect.2.2 and are based on 06:00-09:00 local time average concentrations for each day.For NO x , it can be seen that the simulated concentrations are about 50% lower than the observations, which may in part be due to the location of monitors near emission sources such as roadways and the inability of the modelling system to capture the observed spatial concentration gradients.However, despite the differences in magnitude, the simulated concentrations capture many aspects of the variability and trends present in the www.atmos-chem-phys.net/11/567/2011/Atmos.Chem.Phys., 11, 567-582, 2011   observations.Both observations and CMAQ/STATIC exhibit wintertime maxima, and both show a decrease in average concentrations by about 30 ppb over the 18 year time period (from about 90 ppb to about 60 ppb in the observations and from about 50 ppb to about 20 ppb for CMAQ/STATIC).Table 4 further illustrates that the trends vary spatially among the three NO x sites analyzed here and that the model simulated trends are in good agreement with observations at all sites.For CO, the simulated concentrations are ∼50% lower than observations for the earlier time periods while the underestimation decreases to about 20%-25% for later time periods.Again, part of the underestimation is likely due to the fact that the CO monitors are located near sources and the concentrations measured by these monitors are not reflective of the 12 km spatial scale simulated by CMAQ.In terms of variability, both observations and CMAQ exhibit wintertime maxima, and in terms of trends, the observations show a steeper decrease over time than CMAQ, with observations decreasing on average by 1 ppm over the 18 years while the CMAQ concentrations decreased by only about 0.4 ppm over the same period on average.However, as illustrated by Table 4, the level of agreement between observed and simulated trends varies across space, with somewhat better agreement at the lower absolute range of observed trends.
Because the relative abundance of NO x and VOC emissions can influence the chemical regime for ozone formation, a comparison of trends in observed and simulated indicators of the photochemical regime can serve as another indirect way of testing whether the assumptions made about emis- sion trends in this study are consistent with observational evidence.A number of such indicators have been used in previous studies (e.g.Zhang et al., 2009, and references therein), however, very few measurements exist to compute most of these indicators over an extended time period for the modeling domain considered here.A notable exception are long-term measurements at the Harvard Forest experimental site (Munger et al., 1996(Munger et al., , 1998) ) which provide measurements of O 3 , NO x , and NO y since 1990 and, therefore, can be used to compute the ratio of O 3 to NO z .This ratio can be viewed both as an indicator of the photochemical regime (Trainer et al., 1993;Olszyna et al., 1994) as well as an indicator for the amount of ozone produced per molecule of NO x being oxidized.In this analysis, the slope of a least-squares regression of O 3 vs.NO z was computed for each May-September time period, using only values during the photochemically active 10:00 to 17:00 local time period when the NO x /NO z ratio was less than 0.3 and NO z was less than 8 ppb to screen for aged air masses.Time series of the slopes derived from observations and CMAQ/STATIC simulations are depicted in Fig. 9. Overall, the magnitude of these slopes is similar between the observations and CMAQ/STATIC simulations, with typical values ranging between four and six.The slopes also exhibit interannual variability for both observations and model predictions.Moreover, the observations generally show a trend towards larger slopes, a feature that is captured by CMAQ/STATIC and is consistent with the results presented in Godowitch et al. (2008) Long-term regional-scale ozone simulations over the US q q q q q q q q 1990 1995 2000 2005 4 5 6 7 8 Year Slope Ozone vs. NOz q q q q q q q q q q q q q q q q q q Observations CMAQ/STATIC Fig. 9. Time series of the O 3 vs.NO z slope for observations and CMAQ/STATIC simulations.For a given year, the slope was estimated from a least-squares regression of O 3 vs.NO z concentrations during May-September, considering only values during the photochemically active time period between 10:00 and 17:00 local time.Furthermore, only hours when the NO x /NO z ratio was less than 0.3 and NO z was less than 8 ppb were included in the analysis to screen out air masses influenced by fresh emissions.
should be noted there is only a single annual data point in the observations after 1996 due to missing data in either NO x or NO y .The general agreement between the observed and simulated slopes is another indirect confirmation that the trends in emission inventories assumed in this study are within reasonable bounds.However, it is important to point out that uncertainties in the treatment of nitrogen chemistry in the photochemical mechanism may affect the simulated O 3 /NO z slopes, therefore, the relatively close agreement with observations might be fortuitous.
In summary, while the analysis presented in this section does not constitute a comprehensive evaluation of reported emission trends such as those reported in previous studies (e.g.Parrish, 2006), the relatively good agreement between observed and simulated trends in morningtime NO x concentrations and the O 3 /NO z ratio at Harvard Forest tends to support the notion that the emission assumptions made in this study were adequate for the purpose of simulating ground-level ozone concentrations.On the other hand, the discrepancies between observed and simulated CO in terms of both magnitudes and trends points to the need for conducting additional research.However, it should also be noted that recent studies such as Dallmann and Harley (2010) have pointed out that assumptions about NO x trends in the MO-BILE6 emission model used in this study may have misrepresented trends in the relative contribution of gasoline and diesel vehicles to total mobile source NO x emissions, raising the possibility of compensating errors and pointing to the need for future diagnostic studies evaluating long-term trends in ozone precursors in a variety of urban and rural environments.

Impact of chemical boundary conditions on absolute concentrations, variability and trends
Because lower percentiles of the summertime ozone distribution tend to be more influenced by background concentrations and boundary conditions, the use of time-invariant lateral boundary conditions in the CMAQ/STATIC simulations likely contributed to the underestimation of interannual variability and the disagreement between observed and simulated ozone trends, especially for lower percentiles.To investigate this hypothesis, we repeated the analysis of IAV and trends for the 18 year CMAQ simulations that utilized chemical boundary conditions derived from monthly-mean concentrations from archived ECHAM5-MOZART simulations as described in Sect. 2. Figure 10a and b show the results of this analysis, with the IAV analysis (analogous to Fig. 4b) displayed in Fig. 10a and the trend analysis (analogous to Fig. 7) displayed in Fig. 10b.The results of the IAV analysis indicate that both sets of CMAQ simulations underestimate observed IAV with modeled/observed IAV ratios less than 1, but also show that the CMAQ simulation deriving its boundary conditions from the archived ECHAM5-MOZART simulations significantly improves the representation of IAV for mid and low percentiles.However, the trend estimates shown in Fig. 10b reveal that the CMAQ/ECHAM5-MOZART simulations exhibit an even larger discrepancy with the observations than the CMAQ/STATIC simulations.This is visible for all percentiles but most pronounced for the lower percentiles, lending support to the hypothesis that modeled trends at the 5th percentile are strongly impacted by large-scale features specified through the chemical boundary conditions.
It should be noted that some of the upward trend in the observed 5th percentile may be due to localized effects such as a decrease in the amount of NO x titration over time, and such localized effects caused by the location of some of the AQS monitors in urban areas may not have been captured by the CMAQ simulations because of the 12 km grid spacing.To investigate whether this is the primary reason for the discrepancy between observed and simulated trends for the lower end of the distribution, we performed a similar trend analysis using ozone data from the predominantly rural Clean Air Status and Trends Network (CASTNet).The results from this additional analysis are qualitatively similar to the information shown in Fig. 10b, i.e. the observed 5th percentile shows an upward trend while the CMAQ/STATIC simulations show a slight downward trend and the CMAQ/ECHAM simulations show a strong downward trend, indicating that localized effects are not the main factor for the discrepancies between the observed and simulated trends shown in Fig. 10b.Because of these pronounced impacts of the choice of boundary conditions on variability and trends, it is of interest to further study the differences between these two simulations.Figure 11 shows differences in seasonal average daily maximum ozone concentrations between the CMAQ/ECHAM5-MOZART and CMAQ/STATIC simulations for model layer 1 for winter, spring, summer, and fall, each averaged over the 18 years of the simulation period.While the impact of different boundary conditions on monthly average daily maximum ozone decreases towards the interior of the domain, it still reaches 3-9 ppb in July for the regions typically exhibiting the highest observed ozone concentrations.
Figure 12a-b display the impact of different boundary conditions on average daily maximum ozone concentrations as function of day-of-year, again averaged over 1988-2005 and over all 90 monitors in the modeling domain.It can be seen that CMAQ/ECHAM5-MOZART generally yields higher concentrations than CMAQ/STATIC and that the differences are largest in spring and fall and can be as large as 12 ppb averaged over all sites.The higher concentrations for the CMAQ/ECHAM5-MOZART simulations are consistent with Table 2 that showed higher boundary conditions for ozone as well as NO x and PAN compared to the time-invariant static profile.It is also evident that the CMAQ/STATIC concentrations are generally closer to observed concentrations than the CMAQ/ECHAM5-MOZART concentrations.This is confirmed by a comparison of the Table 5. Model performance metrics calculated for the CMAQ/ECHAM5-MOZART simulated daily maximum 8-hr ozone concentrations at 90 monitors for 1988-2005.Metrics for the "All days" row were calculated using all observed and simulated values for May-September for each year.Metrics for the 5th and 95th percentile rows were calculated by first determining the corresponding percentile from the 153 daily May-September values for each year and then calculating the metrics across the 18 years from 1988-2005.All metrics were calculated separately at each monitor and then averaged for display in this  , 2007), and recent work suggests that this bias can be reduced by an improved representation of biogenic emissions (Engel, 2009).
While the analysis presented thus far has focused on surface observations, the choice of lateral boundary conditions also is expected to have a significant impact on simulated concentrations in the free troposphere.Figure 13a-d show a comparison of observed and modeled vertical profiles of the average and standard deviation of ozone concentrations across all available launches at the two ozonesonde sites described in Sect. 2. We restricted the comparison to CMAQ layers that are completely within the troposphere because of the limited vertical resolution of these simulations in the tropopause region as discussed in Sect. 2. The mean concentration profiles show that the CMAQ/STATIC simulations are closer to observations than the CMAQ/ECHAM5-MOZART simulations throughout the troposphere at the Wallops Island (WI) launch site.At the University of Alabama at Huntsville (UAH) launch site, the CMAQ/STATIC simulations are closer to observations in the lower and upper troposphere while the CMAQ/ECHAM5-MOZART simulations are closer to observations in the mid troposphere.The comparison of observed and simulated vertical profiles of ozone standard deviations over all available launches shows better performance for the CMAQ/ECHAM5-MOZART simulations at both sites, especially in the free troposphere.In addition, despite the differences in absolute magnitude, the shape of the vertical ozone profiles is better reproduced in the CMAQ/ECHAM5-MOZART simulation at both sites.
In terms of long-term trends, Fig. 14a-d show time series of observed and simulated annual mean anomalies at four different vertical levels at Wallops Island.The annual mean anomalies were calculated by subtracting the overall mean concentration for a given dataset from the annual mean concentrations that were calculated for all available observations and corresponding model predictions in a given year.These figures illustrate that the observed time series exhibit more  (right).The annual mean anomalies were calculated by subtracting the overall mean concentration for a given dataset from the annual mean concentrations that were calculated for all available observations and corresponding model predictions in a given year.
Overall, these figures confirm that boundary conditions have a profound impact on simulated ozone concentrations and their trends throughout the troposphere, that the CMAQ/ECHAM5-MOZART simulations have a tendency for overpredictions that is less evident in the CMAQ/STATIC simulations, and that the CMAQ/ECHAM5-MOZART simulations capture more of the observed variability than the CMAQ/STATIC simulations, especially in the free troposphere.

Discussion and summary
In this study, we presented and analyzed the results from two sets of 18-year air quality simulations over the Northeastern US performed with a regional photochemical modeling system.These two simulations used different sets of lateral boundary conditions, one corresponding to a timeinvariant climatological vertical profile and the other derived from monthly mean concentrations extracted from archived ECHAM5-MOZART global simulations.The objective was to provide illustrative examples of how model performance in several key aspects -trends, intra-and interannual variability of ground-level ozone, and ozone/precursor relationships -can be evaluated against available observations, and to identify key inputs and processes that need to be considered when performing and improving such long-term simulations.To this end, we have introduced several methods for comparing observed and simulated trends and variability of ground level ozone concentrations, ozone precursors and ozone/precursor relationships.The application of these methods to the simulation using time-invariant boundary conditions revealed that the observed downward trend in the upper percentiles of summertime ozone concentrations was captured by the model in both directionality and magnitude.However, for lower percentiles there is a marked disagreement between observed and simulated trends.In terms of variability, the CMAQ simulations using the time-invariant boundary conditions underestimate observed inter-annual variability by 30%-50% depending on the percentiles of the distribution.The use of boundary conditions from the ECHAM5-MOZART simulations improves the representation of interannual variability but has an adverse impact on the simulated ozone trends.Moreover, biases in the global simulations have the potential to significantly affect ozone simulations throughout the modeling domain, both at the surface and aloft.The comparison of both simulations highlights the significant impact lateral boundary conditions can have on a regional air quality model's ability to simulate long-term ozone variability and trends, especially for the lower percentiles of the ozone distribution.Moreover, the differences in observed and simulated long-term trends of CO also raise the possibility that some of the emission assumptions made in these simulations may have to be refined for future long-term modeling applications.Boundary conditions and emission trends have been identified as key uncertainties in a previous long-term modeling study over Europe by Vautard et al. (2006).
From the perspective of dynamic model evaluation as defined in Gilliland et al. (2008) and Dennis et al (2010) and applied in Gilliland et al. (2008), Godowitch et al. (2010), andPierce et al. (2010), several key issues for future applications of regional-scale modeling systems for long-term simulations emerge from this study.First, it is crucial to create longterm records of internally consistent and spatially resolved emission inventories such as those developed for the RETRO project (RETRO, 2008).Retrospective simulations such as the one presented in this study can potentially highlight areas for methodological refinements in creating such inventories and performing emission projections.Second, boundary conditions that are either unrealistic or affected by incompatibilities between global and regional models can affect the modeling system's ability to simulate long-term ozone variability and trends, especially for the lower percentiles of the ozone distribution.Potential future improvements would include long-term hemispheric modeling with a single modeling system employing nested grids from global to urban scales and the use of tropospheric observations in data assimilation (Hollingsworth et al., 2008).Third, future work needs to be directed towards improving the modeling system's ability to capture the effect of meteorological variability (intraand interannual variability) on ozone concentrations.Such work could include the analysis of inter-relationships between meteorological variables and air quality variables on a range of time scales (e.g.Gilliam et al., 2006) and could help to build credibility for applying regional-scale modeling systems to quantify the potential effects of climate change on air quality.

Fig. 2 .
Fig. 2. Vertical cross-sections of time-averaged ozone concentrations specified along the southern, eastern, northern, and western boundaries of the CMAQ 36 km modeling domain.Top row: ozone concentrations derived from the time-invariant climatological vertical profiles used in the CMAQ/STATIC simulations.Center row: ozone concentrations derived from the ECHAM5-MOZART simulations.Bottom row: ozone concentrations used in the CMAQ/ECHAM5-MOZART simulations after setting the concentrations for layers 14 and 15 to the same value as for layer 13.

Fig. 3 .
Fig.3.Power spectra calculated from 18 years of hourly time series of observed and CMAQ/STATIC ozone concentrations.To reduce the noise in the spectra and facilitate the comparison, the spectra were calculated at 19 selected sites and the spectral density at each frequency was then averaged over these sites.
Fig. 4. Interannual variability (IAV) of observed and CMAQ/STATIC 8-h daily maximum ozone concentrations.Details on the calculation of IAV are provided in the text.(a) boxplots of the observed and simulated IAV for the 5th, 25th, 50th, 75th, and 95th percentiles of May-September 8-h daily maximum ozone concentrations; the box plots show the distribution of IAV for a given percentile across the 90 sites considered in the analysis (b) ratio of CMAQ/STATIC IAV to observed IAV vs. percentiles of May-September 8-h daily maximum ozone concentrations; the median IAV ratio across all 90 sites is shown for each percentile.

Fig. 6 .
Fig. 6.Least-squares trend estimates for the 95th and 5th percentiles of May-September 8-h daily maximum ozone concentrations over the 1988-2005 time period.Results for observations are shown in the top row while results for the CMAQ/STATIC simulations are shown in the bottom row.

Fig. 7 .
Fig. 7. Observed and CMAQ/STATIC least-squares trend estimates for 1988-2005 (y-axis) vs. percentiles of the May-September 8-h daily maximum ozone distribution.Trends were calculated separately at each of the 90 monitoring sites, the median trend across all sites is shown here.

Fig. 8 .
Fig. 8. Observed and CMAQ/STATIC baseline time series of NO x (top) and CO (bottom).The results represent a spatial average over 3 monitors for NO x and 34 monitors for CO.Only concentrations between 06:00 and 09:00 local time were considered in this analysis.
Fig. 10.(a) Ratio of modeled to observed IAV vs. percentiles of May-September 8-h daily maximum ozone concentrations for both CMAQ/STATIC and CMAQ/ECHAM5-MOZART; the median IAV ratio across all 90 sites is shown for each percentile.(b) Observed, CMAQ/STATIC, and CMAQ/ECHAM5-MOZART leastsquares trend estimates for 1988-2005 (y-axis) vs. percentiles of the May-September 8-h daily maximum ozone distribution.Trends were calculated separately at each of the 90 monitoring sites, the median trend across all sites is shown here.

Fig. 11 .Fig. 12 .
Fig. 11.Differences in seasonal average daily maximum ozone concentrations between the CMAQ/ECHAM5-MOZART and CMAQ/STATIC simulations for model layer 1 for winter, spring, summer, and fall, each averaged over the 18 years of the simulation period.The differences were calculated as CMAQ/ECHAM5-MOZART minus CMAQ/STATIC.

Fig. 13 .Fig. 14 .
Fig. 13.Observed and simulated vertical profiles of the mean and standard deviation of ozone concentrations across all available launches at the two ozonesonde sites described in Sect. 2. (a) mean concentrations at Wallops Island, (b) standard deviations at Wallops Island, (c) mean concentrations at Huntsville, (d) standard deviations at Huntsville.
while the 36 km simulation was used to create hourly boundary conditions for the 12 km grid.For the second set of 1988-2005 simulations, referred to as CMAQ/ECHAM5-MOZART, chemical boundary conditions for the 36 km grid were extracted from archived monthly-mean fields of global chemistry simulations performed for the 1988-2005 time period with the ECHAM5-MOZART modeling system as part of the RETRO project (RETRO

Table 2 .
Boundary conditions used for the CMAQ/STATIC and CMAQ/ECHAM5-MOZART simulations.The concentrations for selected species are shown for several vertical levels and were averaged over all sides of the modeling domain.The numbers shown for the CMAQ/ECHAM5-MOZART simulation were temporally averaged over the entire simulation length for display in this Table.
changes in the relationship between ozone and its precursors, measurements for ozone, NO, NO 2 , and NO y at the Harvard Forest Environmental Management Site in Petersham, MA operated by Harvard University were obtained from http://www.as.harvard.edu/data/nigec-datat.html.Trace gas measurements at this site have been described by

Table .
. The time series represent spatial averages over the location of the Hogrefe et al.: Long-term regional-scale ozone simulations over the US Time series of the 5th, 50th, and 95th summertime percentiles estimated from observed and CMAQ/STATIC simulated May-September 8-h daily maximum ozone concentrations for 1988-2005.The time series represent spatial averages over the location of the 90 ozone monitors in the modeling domain.

Table 4 .
Observed and CMAQ/STATIC NO x and CO concentrations and trends at 3 NO x and 34CO AQS sites, 1988-2005.The analysis considered year-round 06:00 a.m.-09:00 a.m.average concentrations.The observed and modeled trends shown in the "maximum over all sites" and "minimum over all sites" rows reflect the trends at the sites with the maximum/minimum 18-year mean concentrations shown in the first two data columns of these rows.Note that many AQS CO monitors report concentrations only in part per million (ppm) or parts per ten million.The ppb values shown in this table are the result of temporal averaging and trend analysis.