Average versus high surface ozone levels over the continental USA: model bias, background influences, and interannual variability

US background ozone (O3) includes O3 produced from anthropogenic O3 precursors emitted outside of the USA, from global methane, and from any natural sources. Using a suite of sensitivity simulations in the GEOS-Chem global chemistry transport model, we estimate the influence from individual background sources versus US anthropogenic sources on total surface O3 over 10 continental US regions from 2004 to 2012. Evaluation with observations reveals model biases of +0–19 ppb in seasonal mean maximum daily 8 h average (MDA8) O3, highest in summer over the eastern USA. Simulated high-O3 events cluster too late in the season. We link these model biases to excessive regional O3 production (e.g., US anthropogenic, biogenic volatile organic compounds (BVOCs), and soil NOx , emissions), or coincident missing sinks. On the 10 highest observed O3 days during summer (O3_top10obs_JJA), US anthropogenic emissions enhance O3 by 5–11 ppb and by less than 2 ppb in the eastern versus western USA. The O3 enhancement from BVOC emissions during summer is 1–7 ppb higher on O3_top10obs_JJA days than on average days, while intercontinental pollution is up to 2 ppb higher on average versus on O3_top10obs_JJA days. During the summers of 2004–2012, monthly regional mean US background O3 MDA8 levels vary by up to 15 ppb from year to year. Observed and simulated summertime total surface O3 levels on O3_top10obs_JJA days decline by 3 ppb (averaged over all regions) from 2004–2006 to 2010–2012, reflecting rising US background (+2 ppb) and declining US anthropogenic O3 emissions (−6 ppb) in the model. The model attributes interannual variability in US background O3 on O3_top10obs days to natural sources, not international pollution transport. We find that a 3-year averaging period is not long enough to eliminate interannual variability in background O3 on the highest observed O3 days.


Introduction
In the USA, ozone (O 3 ) is regulated as a criteria pollutant under the National Ambient Air Quality Standard (NAAQS).The current NAAQS for ground-level O 3 , set in October 2015, states that the fourth highest daily maximum 8 h average (MDA8) O 3 , averaged across three consecutive years, cannot be 71 ppb or higher (U.S. Environmental Protection Agency, 2015).The 3-year average is nominally intended to smooth out fluctuations in O 3 levels resulting from natural Published by Copernicus Publications on behalf of the European Geosciences Union.
variability in meteorology within the timing constraints of the federal Clean Air Act for air quality planning.As even 1 ppb of excess O 3 may be enough to push a county out of NAAQS attainment, it is relevant to understand which sources influence the severity and timing of the highest O 3 events.Since measured O 3 does not retain a signature of the source from which it was produced, estimates of background O 3 rely on models, ideally evaluated closely with observations, to build confidence in the model capability for source attribution.Here we apply a global chemistry transport model alongside O 3 observations to examine which sources are influencing average versus high-O 3 events, and the extent to which they vary from year to year.
As US anthropogenic emissions of O 3 precursors decline, the relative importance of US background O 3 to total surface O 3 rises.US background O 3 is defined here as the O 3 levels that would exist in the absence of US anthropogenic emissions of O 3 precursors, nitrogen oxide (NO x ) and nonmethane volatile organic compounds (NMVOCs).US background O 3 thus includes naturally occurring O 3 as well as O 3 produced from global methane (including US anthropogenic emissions) and from O 3 precursor emissions outside of the USA.Jaffe et al. (2018) review the current understanding on US background O 3 from models and observations and its relevance to air quality standard setting and implementation.Previous studies estimating background O 3 over the USA found that background sources of O 3 , including stratospheric O 3 intrusions (Lin et al., 2012(Lin et al., , 2015a)), increasing Asian anthropogenic emissions (Lin et al., 2015b), and more frequent wildfires in summer (Abatzoglou and Williams, 2016;Jaffe, 2011;Yang et al., 2015), may present challenges to obtaining the O 3 standard, especially since regional emission controls may be offset by a warming climate (Fiore et al., 2015).At high-altitude sites in the western USA (WUS) in spring, the influence from stratospheric intrusions and foreign transport, combined with relatively deep planetary boundary layers, can lead to high background O 3 events (Fiore et al., 2002;Zhang et al., 2011).Lin et al. (2017) investigated surface O 3 trends over the USA from 1980 to 2014 with the GFDL AM3 model and found that emissions controls decreased the 95th percentile summer O 3 values in the eastern USA (EUS) by 5-10 ppb over 1988-2014, but rising Asian emissions increased this O 3 metric by 2-8 ppb at individual sites in the WUS over the period (Lin et al., 2017).
Earlier work in the GEOS-Chem model analyzing background O 3 during a single meteorological year noted a tendency for the model to underestimate springtime O 3 at highaltitude WUS sites but overestimate summertime O 3 over the EUS (e.g., Fiore et al., 2002Fiore et al., , 2003;;Wang et al., 2009;Zhang et al., 2011Zhang et al., , 2014)).Identifying the extent to which these biases reflect poor representation of US anthropogenic versus background O 3 sources is relevant for assessing uncertainties in estimates of background O 3 on days when the O 3 NAAQS is exceeded.We build upon prior studies by analyzing MDA8 O 3 measurements and 9-year model simula-tions spanning 2004-2012 from the GEOS-Chem 3-D global chemistry transport model (CTM).A suite of sensitivity simulations in which different emissions of O 3 precursors are perturbed allows us to identify which sources are contributing the most to observed high-O 3 days and on the days with the highest model bias.We assess here whether biases in the model reflect problems in the modeled transported background O 3 versus O 3 produced within the US from both background and anthropogenic sources.In addition, the availability of these simulations for 2004-2012 allows us to investigate the year-to-year variability in background sources and the extent to which this variability is relevant for observed high events, and therefore, potentially to attaining the O 3 NAAQS.Though coarse-resolution global models such as GEOS-Chem will mix emissions into the same grid cell that may remain separate in the real atmosphere, a global model is necessary to quantify background O 3 transported intercontinentally, including that produced via oxidation of methane.We estimate the influence from various individual background sources on O 3 concentrations and the interannual variability in background O 3 levels with a focus on the highest 10 events in each of the 10 U.S. EPA regions during each summer (JJA) or year.We aim to answer the following questions: (1) which sources exert the strongest influence on O 3 on the 10 days with the highest model biases against observations?(2) Which background sources influence total O 3 the most on average versus the 10 highest O 3 days?(3) Which sources influence the interannual variability of O 3 in each region on average versus the 10 highest O 3 days?(Baylon et al., 2016).Baseline O 3 is defined as the O 3 concentration at sites with negligible influence from local emissions (National Research Council, 2010).Baseline O 3 is a measurable quantity and differs from background O 3 in that it contains some influence from US anthropogenic emissions that were not recently emitted but contributed to the global background.This station is analyzed as a standalone site in Sect.3.2, given the relevance of high-altitude measurements for downwind surface O 3 (Stauffer et al., 2017).We take all hourly O 3 concentrations from Mount Bachelor and calculate the MDA8 O 3 concentrations for 2004-2012.Daily averages are included only if at least 18 h of data are available and monthly averages require at least 20 days with valid 24 h mean or MDA8 data.For our comparison to monthly average O 3 at Mount Bachelor Observatory, we sample the model both at the level closest to 2.7 km and at the surface.
We use temperature data from a 0.5 • ×0.5 • resolution gridded dataset developed by Fan and van den Dool (2008) from the Global Historical Climatology Network (GHCN) and the Climate Anomaly Monitoring System (CAMS).GHCN Gridded V2 data were provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA (https://www.esrl.noaa.gov/psd/).Each observational site is matched to the model grid cell it falls in and the average monthly temperature is computed by averaging across all the sites in each region.
In order to evaluate the GEOS-Chem model O 3 simulation (described below in Sect.2.3) at a spatial scale comparable to the coarse horizontal resolution global grid (2 • ×2.5 • ), we use a 1 • × 1 • grid of surface MDA8 O 3 measurements, interpolated from the AQS, CASTNet, and Canadian NAPS networks (Schnell and Prather, 2017).We degrade this 1 • to match the horizontal resolution of the GEOS-Chem simulations.As we did not archive 3-D highfrequency data, all MDA8 O 3 values from the model are sampled at the lowest surface layer for comparison to observational sites.

Analysis regions
Each observational site in the EPA AQS and CASTNet datasets is linked to 1 of the 10 U.S. EPA air quality regions (Fig. S1 in the Supplement) based on which state the site is in.The Mount Bachelor data were included with Region 10 (Pacific Northwest) sites even though this site is not a regulatory monitor.Following Reidmiller et al. (2009), we select two regions, the Southeast (Region 4) and Mountains and Plains (Region 8), as representative regions for the EUS and WUS for illustration purposes in the main text.Figures for the other eight regions are included in the Supplement.
To find the daily mean O 3 concentration within each region, we first match each observational site to the model grid within which it falls.We then average across all sites in each region to obtain a regional mean MDA8 O 3 value in the observations and in the model.From the regionally averaged observed MDA8 O 3 , we find (1) the 10 days with the highest observed O 3 during each year (hereafter, O 3 _top10obs days), similar to the definition for extreme events used in Schnell et al. (2014), (2) the 10 days with the highest O 3 observations during each season (hereafter, O 3 _top10obs_MAM, O 3 _top10obs_JJA, and O 3 _top10obs_SON), and (3) the fourth highest MDA8 O 3 within each year.In addition, we sample the model to find the 10 days each year with the highest positive biases.There is at most a 2-6-day overlap between the top 10 O 3 _Base days and the top 10 most biased days in 2004-2012 across all regions, but during most years, the overlap is around 0-2 days.We restrict our analysis to examining the top 10 observed O 3 days as these days are most relevant from a policy perspective.We use O 3 _top10obs as our primary metric, however, instead of the policy-relevant fourth highest O 3 because the model bias is typically lower on O 3 _top10obs days (Fig. S2 versus Fig. S3).On the days when the fourth highest values occur, the model bias is generally more strongly negative in the West and South Central regions and more strongly positive in the Midwest than on O 3 _top10obs days (Figs.S2, S3).In addition, while the model rarely captures the exact day when the observed fourth highest MDA8 O 3 event occurs, there is a 3-4-day overlap on average between the O 3 _top10obs days and the highest 10 MDA8 O 3 days in the model.This overlap is similar to the 3-and 6-day overlap Jaffe et al. (2018) found in their regional models for 1 May to 29 September 2011.

GEOS-Chem model simulations
We use the GEOS-Chem v9_02 global 3-D chemical transport model (CTM) (http://www.geos-chem.org)simulations driven by Modern-Era Retrospective analysis for Research and Applications (MERRA) reanalysis meteorology from the NASA Global Modeling and Assimilation Office for 2004-2012 (Rienecker et al., 2011).The MERRA reanalysis is available at 1/2 • × 2/3 • horizontal resolution, which we degrade here to 2 • × 2.5 • horizontal resolution.MERRA meteorology captures summer mean surface temperatures to within 1-2 K across US regions and precipitation to within 0.5 mm d −1 except for over the Northern Great Plains where a positive bias exceeds 1 mm d −1 , but the variance in summer mean precipitation is lower than observed in some regions (Bosilovich, 2013).While interannual variability in cloudiness observed at weather stations is largely captured by MERRA, the reanalysis generally underestimates cloud cover and thus overestimates observed downward surface shortwave fluxes (Free et al., 2016).Methane surface conwww.atmos-chem-phys.net/18/12123/2018/Atmos.Chem.Phys., 18, 12123-12140, 2018 centrations are prescribed each month using spatially interpolated surface distributions from NOAA Global Monitoring Division flash data.We use the standard v9_02 chemical mechanism which includes recycling of isoprene nitrates (Mao et al., 2013) in contrast to the mechanisms used in earlier versions of GEOS-Chem (e.g., Zhang et al., 2014as discussed in Fiore et al., 2014).Anthropogenic base emissions are from the Emission Database for Global Atmospheric Research (EDGAR) version 3.2 FT2000 inventory (Olivier et al., 2005) for inorganic compounds and the REanalysis of the TROpospheric chemical composition (RETRO) inventory (Hu et al., 2015;Schultz, 2007) for organic compounds.Inorganic emissions are overwritten by regional inventories for the US (EPA National Emissions Inventory 2005), Canada (Criteria Air Contaminants), Mexico (Big Bend Regional Aerosol and Visibility Observational study; Kuhns and Green, 2003), Europe (European Monitoring and Evaluation Programme; Auvray and Bey, 2005), and South and East Asia (Streets et al., 2006).Separate global inventories are used for ammonia (Bouwman et al., 1997), black carbon (Bond et al., 2007;Leibensperger et al., 2012), and ethane (Xiao et al., 2008).Anthropogenic surface emissions have diurnal and monthly variability, some with additional weekly cycles, and are scaled each year on the basis of economic data and estimates provided by individual countries, where available (van Donkelaar et al., 2008).The model does not include daily variations in US anthropogenic emissions associated with higher electricity demand on hotter days (e.g., Abel et al., 2017).Aircraft emissions are from the Aviation Emissions Inventory Code (AEIC) inventory (Stettler et al., 2011) and shipping emissions are from the International Comprehensive Ocean-Atmosphere Data Set (ICOADS; Lee et al., 2011;Wang et al., 2008).Biomass burning emissions follow the interannually varying monthly Global Fire Emissions Database version 3 (GFED3) inventory driven by satellite observations of fire activity (Giglio et al., 2010;van der Werf et al., 2010).Biofuel emissions are constant (Yevich and Logan, 2003).Biogenic VOC (volatile organic compound) emissions from terrestrial plants follow the Model of Emissions of Gases and Aerosols from Nature (MEGAN) scheme version 2.1 (Guenther et al., 2012) and vary with meteorology (Barkley et al., 2011).Global and US emissions are 29.5 and 5.2 Tg N yr −1 , respectively, for anthropogenic NO x emissions (including biofuels), 4.2 and 0.1 Tg N yr −1 for biomass burning, 8.7 and 0.9 Tg N yr −1 for soil NO x , 6.7 and 1.0 Tg N yr −1 for lightning NO x , and 466.1 and 20.6 Tg C yr −1 for isoprene emissions.Emissions for NO x sources and isoprene are provided globally and within the USA for each year in Table S3.
We first perform a base simulation (O 3 _Base) with all emissions turned on for 2003-2012.We conduct a parallel suite of sensitivity simulations, in which selected sources are removed.In all simulations, we discard 2003 from our analysis as initialization.Our first set of sensitivity simulations estimates three different "background" definitions: (1) "North American background" (denoted O 3 _NAB) in which anthropogenic emissions within Canada, Mexico, and the USA are set to zero, but methane surface abundances are kept at present-day values; (2) "US background" (O 3 _USB), which is similar to O 3 _NAB except only US anthropogenic emissions are set to zero; and (3) "Natural background" (O 3 _NAT), in which all anthropogenic emissions have been removed globally and methane is prescribed at preindustrial levels.We estimate Canadian and Mexican influence (O 3 _CA+MX) on US O 3 by subtracting O 3 _NAB from O 3 _USB; the influence from intercontinental pollution transport plus global methane (O 3 _ICT+CH 4 ) is estimated by subtracting O 3 _NAT from O 3 _NAB.A second set of sensitivity simulations enables us to estimate the contribution of individual background sources to total simulated surface O 3 by subtracting a simulation with that source shut off from the O 3 _Base simulation: (1) O 3 _NALNO x by turning off North American lightning NO x , (2) O 3 _SNO x by zeroing out global soil NO x , (3) O 3 _BVOC by zeroing out terrestrial biogenic VOC emissions (we also examine this "O 3 _noBVOC" simulation in Sect.3.3), and (4) O 3 _BB by zeroing out biomass burning emissions, as summarized in Table 1.Due to nonlinearities in atmospheric photochemistry, these "zero out" estimates of source contributions depend on the presence of all other precursor emissions at present-day levels (e.g., the impact of BVOC emissions is sensitive to the amount of anthropogenic NO x emissions in the Base simulation).This set of model simulations does not directly isolate stratospheric O 3 or Asian influences.Previous work has shown that stratospheric O 3 can increase springtime O 3 levels by 17-40 ppb in the WUS when MDA8 O 3 levels are 70-85 ppb, and Asian emissions can contribute 8-15 ppb to MDA8 O 3 on days above 60 ppb (Lin et al., 2012(Lin et al., , 2015a)).Stratospheric and Asian influences are included in O 3 _USB, Asian influences are included in O 3 _ICT+CH 4 , and O 3 _NAT includes stratospheric O 3 , biogenic emissions of O 3 precursors, wildfires, and lightning NO x .As O 3 _BVOC includes O 3 produced from biogenic VOC reacting with both natural and anthropogenic NO x , O 3 _USA and O 3 _BVOC are not additive.O 3 _BVOC thus contributes to both O 3 _USA and O 3 _USB.
3 Model evaluation

MDA8 O 3 distributions
To evaluate the ability of our coarse-resolution model to capture observed high-O 3 events, we compare the MDA8 O 3 averaged over each of the 10 EPA regions simulated by GEOS-Chem to the observations in two ways.In the first method, we use the Schnell and Prather (2017) gridded dataset degraded to the model resolution and sample the model directly at each of the degraded Schnell grid cells prior to calculating the regional average.In the second method, we sample the prior to calculating the regional average.The model is biased positively with either method (Fig. 1a, b), but the shape of the model distribution constructed with the latter approach (Fig. 1b) better matches the observed distribution than that of the former (Fig. 1a).Matching individual sites to the nearest model grid (Fig. 1b) yields a better estimate of high-O 3 days; the model overestimates the percentage of days above 70 ppb by about 3 times when we match them to individual measurement sites (3.14 % of days are above 70 ppb in the observations versus 9.92 % in model) but by about 10 times in comparison to the re-gridded Schnell et al. ( 2014) dataset (0.37 % of days are above 70 ppb in the observations versus 3.91 % in the re-gridded dataset).
Simulated seasonal mean MDA8 averaged over the full 2004-2012 period is higher than observed by 5-30 ppb (Fig. 2a, b, c), with the largest biases typically occurring in the Northeast and Midwest.The model bias is highest in summer (JJA) (15-30 ppb at most sites), followed by fall (SON) (10-20 ppb) (Fig. 2a, b, c).Recent work in a newer version of GEOS-Chem attributes some of the positive model bias in the EUS to excessive NO x emissions in the 2011 National Emission Inventory (NEI) (Travis et al., 2016), an inability of the model to resolve vertical mixing in the boundary layer, and a weak response to cloud cover (Travis et al., 2017).Travis et al. (2016) find that the 3.5 Tg N yr −1 NEI 2011 estimate for US fuel NO x emissions is too high and contributes to excessive surface O 3 .Our simulations include even higher US fuel NO x emissions of 4.4 Tg N yr −1 during 2010-2012 (Table S3), implying that some portion of the model O 3 bias reflects excessively high anthropogenic NO x emissions (Travis et al., 2016).The low bias in cloud cover in the MERRA meteorology and associated overestimate in downward shortwave surface radiation (Free et al., 2016) may also contribute to excessive O 3 production in the model.The model is closest to the observations in spring, with a pos-itive bias usually < 10 ppb over the eastern states and generally within ±5 ppb over most western sites (Fig. 2a, b, c).

Baseline O 3 at Mount Bachelor
Mount Bachelor Observatory (MBO) regularly samples free tropospheric O 3 and is rarely influenced by local anthropogenic emissions (Reidmiller et al., 2009).It is, therefore, a valuable site for examining baseline O 3 .In Fig. S4, we compare the observed 24 h and MDA8 O 3 concentrations at MBO for 2004-2012.The observed O 3 concentrations vary from year to year, and by definition, MDA8 O 3 is a few ppb higher than the 24 h mean mixing ratios.However, the seasonal pattern is similar across both metrics, with a springtime peak, a maximum in April, and a secondary summertime peak in July.
Figure 3 compares modeled and observed monthly mean 24 h O 3 concentrations at the grid box that contains Mount Bachelor.For the model, we examine O 3 _Base and O 3 _USB 24 h average concentrations at 2.7 km and the height of the Mount Bachelor Observatory, as well as at the surface.It is important to note that the diurnal variations on the mountain may not be well captured by the CTM, due to upslope (daytime) / downslope (nighttime) flow.We focus on the 24 h average because we only archived hourly O 3 fields from the model at the surface and, thus, do not have the MDA8 O 3 metric available at 2.7 km.The year-to-year variability is smaller in the model than observed (narrower shaded range).In all months, the O 3 _Base and O 3 _USB values are higher, by 9-14 and 11-21 ppb, respectively, at 2.7 km than at the surface.The model captures the magnitude of the observed springtime peak at 2.7 km, but summertime values are too high, with an overall peak in August.O 3 _USB contributes a greater fraction to O 3 _Base at 2.7 km (92-94 %) than at the surface (72-94 %).The simulated seasonal cycle differs at the surface, peaking in spring (March-April) and in September.In 2012, the observations show equivalent springtime and summertime peaks, more similar to the modeled sea- Our sensitivity simulations enable us to interpret the sources contributing to the simulated seasonal distribution.The model indicates that at MBO, O 3 _USB is the major component of O 3 _Base, including during the summertime overestimate.In turn, the model indicates that the seasonality of O 3 _USB is largely driven by O 3 _NAT, which includes the influence from biogenic VOCs and NO x and light-ning NO x , as well as stratospheric O 3 .O 3 _ICT+CH 4 contributes around 15 ppb at 2.7 km and 5-10 ppb at the surface (Fig. 3).The model does suggest a springtime peak influence from O 3 _ICT+CH 4 in the WUS, consistent with earlier work (e.g., Dentener et al., 2010).Even at this baseline site, the model indicates that O 3 _USA enhances monthly mean O 3 by at least a few ppb at 2.7 km; at the surface, the model simulates a seasonal cycle for O 3 _USA that is typical of photochemical production from regional precursor emissions.O 3 _CA+MX is less than a few ppb at MBO whether the model is sampled at 2.7 km or the surface (not shown).

Magnitude and timing of high-O 3 events
On O 3 _top10obs days, the model biases are typically lower than on average days (Fig. 2, Table 2; see also year-by-year maps in Fig. S2).At some WUS sites, the model underestimates O 3 levels during the highest events by 10-20 ppb.The model systematically underestimates O 3 in the Central Valley of California in all three seasons, which we attribute to the inability of the coarse model resolution to resolve topographical gradients and valley circulations (or stagnation) in this region which experiences some of the highest observed O 3 in the nation.
We compare the MDA8 O 3 distributions in the observations versus the model (O 3 _Base) during the 10 most biased days in each of the 10 regions across the 9 years (900 total events).These "most-biased" days in the model tend to fall around the observed median (Fig. 1c) during the warm season (June-October), with almost 40 % of the days falling in August alone (Fig. S5), and are 9-45 ppb higher than the observations (circles in Fig. S6).We analyze the perturbation simulations (Table 1) to identify which sources influence simulated O 3 most strongly on the most-biased days versus on average (i.e., all 365 or 366 days), which we assume are also likely the main drivers of the bias.In all regions, the largest sources on the most-biased model days are O 3 _USA (3-30 ppb higher MDA8 O 3 than on average with the exception of the Pacific Southwest (SW) where O 3 _USA is smaller than on average days), O 3 _BVOC (by 1-15 ppb), and O 3 _SNO x (by 1-10 ppb; Figs. 4, S6).By contrast, O 3 _ICT+CH 4 is up to a few ppb higher on average days than on the most-biased model days.
To explore possible drivers of model biases across the different seasons, we evaluate the timing of the highest 10 events across each year in the O 3 _Base, O 3 _USB, and O 3 _noBVOC (BVOCs shut off) simulations for each region (900 events).We bin these 900 events by month and calculate the percentage of the total events that fall within each month.Note that all the top 10 days fall between March and October.The standard model (O 3 _Base) underestimates the occurrence of high events early in the O 3 season (March-June) and overestimates them later in the season (July-September) (Fig. S7).While the model indicates that most top 10 O 3 days fall between July and August (35 % each), the observations  show that the months of May through August each contain around 15-25 %, with the maximum in June at 25 %.Both O 3 _noBVOC and O 3 _USB shift the relative timing of the 10 highest O 3 events towards April and May compared to O 3 _Base, but the shortage of high springtime O 3 events remains (Fig. S7).The lack of high events in spring may reflect in part poor representation of stratospheric O 3 intrusions at the coarse resolution of the CTM (Lin et al., 2012;Zhang et al., 2014), in addition to the role of US anthropogenic and BVOC emissions in the temporal mismatch as indicated by the improvements to the timing that occur in the O 3 _USB (US anthropogenic emissions shut off) and O 3 _noBVOC simulations.In addition to contributions from these sources, poor representation of O 3 sinks may contribute to the model biases.For example, Makar et al. (2017) suggest that failing to represent canopy turbulence and shading effects on photolysis can lead to high-O 3 biases in models.

Interannual variability
Figure S8 shows the Pearson correlations coefficients (r) between monthly average observed and O 3 _Base values from 2004 to 2012.In May, correlations are generally strong (r ≥ 0.9) in the Mid-Atlantic and Southeast regions, but much lower (r = 0.2) in the New England region.This pattern may reflect shortcomings in representing the onset of BVOC emissions.In July, the regions flip, with lower correlations in the Southeast and higher correlations in New England.At some sites in the WUS, lower correlations occur during summer months, which may be tied to excessive influence from lightning NO x advected from Mexico (see also Zhang et al., 2011Zhang et al., , 2014) ) or anomalous events such as wildfires that are not well captured by the model.In general, correlations only average about r = 0.2 in the winter and early spring over much of the USA (Fig. S8); the drivers for these weak correlations may be connected to the model tendency to underestimate the occurrence of springtime high-O 3 events.From May to September, however, the months during which high-O 3 events are most likely to occur, the correlation between 2004 and 2012 observed and simulated O 3 monthly averages over much of the contiguous USA exceeds r = 0.7 (Fig. S8).We conclude that the model broadly captures monthly variations from year to year during the warm season and can thus be applied to interpret the role of background sources in contributing to interannual variations during most of the high-O 3 season.We note that Clifton et al. (2017) found that the GEOS-Chem model does not capture interannual variability in deposition velocities observed at Harvard Forest, MA, but it is unclear to what extent this process would amplify or dampen interannual variability associated with changes in emissions.

Influence of individual sources on average versus high-O 3 days
In Tables 2 and 3, we report the influence of the O 3 sources defined in Table 1 on average versus O 3 _top10obs days separately for spring (MAM), summer (JJA), and fall (SON) (10 days from each of the 9 simulation years for 900 events for each region and season).We also report the difference in source influences between average and O 3 _top10obs days, which we interpret as the enhancement from that source relative to average conditions.We first consider the average ranges in MDA8 O 3 contributed by the various sources.Both O 3 _USA and O 3 _USB tend to follow the seasonal cycle of O 3 _Base, with highest abundances in summer.The model indicates that O 3 _USB is 30-50 ppb (range over regions) during summer and highest over the WUS.O 3 _USA is generally 20-30 ppb over the EUS in summer, but only 10-20 ppb over the WUS (Table 2).O 3 _ICT+CH 4 averages 2-13 ppb over all regions and is highest in spring (8-13 ppb compared to 2-11 ppb in summer and 6-12 ppb in fall) (Table 3,Figs. 5,S9).O 3 _NALNO x has a relatively minor influence (at most 1.5 ppb) in all regions and seasons.The influence from O 3 _CA+MX is generally less than a couple of ppb except in New York (NY) and New Jersey (NJ), and in New England, where it can be as much as 4-7 ppb (Table 3, Fig. S9).
We interpret the "difference" lines in Tables 2 and 3 as the enhancements from each source on high days in each season (O 3 _top10obs_MAM, O 3 _top10obs_JJA, O 3 _top10obs_SON) relative to average conditions.Over all regions, O 3 _BVOC and O 3 _SNO x influence O 3 _Base more on O 3 _top10obs days (for all seasons) than on average, whereas O 3 _ICT+CH 4 is typically lower by up to 3 ppb on O 3 _top10obs days (for all seasons) than on average days (Tables 2,3,Figs. 5,S9).O 3 _USA is 8-11 ppb higher on O 3 _top10obs_JJA days versus the average over the New England, NY and NJ, Mid-Atlantic, Midwest, and South Central regions, but only up to 5 ppb higher over other regions (Table 2,Figs. 5,S9).The model indicates an even stronger anthropogenic enhancement (up to 19 ppb) on O 3 _top10obs_SON days in some EUS regions (Table 2).O 3 _USB is enhanced on O 3 _top10obs_JJA days by 2-12 ppb relative to the average, with the smallest enhancements occurring in the Mid-Atlantic, Southeast, and Midwest regions, and the largest enhancements occurring in the Pacific Northwest (NW).In contrast to all the other regions, O 3 _USB is the dominant source enhancing O 3 _top10days_JJA over the Mountains and Plains, Pacific NW, and Pacific SW regions (4-12 ppb for O 3 _USB but < 5 ppb from either O 3 _USA or O 3 _BVOC).In line with earlier work reviewed by Jaffe et al. (2018), enhanced O 3 _USA dominates O 3 _top10obs_JJA days over much of the USA, whereas in the WUS, O 3 _USB enhancements exceed O 3 _USA enhancements on O 3 _top10days_JJA.O 3 _BVOC enhances O 3 _top10obs days (for all seasons) by up to 9 ppb, with the influence often largest in fall (when O 3 formation is more sensitive to VOC; e.g., Jacob et al., 1995).We re-emphasize that BVOCs contribute both to O 3 _USA when reacting with anthropogenic NO x and to O 3 _USB when reacting with all other NO x sources.In contrast to the sources discussed above, O 3 _ICT+CH 4 influences average days by up to a few ppb more than on O 3 _top10obs days (for all seasons), with the largest differences between average and high days occurring in EUS regions (1-3 ppb lower on O 3 _top10obs days (for all seasons) in New England, NY and NJ,Table 3,Figs. 5,S9).O 3 _NALNO x is at most 2 ppb higher than average on O 3 _top10obs days.The O 3 _CA+MX influence is roughly equivalent (generally to within a ppb) on average versus O 3 _top10obs days during all seasons.Table 4).We note that over the Pacific NW there is a 4 ppb decrease in O 3 _USB from 2004-2006 to 2010-2012.Over this period, temperatures generally warm over the EUS, but slightly cool in the WUS.Within the 10 regions, the model captures the sign of the changes in MDA8 O 3 over this period but not the magnitude (Table 4).The model monthly mean temperatures in the model (from the MERRA reanalysis) closely match the observed GHCN+CAMS dataset (Table S4).Table 4 shows that regions with O 3 _USB increases generally experienced rising temperatures over this period, as the 2010-2012 period includes two of the warmest years on record.S5).In NY and NJ, the Southeast, Midwest, South Central, and Plains regions, O 3 _USB and O 3 _USA both contribute to the interannual variability on O 3 _top10obs_JJA days (r = 0.5 − 0.8 for both O 3 _USB and O 3 _USA versus O 3 _Base), while in New England and the Mid-Atlantic regions, O 3 _USA drives the interannual variability more than O 3 _USB (r = 0.64 and 0.72 for O 3 _USA versus O 3 _Base but only 0.28 and 0.54, respectively, for O 3 _USB versus O 3 _Base; Table S5).
Year-to-year variations in monthly average O 3 _USB are relatively large, with 10-15 ppb differences between the highest and lowest O 3 _USB years during the warmest months (Figs. 7, S11).Seasonal variations also differ by region, especially during summer.For example, the western US regions have a smooth seasonal cycle, with O 3 _USB concentrations rising from January to a peak in July and August, and then declining again.
O 3 _USA anomalies relative to the 2004-2012 average illustrate declining influence in all regions, with negative anomalies after 2007 on both O 3 _top10obs and average days (Figs. 8,S14).This finding is well established by earlier work demonstrating decreases in high-O 3 concentrations as a result of regional NO x emissions reductions over the past few decades (Cooper et al., 2012(Cooper et al., , 2014;;Jaffe et al., 2018;Young et al., 2017).O 3 _BVOC is the main driver of the high and low O 3 anomalies (up to ±5 ppb on O 3 _top10obs_JJA days) from year to year (Figs. 8,S15).
Specific events can affect O 3 in any given year.For example, in 2008, there were extensive fires across much of California in May, June, and July.In 2008, the Pacific SW region, which includes California, Nevada, and Arizona, shows a positive anomaly in O 3 _BB (> 1 ppb) on the O 3 _top10obs days, stronger than during any other year in that region (Fig. S15).If we restrict our analysis solely to Reno, NV, the anomaly for O 3 _BB was 7 ppb in July 2008 relative to the 2004-2012 July average (not shown).We emphasize that a single location can be more strongly influenced by a specific source than the regional averages on which we have focused.
Currently, the U.S. EPA uses a 3-year averaging period of the fourth highest MDA8 O 3 to assess compliance with the O 3 NAAQS.We evaluate the extent to which this 3-year averaging period removes interannual variability in meteorology (the grounds for the averaging) (Figs. 9, S16).The observed range is generally much smaller than the model estimate.We find that the 3-year average of the fourth highest day decreases the range by 2-6 and 5-18 ppb in the observations and O 3 _Base, respectively, when compared to taking the fourth highest day in any given year when we look across all regions (Table 5).However, the 3-year average of the fourth highest day still ranges from 3 to 9 and 2 to 11 ppb in the observations and O 3 _Base, respectively, across all regions (compared to 5-15 and 10-36 ppb in the observations and O 3 _Base on the fourth highest day in each individual year).Thus, while averaging across the years decreases the spread, variability remains.In keeping with our previous analysis of the O 3 _top10obs days, we compare the spread of the fourth highest O 3 day in each of the 3 years to the range of the O 3 _top10obs days across each 3-year span; the fourth highest days can range almost as widely as the O 3 _top10obs days in some years, but in other years, are clustered closer together (Fig. 9). Figure 9 shows that the range in O 3 _top10obs days for O 3 _Base generally correlates with O 3 _USB in the WUS, suggesting that O 3 _USB is the dominant influence on the high days there, but there is little correlation in the EUS.We conclude that a 3-year smoothing period is not long enough to eliminate the interannual variability in MDA8 O 3 levels entirely, and in the WUS, this interannual variability tends to reflect variations in O 3 _USB.

Discussion and conclusions
As air quality controls decrease US anthropogenic precursor emissions to O 3 , the relative importance of the background influence on total surface O 3 increases.We use O 3 MDA8 concentrations spanning 2004-2012 from the EPA AQS, CASTNet, and Mount Bachelor Observatory sites, and sensitivity simulations from the global GEOS-Chem 3-D chemistry transport model to estimate the influence from various individual background sources on O 3 in each of the 10 EPA regions in the continental USA.The global scale of the GEOS-Chem model allows us to quantify intercontinental transport (including global methane) in addition to regional natural and anthropogenic sources of O 3 .The sensitivity simulations span 9 years, allowing us to examine the role of these sources in contributing to interannual variability.Our analysis contrasts average-and high-O 3 days.
Correlations between monthly averages across 2004-2012 show that the model captures monthly variations from year to year, especially during summer (JJA).The model shows sub- stantial variability in simulated monthly US background O 3 concentrations from year to year, on the order of 10-15 ppb between 2004 and 2012 in summer (Fig. 7).We find that the extent to which the current 3-year averaging period for assessing compliance with the National Ambient Air Quality Standard for O 3 succeeds in smoothing out interannual variability depends on the range in consecutive years, and thus varies by region and time period, but is generally not long enough to completely eliminate the interannual variability in background O 3 (Fig. 9).We find substantial biases in the severity (+0-19 ppb in maximum daily 8 h average (MDA8) O 3 ) and timing of high-O 3 events in the model.The model underestimates the frequency of high events in spring, possibly associated with stratospheric intrusions (Fiore et al., 2014;Zhang et al., 2011Zhang et al., , 2014)).Future efforts would benefit from quantifying the stratospheric (as well as Asian) influence alongside the other background sources we consider.We find a stronger influence of US anthropogenic emissions on regionally averaged MDA8 O 3 (up to 30 ppb), and from BVOCs (up to 15 ppb) and soil NO x (up to 10 ppb) on the 10 most biased days as compared to average days.We conclude that regional production of O 3 is driving the pervasive high positive model bias in summer, as opposed to transported background O 3 , although our sensitivity simulations do not allow us to rule out the possibility of a coincident missing sink.
Our finding that BVOC emissions contribute to the summertime surface O 3 biases could reflect poor representation of the emissions (and subsequent oxidation chemistry).Earlier work has noted that MEGAN BVOC emissions are too high over California (Bash et al., 2016), southeast Texas (Kota et al., 2015), the Ozarks in southern Missouri (Carlton and Baker, 2011), and across much of the USA (Wang et al., 2017).One recent model study uniformly reduced MEGAN isoprene emissions by 20 % (Li et al., 2018), but we did not apply any such scaling here.In regions that are highly NO x -sensitive, additional isoprene should not strongly influence O 3 , as found over southeast Texas (Kota et al., 2015).While not eliminated entirely, the summertime model bias does lessen in the simulation with BVOC emissions set to zero, suggesting that the O 3 bias is indeed exacerbated if BVOC emissions are overestimated in the model.
On the 10 days with the highest observed MDA8 O 3 values (O 3 _top10obs) in each season, the model indicates that US anthropogenic and biogenic VOC emissions are the most important drivers, relative to average days, over most regions (Tables 2, 3).O 3 _top10obs_MAM and O 3 _top10obs_SON days (i.e., the 10 highest spring and fall MDA8 O 3 days) are up to 9 • C warmer than average, but O 3 _top10obs_JJA days (i.e., the 10 highest summer MDA8 O 3 days) are only 1-2 • C warmer than average (Table 2).US anthropogenic emissions enhance O 3 _top10obs_JJA days by 5-11 ppb above average in the eastern US regions, but by less than 2 ppb over the three western regions.Over these westernmost regions, US background O 3 is 4-12 ppb higher on O 3 _top10obs_JJA days than on average (Table 2).Across the continental USA, biogenic VOC emissions enhance O 3 by 1-7 ppb above average on O 3 _top10obs_JJA days, while intercontinental pollution is either similar or up to 2 ppb higher on average days (Table 3).Analysis of our simulations thus indicates that the highest O 3 events are associated with regional O 3 production rather than transported background.4).With our sensitivity simulations, we interpret this lack of an overall trend as a balance between rising US background O 3 (by 2 ppb for O 3 _USB from 2004-2006 to 2010-2012 averaged over all regions) and declining US anthropogenic emissions (by 6 ppb for O 3 _USA from 2004-2006 to 2010-2012 averaged over all regions).The declining influence of US anthropogenic emissions on O 3 _top10obs_JJA days is consistent with earlier work showing high-O 3 concentrations decreasing in response to regional precursor emissions controls since the late 1990s (e.g., Cooper et al., 2012Cooper et al., , 2014;;Frost et al., 2006;Simon et al., 2016).
In contrast to previous work, including with the GEOS-Chem model (e.g., Fiore et al., 2014 and references therein), we find that US background O 3 tends to be higher in summer than in spring in most regions.This likely reflects differences in the isoprene chemistry, specifically the isoprene nitrates, between our version of GEOS-Chem (Mao et al., 2013) and older versions that treat isoprene nitrates as greater sinks for NO x and thereby suppress O 3 production.The coarse resolution of our model will excessively mix isoprene and soil NO x sources (e.g., Yu et al., 2016), and thus may exaggerate the relative importance of enhanced background O 3 resulting from soil NO x and isoprene.Nevertheless, the model skill at capturing the observed year-to-year variability in the regionally averaged 10 highest days lends some confidence to its attribution of this variability to natural sources (e.g., Fig. 6).Future work with high-resolution models (e.g., at the regional scale, ideally with boundary conditions that include source attributions from a global model) is needed, along with observational evidence, to quantify the extent to which biogenic VOC and NO x contribute to the highest observed O 3 levels in the warm season.The importance of temperature-sensitive sources like biogenic VOC and NO x emissions to background O 3 implies that in a warmer climate, these background influences on O 3 will play an even more important role in driving up O 3 levels.

Figure 1 .
Figure 1.Frequency distribution of regionally averaged US MDA8 O 3 values from 2004 to 2012 in the (a) Schnell and Prather (2017) dataset interpolated to 2 • × 2.5 • and (b) at individual observational sites prior to averaging over each of the 10 EPA regions (total number of points is 9 years × 365 or 366 days × 10 regions) in the observations (blue) and the GEOS-Chem model (orange).(c) As in panel (b) but for the 10 most biased days in each region (total number of points is 9 years × 10 days × 10 regions).The line drawn at 70 ppb in panels (a, b) is the current O 3 NAAQS level.

Figure 4 .
Figure 4. Multi-year (2004-2012) March-October average temperature and MDA8 O 3 source contributions estimated with the GEOS-Chem model in the (a) Southeast and (b) Mountain and Plains regions on the 10 most biased days (blue) versus averaged across all days (yellow).Note that the two regions are on different scales.

5
Interannual variability in the sources influencing high versus average ground-level O 3 Despite its high mean bias and seasonal phase shift, the model does capture some of the observed interannual variability in observed O 3 _top10obs_JJA MDA8 O 3 concentrations (Figs. 6, S10; r = 0.5 to ≥ 0.9).Comparing the 2004-2006 period with 2010-2012, both observed and simulated MDA8 O 3 concentrations on O 3 _top10obs_JJA days hold steady or decrease across all regions.This change reflects opposing influences in the model: rising O 3 _USB (by 2 ppb averaged over all regions) and declining O 3 _USA concentrations (by 6 ppb averaged over all regions) (Figs. 6, S10,

Figure 5 .
Figure 5. Average 2004-2012 influence of each sensitivity simulation on O 3 _Base in the (a) Southeast and (b) Mountains and Plains regions on MDA8 O 3 _top10obs_JJA days (red) versus averaged across all days (blue).Error bars show the concentration on the lowest versus highest year for each sensitivity simulation in each region.Daily 24 h average temperature is also shown.

Figure 6 .
Figure 6.Average yearly MDA8 O 3 _top10obs_JJA concentrations for observations (divided by 2 to fit on the same axes; blue dashed line), O 3 _Base (divided by 2; blue solid line), O 3 _USB (red), O 3 _USA (black), O 3 _NAT (green), and daily average temperature (in • C; light blue) in the (a) Southeast and (b) Mountains and Plains regions on the O 3 _top10obs_JJA days.
Figure 6 shows that O 3 _NAT tracks with O 3 _USB and temperature.Dips in MDA8 O 3 occur during years with cooler temperatures (2008-2009) and increases occur in years with warmer temperatures (2011-2012), indicating that year-to-year variability in O 3 _USB on O 3 _top10obs_JJA days is primarily driven in the model by natural sources sensitive to meteorology rather than international O 3 transport (Figs. 6, S10).Although 2012 was the hottest year on average between 2004 and 2012 (except in the Pacific NW where 2004 was warmer by about a degree), it was not the hottest summer in all regions.We find that O 3 _USB drives the interannual variability on O 3 _top10obs_JJA days in the WUS (r = 0.72 − 0.85 for O 3 _USB versus O 3 _Base, whereas r = 0.05 − 0.64 for O 3 _USA versus O 3 _Base; Table Interannual and seasonal variability in O 3 _USB are generally greater in the Eastern U.S. regions than in the Mountains and Plains region and the Plains region (Figs. 7, S11).Year-to-year variability in O 3 _BVOC is smaller than O 3 _USB, with a maximum range of about 10 ppb between the highest and lowest years during August (Figs.7, S12).O 3 _SNO x ranges by a few ppb throughout the summer in the Southeast and by up to 6 ppb over the Mountains and Plains in August (Figs.7, S13).

Figure 9 .
Figure 9.The three fourth highest values (solid dots) used to calculate the 3-year average of the fourth highest MDA8 O 3 day (hollow diamond).Vertical bars show the range between the highest and lowest O 3 _top10obs days across each 3-year span (i.e, across 30 total points) occurring between March and October in the (a) Southeast and (b) Mountains and Plains regions in the observations (black), and the O 3 _Base (blue) and O 3 _USB (red) simulations sampled on the same days as the top 10 observed values.

From 2004 -
2006 to 2010-2012, MDA8 O 3 concentrations on O 3 _top10obs_JJA days vary from year to year, but show little overall trend, decreasing by 3 ppb in both the observations and the model averaged over all regions (Fig.6, Table Net sites with data between 2004 and 2012, requiring at least 18 h of data per day for each MDA8 O 3 calculation.The Mount Bachelor Observatory, established in 2004 by the University of Washington Jaffe Research Group, is located 2.7 km above sea level on the summit of Mount Bachelor, an extinct volcano in the Cascade Mountains of central Oregon.It provides an estimate of baseline O 3 levels over the west coast of the USA 3 concentration from hourly values at 108 CAST-Atmos.Chem.Phys., 18, 12123-12140, 2018 www.atmos-chem-phys.net/18/12123/2018/

Table 1 .
Sensitivity simulations with the GEOS-Chem model and their application to estimate sources of ground-level O 3 .North American lightning NO x O 3 _Base -simulation with the North American lightning NO x source shut off O 3 _NALNO x Soil NO x emissions O 3 _Base -simulation with the soil NO x emissions shut off O 3 _SNO x

Table 2 .
Summary information for each region.Columns show the model bias and O 3 abundances in the O 3 _Base, O 3 _USB, and O 3 _USA simulations, and in the observations as well as for daily average temperature (1) on the O 3 _top10obs days in each season(average of 2004- 2012), (2) across all days in each season(average of 2004-2012), and (3)the difference between these values, rounded to the nearest whole number.

Table 3 .
Summary information for each region.Each column shows the concentration for each background O 3 source influence (1) on the O 3 _top10obs days in each season (average of 2004-2012), (2) across all days in each season(average of 2004-2012), and (3)the difference between these values, rounded to the nearest whole number.

Table 5 .
Summary information for each region.The first row next to each region reports the range across 2004-2012 of the fourth highest MDA8 O 3 values from each of the 9 individual years for the observations, O 3 _Base, and O 3 _USB.The second row reports the range across 2004-2012 of each of the 3-year averages of the fourth highest values (7 values) in each region for the observations, O 3 _Base, and O 3 _USB.