A comprehensive evaluation of seasonal simulations of ozone in the northeastern US during summers of 2001 – 2005

Regional air quality simulations were conducted for summers 2001–2005 in the eastern US and subjected to extensive evaluation using various ground and airborne measurements. A brief climate evaluation focused on transport by comparing modeled dominant map types with ones from reanalysis. Reasonable agreement was found for their frequency of occurrence and distinctness of circulation patterns. The two most frequent map types from reanalysis were the Bermuda High (22%) and passage of a Canadian cold frontal over the northeastern US (20%). The model captured their frequency of occurrence at 25% and 18% respectively. The simulated five average distributions of 1-h ozone (O 3) daily maxima using the Community Multiscale Air Quality (CMAQ) modeling system reproduced salient features in observations. This suggests that the ability of the regional climate model to depict transport processes accurately is critical for reasonable simulations of surface O 3. Comparison of mean bias, root mean square error, and index of agreement for CMAQ summer surface 8-h O 3 daily maxima and observations showed −0.6±14 nmol/mol, 14 nmol/mol, and 71% respectively. CMAQ performed best in moderately polluted conditions and less satisfactorily in highly polluted ones. This highlights the common problem of overestimating/underestimating lower/higher modeled O 3 levels. Diagnostic analysis suggested that significant overestimation of inland nighttime low O3 mixing ratios may be attributed to underestimates of nitric oxide (NO) emissions at night. The absence of the second daily peak in simulations for the Appledore Island marine site possibly resulted from coarse grid resolution misrepresentation of land surface type. CompariCorrespondence to: H. Mao (hmao@gust.sr.unh.edu) son with shipboard measurements suggested that CMAQ has an inherent problem of underpredicting O 3 levels in continental outflow. Modeled O3 vertical profiles exhibited a lack of structure indicating that key processes missing from CMAQ, such as lightning produced NO and stratospheric intrusions, are important for accurate upper tropospheric representations.


Introduction
Air quality in New England is especially susceptible to variations in seasonal climate due to its extensive areal forest coverage, complex terrain features, and significant influx of air pollutants from major urban/industrial centers in the eastern US.There is already strong evidence that the length of the growing season is increasing in New England (New England Regional Assessment Group, 2001).Longer and hotter summers can have a major impact on air quality and the occurrence of O 3 episodes via chemical and physical processes.
To provide an assessment of the current environment, it is imperative to evaluate model performance for simulations of present day air quality.
Three-dimensional air quality models have been evaluated extensively with a focus on regional climate/meso-scale meteorological and O 3 simulations (e.g., Hogrefe et al., 2004;Dawson et al., 2008;Vivanco et al., 2009).For instance, Hogrefe et al. (2004) found that the Fifth-Generation NCAR/Penn State Mesoscale Model (MM5) and the Community Multiscale Air Quality modeling system (CMAQ) captured interannual and synoptic-scale variability in surface temperature and O 3 mixing ratios during summers 1993-1997 with the magnitude of fluctuations on intra-day to Published by Copernicus Publications on behalf of the European Geosciences Union.
H. Mao et al.: A comprehensive evaluation of seasonal simulations of ozone diurnal variation underestimated.Van Loon et al. (2007) conducted an intercomparison of one-year simulations from seven regional air quality models and suggested that mixing ratios at night and in winter were more difficult to reproduce than during day-and summer due to difficulties of representing a stable atmosphere accurately in models.In their three summer simulations, Vivanco et al. (2009) found fair agreement between model and measurements at rural sites in Spain with the exception of significant underestimation of O 3 levels over the surrounding Madrid metropolis possibly due to poor representation of precursor transport in the model.Most model-observation comparison studies were conducted on a domain and multi-season average basis.However, besides providing such overall comparison, this study examined our modeling systems particularly in their performance of capturing episodes embedded in multiple seasons using extensive observations from field campaigns.We believe that our model evaluation here is thus more rigorous than previous studies and can provide insight on possible causes for modelobservation discrepancies through evaluation of case studies.
A quite common problem in model simulated O 3 levels has been underestimation of high O 3 values, which consequently makes it difficult to predict O 3 exceedance days reliably, with one exception that Huang et al. (2007) showed realistically simulated summer ozone peaks, especially for the northeastern US averaged over the summer season and the subdomain.Zhang et al. (2006a, b) attributed underprediction in daily maximum 1-h O 3 mixing ratios on high O 3 days to overpredicted planetary boundary layer (PBL) height.On most low O 3 days uncertainties in O 3 precursor emissions and overestimated surface layer vertical mixing led to inaccurate results.Yu et al. (2008) revealed that during their four-day simulation the underprediction of peak O 3 concentrations on high-O 3 days was caused by underrepresentation of regional contributions and local production to a lesser extent.Results from these studies allude to the critical importance of accurate simulations of processes depicting dynamics and physics in the planetary boundary layer (PBL) to reproduce observed O 3 distributions.
A few studies included comparison of modeled upper level O 3 mixing ratios with measurements.Ozonesonde data have recently been used to evaluate models (Mao et al., 2006;Chai et al., 2007;Pierce et al., 2007;Tarasick et al., 2007;Yu et al., 2007).Mena-Carrasco et al. (2007) compared simulated O 3 using their regional air quality model STEM with NASA DC-8 and NOAA WP-3 airborne measurements during International Consortium for Atmospheric Research on Transport and Transformation (ICARTT) 2004 and found a strong positive surface level bias and a negative upper tropospheric bias.The low altitude and mid-tropospheric bias was reduced by using improved emission inventories, whereas the upper tropospheric bias was predominately affected by boundary conditions that were the output of global models.
In this study, we conducted an evaluation of multiseason regional climate and air quality model simulations for summers (1 June-30 September) 2001-2005 over the eastern US.We evaluated the performance of the climate and air quality models with a focus on transport processes and O 3 mixing ratios.A unique aspect of this study was that in addition to evaluating models on the domain and seasonal average basis, we examined their ability to represent pollution episodes embedded in the five summer ensemble.To do that, we utilized measurements from a suite of observing platforms encompassing long-term ground-based networks, the NOAA ship Ronald H. Brown, aircraft, ozonesondes, radiosondes, the field campaigns New England Air Quality Study (NEAQS) 2002 and ICARTT 2004 intensive studies, and our 2004 Duke Forest, North Carolina (NC) intensive studies.

Climate model
The regional climate for summers 2001-2005 over the eastern United States was simulated using the Regional Climate Modeling System (RCMS) (Chen et al., 2004) with 36 km spatial resolution (indicated as RCMS in Fig. 1).The five seasonal runs were driven by the 6-hourly National Center for Environmental Prediction (NCEP)/National Center Atmospheric Research (NCAR) Reanalysis Project (NNRP) 1 • × 1 • reanalysis data that were used to provide initial and boundary conditions for the modeling domain.Details of the RCMS configuration can be found in Chen et al. (2004Chen et al. ( , 2005)).The physical options include the Medium Range Forecast (MRF) PBL scheme (Hong and Pan, 1996), the simple ice microphysics scheme (Dudhia, 1989), the Community Climate Model (CCM2) radiation transfer package (Briegleb, 1992), the Grell et al. (1991) cumulus parameterization, and the land-surface-transfer model developed by Pollard et al. (1995).RCMS was run over the time period of 1 May-30 September for each year of 2001-2005.

Emission model
The EPA emission model Sparse Matrix Operator Kernel Emissions (SMOKE) Modeling System was used to produce gridded emission data over the five summers.The most recent version of EPA emission model SMOKE with MO-BILE6 and BEIS3 was used with the 1999 NEI data.It was interfaced with the RCMS output to generate gridded emission data for the seasonal air quality simulations.

Photochemical model
The Community Multi-scale Air Quality model (CMAQ) (Byun and Schere, 2006) was employed to simulate the distributions of pollutants with the first three days as spin-up days over 1 June-30 September of 2001-2005.The 36 km horizontal grid structure in CMAQ follows that of MM5 with 72×59 cells which is marked as CMAQ in Fig. 1.Vertically there were 21 layers with the first 13 layers (from the surface up) identical to those in RCMS to maintain high resolution in the PBL.The CB-IV chemistry mechanism (Gery et al., 1989) was used due to its applicability and wide usage in regional-scale modeling and its lesser computational demand compared to other schemes, which is particularly important for multi-seasonal runs.The cloud and aerosol modules were both used in this study.The cloud module included parameterizations for sub-grid convective precipitating and non-precipitating clouds and grid-scale resolved clouds.Cloud effects were included for both gas-phase species and aerosols.The aerosol module used an approach that represents the particle size distribution as the superposition of three lognormal subdistributions, namely modes.The module calculates the concentrations of both PM 2.5 and PM 10 and includes estimates of the primary emissions of elemental and organic carbon, dust and other species not further specified.Secondary aerosol species considered were sulfate, nitrate, ammonium, water, and organics from precursors of anthropogenic and biogenic origins.

Map typing
The modeled transport processes were evaluated by comparing the dominant circulation patterns from reanalysis data and model runs which were classified using the map typing technique (Lund, 1963).The NCEP Global Final Analysis (FNL) products are available for four time intervals each day (00:00, 06:00, 12:00, and 18:00 UTC) on a 1 • × 1 • global horizontal grid (http://dss.ucar.edu/datasets/ds083.2).For the evaluation of the circulation patterns we extracted the reanalyzed and modeled sea level pressure fields at 12:00 UTC from FNL and RCMS respectively.The synoptic-scale circulation patterns were classified by applying the correlationbased map typing algorithm of Lund (1963) to the RCMS and NCEP FNL sea level pressure (SLP) fields.This technique has been successfully applied to synoptic classification of summertime circulation patterns over the northeastern United States (Hegarty et al., 2009(Hegarty et al., , 2007;;Hogrefe et al., 2004).The algorithm calculates a correlation coefficient between the grids representing scalar meteorological analysis fields over a given spatial domain at different times.The map types were selected using a critical correlation coefficient (i.e., 0.65), and then all days in a given study period were classified as one of these types based on the degree of correlation.A minimum group size equal to 5% of the total number of maps in each set (NCEP FNL or RCMS) was utilized to eliminate map types with few members.

General characteristics of simulated seasonal climate
Over the entire domain during the five summers, hourly meteorological measurements were collected at 1449 sites.For the spatial and temporal averages, temperature was underpredicted by 2.5 • C, with a standard deviation of 4 • C. Our further analysis suggested that spatial and temporal distribution of surface temperature was well simulated during clear-sky days.However surface temperature was underestimated when frontal systems were passing over the Northeast due to overpredicted cloudiness and subsequently underpredicted solar radiation reaching the surface in RCMS.The same problem was reported by Liang et al. (2004) where they found that the CCM2 radiation package produces a deficit of up to 80 W m −2 in solar radiation reaching the surface compared with station measurements in Illinois.The RCMS underpredicted moisture by 0.8 g kg −1 on average, with greater underestimation during daytime.This is consistent with the findings by Braun and Tao (2000) that MRF predicted drier PBL than with other schemes due to its strong mixing processes.In our study, wind speed was better simulated at night than in daytime, and was in general under-predicted by the model by 3 m s −1 .Modeled wind direction lagged observations by 4.5 • C on average.
Simulations above the surface layer were evaluated using twice daily (00:00 and 12:00 UTC) vertical sounding data below 2 km from the twenty sites within the RCMS domain as illustrated in Fig. 1.The model underpredicted temperature by <2 • C in all levels and specific humidity by ≤1 g kg −1 .Wind speed was overpredicted by ≤1 m s −1 except in layers 2-4 where the average was ≤2 m s −1 , while wind direction was simulated fairly accurately.
Transport processes are important in redistributing airborne pollutants and thus model simulations of these processes need to be evaluated.One way to do that is to compare the modeled and observed dominant circulation patterns for the summers of 2001-2005.Note that we used the FNL from the NCEP to extract circulation patterns approximating the real world.The map typing analysis identified six map types from the NCEP FNL analyses (denoted as FNL1-FNL6, Fig. 2) and five map types from the RCMS simulations (denoted as RCM1-RCM5, Fig. 3).A total of 70% of the days could be classified as one of the FNL types and 77% of the days could be classified as one of the RCMS types.The frequency of occurrence of each map type and the corresponding meteorological conditions are summarized in Table 1.
The FNL1 map type (Fig. 2a) representing the Bermuda High circulation was the most common of the NCEP FNL types (22%) and featured light south-southwest flow over much of the northeastern US Similarly, RCM1 (Fig. 3a) occurred on 25% of the days with great resemblance to FNL1, and its higher frequency of occurrence suggests a slight tendency for RCMS to over-predict the occurrence of the Bermuda High (Table 1).Map types FNL2 and FNL3 (Fig. 2b and c), which depict a cold front located off the east coast with a 20% frequency of occurrence for the two, were closely matched by RCM2 (Fig. 3b) occurring on 18% of the days.This suggests close agreement in capturing cold frontal passages over the eastern US between the reanalysis and model results.The anticyclonic types of FNL4 (Fig. 2d) was reproduced by RCMS reasonably well in RCM4 (Fig. 3d) with very close agreement in frequency of occurrence (7% and 8% respectively).The inland cold frontal trough extending from cyclone tracking across Canada approaching the east coast in FNL5 was matched well both in pattern and frequency of occurrence (10% and 13%) by RCM3.This map type produced strong south-southwesterly warm, humid flow with showers ahead of the front and cooler drier conditions and northwest flow over the Midwest behind the front.RCM5 (Fig. 3e) shared similar features with FNL6 (Fig. 2f maximum mixing ratios at all sites within the model domain during the five summers, the mean bias (OBS-MOD) was 0.7 nmol/mol (±17 nmol/mol), the root mean square error (RMSE) 17 nmol/mol, index of agreement (IA) 69% and the correlation coefficient (r) between observations and modeled results was 0.51.The modeled and observed 1-h O 3 daily maxima at all monitoring sites from each day of the five summers were also plotted in a one-to-one corresponding manner (Fig. 4a).The slope value of this correlation was 0.37, resulting from overprediction of lowers values and underprediction of higher levels.Apparently, the small mean bias value does not necessarily mean close one-to-one agreement between observations and model results, which is reflected in the standard deviation of the mean bias.On the contrary, it was a result of over-and under-predicted values cancelling each other.Current regional air quality models commonly produce similar order of agreement with observations (e.g., Hogrefe et al., 2004;Zhang et al., 2006a;Otte, 2008;Vivanco et al., 2009).
One-hour O 3 daily peaks averaged over the 2001-2005 summers from the EPA AIRNOW measurements show that south of the northern borders of Pennsylvania (PA), Ohio (OH), and Indiana (IN), mean 1-h O 3 daily maxima at most stations exceeded 60 nmol/mol.The highest levels (70-75 nmol/mol) were found in neighboring areas of PA, New Jersey (NJ), Maryland (MD), Virginia (VA), and Delaware (DE) (Fig. 5a).An elongated patch of O 3 mixing rations over 60-70 nmol/mol spanned mainly the middle of the domain in a southwest-northeast orientation and a few smaller areas in the southwestern states (Fig. 5a).Lower values (<50 nmol/mol) occurred in Maine (ME), Wisconsin (WI), Minnesota (MN), and Iowa (IA).
Our model simulations captured the pattern and magnitude of the observed salient features in the spatial distribution of 1-h O 3 daily maxima with primary exceptions in Alabama (AL) and Georgia (GA) as well as along the coast of the Mid-Atlantic States extending to southern New England, where modeled O 3 levels were higher than observations by 5-10 nmol/mol (Fig. 5b).Interestingly, the model simulations suggested higher O 3 mixing ratios (>60 nmol/mol) over water than over land, i.e., the Great Lakes and off the east coast, possibly due to lower dry deposition and lower PBL height.
The observed 8-h O 3 daily maxima averaged over the five summers (Fig. 5c) exhibited a spatial pattern similar to the 1h data with overall decreases of 5-10 nmol/mol at most sites  compared to the latter.The model reproduced the general pattern and magnitude of the observed distribution (Fig. 5d).The overall mean bias was −0.4 nmol/mol (±14 nmol/mol), the RMSE 14 nmol/mol, and IA 71%.The observationmodel r was 0.54 with a slope value of 0.39 (Fig. 4b), which is understandably slightly less scattered compared to the hourly data owing to the use of a moving average to obtain the 8-h data.The seasonal time series of observed and modeled 1-and 8-h O 3 daily maxima averaged at all sites across the domain over the five summers are presented in Fig. 6.Both 1-and 8-h O 3 data suggested close agreement in temporal variability between observations and model simulations.The observed 1-h O 3 shows a range of 46-76 nmol/mol over the season compared to the simulated range of 47-65 nmol/mol.In general smaller values occurred in the first and last two weeks of the season, and a biweekly cycle appeared to be embedded in the seasonality.The model captured these key features very well albeit with overprediction of dips and underprediction of peaks.In particular, the largest discrepancy (9 nmol/mol) between observations and model was found during the 23-27 June period when O 3 reached the seasonal maximum of 72 nmol/mol.This is illustrated in more detail at individual UNH AIRMAP sites in Sect.5.1.The over-/underprediction tendency of model simulations   is represented distinctly in the cumulative distribution of instantaneous 1-and 8-h O 3 daily maxima from the entire domain over all summers (Fig. 7).The model overpredicted observed 1-h O 3 daily peaks <56 nmol/mol by 0-9 nmol/mol, which comprised 47% of the total number of data points and underpredicted observed 1-h O 3 daily peaks >56 nmol/mol with the largest difference reaching 21 nmol/mol at the 100th percentile level (168 nmol/mol).This suggests that the model performed best in simulating moderately polluted conditions and less satisfactorily in highly polluted ones.The 8-h data showed under-and overprediction of close magnitude (∼10 nmol/mol) at the lower and upper end of the distribution.
Overall ∼90% of 1-h O 3 daily maxima from observations were <80 nmol/mol centered in the bin of 50-60 nmol/mol with the frequency of occurrence 21%, a few percent higher than the other 4 bins (Fig. 8).About 14% of the total points varied between 80 and 125 nmol/mol, and only 0.6% exceeded 125 nmol/mol, the criterion from the National  Ambient Air Quality Standards (NAAQS) for 1-h O 3 data.The model performed reasonably well in simulating the overall distribution, with the peak frequency of occurrence in the same bin as observations.The predicted frequency in bins over 40-70 nmol/mol was larger than the observed by 3-8% whereas it was a factor of two smaller than the observed in the bins of <40 nmol/mol and >80 nmol/mol.Compared to 0.3% of the observed points being categorical O 3 exceedance, the model captured 0.05% of the total for that case, a factor of six smaller, which raises caution in using models to predict the areas of O 3 exceedance.The observed distribution of frequency of occurrence for averaged 8-h O 3 daily peaks showed a decreasing trend from the lowest O 3 bin to the highest with the largest frequency being 25% in the bin <40 nmol/mol.The model captured less than half of the frequency of occurrence in the bin <40 nmol/mol, overpredicted by ∼10% in the bins 40-60 nmol/mol, predicted a value in close agreement with observations in the bin of 60-70 nmol/mol, and underpredicted significantly in the bins >80 nmol/mol.Another index to quantify air pollution is the length of an O 3 episode; the longer an episode lasts the more likely a regional build-up of pollutant levels will occur.CMAQ's tendency to underestimate higher O 3 levels can conceivably result in missing out those O 3 exeedance days that are defined by the NAAQS criterion from simulations, and can further underestimate the number of occurrences of O 3 episodes.To apply model results more meaningfully, as opposed to just using one single number as the universal exceedance criterion, we defined an exceedance to be when the O 3 daily maximum level exceeded the 90th percentile value calculated for each monitoring site based on all daily maximums during the five summers, and we defined an episode to be a succession of days of such exceedances.
During the five summers there were 16 887 and 13 949 exceedance occurrences across all monitoring sites in the 1-h O 3 daily maxima data of observations and simulations, respectively.The model underestimated the occurrence of O 3 exceedance by 17%.We grouped the episodes into five categories with length of one-to-four days at a one-day interval (Fig. 9).The model overestimated the occurrence of episodes in all length bins by 27-52% with the largest overestimation (∼52%) for the four-day episode type except the one-day group, for which the model underestimated episode occurrence by 50%.The model captured episodes of three-and >four-day episodes with the greatest accuracy.The 8-h O 3 daily maxima data (Fig. 9) showed that the model overestimated all types even more significantly except for the >fourday type, in which the model was in close agreement with observations (2%).

Association between circulation and O 3 distribution
Corresponding to the five map types derived from the reanalysis data, where map types FNL2 and FNL3 were combined due to their similarity in circulation patterns, the observed  and modeled average distribution of 1-h daily maximum O 3 are presented in Fig. 10.The salient features in the observed average distribution corresponding to the five map types are very similar to those in Hegarty et al. (2007) where the relationship was investigated between synoptic-scale circulation patterns and surface O 3 across the northeastern US for summers 2000-2004.We found that FNL1 and FNL5 depict the two stages of cold front passage over the Northeast with FNL1 preceding FNL5 (Fig. 10a and d).In these two map types, the Bermuda High prevailed over the eastern US producing weather conditions with strong solar radiation and warm temperatures conducive to occurrence of high O 3 levels.Consequently, in these two map types higher O 3 levels were observed across the eastern US before the cold front moved the polluted air offshore (Fig. 10f and i).
Map types FNL2 and FNL3 represent the stages where the cold front moved off of the continent with influx behind it of relative clean Canadian air into New England reducing the O 3 level in the region as revealed by observations (Fig. 10g).Lower O 3 levels were observed along the east coastline (Fig. 10h) corresponding to the FNL4 circulation and in particular spread out extensively across the Northeast as shown in FNL6 (Fig. 10e and j).Map types FNL4 and FNL6 are not conducive to O 3 formation and build-up due to the influx of marine air and cool and dry Canadian air masses from over Hudson Bay respectively.
These features in average distributions of daily maximum 1-h O 3 for different map types were captured generally in model simulations (Fig. 10k-o).The underestimation of higher levels of O 3 were manifested in map types FNL1 and www.atmos-chem-phys.net/10/9/2010/Atmos.Chem.Phys., 10, 9-27, 2010 H. Mao et al.: A comprehensive evaluation of seasonal simulations of ozone FNL5, while overestimation of lower levels was particularly apparent along the southeastern coastline in FNL4 and in New England in FNL2 and FNL3.Overall, capturing the primary features of climatological circulation patterns appeared to be critical to simulating accurately surface O 3 mixing ratios.

Model comparison to observations
In this section only 1-h data were used for model and observation comparison because the objective was to further evaluate model skill in representing chemical and dynamical processes.Hence we believe that the model-observation discrepancy for this purpose should be examined as is, rather than being mathematically smoothed to some extent due to the use of 8-h O 3 data.
In our previous episode studies (Mao et al., 2004(Mao et al., , 2006) we used the AIRMAP ground-based and some of ICARTT multi-platform observations for model evaluation, which overlapped parts of the application of these observations in this study.In this study, in addition to capturing the overall seasonal variability in O 3 over the five summers, the model was also examined for its ability to reproduce the ensemble of particular episodes in the multi-season context.Thus the model-observation comparison conducted here is more robust than episode studies, and consequently it can enhance our confidence in the model performance should the results be satisfactory.

AIRMAP ground-based observations
We compared modeled hourly O 3 mixing ratios with observations from the four AIRMAP sites, Thompson Farm (TF) (24 m a.g.l.), Castle Springs (CS) (320 m a.g.l.), Mount Washington (MW) (∼2 km a.g.l.), and Appledore Island (AI) (sea level) for all five summers with the exception of summer 2001 for AI when measurements at that site were not yet available.Observations showed that median mixing ratios of O 3 at all four sites were similar except MW, at 28, 36, 33, and 45 nmol/mol for TF, AI, CS, and MW respectively, which suggests that the low elevation sites (<400 m) likely reflect the same regional airshed.In the time series of O 3 mixing ratios at all sites showed a periodicity of 3-5 days (Fig. 11), which is possibly associated with synoptic scale dynamical processes.
Further, there are also distinct site idiosyncrasies due to their differing geographic characteristics.Our previous study suggested that the location of AI allows it to receive polluted air masses from more upwind source regions, such as the Mid-Atlantic States, the Greater Boston area, and/or the northeastern urban corridor, than inland New England sites (Mao and Talbot, 2004).This explains why AI was observed to experience higher O 3 levels more frequently than the inland sites evidenced in its highest 90th percentile value of 75 nmol/mol compared to 55 and 53 nmol/mol at TF and CS respectively (Table 3).It is even higher than the 90th percentile value of 67 nmol/mol at MW which often had enhancements in O 3 levels due to free tropospheric and stratospheric influences (Xiao et al., 2009).Specifically, over the five summers, there were 18 sample points at AI that exceeded 120 nmol/mol, 3 at TF and MW, and none at CS.The 10th percentile value at the coastal site TF was 8 nmol/mol, a factor of ∼3-4 lower than the other sites, which was driven by the low mixing ratios that were observed mostly on nights with the occurrence of a nocturnal inversion layer.These low values reached single digits and frequently zero for an extended period of hours on ∼50% of summer nights (Talbot et al., 2005).Talbot et al. (2005) suggested that this phenomenon of nighttime depletion of surface O 3 was caused by continuous loss of surface O 3 via dry deposition and in situ chemistry with limited re-supply of O 3 -rich air masses from aloft.
CS is situated at an elevation of 320 m, near the top of the boundary layer at night, but in the middle of the convective boundary layer during the day.This characteristic determines that even on nights with the inversion layer, there is always exchange between the surface and the layers aloft.The loss of O 3 near the surface is thus constantly replenished, which is reflected in its lower 10th percentile value of 22 nmol/mol, almost a factor of 3 higher than at TF.At MW the lower 10th percentile and median values were larger than at CS and AI by ∼10 nmol/mol and the 90th percentile value larger than at TF and CS by ∼13-14 nmol/mol but smaller than at AI by 8 nmol/mol.MW is frequently in the free troposphere, and hence observations suggest a mixture of influences on O 3 ranging from regional to farther upwind sources as well as stratospheric intrusions (Xiao et al., 2009).
The model captured the general patterns of variability at all sites (Fig. 11).For instance, the 3-5 day periodicity, the high O 3 episodes, and the week-long low O 3 periods during 12-19 July 2001 uniformly occurring at TF, CS, and MW and 24 August-4 September 2002 at all sites.However, the model tended to underpredict significantly the spikes in O 3 mixing ratios, particularly during episodes.For example, during the 12-19 August 2002 episode the observed 1-h O 3 daily maxima at AI reached >144 nmol/mol at 22:00 UT on 14 August and two hours later reached nearly the same level at TF.The model underpredicted these by >60 nmol/mol, the largest underprediction of all 4 summers.We examined all nights with >30 nmol/mol overestimation, and averaged modeled and observed diurnal cycles including these nights (Fig. 12).Overprediction by >30 nmol/mol at TF was mostly on nights with the observed occurrence of O 3 depletion.However, instead of a steady decrease in O 3 to near zero, modeled O 3 increased slightly from midnight to 05:00 LT followed by a ∼5 nmol/mol decrease during the next three hours.This indicates that in the model, there were source (s) for O 3 at night that likely included advection and/or exchange between the nighttime boundary layer and the residual layer aloft.We speculated that this may be due to misrepresentation of the nighttime boundary layer in the model, which was subsequently examined.
The occurrence of nighttime O 3 depletion is a distinct indicator of the nocturnal boundary layer, which is depicted by minimal turbulence represented by small vertical eddy diffusivity (k zz ) values in the model.Zhang et al. (2006b) suggested that the floor value of k zz of 1.0 m 2 s −1 used in CMAQ v4.4 was too great causing over-mixing at night.Hu et al. (2003) showed that an optimal floor value of k zz may be between 0.1 and 1.0 m 2 s −1 , and using a value of 0.1 m 2 s −1 reduced the positive normalized mean bias in modeled O 3 mixing ratios by 16%.However, in our study the floor k zz value was reduced by a factor of 2 to 0.5 m 2 s −1 , and it did not seem to effectively yield realistic nighttime O 3 lev-els especially during depletion events.For example, during the period of 11-31 August 2001 our observations showed six nights with O 3 mixing ratios <5 nmol/mol.This clearly suggests the occurrence of the nocturnal inversion layer on those nights, and consequently k zz should be minimal.This was indeed captured in the model by implementation of the floor value of k zz on those nights, and yet the model still overestimated O 3 depletion by >30 nmol/mol.A separate model sensitivity study (Hwang et al., 2009) showed that the value of k zz needed to be reduced to 10 −4 m 2 s −1 to reproduce the nighttime O 3 depletion, which is 3 orders of magnitude smaller than the lower end of the wide range (0.1-5 m 2 s −1 ) reported/applied in literature (Hanna et al., 1982;Johansson and Janson, 1993;Plummer et al., 1996;Seifeld and Pandis, 1998;Stull, 1988;Zhang et al., 1982).It raises the possibility that the prescribed floor value of k zz may not be the dominant factor leading to nighttime overprediction of O 3 mixing ratios.Talbot et al. (2005) found that, averaged over the whole summer, 65% of the overnight depleted amount of O 3 observed at TF was likely caused by titration by NO.They estimated total nighttime contribution of mobile source emissions for Strafford County to the ambient NO level in the area was ∼25 nmol/mol.In comparison, over the grid cell containing TF, comparable to the size of Strafford Country, the modeled rate of mobile emission of NO was 2 mol s −1  on average, which amounts to a contribution of 14 nmol/mol, or about 50% of that estimated by Talbot et al. (2005).This suggests that the model may underestimate O 3 titration by NO, which might have contributed to the considerable overestimation of O 3 at night.
In comparison, the largest underestimation of O 3 was found at AI in the window of 17:00-23:00 LT on days with a secondary daily maximum occurring at the site (Fig. 12).On average the observed O 3 mixing ratios reached their first peak at 14:00 LT and hovered around it for two hours.A secondary peak occurred at 18:00 LT followed by a slow decrease during the next three hours and then decreased faster afterward.In comparison, the model depicted a textbook diurnal cycle with the peak at 14:00-15:00 LT followed by a linear decrease over the next five hours.Mao and Talbot (2004) suggested that on days with favorable flow regimes and meteorological conditions conducive to O 3 formation and buildup, it takes ∼4 h at an average wind speed of 6 m s −1 for air masses originating from the Greater Boston area to reach AI.In situ O 3 destruction/production likely occurs in air masses in transit between the two locations (Mao et al., 2006;Griffin et al., 2004;Pszenny et al., 2007).This can lead to a reprise after the peak hours of daytime O 3 levels at AI if the net production is positive, which was frequently observed in continental outflow from the Northeast (Mao et al., 2006).TF, a site just 25 km from AI and 10 km inland, was not likely receptive of air masses from such a transport route.However, the 36 km grid resolution is not fine enough to separate AI from TF; instead, the model treated the land use type of both sites as deciduous broad leaf vegetation, whereas in reality AI is situated completely in the marine boundary layer.We speculate that the misrepresented land use type and the subsequent incorrect simulation of circula-tion and chemistry might have contributed to the missing of the secondary daily peak by the model.
Ozone daily maxima were most underpredicted at MWO likely due to three reasons.First, it is possible due to the noflux upper boundary conditions at the model top (100 hPa), and thus the influence of stratosphere-tropospheric exchange was not taken into account properly in the model.IONS ozonesonde data suggested that the free troposphere over the northeastern North America was frequently enriched by stratospheric O 3 during ICARTT 2004 (Thompson et al., 2007a) and that 20-25% of tropospheric O 3 is of stratospheric origin (Thompson et al., 2007b).Tang et al. (2009) showed much improved prediction of upper tropospheric O 3 levels using the boundary conditions derived from IONS ozonesonde measurements.Second, this site can be receptive of influences from transboundary transport from southern Canada (DeBell et al., 2004a) and from as far away as Asia (DeBell et al., 2004b).In our model runs, the default boundary conditions in CMAQ were used, which means that O 3 mixing ratios on the boundaries were time-independent, which made it nearly impossible to represent accurately the impact of upstream source regions on the continental scale.Third, it may be the result of a mismatch in the model gridaveraged value and the observed value from a single point in the grid (Mao et al., 2006).Perhaps averaging the summit observations with those along the vertical slope of MW (not available) would make a more reasonable comparison with model results.

Model comparison with the duke forest intensive
The ∼2-week O 3 measurements at Duke Forest during the time period of 12-28 September 2004 showed an increasing trend in the daily maximum value over the first 12 days, reaching the highest 1-h O 3 daily peak of 75 nmol/mol on 23 September followed by a decrease afterward (Fig. 13a).During that period, a high pressure system moved from southern Canada toward the northeastern US on 18 September, stagnated with its center over OH and PA during 20-23 September, and began weakening/moving off the coast on the 24th.The spatial distribution of 1-h O 3 daily peaks on 23 September showed pervasive higher O 3 levels in the Northeast with highest patches (>80 nmol/mol) in OH, PA, Mid-Atlantic States, and southern Canada.
The model overpredicted O 3 levels during the first 5 days by up to a factor of 5 (14 September) when the observed daily maxima were <30 nmol/mol.In contrast, it captured magnitude and timing of the daily maxima very nicely with nearly zero model and observation difference on 21-24 September when O 3 levels were the highest (Fig. 13a).Overall, the modeled hourly mixing ratios appeared to correlate with the observed values from the five summers at three AIRMAP sites and the September 2004 Duke Forest Intensive at r 2 =0.32 and a slope of 0.33 (not shown).Further examinations of the results revealed that the model was far off as to accurately representing wind speed over the entire period except the few days, 22-24 September, with relative higher O 3 mixing ratios and calm wind speed (Fig. 13a  and b).The Duke site was influenced predominately by northerly to easterly winds during September 2004 with two excursions caused by the peripheral influence of Hurricanes Ivan (16 September) and Jeanne (26 September).Periods of elevated wind speeds are coincided with passing of the storms as evidenced by the revolving nature of the wind direction over a two day interval.The disruption in the basic diurnal cycles of O 3 and air temperature by the passage of these storms is readily apparent.For instance, in the 19 September reanalysis data the center of Hurricane Ivan was a few hundred kilometers offshore, and NC was under the influence of a ridge associated with the strong high pressure system centered in southern Canada (Fig. 14a).The modeled wind speed deviated the most from reanalysis on that day, as the simulated sea level pressure and wind vectors showed that Hurricane Ivan made landfall in NC (Fig. 14b).In comparison, during the period of 22-24 September the model performed well in capturing the dominance of the high over land in both position and magnitude and the approaching low in the west (Fig. 14c and d).This result again points to the imperative requirement of reasonable simulation of circulation patterns in order to reproduce observed O 3 levels, as documented here for Duke Forest.

Model-observation comparison during NEAQS 2002 and ICARTT 2004
Model results were also compared with a suite of NEAQS 2002 and ICART 2004 measurements from the NOAA ship Ronald H. Brown and 2004 ozonesondes from the IONS network.Overall the modeled trends in O 3 during the two intensives agreed reasonably well as demonstrated by comparison to the Ron Brown observations (Fig. 15).The model performed particularly well in capturing episodes of higher O 3 mixing ratios, such as 15-20 July 2002, 22-23 July 2002, 3-6 August 2002, 9-14 July 2004, 20-23 July 2004, and 3-4 August 2004.The three episodes of lower O 3 levels, 24-27 July 2002, 5-8 July 2004, and 27 July-2 August 2004, were also depicted with reasonable agreement in magnitude compared to measurements.Geographically, the model reproduced the west-east gradient in O 3 mixing ratios during NEAQS 2002 from higher values immediately offshore to lower ones farther out over water, and the latitudinal gradient of higher in the south near the North Carolina coast to lower in the northern states (as far north as Maine).It also simulated a few hot spots just offshore of the New York City and Boston metropolises (Fig. 16a and b).Similarly during ICARTT 2004 the modeled trend in O 3 along the Ron Brown cruise tracks agreed well with shipboard observations, especially in the near-coastal areas northeast of Greater Boston and in southern ME (Fig. 16c and d).The lowest O 3 mixing ratios (<30 nmol/mol) occurred in the near-coastal area east of Boston, and slightly higher levels of 40-50 nmol/mol ∼50 km farther out over the ocean.A scatter plot of modeled versus observed values showed a 1-to-1 correlation with r 2 =0.25 (not shown).Note that in the two intensives the modeled west-to-east offshore gradients were much flatter compared to observations, implying considerably underpredicted continental export of O 3 and other pollutants.This is www.atmos-chem-phys.net/10/9/2010/Atmos.Chem.Phys., 10, 9-27, 2010  In addition to ground-level comparisons, we examined model performance in simulating upper air O 3 mixing ratios using ozonesonde data obtained at three ICARTT-IONS locations in Narragansett, RI, Pellston, MI, Beltsville, MD, Huntsville, AL, Wallops Island, VA, and the ship Ron Brown (Thompson et al., 2007a, b).The modelobservational agreement varied between locations and days.On average, the sounding data showed a gradual increase from 33 nmol/mol (±16 nmol/mol) at the 4 m height to 76 nmol/mol at the 5.4 km altitude followed by a precipitous increase to 130 nmol/mol (±105 nmol/mol) near the 10 km altitude, 20% of which resulted from stratospheric    influences (Fig. 17a) (Pfister et al., 2008;Yorks et al., 2009).
In comparison, the modeled vertical profile remained almost constant from the surface to 10 km, varying within 5 nmol/mol and a standard deviation of ∼10 nmol/mol.The model captured the vertical trend at all sites with deviation of around ±15 nmol/mol from observation below 6 km except for the Ron Brown and at Beltsville.The Ron Brown observations showed an increasing trend in O 3 from 25 nmol/mol at 2 m to 109 nmol/mol at 10 km, whereas the model simulated an increase from 44 nmol/mol at 2 m to a peak of 59 nmol/mol at 600 m followed by a gradual decrease to 3 km and then an increase to the top of the troposphere.The model reproduced the shape of the vertical profile over Beltsville with underprediction >14 nmol/mol at all levels reaching maximum values in the top layer.
In their study on evolution of ETA-CMAQ forecast model results using the IONS 2004 measurements, Yu et al. (2007) reproduced the O 3 vertical profile well at low altitudes, especially at the Pellston site, similar to what is shown in this study.However, the authors revealed a consistent model overestimation above ∼6 km due to the lateral boundary conditions derived by the Global Forecast System  In our study the lack of structure in the modeled vertical profile indicates that some key processes are missing in the model in addition to the lack of stratosphere-tropospheric exchange.For instance, lightning NO x is not represented in the CMAQ version employed in this study.Cooper et al. (2009) suggested that more than 80% of summertime upper tropospheric NO x above the eastern US is produced by lightning.This missing source of NO x in the upper troposphere could potentially result in model underestimation of O 3 mixing ratios in that region.
The NASA DC-8 during INTEX A, as part of ICARTT 2004, surveyed air quality over the northeastern US and its adjacent near-coastal area.The averaged vertical profile of O 3 mixing ratio from the model agreed reasonably with the airborne measurements onboard with the smallest difference (<10 nmol/mol) below 2.5 km (Fig. 18).The modeled O 3 mixing ratios correlated with the observed at r 2 =0.23 and a slope of 0.19 resulting from the model completely missing O 3 levels >80 nmol/mol (Fig. 18a).The time series of O 3 mixing ratios showed that the model captured most observations except at O 3 levels >70 nmol/mol (Fig. 18b).Mixing ratios >100 nmol/mol were observed on 2, 22 July, and 6-7 August, which were completely missed by the model.The ones >100 nmol/mol mostly occurred at altitudes >7 km (Fig. 18c) which are likely to be the result of stratospheric influence that cannot be captured by the model owing to the top lateral boundary conditions.

Summary
We have examined model performance in the five summer (2001)(2002)(2003)(2004)(2005)) simulations of regional climate and O 3 mixing ratios using long-term continuous measurements from US and Canadian Surface Hourly Observations, National Weather Service radiosondes, the EPA AIRNOW and UNH AIRMAP as well as data from field campaigns NEAQS 2002, ICARTT 2004, and our Duke Forest 2004 work.Our map typing analysis suggested that RCMS captured the patterns and frequency of dominant five map types with the best agreement for the two most dominant map types with the reanalysis data.The modeled distributions of surface O 3 daily maxima corresponding to these map types were in excellent agreement with observations.This suggests that accurate simulation of circulation was a deterministic factor in reproducing the salient features in surface O 3 distribution.The mean bias, root mean square error, and index of agreement of the five summer modeled surface 8-h O 3 daily maxima simulated by CMAQ, as compared to observations, were −0.6±14 nmol/mol, 14 nmol/mol, and 71% respectively, and the values calculated using the 1-h O 3 data were very similar.Both modeled 1-and 8-h O 3 daily maxima suggested that the model performed best in simulating moderately polluted conditions and less satisfactorily in simulating highly polluted ones.
Moreover, the diagnostic analysis suggested that significant overestimation of nighttime low O 3 mixing ratios for the coastal site Thompson Farm may have resulted from underestimated NO emissions at night.The missing of the second daily peak at the marine site Appledore Island possibly resulted from the misrepresentation of land surface type of the site due to the coarse grid resolution.The comparison of modeled and Ron Brown shipboard measurements from NEAQS 2002 andICARTT 2004 suggested that CMAQ has an inherent problem in under-predicting O 3 levels in continental outflow, probably due to underrepresented O 3 precursor emissions in the model.While CMAQ appeared to simulate the lower tropospheric O 3 distribution reasonably, the overall lack of structure in the modeled vertical profiles indicates that key processes missing in the model, such as lightning produced NO x and stratospheric intrusions, are important for accurate free tropospheric simulations.Future work is warranted to improve the representation of O 3 precursor emissions and processes influencing upper tropospheric air to better simulate the three-dimensional distribution of O 3 .

Fig. 2 .
Fig. 2. The representative sea level pressure (SLP) in hPa distribution for map types FNL1-FNL6 and the date of occurrence.

Fig. 8 .
Fig. 8. Distributions of frequency of 1-h and 8-h O 3 daily maxima for different bins from observations and model simulations.

Fig. 9 .
Fig. 9. Distributions of occurrence of episodes of varying length using 1-and 8-h O 3 daily maxima from observations and model simulations.

Fig. 10 .
Fig. 10.The top 5 map types from reanalysis (a-e) and corresponding average distributions of 1-h O 3 daily maxima from observations (f-j) and model results (k-o).
Figure 13.Time series of modeled (red) and observed (blue) hourly O 3 (a), hourly wind speed (b), and temperature (c) at Duke Forest, NC during our campaign.

Fig. 13 .
Fig. 13.Time series of modeled (red) and observed (blue) hourly O 3 (a), hourly wind speed (b), and temperature (c) at Duke Forest, NC during our campaign.

Fig. 14 .
Fig. 14.Analyzed (a and c) and modeled (b and d) sea level pressure and wind vectors for 19 and 23 September 2004.

Fig. 16 .
Fig. 16.Hourly O 3 from ship observations (a and c) and model results (b and d) in NEAQS 2002 (a and b) and ICARTT 2004 (c and d).

Figure 17 .
Figure 17.Vertical profiles of O 3 mixing ratios from IONS observations and model simulations during ICARTT 2004.

Fig. 17 .
Fig. 17.Vertical profiles of O 3 mixing ratios from IONS observations and model simulations during ICARTT 2004.

Table 1 .
Predominant map types for JJA from NCEP FNL and RCMS data sets and associated meteorological characteristics.The correlation coefficient for map typing was r=0.70.
the west; general east to southeasterly flow in the northeastern US Inland cold frontal trough extending from cyclone tracking across Canada approaching the east coast, strong south-FNL5 and RCM3 10 13 southwesterly warm, humid flow with showers ahead of the front and cooler drier conditions and northwest flow over Midwest behind the front Anticyclone centered near Hudson Bay extending into the northeastern US; cool, FNL6 and RCM5 5 4 dry northerly flow over the coastal states with warmer, humid, southerly flow over the Midwest states