Long-term particulate matter modeling for health effect studies in California – Part 1 : Model performance on temporal and spatial variations

For the first time, a ∼ decadal (9 years from 2000 to 2008) air quality model simulation with 4 km horizontal resolution over populated regions and daily time resolution has been conducted for California to provide air quality data for health effect studies. Model predictions are compared to measurements to evaluate the accuracy of the simulation with an emphasis on spatial and temporal variations that could be used in epidemiology studies. Better model performance is found at longer averaging times, suggesting that model results with averaging times ≥ 1 month should be the first to be considered in epidemiological studies. The UCD/CIT model predicts spatial and temporal variations in the concentrations of O3, PM2.5, elemental carbon (EC), organic carbon (OC), nitrate, and ammonium that meet standard modeling performance criteria when compared to monthly-averaged measurements. Predicted sulfate concentrations do not meet target performance metrics due to missing sulfur sources in the emissions. Predicted seasonal and annual variations of PM2.5, EC, OC, nitrate, and ammonium have mean fractional biases that meet the model performance criteria in 95, 100, 71, 73, and 92 % of the simulated months, respectively. The base data set provides an improvement for predicted population exposure to PM concentrations in California compared to exposures estimated by central site monitors operated 1 day out of every 3 days at a few urban locations. Uncertainties in the model predictions arise from several issues. Incomplete understanding of secondary organic aerosol formation mechanisms leads to OC bias in the model results in summertime but does not affect OC predictions in winter when concentrations are typically highest. The CO and NO (species dominated by mobile emissions) results reveal temporal and spatial uncertainties associated with the mobile emissions generated by the EMFAC 2007 model. The WRF model tends to overpredict wind speed during stagnation events, leading to underpredictions of high PM concentrations, usually in winter months. The WRF model also generally underpredicts relative humidity, resulting in less particulate nitrate formation, especially during winter months. These limitations must be recognized when using data in health studies. All model results included in the current manuscript can be downloaded free of charge at http: //faculty.engineering.ucdavis.edu/kleeman/.


Introduction
Numerous scientific studies have demonstrated associations between exposure to ambient airborne particulate matter (PM) and a variety of health effects, such as cardiovascular diseases (Dockery, 2001;Ford et al., 1998;Franchini and Mannucci, 2009;Langrish et al., 2012;Le Tertre et al., 2002), respiratory diseases (Gordian et al., 1996;Hacon et al., 2007;Sinclair and Tolsma, 2004;Willers et al., 2013), low birth weight and birth defects (Barnett et al., 2011;Bell et al., 2010;Brauer et al., 2008;Laurent et al., 2014Laurent et al., , 2013;;Stieb et al., 2012), lung cancer (Beelen et al., 2008;Beeson et al., Published by Copernicus Publications on behalf of the European Geosciences Union. 1998; Pope et al., 2002;Vineis et al., 2006), mortality, and lower life expectancy (Chen et al., 2013;Correia et al., 2013;Dockery et al., 1993;Franklin et al., 2007;Goldgewicht, 2007;Cao et al., 2011;Laden et al., 2000;Ostro et al., 2006;Pope et al., 2009).Recently a few studies have investigated the associations between particle composition and health effects (Bell et al., 2010(Bell et al., , 2007;;Burnett et al., 2000;Cao et al., 2012;Franklin et al., 2008;Ito et al., 2011;Krall et al., 2013;Levy et al., 2012;Mar et al., 2000;Ostro et al., 2007Ostro et al., , 2010;;Son et al., 2012).However, there remains large uncertainty about which PM components are most responsible for the observed health effects, possibly due to the fact that central site monitoring measurements used in the PM composition studies have limited temporal, spatial, and chemical resolution, which could potentially lead to misclassification of exposure estimates and mask some detailed correlations.Central site PM measurements typically have a collection schedule of one sample every 3 or 6 days at a few sites used to represent an entire population region.Important particle size distribution and chemical composition information is not always routinely measured.Additional information relating PM composition to health effects would provide a solid foundation to design effective PM control strategies to protect public health at a reduced economic and social cost.
Chemical transport models (CTMs) have recently been used as one of the alternative approaches to address the limitations of central site monitors (Anenberg et al., 2010;Bravo et al., 2012;Sarnat et al., 2011;Tainio et al., 2013).The latest generation of CTMs represents a "state-of-the-science" understanding of emissions, transport, and atmospheric chemistry.CTM predictions provide more detailed composition information and full spatial coverage of air pollution impacts with a typical temporal resolution of 1 h.CTMs have great potential to fill the time and space gaps in the central site monitoring data set for PM measurements, leading to improved exposure assessment in epidemiological studies.
The CTM applications in epidemiology studies to date have generally used relatively coarse spatial resolutions in order to reduce computational burden.Global CTMs have used horizontal resolutions of over 100 km and regional CTMs have used resolution of 12-36 km.These resolutions cannot capture fine spatial gradients of PM concentrations, especially in areas with diverse topography and demography.Previous CTMs predictions used in epidemiology studies have also been limited to time periods less than 1 year.Recently Zhang et al. (2014a) evaluated the performance of the Community Multiscale Air Quality (CMAQ) model over a 7-year period in the eastern USA, but no other long-term CTMs studies for health effect analyses have been published to date.As a further limitation, previous epidemiology studies based on CTM predictions have mostly used predicted particles with aerodynamic diameter less than 2.5 µm (PM 2.5 ) mass concentrations without taking full advantage of the ability of CTMs to simultaneously estimate population exposure to multiple particle size fractions, chemical components, and source contributions.The variation in CTM prediction bias as a function of space and time due to uncertainties in model inputs (emissions, meteorological fields, mechanism parameters) is often not sufficiently characterized to understand potential impacts on health effect estimates.Detailed analyses are needed to assess the temporal and spatial features of CTM predictions to identify accurate and/or unbiased information for exposure assessment before such information can be applied in health effect studies (Beevers et al., 2013).
The objective of the current study is to develop and apply advanced source-oriented CTMs to predict the concentrations and sources for enhanced PM exposure assessment in epidemiological studies over a long-term period with high spatial resolution in California.California is chosen as the focus area for the current study because it has extensive infrastructure to support CTM studies, and it has one of the largest populations in the USA that is experiencing unhealthy levels of PM pollution.In 2013, 104 US counties with a population of 65 million people were in nonattainment with the National Ambient Air Quality Standards for PM 2.5 (EPA, 2013).Approximately half of that population (31 million people) lives in 29 California counties, meaning that California suffers a disproportionately large share of US PM-related mortality (Fann et al., 2012).The California Air Resources Board (CARB) estimates that 14 000-24 000 California residents die prematurely each year due to particulate air pollution (Tran, 2008).The severity of this problem has motivated extensive investments to support air pollution studies.California has the densest ambient PM measurement network, the most accurate emissions inventories, and the most health effect study groups of any state in the United States.Rich data sets are available to support model application and evaluation.
The current study is the first attempt to address the sparse PM data problem in exposure assessment using CTM results over a ∼ decadal time period (9 years from 2000 to 2008) over a domain spanning ∼ 1000 km at a spatial resolution of 4 km.Companion studies have modeled primary PM 2.5 and PM 0.1 (particles with aerodynamic diameter less than 0.1 µm) concentrations and sources in California (Hu et al., 2014a, b).The current paper, as the third in the series, focuses on model evaluation of total (= primary + secondary) PM 2.5 and major components (elemental carbon (EC), organic carbon (OC), nitrate, sulfate, ammonium), emphasizing the aspects of temporal and spatial variations, to identify the features of the CTM results that could add skill to the exposure assessment for epidemiological studies.A future study will investigate the model capability for PM source apportionment of primary and secondary organic aerosols, which is currently an area with great uncertainty.

Air quality model description
The host air quality model employed in the current study is based on the Eulerian source-oriented University of California, Davis/California Institute of Technology (UCD/CIT) chemical transport model (Chen et al., 2010;Held et al., 2004;Held et al., 2005;Hixson et al., 2010Hixson et al., , 2012;;Hu et al., 2012Hu et al., , 2010;;Kleeman and Cass, 2001;Kleeman et al., 1997Kleeman et al., , 2007;;Mahmud, 2010;Mysliwiec and Kleeman, 2002;Rasmussen et al., 2013;Ying et al., 2008;Ying et al., 2007;Ying and Kleeman, 2006;Zhang and Ying, 2010).The UCD/CIT model includes a complete description of atmospheric transport, deposition, chemical reaction, and gas-particle transfer.The details of the standard algorithms used in the UCD/CIT family of models have been described in the above references and therefore are not repeated here.Only the aspects that are updated during the current study are discussed in the following section.
The photochemical mechanism used by the UCD/CIT model was updated to reflect the latest information from smog-chamber experiments.The SAPRC-11 photochemical mechanism (Carter andHeo, 2012, 2013) was used to describe the gas-phase chemical reactions in the atmosphere.The secondary organic aerosol (SOA) treatment was updated following the method described in Carlton et al. (2010).Seven organic species (isoprene, monoterpenes, sesquiterpenes, long-chain alkanes, high-yield aromatics, low-yield aromatics, and benzene) are considered as precursors for SOA formation.A total of 12 semi-volatile and 7 nonvolatile products are formed from the oxidation of the precursor species.The gas-particle transfer of the semi-volatile and nonvolatile products in the UCD/CIT model is dynamically calculated based on the gas vapor pressures calculated over the particle surface and the kinetic limitations to mass transfer.The explicit chemical reactions and the parameters for the thermodynamic equilibrium calculation (i.e., enthalpy of vaporization, saturation concentrations, and stoichiometric yields) are provided in Carlton et al. (2010) and references therein.
Model simulations were configured using a one-way nesting technique with a parent domain of 24 km horizontal resolution that covered the entire state of California (referred to as CA_24 km) and two nested domains with 4 km horizontal resolution that covered the Southern California Air Basin (SoCAB)(referred to as SoCAB_4 km) and the San Francisco Bay Area, San Joaquin Valley (SJV), and South Sacramento Valley air basins (referred to as SJV_4 km) (shown in Fig. 1).The nested 4 km resolution domains are configured to cover the major ocean, coast, urban, and rural regions that influence California's air quality and, most importantly, to cover most of the California's population for the purpose of health effect analyses.Over 92 % of California's population lives in the 4 km domains based on the most recent census information.The UCD/CIT model was configured with 16 vertical layers up to a height of 5 km above ground level in all the mother and nested domains, with 10 layers in the first 1 km.Note that the use of relatively shallow vertical domains is only appropriate in regions with well-defined air basins and would not be appropriate for locations in the eastern USA or other regions with moderate topography.Particulate composition, number and mass concentrations are represented in 15 size bins, ranging from 0.01 to 10 µm in diameter.Primary particles are assumed to be internally mixed, i.e., all particles within a size bin have the same composition.Previous studies (Ying et al., 2007) have shown that this assumptions provides adequate predictions for total PM concentrations relative to source-oriented mixing treatments in California when feedbacks to meteorology are not considered (Zhang et al., 2014b).

Meteorology and emissions
Hourly meteorology inputs (wind, temperature, humidity, precipitation, radiation, air density, and mixing layer height) were generated using the Weather Research and Forecasting (WRF) model v3.1.1Wang et al., 2010;Shamarock et al., 2008).Two-way nesting was used with the outer domain at 12 km resolution and the inner nested domain at 4 km resolution.North American Regional Reanalysis data with 32 km resolution and 3 h time resolution were used as initial and boundary conditions of the coarse 12 km domain.The WRF model was configured with 31 vertical layers up to 100 hPa (around 16 km).Four-dimensional data assimilation was used.The Yonsei University (YSU) boundary layer scheme, thermal diffusion land-surface scheme, and Monin-Obukhov surface layer scheme were used based on results from a previous study in California (Mahmud, 2010;Zhao et al., 2011).The surface wind was overpredicted with the original version of WRF, especially for wind speed less than 3 m s −1 , consistent with other studies in California (Angevine et al., 2012;Fast et al., 2014;Michelson et al., 2010a).Overprediction of the slow winds caused underprediction of concentrations during high pollution events.A recent study (C.F., Mass, personal communication, 2010) found that increasing the surface friction velocity (u * ) by 50 % reduced the bias in surface wind predictions in a complex-terrain domain.This technique was tested and adopted in previous studies (Hu et al., 2012(Hu et al., , 2014a;;Mass and Ovens, 2010;Wang et al., 2015) where it improved the accuracy of air quality predictions.In the current study, a 1-year sensitivity simulation for California in the year 2000 revealed that increasing u * by 50 % improved the mean wind bias from 1.15 to −0.50 m s −1 and lowered the root-mean-square error (RMSE) from 2.95 to 2.20 m s −1 (Hu et al., 2014a).It should be noted that this approach reduces positive bias for wind speeds less than ∼ 3 m s −1 but increases negative bias at higher speeds.Analysis of the wind speed measurements in California air basins shows that 78 % of winds are less than 3 m s −1 .Therefore, increasing u * by 50 % in our study improves the wind predictions for a majority of cases during the modeling period.Similar detailed evaluations should be conducted before applying the increased u * approach to other regions and periods.It should also be noted that concentration is inversely proportional to wind speed.As a result, the concentration bias created by a wind speed bias of 1 m s −1 at a true wind speed of 3.5 m s −1 is 50 times lower than the concentration bias created by the same wind speed bias at a true wind speed of 0.5 m s −1 .This implies that the underprediction of high wind speeds in the present study has minimal impact on concentration fields used for epidemiology.
Hourly average meteorology outputs at the air-qualitymodel vertical layer heights were created by averaging the WRF fields.The meteorology predictions were evaluated against meteorological observations (CARB, 2011a).The meteorological statistical evaluation over the period 2000-2006 has been presented in a previous study (Hu et al., 2014a), and the results in the period 2007-2008 are consistent with those years.In summary, meteorology predictions of temperature and wind speed generally meet benchmarks suggested by Emery et al. (2001).Mean fractional biases (MFBs) of temperature and wind are generally within ±0.15, RMSEs of temperature are around 4 • C, and RMSEs of wind are generally lower than 2.0 m s −1 , especially in the SoCAB and SJV air basins which are the focus of the current study.Relative humidity is underpredicted, consistent with findings in other studies in California (Bao, 2008;Michelson et al., 2010b).Precipitation is also underpredicted with a MFB of −76.1 % and RMSE of 2.84 mm h −1 .Wind, temperature, and humidity are the major meteorological factors that influence the PM concentrations.Further discussions of the uncertainties in meteorology predictions on PM predictions are included in the Results section.
Hourly gridded gas and particulate emissions were generated using an updated version of the emissions model described by Kleeman and Cass (1998).The standard emissions inventories from anthropogenic sources (i.e., point sources, stationary area sources, and mobile sources) were provided by CARB.Size-and composition-resolved particle emissions were specified using a library of primary particle source profiles measured during actual source tests (Cooper, 1989;Harley et al., 1992;Hildemann et al., 1991a, b;Houck, 1989;Kleeman et al., 2008Kleeman et al., , 1999Kleeman et al., , 2000;;Robert et al., 2007a, b;Schauer et al., 1999aSchauer et al., , b, 2001Schauer et al., , 2002a, b;, b;Taback et al., 1979).A few studies have revealed some uncertainties associated with the standard emissions inventories.Millstein and Harley (2009) found that PM and NO x emissions from diesel-powered construction equipment were over-estimated by a factor of 3.1 and 4.5, respectively.Countess ( 2003) suggested that a scaling factor of 0.33-0.74should be applied to the fugitive dust emissions in the Californian San Joaquin Valley.Therefore, scaling factors of 0.32 for off-road diesel sources and 0.50 for dust emissions were applied in the current study.The EMFAC 2007model (CARB, 2008) was used to scale the mobile emissions using predicted temperature and relative humidity fields through the entire 9-year modeling episode.Biogenic emissions were generated using the Biogenic Emissions Inventory System v3.14 (BEIS3.14),which includes a 1 km resolution land cover database with 230 different vegetation types (Vukovich and Pierce, 2002).Sea-salt emissions were generated online based on the formulation described by de Leeuw et al. (2000) for the surf zone and the formulation described by Gong (2003) for the open ocean.Emissions from wildfires and open burning at 1 km × 1 km resolution were obtained from the Fire INventory from NCAR (FINN) (Hodzic et al., 2007;Wiedinmyer et al., 2011).The FINN inventory provides SAPRC99 speciated daily emissions of gaseous and particulate emissions (EC, organic matter (OM), PM 2.5 , and PM 10 ) based on satellite observations of open burning events.Each open burning event is allocated to model grid cells of each domain based on the reported longitude/latitude of the event and the area burned.The emissions were injected at the height of the atmospheric mixing layer (PBL).The temporal variation of wildfire emissions was obtained from the Western Regional Air Partnership report (WRAP, 2005).A size distribution profile was calculated based on assumptions described in Hodzic et al. (2007).

Ambient air quality measurements
The evaluation data set was compiled from several measurement networks, including CARB's "2011 Air Quality Data DVD" (CARB, 2011b) and the database maintained by the Interagency Monitoring of Protected Visual Environments (IMPROVE).The data DVD includes daily average mass concentrations of PM 2.5 , EC, OC, nitrate, sulfate, ammonium, and trace metals every 3 or 6 days at the sites of the PM 2.5 Speciation Trends Network (STN) and the state and local air monitoring stations.There are a total 13 PM 2.5 speciation sites included in the DVD covered in the 4 km domains during the modeling periods.The precision of STN measurements is estimated to be 3.5, 8.6, and 3.9 % for sulfate, nitrate, and ammonium, respectively (Sickles Ii and Shadwick, 2002).Measured EC concentrations at five sites are found to be exactly 0.5 µg m −3 on > 80 % of the measurement days, suggesting corrupt or missing data at these locations.Therefore, these five sites were excluded in the evaluation for EC but still included in the evaluation for other PM components.The OC data were not blank corrected, resulting in a positive artifact by the NIOSH5040 method that is equivalent to approximately 1 µg m −3 .Measured OC concentrations were blank corrected in the current study by subtracting 1 µg m −3 from all OC measurements.The IMPROVE network provides daily average mass concentrations every 3 days for PM 2.5 , EC, OC, nitrate, sulfate, and soil.There are a total of nine IMPROVE sites covered in the 4 km domains.The precision of IMPROVE measurements is estimated to be 4-6 % for PM 2.5 mass, nitrate, and sulfate and to be > 15 % for EC and OC (http: //vista.cira.colostate.edu/improve/Publications/OtherDocs/IMPROVEDataGuide/IMPROVEDataGuide.htm).Daily average PM 10 mass measurements and hourly measurements of several key gaseous pollutants (ozone, CO, NO, NO 2 , and SO 2 ) are also included in the data DVD.There are a total of 66 PM 2.5 federal reference method (FRM) sites covered in the 4 km domains.Frank (2006) found that FRM PM 2.5 mass measured using STN monitors was within ±30 % of reconstructed fine mass concentrations measured using IMPROVE monitors.

Statistical evaluation
Statistical measures of MFB and mean fractional error (MFE) were calculated to evaluate the accuracy of model estimates in space and time.Boylan and Russell (2006) proposed concentration-dependent MFB and MFE performance goals and criteria, realizing that lower concentrations are more difficult to accurately predict.The performance goals are the level of accuracy close to the best that a model can be expected to achieve, while performance criteria are the level of accuracy acceptable for standard modeling applications.
Figures 2 and 3 show the monthly MFB and MFE values, respectively, of predicted daily average EC, OC, nitrate, ammonium, sulfate, and total PM 2.5 mass in the 4 km domains.Measured EC, OC, nitrate, ammonium, and total PM 2.5 mass concentrations follow similar seasonal patterns with high concentrations occurring in winters (indicated by blue colors in figures) and low concentrations occurring in summers (indicated by red colors in figures).These patterns are driven by the meteorological cycles (i.e., lower mixing layer and wind speed providing less dilution and lower temperature encouraging partitioning of ammonium nitrate to the particle phase) and the emissions variations (i.e., additional wood burning emissions for home heating in winters).The opposite seasonal variations in sulfate concentrations are observed due to higher oxidation rates from S(IV) to S(VI) and higher sulfur emissions from natural sources in summer (Bates et al., 1992).
EC predictions are in excellent agreement with measurements.MFBs in all months and MFEs in 107 months out of the total 108 months are within the model performance goal.EC MFBs and MFEs show no significant difference among months/seasons, indicating consistently good EC performance during the entire 9-year modeling period.OC, nitrate, sulfate, and ammonium, the PM components that include the secondary formation pathways, meet the MFBs model performance criteria in 71, 73, 46, and 92 % of the simulated months, respectively.These components generally have good agreement between predictions and measurements in winter months, with only a few months not meeting the performance criteria.When analyzing by season, predicted concentrations of these species are found to be more biased in summer months, especially for sulfate and nitrate.Different factors influence the seasonal profile of each species.The more significant OC underprediction in summertime is mainly associated with the underprediction of SOA due to incomplete knowledge of SOA formation mechanism at the present time.Similar patterns have been reported in other modeling studies outside California (Matsui et al., 2009;Volkamer et al., 2006;Zhang et al., 2014a;Zhang and Ying, 2011).Measured nitrate concentrations in summertime (1-5 µg m −3 ) are factors of 2-5 lower than concentrations in wintertime (5-12 µg m −3 ).Model predictions tend to underestimate the low particle phase nitrate concentrations in summer, especially when temperatures exceed 25 • C. Model predictions for particulate nitrate are usually less than 1 µg m −3 under these conditions, while 2-3 µg m −3 nitrate concentrations are still observed in the ambient air.Similar underpredictions of summertime nitrate have been reported in other regional modeling studies (Appel et al., 2008;Tesche et al., 2006;Yu et al., 2005;Zhang et al., 2014a).Model calculations reflect thermodynamics and kinetic gas-particle transfer for ammonium nitrate in mixed particles, suggesting that some other form of nitrate is present in the real atmosphere, such as organo-nitrates (Day et al., 2010).Sulfate concentrations are consistently underpredicted throughout the modeling period at all locations, especially in Southern California where the measured sulfate concentrations are highest.Underprediction of sulfate has also been reported by other regional modeling studies in California (Chen et al., 2014;Fast et al., 2014) using different air quality models (e.g., CMAQ, WRF-Chem).This consistent behavior suggests that the specific model is not the cause of the sulfate underprediction.A global model study that included ocean dimethyl sulfide (DMS) emissions showed a better sulfate performance in California (Walker et al., 2012).Therefore, missing emissions sources such as the sulfur emitted as DMS from the Pacific Ocean likely contribute to the sulfate underpredictions in the current study.The sulfate concentrations at the sites in Southern California are ∼ 2 to 3 times higher than in Northern California and are underpredicted by an even larger amount (with MFBs around −1.0).It is therefore likely that anthropogenic sulfur sources are missing in Southern California in addition to background DMS sources.In the remote areas where the sulfate concentrations are low, the omission of nucleation processes in the current study could reduce seed aerosol surface area onto which sulfuric acid can condense.This factor could contribute to the underpredic- tion of sulfate mass in these regions along with the missing sulfur sources.Ammonium is drawn to acidic particles and so ammonium concentration predictions reflect the combined trends of nitrate and sulfate predictions.
The model predictions of total PM 2.5 mass, as a summation of all components, show very good agreement with measurements, with only 3 summer months and 2 spring months (5 % of all simulated months) not meeting the performance criteria, and 78 and 75 % of months within the performance goals for MFB and MFE, respectively.The largest biases in the total PM 2.5 mass occur in summer.Underprediction in summer sulfate and OC contribute to negative biases in the total PM 2.5 mass predictions.Sulfate and OC concentrations in summer accounted for ∼ 18 and ∼ 37 % of the total PM 2.5 mass.Sulfate and OC underprediction contributed to a combined ∼ 37 % underprediction of total PM 2.5 mass.However, positive biases in predicted dust concentrations rich in crustal elements such as aluminum and silica (Hu et al., 2014a) compensate for the underpredictions in carbonaceous components and water-soluble ions described above.
Figure 4 shows the MFB and MFE values of particulate species of PM 2.5 total mass, EC, OC, nitrate, sulfate, ammonium and gaseous species of O 3 , CO, NO, NO 2 , and SO 2 using daily averages across all measurement sites during the entire modeled 9-year period.PM 2.5 total mass, EC, OC, ammonium, and gaseous species of O 3 , CO, and NO 2 have MFBs within ± 0.3 and MFE less than 0.75, indicating general agreement between predictions and measurement for these species.Nitrate and NO have MFBs of −0.4 and −0.28, respectively, but MFEs of 0.8 and 1.07, respectively.The relatively moderate or small bias combined with relatively large error indicates that the daily predictions miss the extremely high and low concentrations.Sulfate and SO 2 have high MFBs of −0.7 and −0.5, respectively, and high MFEs of 0.8 and 0.9, respectively, indicating that these species are consistently underpredicted.
Concentrations averaged over longer times, such as 1 month or 1 year, are used in some studies of air pollution health effects.A previous examination of primary particles in California revealed that air quality model predictions are more accurate over longer averaging time because the influence of extreme events and short-term variability is reduced as the averaging period gets longer (Hu et al., 2014a).Figure 4 compares the MFB and MFE values for total (= primary + secondary) particulate matter and gaseous species using daily, monthly, and annual averages across all sites in the 4 km domains.The results demonstrate that longer averaging times produce better agreement between model predictions and measurements (except for sulfate, which is underpredicted due to missing emissions), because they remove the effects of random measurement errors at monitoring stations and variations in actual emissions rates that are not reflected in seasonally averaged emissions inventories.The reduced errors associated with longer averaging times indicate that model results may be most useful in epidemiological studies that can take advantage of averaging times ≥ 1 month.

Spatial and temporal variations
Figure 5a shows the predicted and measured monthly average concentrations of 1 h peak O 3 at five major urban sites (Sacramento, Fresno, Bakersfield, Los Angeles, and Riverside).Strong seasonal variations are observed in measured and predicted 1 h peak O 3 .The measured 1 h peak O 3 shows seasonal variation from 100 ppb in summertime to 20 ppb in wintertime.The predicted high 1 h peak O 3 concentrations in non-winter months are in good agreement with, or slightly higher than, ambient measured concentrations at all sites.This is consistent with studies in the eastern USA (Zhang et al., 2014a) which found similar slight overpredictions of summer O 3 concentrations.Predicted 1 h peak O 3 concentrations in cold winter months, however, are generally higher than measured values.Photochemical reaction rates in winter months are slow and the predicted O 3 concentration at the surface mostly reflects downward mixing of the aloft background O 3 , followed by titration by surface NO emissions.The STN measurement sites in California are located in urban areas that are close to major freeways (see the site locations and nearby sources information in Hu et al., 2014a).The 4 km × 4 km model grid cells that contain both freeways and monitors dilute the high NO concentrations around the measurement sites, leading to an underprediction of O 3 titration and an overprediction of O 3 concentrations.EPA recommends a threshold O 3 value of 60 ppb for model O 3 evaluations (U.S. EPA, 2007), which means that wintertime O 3 concentrations at the urban sites will generally not be considered in the formal model evaluation.
Figure 5b and c show the predicted and measured monthly average CO and NO concentrations.Strong seasonal variations in CO and NO can be observed, with wintertime concentrations that are a factor of 3-5 higher than summertime concentrations.Model predictions generally reproduce the seasonal variations except at the Riverside site, where predicted seasonal variations are weaker than measurements.A clear and similar decreasing trend is apparent in measured CO and NO concentrations from 2000 to 2008.This interannual trend is not well captured by the model predictions due to the uncertainties in the emissions.An adjusted NO prediction (NO_adj) can be calculated using CO as a tracer for the mobile emissions and dilution according to the equation where NO_noadj is the NO predictions before the adjustment (i.e., the concentrations showing in Figure 5c).NO_adj has a higher correlation coefficient (R 2 ) with measured NO concentrations than the NO_noadj prediction at all the five monitoring sites (as shown in Fig. 7), and NO_adj has a regression slope closer to 1.0 than NO_noadj at three out of five sites.This suggests that either emissions or physical dilution processes in the model contribute to the errors observed in Fig. 5 (in addition to the possibility of errors in model chem- istry).Unfortunately, the large variation in the correction factor among different locations suggests that these scaling factors cannot be simply interpolated/extrapolated from the indicated five monitoring sites to the full modeling domain.
Figures 5d and 6a show the predicted and measured monthly average ammonium and nitrate concentrations.Ammonium nitrate is a major PM 2.5 component in California, especially in wintertime when the low temperature and high relative humidity favor partitioning to the condensed phase.The monthly average ammonium and nitrate results demonstrate similar model performance.The predicted concentrations agree reasonably well with measured ambient concen-trations and seasonal variations.Model predictions are lower than measured values in the early years, especially during winter months when concentrations are highest.This pattern is very consistent with CO model performance, suggesting mobile emissions are under-estimated for the early years of the simulation period.Nitrate is formed through NO oxidation to nitric acid but NO concentrations are not underpredicted, suggesting that the chemical conversion of NO to nitric acid is too slow.Carter and Heo (2012) suggested that SAPRC11 mechanism systematically underpredicts OH radical concentrations by ∼ 30 %, which would be consistent with the observed trends.Gas-particle partitioning of ammonium nitrate depends on temperature and relative humidity.While there is no systematic bias in WRF temperature, relative humidity is generally underpredicted by up to 40 % over California.A 1-year sensitivity analysis was conducted with RH increased uniformly by +30 % (but not to exceed 95 %, and all other meteorological parameters were kept the same) in 2008 to investigate the impact of the relative humidity bias on particulate nitrate predictions.The arbitrary increase in RH by 30 % in the air quality model simulations yields an upper bound estimate of the nitrate sensitivity to RH. Figure 8 compares the monthly average nitrate concentrations predicted with the original RH (denoted as "RH_ori" case) and the enhanced RH (denoted as "RH + 0.3" case) at Sacramento and Fresno.Nitrate predictions are generally higher in the "RH + 0.3" case due to more particle phase water available to absorb nitrate into the condensed phase.The nitrate predictions at Sacramento are significantly improved during most months in 2008, suggesting this area suffers from the low RH bias in the WRF predictions.Nitrate at Fresno is improved mostly in the winter and spring but is still underpredicted during the time period with peak winter concentrations, indicating this area is in- fluenced by other factors besides RH.Nitrate predictions at Fresno in summer and fall are lower when RH is enhanced, due to faster deposition caused by larger particle sizes with more particle phase water.The uniform RH increase of 0.3 in this region is likely unrealistically large during these months.
Figure 6b shows the OC predictions and measurements.Organic aerosol in California it is typically the second most abundant species after ammonium nitrate.In the comparison, an OM / OC ratio of 1.6 (Turpin and Lim, 2010) is applied to convert primary organic aerosol OM back to OC for comparison to measured concentrations.The conversion ratios for SOA species are taken from Table 1 in Carlton et al. (2010).Predicted OC agrees reasonably well with measured concentrations but is lower than the wintertime high concentrations in the early years, similar to other PM components.Predicted OC in summers is also in good agreement with measurements at the indicated monitoring sites.As mentioned previously, these sites are all near major freeways and therefore OC is dominated by primary organic aerosols.Larger bias is found at sites distant from local sources where SOA becomes more important.More analysis about the concentrations and sources of the OC results are included in a companion paper (Hu et al., 2015).
Figure 6c shows that predicted EC concentrations agree well with measured concentrations.High measured EC concentrations in a few winter months in the early years are underpredicted, but EC concentrations in the summer months are generally overpredicted.
Figure 6d shows that monthly average predictions for PM 2.5 mass concentrations agree well with observations, and seasonal trends are generally captured with high concentrations in winter and low concentrations in summer.PM 2.5 is overpredicted in summer months when nitrate, sulfate, and ammonium are found to be underpredicted.These trends reflect the overprediction of the primary components, mostly dust particles, in the model calculations (Hu et al., 2014a).This result suggests that a uniform scaling factor of 0.5 for dust emissions may not be appropriate.A smaller factor (for example, a factor of 0.25 was used in the eastern USA; Tesche et al., 2006) or a spatially resolved method that accounts for the land-use types (Pace, 2005) should be used for future studies in California.
California experiences the highest PM 2.5 concentrations in wintertime, caused by stagnant meteorological conditions characterized by low wind speed and shallow atmospheric mixing layer.The WRF model tends to overpredict wind speed during low wind speed events ( ≤ 2 m s −1 ) in California (Zhao et al., 2011).Increasing u * by 50 % improves the WRF wind prediction but still overpredicts wind speed during events when measured wind speed is < 1.5 m s −1 .A zeroorder approximation of air pollutant concentration (Mahmud, 2010) is where C is the pollutant concentration, E is the source pollutant emission rate, V is the air ventilation rate which is equal to wind speed × mixing height, and u and H are the horizontal wind speed and mixing height, respectively.The concentration is linearly dependent on the inverse wind speed (1/u).Figure 9 shows the MFBs of the predicted atmospheric inverse wind speed (1/u) as a function of the observed atmospheric inverse wind speed.Also shown in Fig. 9 are the MFBs of PM component concentrations as a function of the observed concentrations.The MFBs decrease when the inverse wind speed or concentrations increase, indicating that low inverse wind speed/concentrations are overpredicted but high inverse wind speed /concentrations are underpredicted.
The trends of inverse wind speed and concentrations are well correlated, indicating that simple wind bias leads to bias in PM predictions, especially during the events with high PM pollution.The correlation with 1/u MFB is stronger for primary PM component(s) than for secondary components, indicating that additional processes affect the secondary PM, e.g., chemistry and gas-particle partitioning.Sulfate bias has the weakest correlation to inverse ventilation bias, because sulfate bias is mainly driven by the bias in sulfur emissions.
Figure 10 shows the predicted 9-year average concentrations of PM 2.5 , EC, OC, nitrate, sulfate, and ammonium compared with measured average concentrations over California.High concentrations of all PM pollutants occur in the urban areas with large population, indicating that most of the PM is generated by anthropogenic activities.The predicted spatial distributions generally agree well with measurements but provide much more detailed information.PM 2.5 concentra- tions are overpredicted in the SJV air basin due to an overprediction of agricultural dust.High OC concentrations were measured at two sites in Northern California due to intense wood burning.The two sites are in the 24 km model domain but outside the 4 km; therefore the predicted OC concentrations in the 24 km grids do not agree well with the measurements at this location.This finding confirms that 24 km resolution is probably too coarse for studies of health effects and justifies the use of 4 km grids over the majority of California's population in the current work.Background sulfate concentrations at IMPROVE sites were measured to be 0.6-1 µg m −3 , but higher concentrations of 2-3 µg m −3 were measured in Southern California.Model calculations do not reproduce this concentration enhancement, leading to an underprediction in the concentrations of this PM 2.5 species.

Discussion
In general, the reasonable agreement between model predictions and measurement builds confidence that the model predictions can provide a reasonable estimate of exposure fields in locations with no available measurements.The detailed analysis described in the previous section identifies several aspects that must be considered when applying the data in the health effect studies.For the gaseous pollutants, daily maximum O 3 predictions are in good agreement with measurements across the entire modeling domain.Seasonal and annual variations are captured accurately.Therefore daily maximum O 3 predictions can be used in health analyses with high confidence.The predictions also capture the seasonal variations in NO and CO but do not reflect the long-term trends, especially in Southern California.Predicted monthly aver- ages of NO and CO in Northern California are preferred over daily averages for use in health analyses.For the PM pollutants, daily concentrations and spatial distributions of EC and total PM 2.5 mass generally agree well with observations, but monthly averages should be considered first in health studies as they are in better agreement with observations than shorter averages.Predicted OC in winter is also reasonably accurate, but OC in summer should be used with caution.Sulfate and nitrate are both underpredicted.Sulfate has greater bias in Southern California than in Northern California, while nitrate has consistent bias throughout the modeling domain.This suggests that the spatial distribution information of nitrate might still be useful for health effect studies that use contrasts in exposure as a function of location, but sulfate data are likely not useful in health effect studies at the present time.
Predicted monthly averages for PM concentrations are more accurate than daily averages, suggesting that the PM exposure predictions will be most useful in studies that can take advantage of averaging times ≥ 30 days.Longer averaging times smooth out short-term PM variations that could be useful in some epidemiological studies that focus on short term changes in health effects.To get more accurate pollutant predictions at shorter timescales would require more accurate representation of emissions, meteorological conditions, and atmospheric chemistry at these time scales.Many intensive studies that manually corrected input data have focused on high temporal resolution for short periods (generally less than 1 month), such as the California Regional PM 10 / PM 2.5 Air Quality Study (Ying et al., 2008).It is currently impractical to carry out such efforts for a ∼ 10-year modeling period in which there are a large number of special events that are not represented by automated meteorology and emissions models.The atmospheric modeling community continues to refine tools that can capture and accurately represent these special cases.For example, the current study includes automatic detection and incorporation of wildfire emissions into the modeling system based on satellite observations.This automated feature was not generally available in previous studies.Future advances will detect transportation patterns responding to traffic accidents or holiday traffic jams, drought effects on biogenic emissions, etc.These future advances will improve models to have more accurate predictions in both short (< 1 month) and long (> 1 month) averaging times.

Conclusions
For the first time, a ∼ decadal (9-year) CTM air quality model simulation with 4 km horizontal resolution over populated regions has been conducted in California to provide air quality data for health effect studies.Model predictions are compared to measurements in order to evaluate both the spatial and temporal accuracy of the results.The performance of the source-oriented UCD/CIT air quality model is satisfactory for O 3 , PM 2.5 , and EC (both spatially and temporally).Predicted OC, nitrate, and ammonium are less satisfactory but generally meet standard model performance criteria.OC bias is larger in summertime than wintertime mainly due to an incomplete understanding of SOA formation mechanisms.Bias in predicted ammonium nitrate is associated with uncertainties in emissions, the WRF-predicted relative humidity fields, and the chemistry mechanism.Predicted sulfate is not satisfactory due to missing sulfur sources in the emissions.The CO and NO (species dominated by mobile emissions) results reveal significant temporal and spatial uncertainties associated with the mobile emissions generated by the EM-FAC 2007 model.The WRF model tends to overpredict wind speed during stagnation events, leading to underpredictions of high PM concentrations usually in winter months.The WRF model also generally underpredicts relative humidity, resulting in less particulate nitrate formation especially during winter months.Despite the issues noted above, predicted spatial distributions of PM components are in reasonably good agreement with measurements.Predicted seasonal and annual variations also generally agree well with mea-surements.Better model performance with longer averaging time is found in the predictions, suggesting that model results with averaging times ≥ 1 month should be first considered in epidemiological studies.All model results included in the current manuscript can be downloaded free of charge at http://faculty.engineering.ucdavis.edu/kleeman/.

Figure 1 .
Figure 1.Modeling domains (blue lines outline the CA_24 km domain, and red lines outline the SoCAB_4 km (bottom) and SJV_4 km domains (top)) and PM measurement sites (dots).Blue dots represent the sites of the PM 2.5 Speciation Trends Network (STN) and the state and local air monitoring Stations (SLAMS), green dots represent the Interagency Monitoring of Protected Visual Environments (IMPROVE) sites; gray dots represent the PM 2.5 federal reference method (FRM) sites.

Figure 2 .
Figure 2. Monthly mean fractional bias (MFB) of PM 2.5 EC, OC, nitrate, ammonium, sulfate, and total mass.Solid lines represent the MFB criteria, and the blue dash lines represent the MFB goals.

Figure 3 .
Figure 3. Monthly mean fractional errors (MFE) of PM 2.5 EC, OC, nitrate, ammonium, sulfate, and total mass.Solid lines represent the MFE criteria, and the blue dash lines represent the MFE goals.

Figure 4 .
Figure 4. Mean fractional bias and mean fractional errors of PM and gaseous species when calculated using daily, monthly, and annual averages.
Figure5ashows the predicted and measured monthly average concentrations of 1 h peak O 3 at five major urban sites(Sacramento, Fresno, Bakersfield, Los Angeles, and Riverside).Strong seasonal variations are observed in measured and predicted 1 h peak O 3 .The measured 1 h peak O 3 shows seasonal variation from 100 ppb in summertime to 20 ppb in wintertime.The predicted high 1 h peak O 3 concentrations in non-winter months are in good agreement with, or slightly higher than, ambient measured concentrations at all sites.This is consistent with studies in the eastern USA(Zhang et al., 2014a) which found similar slight overpredictions of summer O 3 concentrations.Predicted 1 h peak O 3 concentrations in cold winter months, however, are generally higher than measured values.Photochemical reaction rates in winter months are slow and the predicted O 3 concentration at the surface mostly reflects downward mixing of the aloft background O 3 , followed by titration by surface NO emissions.The STN measurement sites in California are located in urban areas that are close to major freeways (see the site locations and nearby sources information inHu et al., 2014a).The 4 km × 4 km model grid cells that contain both freeways and monitors dilute the high NO concentrations around the measurement sites, leading to an underprediction of O 3 titration and an overprediction of O 3 concentrations.EPA recommends a threshold O 3 value of 60 ppb for model O 3 evaluations (U.S.EPA, 2007), which means that wintertime O 3 concentrations at the urban sites will generally not be considered in the formal model evaluation.Figure5b and cshow the predicted and measured monthly average CO and NO concentrations.Strong seasonal variations in CO and NO can be observed, with wintertime concentrations that are a factor of 3-5 higher than summertime concentrations.Model predictions generally reproduce the seasonal variations except at the Riverside site, where predicted seasonal variations are weaker than measurements.The model performance varies by simulation year and location.At the Sacramento and Fresno sties, predicted CO is in good agreement with measured concentrations in all months of 2002 through 2006, but CO is underpredicted in winter months of 2000-2001 and slightly overpredicted in most months of 2007-2008.At the Bakersfield site, CO is underpredicted in 2000-2003 and in good agreement with measurements in 2004-2005 (after which further measurements are not available).At the Los Angeles site, CO is in good agreement in 2000-2003 and overpredicted in the later years.At the Riverside site, CO is underpredicted in all months of 2000-2003, underpredicted in non-summer months in 2004-2006, and in general agreement with measurements in 2007-2008.NO predictions generally agree well with measured NO concentrations in 2000-2004 at Sacramento, Fresno, Bakersfield, and Los Angeles and then are overpredicted in the later years.NO at Riverside is underpredicted in the winter months of 2000-2003 and overpredicted in the summer

Figure 7 .
Figure 7. Monthly average NO concentrations adjusted with the predicted/observed CO ratios.NO_noadj represents the NO concentrations in the UCD/CIT model predictions, and NO_adj represents the NO concentrations adjusted with observations as NO_adj = NO_noadj × CO_predicted / CO_measured.

Figure 8 .
Figure 8. Monthly average nitrate concentrations in 2008 at Sacramento and Fresno, predicted with perturbed relative humidity (RH + 0.3), compared to the base-case nitrate predictions (RH_ori) and observed concentrations (Obs).

Figure 9 .
Figure 9. Association between predicted PM concentration bias and wind bias vs. observed values.The observed PM concentrations and 1/u values on the x axis are expressed in a relative scale of 0-100 % of maximum range calculated as x ( %) = (C−C min )/(C max − C min ) × 100.Values for [C min, C max ] are listed in the concentration key.Bias between predicted vs. observed values is shown on the y axis.Ideal behavior is bias of zero at all concentrations and wind speeds.

Figure 10 .
Figure 10.Predicted (1) vs. measured (2) 9-year average PM 2.5 total mass (a), EC (b), OC (c), nitrate (d), sulfate (e), and ammonium (f) concentrations.The SoCAB_4 km and SJV_4 km results are overlayed on top of CA_24 km results to create the model predicted spatial distributions.Predicted and measured concentrations of the same species are in the same scale shown in the measurement panels.