Improving PM 2 . 5 retrievals in the San Joaquin Valley using A-Train Multi-Satellite Observations

Introduction Conclusions References


Introduction
Air quality in the United States has generally improved since the Air Pollution Control Act of 1955 was enacted.It is, however, still a concern in many regions of the country and in the world primarily because of its affect on human health (Samet et al., 2000;Pope et al., 2009;Krewski et al., 2009).Indirect calculations have pointed to an ap-Introduction

Conclusions References
Tables Figures

Back Close
Full are ground based point measurements, there are many areas where there are huge gaps between measurements.The idea of using satellite measurements to fill in the gaps between ground sensors has been an area of active research since 2003 when the first two studies on this topic were published (Wang and Christopher, 2003;Chu et al., 2003).Several satellite instruments measure the spectral extinction of light as it travels through the Earth's atmosphere.This unitless quantity called aerosol optical depth (AOD) is used to infer the amount of suspended aerosols in the total atmospheric column.While it is the surface measurement of particulate mass that is used for regulatory purposes, satellite observations of AOD have some advantages over surface measurements.They can provide better spatial coverage of pollutant in remote and non-monitored areas, track pollution transport, validate and guide model predictions, and suggest the placement of future surface sensors.Researchers have also been working to use satellite observations to improve air quality forecasts and to provide information to assist regulators and policy makers (Al-Saadi et al., 2005).The question we must answer to determine the utility of remote sensing data for particulate air quality monitoring is, "How well do the column measurements of AOD represent PM 2.5 concentrations at the ground?"While much progress has been made over the last decade, there is still much work to be done (Hidy et al., 2009;Hoff and Christopher, 2009).
To be useful for air quality purposes, satellite observations are (1) validated using ground-based AOD measurements from the globally distributed AErosol RObotic NETwork (AERONET), (2) translated to surface PM 2.5 measurements and (3) those "retrieved" mass concentration values must be validated against surface PM 2.5 measurements.Many studies use a single-variate, least-squares linear regression (a simple linear regression) to relate AOD with PM.Correlations in these studies show a considerable variation with correlation coefficients (r 2 ) between 0 and 0.85 depending on the season and location, or region, of comparison.Hoff and Christopher (2009) review and discuss over 30 papers that have explored the relationship between surface PM and satellite AOD measurements.Satellite AOD measurements from MODIS (MODerate Introduction

Conclusions References
Tables Figures

Back Close
Full resolution Imaging Spectroradiometer), MISR (Multi-angle Imaging SpectroRadiometer), POLDER (POLarization and Directionality of Earth's Reflectance), and GOES (Geostationary Operational Environmental Satellite) have been used in these comparisons.For example, Gupta et al. (2006) investigated linear correlations between MODIS AOD and PM 2.5 at 25 urban areas globally and found correlations varied between 0.11 and 0.85.They found that the AOD-PM 2.5 relationships depended strongly on aerosol concentrations, ambient relative humidity, and height of the mixing layer.Kacenelenbogen et al. (2006) used simple linear regressions to compare POLDER AOD that is retrieved from spectral, directional, and polarized characteristics of the solar radiation and surface PM 2.5 measurements over France to obtain a correlation coefficient of 0.55 for the general area and 0.80 for particular sites.In some cases, notably in the western US, there is little or no correlation (Al-Saadi, et al., 2005;Engel-Cox, et al., 2004;Kahn et al., 2005;Gupta et al., 2006).Hoff and Christopher (2009) suggest that the poorer correlations in the West were due to a wider variation in aerosol types (more nitrate than sulfate and more biomass burning smoke than over the Eastern part of the United States.),higher surface reflectivity making passive satellite AOD retrieval difficult, and more elevated plumes in the AOD signatures.In general, satellite retrievals of AOD are most accurate over areas of dark vegetation (Levy et al., 2010).Not surprisingly, AOD-PM 2.5 correlations are poorest in areas where the AOD retrievals themselves are poor.Additionally, elevated aerosol plumes, typically associated with long-range transport, are integrated into the column AOD retrieval but are not reflected in the surface PM measurements.
Many schemes have been applied to improving satellite-surface correlations.Satellite AOD measurements have been combined with other parameters, such as the vertical structure of aerosol as calculated from a chemical transport model (van Donkelaar et al., 2010;Liu et al., 2009a), or estimates of boundary layer height and relative humidity from models (Liu et al., 2005, Kacenelenbogan et al., 2006).Liu et al. (2009b) used fractional component AOD from MISR (Multiangle Imaging Spectrometer) and aerosol vertical distributions from the GEOS-Chem transport model to constrain esti-Introduction

Conclusions References
Tables Figures

Back Close
Full mates of surface PM 2.5 .Liu et al. (2009b) used generalized linear regression models (GLM) and model-generated meteorological fields to compare the ability of MISR and MODIS AOD to predict surface PM 2.5 in the St. Louis region during 2003.The correlation coefficient between retrieved and measured PM using a GLM was r 2 = 0.62 and r 2 = 0.51 for MISR and MODIS, respectively.
Since many different parameters influence the PM-AOD relationship (for example, various measurements of AOD and Ångstr öm exponent), another way to proceed is to adopt a statistical approach.Gupta et al. (2008) used multiple linear regression (MLR) and neural network techniques to include PBL height, location, temperature, and RH in an analysis in the southeastern US and achieved an improved r 2 of 0.7 using MLR and 0.83 using NN compared with r 2 of 0.41 for a simple linear regression model.Pelletier et al. (2007) applied GAM techniques to retrieve PM from the combination of AERONET AOD and NCEP meteorological data and compared with surface PM collected at two sites in northwestern Europe.Their model increase r 2 to 0.76 compared to an r 2 of 0.27 derived from a simple linear regression.Then Vidot et al. (2007) used the Pelletier et al.GAM method to relate surface PM with SeaWiFS (SeaWide Field-of-view Sensor), AOD, and National Center for Environmental Prediction (NCEP) meteorological parameters achieving an r 2 of 0.61 compared with 0.48 for the linear case.
PM 2.5 concentrations are needed in all weather conditions and at all locations and air quality models are used to "fill in the gaps" from satellite observations or surface stations.These models typically get their meteorological parameters from forecast, transport, or assimilation models and need to be validated.Using these same inputs in the PM 2.5 satellite retrievals would impugn the value of these retrievals in the validation process.Therefore, our objective in this paper is to improve estimates of PM 2.5 using only satellite observations and statistical techniques.
The study area, data sets, and methodology are discussed in the next section of this paper.Particular attention is paid to the generalized additive models (GAM) used in this research.Results of the GAM applied to six individual sites in the San Joaquin Valley (SJV) are presented and compared to the simple linear regression as a standard Introduction

Conclusions References
Tables Figures

Back Close
Full metric.Then results from a case combining all 6 sites are described.One advantage of GAM over other statistical techniques is that general spline functions can be used to model the sensitivity of parameters on the PM 2.5 response.These sensitivities are discussed in the next section.The effect of adding satellite observations on sample size is discussed.Finally, time lines of PM 2.5 retrieved with a linear model and the GAM are compared with surface measurements.

Study area
California's San Joaquin Valley (SJV) is located southeast of San Francisco in the center of the state between the Coastal Mountain Range to the west and the Sierra Nevada Range to the east (see Fig. 1).This topography leads to northwesterly winds during the spring, summer, and fall and a thick well-mixed boundary layer.During the winter, the winds die down and cold temperatures form an inversion layer keeping the boundary layer low and inhibiting surface mixing.The SJV encompasses nearly 64 000 km 2 and contains a population of over 3 million.Fresno, located in the center of the SJV, has a population of more than 500 000 and experiences frequent hospitalizations for asthma (Watson, 2000).From 1991 to 1996, PM 2.5 annual averages ranged from 18 to 24 µg m −3 with the highest 24-h averages ranging from 56 to 93 µg m −3 .The 24-h averages exceeded the National Ambient Air Quality Standards (NAAQS) for PM 2. quality data (Tran et al., 2009).Ground-based measurements are sparse in California and do not provide enough coverage for air quality monitoring.Further, there is little or no correlation between MODIS AOD and surface PM measurements in this region using single-variate, least-squares linear regressions (Engel-Cox et al., 2004).Previous attempts to improve this correlation by taking into account surface reflectance included in the MODIS Deep Blue retrieval for bright surfaces (Hsu et al., 2006) versus the standard MODIS dark target retrieval models have met with little success (Ballard et al., 2008;Justice et al., 2009).We have concentrated our study in the SJV.

Surface data
The California Air Resources Board (CARB) has established air quality monitoring stations throughout California.Air quality indicators such as chemical compounds and particulate matter concentrations, ozone levels, and meteorological conditions are monitored (Bucsela et al., 2008).The sties used for this study are listed in Table 1 and shown in Fig. 1.While the Federal Reference Method (FRM) gravimetric filter-based samplers are the only samplers currently approved to provide data for determining the attainment status for an area, continuous mass monitors are used to collect data for understanding the diurnal and episodic behavior of fine particles, transport assessment, and for use by health scientists investigating exposure patterns.
In this study we use PM 2.5 obtained using both the FRM filter method for daily PM and the beta attenuation monitor (BAM) for the hourly PM.BAM measurements are a confirmed US Federal Equivalence Measurement relative to the 24-h filter-based reference standard (Zhu et al., 2007).The BAM determines the deposited mass by the attenuation of high-energy electrons through the sample filter.Accuracy is estimated to be ±8 % of indication for hourly readings and ±2 % for daily readings when compared to the FRM measurements (Ecotech, 2006).Hourly PM 2.5 , daily PM 2.5 averages, and daily PM 10 averages were collected from Introduction

Conclusions References
Tables Figures

Satellite data
The satellite data used in this study come from the Ozone Monitoring Instrument (OMI) onboard NASA's EOS-Aura (Earth Observing System-Aura) satellite and the Moderate Resolution Imaging Spectroradiometer (MODIS) onboard NASA's EOS-Aqua (Levelt et al., 2006).Aqua and Aura were launched into a sun-synchronous orbit on 4 May 2002 and 15 July 2004, respectively.Aqua observes the SJV between 1300 and 1400 h, Pacific Standard Time.There is an approximate ∼8 min time lapse between the AQUA and AURA overpasses.MODIS AOD was for this study because of its good spatial resolution, effective cloud mask algorithms, and its near-daily global coverage.MODIS continuously acquires daily global measurements with 36 spectral bands (from 0.41 to 14 µm) at three different spatial resolutions (250 m, 500 m and 1 km).AOD is reported at 10 × 10 km 2 resolution.MODIS' high spatial and temporal resolutions are Introduction

Conclusions References
Tables Figures

Back Close
Full vegetation (Levy et al., 2007)".This algorithm is accurate for determining AOD above highly vegetated surfaces, but performs poorly over highly reflective surfaces such as cities and deserts.The Deep Blue algorithm has proven to be successful at deriving AOD values over bright surfaces such as deserts (Hsu et al., 2006).Deep Blue uses a set of lookup tables based on a polarized radiative transfer model, allowing it to simulate the radiance for "a range of solar and viewing geometries at the top of the atmosphere" (Hsu et al., 2004).The MODIS dark target AOD product retrieves within an expected error envelope of ±(0.05 + 15 %) at 550 nm (Levy et al., 2010).
The OMI is an imaging spectrometer that measures solar light backscattered by the Earth's atmosphere and surface (Bucsela et al., 2006).The instrument consists of two spectrometers, one measuring the UV spectral range from 270 to 365 nm in two sub-ranges (UV1: 270-314 nm, resolution: 0.42 nm, sampling: 0.32 nm; UV2: 306-380 nm, resolution: 0.45 nm, sampling: 0.15 nm), the other measuring the UV-visible spectrum from 350 to 500 nm (resolution: 0.63 nm; sampling: 0.21 nm).OMI uses a CCD array with one dimension resolving the spectral features and the other dimension allowing a 114 • field of view, providing a 2600-km viewing swath transverse to the orbit track.Its nadir spatial resolution ranges from 13 × 24 to 24 × 48 km 2 , depending on the instrument's operating mode.For this study we used OMI aerosol optical depth, aerosol absorption optical depth (AAOD), and NO 2 , data product version 3. Good agreement is generally seen between the OMI and ground-based measurements, with tropospheric columns underestimated by 15-30 % (Celarier et al., 2008).

Data preparation
Satellites provide AOD usually only once a day and correlations are typically better with hourly rather than daily PM 2.5 surface measurements (Gupta et al., 2008;Ballard et al., 2008;Justice et al., 2009).The best correlation between MODIS AOD and surface PM 2.5 for sites in the SJV was found when averaging 5 × 5 10 km pixels centered at the ground station using a linear regression (Ballard et al., 2008;Justice et al., 2009).
Comparing the average of a 5 × 5 pixel area AOD with hourly PM 2.5 measurements is 30571 Introduction

Conclusions References
Tables Figures

Back Close
Full the standard practice for such studies (Abdou et al., 2005;Ichoku et al., 2002;Kahn et al., 2005;Remer et al., 2005).Two studies suggest that a 5 × 5 pixel average is a good approximation for AOD and validated it against AERONET data collected within one hour of the MODIS overpass time (Anderson et al., 2003;Ichoku et al., 2002).This finding is supported by the observation that the average speed of aerosol transport in mid-troposphere over ocean is 50 km h −1 based on TOMS images.We therefore used this value in the present study.OMI already has a large pixel size so the single pixel OMI data was used in this analysis.Including only MODIS AOD points with "Good" or "Very Good" quality flags marginally improved correlation (Justice et al., 2009), but this also restricted the number of data points in the comparison, sometimes severely.
It was decided to ignore the data quality flags for this project.The surface PM 2.5 measurement was taken at the nearest hour to the satellite overpass.We used the same data sets when comparing linear and GAM retrievals.For example, we have PM and MODIS data at Fresno from 2002 to present, but OMI data was only available from 2004-2008, so the comparison in the results section between correlations using the linear model with AOD only and the GAM model which included OMI data are made for the period 2004-2008.

Relating PM and AOD
PM can be related to AOD through the relationship (Koelemeijer et al., 2006) where H is the layer over which the AOD is measured, f (RH) is the ratio of ambient and dry extinction coefficients, Q ext,dry is the extinction efficiency (a function of particle size, composition, and wavelength) , ρ is the particle density, and r eff is the particle effective radius (the ratio of the third and second moments of the size distribution).Equation ( 1) shows that the relationship between PM and AOD is dependent on the Introduction

Conclusions References
Tables Figures

Back Close
Full vertical distribution of particles in the atmosphere, their size and composition, and the efficiency with which they interact with light.The particles measured in determining PM reside in the planetary boundary layer (PBL) whose height is a function of the local meteorology.In winter in the SJV, the PBL is small resulting in high concentrations of PM near the surface.In the summer, the PBL is larger, resulting in decreased concentrations of PM near the surface.The presence of f (RH) in Eq. ( 1) shows that the surface RH also affects AOD since the uptake of water vapor onto particles increases their size and extinction coefficient.In this study, these meteorological effects are represented with a 'seasonality' parameter, θ, that is modeled as a simple function of 'day of year'.Admittedly, θ will not take into account episodes where large amounts of particles reside above the PBL, affecting AOD but not PM.Incorporating these occurrences in the present algorithm would likely improve results, however, information on the vertical distribution of particles in the atmosphere are not routinely available in the SJV from ground-based lidars and data from the space-based Calipso lidar is too infrequent for this study.Finally, particle and gas phase constituents are interrelated through the processes of nucleation, condensation, and evaporation.Therefore, a relationship between the satellite gas phase product and the PM response can reasonably be expected.In this study, column concentrations of NO 2 from the OMI satellite were found to be significant in the retrieval of PM.

Statistical methods
In a simple linear regression model, the relationship between a response y and parameters x 1 , . . ., x p is given by

Conclusions References
Tables Figures

Back Close
Full where the B j are constants and ε is the residual error.In the simplest case, when PM is compared with AOD only, Eq. ( 2) becomes: In an additive model, we replace the linear relationship between the parameters and response with a functional relationship so that an additive model takes the form: where the f j are simple, smooth arbitrary functions replacing the coefficients B j .Replacing the B j with f j is necessary because some parameters show a nonlinear dependency with PM 2.5 .Some parameters may have a large effect on the response in only a particular part of their range.For example, a variable may be dominated by random error at very low values or have other undesirable effects at very high values, such as MODIS Deep Blue AOD as shown in Fig. 4a.For these reasons it is appropriate to make the simple extension of linear to non-linear regressions.Generalized additive models (GAMS) are such an extension, retaining additive terms (Wood, 2006(Wood, , 2007)).
In this case we employ several retrievals of AOD as parameters (e.g., MODIS standard algorithm, MODIS Deep Blue, OMI), because each has its possible virtues and range of greatest applicability.For example the OMI AAOD tends to be more sensitive to aerosols at high altitude.
Generalized additive models (GAMs) and neural network algorithms have been used previously but not with satellite-only data sets (Gupta, et al., 2006;Pelletier, et al., 2007;Vidot et al., 2007;Lary et al., 2009).Additive models provide flexibility but also allow functional forms to be plotted, allowing us to evaluate the physical rationale of the relationship expressed in the function.In this implementation we will use splines to describe the relationship between parameters and response in a functional form that can be related to a physical basis.The resulting predictive model is used to predict surface PM from satellite observations.Introduction

Conclusions References
Tables Figures

Back Close
Full 3 Results and discussion

Linear regressions between parameters
When a particular response, in this case PM, is a function of multiple parameters, it is instructive to explore the relationship between different parameters.This information can guide the selection of parameters to include in a multiple regression and help explain the results of the analysis.Correlations for the linear regressions between the parameters of interest in this study are shown in Table 2.These correlations are based on the data from all sites that form the complete set described in more detail in Sect.3.2.Some features of the data in Table 2 stand out.First, there are excellent correlations between AOD retrieved by the same method or instrument at different wavelengths.For example, the correlation between MODIS AOD at 0.47 and 0.55 nm using the standard retrieval (AOD 47 and AOD 55) is 0.97.The correlation for the Deep Blue retrieval (DB AOD 47 and D AOD 55) was also 0.97.The high correlation between AOD at different wavelengths may be due to the constraints that the MODIS retrieval puts on the wavelength dependence.This practice is typical of other satellite retrievals This indicates that no new information will be obtained from including both wavelengths from the same retrieval.It is also interesting that the AOD retrieved using different methods does not correlate very well: the correlation between AOD 47 and DB AOD 47 was 0.35; the correlation between AOD 47 and OMI AOD was 0.15; and the correlation between BD AOD 47 and OMI AOD was 0.7.The correlation between OMI AOD and AAOD was 0.6.These low correlations are surprising.They suggest that the different retrieval algorithms play an important role in determining AOD.They also suggest that different retrievals or observations are providing different information about the aerosol optical depth.This may be because the measurements are made at different wavelengths (OMI observations are made at shorter wavelengths than MODIS) or different assumption for aerosol models or surface reflectance are used in the retrieval algorithm.The other surprising fact from the data in Table 2 is the generally low level of correlation between the other parameters.Low correlations between PM and satellite 30575 Introduction

Conclusions References
Tables Figures

Back Close
Full observations in this region has been observed and commented on before.

Correlations for individual sites
As an example of how the GAM model can work for an individual site, we look at the Fresno site for four years, 2004-2008.Figure 2a shows a scatter plot of PM 2.5 modeled using a simple linear regression of MODIS Deep Blue AOD (DB AOD 47) versus PM 2.5 .The correlation coefficient, R 2 , is 0.21.Note the lack of agreement especially for large values of PM 2.5 .These are the most important points to match since they are associated with the exceedances of EPA criterion pollutant standards, which is 35 µg m −3 for a 24-h average PM 2.5 .Figure 2b shows a scatter plot of PM 2.5 modeled using a GAM for the same data.Parameters used in the model are MODIS Deep Blue AOD at 0.4 µm wavelength, OMI NO 2 , and θ (day of year, a proxy for seasonal variation).The correlation coefficient has improved to R 2 = 0.72 for PM 2.5 measurements.
We see from Fig. 2a and b that a positive intercept is obtained using both the simple linear and GAM retrievals.It has been suggested that this intercept represents the minimum level of particle concentration for which the satellite-derived AOD is sensitive (Gupta et al., 2008).It is probably more suitable to claim the lowest reported value as the instrument sensitivity.In both cases this is about 12 µg m −3 .
Table 3 summarizes the results of the GAM model for six San Joaquin Valley sites: Bakersfield, Fresno, Modesto, Stockton, Turlock, and Tracy.In all cases the response was hourly PM 2.5 measured at the surface site.Very little Deep Blue AOD was available for Modesto, therefore, data from the standard algorithm was used.Results of the simple linear regression are shown in red for comparison.The number of points used in the correlation is listed and discussed later.The low number of points for Tracy is due to PM data only being available for part of one year, 2007.The correlation to daily PM is better for Bakersfield and Fresno than for the hourly PM.
The data show that there is general improvement in the correlations for the various sites over the simple linear regression.We explored many combinations of parameters Introduction

Conclusions References
Tables Figures

Back Close
Full looking for the best fit of retrieved to measured PM.Better correlations were obtained using surface measurements of NO x , but the objective in this study was to use only satellite observations since these are most transferable to non-monitored areas.The parameter set chosen for Table 3 was the set that gave the best performance for the entire SJV, however, some sites did have slightly better agreement than those listed in Table 3 with a different set of paramters.

Multiple-site retrieval
Our goal in this work is to obtain the best retrieved-PM for the entire SJV using only satellite observations.We combined all of our San Joaquin Valley sites into one data set.A parameter to differentiate between sites was included to account for topological and population differences but it had a negligible contribution to the retrieval and it was not included in the parameter sets considered here.Table 4 summarizes the results of several GAM models for the six San Joaquin sites.The best correlation (r 2 = 0.74) between observed and retrieved surface PM occurred when the GAM1 model was used.This model included all six parameters: MODIS standard AOD, MODIS Deep Blue AOD, OMI AOD, OMI AAOD, OMI NO 2 , and θ.However, as discussed in Sect.3.5, adding more parameters significantly reduced the sample size.In this case the sample size was reduced from 2122 to 630.To obtain the other retrieval models that make up Table 4, different parameters (indicated by an "X") were eliminated and retrievals obtained.The results of the correlation and effects on sample size are shown in Table 4.
Then a scheme for combining several GAM models was devised which kept the sample size constant and had minimal effect on the correlation coefficient.We discovered that we could retain all 2122 sample points in this way and only reduce the correlation to 0.69.The combination was performed using the following steps: 1.The GAM1 model (see Table 4) was run with the full set of six parameters.The retrievals were weighted as described in Sect.3.4.

Parameters were eliminated and the effect on correlation coefficient and sample Introduction Conclusions References
Tables Figures

Back Close
Full 3.This procedure resulted in six models using six different sets of parameters.The 6 models and their results are shown in Table 4 along with a simple linear model for comparison and each parameter's p-value sensitivity.
4. The next step was to combine all six retrieval models ensuring that no data point was used more than once.The sample points were combined by taking the points from the model with the highest r 2 (GAM1) and adding the points from the model with the next highest r 2 (GAM3) that were not contained in the previous model.
Points from the remaining models were added in this same fashion.
5. The combined data set r 2 was determined and shown in Table 4.
The values listed in the row across from the parameters are their p-value.P-value is a measure of how significant a parameter is in the retrieval.The smaller the p-value the more highly significant is the effect of the parameter in the model.(In statistical terms, the p-value is a measure of how much evidence we have against the null hypothesis; that is the assumption that the parameter has no effect on the retrieval.)

Weighting for PM exceedances
It is possible to weight the response when using regressions.In this case it was desirable to put more weight on higher PM values -those associated with EPA exceedances that are of more interest to the regulatory and epidemiology community.This was accomplished in this study using an iterative process.First the GAM models in Table 4 were run without weighting.Then the resultant retrieved PM values were used to establish a weighting and the GAM model was run again.Figure 3 shows the data correlations for the combined data set with a variety of models.Figure 3a  linear model correlating standard MODIS AOD with PM for reference.Figure 3b is the combined GAM model without weighting for PM, and Fig. 3c is a simple correlation for the weighted model results.It is seen that the r 2 has not changed between Figs. 3b   and c.When using a weighting function, it is appropriate to look at the correlation as a function of the weighted variables.A standard correlation between two variables, x and y, is where µ is the mean value and σ is the standard deviation of the parameter x or y.The weighting function, w n , used was based on the retrieve values to insure that the weighting function was normalized, w 2 n = 1.Then the weighted correlation becomes Figure 3d shows the scatter plot for the weighted observed and retrieved PM.It is clear that weighting is doing a better job at matching the higher values of PM.

Parameter sensitivities
OMI NO 2 was a significant contributor to predicting PM and had a linear sensitivity with PM.This is probably because OMI NO 2 is a measure of how much the SJV is affected by NO x -generating polluters, in this case, vehicle traffic.Nitrogen dioxide is formed in the environment from primary emissions of oxides of nitrogen.Although there are Introduction

Conclusions References
Tables Figures

Back Close
Full natural sources of NO x (e.g., forest fires), the combustion of fossil fuels has been, and remains, the major contributor in urban areas.Traffic pollution and power plants are two of the biggest sources of NO 2 pollution in the United States.Sulfate aerosol make up only about 5 % of the total aerosol in the Fresno area while nitrate makes up about 25 % and BC and OC make up about 18 % and 21 %, respectively (McMurry et al., 2004).We note that OMI NO 2 does not typically correlate well with surface measurements.We believe this is because the surface measurement is a point measurement, while the satellite data represents a more distributed value.
Including surface reflectance in the model had very little effect on the agreement between retrieved and measured PM suggesting that the Deep Blue surface reflectance model is sufficient for the AOD retrieval in the SJV.Ångstr öm exponent was not a significant parameter in any of the models.This was expected due to the high correlation between AOD at different wavelengths.As previously mentioned, one of the advantages of using GAMs in this application is that they produce plots of the relationships between the response (PM) and the parameters (e.g., AOD, NO 2 ) that can suggest a physical relationship.To illustrate, Fig. 4a shows the sensitivity of MODIS Deep Blue AOD with respect to PM.This sensistivity is the coefficient, f j in Eq. ( 4), that corresponds to the parameter MODIS Deep Blue AOD.This sensitivity is nearly linear as one would hope, however there are some departures from linearity that are captured by the spline function.
The significance of the seasonal variation in the GAM models is large, as seen in the very low p-values in Table 4.This is expected because meteorological conditions strongly affect PM concentrations.These include: vertical mixing of air pollutants, temperature, moisture, long and short range transport, and the available sunlight which effect secondary organic aerosol.Liu et al. (2009b) discriminated between spring and non-spring seasons in their work using GAMs and model-generated meteorological fields.Figure 4b shows the seasonal sensitivity as function of day-of-year.In the SJV it is observed that very high levels of pollution occur on strong inversion days (Watson et al., 2000).Others have considered the factors affecting the air quality in the region (e.g., Chow et al., 2006), and found that motor vehicles and residential wood burn-Introduction

Conclusions References
Tables Figures

Back Close
Full ing are the principal sources of aerosol particles.Emissions from motor vehicles are essentially constant during the year while wood burning emissions constitute a major aerosol mass fraction in the fall and winter seasons beginning mid-November.Presumably, residential wood smoke particulates would be minimal at 1400 the time of the satellite overpass.Justice et al. (2009) found that while surface measurements of PM 2.5 in SJV were greatest in the winter, AOD values were lowest in the winter.Investigating the relationship between sunphotometer-derived AOD to PM at Ispra, Italy, Barnaba et al. (2010) found a similar effect with maximum AOD occurring in spring/summer and maximum PM values occurring in winter.They were able to use coincident lidar measurements to demonstrate that the variation in aerosol vertical distribution was important in this discrepancy.Since there are no lidar sites in the SJV, we were unable to explore this phenomenon presenting the present study, however the statistical model was able to capture the proper seasonal behavior and successfully model it.
Given that PM is a dry measurement and AOD is not, the seasonal variation in RH may be reflected in the seasonal functional dependence.The choice of the day-of-year parameter was due to expedience.The influence of seasonal factors on surface PM, such as relative humidity, height of the planetary boundary layer, composition of the aerosol due to seasonal emissions, is well documented.We could obtain no reliable observations of these parameters from remote sources in keeping with our premise to use only remote observations to retrieve surface PM values.There was an attempt to use RH from assimilation models, but these proved to be inaccurate unless some modeled meteorology was included in the assimilation.Barnaba et al. (2010) found that the correlation did not improve when RH was considered.Similarly, for the few cases in which surface measurements of RH did exist, we found that they did not have a significant effect on the correlations between retrieved and measured PM.Additionally, the usefulness of our retrievals for validating model predictions of PM would be compromised if the same meteorological model was used in estimating RH.Thus the day-of-year parameter seemed the best choice for this study.The limitation with using day-of-year is that the meteorological factors that effect PM do not commence at the Introduction

Conclusions References
Tables Figures

Back Close
Full same time every year.We are exploring ways to add this influence in our algorithm development.

Effect of adding parameters on sample size
The number of sample points decreases dramatically when PM is merged with satellite data.Hidy et al. (2009) note that for MODIS data in Fresno from 2002 to 2008, cloudfree days were available 43, 61, 94, and 81 % of the time in winter, spring, summer, and fall, respectively.Table 5 shows how the number of sample points decreases as more parameters are added.The data set is limited to days that have OMI data, from 1 Oct. 2004 to 6 Jul 2008.This gives a total of 8244 possible samples for six sites.Of these possible samples, MODIS data existed for 4455 points.But our criterion for averaging over the 5 × 5 pixels was that >50 % of the 25 pixels had to have valid data.This restriction is more severe than is typically used for these types of studies and (slightly) reduced the number of points to 4340.The reduction in points resulting from the merge with satellite data is largely due to cloudy pixels.Since OMI has a larger footprint than MODIS, the number of "cloudy" pixels is larger.

Retrieved and observed PM trends
To further illustrate the utility of the multi-satellite GAM retrieval, observed PM for days in our sample set are plotted with retrieved PM from a simple linear regression with standard MODIS AOD (Fig. 5a) and with the GAM-retrieved PM (Fig. 5b).The figures clearly illustrate that the GAM retrieval does a superior job in matching the observed to go before surface measurements will be replaced by satellite retrievals for regulatory purposes.Thus we contend that the real near-term utility of satellite-derived PM will be in filling in the gaps of surface measurements to improved validation of air quality models and for epidemiology studies.

Conclusions
This paper demonstrates the use of GAM models with multi-platform satellite observations to dramatically improve the correlation between observed and retrieved PM in California's San Joaquin Valley.The parameters used are MODIS AOD, OMI AOD, AAOD and NO 2 concentration, and a seasonal parameter.Correlations (r 2 ) for the retrieved/observed PM 2.5 for a data set combining six surface sites improve to 0.69 compared with r 2 of 0.27 for the linear regression of MODIS AOD to surface PM.Particularly noteworthy is the fact that the PM retrieved using the GAM captures many of the PM exceedences that were not seen in the simple linear regression.
Further improvements are needed.The GAM models and input data can be refined by using other combinations of parameters and including more measurement sites, especially rural sites.Certainly, the inclusion of other available data, especially information on the vertical distribution of particles, when available, may further improve these results.The question of the generality of this technique needs to be addressed in the future.We intend to apply this technique to other areas globally that have similar topography to the SJV and that have demonstrated poor correlations between AOD and PM.We expect that the combinations of parameters used in the SJV will be useful in these areas.However, there may be some regions in which the combination of parameters needs to be adjusted.For example, NO 2 concentration was a significant factor in retrieving PM in the SJV, but this may have been due to the prevalence of transportation related emissions in the region.In other regions more heavily dominated by electrical power plant emissions, SO 2 may prove to be more significant than NO 2 in the retrievals.However, we feel that the techniques demonstrated in this study can Introduction

Conclusions References
Tables Figures

Back Close
Full be used to greatly enhance the utility of satellite-retrieved PM for air quality purposes.
In addition, the GAM model produces relationships between the parameters and the response that may lead to improved satellite retrieval of AOD, surface reflectance, etc. Improved retrieved-PM 2.5 from satellite data will have profound benefits.They will be valuable in validating emission inventories used in global climate models and for validation of air quality models.They will be especially useful in sparsely populated regions where no data is available and areas where residents are exposed to significant PM 2.5 concentrations but where measurements are not available.Such data can be used to track pollution transport, suggest the placement of future surface monitoring stations, and in epidemiological studies that seek to identify the sources of the most toxic air pollutants and the susceptible populations.These techniques will prove useful in improving satellite observations of particulate as well as gas phase air quality, not only in the western United States, but globally.Introduction

Conclusions References
Tables Figures

Back Close
Full  Full Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | 5 , which are 15.0 µg m −3 for the annual average and 35 µg m −3 for the 24-h average.The highest PM 2.5 concentrations are typically found during the winter and fall.The winter and fall are also characterized by high concentrations of NH 4 NO 3 .Watson et al. (2000) provide an excellent overview of the Fresno air quality and air quality measurements made at the Fresno supersite.Elevated concentrations of PM 2.5 can be associated with about 18 000 (ranging from 5600 to 32 000) premature deaths in California each year based on 2004-2006 air Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Screen / Esc Printer-friendly Version Interactive Discussion Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper |size was recorded.It was found using OMI AAOD as a parameter resulted in the largest decrease in sample size.Eliminating either the standard or Deep Blue MODIS AOD did not have a large effect on the correlations and reduced the number of sample points only slightly.
Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | PM and capturing exceedances.The lines in the plots are smoothed data to show trends.Of the 1272 data points in the set, surface measurements recorded 149 exceedances.The linear fit recorded six and the GAM fit recorded 167.The GAM fit correctly identified 68 % of the exceedances, missed 31 %, and gave false positives for 43 %.While this is a big improvement, it underscores the fact that there is a long way Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Figure 1.Map of the San Joaquin Air Pollution Control District in California highlightin coverage provided by ground sites and satellites.4 5

Fig. 1 .
Fig. 1.Map of the San Joaquin Air Pollution Control District in California highlighting the coverage provided by ground sites and satellites.

Fig. 5 .
Fig. 5. Trends of retrieved and measured PM in SJV.The EPA 24-h criterion is shown as a red dashed line.The blue and gray points are the retrieved and measured PM 2.5 , respectively.The blue and black solid lines are loess fits to the retrieved and measured and PM 2.5 , respectively.

Table 2 .
Correlation coefficients (R 2 ) for linear regressions between response and parameter.Data from all sites.