Atmospheric Chemistry and Physics Open Access

Abstract. An aerosol optical depth (AOD) three-dimensional variational data assimilation technique is developed for the Gridpoint Statistical Interpolation (GSI) system for which WRF-Chem forecasts are performed with a detailed sectional model, the Model for Simulating Aerosol Interactions and Chemistry (MOSAIC). Within GSI, forward AOD and adjoint sensitivities are performed using Mie computations from the WRF-Chem optical properties module, providing consistency with the forecast. GSI tools such as recursive filters and weak constraints are used to provide correlation within aerosol size bins and upper and lower bounds for the optimization. The system is used to perform assimilation experiments with fine vertical structure and no data thinning or re-gridding on a 12 km horizontal grid over the region of California, USA, where improvements on analyses and forecasts is demonstrated. A first set of simulations was performed, comparing the assimilation impacts of using the operational MODIS (Moderate Resolution Imaging Spectroradiometer) dark target retrievals to those using observationally constrained ones, i.e., calibrated with AERONET (Aerosol RObotic NETwork) data. It was found that using the observationally constrained retrievals produced the best results when evaluated against ground based monitors, with the error in PM 2.5 predictions reduced at over 90% of the stations and AOD errors reduced at 100% of the monitors, along with larger overall error reductions when grouping all sites. A second set of experiments reveals that the use of fine mode fraction AOD and ocean multi-wavelength retrievals can improve the representation of the aerosol size distribution, while assimilating only 550 nm AOD retrievals produces no or at times degraded impact. While assimilation of multi-wavelength AOD shows positive impacts on all analyses performed, future work is needed to generate observationally constrained multi-wavelength retrievals, which when assimilated will generate size distributions more consistent with AERONET data and will provide better aerosol estimates.


Introduction
Atmospheric aerosols interact with society and the environment in several important ways such as producing acute health impacts, generating visibility issues and creating a substantial climate response (e.g., Ramanathan et al., 2008).Thus, it is important to have accurate estimates of aerosol concentrations.However, predicting aerosols remains a challenge and models produce estimates with substantial errors and biases (Koch et al., 2009;McKeen et al., 2007).Current efforts to reduce the uncertainties in aerosol distributions include assimilating aerosol-related observations (e.g., Pagowski et al., 2010), where one of the most commonly used observations is aerosol optical depth (AOD) from satellite retrievals.AOD has been used along with models to constrain aerosol concentrations in multiple ways: to generate AOD to surface PM 2.5 (particles with a diameter of 2.5 µm or less) conversion factors (e.g., van Donkelaar et al., 2006;Liu et al., 2005); to improve model daily and monthly estimates of ground level PM 2.5 (e.g., Carmichael et al., 2009;Adhikary et al., 2008); to correct model initial conditions to produce improved reanalysis and forecasts (e.g., Liu et al., 2011;Benedetti et al., 2009); and to produce better emissions estimates (Huneeus et al., 2012;Heald et al., 2010;Wang et al., 2012;Xu et al., 2013).
Among the satellites and sensors that produce AOD estimates, one of the most commonly used is the operational dark target retrieval from the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor on board the Terra and Aqua platforms (Remer et al., 2005), as it tends to generate accurate observations over a wide range of surfaces (Petrenko and Ichoku, 2013).However, this retrieval often shows deviations from ground measurements, and centers such as the Naval Research Laboratory (NRL) (Zhang et al., 2008) and National Aeronautics and Space Administration (NASA) (GMAO, 2013) use observationally constrained retrievals (where AOD is empirically fitted to ground sunphotometer data) in their assimilation systems.To our knowledge, the impact of assimilating operational MODIS products versus observationally constrained products has not been previously assessed.
MODIS AOD, as other satellite/sensor products, is reported in several wavelengths (three and seven for land and ocean retrievals, respectively), and the wavelength dependency of AOD (Ångström exponent) contains aerosol size information (Schuster et al., 2006).However, most studies assimilate a single retrieval (usually 550 nm), and there are few studies analyzing the impact of simultaneously assimilating multiple wavelengths.For instance, Schutgens et al. (2010) assimilated AOD and Ångström exponent (obtained combining multiple wavelengths) ground measurements from the Aerosol RObotic NETwork (AERONET) network to constrain a global aerosol model.AOD retrieval algorithms can also produce a fine mode fraction product.A few studies have explored the use of the fine mode fraction and total AOD simultaneously.For example, Generoso et al. (2007) used fine and coarse mode AOD on global data assimilation experiments using POLDER satellite measurements, as well as Huneeus et al. (2012), who used fine and total MODIS AOD in the context of a global emissions inversion with positive impacts, including improved aerosol size distributions.There is a need to further assess the impacts of simultaneous use of these data sets in a data assimilation framework.
In this study we develop the ability of the Gridpoint Statistical Interpolation (GSI) three dimensional variational (3DVAR) system to simultaneously assimilate various AOD products to correct Weather Research and Forecasting/Chemistry (WRF-Chem) forecasts when using the Model for Simulating Aerosol Interactions and Chemistry (MO-SAIC) treatment (Zaveri et al., 2008).This aerosol model is widely used in several applications, but its use in an assimilation framework is challenging due to the large num-ber of species and size bins that need to be treated simultaneously (Li et al., 2012).However, assimilation performed for aerosol treatments that have higher degrees of freedom (i.e., multiples species and multiples size bins) may be useful when assimilating many data sources at the same time, as both the total mass and aerosol size distribution could be modified to produce a better fit to observations.Section 2 describes the method and additions introduced to GSI to effectively perform assimilation with the MOSAIC model.Then, the system is used in two experiments, shown in Sect.3. First, we assess the impact of assimilating operational MODIS retrievals (dark target land and ocean) versus observationally constrained ones; and second, we evaluate the impact on forecasts when simultaneously assimilating multiple wavelengths and fine and total AOD compared to just assimilating total 550 nm AOD.Finally, shortcomings, conclusions and future directions are presented.

Forecast model
The aerosol forecasts were performed with the Weather Research and Forecasting (WRF) version 3.4.1 regional meteorological model (Skamarock et al., 2008) coupled to aerosols and chemistry (WRF-Chem) (Grell et al., 2005).This is a fully coupled online model.The chemical and aerosol mechanism used is the CBMZ gas-phase chemical mechanism (Zaveri and Peters, 1999;Fast et al., 2006) coupled to the 8-bin sectional MOSAIC (Zaveri et al., 2008) aerosol model.MOSAIC keeps track of 8 chemical species (sulfate, nitrate, ammonium, organic carbon, black carbon, sodium, chloride and other inorganics, where dust is included) on two phases (dry/interstitial and wet/activated) that, along with number concentration (on both phases), water and hysteresis water content per size bin, results in a total of 160 species tracked.The model configuration is based on Saide et al. (2012b).Some of the configuration choices include a 12 km horizontal grid spacing with 72 vertical levels with ∼ 60 m level thickness below 3 km, MYNN level 2.5 planetary boundary layer (PBL) scheme (Nakanishi and Niino, 2004), Lin microphysics (Chapman et al., 2009), Goddard short wave radiation (Chou et al., 1998;Fast et al., 2006), andAbdul-Razzak andGhan (2002) aerosol activation.
Emissions from different sources are treated as follows: NEI (National Emission Inventory) 2005 anthropogenic emissions (http://www.epa.gov/ttnchie1/net/2005inventory.html), MEGAN (Model of Emissions of Gasses and Aerosols from Nature) biogenics (Guenther et al., 2006), and FINN (Fire INventory from NCAR) biomass burning emissions (Wiedinmyer et al., 2011), coupled to an online plumerise model (Grell et al., 2011), Gong et al. (1997) sea salt parameterization and GOCART dust scheme (Zhao et al., 2010) (Emmons et al., 2010), respectively.MOZART uses monthly dust distributions from Community Atmosphere Model (CAM) (Mahowald et al., 2006) calculations, which are also used in this study.Even though WRF-Chem has the option to include secondary organic aerosol (SOA) formation coupled to MOSAIC aerosols (Shrivastava et al., 2011;Hodzic and Jimenez, 2011), this process is not included in this analysis, as the focus of this paper is the development and testing of the new assimilation system.
Even though more sophisticated assimilation schemes such as 4DVAR (Benedetti et al., 2009) and Ensemble Kalman filter (Pagowski and Grell, 2012) can be used for assimilation, we chose 3DVAR as a computationally inexpensive but powerful way to demonstrate AOD assimilation for the MO-SAIC aerosol scheme, without having to perform an ensemble of simulations or develop the WRF-Chem adjoint.The GSI version used is based on the modifications made by Liu et al. (2011) and Schwartz et al. (2012) to assimilate AOD.However, we incorporated substantial additional modifications suited to the MOSAIC aerosol model, as described in the following subsections.

3DVAR method
In this study we build upon work of Liu et al. (2011) and Schwartz et al. (2012).As they presented, we use the 3DVAR functional (J ), but add terms to allow weak constraints: where y represents the observation and x the control variable (e.g., the one modified during optimization); with x b the a priori estimate and x uc and x lc the upper and lower constraints; H the observation operator; R, B and K the observation, background and weak constraint error covariance matrixes, respectively; and k uc and k lc regularization parameters to weight the weak constraint.Liu et al. (2011) and Schwartz et al. (2012) considered as control variables three dimensional (3-D) aerosol concentrations from different species and AOD as observations.In this work, we introduce new GSI options.First, we incorporate the option of using as control variables the natural logarithm (LN) of 3-D aerosol concentrations.This choice naturally constrains concentrations to be positive and provides multiplicative rather than additive corrections (Henze et al., 2009).In the same manner, we add the option to use the AOD natural logarithm as the observation.As both aerosol concentration and AOD are positive, it is likely that their errors are of multiplicative nature, and the use of a transformation becomes more natural as Eq. ( 1) implicitly assumes that the errors are normally distributed (Bocquet et al., 2010).When using these logarithmic choices, the sensitivities from the observation operator have to be computed accordingly, which is achieved by using the non-log sensitivities and the chain rule of derivatives: where c represents the aerosol concentrations being analyzed.By using this conversion we are able to use the same code to compute sensitivities for any choice of control variable.To avoid zero values, we set a threshold of 1 × 10 −20 for AOD and aerosol concentrations when converting to and from the LN variables.
An additional modification with respect to the control variable used in previous research is that, instead of using aerosol concentrations output by WRF-Chem (µg kg −1 ), we multiply them by the grid-cell vertical thickness (in meters), which provides a measure of the column concentration and is proportional to aerosol mass rather than aerosol concentration.The consequences of not applying this correction are depicted by the following example.For two given grid-cells in the same column and containing the same aerosol concentrations and uncertainty, the grid with the deeper thickness will contain higher sensitivities, as the same change in concentration will generate a higher increase in AOD due to the deeper layer.This will end up preferentially modifying concentrations in the assimilation in those deeper grid-cells, biasing the model.By multiplying by the thickness, we avoid the assimilation favoring changes in deeper grid-cells, which could be important in configurations with great vertical variability as the one used in this study.
Finally, instead of using as control variables all aerosol species (Liu et al., 2011;Schwartz et al., 2012) on all size bins, we introduce the option of using total mass per size bin as control variables, and distribute the changes within GSI considering the percentage of mass contribution of each species as a constant for each size bin.This consideration allows a reduction in the number of control variables by a factor equal to the number of species, which is eight in our case.When using this choice the system is faster, has fewer degrees of freedom and is less likely to accumulate changes on single species.Similar assumptions have been made in other AOD assimilation systems (e.g., Benedetti et al., 2009), with the difference that here we can still produce changes in the total aerosol composition, as different species often dominate different size bins (Saide et al., 2012a).We use the total mass per size bin as the control variable for all experiments presented in this study.

P. E. Saide et al.: Aerosol optical depth assimilation for a size-resolved sectional model
The weak constraint term added in Eq. (1) constrains the control variable so the optimal solution would be within user specified bounds or close to them.The implementation is based on the relative humidity weak constrain done in GSI meteorological assimilation.K is diagonal and chosen as a scale (in the variance space) of the control variable.For simplicity we chose it equal to the diagonal of B and use the parameters k uc and k lc for weighting the constraint, with higher values giving a higher weight to the terms in Eq. ( 1), thus allowing a smaller departure of x from the target bound once it has been exceeded.x uc and x lc represent the desired bounds for the control variable and are calculated as multiplicative factors applied to the prior (additive in the case of LN control variable).In the experiments, x uc and x lc were chosen equal to 5•x b and 0.01•x b , meaning that the upper and lower bound terms are activated during minimization when x is over 5 times or below 1/100 times the background, respectively.The weights of the constraint term k uc and k lc were equal to 0.5 and 0.05, which were chosen experimentally by trying different values and keeping a range that both restricts x to the bounds and at the same time avoids the constraint term becoming the largest term in the functional J .Higher weight and more constrained multiplicative bound are given for the upper constraint as we found that overly increasing concentrations (i.e., incorrectly high AOD retrieval) can excessively damage the forecasts.

Background error covariance matrix
By using standard deviations and vertical and horizontal correlation length scales as inputs, GSI is able to approximate the convolution of a background error covariance matrix (B) by the use of recursive filters (Wu et al., 2002;Purser et al., 2003).Besides vertical and horizontal correlations, chemical and aerosol data assimilation often incorporates the use of cross-species correlations as many of these are co-emitted or have similar precursors (Elbern et al., 2007).Since we use total mass of all species per size bin as the control variable, inter-species correlation is not applicable.There is also a natural correlation for different size bins for each species that needs to be considered (e.g., Saide et al., 2012a).By using recursive filters we incorporate the capacity to add correlations between aerosol size bins in GSI.Filter passes run along size bins in incremental order and are applied locally for each aerosol size distribution, in a similar way to how vertical scales are applied (Wu et al., 2002).For simplicity, the inter-size bin correlation lengths are specified in the namelist by the user and not computed through the method described in the next paragraph.However, we do not discard this possibility for future studies.The size bin correlation length scale was chosen equal to 2 bin units, which prevents excessive accumulation of innovations on a single size bin and distributes the changes along them.The isotropic nature of one-dimensional recursive filters restricts the ability to apply different correlations scales to bins that have smaller and larger sizes than the reference one.Such anisotropic correlation would be preferred for bins located at the edges of fine and coarse distributions.We hypothesize this limitation could be partially overcome when computing the correlation with methods such as the one described next.
As in Liu et al. (2011) and Schwartz et al. (2012), we use the NMC (National Meteorological Center) method (Parrish and Derber, 1992) for computing the standard deviations and vertical and horizontal length scales.Depending on the choice of control variable (see Sect. 2.2.1), the same variable has to be the input to the NMC computation.For the case of LN control variables, we constrain the standard deviation to be less than or equal to one LN unit to avoid a very unconstrained system.The NMC method generally uses two forecasts (12 and 24 h or 24 and 48 h) to compute statistics.We use a long meteorological spin-up time (Saide et al., 2012b), so following this strategy could consume too much computational resources.To compensate, we assess uncertainties by running two continuous parallel simulations driven by different meteorology.In the case of retrospective North American experiments, this can be done using NCEP Final Analysis and North American Regional Reanalysis (NARR).This method used for May 2010 yields isotropic horizontal length scales between 15 and 36 km, with smaller and higher values in the lower and upper troposphere, respectively.These are considered small values compared to global data assimilation systems, but are in the range of 1 to 3 times the horizontal grid resolution, which falls between typical ranges (Liu et al., 2011).Vertical length scales vary between 1 and 6 model grid vertical level units.In general they are large near the surface due to boundary layer mixing, then decrease rapidly, reaching small values around the capping inversion height, and then remain high up to ∼ 3 km where the model vertical grid gets coarser (see Sect. 2.1) and thus the length scales decrease down to small values.

Forward and adjoint of the observation operator
While Liu et al. (2011) and Schwartz et al. (2012) used the Community Radiative Transfer Model (CRTM) (Han et al., 2006) as the forward and adjoint observation operator, here we use WRF-Chem optical properties (OP) routines (Fast et al., 2006).This choice provides consistency between the AOD computed for assimilation and for forecast models.The WRF-Chem OP code considers an internal mixture within each aerosol size bin and uses Mie theory along with Chebyshev expansion coefficients for reducing computational time (Fast et al., 2006).This code has shown skill in predicting optical properties against total column data for several regions and aerosol regimes (Yang et al., 2011;Zhang et al., 2010;Chapman et al., 2009;Qian et al., 2010;Zhao et al., 2010Zhao et al., , 2012;;Kalenderski et al., 2013) and against in situ data ( Barnard et al., 2010;Shrivastava et al., 2013).The tangent linear (TL) and adjoint of this code were obtained using the automatic differentiation tool TAPENADE v 3.6 (Hascoët and Pascual, 2004).Two tests were performed to validate the code generated.First, the TL code was tested using the TL test, which consists of comparing the derivatives obtained from the code against finite differences derivatives obtained using the forward code, obtaining better agreement as the perturbation used was reduced, which is considered a successful test.Second, the adjoint code was tested using the adjoint test, which consists of generating derivatives with the TL code and then using them as an input for the adjoint code.In this case, a successful test is obtained when, for different perturbations, the dot product of the derivatives generated with the TL is equal to machine precision to the dot product of the adjoint derivatives and the original perturbation (Zou et al, 1997), which was also accomplished.
We incorporated aerosol water and number calculations within the WRF-Chem OP code added to GSI so they will be dependent on aerosol concentrations.The water uptake code is extracted from MOSAIC, which uses the activity coefficients of the electrolytes present and the Zdanovskii-Stokes-Robinson method (Zdanovskii, 1948;Stokes and Robinson, 1966).A threshold of 99 % relative humidity was set for water uptake calculations, and columns with clouds present are excluded from assimilation.Aerosol number was computed using aerosol concentration and diameter in each bin, assuming that the assimilation does not update diameter using the one in the prior.
Another addition to the WRF-Chem OP code added to GSI was the column AOD computation for specific MODIS wavelengths.WRF-Chem computes OP for four wavelengths: 300, 400, 600 and 999 nm.Similarly to WRF-Chem radiative transfer calculations (Fast et al., 2006), interpolation/extrapolation to MODIS wavelengths is done using the Ångström exponent from the two closest wavelengths.No modifications are needed when computing fine mode AOD, as the coarse bin mass and number are zeroed out before AOD and sensitivity computations.As the aerosol models in the MODIS algorithm use a modal approach (Remer et al., 2005) while MOSAIC uses a sectional approach, it is hard to create a complete match between the two when computing the fine fraction.For simplicity, we consider fine mode as aerosols with a dry diameter equal or less than 625 nm (first 4 size bins from the 8 bins of MOSAIC), which is in agreement with the cut-off diameter of 600 nm used in the standard AERONET retrieval (Dubovik and King, 2000;Dubovik et al., 2000).

Observations and their errors
The observational data sets that were assimilated in the different experiments are described in the following.There was no thinning of the data to maximize data usage.

Operational MODIS level 2 retrieval
Collection 5.1 MODIS aerosol data from Aqua and Terra satellites were obtained from NASA Goddard Space Flight Center.The dark target retrieval, which is the one used, is based on Remer et al. (2005) and Levy et al. (2007).Over land, AOD (the "Corrected_Optical_Depth_Land" product) is provided in three wavelengths: 470, 550 and 660 nm.However, for AOD at 550 nm lower than 0.2, the Ångström exponent used to compute the other two wavelengths is fixed (Levy et al., 2007), not providing an independent measurement of size distribution.Most AOD values over land were lower than 0.2 for the period of study, thus only the 550 nm retrieval is used in the assimilation.Over ocean OD (the "Effective_Optical_Depth_Average_Ocean" product) is provided in seven wavelengths (470,550,660,870,1240,1630 and 2130 nm) but only the ones in the range 550-1240 nm are used in the assimilation to keep the wavelengths used close to the range computed by WRF-Chem.The 470 nm wavelength is not used as there is no validation presented for this wavelength over ocean (Remer et al., 2005).The MODIS aerosol data set also provides fine mode fraction, defined as fraction that the fine mode (effective radius less than 0.5 µm) (Kaufman et al., 1997) contributes to the total optical thickness, which can be used to compute fine mode AOD.
When operational MODIS data are assimilated, the data are quality controlled to avoid degrading the assimilation.These controls include accepting the highest quality flag (qf = 3) over land and any flag (qf = 1, 2 or 3) over ocean and processing only pixels with zero cloud fraction.

NASA Neural Network Retrieval
The NASA Neural Network Retrieval (NNR) is an observationally constrained retrieval designed to generate a better fit with respect to AERONET observations, and is used operationally in the Goddard Earth Observing System, Version 5 (GEOS-5) (Rienecker et al., 2008) aerosol assimilation system (GMAO, 2013).It uses a neural network as an alternative to linear regression to capture possible nonlinear relationships.Predictors used for the ocean retrieval include level 2 multi-channel top of the atmosphere (TOA) reflectances, glint, solar and sensor angles, cloud fraction (only when it is lower than 85 %, otherwise pixel is discarded) and GEOS-5 surface wind speeds.Predictors used for the land retrievals are TOA reflectances, solar and sensor angles, cloud fraction (< 85 %) and climatological albedo (only if lower than 0.25).An important difference with other post-processing techniques is that it does not use any MODIS AOD retrieval as a predictor.The target used in the neural network (and in the GEOS-5 assimilation system) is not directly AERONET AOD, but log(AOD + 0.01), which tends to better represent a Gaussian probability distribution.The AOD at 550 nm is available at the same 10 km resolution of the MODIS level 2 operational retrievals (GMAO, 2013).The NRL-UND retrieval is a value-added AOD data set based on MODIS Level 2 aerosol products specifically designed for quantitative applications, including data assimilation and model validation.The quality assurance procedures and empirical correction algorithms (to better fit AERONET data) applied to this product are described in Zhang and Reid (2006), Zhang et al. (2008), Shi et al. (2011) and Hyer et al. (2011).This 550 nm AOD retrieval is derived from MODIS collection 5.A product gridded to 0.5 degree is produced by NASA's Land, Atmosphere Near-realtime Capability for EOS (LANCE) with product code MC-DAODHD.Due to the high resolution used in this study, the source code of this algorithm was modified to output results on a 0.05 degree grid, with a minimum of one retrieval per grid and without checking for neighbors on the output grid (no "grid buddy checking").This method always produces a maximum of one retrieval per grid cell (as MODIS minimum grid size is ∼10 km) with no aggregation, being comparable in terms of possible pixels generated to the other two retrievals used (MODIS and NASA NNR).In addition, only pixels with cloud fractions equal to zero and with the highest context quality checking were processed.

Observation error
Observational errors were assumed to be the same for all data sets, even though uncertainty is usually provided for the different data sets (e.g., Shi et al., 2011).This assumption was made to provide the same basis for comparing results.AOD errors over land and ocean were assumed to be equal to 0.6 and 0.2 in LN units (∼ 60 % and ∼ 20 % error, respectively).Our approach does not follow the error estimates proposed by Remer et al. (2005) and used by Liu et al. (2011) and Schwartz et al. (2012) (error = a + b • AOD, with a and b constants function of the type of retrieval) because in this treatment relative errors (computed as a percent of the AOD magnitude) increase as AOD is lower.In the case of operational MODIS data assimilation and when computing errors with this approach, spurious high AOD can significantly damage assimilation results as the high AOD will dominate due to the high relative error of the surrounding small AOD.
In the case of applying the same relative error (our approach), the surrounding small AOD control the spike of mass incorporated in the model.These MODIS AOD artifacts are effectively erased by the post-processing techniques (NASA NNR and NRL-UND).We also found that the fixed log-space uncertainty estimates resulted in better analysis results.These improvements suggest that the uncertainty estimates used in previous research may be too high for low AOD values, or may not correctly account for reduction of random error by spatial averaging in the data assimilation system.

Study domain and experimental design
The study region is California and its surroundings, which is an area with important air pollution problems, affected by both local and distant sources (Huang et al., 2010) (Zaveri et al., 2012).The coast of California is also important since in this area a persistent stratocumulus deck is found, which means that (1) aerosol retrieval from satellite is more challenging compared with more cloud-free areas, and (2) aerosol-cloud interaction is likely to be important (Hegg et al., 2012;Twohy et al., 2005).
The region also represents a challenge in terms of accurate meteorological and air quality predictions (Yver et al., 2013;Fast et al., 2012).The existence of the stratocumulus deck plus the pollution issues makes this area a good place to demonstrate the application of AOD assimilation approaches and assess its limitations.
The modeling domain is centered on the central California coast, with a domain spanning from 30 • N to 47 • N and from 133 • W to 112 • W. A large portion of the domain covers the ocean to allow a higher influence of data assimilated here and to better resolve the stratocumulus deck (Saide et al., 2012b).As previously mentioned, 12 km horizontal grid spacing is used.
Results are presented for May 2010.Simulations without data assimilation (from now on referred as "nonassimilated") start on 26 April to allow for model spin-up and run continuously until the end of May.On the assimilation experiments, analysis steps are performed every three hours with a three hour observation window, then forecasts are restarted from meteorology of the previous forecast and run for three hours.Additional simulations were performed for the first 10 days of May to assess the impact of assimilation on forecasts by performing 48 h unconstrained simulations after each daily 21:00 UTC analysis.The 550 nm operational MODIS AOD retrieval assimilation is considered as the "control" for all experiments, and impacts of other or additional data are also assessed.First, we evaluate the impact of assimilating observationally constrained retrievals (i.e., NASA NNR and NRL-UND) and, second, we assess the inclusion of fine mode AOD and multiple-wavelengths to the assimilation.We evaluate impacts for PM 2.5 , AOD and Ångström exponent (AE).Fractional error and fractional bias (Morris et al., 2005) are computed to assess model performance against non-assimilated observations (see next paragraph).Fractional error reductions (FER) are computed by subtracting fractional errors of the experiments and control assimilations.For the second set of experiments, as we assimilate multi-wavelength AOD only over ocean and fine fraction is very infrequent over land for this area and period, we focus our performance analysis on satellite data over ocean and coastal stations.
Observations from different ground monitoring networks were used as independent data to evaluate the data assimilation impacts (Fig. 1).Hourly PM 2.5 data were obtained from the US Environmental Protection Agency (EPA) Air Quality System (AQS, http://www.epa.gov/ttn/airs/airsaqs/), which provides a high density of measurements over California with most sites located in urban or suburban areas (Pagowski and Grell, 2012).We also used data from the Interagency Monitoring of Protected Visual Environments (IMPROVE, http://vista.cira.colostate.edu/improve/,Malm et al., 1994) network, which collects measurements mainly on remote regions (parks and wilderness areas), which are representative of one day and collected every 3 days.Besides total PM 2.5 , it also collects aerosol chemical composition measurements, from which we use sulfate, nitrate, chloride, sodium, organic carbon and black carbon.Additionally, total column AOD and Ångström exponent (AE) measurements were obtained from AERONET network data (Holben et al., 2001).For the period of study, 10 AERONET stations had data available within the study domain (Fig. 1).Finally, AOD retrievals not yet assimilated are considered as independent data and compared against model forecasts.Even though we performed assimilation every 3 h, most of the data is available in the 18:00 and 21:00 UTC cycles (due to the satellite overpass time and domain of study), with Terra and Aqua data accumulated mainly in the 18:00 and 21:00 UTC cycles, respectively.Thus, by comparing model forecasts and Aqua data one can analyze the performance of assimilation against independent satellite data for a 3 h forecast, or for a 21 h forecast by comparing to Terra data.
The validation using the three types of observations (aerosol concentration, ground AOD and satellite AOD) represent different levels of independence.The comparison against aerosol concentration observations is the true independent validation as these are not assimilated nor used for obtaining the retrievals assimilated.Comparing against AERONET AOD represents an intermediate level of independence as, even though these observations are not assimilated and have a lower level of uncertainty, they are used to tune the algorithms which compute the assimilated retrievals.Finally, validating against satellite retrievals represents the lowest level of independence, as, even when observations from a different satellite not assimilated are compared to the forecasts, they are computed with the same algorithms as the one assimilated so they retain the same systematic biases, which are propagated into the analysis and forecast.These levels of independence must be considered when analyzing the assimilation tests performance.
Model to observation mapping is described as follows.WRF-Chem output is saved hourly and mapped to ground stations using nearest neighbor interpolation.The hourly PM 2.5 WRF-Chem concentrations are used directly to compare against AQS observations.For IMPROVE stations, local time daily averages are computed.AERONET observations are averaged to hourly values, which are then compared to hourly WRF-Chem output using the Ångström exponent for interpolation to AERONET wavelengths.Finally, both satellite retrievals and model fields are re-gridded to a fixed regular 0.2 × 0.2 degree grid where averages and performance statistics are computed.

Non-assimilated model and retrievals evaluation
Figures 2 and 3 show non-assimilated model performance with respect to PM 2.5 ground observations from the AQS network.In general, the model overestimates PM 2.5 concentrations at most sites, with a global mean of 8.5 µg m −3 and 14.1 µg m −3 for observation and model, respectively.As seen in Fig. 2c and d, model biases tend to be more negative over northern California with biases close to zero and smaller errors in the Los Angeles area.Despite the biases, the model is able to reproduce the patterns of highest concentrations in the urban centers (Fig. 2) and captures the synoptic features which generate the high and low particle concentrations in the region (Fig. 3).
Figure 4 shows the non-assimilated model evaluation using the IMPROVE speciated observations.The model also overestimates PM 2.5 at these sites.These high model values come from the "other" chemical species aerosol (Fig. 4), which corresponds mainly (96 %) to the "other inorganics"  (oin) aerosol specie in MOSAIC.This overestimation can be traced back to dust aerosol in the chemical boundary conditions coming predominantly from the western and northwestern boundaries.The model also shows overestimation of aerosol nitrate, sea salt and black carbon.Sea salt aerosol overestimation is consistent with previous work (Saide et al., 2012b), which is produced by too high sea salt emissions.The nitrate overestimation may be due to emissions, as the NO x NEI2005 emissions have been found to be overestimated (Kim et al., 2009) and they do not reflect the decreasing trend in NO x emissions up to year 2010 (EPA, 2013).Opposite to the general trend, organic carbon is highly underestimated by the model, which is expected as no SOA scheme was included in the simulations.This difference is more evident as the IMPROVE network consists mostly of remote stations, leaving longer time for SOA production.Sulfate is slightly underestimated, which reflects that SO2 emissions may be low in NEI2005.This could be the result of the NEI emissions not including shipping emissions, which is an important source in the region (Huang et al., 2011).
As issues with local emissions are found, an emission inversion along with data assimilation could be performed, as suggested by other studies (Jiang et al., 2013).However, as the major problem in this study arises from the dust boundary conditions, adjusting just emissions would end up reducing them when they do not necessarily need to be reduced (e.g., in the case of SO 4 ).Thus, future studies performing data assimilation and emissions inversions would also need to assimilate chemical boundary conditions.), nitrate (NO 3 ), chloride (Cl), sodium (Na), organic carbon (OC), black carbon (BC) and "other", which in the observation is obtained as the mean PM 2.5 minus the sum of the mean of rest of the species mentioned, and in the model as the sum of the mean of "other inorganics" (which includes dust) and ammonium species.Monthly mean values for the different 550 nm AOD retrievals are shown in Fig. 5 (top).Significant differences can be seen between the different retrievals.The NASA NNR and NRL-UND retrievals tend to make corrections of the same sign with respect to the operational MODIS, for example they increase AOD over coastal California, while decreasing Fig. 6.AOD time series on a selection of sites for AERONET data (500 nm), operational MODIS (550 nm), NASA NNR (550 nm), the non-assimilated forecast and the two assimilation forecasts (500 nm).MODIS shows pixels lumped from Terra and Aqua, while NASA NNR shows pixels lumped for Terra, Aqua, land and ocean retrievals.For satellite data, the closest retrieval to the site is plotted only when the distance is less than 0.2 degree.See Fig. 1 for AERONET sites locations.AOD over the ocean and on the more inland territories, especially over Nevada.Both of these post-processing retrievals are calibrated with the AERONET data, so this behavior is expected.However, as the algorithms and inputs are different, there are still some significant differences between both data sets.When considering specific AERONET sites (Fig. 6), we can see that the post-processed techniques are usually closer to the AERONET values than the operational MODIS retrieval.However, there are still persistent biases that the post-processed techniques are not able to overcome, mainly in the area of Nevada and southeast California (Fig. 6f), which is also shown by the high AOD in the monthly means (Fig. 5) in all retrievals.
Non-assimilated model monthly mean values (Fig. 5d) show a persistent overestimation in AOD over the ocean and over land for most of the domain.As mentioned above, there appears to be a high bias in the boundary conditions associated with dust, which produces a high background AOD over the modeling domain.The non-assimilated model underestimates AOD in Nevada and southeast California, which corresponds to the area mentioned before where the retrievals show higher deviations from AERONET sites.A general overestimation is also found when comparing the non-assimilated model to AERONET stations (Fig. 6), so we anticipate that assimilation should move the aerosol state towards the AERONET observations for most sites and retrievals.An interesting station to analyze is the Caltech site (Fig. 6e), located in northern Los Angeles.Here, the model shows very small bias for the high AERONET AOD values, which is consistent with small errors and almost no bias found in the PM 2.5 AQS comparison (Fig. 2c and d).For this site, satellite retrievals do not exactly match the AERONET data, so we anticipate that assimilation will tend to degrade results in this area as errors in the retrieval are higher than model errors.

MODIS and observationally constrained 550 nm AOD assimilation
Model AOD after assimilation is shown in Fig. 5e and f.Model estimates are closer to the observations being assimilated compared to the non-assimilated one, showing that the optimization process is working properly.Figures 3, 4, 7 and 8 show improvements in PM 2.5 after AOD assimilation.When looking at the time series of PM 2.5 for the whole month (Fig. 3), it is seen that the bias reduction of the assimilation changes from day to day.These variations can be partially explained by the amount of data being assimilated (Fig. 9) which is a function of several factors, including the scan pattern of the MODIS sensor, the quality control applied (the most important being the cloud fraction threshold) and post-processing algorithms.For instance, the first 8 days of the month show the consecutive period with the most data available to assimilate and the largest bias reductions.One factor contributing to the correlation between the amount of assimilated data and the resulting bias reduction is the small horizontal length scale used (see Sect. 2.2.2), which prevents corrections during assimilation extending too far from the observation location.Thus, more data will translate into a larger spatial coverage.
Results show that all assimilated retrievals reduce the fractional error (from 0.71 on the background to 0.65, 0.62 and 0.64 for MODIS, NASA NNR and NRL-UND assimilation) on a large fraction of AQS PM 2.5 stations (85 %, 92 % and 96 % for MODIS, NASA NNR and NRL-UND assimilation).Fractional error reductions (Fig. 7) tend to be higher in stations that originally had higher errors (Fig. 2), like locations in northern and central California.In general, the assimilation of post-processed data (NASA NNR and NRL-UND) has better performance than the operational MODIS data.A very clear example is southern Nevada, where MODIS assimilation degrades results, which is in agreement with Schwartz et al. (2012), while both post-processed techniques reduce the errors.As seen in Fig. 9, the NASA NNR retrieval has the highest amount of data assimilated, which is mainly due to the less restrictive quality control applied in this algorithm (e.g., cloud fraction less than 0.85 versus no cloud fraction in MODIS and NRL-UND tests).As discussed previously, having more data tends to improve assimilation performance, thus this is a factor influencing the higher error reductions of assimilating NASA NNR versus the other two re-trievals.However, quality of the data is also important, which is why the post-processed techniques show a considerably higher fraction of stations with reduced errors compared to MODIS.In this dimension, the NRL-UND product is the one that shows the highest fraction of stations improved due to the more restrictive quality control applied.The assimilations tend to slightly reduce or even increase errors in the region surrounding Los Angeles (Fig. 7).As mentioned in Sect.3.1, this was expected as the non-assimilated model has small error and bias in this region, both against PM 2.5 and AERONET measurements.Also, as this is a populated area, spatial and temporal concentration gradients are not completely resolved by the 12 km horizontal grid spacing used.This can be observed by comparing observation and model mean maps (Fig. 2a and b), and by the great variability during the day at the AERONET Caltech site not entirely captured in the non-assimilated model (Fig. 6e).
The bias reduction against monitored PM 2.5 can also be seen at the IMPROVE stations (Fig. 4, bottom), with improvements in the fractional error (from 1.1 to 0.88 globally) for 100 % of the stations analyzed when NASA NNR is assimilated (similar for the other two retrievals).The assimilation does not significantly change aerosol composition as only total AOD is assimilated and because a relatively long correlation length is used between size bins.Thus, the general trend is that aerosol species in the non-assimilated model that have a bias of the same sign of total PM 2.5 bias will have their biases reduced in the analysis, while the bias will be increased in the opposite case.For instance, assimilated black carbon and nitrate improved while sulfate and organic carbon degraded after the assimilation tests.This behavior is similar to the one found in experiments assimilating PM 2.5 observations (Pagowski and Grell, 2012).
An analysis of the impact of assimilation on forecasts starting at 21:00 UTC is shown in Figs. 8 and 10.When evaluating against PM 2.5 AQS measurements (Fig. 8), as all forecasts start at the same time, the diurnal cycle modulates the bias and error reductions.For instance, the decreasing trend in fractional error on the 0-4 h forecast follows the increase in error shown by the non-assimilated model in this portion of the diurnal cycle.PM 2.5 concentrations show low bias one hour after assimilation, reaching zero values when NASA NNR retrievals are assimilated.Then, the assimilation gradually returns towards concentrations and errors found when no assimilation is performed, in agreement with previous studies (Schwartz et al., 2012).This is also seen in the AQS PM 2.5 comparison, where assimilation almost never goes back to the non-assimilated model levels (Fig. 3) and fractional error reduction at 18:00 UTC for all stations and days in May is equal to 0.06.After 48 h there is a slight but positive influence of assimilation for both retrievals (> 0.012 fractional error reduction).These results show that, in the context of operational air quality forecasting, AOD assimilation with the method developed here can be beneficial for improving the skill of the forecasts for the day after the satellite overpass.As shown earlier, the NASA NNR retrieval assimilation outperforms the MODIS 550 nm assimilation for all times for both bias and fractional error.
Figure 10 shows performance evaluation for a 3 and 21 h forecast against not yet assimilated satellite data.For the 3 h one, assimilation shows error reductions in most of the domain, except in the Nevada and southeast California regions, where the retrievals tend to present issues (see Sect. 3.1).Error reductions tend to be higher over ocean due to the smaller error assigned to these observations during assimilation, which allow the system to better fit the model to observations.Different error reductions over different ocean areas are related to cloud presence from day to day, with areas with higher cloud fractions showing less error reductions.The 21 h forecast (Fig. 10b) shows smaller but still significant error reductions as model errors over time bring the state closer to the non-assimilated model.Fractional error reductions are close to zero over the ocean from 125 • W to the west as after 21 h the assimilated aerosols have already been advected away from this region, matching the error of the non-assimilated model.Over the ocean south of California there is still the presence of the assimilated aerosol and the fractional error reductions are considerable (above 0.1).Over California, error reductions are generally less than 0.1 but positive, showing that after 21 h there is still persistence of the innovation.High error reductions over land after 21 h (Fig. 10b) are found along coastal southern California and northern Mexico, which is consistent with the excellent and long-lasting performance of NASA NNR (and the other retrievals) assimilation against AERONET measurements at the La Jolla site (Fig. 6a).On the other hand, sites like Trinidad Head (Fig. 6b) tend to approach the non-assimilated model more rapidly as the domain boundary is close to the site and boundary conditions blow in this direction, in agreement with what happens over the ocean west of 125 • W (see above).As NASA NNR (and NRL-UND as well, not shown) data are closer to AERONET AOD, assimilation of these data sets generally provides a closer agreement to the observations compared to the operational MODIS data assimilation (Fig. 6).From the 10 AERONET stations with data during May 2010, MODIS assimilation reduces fractional error in 8 of them from a global fractional error of 0.66 to 0.6, while the AERONET calibrated techniques reduce errors in all 10 stations, yielding smaller fractional errors (0.54 and 0.58 for NASA NNR and NRL-UND, respectively).

Multiple wavelength and fine mode AOD assimilation
Compared to AQS and IMPROVE data, fine and total, and multiple wavelength assimilations do not degrade results compared to the control 550 nm AOD assimilation, obtaining similar statistics to the ones shown in Sect.3.2, even when results are filtered for coastal stations.
Figure 11 shows error reductions for a 3 h forecast.We see that all approaches considerably reduce errors over the ocean for both wavelengths (Fig. 11 top and middle rows).Assimilating only 550 nm AOD (control) reduces the aerosol loads, thus also reducing the 870 nm AOD, generating a better fit with these observations without assimilating them.Assimilating fine and total AOD generates smaller error reductions (for both 550 nm and 870 nm AOD) compared to only assimilating total AOD.This is probably because the additional constraint to the fine aerosol reduces the ability of the optimization to generate a closer fit to the total AOD.Another factor that could also create these results and that has been noted to generate issues (Kleidman et al., 2005) is the possible mismatch in the fine and coarse mode definitions due to different aerosol approaches used in MOSAIC and the MODIS algorithm (sectional versus modal, respectively).On the other hand and opposite to fine AOD assimilation, using multi-wavelength AOD data generates slightly better error reductions for 550 nm AOD while considerably better reductions for 870 nm AOD when compared to the control  assimilation.The better fit to 870 nm AOD observations is expected as the 870 nm retrieval is being directly assimilated.
Figure 11 (bottom row) also shows error reductions for the Ångström exponent (AE).In general, increasing values on AE indicate finer aerosols (Schuster et al., 2006).Over the ocean, the non-assimilated model tends to show very low AE compared to the observed values (not shown), which is consistent with the overestimation of dust coming from the boundaries (see Sect. 3.1).Even though the 550 nm AOD only assimilation generates a good fit to AOD observations, there is only a small change in AE, with regions where there is even an increase in the error.As this assimilation only uses one observation per column and a large correlation length between bin sizes, the assimilation tends to uniformly modify aerosols within bin sizes, not significantly changing the size distribution and thus the AE.A completely different picture is seen for fine and total, and multi-wavelength AOD assimilations, where the use of multiple observations per column modifies the AE and in the right direction, reducing the errors in most of the domain.The fine and total AOD assimilation tends to generate slightly better AE results than the multi-wavelength AOD assimilation as the former directly modifies the fine aerosol.However, we recommend the use of the multi-wavelength over the fine and total AOD data as total AOD burdens are much better estimated by the multiwavelength approach, as described in the previous paragraph.
To better understand the differences between the control assimilation (550 nm AOD only) versus adding additional multi-wavelength data, Fig. 12 shows vertical profiles of PM 2.5 and aerosol number concentration 3 h after a given assimilation.Even though the PM 2.5 column is reduced for both assimilations (Fig. 12a), the use of multiple wavelength data selectively reduces PM 2.5 in different model layers (higher reductions in the 3-8 km layer, smaller below 2 km) to better fit all observations simultaneously.This can generate a shift in the AE as different size distributions are found at different heights.On the other hand, the different assimilation approaches generate opposite results for aerosol number concentrations (Fig. 12b), with the single and multiple wavelength cases reducing and increasing it below 5 km, respectively.As explained in the previous paragraph, when assimilating 550 nm AOD only, the long correlation length generates uniform modifications along bin sizes, so as the total aerosol concentration is reduced, aerosols in the small bin sizes (where aerosol number dominates) will also be reduced, not changing the overall size distribution (Fig. 12c).Again, multiple-wavelength AOD assimilation will selectively modify size bins to create a better fit to observations at all wavelengths, even if changes go in opposite directions between bin sizes.In the case shown, coarse and fine size bins are reducing and increasing their mass respectively, which globally reduces mass (Fig. 12a) but increases number (Fig. 12b), changing the size distribution (Fig. 12c).Changing aerosol number concentrations in different directions can have a great impact in this region, as these aerosols can act as cloud condensation nuclei and substantially modify cloud properties (Saide et al., 2012b).
Comparisons over coastal AERONET stations show that for periods when the single-and multi-wavelength assimilations have differences (flow towards the coast), assimilation of multi-spectral AOD tends to show slightly better performance against 870 nm AOD (Fig. 13, left column), but the single wavelength assimilation still shows very good skill, as mentioned previously.Stronger differences can be appreciated in the AE time series (Fig. 13, right column).The 550 nm AOD assimilation usually follows the nonassimilated model closely, while the multiple wavelength assimilation deviates from it and generally fits the observation better.The error reductions when comparing to AERONET AE are not as significant as the ones shown when comparing to MODIS AE.This is probably because MODIS retrieved AE, which has yet to be validated (Remer et al., 2005), is often inconsistent with AERONET AE (Fig. 13, right column).In this sense, obtaining observationally constrained retrievals for multiple wavelengths AOD and AE would allow assimilations to obtain additional improvements, as shown in Sect.3.2.Also, further work evaluating against marine AERONET stations and/or maritime aerosol network (MAN) data (Smirnov et al., 2011) is needed to substantiate these conclusions.

Conclusions
We developed the ability for the GSI system to perform AOD assimilation to correct WRF-Chem aerosol fields when simulations are done with the MOSAIC sectional aerosol model.This enables the assimilation to impact aerosol concentrations, size and composition.In doing so, we added several new capabilities to the GSI system: using the AOD forward and adjoint Mie computations from WRF-Chem routines, making GSI results consistent with the forecasts; adding the use of logarithmic state and observations; including bounds during optimization time in the form of weak constraints; adding correlations within aerosol size bins into the background error covariance matrix by the use of GSI recursive filters; and modeling aerosol water uptake, as done in MOSAIC, considering atmospheric conditions and the electrolytes present.The assimilation is performed using as state variable total mass within each size bin, significantly reducing computational resources used compared to using all species in all size bins.This is all demonstrated using a 3DVAR assimilation system, but it could eventually be applied to more sophisticated frameworks such as 4DVAR or Kalman filter systems to make use of their strengths over 3DVAR (e.g., Pagowski and Grell, 2012).These methods would allow performing data assimilation simultaneously with boundary conditions and emissions inversions (e.g., Elbern et al., 2007), which is likely to make the improvements in the aerosol predictions persist longer in time.
This newly developed assimilation scheme was demonstrated in a regional analysis and forecast application for one month at sites throughout California and its surroundings.The first set of assimilation experiments explored the use of observationally constrained AOD retrievals (NASA NNR and NRL-UND) against using operational MODIS 550 nm dark target data.All three assimilations decreased global error and biases by improving forecasts on a large fraction of PM 2.5 and AOD monitoring ground stations.The assimilation of observationally constrained retrievals consistently showed better performance compared to the operational MODIS data as they corrected the spatial biases and quality controlled odd retrievals, with the NASA NNR producing the higher error reductions (due to a larger amount of data) and the NRL-UND showing the higher fraction of improved PM 2.5 stations (96 %, due to the more restrictive quality control applied).48 h forecasts starting from an analysis step showed improvements in the aerosol predictions (0.15-0.015 fractional error reductions for the NASA NNR retrieval vs. PM 2.5 ), demonstrating the potential of the developed technique for air quality forecasting applications.These assimilation experiments did not change the overall aerosol composition, thus degrading model performance for single aerosol species that had an opposite bias to the global tendency.Improvements in the non-assimilated estimates are necessary to correct this issue, which could be achieved in the study case by incorporating missing SO 2 emissions and processes not modeled such as secondary organic aerosol formation.
A second set of experiments assessed the impact of assimilating fine mode and multiple-wavelength AOD.Results showed that while single wavelength assimilation did not significantly change size distributions, assimilation of additional data selectively modified aerosol at different vertical layers and changed size distributions, producing a better fit to the Ångström exponent (AE), an indicator of aerosol particle size distributions.The inclusion of fine AOD could not outperform the assimilation of just total AOD when comparing AOD burdens, possibly due to a mismatch between the fine mode fraction definition on model and retrieval.On the other hand, forecasts including multiple wavelengths in the assimilation further reduced errors for MODIS 550 nm and 870 nm AOD and simultaneously improved the 550-870 nm AE.The use of multiple wavelengths in the assimilation was also found to have positive influence on predictions at coastal AERONET sites.However, AE error reductions were not as significant as when evaluating with MODIS AE, possible due to an inaccurate performance of the MODIS against AERONET AE.
In this paper we showed the value of assimilating observationally constrained AOD and multiple wavelength data over assimilation of off-the-shelf 550 nm AOD products.Future research should point towards generating observationally constrained AOD and AE for multiple wavelengths, which will bring together the best of the techniques explained in this research.We directly used the MODIS resolution (10×10 km 2 ) in assimilation without thinning or re-gridding, showing that data assimilation on fine resolution models is feasible with positive impacts.This becomes important as newer products are available at higher resolutions (e.g., Lyapustin et al., 2012;Munchak et al., 2013).Even though we performed assimilations on a region densely populated by monitoring networks, we assimilated satellite retrievals only; thus, this method should be able to be applied anywhere in the world.Future work should point towards simultaneously assimilating several AOD data sets, including other observation types such as ground measurements (Schwartz et al., 2012) and cloud retrievals (Saide et al., 2012a).We also showed that the impact of assimilation increases with the amount of data used, so further error reductions may be achieved by using AOD retrievals from geostationary satellites, provided that their quality is appropriate for data assimilation.Integration of all these data sets is likely to help in providing better aerosol estimates for a large variety of applications.
As we showed that assimilation can improve estimates of surface PM 2.5 , this technique could also be used to generate analysis with high temporal and spatial resolution for use in health assessments (e.g., Silva et al., 2013).Also, the improved aerosol loads should be able to help in better estimating aerosol climate forcing.Finally, the assimilation can likely be used in forecasting mode to predict air quality more accurately.

Fig. 2 .
Fig. 2. Non-assimilated model evaluation against PM 2.5 monitors from AQS network throughout California and Nevada for May 2010.Panels show Observation mean (a), model map and model masked to observations means (b), fractional error (c) and fractional bias (d).

Fig. 3 .
Fig. 3. Model and observed mean PM 2.5 time series for May 2010 over AQS sites in California and Nevada.Model simulations are the non-assimilated and assimilated using the NASA NNR product.

Fig. 4 .
Fig. 4. Summary of May 2010 IMPROVE observations versus non-assimilated and NASA NNR assimilated model estimates.Top figures show aerosol composition and bottom ones show mean aerosol concentration per chemical specie.Chemical species are sulfate (SO 4), nitrate (NO 3 ), chloride (Cl), sodium (Na), organic carbon (OC), black carbon (BC) and "other", which in the observation is obtained as the mean PM 2.5 minus the sum of the mean of rest of the species mentioned, and in the model as the sum of the mean of "other inorganics" (which includes dust) and ammonium species.

Fig. 5 .
Fig. 5. May 2010 average maps of operational MODIS Terra (a), NASA NNR (b) and NRL-UND (c) products at 550 nm for the same MODIS Terra data, non-assimilated model (d) and assimilated estimates (e, f).While the non-assimilated model is masked by the NASA NNR product, the assimilations are masked by each data ingested.MODIS data (a) are not quality controlled by cloud fraction or quality flags, as done during assimilation.

Fig. 7 .
Fig. 7. PM 2.5 fractional error reductions from non-assimilated to assimilated models at AQS sites for May 2010.Positive values represent error reductions.

Fig. 8 .
Fig. 8. Mean PM 2.5 concentrations (top) and fractional error reductions (FER, bottom) as a function of forecast hour for all AQS stations during the first 10 days of May 2010.All 48 h forecasts used to build the mean and FER start at 21 UTC.

Fig. 9 .
Fig. 9. Time series of the number of pixels being assimilated for each day in May 2010 for the different 550 nm AOD data sets.

Fig. 10 .
Fig. 10.Fractional error reductions from non-assimilated to NASA NNR assimilated models computed with respect to NASA NNR Aqua (a) and Terra (b) observations.Model fields used for comparison are forecasts at 21:00 and 18:00 UTC for (a) and (b) respectively.Thus in the fractional error computation, observed data has not been assimilated yet and can be considered as independent.Thus, (a) and (b) are fractional error reductions for a 3 and 21 h forecast respectively (see Sect. 2.4 for more details).

Fig. 11 .
Fig. 11.Fractional error reductions for 550 nm AOD, 870 nm AOD and 550-870 nm Ångström exponent (rows) from non-assimilated to assimilated model computed with respect to Aqua retrievals.Figures in the left column assimilate only MODIS 550 nm AOD (control), in the center column assimilate both total and fine AOD at 550 nm, while the ones in the right column assimilate MODIS 550, 660, 870, and 1240 nm over ocean and only 550 nm over land.As described in Fig. 10, these figures correspond to a fractional error reduction for a 3 h forecast for May 2010.

Fig. 12 .
Fig. 12.On the top, vertical profiles for PM 2.5 (a) and aerosol number concentration over 80 nm diameter (b); and on the bottom, mass fraction size distribution at 4 km altitude (c), for forecasts on 6 May 2010 at 21:00 UTC.The forecasts are the non-assimilated and the two assimilated using the 550 nm AOD only (control) and multiple wavelength AOD retrievals.Data assimilation was performed 3 h before (18 Z the same day).For 8 bin MOSAIC, dlogDp is 0.693.

Fig. 13 .
Fig. 13.As Fig. 6 but for of 870 nm AOD and 500-870 nm Ångström exponent from coastal AERONET sites.The three models shown are the non-assimilated, and forecasts assimilating MODIS 550 nm only and wavelengths from 550 nm to 1240 nm.MODIS ocean retrieval (870 nm AOD and 550-870 nm Ångström exponent) is shown when data is within 0.2 degree of the site.See Fig. 1 for AERONET sites locations.
. This region has been the target of several recent measurement campaigns such as NASA's Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) mission for the California Air Resources Board (CARB) (Jacob et al., 2010), the California Research at the Nexus of Air Quality and Climate Change (CalNex) field study (Ryerson et al., 2013) and the 2010 Carbonaceous Aerosols and Radiative Effects Study (CARES)