RAMS-MLEF Atmosphere-Aerosol Coupled Data Assimilation: A Case Study of A Dust Event over the Arabian Peninsula on 4 August 2016

The Regional Atmospheric Modeling System (RAMS) has been interfaced with the Maximum Likelihood Ensemble Filter (MLEF) with the goal of improving initial conditions for aerosol weather forecasting via atmosphere-aerosol coupled data assimilation (RAMS-MLEF). In order to assimilate satellite retrieved aerosol optical depth (AOD), an AOD observation operator customized for the RAMS aerosol module is implemented. Two MLEF-RAMS experiments are carried 15 out for a dust storm event over the Arabian Peninsula that occurred on 4 August 2016. In the first experiment, conventional atmospheric observations from the National Centre for Environmental Prediction (NCEP) Prepared Binary Universal Form for the Representation of meteorological data (PrepBUFR) dataset are assimilated (ATMONLY), while both the atmospheric observations and AOD retrievals from Moderate Resolution Imaging Spectroradiometer (MODIS) are assimilated in the second experiment (ATMAOD). In the two experiments, a list of control variables is used and it includes the three20 dimensional wind components, perturbation Exner function, ice-liquid water potential temperature, total water mass mixing ratio, and the two dust modes from the aerosol module. Results indicate that the assimilation of MODIS AOD retrievals improves the representation of the dust plume over Persian Gulf, however, has no obvious impact on the dust plume interior of Saudi Arabia. Such finding is further supported by the examination of analysis increments of some control variables and the information measure in terms of degrees of freedom for signal. This is likely due to the lack of AOD retrievals interior of 25 the Arabian Peninsula. Finally, a 12-h forecast initialized from both experiments is conducted. In general, ATMAOD forecast better represents the Persian Plume but performs poorly for the Saudi Plume, when verified against the aerosol reanalysis product from the Modern-Era Retrospective Analysis for Research and Applications version 2 (MERRA-2) dataset. Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c © Author(s) 2018. CC BY 4.0 License.


Introduction
There has been an increasing interest over the last two decades in assimilating aerosol and/or chemistry data into numerical models to improve the forecast of aerosol weather (Collins et al., 2001, Wang et al., 2003, Weaver et al., 2007, Wang and Niu, 2013, Zhang et al., 2014, Randles et al., 2017. On the modelling perspective, there are, in general, two approaches for this type of study. One approach is to use an offline aerosol model that is driven by meteorological fields produced by an 5 atmospheric model (e.g., Sekiyama et al., 2010, Rubin et al., 2017. For this type of approach, interactions between the atmospheric model and the aerosol model are one-way. That is, the meteorological fields from an atmospheric model are used to initialize the aerosol model, but the outcome from the aerosol model is not fed back to the atmospheric model. Another approach is to use an atmosphere-aerosol coupled modelling system (e.g., Liu et al., 2011, Lee et al., 2017, in which a simultaneous forecast of both meteorological and aerosol fields are performed and the interactions between both 10 atmospheric and aerosol components are two-way. One such example is the Weather Research and Forecasting model coupled with Chemistry (WRF-Chem) (Grell et al., 2005).
On the data assimilation perspective, depending on how the modelling components interact with each other, there exists three general regimes for this type of study: i) an uncoupled, ii) a weakly coupled, and iii) a strongly coupled systems. As 15 summarized in Županski (2017), an uncoupled data assimilation system simply means that each component is completely independent in both data assimilation and forecast aspects. A weakly coupled data assimilation system performs data assimilation separately for each component. The updated initial conditions for both meteorological and aerosol fields are then used to initialize a coupled forecast. Specifically, the cross-component elements in the forecast error covariance matrix are not considered during data assimilation update. Finally, a strongly coupled data assimilation system conducts both data 20 assimilation and forecast in a coupled sense. Since the cross-component elements in the forecast error covariance matrix are used in a coupled data assimilation system, observational information from one component has a potential to influence other components .
Similar to WRF-Chem, the Colorado State University (CSU) Regional Atmospheric Modeling System (RAMS) is also an 25 atmosphere-aerosol coupled modelling system that is capable of simulating aerosol and cloud microphysical processes across multiple atmospheric scales. In this study, RAMS is interfaced with an ensemble-based data assimilation system, the Maximum Likelihood Ensemble Filter (MLEF), for the first time. Such atmosphere-aerosol coupled data assimilation system will be referred to as RAMS-MLEF hereafter. Following the above discussion, the RAMS-MLEF system is configured as a strongly coupled system in the sense that a set of control variables is chosen to represent both the meteorological and aerosol 30 fields. The goal of this study is to assimilate aerosol optical depth (AOD) retrievals in the RAMS-MLEF system to improve the representation and forecast of the spatial and temporal distribution of aerosol in the littoral zone. Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c Author(s) 2018. CC BY 4.0 License. AOD retrievals derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) provide a near global coverage of aerosol distribution . Specifically, the Deep Blue (DB) algorithm over land (Hsu et al., 2006) along with those derived by Dark Target (DT) algorithm over land (DT-land) and over ocean (DT-ocean)  are three major MODIS AOD retrievals that have been widely used in the research community. Nevertheless, retrievals of AOD over turbid costal water has been challenging due to large variability in the ocean color along the costal regions. Wang et al., 5 (2017) developed an algorithm that takes the advantage of negligible radiances at 2.1 µm regardless of water turbidity and assumes that the aerosol single scattering properties over turbid water are similar to those over adjacent open ocean scenes.
Their technique is referred to as Costal Water (CW) algorithm. As a consequence, the CW algorithm not only complements the existing DT-ocean algorithm by filling the observational gap in costal regions, but also improves MODIS AOD retrieval evaluation against collocated Aerosol Robotic Network (AERONET) sites . In this study, 3-km DT-land 10 and DT-ocean AOD retrievals at 550 nm (or 0.55 µm) along with 3-km CW AOD retrievals at the same wavelength are used for data assimilation and are hereafter referred to as MODIS AOD retrievals.
The rest of the manuscript is organized as follows: section 2 describes the development of interfaces to connect RAMS with MLEF for a atmosphere-aerosol coupled data assimilation system along with an implementation of an AOD observation 15 operator for RAMS, section 3 provides the case study and section 4 describes the experimental design followed by results presented in section 5. Finally, section 6 includes a summary and conclusions.

RAMS model
RAMS is a multi-purpose mesoscale numerical prediction model that was developed at CSU (Cotton et al., 2003). 20 Throughout the years, RAMS has undergone multiple upgrades that include improvements to its microphysics via the implementation of a cloud nucleation scheme (Saleeby and Cotton, 2004), an improved capability to assimilate lightning data (Federico et al., 2017), an implementation of a non-local boundary layer scheme (Gómez et al., 2016), and the development of an aerosol module (Saleeby and Van den Heever, 2013). Out of these recent upgrades, the development of a RAMS aerosol module is directly related to the study herein. 25 There are a total of nine aerosol species represented by the aerosol module in RAMS: i) submicronmeter sulphate, ii) supermicrometer sulphate, iii) submicrometer mineral dust, iv) supermicrometer mineral dust, v) film-mode sea salt, vi) jet drop-mode sea salt, vii) spume-mode sea salt, viii) submicrometer regenerated aerosols, and ix) supermicrometer regenerated aerosols. For each aerosol type, the size is represented by a lognormal distribution given by 30 Atmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018- where n(r) is number concentration of aerosols of dry radius r, N is total number concentration of aerosols, r g is lognormal distribution geometric median radius, and σ g is lognormal distribution geometric standard deviation. Although the shape of the size distribution as described in (1) is fixed during a simulation, the shape is allowed to translate in the direction of r.
That is, as a result of sources and sinks of aerosol mass during a simulation, the shape of (1) is allowed to shift toward larger 5 or smaller values of r. In addition, the width of the size distribution is determined by σ g , which behaves like a dispersion parameter in a Gamma size distribution used in microphysical development.

MLEF
MLEF is an ensemble data assimilation algorithm based on control theory developed by Zupanski (2005) and Zupanski et al. (2008). In other words, MLEF is a hybrid data assimilation algorithm with both variational and ensemble features. As 10 illustrated by Fig. 1, during the forecast step, MLEF generates an ensemble of forecasts to estimate the flow-dependent forecast error covariance, while during the analysis step, the nonlinearity of observation operators are considered by utilizing an iterative minimization of a cost function described by where x and y define state vector and the observation vector respectively. The subscript f denotes the forecast and superscript 15 T and -1 are used to denote the transpose and inverse of a matrix respectively. P f is the forecast error covariance and R is the observation error covariance, which is often a diagonal matrix following the assumption that observations are not spatially correlated. h denotes a collection of nonlinear observation operators.
A large part of this study focused on developing interfaces to connect RAMS with MLEF, hence, the RAMS-MLEF system. 20 This system is different from the CSU Regional Atmospheric Modeling Data Assimilation System (RAMDAS, Zupanski et al., 2005). To address the developmental work, a schematic diagram as shown in Fig. 1 outlines the RAMS-MLEF system. Specifically, three major interfaces are implemented in MLEF and they are 1) I/O interfaces between MLEF and RAMS, 2) an interface that acts as a driver to call and run RAMS, and 3) an interface for observation operators that utilize input from RAMS for assimilation. In MLEF, observation operators for conventional atmospheric observations are adapted from the 25 forward component of the Gridpoint Statistical Interpolation (GSI, Wu et al., 2002, Kleist et al., 2009) through a module (ATM in the orange box of Fig. 1). With that, atmospheric observations that are provided by the National Centre for Environmental Prediction (NCEP), such as the conventional observations within the NCEP Prepared Binary Universal Form for the Representation of meteorological data (PrepBUFR) dataset and satellite radiances data from various platforms, can be Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c Author(s) 2018. CC BY 4.0 License. assimilated by MLEF in parallel/conjunction with operational approach. However, the AOD observation operator that is embedded in the Community Radiatve Transfer Model (CRTM, Han et al., 2006), which is part of the forward operators of GSI, was built for the Goddard Chemistry Aerosol Radiation and Transport (GOCART, Chin et al., 2000) aerosol species.
Therefore, there is a need to develop an AOD operator specifically for RAMS aerosol module in MLEF. RAMS 5 Following Liu et al., (2011) and Pagowski et al., (2014), an observation operator for AOD at a given wavelength λ (nm), is calculated by the following equation

AOD Observation Operator for
where AOD(λ) represents the spectrally dependent AOD operator (unit less), i is the index for aerosol species, N aero is the total number of aerosol species that contribute to the AOD calculation, k is the index for model vertical levels, and k top is the 10 model top level. E ext is the spectrally dependent mass extinction coefficient (m 2 g -1 ), which is a function of the index of refraction n r and effective radius r eff (nm) of a given aerosol species, c i , in the form of mass mixing ratio (g of species/kg of dry air). Δp k is pressure difference (mb) between two vertical levels k and k+1, and g is the acceleration due to gravity (m s -2 ). const is a constant of 10 5 , as a result of unit conversion.

15
Out of the nine aerosol species, six of them are used in (3), i.e., N aero = 6, to calculate AOD for this study. Supermicrometer sulphate is not used due to its little contribution to the calculation of AOD. The two regenerated aerosol species are not available because of the experimental design, which configures RAMS to disable microphysics parameterization (see section 4). The optical properties of the six aerosol species at 550 nm under dry conditions are provided in Table 1. The mass extinction coefficient is computed using the Mie theory (Bohren and Huffman, 1983), in which the spherical assumption of 20 aerosol particles is required. For each of the aerosol species, particles are first grown hygroscopically to equilibrium with ambient relative humidity using κ-Köhler theory (Petters and Kreidenweis, 2007) and the refractive index is adjusted based on volume mixing with water. To reduce computational expense, a lookup table of the mass extinction coefficient as a function of ambient relative humidity (RH, %) for each of the six aerosol species at 550 nm is prepared (similar to Kliewer et al., 2018 and. A 1% interval of RH is used in the lookup table, which is plotted in Fig. 2. For a 25 simulated RH with a value that falls between two integer numbers (e.g., 85.6 %), the integer value that is closer to the simulated value will be used (e.g., 86%).

Case Study
Situated in one of the major dust sources of the world, the so-called dust belt, the Arabian Peninsula, in general, regularly experiences dust storms. In particular, dust storms are found to be more frequent in the summer time of the southern Arabian 30 Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c Author(s) 2018. CC BY 4.0 License. Peninsula (Jish Prakash et al., 2015). On 4 August 2016, two distinct dust plumes occurred, in which one plume advected offshore of the United Arab Emirates (UAE) to central portion of the Persian Gulf (referred to as the Persian Plume) and the other plume was located in interior portion of Saudi Arabia (referred to as the Saudi Plume). The Saudi Plume was well detected by MeteoSat Second Generation (MSG) imagery with a dust enhancement (Miller et al., 2017) applied, revealing the dust in yellow (Fig. 3a). At times, dust plumes, in general, are evident in satellite imagery of so-called reflective bands. A 5 true colour image was generated from MODIS aboard Aqua that exhibits the Persian Plume (Fig. 3b). Note, each image is unable to reveal both dust plumes; that is, the dust enhancement only captures the Saudi Plume (bright yellow) and not the Persian Plume (no yellow), while the true colour image captures only the Persian Plume but not the Saudi Plume. The existence of the Persian Plume in the true colour image (Fig. 3b) provides support that the IR-based dust detection algorithm missed the Persian Plume (Fig. 3a). 10 As pointed out by Jish Prakash et al., (2015), the Arabian Peninsula is an under-sampled region of observations. As a result, reanalysis data from the Global Forecast System (GFS) will be used to provide some meteorological fields on 1200 UTC 4 August 2016 (Fig. 4). As indicated in Fig. 4, the environment of the Saudi Plume is characterized by northerly flow and relatively low total precipitable water (TPW) (~25 mm). In contrast, the Persian Plume is in an environment characterized by 15 southeasterly winds and TPW values in excessive of 45 mm. Additional details regarding the case study can be found in Miller et al., (2018).

20
In this study, the RAMS-MLEF system is used to study the above-mentioned two-dust-plume event over the Arabian Peninsula on 4 August 2016. As was mentioned, RAMS serves as the forecast component of the RAMS-MLEF.
Configuration of RAMS used for this case study is now described. Only one grid is used and has 15 km horizontal grid spacing in both horizontal directions (see There are many prognostic variables in RAMS, a few of which are discussed here. Due to the complexity of potential temperature (θ) at cloud boundaries (Tripoli and Cotton, 1981), ice-liquid water potential temperature (θ il ) is used in place of Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c Author(s) 2018. CC BY 4.0 License.
θ. In addition, θ il is dependent on the mass mixing ratio of microphysical species. One may consider any one of the observed microphysical species, such as cloud droplets, and particles within observed "blowing dust", such as clay particles, as suspended aerosol within the atmosphere of the Earth. As a first step, RAMS will be run dry; that is, simulated microphysical species will remain inactive. As a result, water vapour is the only prognostic microphysical conservative water type in the experiments, to discussed below; another consequence is that θ il is equivalent to θ. In addition, the interpretation 5 of the assimilation of observed AOD will be simplified by the absence of simulated microphysical species. Since AOD is dependent on RH, a simplified interpretation of assimilation is facilitated by the removal of microphysical species that compete for water vapour. In other words, water vapour is the only water species that will interact with the six aerosol species from which AOD is computed at 550 nm.

10
For the case study herein, blowing dust occurred in a somewhat cloud free area as discussed in Section 3; therefore, there exists some observational support for running the forecast model dry. As described in Section 2.3, RH is used in the calculation of AOD. Since RAMS is run dry, diagnosed values of RH at areas where upward motions occur (i.e., sloping terrain) can exceed 100 % because no condensate was allowed to form and water vapour was conserved. As a consequence of RH > 100 %, simulated values of AOD reached approximately one order of magnitude larger than observed values near 15 sloping terrain in the experiments. Relatively larger difference between simulated and observed values of AOD creates biases that are large enough to cause rejection of AOD observations that could have been assimilated. To alleviate the introduction of large biases due to running the model dry, during the computation of AOD, only grid points that have RH values less than 100% are included in the vertical summation of (3).

20
Configuration of MLEF used for this study is described here. A time-lagged methodology (Zupanski et al., 2006) is used to generate an initial set of ensemble forecasts of size 32 from RAMS, which are valid at 0000 UTC 3 August 2016. During the cost function minimization, a generalized quasi-Newton algorithm is used (Zupanski et al., 2008). A list of control variables include the three-dimensional wind components (u, v, and w), perturbation Exner function (pi), ice-liquid water potential temperature (θ il ), total water (water vapour and all condensate types) mass mixing ratio (rtp), and the two mineral 25 dusts (dust1 and dust2). As a consequence of a leapfrog scheme, there exists two temporal solutions for some RAMS prognostic variables. In addition, there is a procedure in RAMS that is designed to prevent the two temporal solutions from diverging. Due to the use of leapfrog scheme in RAMS, MLEF will alter prognostic variables on one of the temporal solutions. In order to prevent both temporal solutions from diverging, a methodology is used in MLEF that keeps the difference between the two time solutions "close". from NCEP PrepBUFR and 550 nm MODIS AOD retrievals. In Fig. 5, the NCEP PrepBUFR dataset used in both experiments is displayed over the RAMS domain described earlier. Note that the majority of the dataset is only available at surface (green, blue, and orange symbols), while rawinsonde (red symbol) is the single source of data that provides vertical aspect of the atmosphere. Due to the availability of MODIS AOD retrievals, they are assimilated at the 2 nd , 3 rd , and 6 th cycle (0600 UTC and 1200 UTC 3 August and 0600 UTC 4 August) for the ATMAOD experiment. For the study herein, 5 observation error for the AOD retrievals is 0.1 (unit less). Similar to Liu et al., (2011), observation error is increased by 5% (15%) of the AOD value when it is over land (ocean). In order to reduce the effects of spatial observation error correlation, data thinning is applied to the AOD retrievals prior to the actual assimilation. For a given cycle, AOD retrievals are thinned in a way that only every third of pixels of a given retrieval image is retained. Once spatial thinning is completed, the next step is quality control. During the quality control, the so-called gross check is applied to remove large differences (three 10 times the prescribed observation errors) between the AOD retrievals and first-guess.

Observed vs. Model Equivalent AOD Before and After Data Assimilation
As one of the means to evaluate the success of a data assimilation experiment, it is a common practice to examine the model equivalent observation quantity before and after the assimilation. In Fig. 6, the assimilated MODIS AOD retrievals (thinned 15 and passed quality control) are presented along with model equivalent AOD that are computed from the first guess of both ATMONLY and ATMAOD experiments and analysis of the ATMAOD experiment that are valid at 0600 UTC 4 August, 2016, the 6 th cycle of both experiments. At first glance, there is obviously a lack of AOD retrievals over the Arabian Peninsula (Fig. 6a), although areas adjacent to the Arabian Peninsula such as Red Sea, Persian Gulf, and Gulf of Omaha have some coverage. This is because the AOD retrievals were derived from the MODIS data as shown in Fig. 3b, in which 20 the Persian Plume was well detected while the Saudi Plume was missing. In terms of the two plumes of interest, the first guess of ATMONLY (Fig. 6b) and ATMAOD (Fig. 6c) experiments has different representations of AOD distributions and magnitude, although both have the two plumes in place. After assimilating the AOD retrievals, the analysis of ATMAOD experiment further strengthens the areal extent and magnitude of the Persian Plume as well as AOD signals in Gulf of Omaha. However, due to the lack of observations that are indicative of aerosol (AERONET data was not available during the 25 period of the experiments for the Mezaira site and Masdar Institute site, which are the two sites that are located in the UAE), a quantitative verification for the above-mentioned adjustment is unrealistic. Instead, a different perspective to evaluate the data assimilation experiments is provided via an examination of information measure.

Information Measure in Data Assimilation
As pointed out by Zupanski et al. (2007), it is useful to compare information measures obtained in different data assimilation approaches. One information measure that is often used in information theory as well as data assimilation is degrees of freedom for signal (DFS, Rodgers, 2000), which is defined by where trace is the trace operator of a square matrix, I state is an identity matrix of dimension N state x N state (N state is number of state vectors), and P a is the analysis error covariance of the same dimension as I state . As indicated by (4), DFS measures the forecast error reduction due to new information brought to the assimilation by observations. Following Zupanski et al. (2007), by introducing the information matrix in ensemble subspace C, which is given by where vector z i is the i th column of the matrix Z of dimension N ens x N ens (N ens is number of ensemble members), vector p i f is the i th column of the forecast error covariance matrix P f , (4) can be further reduced, in terms of eigenvalues λ i of C, to Now it becomes clear that the value of DFS is non-negative and should have a range between 0 and N ens according to (6). A value that is close to zero suggests that there is minimal reduction of uncertainty due to assimilation of observations, i.e., 15 negligible impact from assimilating observations. On the other hand, a value near N ens would be optimal. Note that DFS with a value that is equal to N ens is almost impossible because it is very difficult to have a set of ensemble members that are linearly independent.
Using the formulation in (6), DFS were computed for the ATMONLY and ATMAOD experiments. In Fig. 7, DFS computed 20 from the analysis of 6 th cycle of ATMONLY experiment and that of the ATMAOD experiment are displayed side by side.
The maximum value of DFS for the ATMONLY case is 1.67 (Fig. 7a), and the maximum value of DFS for the ATMAOD case is 3.74 (Fig. 7b). Both of which is not a significant number, i.e., not close to N ens = 32. However, assimilation of AOD appears to be beneficial to the reduction of uncertainty, as opposed to only assimilating atmospheric observations. In addition, the areal extent of non-zero DFS values is generally larger in the ATMAOD experiment, especially in the Red Sea, 25 Persian Gulf, and Gulf of Omaha where MODIS AOD retrievals were available (Fig. 6a).
Additional information brought by AOD not only reduces the forecast uncertainty but also introduces substantial adjustments to the first guess, which are often referred to as analysis increment (analysis minus first guess) in the field of data assimilation . 30 Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c Author(s) 2018. CC BY 4.0 License.

Analysis Increment
In this section, analysis increments of some of the control variables (Section 4) from the ATMONLY and ATMAOD experiments are discussed. Fig. 8 illustrates the analysis increments of total dust (dust1 and dust 2, µg kg -1 ), ice-liquid water 5 potential temperature (K), total water mass mixing ratio (g kg -1 ), and horizontal wind components (m s -1 ) from the two experiments at model level 11, which is approximately 1 km above aground. A side-by-side comparison between the analysis increments from the two experiments reveals that there are considerable adjustments from assimilating AOD retrievals. Analysis increments from the ATMONLY experiment, in general, have smaller spatial extent and are negligible in size, which are likely a consequence of the paucity of conventional atmospheric observations in this region. In contrast, 10 analysis increments from the ATMAOD experiment are evident along Persian Gulf, Iraq, western and northern Iran, Red Sea, east Sudan, Gulf of Aden, and northern Arabian Sea. However, due to the lack of AOD retrievals, no increments were found interior of the Arabian Peninsula. While it is often challenging to identify any sort of relationship between the analysis increments of one variable and that of the other variable, it is clear that positive increments of total dust appear to match the locations of positive increments of horizontal wind speed (Figs. 8 a, b, g, and h). Such finding suggests that there exists a 15 positive correlation between dust variables and horizontal winds in the forecast error covariance.

Impact on Forecast
Both analyses of the ATMONLY experiment and the ATMAOD experiments valid at 0600 UTC 4 August 2016 are used to initialize a 12-h forecast. Due to the lack of observations over the Arabian Peninsula, the aerosol reanalysis product contained in the Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2, Gelaro et al., 20 2017, Randles et al., 2017 dataset is used in place of the true state of aerosol. As a reanalysis product, MERRA-2 includes assimilation of AOD retrievals from various instruments aboard satellites such as MODIS, the Advanced Very High resolution Radiometer (AVHRR), and the Multiangle Imaging SpectroRadiometer (MISR) as well as direct measurements of AOD from ground-based AERONET. In addition, MODIS AOD retrievals that were not included for assimilation in the two experiments are also used along with the MERRA-2 aerosol reanalysis product. 25 In Fig. 9, the AOD field from the MERRA-2 product (Fig. 9a) and the MODIS AOD retrievals valid at 1200 UTC 4 August 2016 (Fig. 9b) are displayed alongside the model equivalent AOD computed from the 6-h forecasts of the ATMONLY (Fig.   9c) and ATMAOD experiments (Fig. 9d). Similar to the findings in Fig. 6, both the Persian Plume and the Saudi Plume are captured by the ATMONLY forecast and the ATMAOD forecast. However, the two forecasts overall underestimate the areal 30 extent of the Saudi Plume, although the magnitude and areal extent of Persian plume is better represented by the ATMAOD forecast. Similar trend is also found in the 12-h forecast (Fig. 10), in which the ATMAOD forecast continues to have a more Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c Author(s) 2018. CC BY 4.0 License. accurate representation of the Persian Plume while the Saudi Plume is poorly captured by both ATMAOD and ATMONLY forecasts.

Summary and Conclusions
The goal of this research is to improve aerosol weather forecast by assimilating atmospheric and aerosol observations, hence, 5 an atmosphere-aerosol coupled data assimilation system, RAMS-MLEF, is developed and tested. RAMS-MLEF uses the forward component of GSI as the observation operator for atmospheric observations. However, the AOD observation operator embedded in GSI is specifically designed for GOCART aerosol species. An AOD observation operator for RAMS aerosol species is required and has been built for the study herein. To demonstrate the capability of the RAMS This study demonstrates the capability of RAMS-MLEF as an atmosphere-aerosol coupled data assimilation system using a single case study over the Arabian Peninsula. This area is known to be severely under-sampled. Specifically, conventional 5 observations are sparse and are restricted to ground stations, and satellite observations from polar-orbiting platforms are limited due to their infrequent revisits to any specific location. One important lesson learned from this study is that the location and timing of observations largely determines the improvements achieved by data assimilation. The lack of observations also makes it very challenging to perform a quantitative verification of results obtained from data assimilation.
This issue may be addressed in future work by including the assimilation of AOD retrievals over deserts and turbid coastal 10 water from geostationary satellites , as they can further constrain dust source functions as well as improve dust transport.     Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1249 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 17 December 2018 c Author(s) 2018. CC BY 4.0 License. Figure 8: Analysis increment at cycle 6, valid at 0600 UTC 4 August 2016 of a) total dust (dust1+dust2, µg kg -1 ), c) total water mass mixing ratio (rtp, g kg -1 ), e) ice-liquid water potential temperature (θ il , K), and g) horizontal wind components (u and v, m s -1 ) from the ATMONLY experiment. b), d), f), and h) are corresponding increments from the ATMAOD experiment.