An improved dust emission model. Part 2: Evaluation in the Community Earth System Model, with implications for the use of dust source functions

The complex nature of mineral dust aerosol emission makes it a difficult process to represent accurately in weather and climate models. Indeed, both measurements and the new physically-based dust emission parameterization presented in the companion paper indicate that many large-scale models underestimate the dust flux’s sensitivity to the soil’s threshold friction velocity. We hypothesize that this finding explains why many dust cycle simulations are improved by using an empirical dust source function that shifts emissions towards the world’s most erodible regions. Here, we both test this hypothesis and evaluate the performance of the new dust emission parameterization. We do so by implementing the new emission scheme into the Community Earth System Model (CESM) and comparing the resulting dust cycle simulations against an array of measurements. We find that the new scheme shifts emissions towards the world’s most erodible regions in a manner that is strikingly similar to the effect of implementing a widely-used source function based on satellite observations by the Total Ozone Mapping Spectrometer. Furthermore, model comparisons against aerosol optical depth measurements show that the new physically-based scheme produces a statistically significant improvement in CESM’s representation of dust emission, which exceeds the improvement produced by implementing a source function. These results indicate that the need to use an empirical source function is (partially) eliminated, at least in CESM, by the additional physics in the new scheme, and in particular by its increased sensitivity to the soil’s threshold friction velocity. Since the threshold friction velocity is affected by climate changes, our results further suggest that many large-scale models underestimate the global dust cycle’s climate sensitivity.


Introduction
Mineral dust aerosols affect the Earth system through a wide variety of interactions, including scattering and absorbing radiation, altering cloud lifetime and reflectance, and serving as a nutrient source (Martin et al., 1991;Miller and Tegen, 1998;Forster et al., 2007). Conversely, the global dust cycle is highly sensitive to changes in climate (Tegen et al., 2004;Mahowald et al., 2006b;Washington et al., 2009), as evidenced both by global dust deposition being several times larger during glacial maxima than during interglacials (Rea, 1994;Harrison et al., 2001) and by the apparent increase in global dust deposition over the past century (Prospero and Lamb, 2003;Mahowald et al., 2010). The radiative forcing resulting from such changes in the dust cycle might have played a critical role in amplifying past climate changes (Jansen et al., 2007), and may play an important role in present and future climate changes (Harrison et al., 2001;Mahowald et al., 2010).
Unfortunately, an accurate quantification of dust interactions with the Earth system in past and future climates is hindered by the empirical nature of dust emission parameterizations in climate models. Since these parameterizations are generally tuned to reproduce the current dust cycle (Ginoux et al., 2001;Zender et al., 2003a;Cakmur et al., 2006), applying them to a past or future climate, with substantial differences in global circulation and land surface, could produce large systematic errors. In particular, many dust modules in climate models use a dust source function S (Ginoux et al., 2001;Zender et al., 2003b;Grini et al., 2005;Koven and Fung, 2008) to help account for global variations in soil erodibility (defined in the dust modeling community as the efficiency of a soil in producing dust aerosols under a given wind stress (Zender et al., 2003b)).
The flux of dust emitted through wind erosion in a model grid cell is thus commonly represented by (Ginoux et al., 2001;Zender et al., 2003a;Grini et al., 2005;Colarco et al., 2014)  where C tune is a global tuning constant, usually set to maximize agreement against observations (Cakmur et al., 2006), and F d is the vertical dust flux produced by an eroding soil per unit time and area, as predicted by a dust emission parameterization such as Gillette and Passi (1988), Marticorena and Bergametti (1995), or Kok et al. (2014). The dust source function S is a function of latitude and longitude, and usually shifts emissions towards the world's most erodible regions, such as North Africa (Ginoux et al., 2001;Zender et al., 2003b).
The need to use a source function to improve agreement against observations was first noted by the pivotal study of Ginoux et al. (2001). They used the observation of Prospero et al. (2002) that dust "hot spots" tend to be co-located with topographic depressions to design a source function based on the relative height of a model grid cell compared to its surrounding cells. However, some subsequent studies challenged this association of dust hot spots with topographic depressions by Prospero et al. (2002) because (i) the used remote sensing product from the Total Ozone Mapping Spectrometer (TOMS) is sensitive to boundary layer height, which tends to be higher over depressions in central desert regions (Mahowald and Dufresne, 2004), and because (ii) advection causes the remotely sensed dust loading to be shifted downwind from source regions . Nonetheless, the use of source functions, and the consequent shift of emissions towards regions with observed high dust loadings (Ginoux et al., 2001;Prospero et al., 2002), can substantially improve the agreement of dust cycle simulations with measurements (Zender et al., 2003b;Cakmur et al., 2006). This improvement of dust cycle simulations by semi-empirical source functions suggests that a key piece of physics is missing from existing dust emission parameterizations in models. And indeed, dust flux parameterizations in most large-scale models account for variations in soil erodibility in mostly empirical manners. The dust emission parameterization of Gillette and Passi (1988), which is used in the Goddard Chemistry Aerosol Radiation and Transport (GOCART) model (Ginoux et al., 2001) and many other models (Huneeus et al., 2011), does not account for the effect of either sediment availability or other soil properties on the dust flux. Similarly, the dust flux parameterization of Marticorena and Bergametti (1995), which is used in simplified form in the Dust Entrainment And Deposition model (Zender et al., 2003a) that is used in many climate models (Huneeus et al., 2011), accounts for the effect of fine sediment availability on the dust flux using an empirical fit to data from a single study (Gillette, 1979). Since such empirical parameterizations and source functions cannot accurately capture changes in soil erodibility produced by climate changes, which for instance affect soil moisture content and soil aggregation (Zobeck, 1991;Fecan et al., 1999;Shao, 2008;Kok et al., 2012), their use could cause substantial errors in model estimates of climate-induced changes in the global dust cycle.
This paper and its companion paper  strive to take a step towards an improved representation of the global dust cycle in climate models, in particular for climate regimes other than the current climate to which most models are tuned (Cakmur et al., 2006). The first component of this objective was achieved in our companion paper , by presenting a physically-based theory for the vertical dust flux emitted by an eroding soil. The resulting parameterization (henceforth referred to as K14) was able to reproduce a quality-controlled compilation of dust flux measurements with substantially less error than the existing dust flux parameterizations we were able to test against, and is relatively straightforward to implement since it uses only globally-available parameters. A critical insight from our companion paper is that the dust flux is likely substantially more sensitive to changes in the soil state than most climate models account for. The resulting underestimation of the dust flux sensitivity to the soil state might explain why an empirical source function that shifts emissions towards more erodible regions improves agreement against measurements (Cakmur et al., 2006). Since the new K14 scheme accounts for the increased sensitivity of the dust flux to the soil state, our companion paper hypothesized that K14 can reduce the need to use a source function in dust cycle simulations.
Here we use simulations with the Community Earth System Model (CESM) to both test the above hypothesis, and to evaluate the performance of the K14 parameterization in a climate model. Through a detailed comparison against measurements, we find that the K14 scheme produces a statistically significant improvement in the representation of dust emission in CESM, and that this improvement exceeds that produced by a source function. These results indicate that the additional physics accounted for by K14, which result in an increased sensitivity of the dust flux to the soil's threshold friction velocity, reduces the need for a source function in dust cycle simulations, at least in CESM. Since the threshold friction velocity is affected by climate changes, our finding that many models underestimate the dust flux's sensitivity to this threshold further suggests that these models have underestimated the global dust cycle's climate sensitivity.

Methods
We use simulations with CESM version 1.1 (Hurrell et al., 2013) to evaluate the performance of the K14 dust emission scheme. Specifically, we simulate the present-day dust cycle with four different combinations of source functions and dust flux parameterizations (see Table 1). In order to assess whether K14 improves the representation of dust emission in CESM, we compare the simulation results against measurements of aerosol optical depth (AOD) by the AERosol RObotic NETwork (AERONET; Holben et al. (1998)), as well as to satellite-derived estimates of dust optical depth and dust mass path close to source regions (Evan et al., 2014). Note that the code to implement the K14 parameterization in CESM is freely available from the main author.
We also evaluate how each of the four dust cycle simulations performs against measurements further from source regions, namely dust surface concentration and dust deposition. Since these measurements were thus generally taken far from source regions, the simulation performance against these measurements depends on the model's ability to simulate a variety of other processes in addition to dust emission, especially transport and deposition.
The next section briefly describes CESM and its treatment of the dust cycle for each of the four simulations (Table 1). We then describe the properties of the data sets used to evaluate the model performance in Sections 2.2 and 2.3. In Section 2.4, we describe the method to assess whether improvements in the ability of the different simulations to reproduce these measurements are statistically significant. Table 1. Summary of the four CESM simulations used in this study, and the statistics of their comparison against the data set most characteristic of dust emission, namely AERONET AOD measurements at dusty stations (see text). Model results are compared to measurements of AERONET AOD climatology (fourth and fifth columns), the mean correlation to the measured seasonal cycle at each station (sixth column), and the mean correlation to the measured daily variability at each station (seventh column). Statistically significant improvements (see Section 2.4) of Simulations II-IV relative to the 'control' Simulation I are indicated with bold font. Additionally, simulation results that are statistically significantly improved over the results of each of the other three simulations are both bold and underlined. atmosphere model, the Community Atmosphere Model version 4 (CAM4), to calculate the threedimensional transport and deposition of dust, as well as the dust aerosol optical depth (Mahowald et al., 2006b;Albani et al., 2014). In addition to accounting for the global dust cycle and the consequent optical depth produced by dust aerosols (see next section), CESM also includes the effects of other kinds of aerosols, including sea salt, biomass burning, and sulfate aerosols. Black and organic carbon, dimethyl-sulphide, and sulphur oxides emissions are prescribed based on AeroCom specifications (Neale et al., 2010), whereas sea salt aerosol emission is prognostic, based on 10 m wind speed and humidity (Mahowald et al., 2006a).

General treatment of the dust cycle in CESM
The emission of dust aerosols in CLM4 follows the treatment of Zender et al. (2003a), with modifications described in Mahowald et al. (2006bMahowald et al. ( , 2010, and further adjustments described below. Specifically, the vertical dust flux d φ in a model grid cell is parameterized using Eq. (1), with the source function S and the vertical dust flux F d given in Table 1 for the four simulations (also see 2.1.2). We adjust the global tuning factor C tune to maximize agreement against AERONET AOD measurements (see Section 2.1.2).
The threshold friction velocity u *t ' at which dust emission is initiated is critical to determining dust emissions. The value of u *t ' depends on air density, soil properties, and the presence of non-erodible roughness elements (Marticorena and Bergametti, 1995;Shao and Lu, 2000;Kok et al., 2014). However, CLM4 does not account for the effect of non-erodible roughness elements on dust emissions, and the effect of air density on u *t ' is limited. Consequently, u *t ' in CLM4 is mostly determined by soil moisture content; the treatment of the effect of soil moisture on u *t ' follows Eqs. (12) and (14) in Fécan et al. (1999): where u *t ' and u *dt ' are respectively the threshold friction velocities in the presence and absence of soil moisture, and u *dt ' is calculated following the semi-empirical relation of Iversen and White (1982), as described on p. 3 of Zender et al. (2003a). Furthermore, w is the gravimetric water content in percent for CLM4's top soil layer, which has a thickness of 1.75 cm (Oleson et al., 2010). The threshold gravimetric water content w′ of the top soil layer above which w increases u *t ' is given by (Fecan et al., 1999;Zender et al., 2003a) ( ) 2 clay clay where w′ is given in percent, b is a tuning parameter introduced by Zender et al. (2003a), and f clay is the soil's clay fraction, which is taken from the FAO (2012) soil database (see Fig. S1).
The larger the value of the tuning parameter b, the smaller the effect of soil moisture on the dust emission threshold u *t '. The range of plausible values of b extends from less than 1, to 1 (i.e., no tuning constant; Fecan et al. (1999)) to 3 (Mokhtari et al., 2012) to 1/f clay (Zender et  2003a). Since dust emissions are non-linear in u *t ' (e.g., Kok et al. (2014)), and since u *t ' is a critical variable in the K14 dust emission scheme tested in this paper, the choice of b can be expected to substantially affect the simulated dust cycle. Unfortunately, the 'correct' value of b is highly uncertain, in part because the parameterization of Fecan et al. (1999) is based on wind tunnel studies. Implementing this small-scale parameterization into a climate model scales it up by many orders of magnitude, potentially producing physically unrealistic results. Furthermore, the inhibition of dust emission by soil moisture depends on the moisture content of the top layer of soil particles (McKenna Neuman and Nickling, 1989), which is in direct contact with the surface air. In contrast, the top soil layer of hydrology models in climate models usually has a thickness of multiple centimeters and thus responds differently to precipitation and changes in atmospheric humidity, which are important in determining the dust emission threshold (Ravi et al., 2004;Ravi et al., 2006). The 'correct' value of b in a climate model is therefore likely to depend substantially on the model methodology, and in particular on the specifics of the model's hydrology module. Since the choice of b is thus ambiguous, we investigated the sensitivity of our results to the particular value of b by running simulations with a wide range of values (Table S1). We emphasize that other models using the K14 scheme should do a similar sensitivity study to avoid an unrealistic influence of the model's hydrology on the dust emission fluxes. Because we found that the simplest case of not using a tuning constant (i.e., b = 1) produces the best overall results for all four model configurations, we used b = 1 for the results reported in Section 3. But note that the wide range of values of b that we tested all produced results qualitatively similar to those presented here (see Table S1).
In addition to the effects of soil moisture, CLM4 also accounts for the inhibition of dust emissions by vegetation. Specifically, CLM4 assumes that the fraction of the grid cell consisting of bare soil capable of emitting dust aerosols (f bare ) decreases linearly with the leaf area index (LAI), which is the ratio of the total surface area of leaves with the land surface area. That is, where λ denotes LAI, and λ thr = 0.3 is the threshold LAI above which no dust emission is assumed to occur . (Note that the Ginoux et al. (2001) source function already includes the effects of vegetation, such that f bare is set to 1 for all grid cells in Simulation III (see Table 1) to prevent accounting for the effects of vegetation twice.) After CLM4 has calculated the dust flux, CAM4 distributes the emitted dust aerosols into 4 size bins (Mahowald et al., 2006b), from 0.1 -1.0 µm, 1.0 -2.5 µm, 2.5 -5.0 µm, and 5.0 -10 µm. The fraction distributed into each bin follows the 'brittle fragmentation' dust size distribution derived in Kok (2011b), which is in good agreement with a wide range of measurements . The emitted dust size distribution does not depend on the wind speed at emission, as shown by measurements (Kok, 2011a). The optical properties for each bin are specified in Albani et al. (2014) and are derived from a representation of dust as an internal mixture of the primary mineral classes of dust (quartz, aluminosilicates, clays, carbonates, iron-bearing minerals), combined into an effective medium using the Maxwell Garnett approximation (e.g., Videen and Chylek (1998)). The proportions of the mineral classes are consistent with the ranges reported in atmospheric dust and its parent soils (Claquin et al., 1999), and they are in agreement with bulk optical properties observed in dusty regions (Albani et al., 2014). The resulting radiative effects of dust aerosols do not feed back onto the simulated atmospheric dynamics .   6   190  191  192  193  194  195  196  197  198  199  200  201  202  203  204  205  206  207  208  209  210  211  212  213  214   215  216  217  218  219  220  221  222  223  224  225  226  227  228  229  230  231  232 CAM4 simulates both dry and wet deposition of dust. Dry deposition includes turbulent and gravitational settling, and follows the treatment in Zender et al. (2003a). Wet deposition accounts for in-and below-cloud scavenging and follows Neale et al. (2010) with the modifications described in Albani et al. (2014), which improve the model's ability to simulate the observed spatial gradients of dust. Specifically, the dust solubility (i.e., the fraction of dust available for incloud removal) was changed from 0.15 to 0.30, in line with a more recent version of the model (Liu et al., 2012). In addition, instead of using a constant below-cloud scavenging coefficient (collection efficiency) of 0.1 (Balkanski et al., 1993;Neale et al., 2010), the scavenging coefficient was made size-dependent (Andronache, 2003;Zender et al., 2003a), and was set to 0.1 for dust diameters below 2.5 μm and 0.3 for larger dust particles.

Dust emission schemes of the four CESM simulations
We used CESM to conduct four simulations, each with a different dust emission scheme ( Table  1). The 'control' Simulation I uses CESM's default dust emission parameterization of Zender et al. (2003a) (henceforth referred to as Z03) and does not use a source function; Simulations II and III then respectively add the source functions of Zender et al. (2003b) and Ginoux et al. (2001); and Simulation IV replaces the Zender et al. (2003a) parameterization with the K14 parameterization and also does not use a source function.
For Simulations I -III, F d thus follows Z03, which is essentially a simplified version of the Marticorena and Bergametti (1995) parameterization. It is given by where C MB is a dimensionless proportionality constant, ρ a is the air density, g is the gravitational acceleration, and the sandblasting efficiency η (units of m -1 ) depends on the soil clay fraction f clay (see Fig. S1) following . Note that, whereas the soil friction velocity u * is defined from the wind stress on the bare erodible soil, the friction velocity u * ' is defined by the wind stress on the entire surface, so including the stress on non-erodible roughness elements (see Kok et al. (2014) for exact definitions). But since Z03 (and thus CLM4) does not account for the presence of non-erodible roughness elements, we have that u * = u * ', and u *t = u *t ', where u *t and u *t ' are respectively the threshold friction velocity and the threshold soil friction velocity above which dust emissions occur.
Simulation IV uses the K14 dust emission parameterization , which is given by The standardized threshold friction velocity u *st is the value of u *t at standard atmospheric density (ρ a0 = 1.225 kg/m 3 ), that is, optimally-erodible soil, and measurements show that u *st0 ≈ 0.16 m/s (Kok et al., 2012). Furthermore, the dust emission coefficient C d is a measure of soil erodibility (i.e., a soil's ability to produce dust) that depends only on the standardized threshold friction velocity, and is defined as The dimensionless coefficients in Eq. (7) were determined in Kok et al. (2014) from comparison against a quality-controlled compilation of dust flux measurements, and are as follows: C α = 2.7 ± 1.0, C e = 2.0 ± 0.3 and C d0 = (4.4 ± 0.5) × 10 -5 .
As discussed in Kok et al. (2014), Eq. (7) accounts for the experimental observation that a more erodible soil produces a larger flux of dust per saltator impact. That is, per Eq. (7b), the dust flux F d increases exponentially with a decrease in the standardized threshold u *st . Consequently, the dust flux is substantially more sensitive to the soil's threshold friction velocity in K14 than in Z03 (Fig. 1), which is in agreement with measurements . We discuss the implications of the increased sensitivity of the dust flux to the threshold friction velocity in Sections 4.2 and 4.3.
The simulations used the capability of CESM to be forced with reanalysis winds instead of predicting winds, and used the ERA-Interim reanalysis meteorology from the European Centre for Medium-range Weather Forecasts (ECMWF) (Dee et al., 2011;Ma et al., 2013). CESM uses Monin-Obukhov Similarity Theory to obtain the friction velocity u * ' from the wind field (see section 5.1 in Oleson et al. (2010)). All simulations were run at a resolution of 1.9º latitude by 2.5º longitude, and cover the period 1994 to 2011 to produce substantial overlap with the data set of AERONET AOD measurements. The first year of each simulation was used as model spin-up and thus not used for analysis. m/s and for f clay = 15%, which is a typical value for dust emitting regions (see Fig. S1). The predicted dust fluxes include the global tuning factors that eliminates the bias against AERONET AOD measurements for Simulations I and IV, respectively (see Eq. 1 and Section 2.2.1).

2.2.Measurements used to evaluate the representation of dust emission in CESM
We evaluate the representation of dust emission in the four dust cycle simulations by comparing their predictions against dust cycle measurements close to source regions. Arguably the best available data to test a model's dust emission scheme are the extensive and accurate (Eck et al., 1999) AOD measurements by the AERONET network (Holben et al., 1998). In addition to these measurements, we also compare the simulation results against a satellite-derived estimate of dust optical depth and dust mass path (DMP) off the West African coast (Evan et al., 2014).

AERONET AOD measurements
AERONET sites use sun photometers to measure radiances at a range of wavelengths, which are then inverted to retrieve aerosol properties (Dubovik et al., 2002;Dubovik et al., 2006). For this study, we used the daily-averaged level 2.0 quality-assured AOD (pre and post-field calibrated and manually inspected), obtained from the Version 2 Direct Sun Algorithm. We select 'dusty' AERONET stations by only using stations for which our simulations indicate that over 50% of the annually-averaged AOD is due to dust aerosols (i.e., stations for which at least three of the four simulations find that over 50% of AOD is due to dust for the grid box in which the station is located; see Fig. S2). Furthermore, for each station, we select only days for which the Angstrom exponent α (in the 440-870 nm wavelength range) is smaller than 1 (Eck et al., 1999;Dubovik et al., 2002). Since α is a measure of particle size, with smaller values indicating coarser aerosols, values of α < 1 indicate that a substantial fraction of AOD is due to dust (Eck et al., 1999;Dubovik et al., 2002), which is a relatively coarse aerosol. Choosing a different plausible cut-off for AE does not qualitatively affect our results. Finally, we select only stations for which at least 6 months of data (i.e., at least 183 days) is available over the simulation period.
The above procedure resulted in the selection of 42 'dusty' AERONET stations: 18 in North Africa, 4 in the North Atlantic, 11 in the Middle East, 6 in the rest of Asia, and 3 in Australia S5). Comparisons between simulated and measured AOD at these stations is sensitive to the value of the parameter C tune (see Eq. 1), which scales the size of the global dust cycle. Because of the many uncertainties in parameterizing dust emission on both small scales (Barchyn et al., 2014;Kok et al., 2014) and the much larger global model grid box scale (Cakmur et al., 2004), the value of C tune is poorly constrained (Cakmur et al., 2006;Huneeus et al., 2011). We therefore choose the value of C tune for each of the four simulations such that the bias vanishes. That is, we determine C tune by forcing the average modeled optical depth at AERONET stations to equal the average observed optical depth: where i sums over the N = 42 'dusty' AERONET stations, and τ model is the AOD in the visible wavelength (550 nm) simulated at the AERONET station location, the component of which that is due to dust aerosols scales with C tune . The measured AERONET AOD (τ meas ) is obtained at 550 nm, the central wavelength in the visible spectrum, by correcting the AOD measured at 675 nm to 550 nm using the measured value of the Angstrom exponent α. That is, where τ 675 is the measured AOD at 675 nm, and λ 550 and λ 675 equal 550 and 675 nm, respectively.
We use the above procedure to generate three quantitative comparisons between the simulated and the measured AERONET AOD. First, we obtain the measured 'climatological' AOD for each station over the period of the model simulations (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011) by averaging over all days for which AERONET data is available (subject to the quality-control criteria discussed above). For each station, we compare this average measured AOD to the simulated AOD averaged over the same days. Second, we obtain a comparison for the seasonal cycle of AOD by (i) calculating the measured monthly-averaged AOD for months with at least 10 days of data, (ii) calculating the corresponding simulated monthly AOD averaged over the same days, and (iii) averaging all simulated and modeled monthly-averaged AOD values for each of the 12 months in the year, and comparing them. And, third, we obtain a comparison between daily-averaged variations in AOD at each station. In order to prevent mismatches between model time and local time, we restrict the analysis of daily-averaged AOD variations to the 31 stations with longitude between 60W -60E.

Satellite-derived estimates of dust mass path (DMP)
In addition to assessing the fidelity of the four different dust emission parameterizations through comparisons against AERONET AOD, we use recent satellite-derived estimates of dust optical depth and dust mass path (DMP; g/m 2 ) off the West African coast around Cape Verde by Evan et al. (2014). This study followed Kaufman et al. (2005) and Evan and Mukhopadhyay (2010) in separating the dust contribution to AOD from the (smaller) contributions of anthropogenic and marine aerosols for the long-term records of the Advanced Very High Resolution Radiometer (AVHRR) and the Moderate Resolution Imaging Spectroradiometer (MODIS) satellites. Evan et al. (2014) subsequently obtained the DMP by using the result of Kaufman et al. (2005) that a unit dust AOD corresponds to 2.7 ± 0.4 g/m 2 of suspended dust. Since DMP simulated at Cape Verde by climate models participating in the Climate Model Intercomparison Phase 5 (CIMP5) is strongly correlated to the simulated total North African emissions (Evan et al., 2014), and since the fraction of AOD contributed by dust in the study area exceeds the 50% threshold for which we consider the area 'dusty'(see Section 2.2.1 and Fig. S2), we use the dust AOD and the DMP off the West African coast as additional tests of the dust emission component of CESM.

2.3.Other measurements of the dust cycle
The measurement comparisons described in the previous section are designed to evaluate possible improvements in CESM's representation of dust emission. This section describes measurement comparisons to test whether such improvements propagate into an improved simulation of dust cycle properties further from source regions. Specifically, we compare the model results against measurements of dust surface concentration (Prospero and Nees, 1986) and deposition (Albani et al., 2014), which are generally taken far from source regions.

Surface concentration measurements
We compare the simulation results against a data set of dust concentration measurements at the surface. These data are a compilation of measurements taken in the North Atlantic during the Atmosphere-Ocean Chemistry Experiment (AEROCE; Arimoto et al. (1995)) and in the Pacific Ocean during the sea/air exchange program (SEAREX; Prospero et al. (1989)). Since the majority of these stations measure long-range transported dust, the ability of the model to reproduce these measurements depends on the accuracy of model parameterizations of several processes in addition to dust emission, including transport, and wet and dry deposition. We only use stations where CESM predicts at least some dust (> 0.05 µg/m 3 on an annual basis), such that the comparisons between the four simulations are meaningful. This results in a total of 15 stations (see Fig. S6).
The data sets from AEROCE and SEAREX were obtained by drawing large volumes of air through a filter when the wind was onshore and not very light (i.e., > 1 m/s); this was done to help reduce the effects of anthropogenic aerosols on the measurements (Prospero et al., 1989). The mineral dust fraction of the collected airborne particulates was determined either from their Al content (assumed to be 8%, corresponding to the Al abundance in the Earth's crust), or by burning the sample and assuming the ash residue to represent the mineral dust fraction (Prospero, 1999).
For each station, we first calculate the average dust concentration for each of the 12 months in the year using the data provided by Joseph Prospero to the AeroCom project (Joseph Prospero and Nicolas Huneeus, personal communication, 2014). We then calculate a 'climatological' average for each site by averaging over the seasonal cycle. We compare these measures to the seasonal cycle and 'climatological' dust concentration predicted by CESM over the period 1995 -2011 (Table 2). We emphasize that, since the data of the AEROCE and SEAREX campaigns were taken for different dates at each site in the period 1981-2000, the 'climatology' derived from these measurements is for a different period than that of the model simulation, introducing systematic errors of unknown size. Similar comparisons in previous studies have suffered from the same problem (e.g., Huneeus

Deposition measurements
We further compare the simulation results against the data set of dust deposition measurements compiled by Albani et al. (2014), which consists of 110 stations. This compilation was produced by merging pre-existing datasets (Ginoux et al., 2001;Lawrence and Neff, 2009;Mahowald et al., 2009), and retaining only measurements representative of modern climate (i.e., excluding sites representing a sediment flux integrated over hundreds or thousands of years). Furthermore, the size range of the measured deposition flux was adjusted to be consistent with the dust size range simulated in CESM of 0.1 -10 µm (Albani et al., 2014).
Since the deposition fluxes represent an integration over long time scales (generally years to decades), we can only compare the global distribution of the deposition fluxes to those simulated by CESM, and cannot compare seasonal or daily variations. Specifically, we compare the measured annual deposition flux at each station against the mean annual deposition flux simulated by CESM with the various dust emission configurations over the period 1995 -2011 (Table 1). As with the concentration comparison, the dates over which the measurements were obtained were by and large not coincident in time with the simulations, introducing systematic errors of unknown size. Furthermore, note that model representation of the dust deposition flux depends on the realistic simulation of a variety of processes. In addition to dust emission and transport, this notably includes wet and dry deposition, which is generally poorly captured by models (Huneeus et al., 2011).

2.4.Bootstrapping method to assess statistical significance of model improvements
We quantify the model performance for the range of comparisons outlined above by calculating the Pearson correlation coefficient r and the root mean square error (RMSE) (Bevington and Robinson, 2003;Wilks, 2011). An increase in r or a decrease in RMSE thus denotes an improvement in the model's ability to reproduce a given data set. To assess whether such an improvement is statistically significant, we use the bootstrap method (Efron, 1982). That is, we randomly select a number of samples from the 'distribution' of the observed data points, equal to the total number of data points. This is done with replacement, such that each data point can be chosen several times, or not at all. For each of the four simulations, we then calculate the value of r and RMSE for the model comparison with this randomly-chosen sample. By repeating this procedure a large number of times, the uncertainty of any arbitrary statistical measure can be estimated (Efron, 1982). In our case, we assess the significance of a model improvement by checking whether the statistical measure (r or RMSE) is improved for ≥ 95% of the bootstrap samples, which is equivalent to the p-value of ≤ 0.05 generally used to represent statistical significance (e.g., Wilks (2011)).

Results
This section presents the results of the four simulations. Below, we first present and qualitatively discuss the spatial patterns of the source functions, and the resulting dust emissions and dust AOD. Section 3.2 then reports the quantitative comparison of simulation results against aerosol optical depth measurements in dusty regions, after which Section 3.3 presents comparisons against dust cycle measurements further from source regions (i.e., dust concentration and deposition measurements).

3.1.The spatial patterns of source functions, dust emissions, and dust AOD
One of the objectives of the present study is to evaluate the hypothesis that use of the K14 scheme reduces the need for a source function in dust cycle simulations. We investigate this hypothesis by comparing the simulated spatial pattern of the dust emission coefficient C d in K14, which scales the dust flux in Simulation IV, to the Zender et al. (2003b) and Ginoux et al. (2001) source functions (Figs. 2 and S3). As such, we interpret C d as a physically-based and temporallyvariable source function because C d scales the dust emission flux in K14 in a manner that is similar (but not identical) to that of a source function (compare Eqs. 1 and 7a). We find that the large-scale spatial pattern of C d , which is highly anti-correlated to the soil moisture content of the top soil layer (Fig. S4), is strikingly similar to that of the Ginoux et al. (2001) Fig. 2b (Simulation III), this shift is empirically parameterized based on TOMS satellite observations (Ginoux et al., 2001). In contrast, in Fig. 2c (Simulation IV) this shift arises from the additional physics accounted for in K14 over Z03. Specifically, the shift of emissions to the most erodible regions arises from the greater sensitivity of the dust flux to the soil's threshold friction velocity (Fig. 1), which occurs because K14 accounts for a soil's increased ability to produce dust as it becomes more erodible. The spatial patterns of the source functions and the dust emission coefficient C d partially determine the dust flux , which in turn determines the dust optical depth . The simulation results show that application of a source function tends to shift dust emissions (Fig. 3) and dust AOD (Fig. 4) from less erodible regions, such as North America, to more erodible regions, such as North Africa. As hypothesized in our companion paper , and consistent with the similarity between C d and the Ginoux et al. (2001) source function (Fig. 2), replacing Z03 with K14 has an effect that is similar to the application of an empirical source function. That is, it shifts emissions and dust AOD to the most erodible regions, producing increases in emissions and AOD over most of North Africa, and decreases over less erodible regions such as North America and Southern Africa (Figs. 3d, 4d). Moreover, K14 shifts dust AOD within North Africa westward, as well as southward towards the 15º -25º N latitude belt. This appears to bring the simulations in better agreement with qualitative satellite observations of North African dust emitting regions (Prospero et al., 2002;Crouvi et al., 2012;Ginoux et al., 2012;Ashpole and Washington, 2013). Furthermore, applying K14 substantially increases dust emissions in Patagonia (Figs. 2c,3d,and 4d), which in the default version of CESM needs to be increased by about two orders of magnitude to match available observations (Albani et al., 2014).
The simulation results further show that dust emissions depend quite differently on soil properties for the different dust emission schemes. In particular, the soil clay fraction f clay has two competing effects, and the relative strength of these differ between the schemes. On the one hand, f clay scales the threshold moisture content above which additional soil moisture increases the soil's threshold friction velocity (see Eq. 4), and on the other hand f clay scales the dust emission flux (see Eq. 7a). Since the scaling of the dust flux with f clay is exponential in Z03, it overwhelms the effect of f clay on the threshold friction velocity, such that dust fluxes in Simulation I correlate strongly with f clay (compare Figs. 3a and S1). In contrast, in the K14 scheme the dust flux scales merely linearly with f clay , such that both effects of the clay fraction on the dust flux are important. Consequently, the effect of f clay on dust emissions in K14 is not straightforward, and the map of Simulation IV's dust emission fluxes (Fig. S2d) does not show an obvious correlation with the map of the clay fraction (Fig. S1). However, because the strong exponential dependence of the dust flux on f clay in Z03 is replaced by a linear dependence in K14, areas with low clay content, such as sand dunes, produce consistently more dust in Simulation IV than in Simulation I (compare Figs. 3d and S1). This brings the results of Simulation IV in better qualitative agreement with the observation by Crouvi et al. (2012) that a large fraction of North African dust plumes originate from sand dunes. Furthermore, since CESM does not use a map of the spatially variable aerodynamic roughness length to more accurately calculate the aerodynamic drag on the bare (erodible) soil, it is likely that improving the model by using such   13   459  460  461  462  463  464  465  466  467  468  469  470  471  472  473  474  475  476  477  478  479  480  481  482  483  484  485  486  487  488  489  490  491  492  493  494  495  496  497  498  499  500  501  502  503  504 a map Menut et al., 2013) would produce even higher fluxes in regions covered in sand dunes, since these regions tend to have low roughness length.   As discussed previously, the data sets of AOD at the 42 'dusty' AERONET stations (see Section 2.2.1) are arguably the most reliable currently available data to quantitatively evaluate the dust emission components of large-scale models. We therefore perform an extensive comparison against the AERONET AOD data sets (Table 1 and Figs. 5, S5). As expected, the application of the source functions of both Zender et al. (2003b) and Ginoux et al. (2001) improve model agreement against the average ('climatological') AERONET AOD data (Figs. 5a-c). However, a substantially larger and statistically significant improvement is obtained with Simulation IV, which uses the K14 scheme and does not use a source function (Fig. 5d and Table 1). In particular, Simulation IV produces improved agreement over North Africa, the Middle East, and Australia, and somewhat lesser agreement over Asia. parameterized following K14 instead of Z03). For each simulation, the root mean square error (RMSE) and correlation coefficient (r) are noted.

AERONET AOD seasonal cycle and daily variability
In addition to improving the representation of AOD climatology, applying K14 also produces a statistically significant improvement of CESM's simulation of the seasonal AOD cycle at the AERONET stations ( Fig. S5 and Table 1). Indeed, Simulation IV produces the best seasonal agreement for 25 of the 42 AERONET stations, including 15 of the 18 North African stations (Fig. S5). However, the influence of changes in the emission scheme on seasonal variability appears limited: for a given station, the standard deviation of the four correlations produced by the four different simulations is on average only 0.05, which is only a small fraction of the mean correlation coefficient of ~0.80 (Table 1). This indicates a limited influence of the emission scheme on the seasonal variability of AOD, which is likely primarily controlled by seasonal variations in soil moisture, wind, and vegetative soil cover. Consequently, the statistically significant improvement in Simulation IV's ability to reproduce the seasonal cycle of dust AOD results in only a modest improvement in the correlation coefficient (Table 1). The comparison of model results to the measured daily variability of dust AOD at the different AERONET stations produces results that are qualitatively similar to those for the seasonal cycle. That is, Simulation IV again produces a statistically significant improvement in the model's ability to reproduce the daily AOD variability (Table 1), and produces best agreement at 18 of the 31 available stations, including 13 of the 18 North African stations. However, the emission scheme appears to also have limited influence on the daily AOD variability, which is probably largely controlled by the daily variations in wind speed and in soil moisture, which controls the dust emission threshold (see Section 2.1.1). Indeed, for a given station, the standard deviation of the four correlations produced by the four different simulations is on average only 0.03, which is only a small fraction of the mean correlation coefficient for daily variability of ~0.45 (Table 1). Consequently, the absolute improvement in the correlation coefficient produced by the statistically significant improvement in the model's ability to capture daily variability in the AOD is again modest (Table 1). Evan et al. (2014) analyzed satellite data to estimate both the dust optical depth and the dust column mass path (DMP) off the West African coast over the region of 10º-20º N, 20º-30º W around Cape Verde. They found that most models that participated in the Coupled Model Intercomparison Project Phase 5 (CMIP5) underestimated the dust optical depth for this region, and that all models substantially underestimated the dust mass path. We find that the dust optical depth in Simulations I, III, and IV are close to the satellite estimates, and that all four simulations produce a DMP that is in better agreement with the satellite estimates than that of the CMIP5 models, which includes a previous version of CESM (Fig. 6). As already suggested by Evan et al. (2014), these improvements most likely occur because our newer CESM version uses the size distribution of Kok (2011b), which corrects a bias towards fine dust aerosols in most large-scale models. Consequently, using this size distribution results in a higher dust mass path per unit of dust AOD, which is in better agreement with measurements.

Dust optical depth and dust mass path off the West African coast
Although all four simulations are in better agreement with the Evan et al. (2014)  Simulations III and IV are consistent with the lower uncertainty range of the satellite estimates. The likely reason for the improved performance of Simulations III and IV is that these simulations shift emissions towards West Africa (see , resulting in a higher, and thus more realistic, value of DMP off the West African coast. Note that the simulated values of DMP and the dust AOD in Fig. 6 both depend on the tuning constant C tune scaling the dust emissions (see Eqs. 1 and 8). It's thus possible to obtain better agreement with these measures by choosing a different method to set C tune , although doing so would degrade the model's comparison against AERONET AOD (Figs. 5,S5). Also note that the dust optical depth and dust emissions are sensitive to dust radiative effects and dust optical properties (Perlwitz et al., 2001;Albani et al., 2014).

Figure 6.
Dust optical depth (left) and dust mass path (right) averaged over the region 10˚-20˚ N and 20˚-30˚ W. Satellite-derived estimates for AVHRR  and MODIS (2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012), as well as the ensemble average of CMIP5 models , are from Evan et al. (2014). Error bars denote the uncertainty on the satellite estimates, and the standard deviation of the available CMIP5 model results (23 models for the dust optical depth, and 11 models for the dust mass path; see Evan et al. (2014)). The dust optical depth and dust mass paths calculated from the four CESM simulations were averaged over the period 2001 -2011 to be most comparable to the MODIS results.

3.3.Comparisons against other measurements of the dust cycle
The previous section evaluated possible improvements in CESM's representation of dust emission through comparisons against measurements of aerosol optical depth close to source regions. In this section, we evaluate whether these improvements propagate into the simulation of other dust cycle properties further from source regions. Specifically, we compare the simulation results against measurements of dust surface concentration and deposition, which are generally taken further from source regions (Table 2). We find that Simulation IV, which uses the K14 scheme, produces a statistically insignificant improvement against both the surface dust concentration climatology and seasonal cycle (Figs. 7 and S6). We also find that Simulation IV produces agreement against dust deposition flux measurements that is slightly less than that of the other simulations (Fig. 8) (Zender et al. (2003b) source function), (c) Simulation III (Ginoux et al. (2001) source function), and (d) Simulation IV (no source function, and dust flux is parameterized following K14 instead of Z03). For each panel, the root mean square error (RMSE) and correlation coefficient (r) in log10-space are noted.

Discussion
The simulations reported in the previous section were designed to (i) evaluate the performance of the K14 dust emission scheme in CESM, and (ii) test the hypothesis in Kok et al. (2014) that the K14 scheme reduces the need to use a source function in dust cycle simulations. The next two sections discuss these two objectives, after which we discuss the implications of our results for the dust cycle's sensitivity to climate changes.

4.1.Evaluation of the K14 dust emission scheme in CESM
Relative to the 'control' (Sim. I), the simulation using K14 (Sim. IV) shows statistically significant improvements against measurements most characteristic of dust emission, namely (i) AERONET AOD climatology (Fig. 5), (ii) variations in AERONET AOD on daily and seasonal timescales (Table 1 and Fig. S5), and (iii) satellite-derived estimates of dust optical depth and dust mass path off the West African coast (Fig. 6). The representation of dust emission in Simulation IV is also statistically significantly improved over the simulations using either the Zender et al. (2003b) or the Ginoux et al. (2001) source function. We find that the improvement in CESM's dust emission module by the K14 scheme does not propagate into a statistically significant improvement of the model's ability to reproduce surface measurements of dust concentration (Figs. 7 and S6), or of dust deposition (Fig. 8). Since most stations in these data sets are far from source regions, this indicates that the improvement in CESM's dust emission scheme does not produce comparable improvements in simulated dust cycle properties further from source regions. This could be caused by model errors in other processes, particularly in dust transport and deposition. This is especially likely to explain the lack of improvement in the simulated dust deposition fluxes, which is known to be subject to large model errors (Huneeus et al., 2011). In addition, it is possible that tuning of parameters describing processes such as transport and deposition in previous model versions (e.g., Albani et al. (2014)) degrades the model performance with the new K14 parameterization.

4.2.Does the K14 scheme reduce the need to use a dust source function?
The main reason for the improved representation of dust emissions with K14 is likely that this scheme accounts for a soil's increased ability to produce dust as its erodibility increases . This results in an increased sensitivity of the dust flux to a soil's standardized dust emission threshold (Fig. 1), which quantifies a soil's susceptibility to wind erosion. Since the dust emission threshold in CESM is parameterized mainly in terms of soil moisture, the dust emission coefficient in the K14 scheme (Fig. 2c) (Fig. S4). Consequently, relative to the 'control' simulation with the Z03 scheme, the simulation with K14 produces a shift of emissions and dust AOD to hyper-arid -and thus typically highly erodible -regions such as North Africa (Figs. 2c, 3d, 4d). This effect is remarkably similar to that of applying the Ginoux et al. (2001) source function (Figs. 2b, 3c, 4c), which was developed to shift emissions to regions with high dust loadings observed by the TOMS satellite (Prospero et al., 2002). This similarity between the effects of the K14 scheme and the Ginoux et al. (2001) source function suggests that the K14 scheme replaces some of the empiricism introduced by this source function with an improved description of the physics of dust emission.
The K14 scheme thus reduces, or even eliminates, the need to use a source function in CESM. Although further work is needed to investigate whether the K14 scheme produces similar improvements in other large-scale models, the additional dust emission physics accounted for in K14 could aid in moving dust modules beyond the widespread use of empirical source functions. Such a transition towards a more physically-explicit treatment of dust emission is necessary for better understanding past and forecasting future changes in the global dust cycle: although empirical descriptions of dust emission can work well for the current climate, such descriptions are unlikely to accurately capture the effect of climate-driven changes in soil erodibility and other relevant factors.
Note that parallel efforts to improve the fidelity of dust modules might also help to reduce the reliance on empirical source functions in dust cycle simulations. Indeed, since many source functions were formulated for models that did not account for the spatial variability of the aerodynamic roughness length (Ginoux et al., 2001;Zender et al., 2003b), the use of drag partition schemes employing high-resolution maps of roughness length Chappell et al., 2010;Menut et al., 2013) might be particularly effective. In addition, a more accurate description of the land surface through higher resolution and more detailed soil data sets could further reduce the reliance on empirical parameterizations in dust modules.

4.3.Implications for the dust cycle's sensitivity to climate changes
An important result from Kok et al. (2014) is that current parameterizations in climate models likely underestimate the dust flux's sensitivity to the soil's threshold friction velocity, which is further supported by the results presented here (Figs. 1 -4). This underestimation might have important implications for evaluating the global dust cycle's response to climate changes. In particular, since soil erodibility is affected by climate, which partially determines the soil moisture content, aggregation state, and crusting of the soil (Zobeck, 1991;Kok et al., 2012), our results here and in Kok et al. (2014) suggest that many climate models underestimate the dust cycle's climate sensitivity.
This result could help explain a series of observations. For instance, climate models have difficulty reproducing the increase in North African dust emissions during the Sahel drought in the 1980s (Mahowald et al., 2002;Evan et al., 2014;Ridley et al., 2014), which is likely partially due to the underestimation of the dust flux's sensitivity to drought conditions (Fig. 1). Furthermore, an increased sensitivity of dust emissions to climate changes could help explain the large differences in the global dust cycle between different climates, such as the much larger dust deposition fluxes measured for the Last Glacial Maximum (Rea, 1994;Harrison et al., 2001), which climate models also have difficulty reproducing without positing large changes in source areas (Werner et al., 2002;Mahowald et al., 2006b).

Summary and Conclusions
We used CESM simulations to both evaluate the performance of the K14 dust emission scheme developed in the companion paper , and to test the hypothesis that the K14 scheme reduces the need to use an empirical source function in dust cycle simulations.
We find that implementing the K14 scheme has an effect that is strikingly similar to that of implementing the Ginoux et al. (2001) source function. That is, it shifts emissions and dust AOD towards the most erodible regions, especially North Africa (Figs. 2-4). Indeed, the spatial pattern of the dust emission coefficient C d , which scales the dust flux in K14 using only the dust emission threshold and can thus be interpreted as a physically-based and dynamic source function, is remarkably similar to that of the Ginoux et al. (2001) source function (Fig. 2). We further find that the K14 scheme improves CESM's representation of dust emission, as evidenced by statistically significant improvements of the model's ability to reproduce AERONET AOD measurements in dusty regions (Table 1, Figs. 5, S5) and satellite observations of dust optical depth and dust mass path off the West African coast (Fig. 6). These improvement substantially exceed those produced by implementing either the Zender et al. (2003b) or the Ginoux et al. (2001) source functions.
These results suggest that the K14 scheme replaces some of the empiricism introduced by the use of a source function with an improved description of the physics of dust emission. As such, the K14 scheme reduces, or even eliminates, the need to use a source function in dust cycle simulations in CESM. Further work is required to investigate whether the K14 scheme can similarly improve other large-scale models.
Our results further suggest that many large-scale models have used source functions to empirically account for a part of the sensitivity of the dust flux to the soil threshold friction velocity (Figs. 1 -4). Because climate changes affect this dust emission threshold, for instance by affecting the soil moisture content and soil aggregation (Zobeck, 1991;Fecan et al., 1999;Shao, 2008;Kok et al., 2012), many models might underestimate the dust cycle's sensitivity to climate changes. Since the K14 scheme accounts for this missing component of the dust flux' sensitivity to the soil threshold friction velocity, namely the effect that a more erodible soil will produce more dust per saltator impact , we expect that it can improve the accuracy of dust cycle simulations for past and future climates.
Accounting for the dust flux's increased sensitivity to the soil state will thus affect simulations of the global dust cycle's response to future climate changes. In particular, since arid regions are generally predicted to become drier in most climate models , accounting for the dust flux' increased sensitivity to the soil threshold friction velocity would likely produce an increase in the future global dust emission rate, and thus in the global dust radiative forcing, relative to simulations that do not account for this. Since the dust cycle is sensitive to a variety of processes, including CO 2 fertilization (Mahowald et al., 2006b), land use change (Ginoux et al., 2012), and changes in sediment availability (Harrison et al., 2001), a substantial body of further work is required to assess the dust cycle's response to future climate changes.