Optimizing CALIPSO Saharan dust retrievals

We demonstrate improvements in CALIPSO (Cloud–Aerosol Lidar and Infrared Pathfinder Satellite Observations) dust extinction retrievals over northern Africa and Europe when corrections are applied regarding the Saharan dust lidar ratio assumption, the separation of the dust portion in detected dust mixtures, and the averaging scheme introduced in the Level 3 CALIPSO product. First, a universal, spatially constant lidar ratio of 58 sr instead of 40 sr is applied to individual Level 2 dust-related backscatter products. The resulting aerosol optical depths show an improvement compared with synchronous and collocated AERONET (Aerosol Robotic Network) measurements. An absolute bias of the order of−0.03 has been found, improving on the statistically significant biases of the order of −0.10 reported in the literature for the original CALIPSO product. When compared with the MODIS (Moderate-Resolution Imaging Spectroradiometer) collocated aerosol optical depth (AOD) product, the CALIPSO negative bias is even less for the lidar ratio of 58 sr. After introducing the new lidar ratio for the domain studied, we examine potential improvements to the climatological CALIPSO Level 3 extinction product: (1) by introducing a new methodology for the calculation of pure dust extinction from dust mixtures and (2) by applying an averaging scheme that includes zero extinction values for the nondust aerosol types detected. The scheme is applied at a horizontal spatial resolution of 1 ◦ × 1 for ease of comparison with the instantaneous and collocated dust extinction profiles simulated by the BSC-DREAM8b dust model. Comparisons show that the extinction profiles retrieved with the proposed methodology reproduce the well-known model biases per subregion examined. The very good agreement of the proposed CALIPSO extinction product with respect to AERONET, MODIS and the BSC-DREAM8b dust model makes this dataset an ideal candidate for the provision of an accurate and robust multiyear dust climatology over northern Africa and Europe.


Introduction
Since the launch of the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) instrument on board the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO; Winker et al., 2009) satellite in June 2006, global aerosol and cloud profiles are provided to the scientific community through analysis of CALIOP backscatter observations at the operating wavelengths 532 and 1064 nm.CALIOP probes the atmospheric vertical structure, which is geometrically separated in layers (Vaughan et al., 2009), with each layer being characterized either as cloud or aerosol (Liu et al., 2009).For aerosol observations, a further discrimination into six subtypes (dust, marine, smoke, polluted dust, polluted continental and clean continental) is performed based on the layer-integrated attenuated backscatter V. Amiridis et al.: Optimizing CALIPSO Saharan dust retrievals and approximate particulate depolarization ratio, as well as the location of the measurement (either land or ocean; Omar et al., 2009).Based on the aerosol classification scheme, CALIPSO algorithms produce aerosol extinction and backscatter coefficients using a look-up table for the six aerosol types in order to define the aerosol-type-dependent lidar ratio (LR) -a parameter that is required for the inversion of Level 1 attenuated backscatter coefficient profiles.The LRs are estimated from scattering calculations based on the definition of typical size distributions and refractive indices for each aerosol type, mostly drawn from analysis of global Aerosol Robotic Network (AERONET) observations (Omar et al., 2009).
Following the retrieval of extinction coefficient profiles, the aerosol optical depth (AOD) for each CALIPSO layer is obtained by integrating with respect to height.Validation studies performed so far, in order to evaluate columnar CALIPSO estimates of AOD, have revealed low biases with respect to other global observations (e.g., Redemann et al., 2012;Schuster et al., 2012;Omar et al., 2013).With regard to the Moderate-Resolution Imaging Spectroradiometer (MODIS) sensor, most studies emphasize a CALIPSO underestimation, of the order of 0.1 over regions having a strong mineral dust presence like the Mediterranean (e.g., Redemann et al., 2012).However, MODIS AOD accuracy decreases with cloud cover (e.g., Loeb and Manalo-Smith, 2005;Zhang and Reid, 2006); thus, trustworthy comparisons between CALIPSO and MODIS should contain only cloudfree MODIS retrievals that constrain the correlative datasets to a small but more reliable number of coincidences.Recently, Schuster et al. (2012) compared CALIPSO AODs with ground-based retrievals using AERONET and found that the relative bias of CALIPSO with respect to 147 global Sun-photometric stations is −13 % when dust is present and −3 % when dust retrievals are not included in the analysis.The results reported in this study are based on the segregation of the dataset into different aerosol types based on the CALIPSO aerosol classification scheme.Although this aerosol classification scheme is yet to be thoroughly evaluated (e.g., Burton et al., 2013), the CALIOP depolarization sensor has proven to be a direct and robust means by which mineral dust can be identified (e.g., Omar et al., 2009), and thus the results reported in Schuster et al. (2012) are likely to be representative for this aerosol type.
In any case, a detailed evaluation of CALIPSO dust extinction profiles (rather than AODs) using ground-based Raman lidars would be the ideal way to evaluate the reported CALIPSO underestimations for dust and to investigate possible causes of such discrepancies.So far, only a small number of Level 2 CALIPSO evaluation studies using Raman lidars have been reported in the literature (e.g., Pappalardo et al., 2010).Most evaluation studies have been performed over Europe in the framework of the European Aerosol Research Lidar Network (EARLINET) and over North America using High Spectral Resolution Lidar (HSRL) airborne measurements during CALIPSO under-flights of the NASA B200 aircraft (e.g., Kacenelenbogen et al., 2011;Burton et al., 2013), i.e., at sites having complex aerosol mixtures that are not suitable for pure dust detection and therefore validation.It is only recently that Tesche et al. (2013) utilized ground-based Raman lidar measurements over Cape Verde, performed during the second Saharan Mineral Dust Experiment (SAMUM), in order to validate CALIPSO pure dust observations.The researchers reported an underestimation of the CALIPSO Level 2 product for the 532 nm extinction coefficient as high as 30 % and attributed the difference to the low dust LR value of 40 sr used in the CALIPSO algorithm at this wavelength.
The LR value of 40 sr has been estimated for CALIPSO by assuming typical size distributions and refractive indices obtained from AERONET dust sites and then applying scattering calculations using the discrete dipole approximation technique to account for nonspherical particles in terms of spheroids (Omar et al., 2009).This value has also been retrieved directly from CALIPSO observations of isolated dust layers.In particular, AOD constraints have been set for dust layers (Liu et al., 2008) allowing LRs to be retrieved.Liu et al. (2008) report an effective dust LR of the order of 41 ± 6 sr at various locations in the Saharan dust plume off the west coast of Africa, agreeing well with the dust model of Omar et al. (2009).
Although CALIPSO dust retrievals may appear to be self-consistent, comparisons with ground-based Raman lidar measurements of Saharan dust show considerable discrepancies with respect to the LR at 532 nm.Direct LR measurements of pure Saharan dust obtained during the SAMUM-1 experiment yield LRs of 55 ± 7 sr at 532 nm (Tesche et al., 2009a).Moreover, EARLINET reports a broad range of dust LRs from 30 sr to 80 sr across Europe (e.g., Mattis et al., 2002;Balis et al., 2004;Mona et al., 2006;Papayannis et al., 2008).This large dispersion in EARLINET LRs is mostly attributed to variations in the mixing of dust with other aerosol types, since the values are retrieved from the analysis of Raman lidar measurements during Saharan dust advection over the lidar sites which are contaminated by the presence of local aerosol sources.Here, we use LR values calculated from the statistical analysis of pure dust elevated layers found in multiyear EARLINET observations.The analysis reveals LRs at 532 nm equal to 58 ± 8 sr.This value is also supported by recent AERONET calculations performed by Schuster et al. (2012) using a different methodology to that of Omar et al. (2009).The highest LRs obtained by Schuster et al. (2012), of the order of 58 sr, occurred at sites in Africa that are not located in the Sahel, while the lowest LRs of the order of 43 sr were found in the Middle East.Schuster et al. (2012) attributed the variability in the retrieved LR to the variability of the real refractive index of dust, which in turn is caused by the variability of the relative proportion of the mineral illite.Further evidence that the LRs of Arabian dust are significantly lower than those of Saharan dust has been recently provided by Mamouri et al. (2013) based on combined lidar/Sun-photometric observations of advected Arabian dust over Cyprus.
A possible explanation for the difference between the LR of 58 sr that is closer to all reported values from SAMUM-1, EARLINET and Schuster et al. (2012) and the LR of 40 sr used in the CALIPSO retrieval algorithm has been given by Wandinger et al. (2010).This latter study showed that the LR of 40 sr used by CALIPSO is an effective value accounting for the increased atmospheric transmission caused by multiple scattering, and gives reasonable backscatter coefficients that compare well with ground-based observations.However, using the same value of 40 sr to convert backscatter into extinction coefficients introduces a systematic underestimation of extinction and AOD by 25-35 % (Wandinger et al., 2010;Tesche et al., 2013).The authors suggest that this artifact can easily be overcome by applying two different look-up values for the LR of mineral dust in the CALIPSO retrieval algorithm, i.e., an effective value of 40 sr for the backscatter retrieval and a single-scattering value of 55 sr for the backscatter-to-extinction conversion.In addition, the authors suggest that CALIPSO dust retrievals could be further optimized by applying the method introduced by Tesche et al. (2009b) for separating out the dust portion of the polluted dust CALIPSO aerosol type.
In this work, we investigate the possible improvement of CALIPSO dust retrievals by appropriately filtering CALIPSO Level 2 data and applying the LR value of 58 sr to CALIPSO backscatter retrievals.Moreover, we examine potential improvements on Level 3 climatological monthly means when accounting for pure dust only, by separating pure dust from both "polluted dust" and "dust" CALIPSO subtypes based on depolarization observations.The domain of our application is northern Africa and Europe, and we wish to note that this methodology cannot be applied to mineral dusts different from those advected from the Sahara.This point has been re-emphasized by the recent study of Schuster et al. (2012) which implied that the use of a spatially constant LR for all CALIPSO dust retrievals is inappropriate and would produce positive bias for CALIPSO AODs over the Middle East, where the dust LR is lower than that for the Sahara (of the order of 43 sr).The data used in this study refer to a domain that excludes the Middle East, and are presented in Sect. 2. Methodologies followed for each comparison together with the corresponding results are presented in Sect.3, and the paper closes with our conclusions in Sect. 4.

Data
Satellite and ground-based observations together with their corresponding products and the dust model utilized for Saharan dust simulations used in this study, are described in this section.

The CALIPSO product
CALIOP, the principal instrument on board the CALIPSO satellite of the NASA A-Train, is a standard dual-wavelength (532 and 1064 nm) backscatter lidar operating a polarization channel at 532 nm (Winker et al., 2009), and has been acquiring global atmospheric profiles since June 2006.CALIOP measures high-resolution (1/3 km in the horizontal direction and 30 m in the vertical direction) profiles of the attenuated backscatter of aerosols and clouds at 532 and 1064 nm along with polarized backscatter in the visible channel (Winker et al., 2009).These data are distributed as a part of the CALIPSO Level 1 products.After calibration and range correction, cloud and aerosol layers are identified, and aerosol backscatter and extinction are retrieved at 532 and 1064 nm and delivered in the Level 2 product.In this study, we use the CALIOP Level 2 product, which is derived from the Level 1 product using a succession of algorithms that are described in detail in a special issue of the Journal of Atmospheric and Oceanic Technology (e.g., Winker et al., 2009).In brief, the CALIOP Level 2 retrieval scheme is composed of an algorithm for feature detection, a module that classifies features according to layer type (e.g., aerosol vs. cloud) and subtype, and, finally, an extinction retrieval algorithm that estimates the aerosol backscatter and extinction coefficient profile and total column AOD for an assumed LR for each detected aerosol layer.The CALIPSO Level 2 product determines the locations of layers within the atmosphere (Vaughan et al., 2009), discriminates aerosol layers from clouds (Liu et al., 2009), categorizes aerosol layers as one of six subtypes (dust, marine, smoke, polluted dust, polluted continental, and clean continental; Omar et al., 2009), and estimates the AOD of each layer detected (Young and Vaughan, 2009).Due to CALIOP's sensitivity to polarization at 532 nm, the depolarization arising from scattering from nonspherical dust particles serves as an independent means of discrimination between dust and other aerosol species.
In this study we use Version III.01 of the Level 2 product.The older Version II product reported aerosol spatial properties (in the layer product files) at a horizontal resolution of 5 km, and range-resolved aerosol optical properties (in the profile product files) at a horizontal resolution of 40 km.The new Version III data products report aerosol optical properties at the same 5 km horizontal resolution used for the spatial properties.However, the same optical properties retrieval strategy is used in both Version II and III of the CALIOP data products (Young and Vaughan, 2009).
Moreover, we use the methodology developed for the production of the Level 3 aerosol product (Winker et al., 2013) in order to derive 1 • × 1 • latitude-longitude monthly averaged vertical distributions.This methodology has been developed in order to produce the CALIPSO Level 3 product, in which the Level 2 532 nm aerosol extinction product is aggregated onto a global 2 • × 5 • latitude-longitude grid.The vertical resolution of the product is 60 m over the range of heights −0.5 to 12 km relative to the mean sea level.Mean extinction profiles are computed for dust-only and for all aerosol types.CALIOP retrieves aerosol below optically thin clouds, in clear skies and above clouds.Monthly-mean extinction profiles are computed for four conditions: daytime: all-sky and cloud-free, and nighttime: all-sky and cloud-free.In addition, several quality control flags contained in the Level 2 files are used to screen the data prior to averaging.A detailed summary of the methodology used for the production of the Level 3 product is provided in the Appendix of Winker et al. (2013).

The AERONET product
Ground-based AOD measurements from the well-known AEerosol RObotic NETwork (AERONET) of NASA (Holben et al., 2001) are used for validation purposes in our study.AERONET Sun photometers provide directly measured AODs at seven wavelengths from UV to the near IR (approximately 340, 380, 440, 500, 675, 870, and 1020 nm) with an estimated uncertainty of 0.01-0.02(Holben et al., 2001).In the present study, quality-assured direct-sun data (Level 2, Version II) in the wavelength range 440-870 nm are used.

The MODIS product
Level 3 gridded, daily mean AODs at 550 nm from MODIS on board the Aqua satellite are utilized in our study.Our selection of MODIS-Aqua rather than MODIS on board the Terra satellite is based on the fact that CALIPSO is flown in formation with Aqua as part of the A-train satellite constellation, so that a large number of coincident observations are available from the CALIOP and MODIS-Aqua instruments.A detailed description of the MODIS aerosol product is given in e.g.Remer et al. (2002), and the accuracy of MODIS AODs has been evaluated against ground measurements globally (e.g., Levy et al., 2010).Over sea surfaces, the accuracy of the AOD is ±0.03 ± 0.05 • AOD and is higher than that over vegetated land ±0.05 ± 0.2 • AOD (Ichoku et al., 2002;Remer et al., 2005).Over land, errors larger than ±0.05 ± 0.2 • AOD can be found in coastal zones due to subpixel water contamination (Barnaba and Gobbi, 2004).
In this study, we use Level 3 1 • × 1 • gridded daily mean values of the AOD at 550 nm from Collection 5.1.In addition, we use information on the Level 2 counts used for the production of the Level 3 AOD in order to constrain our dataset to representative 1 • × 1 • Level 3 values, calculated by an adequate number of 10 km × 10 km Level 2 records.Deep Blue retrievals (Hsu et al., 2004) over bright surfaces such as deserts are ignored, since no information on the number of Level 2 cells used for the derivation of the Deep Blue Level 3 product is provided in the current version.Moreover, we utilize the total cloud coverage product in order to constrain our datasets to retrievals under almost cloud-free con-ditions, since the presence of clouds is a determining factor that strongly affects the accuracy of the algorithm retrieval and usually leads to a significant overestimation of the AOD (e.g., Zhang et al., 2005;Remer et al., 2008).

The BSC-DREAM8b dust model
Dust extinction and dust AOD at 550 nm simulated by the BSC-DREAM8b dust model are utilized in this work for comparisons with the CALIPSO Level 3 dust product in the domain of interest.BSC-DREAM8b (Nickovic et al., 2001;Pérez et al., 2006a, b) is a regional model designed to simulate and predict the atmospheric cycle of mineral dust aerosol.The model is fully embedded as one of the governing prognostic equations in the atmospheric NCEP/Eta model and solves the mass balance equation for dust taking into account the following processes: (1) dust production (Shao et al., 1993) including a viscous sublayer (Janjic, 1994), (2) horizontal and vertical advection, (3) turbulent and lateral diffusion (Janjic, 1994), (4) dry deposition and gravitational settling (Zhang et al., 2001), and (5) a simple belowcloud scavenging scheme (Nickovic et al., 2001).The model includes a source function based on the arid and semi-arid categories of the 1 km land-use dataset provided by the US Geological Survey (USGS), eight size bins within the 0.1-10 µm radius range according to Tegen and Lacis (1996), a source size distribution derived from D'Almeida (1987), as well as dust radiative feedbacks on meteorology (Pérez et al., 2006a).
In recent years, operational versions of the model have been used for dust forecasting and as a dust research tool in northern Africa and southern Europe (e.g., Pay et al., 2010;Kokkalis et al., 2012).Several case studies have highlighted the high capability of BSC-DREAM8b (e.g., Pérez et al., 2006a, b;Amiridis et al., 2009) with regard to both the horizontal and vertical extent of dust plumes in the Mediterranean Basin.The model has also been validated and tested over longer time periods in Europe (e.g., Basart et al., 2012) and against measurements in source regions during SAMUM-1 (Haustein et al., 2009) and the Bodélé Dust Experiment (BoDEx; Todd et al., 2008).Additionally, in order to improve the dust forecast and to implement operational products, daily evaluation with near-real-time (NRT) observations is conducted at the Barcelona Supercomputer Center (BSC) in collaboration with the Spanish Meteorological State Agency (AEMET).Currently, the NRT evaluation includes both satellites (MODIS and Meteosat) and AERONET Sun photometers.
The initial state of the dust concentration in the BSC-DREAM8b model is defined by the 24 h forecast from the previous-day model run.For the present study, global meteorological files (at 1 • × 1 • ) at 00:00 UTC from the National Center for Environmental Prediction's Global Forecast System (FNL/NCEP) are used as initial conditions and boundary conditions at intervals of 6 h.The resolution is set to 1 / 3 • in the horizontal and to 24 layers extending up to approximately 15 km in the vertical.The domain of simulation covers northern Africa, the Mediterranean Sea, southern Europe and the Middle East, and the output of model simulations is available hourly.The model outputs have been regridded to a horizontal resolution of 1 • × 1 • so as to be suitable for the present analysis.
In BSC-DREAM8b, the AOD (τ (λ)) and the extinction coefficient (α(λ)) are related to column mass loading and mass concentration, respectively, by where, for each size bin k, τ k (λ) is the aerosol optical depth, α k (λ) is the extinction coefficient, ρ k is the particle mass density, r k is the effective radius, M k is the column mass loading, C k is the concentration, and Q ext (λ) k is the extinction efficiency factor calculated using Mie scattering theory.
3 Methods, results and discussion

Comparison methodology
In order to compare CALIPSO dust AODs with AERONET measurements for the 5 yr period between 2007 and 2011 of our analysis, we apply a method similar to that introduced by Schuster et al. (2012) to spatially collocate and synchronize CALIPSO and AERONET data.The spatial collocation is based on an acceptable closest approach between the CALIPSO overpass and AERONET station, determined to be equal to 80 km.The time synchronization of the observations is defined as a 30 min difference of the CALIPSO closest approach to a single AERONET AOD measurement.The use of the AERONET Level 2 quality-assured product ensures the lowest possible contamination of the AOD measurement by clouds.In order to convert AERONET AODs to the CALIOP operating wavelength of 532 nm, the methodology introduced by Schuster et al. ( 2006) is applied.First, we use the CALIPSO AODs reported in the 5 km Level 2 product.Only 5 km cases with pure dust presence in the atmospheric column are accepted.Since our intention is to use only cloud-free profiles, it was a requirement that the CALIPSO cloud and aerosol detection (CAD) score for these profiles was lower than −20.Moreover, we required that the extinction quality control (QC) flag was equal to zero or 16, indicating that a successful extinction solution was achieved with the default LR assigned to each layer.Furthermore, we required that the aerosol extinction uncertainty was less than 99.9 km −1 .Finally, we required that CALIPSO surface elevations were within 100 m of the AERONET site in order to ensure that the optical path lengths for the CALIPSO and AERONET instruments were approximately equal.The locations used in this study were restricted to the domain of latitudes between 20 and 55 • north and longitudes between −20 and 30 • .As a result, 11 AERONET stations fulfilled the aforementioned requirements: Dakar, Caceres, Autilla, Chilbolton, Le Fauga, Dunkerque, Venise, Gustav Dalen Tower, FORTH Crete, Toravere and Eforie.The locations of the AERONET stations used in our study can be found at the AERONET website (http://aeronet.gsfc.nasa.gov/).

Results and discussion
Considering homogeneous CALIPSO profiles where only dust is detected in the atmospheric column, we found 1203 profile coincidences with a distance less or equal to 80 km from the reference AERONET stations located in the domain of our interest.In Fig. 1 (upper-left panel) we present the CALIPSO single 5 km AODs vs. the AERONET measurements for this dataset.A significant absolute bias (absolute difference of the means: averaged CALIPSO AOD minus the averaged AERONET AOD) of the order of −0.1 is revealed by our dataset.This absolute bias, found for Saharan dust, is almost identical to that reported by Schuster et al. (2012) for dust worldwide.In general, we find good agreement with Schuster et al. (2012) for both the absolute biases and all the statistical parameters of the comparison (e.g., relative biases of the order of −0.36 and large root mean square (rms) biases of the order of 0.25).As already stated in Schuster et al. (2012), the high correlations and large relative biases of the dust comparisons in their work (but also here) indicate that dust aerosols are generally being "typed" correctly over the AERONET sites, but that perhaps the LR assigned to dust is too low.Thus, the LR underestimation is believed to be the main factor affecting the CALIOP AOD underestimation, and is expected to increase linearly with AODs.The expected linear increase with AOD is revealed here if we separate the 5 km CALIPSO absolute biases by AOD class (Fig. 1 -upper-right panel).While absolute biases are affected more by larger AODs, relative biases with respect to AOD class consistently show a random variability around a −35 % average (Fig. 1 -upper-right panel), which is what is expected when underestimating LR by a factor of 0.7.The variability of the relative bias however implies that other artifacts may affect the comparison as well.Recently, Omar et al. (2013) performed a detailed global CALIOP-AERONET comparison and found a number of discrepancies, including CALIOP's failure to correctly detect the aerosol layer base or failure to detect aerosol at all, misclassification of aerosol type, classification of dense aerosol layers as clouds, cloud contamination in both datasets and horizontal scene inhomogeneity -all of which affect the comparison.In order to account for these discrepancies, we screen here our correlative dataset to ensure cloud-free conditions and scene homogeneity, acknowledging however that misclassifications are not likely for dust due to the CALIOP capability of detecting this aerosol type using its high-quality depolarization signatures.To be specific, we apply quality criteria to account for the CALIPSO scene inhomogeneity as this is depicted by the high variability of the Level 2 AODs presented in Fig. 1 for certain collocations (upper-left panel).The observed inhomogeneity most probably results from natural aerosol horizontal variability, or because the CALIPSO scene contains a number of 5 km products that have different optical path lengths due to removal of layers by the application of quality screening criteria or due to CALIPSO misdetections in general.To screen the latter effect on CALIPSO scenes used in our work, we examine the relationship between the mean AOD of each scene as this is produced by averaging single 5 km columnar AODs (hereafter referred to as "AOD AvgCol ") and the respective AODs produced by averaging the extinction profiles into a mean extinction for the scene and then integrating to acquire a mean AOD (hereafter referred to as "AOD IntOfMean ").The averaging procedure used to acquire a mean extinction profile representative for each scene follows the quality criteria defined in the Level 3 CALIPSO product.In addition, we exclude from our comparison scenes that are interrupted by clouds within an 80 km overpass distance from the AERONET station, and we keep only scenes with pure dust presence.The comparison between AOD IntOfMean and AOD AvgCol for our dataset is presented in Fig. 1 (lowerleft panel).In order to account for homogeneous CALIPSO scenes, we exclude from our dataset those cases where the absolute difference between the two AOD retrieval methods is greater than 0.02 (blue squares).Moreover, to exclude cases of high aerosol spatial inhomogeneity, a standard deviation threshold of 0.02 is applied as this is computed by the averaging of single AODs (pink error bars).After screening our dataset, we find that the absolute biases are more clearly affected for larger AODs, while the relative biases with respect to AOD class show a lower variability, again of the order of −35 % (Fig. 1 -lower-right panel).The slightly higher relative biases at AODs below 0.5 may result from possible artifacts due to layer detection that surpassed the thresholds of 0.02 in terms of AOD.
The differences found in our CALIPSO-AERONET AOD comparison are of the order of what is expected when underestimating a LR of 58 sr with the value 40 sr.Backscatter errors, on the other hand, do not diverge so much from the LR assumption for an elastic lidar.For example, Tesche et al. (2013) showed, in a study using 15 collocated/synchronous ground-based lidar measurements during CALIPSO overpasses, that CALIPSO retrievals work best for the 532 nm backscatter coefficient.However, for dust cases it was found that using the effective dust LR of 40 sr for the retrieval rather than the observed mean LR value of 55 sr in their dataset led to an underestimation of the 532 nm extinction coefficient by as much as 30 %.When backscat- ter values were corrected for the low LR (i.e., by multiplying the backscatter by the ratio 55 / 40), the agreement between ground-based and CALIPSO extinctions was significantly improved.
Here, we follow the same approach in order to investigate this potential improvement on our CALIPSO-AERONET comparison.We use the 532 nm backscatter coefficient retrievals of CALIPSO multiplied by the value 58 / 40.The mean LR of 58 sr is the value that we derive by processing multiyear EARLINET Raman lidar measurements of pure Saharan dust.To be more specific, this LR has been retrieved by in-depth investigation of more than 500 aerosol layers selected from measured dust profiles at 16 EARLINET stations.Layer boundaries have been determined by the application of the derivative method (e.g., Mattis et al., 2008) and, for each layer analyzed, mean optical properties have been retrieved and the BSC-DREAM8b dust model has been used in order to validate the dust origin of each dust layer.The analysis of the EARLINET observed dust layers revealed statistical average LR values of 58 ± 9 sr at 532 nm and 58 ± 11 sr at 355 nm, showing almost no wavelength dependence for this parameter.As mentioned earlier, these values are consistent with measured LRs over the Sahara during the SAMUM-1 experiment (e.g., Tesche et al., 2009b).
Using our screened CALIPSO-AERONET dataset of 77 quality-controlled and homogeneous scenes, the AOD retrievals from CALIPSO (AOD IntOfMean ) and AERONET using average LR values of 40 sr and 58 sr are compared in the scatter plots presented in Fig. 2. To the left, the CALIPSO AOD IntOfMean calculated by original CALIPSO products is  compared, while to the right, the same comparison is presented with the LR adjusted to 58 sr.A Pearson correlation coefficient of 0.91 reveals excellent agreement for both collocated datasets.Moreover, the use of the LR value of 58 sr improves the slope of the linear regression from 0.66 (for the original CALIPSO product) to 0.96 (for LR = 58 sr).Absolute biases between CALIPSO and AERONET AODs are down to −0.03 from −0.1, while the confidence parameters (t test scores and p values) show that the bias for the AODs computed with LR equal to 58 sr changes from statistically significant (with very high confidence for the original CALIPSO product) to nonsignificant (see also Table 1).We have to emphasize once again that the improvements refer only to the domain examined, i.e., the Sahara and Europe.When we apply our methodology over the Middle East (not shown here), the original CALIPSO product is in a very good agreement with AERONET -a result that is in line with the recent comparison performed by Omar et al. (2013).Thus, an average LR of 40 sr applies well for that region as already reported by Schuster et al. (2012) and according to further evidence provided by Mamouri et al. (2013), but is not appropriate for the Sahara and Europe.

Comparison methodology
In this section, we compare 1 • × 1 • spatial averages of CALIPSO dust AODs with the collocated MODIS-Aqua Level 3 AOD product.The quality screening methodology for CALIPSO is similar to that followed for the AERONET comparison; i.e., we use only cases for which the 5 km product is cloud free in all profiles included in the 1 • × 1 • MODIS-Aqua cell.At the same time, we restrict the dataset to dust cases only, i.e., to CALIPSO overpasses where the aerosol classification scheme reveals exclusively the presence of dust in the column.The quality filters introduced for the CALIPSO Level 3 product are applied to the observations used for the comparison (Winker et al., 2013).From these, the most important are the CAD score (−20 to −100), the extinction QC flag (only aerosol layers with values 0 and 16) and the extinction uncertainty (only data with reliable extinction retrievals having an uncertainty in the layers above them of less than 99.9 km −1 ).Then, the average extinction profile for the 1 • × 1 • cell is calculated taking into account dust extinction values as well as zero extinction values for heights containing only molecules.The dust AOD used for the comparison is calculated by integrating the final average extinction profile, representative for the cell (AOD IntOfMean ).
Furthermore, the MODIS Level 3 product is screened.We use AOD retrievals for which the MODIS-retrieved cloudiness is less than 20 % within the cell, in order to constrain our dataset to accurate, almost cloud-free retrievals.The criterion for cloudiness is rather strict if we consider that realistic aerosol MODIS products are reported in the literature for cloudiness levels of less than 80 % (e.g., Zhang et al., 2005;Remer et al., 2008).However, our main concern for this comparison is to avoid a possible overestimation of MODIS AODs due to the presence of clouds, since it is well documented that clouds can lead to a significant overestimation of MODIS AOD, especially for cloud fractions higher than 80 % (e.g., Zhang et al., 2005;Remer et al., 2008).In addition to the cloudiness criterion, the MODIS Level 3 product is filtered in order to ensure the representativeness of the selected AOD values for the 1 • × 1 • cell.To ensure this, we select Level 3 data produced from at least 60 Level 2 records of 10 km spatial resolution, out of a maximum of 121 pixel counts, as input for the Level 3 aerosol data.It should be noted that after filtering the dataset, 80 % of the selected cells are over maritime areas.This also increases the accuracy of MODIS AODs used since over land the sensor is less reliable due to the fact that the retrievals are affected by higher surface reflectance (e.g., Remer et al., 2005Remer et al., , 2008  The aforementioned constraints led to a significant decrease in size of the initial dataset, but maintained the "quality" of the selected cases.Spectral conversions are not applied and the final comparison is between AODs at 532 nm for CALIPSO and 550 nm for MODIS.

Results and discussion
The final dataset of CALIPSO vs. MODIS AODs for the 5 yr period is presented in Fig. 3 for the original CALIPSO retrievals using the LR of 40 sr (left) and for the corrected product using the LR of 58 sr (right).The upper panel of Fig. 3 presents the dataset without applying controls that ensure a relative aerosol horizontal homogeneity within the cell.The lower panel shows the final 234 cells of our comparison, resulting from filters applied to account for the horizontal homogeneity (as described in Sect.3.1) and catering for the spatial sampling differences of the two sensors.For the latter, we use only MODIS Level 3 AODs produced from at least 60 Level 2 records of 10 km spatial resolution out of a maximum of 121 pixel counts to ensure the representativeness of the MODIS Level 3 product.Moreover, we use only the cases where the CALIPSO cross-section has a length greater than 100 km (approximately 20 CALIPSO profiles) within the MODIS cell.Since we use exclusively dust CALIPSO retrievals, the latter prerequisite ensures that dust presence is dominant in the 1 • × 1 • cell as well.Finally, the MODISretrieved cloudiness is set to be less than 20 %.
From Fig. 3 it is evident that the LR correction reveals an agreement with cloud-free MODIS AODs, similar to that obtained from the AERONET comparison.In the upper panel of Fig. 3, the number of cases found for each CALIPSO-MODIS AOD bin between 0 and 1.0 is presented with a bin step equal to 0.0125.The central tendency maximizes for low AODs less than 0.2.After filtering the data in the lower panel of Fig. 3, the statistics of the comparison (presented analytically in Table 2) show an improved, nonsignificant absolute bias for the LR-corrected CALIPSO AOD IntOfMean calculated by original CALIPSO products equal to −0.02, much lower than the statistically significant bias for the original product of the order of −0.07.The slope of the linear regression between the two datasets improves from 0.73 to close to unity and has a Pearson correlation coefficient for both comparisons of the order of 0.8.In conclusion, the two sensors show fairly good agreement for dust observations af- ter the correction of the LR is used in the CALIPSO algorithm and when only cloud-free MODIS cells (less than 20 % cloudiness) are acknowledged.Residuals in this comparison are most likely attributed to other retrieval errors for both sensors.
The agreement found between the two sensors in the case of dust shows that the dust LR issue is critical and should be taken into account in similar future work.Many studies in the literature have reported negative biases for CALIPSO AODs with respect to MODIS collocated Level 2 or Level 3 retrievals.For example, Redemann et al. (2012) assessed the consistency between collocated Level 2 AODs from MODIS and CALIPSO Version II and III and found that the CALIPSO Version III product is generally in better agreement with MODIS AOD, showing however regional and seasonal variability in the absolute biases of the two sensors.Figure 8 in Redemann et al. (2012) shows a clear CALIPSO underestimation of the order of 0.1 over Europe and the Mediterranean mostly during the spring and summer months, which are the seasons containing frequent Saharan dust advections.Recently, Winker et al. (2013) concurred with the aforementioned low CALIPSO biases, but added that MODIS AOD accuracy decreases as the environment becomes cloudier (e.g., Zhang and Reid, 2006).The methodology and results presented in this section suggest that constraints regarding the dust LR, cloudiness and the representativeness of cell samples have to be applied to both sensors in order for them to be comparable.

CALIPSO comparison with BSC-DREAM8b simulated dust fields
The CALIPSO evaluation study against AERONET and MODIS observations in Sects.3.1 and 3.2, respectively, points to the need for a larger average LR (of the order of 58 sr) for the dust component of the CALIPSO retrieval algorithm over the domain examined in our work.This correction is expected to eliminate the negative bias of CALIPSO AODs reported in the literature, especially over the Sahara and surrounding regions.In order to evaluate the impact of a larger LR on climatological averages, specifically the recently released Level 3 CALIPSO climatological product, we evaluate in this section this product against dust simulations from the BSC-DREAM8b regional dust model.In addition to the original Level 3 CALIPSO product and the amended product using a LR equal to 58 sr, a third product is evaluated that uses both a corrected value of LR equal to 58 sr as well as corrections that account for the pure dust component included in dust and polluted dust CALIPSO subtypes.

Comparison methodology
In this section we present the methodology followed for the comparison and the production of the three versions of climatological products used in our study: -Version I: the original CALIPSO Level 3 dust extinction product with LR equal to 40 sr, based on the original CALIPSO averaging scheme; -Version II: a dust extinction product retrieved by the application of an LR equal to 58 sr on Level 2 backscatter profiles based on the original CALIPSO averaging scheme; and -Version III: a product retrieved by the application of LR equal to 58 sr together with an averaging scheme different to CALIPSO that (1) acknowledges zero extinction values for nondust aerosol types detected in the cell, and (2) corrects for pure dust by separating the pure dust component from the dust and polluted dust subtypes.
The three versions are compared with collocated and synchronized dust extinction simulations from the BSC-DREAM8b model.Vertical averaging is applied to the CALIPSO products in order to collocate extinction values with the model's vertical resolution.No spectral correction is applied since dust particles are expected to have a weak spectral dependence on extinction (e.g., O'Neill et al., 2003;Schuster et al., 2006).As a result, the CALIPSO extinction at 532 nm is directly compared with the BSC-DREAM8b extinction at 550 nm.Our comparison is applied to data spanning the period from January 2007 to December 2010.The methodology followed for the production of the different versions of CALIPSO climatological dust products is described below.
Version I: the methodology followed for the production of the original CALIPSO 1 • × 1 • monthly mean dust extinction product is based on the averaging and screening techniques introduced by the CALIPSO team for the Level 3 climatology (Winker et al., 2013, and Appendix therein).CALIOP Version III aerosol extinction profiles at 532 nm are aggregated onto the 1 • × 1 • grid, and monthly mean extinction profiles are computed for aerosol species classified as dust.Following the definitions of the CALIPSO Level 3 climatology, we use the cloud-free product only.In brief, the CALIPSO Level 2 data are screened by CAD score (use only data with CAD score between −20 and −100), extinction QC flag (only aerosol layers with values 0 and 16 are accepted), and extinction uncertainty (use only data with reliable extinction retrievals having uncertainty in the layers above them of less than 99.9 km −1 ).Additional filters are applied in order to screen misclassified clouds, isolated layers due to noise spikes, subsurface samples, samples below opaque cloud and aerosol layers, large negative near-surface extinction samples, surface contamination beneath surface-attached opaque layers, and undetected surfaces associated with low aerosol biases.In clear air, the extinction value is set to zero (for details, see Winker et al., 2013, and http://eosweb.larc.nasa.gov/PRODOCS/calipso/Quality_Summaries/CALIOP_L3AProProducts_1-00.html).In order to validate the ability of our Version I product to reproduce the CALIPSO Level 3 averaging scheme, the algorithm developed in this study has been evaluated against the original CALIPSO product that is distributed on a 5 • × 2 • longitude-latitude spatial resolution grid.The comparison revealed that the Level 3 retrievals obtained from both algorithms are in excellent agreement, having a Pearson correlation coefficient of 0.98 and a linear regression slope of approximately 1.0 in the case of a test comparison of global extinction retrievals for January 2008 (not shown here).After validating the algorithm, the method was applied to 1 We should note here that, for the comparison of the CALIPSO Level 3 dust product with BSC-DREAM8b, we used both daytime and nighttime CALIPSO products.The original Level 3 CALIPSO product distinguishes between daytime and nighttime profiles, since for the daytime product the solar background reduces the aerosol detection sensitivity and results in smaller column AODs (Winker et al., 2013).Differences between daytime and nighttime products can be attributed to tuning of the retrieval algorithms to account for differences in signal-to-noise ratios (Winker et al., 2013).However, differences between the daytime and nighttime Level 3 product could be also attributed to sampling differences since, due to its orbital pattern, CALIPSO samples different geographical areas during day and night orbits.Moreover, aerosol loads can have large diurnal variations depending on the region and result in real, rather than artificial, differences between the daytime and nighttime product.For the domain of our study, we do not distinguish between lightning conditions, following the small reported differences between the daytime and nighttime product reported in Winker et al. (2013) for zonally averaged mean aerosol extinction during the summer months between 2006 and 2011 (Fig. 7 of Winker et al., 2013).These findings suggest that the ratio between the daytime and nighttime climatological extinction product is close to unity for the latitude zone between 20 and 50 • that includes the Sahara, the Mediterranean and a large part of Europe.
Version II: this version follows the definitions of the CALIPSO Level 3 product (Version I) with regard to the data averaging and screening procedures.The only alteration applied regards the production of extinction profiles from the dust backscatter profiles which are multiplied by the LR of 58 sr.
Version III: in this version, the LR used for the production of extinction profiles is kept equal to 58 sr as in Version II.However, two alterations are introduced; the first regards the vertically resolved separation of pure dust from aerosol types reported as dust and polluted dust, and the second involves the CALIPSO averaging scheme.
Regarding the first alteration, the separation of the pure dust component is obtained by applying the method introduced by Tesche et al. (2009a).This method makes use of the particle backscatter coefficient and the particle depolarization ratio at 532 nm in order to separate the backscatter contributions of the weakly light-depolarizing aerosol components ("other type") from the contribution of strongly lightdepolarizing particles (pure dust).In order to define the dust mixtures in our study, we first examined CALIPSO conventions related to its classification scheme.In general, dust presence in the atmosphere is classified by CALIPSO either as "dust", meaning pure dust, or "polluted dust", meaning dust mixed with other nondepolarizing aerosols.These types are distinguished by the CALIPSO algorithm using the only available Level 1 intensive aerosol property capable of classifying nonspherical particles, namely the volume depolarization (Omar et al., 2009).From the volume depolarization, an approximate particle depolarization ratio is calculated by where δ ν indicates the volume depolarization, δ m is the molecular depolarization, and R is the total scattering ratio, equal to the ratio of the total backscatter to the molecular backscatter.The approximate particle depolarization ratio is affected by the total scattering ratio which is not corrected for attenuation of the laser beam between the satellite and the layer under investigation.This leads to overestimation of the actual particle depolarization ratio and correspondingly affects the classification of dust into pure dust or polluted dust.Recent CALIPSO validation results using airborne HSRL collocated measurements (Burton et al., 2013) show that the CALIPSO dust classification corresponds to a classification of either dust or dust mixtures by HSRL.This is attributed by the authors to either the overestimation of the approximated particle depolarization ratio or to the polarization thresholds used by the CALIPSO classification algorithm.To be specific, while the threshold for the approximate particle depolarization ratio regarding the pure dust classification of CALIPSO is 0.2, the particle depolarization ratio for pure dust reported in the literature is much higher.Particle depolarization ratios measured over the Sahara during the SAMUM-1 campaign were found to be of the order of 0.31 ± 0.03 at 532 nm (e.g., Freudenthaler et al., 2009).Values of 0.35 have also been reported in the literature for Asian dust from long-term observations over China and Japan (Sugimoto et al., 2002;Shimizu et al., 2004).Thus, the finding by Burton et al. (2013) that pure dust CALIPSO classifications can, in reality, be dust mixtures (according to HSRL) is not surprising.
In Version III of our climatological product, we treat both dust and polluted dust types as dust mixtures and assume a value of 0.33 for the particle depolarization ratio of pure dust as confirmed by ground measurements (e.g., Freudenthaler et al., 2009).In order to examine the true particle depolarization ratio of the layers included in our study instead of relying on the Level 1 approximated value used for the classification, we vertically average the reported CALIPSO Level 2 particle depolarization ratio for each layer and present their distribution in Fig. 4 (black line).In the same figure, a second distribution is also shown (red line), representing the layeraveraged particle depolarization ratios retrieved for the same dataset using the standard equation where β t is the CALIPSO Level 2 total backscatter and β perp is the perpendicular backscatter product.The results for the standard depolarization formula (Eq.4) have been found to be different from those using the particle depolarization ratio product reported by CALIPSO Level 2. This finding has been already discussed by Tesche et al. (2013), who performed a detailed validation of the CALIPSO depolarization retrievals and found satisfactory agreement with groundbased collocated lidar measurements for this product when Eq. ( 4) is used instead of the depolarization product itself.
The authors state that this inconsistency is currently under investigation by the CALIPSO team and is most likely attributed to a software error in the CALIPSO retrieval algorithm.The differences for our dataset are presented in Fig. 4.
In the upper panel of Fig. 4, the distribution of particle depolarization ratios is presented for the layers characterized as polluted dust, while the lower panel refers to layers classified as dust by the CALIPSO scheme using the Level 1 product's approximation of the particle depolarization ratio (Omar et al., 2009).The grey-shaded areas denote the approximate depolarization ratio ranges for the classification of polluted dust (0.075 < depolarization < 0.2) and pure dust (depolarization > 0.2).The corrected distributions for pure dust (Fig. 4 -lower panel) show values lower than 0.5, ranging mainly between 0.15 and 0.4.Most of the values are greater than the threshold value of 0.2 for the Level 1 approximate depolarization product, suggesting that the classification is mostly justified by the Level 2 particle depolarization ratio as well.The maximum of the distribution is found to be at 0.3, which is in good agreement with ground-truth particle depolarization ratio values measured over the Sahara during the SAMUM-1 campaign (e.g., Freudenthaler et al., 2009).The distribution of the original CALIPSO particle depolarization ratio is skewed towards higher values, which are often unrealistic.Regarding the distribution for the polluted dust type (upper panel), this is again within the range of values intended to be used as thresholds for classification purposes based on Level 1 approximations.This is true especially for the corrected values produced by Eq. (4) (red) which, in general, are shifted to lower values.All the above considerations are consolidated in the methodology followed for producing the Version III climatological product.We separate the pure dust component included in dust mixtures (either classified as dust or polluted dust) by using the methodology of Tesche et al. (2009a) and apply the correct particle depolarization ratios using Eq. ( 4).Because the 5 km Level 2 CALIPSO depolarization profile is mostly noisy, we chose also to use a layer-averaged depolarization value for our corrections.This is done by applying Eq. ( 4) to layer-averaged perpendicular and total backscatter values.The corrected particle depolarization ratios are then used to apply the method of Tesche et al. (2009a).As already mentioned, the method makes use of the particle backscatter coefficient and the particle depolarization ratio at 532 nm in order to separate the backscatter contributions of the weakly light-depolarizing aerosol components from the contribution of strongly light-depolarizing particles.To be more specific, Lower: the same as upper panel but for layers categorized as dust.Grey areas denote the classification thresholds for polluted dust and dust that are followed by CALIPSO algorithm using the Level 1 approximated depolarization ratio.
the method assumes that, if we have two aerosol types, the backscatter contribution of the first aerosol type β 1 is obtained from the measured total backscatter coefficient β t by where δ p is the observed particle depolarization ratio and δ 2, δ 1 are the assumed "typical" particle depolarization ratios of the two pure aerosol types.The particle backscatter coefficient of the second aerosol type is given by β t − β 1 .In our interpretation of the method, we assume as mixtures all CALIPSO dust types (dust and polluted dust), acknowledging as pure dust only those layers having depolarization ratio values greater than 0.33 (e.g., Freudenthaler et al., 2009).A value of 0.33 was used for the particle depolarization ratio of pure dust in Eq. ( 5) (aerosol type 1), while a value of 0.03 was used for the nondepolarizing aerosol type 2 in  our separation procedure.This methodology is demonstrated in the example of Fig. 5.For the selected profile, the classification of CALIPSO revealed a dust layer between 0 and 1.5 km and an elevated polluted dust layer between 1.5 and 3.2 km (Fig. 5 -left panel).The Level 2 layer-mean particle depolarization ratio (Fig. 5 -middle panel) shows values of the order of 0.3 for the dust layer (green) and 0.25 for the polluted dust layer.The red lines represent the corrected layer-averaged particle depolarization ratios derived by application of Eq. ( 4).Using the corrected depolarization ratio values and applying the method of Tesche et al. (2009a) to the backscatter Level 2 CALIPSO product (Eq.5), we finally retrieve the result presented in the right panel of Fig. 5, where the pure dust backscatter has been separated from the "other" aerosol type particle depolarization ratio equalling 0.03 (and assumed to be present in the dust mixture).The pure dust backscatter profile is then multiplied by the LR of 58 sr in order to retrieve the pure dust extinction coefficient.
After producing the pure dust extinctions for Version III, we aggregate the profiles on a 1 • × 1 • cell.The averaging procedure for dust is altered from the original CALIPSO methodology followed for Version I and II by introducing in the averaging scheme nondust observations beyond those of clear air, namely the presence of other aerosol types detected by CALIPSO (marine, clean continental, polluted continental, smoke).These types are taken into consideration in the Version III averaging routine as zero extinction values and not as "nonavailable" observations as is the case in the CALIPSO Level 3 algorithm (Version I and II).To demonstrate how this averaging scheme performs in contrast to the CALIPSO methodology, an example of our approach is given in Fig. 6.One 1 classification scheme shows the presence of dust, polluted dust and marine aerosol subtypes.Version I, II and III extinction averages are presented in the right panel of Fig. 6.
Version II shows larger extinction values due to the use of a larger LR (58 sr).The acknowledgment of dust mixtures and respective dust contributions in Version III causes significant differences from the other versions.In particular, it leads to lower values of dust extinction in general, especially between the surface and 0.5 km since, in this height range, we acknowledge as zero values the extinction values corresponding to the marine subtype.Moreover, Version III retrieves dust extinction by separating pure dust from polluted dust in the height range between 1.5 and 2.7 km.In contrast, Version I and II retrieve, for the same height range, zero extinction -since only the polluted dust type is present (which is not considered).
To demonstrate the difference of the proposed averaging procedure in relation to the CALIPSO Level 3 approach, we present in Fig. 7 two synthetic scenes containing identical dust content homogeneously distributed between 0 and 4 km.In the first scene (upper panel), the consecutive 5 km product contains one layer classified as marine between 0 and 2 km and clear air above.The CALIPSO Level 3 algorithm would produce the averaged extinction profile for the scene presented in the upper-middle panel, overestimating the real dust extinction between 0 and 2 km due to the exclusion of the marine layer from the averaging procedure.However, zero extinction values are acknowledged for clear air; thus the average for the 2-4 km height range would produce half the extinction of that of the lower layer.In the lower panel of Fig. 7, a similar example is presented for the same dust load where the consecutive 5 km product contains one layer classified as smoke between 2 and 4 km and clear air beneath.The CALIPSO Level 3 algorithm would produce the averaged extinction profile for the scene presented in the lowermiddle panel overestimating the dust extinction between 2

Results and discussion
The CALIPSO-BSC-DREAM8c comparison of monthly mean AODs for the domain of our study obtained from all the three versions of 1 • × 1 • Level 3 products is presented in Fig. 8.All AODs refer to integrals of the extinction profiles in the vertical range between the maximum surface elevation of the cell and a height of 8 km.The color bar represents the latitudinal zone of the comparison, in 5 • bins.While Pearson's correlation coefficient remains almost constant for all versions around 0.87, we observe a significant improvement in the regression slopes for Version II and Version III, which increase to 0.73 from 0.5 with reference to the CALIPSO original Version I.A significant improvement is also visible in the absolute biases for Version II and III, which reduce to the value −0.01 as compared with −0.05 for Version I.As expected, the AOD shows a latitudinal dependence having larger values over Africa and lower values over northern Europe.
In Fig. 9 we present the vertically resolved comparisons averaged over the domain of our study.The upper- left panel shows the mean extinction profiles resulting from BSC-DREAM8b simulations and for all three CALIPSO climatological versions examined (1-3).While the original CALIPSO Version I mostly underestimates model simulations, Version II seems to overestimate them.Furthermore, the use of a dust LR equal to 58 sr to correct for the global mean value of 40 sr in Version II does not seem to give satisfactory results in relation to the BSC-DREAM8b model, contrary to the results of our comparison with AERONET and MODIS.However, this is most likely attributable to the averaging procedure and the mixing of dust with other types, since we obtain a very satisfactory agreement with Version III where the same LR of 58 sr has been used as well.The agreement of Version III with the model is clearly visible for the lower troposphere, where most of the mixing of dust with aerosol types from ground or sea sources is expected to occur.The absolute biases presented in the upper-right panel of Fig. 9, point to the same conclusions.In the same plot, the vertically averaged absolute biases are also presented.The black line represents the reported model bias over the domain as this is retrieved from comparison with AERONET observations (Basart et al., 2012).The results of Version I and III are close to this bias, showing that the best agreement with the model is achieved for these two versions.However, when linearly regressing Version I and III on the BSC-DREAM8b model as a function of height, the Pearson correlation coefficients show better agreement for Version III, especially for height ranges between the ground and 4 km (lower-left panel of Fig. 9).The regression slopes also show better agreement for Version III, reaching values close to unity (lower-right panel of Fig. 9).The spatial distribution of 5 yr AOD absolute biases obtained when comparing the model and the three versions examined is presented in Fig. 10.The columnar biases show a significant improvement over northern Africa for Version II and III.For the Sahel region, however, Version II and III overestimate when compared to the model.Nevertheless, the biases observed over the Sahel and northwestern Africa fall within regions of model underestimation and overestimation, respectively.This is reported in the detailed evaluation of BSC-DREAM8b against AERONET published by Basart et al. (2012).The results of this study are geographically summarized in Fig. 11, where the radii of the circles correspond to the model biases obtained with respect to AERONET.Biases lower than 0.1 were found over western, central and eastern regions of the Mediterranean, and a bias close to 0.1 is reported for the Atlantic region.The model evaluation results as compared with AERONET have a better spatial agreement with the comparison made with the CALIPSO Version III climatological product, as shown in Fig. 10 (lower panel).Version II clearly overestimates over Europe (especially eastern Europe), the Mediterranean and especially the Atlantic.Version I, on the other hand, underestimates the BSC-DREAM8b model across almost the whole domain, and especially over source regions in northern Africa.If we compare Version II and III, taking into account known model biases (Fig. 11), then we can conclude that the LR correction improves biases over northern Africa, while the correction in Version III for pure dust retrievals from dust mixtures improves significantly over Europe, where more mixing is expected.
To demonstrate the regional differences between the products, we present in Fig. 12 the three product versions averaged separately over Europe and northern Africa (upper and lower panel, respectively).The vertical distributions of the occurrence of each aerosol type acknowledged in the averaging scheme for each domain are presented in the right panel of Fig. 12.Over Europe (upper panel), the Version II profile shape differs significantly from Version III due to the aggregation of significant occurrences of dust mixtures and other aerosol types (marine, continental, smoke), which are dominant for this region.As already stated, Version I and II acknowledge only pure dust and clear air, while Version III takes into account polluted dust and also other aerosol types.Over northern Africa (lower panel), pure dust dominates in relation to other types; thus the profile shapes for the three versions are similar.Differences observed between Version I and II are due to the LR used, while differences between Version II and Version III are due to the averaging scheme.Version II and III differences maximize over Europe, where polluted dust and other types are dominant.Beyond the type occurrence frequency, the impact of the "other type" on the averaging procedure is larger than that of the polluted dust correction due to the zero extinction value introduced, leading to much lower averages.

Conclusions
CALIPSO is capable of providing a multiyear, robust 4-D dust climatology, a task that cannot easily be achieved by passive sensors, especially over deserts.However, limitations on retrieval performance using CALIPSO exist, especially regarding the classification of dust and its mixtures based on the approximate particle depolarization ratio and the LR assumption.In this paper, we show the potential improvement of CALIPSO dust retrievals over Europe and northern Africa by using a dust LR of 58 sr -demonstrating that a regional correction is feasible when using a universal and spatially constant LR.Moreover, improvements in the Level 3 climatological product for dust are demonstrated when comparing with BSC-DREAM8b dust simulations.This is achieved by altering the CALIPSO Level 3 averaging scheme so as to account for the pure dust component in dust mixtures and acknowledging the presence of other nondust aerosol types instead of only dust and clear air.Combining the calculations with the LR correction for the region examined, the results The agreement presented here will facilitate and hopefully encourage accurate, climatological dust studies in this large geographical domain.Future work could include the application of the methodology in similar studies over the deserts in the Middle East and China in order to optimize CALIPSO dust retrievals over these areas as well.Ground-based measurements of the dust LR and particle depolarization ratio for these regions will be vital for the success of implementing similar improvements.
Accurate climatological CALIPSO extinction retrievals could also help form a bridge between CALIPSO time series and future European Space Agency (ESA) ADM-Aeolus and EarthCARE retrievals, in order to accomplish a multidecadal climatological record.Such efforts are considered feasible especially for dust since this aerosol type has a relatively small wavelength dependence, and it should be straightforward to combine CALIPSO products in the visible with future EarthCARE products in the ultraviolet spectral region.
Finally, the agreement between CALIPSO and MODIS reported in this study is encouraging for future combinations of paired data from the two sensors.Such synergy will help the community make further deductions about aerosol types and origin, facilitating at the same time the evaluation of, e.g., the Deep Blue product over the Sahara and potentially other deserts.

Fig. 1 .
Fig. 1.Upper: scatter plot comparison of CALIPSO 5 km dust AOD versus collocated AERONET measurements (left), absolute and relative biases per AERONET AOD class (right).Lower: comparison of CALIPSO scene averaged dust AOD using different methodologies (left), absolute and relative biases per AERONET AOD class (right).

Fig. 2 .
Fig. 2. Scatter plot comparison of CALIPSO AOD IntOfMean vs. collocated AERONET measurements when LR is equal to 40 sr (left) and when LR is equal to 58 sr (right).

Fig. 3 .
Fig. 3. Comparison of CALIPSO AODs (1 •× 1 • ) vs. collocated MODIS-Aqua Level 3 using LR equal to 40 sr (left) and LR equal to 58 sr (right).Upper: 2-D histograms representing the number of cases found for each CALIPSO-MODIS AOD bin between 0 and 1.0 (bin step equal to 0.0125).MODIS data are not filtered, while CALPISO data are filtered according to Level 3 specifications.Only CALIPSO overpasses that are cloud-free and for which the aerosol classification scheme reveals only dust presence are considered.Lower: data are screened to ensure horizontal homogeneity in the cell and CALPISO data representativeness comparing to MODIS spatial averages, as well as cloud-free conditions for the MODIS cell.

Fig. 4 .
Fig. 4.Upper: particle depolarization ratio distributions for the polluted dust layers examined.Black curve represents the reported Level 2 product by CALIPSO and red curve represents the recalculated values using perpendicular and total backscatter product.Lower: the same as upper panel but for layers categorized as dust.Grey areas denote the classification thresholds for polluted dust and dust that are followed by CALIPSO algorithm using the Level 1 approximated depolarization ratio.

Fig. 7 .
Fig. 7.Synthetic scene examples demonstrating the differences between the averaging procedures followed for the derivation of CALIPSO Level 3 product and in this work for the derivation of Version III product.

Fig. 8 .
Fig. 8.Comparison of CALIPSO and BSC-DREAM8b dust AODs for (upper) original Version I CALIPSO AODs, (middle) Version II CALIPSO AODs for LR equal to 58 sr, (lower) Version III CALIPSO AODs for LR equal to 58 sr and acknowledgment of nondust aerosol types (extinction equal to zero) in the averaging scheme as well as pure dust component contained in dust and polluted dust types.Color bar represents the latitudinal zone of the comparison, in 5 • bins.

Fig. 9 .
Fig. 9. Comparison between Level 3 extinction CALIPSO product and BSC-DREAM8b model outputs for the three versions of CALIPSO climatological product.Upper: averaged extinction profile over the domain (left) and absolute biases (right).Lower: Pearson's correlation coefficient profile for the domain (left) and regression slopes (right).

Fig. 10 .
Fig. 10.Spatial distribution of 5 yr AOD absolute biases for the three versions of CALIPSO climatological product and the BSC-DREAM8b dust model outputs.

Table 1 .
Statistical indicators for CALIPSO and AERONET comparisons under different LR assumption for CALIOP (40 sr vs. 58 sr).Average CALIPSO aerosol optical depth at 532 nm (τ C ), absolute bias (B a ), absolute standard error (σ a ), Student's t test score (t), p value (p), relative bias, (B r ), root-mean-square error (RMSE), correlation coefficient (R fit ), slope (S fit ) and intercept (I fit ) of the linear fit and number of comparisons (N ) are shown.Average AERONET aerosol optical depth at 532 nm for this dataset is 0.267.

Table 2 .
Statistical indicators for CALIPSO and MODIS comparisons under different LR assumption for CALIOP (40 sr vs. 58 sr).Average CALIPSO aerosol optical depth at 532 nm (τ C ), absolute bias (B a ), absolute standard error (σ a ), Student's t test score (t), p value (p), relative bias (B r ), root-mean-square error (RMSE), correlation coefficient (R fit ), slope (S fit ) and intercept (I fit ) of the linear fit and number of comparisons (N ) are shown.Average MODIS aerosol optical depth at 532 nm for this dataset is 0.187.
Level 2 products for the domain of interest, so that the monthly averaged products could be compared with the results of BSC-DREAM8b dust simulations.

2013 V. Amiridis et al.: Optimizing CALIPSO Saharan dust retrievals are
found to be in better agreement with dust model simulations.