Evaluation and comparison of multiangle implementation of the atmospheric correction algorithm, Dark Target, and Deep Blue aerosol products over China

A new multiangle implementation of the atmospheric correction (MAIAC) algorithm has been applied in the Moderate Resolution Imaging Spectroradiometer (MODIS) sensor and has recently provided globally high-spatial-resolution aerosol optical depth (AOD) products at 1 km. Moreover, several improvements have been modified in the classical Dark Target (DT) and Deep Blue (DB) aerosol retrieval algorithms in MODIS Collection 6.1 products. Thus, validation and comparison of the MAIAC, DT, and DB algorithms are urgent in China. In this paper, we present a comprehensive assessment and comparison of AOD products at a 550 nm wavelength based on three aerosol retrieval algorithms in the MODIS sensor using ground-truth measurements from AErosol RObotic NETwork (AERONET) sites over China from 2000 to 2017. In general, MAIAC products achieved better accuracy than DT and DB products in the overall validation and accuracy improvement of DB products after the QA filter, demonstrating the highest values among the three products. In addition, the DT algorithms had higher aerosol retrievals in cropland, forest, and ocean land types than the other two products, and the MAIAC algorithms were more accurate in grassland, builtup, unoccupied, and mixed land types among the three products. In the geometry dependency analysis, the solar zenith angle, scattering angle, and relative azimuth angle, excluding the view zenith angle, significantly affected the performance of the three aerosol retrieval algorithms. The three products showed different accuracies with varying regions and seasons. Similar spatial patterns were found for the three products, but the MAIAC retrievals were smaller in the North China Plain and higher in Yunnan Province compared with the DT and DB retrievals before the QA filter. After the QA filter, the DB retrievals were significantly lower than the MAIAC retrievals in south China. Moreover, the spatiotemporal completeness of the MAIAC product was also better than the DT and DB products.


Introduction
Aerosols are a multi-compartment system consisting of suspended solid and liquid particles in the atmosphere, which play an important role in radiative forcing (Rajeev et al., 2001), regional climate (Qian and Giorgi, 1999;Feng et al., 2019), and urban air pollution (Dominici et al., 2014). The aerosol optical depth (AOD) is the key aerosol optical parameter, defined as the vertical integration of the aerosol extinction coefficient from the ground to the top of the atmosphere (TOA). Ground measurements from the AErosol RObotic NETwork (AERONET) provide high-quality multiband aerosol optical and microphysical properties at 15 min sampling frequencies on a global scale (Holben et al., 1998). High-quality ground measurements are often employed to validate satellite aerosol products  and to provide a regional aerosol model for the satellite aerosol retrieval algorithm . However, they cannot grasp the high aerosol spatial variability due to the sparse ground sites where spatial variability information is still necessary. Though some active remote-sensing meth-ods, e.g., spaceborne lidar, can monitor vertical distribution of aerosol, they still cannot observe high aerosol spatial variability (Huang et al., 2007;Jia et al., 2015;Liu et al., 2015). Although model-simulated AOD can obtain spatially continuous data, its very coarse resolution and large uncertainties limit its application (Sun et al., 2019;Cesnulyte et al., 2014). In contrast, the satellite aerosol retrieval algorithm has the ability to achieve continuous spatial measurements with high spatial resolution (She et al., 2017).
The Moderate Resolution Imaging Spectroradiometer (MODIS) sensor with its multiband detection ability from the visible band to thermal infrared spectrum band (Salomonson et al., 1989) can readily detect aerosol properties. With the Terra satellite and Aqua satellite carrying the MODIS sensor successfully launched in 2000 and 2002, respectively, MODIS has stored over 17 years of historical globally monitored data. Recently, a new multiangle implementation of the atmospheric correction (MAIAC) algorithm has been applied in the MODIS sensor, which provides high-spatial-resolution aerosol data at 1 km (Lyapustin et al., 2018). Moreover, some important improvements in classical Dark Target (DT; Mattoo, 2017) and Deep Blue (DB; Hsu, 2017) aerosol retrieval algorithms have been revised in MODIS Collection 6.1 products. However, all satellite aerosol retrieval algorithms are under some hypothesis and approximation assessments, and the accuracy should be validated before applying a satellite aerosol product in related studies.
China is experiencing severe aerosol pollution, and numerous studies on aerosol pollution have utilized MODIS Collection 6.0 aerosol retrievals to map aerosol pollution and to analyze its spatiotemporal trends (Fang et al., 2016;Ma et al., 2014;He and Huang, 2018a, b;Zou et al., 2016Zou et al., , 2019Zhai et al., 2018). Few studies have applied 1 km MAIAC aerosol retrievals to map finer aerosol concentrations in regional China, e.g., the Yangtze River Delta (Xiao et al., 2017) and Shandong Province . Before widely applying MAIAC and C6.1 products in China, the accuracy differences and applicable conditions of the three aerosol retrievals should first be recognized to guide the utilization of these products. Recently, the global validation (Lyapustin et al., 2018) and regional validation in South America (Martins et al., 2017), North America (Superczynski et al., 2017), and South Asia (Mhawish et al., 2019) for MAIAC products has shown that more than 66 % of retrievals fall within the expected error (EE = ±(0.05 + 0.05 × AOD)) limits, indicating a good accuracy for MAIAC products. In China, a comprehensive validation of the C6.1 product was initially performed  and then the MAIAC product was relatively simply evaluated against ground AERONET measurement in different seasons and land cover types and at different sites . Thus, an urgent demand persists for a detailed comparison of the three products to guide user selection of these products.
In this context, we provide the first comprehensive understanding and comparison of the aerosol retrieval uncertainties for MAIAC, DT, and DB products in China based on spatiotemporal accuracy differentiation patterns, spatiotemporal completeness, land type dependence characteristics, view geometry dependence characteristic aspects, and other features. The following paper is organized as follows: Sect. 2 briefly introduces three satellite products with their retrieval algorithm and ground AERONET data, the validation approach is clarified in Sect. 3, and Sect. 4 provides the detailed validation results and discussion. The conclusions are presented in Sect. 5.

Data description
Three aerosol products, e.g., MAIAC, DT, and DB, are stored in Hierarchical Data Format (*.hdf) files, and we obtain corresponding *.hdf files in the China region from 2000 to 2017 from the NASA Earthdata Search website (https:// search.earthdata.nasa.gov/search/, last access: 28 May 2019). Some ground surface types, e.g. snow, cloud, and desert, will increase the retrieval uncertainty; thus three products all provide a quality assurance (QA) flag to indicate the retrieval quality. Ground measurement aerosol data obtained from the AERONET website (https://aeronet.gsfc.nasa.gov/, last access: 28 May 2019) were used to validate the accuracy of three satellite aerosol products. Additionally, land cover data from the Geographical Information Monitoring Cloud Platform (GIMCP, http://www.dsac.cn/, last access: 28 May 2019) were utilized to analyze the land cover dependency for three satellite aerosol products.

DT products
The DT algorithm retrieves AOD parameters based on the assumption that the surface reflectance in two visible bands, e.g., 470 and 644 nm, presents a good linear relationship with the surface reflectance in the shortwave infrared (SWIR) band, e.g., 2119 nm, in dark, dense vegetated area, and the measurement in the SWIR band is transparent with the aerosol particle (Kaufman et al., 1997;Levy et al., 2013). The surface and aerosol information can then be decoupled from the TOA spectral reflectance. Compared with the DT algorithm in Collection 6.0, the DT algorithm in Collection 6.1 mainly revises the surface characterization over the land surface when the urban percentage is larger than 20 % (Gupta et al., 2016).
The DT algorithm produces two aerosol resolution products in Collection 6.0 and 6.1, e.g., 3 km × 3 km and 10 km × 10 km. The two resolution products share the same retrieval protocol except the use of different retrieval boxes. For example, the 10 km product organizes 20×20 group pixels with the three aforementioned band measurements at 500 resolution into the retrieval box, whereas the 3 km product combines three band measurements in the 6 × 6 pixel group into a retrieval box . The comparison be-tween the 10 km product and 3 km product from Collection 6.0 on the global scale  and the China region (He et al., 2017) shows that the accuracy of the 10 km product is superior to one of the 3 km product, even though the 3 km product provides finer-resolution aerosol retrievals. In this study, we consider the 10 km product of the newest Collection 6.1 version from the Terra satellite. In DT products, QA = 3 indicates high-confidence data, and QA = 1 indicates marginal-confidence data . In this paper, the scientific datasets (SDSs), named the "Im-age_Optical_Depth_Land_And_Ocean" without a QA filter and "Optical_Depth_Land_And_Ocean" with a QA filter (QA > 1 for ocean and QA = 3 for land), are extracted to compare the accuracy with and without the QA filter.

DB products
The DB algorithm retrieves the AOD parameter under the hypothesis that the surface reflectance in the Deep Blue band, e.g., 412 nm, is much smaller than in longer bands over bright surfaces, such as urban and desert regions (Hsu et al., 2004). First, the DB algorithm retrieves 1 km aerosol properties using the global surface reflectance database in visible bands, e.g., 412, 470, and 650 nm, and then aggregates 1 km pixels into a 10 km scale. In Collection 6.0, the surface reflectance database is improved using knowledge of the normalized difference vegetation index, scattering angle, and season . The ability to retrieve aerosol data over a bright surface for the DB algorithm greatly expands the coverage of aerosol retrieval. The general principles for collection of the 6.1 DB products are still the same as those in the Collection 6.0 version. The major improvements for Collection 6.1 DB products are in the radiometric calibration, heavy smoke detection, artifact reduction over heterogeneous terrain, surface model in elevated terrain and regional/seasonal aerosol optical models (Hsu, 2017).
The same as the DT products, SDSs named "Deep_Blue_Aerosol_Optical_Depth_550_Land" without the QA filter and "Deep_ Blue_Aerosol_Optical_Depth_550_Land_Best_Estimate" with the QA filter (QA = 2, 3 for land) in Collection 6.1 from the Terra satellite were selected for our study to validate the accuracy improvement by the QA filter. The solar zenith angle in the "Solar_Zenith" SDSs, view zenith angle in the "Sensor_Zenith" SDSs, solar azimuth angle in the "Solar_Azimuth" SDSs, sensor azimuth angle in the "Sensor_Azimuth" SDSs, and scattering angle in the "Scattering_Angle" SDSs were also assessed to determine the geometry dependence for DT and DB products.

MAIAC products
The MAIAC algorithm relies on the assumption that the surface reflectance changes slowly over time and shows high variability over space, whereas the aerosol loading changes very fast over time and varies only on a limited space scale. The main procedure of MAIAC is as follows: first, MAIAC resamples MODIS L1B measurements into a fixed 1 km grid, and then it adopts 4-16 d time series of resampled MODIS measurements to retrieve the surface Ross-Thick Li-Sparse (RTLS) bidirectional reflectance distribution function (Lucht et al., 2000) using the measurements in the SWIR band. Subsequently, the linear spectral regression coefficient (SRC) between 470 and 2119 nm for each 1 km grid is retrieved instead of using the empirical regression coefficient in the DT algorithm. Finally, the AOD parameter at 470 nm can be retrieved by searching the minimum spectral residual between the theoretical TOA reflectance of the lookup table and the measurements in the red and SWIR bands. The AOD is originally retrieved at 470 nm, and the AOD parameter at 550 nm is computed using the AOD parameter at 470 nm based on spectral properties, expressed by the regional aerosol model from the MAIAC lookup table. The detailed MAIAC algorithm has been described by Lyapustin et al. (2011).
Data used in this study were from the "Opti-cal_Depth_055" and "AOD_QA" SDSs, and data were collected from the Terra satellite. The data type of the "AOD_QA" SDSs is a 16-bit unsigned integer, and the best retrieved quality can be selected if 8-11 bytes of "AOD_QA" SDS bits are "0000", which indicates the retrieval pixel and its adjacent pixel is clear (Lyapustin et al., 2018). The solar zenith angle in the "cosSZA" SDSs, view zenith angle in the "cosVZA" SDSs, relative azimuth angle in the "RelAZ" SDSs, and scattering angle in the "Scattering_Angle" SDSs were also selected to analyze the view geometry dependence for MAIAC products.

AERONET data
AERONET is a global ground-based aerosol monitoring network that provides continuous optical and microphysical properties of aerosols at a 15 min sampling rate. The total uncertainty for the AERONET AOD parameter under cloudfree conditions is lower than ±0.01 for a wavelength longer than 440 nm and less than ±0.02 for shorter wavelengths (Holben et al., 1998). Some studies were also conducted to examine the properties of these high-quality measurements in China (Liu et al., 2011;Xia et al., 2007;Li et al., 2007;Che et al., 2008Che et al., , 2014Bi et al., 2014). These high-accuracy datasets support various satellite AOD product evaluation studies in the China region (Tao et al., 2015;Tian et al., 2018;He et al., 2017;Sogacheva et al., 2018). AERONET provides three quality levels of data, e.g., level 1.0, level 1.5, and level 2.0, in version 3. Here, we only selected qualityassured level 2.0 data as ground-truth data to validate the  satellite data. Figure 1 shows the locations of the selected 50 AERONET sites across China in this study. Table 1 reports the site name, longitude, and latitude of the selected sites. However, the AERONET site does not record aerosol measurements at 550 nm, and thus we interpolated the AOD parameter at 550 nm using the Ångström exponent in the two neighboring bands at 500 and 675 nm (Ångström et al., 1929;Eck et al., 1999) which can be shown by where τ 500 and τ 675 are the AOD parameter at 500 and 675 nm, respectively, τ 550 is the interpolated AOD parameter at 550 nm, α 500−675 is the corresponding Ångström exponent, and ln(*) is the logarithmic operator.

Land cover data
One key difficulty in the aerosol retrieval algorithm is to decouple surface and atmosphere information in the satellite apparent reflectance. Land cover information greatly af-fects atmosphere properties Feng and Zou, 2019). Understanding the uncertainties in a satellite aerosol retrieval algorithm for different land cover types is necessary (W. . GIMCP land cover data with 30 m resolution in the years 2000, 2005, 2008, 2010, and 2013 were used in this study. The first level of GIMCP land cover data includes cropland, forest, grassland, water, and built-up and unoccupied land. Among them, unoccupied land includes desert, saline-alkaline soil, swampland, bare land, and bare rock gravel, which mainly includes bright surfaces. The high spatial resolution and abundant land cover types support our studies. Figure 1 shows the first level land cover type across the China mainland in 2013.
3 Evaluation method

The selected spatiotemporal window
There are only a small number of matchup data between the satellite data and ground data when using the direct matching method, e.g., use only 1 pixel where the AERONET sites are located and ground measurement at the exact satellite overpass time, due to large numbers of missing data in AERONET or satellite data and the time delay between the satellite overpass time and AERONET sampling time. Therefore, under the assumption that aerosol information is homogeneous in a limited spatial and temporal lag, a suitable spatiotemporal window is often adopted to increase the matchup data number. Thus, satellite measurements in the spatial window around the AERONET sites are averaged, and ground measurements in the temporal window centered on the satellite overpass time are averaged. For 10 km DT and DB products, the selected spatial window is often 50 km × 50 km, and the temporal window is ±30 min He et al., 2017;Tao et al., 2015). For MAIAC products, Matins et al. (2017) described five different spatial windows, e.g., 3, 15, 25, 75, and 125 km, and four temporal windows, e.g., 30, 60, 90, and 120 min, to validate the MAIAC product over South America (Matins et al., 2017). The results showed that 25 km × 25 km and ±60 min are reasonable for the Terra satellite. For comparison with 10 km DT and DB products, we selected 30 km × 30 km as the spatial window closest to the best spatial window for the MAIAC product and employed the best temporal window ±30 min of the 10 km product because we also noticed that the validation accuracy is very close for the ±30 and ±60 min temporal windows in the results of Matins et al. (2017), although the matchup data number of the ±60 min temporal window is more than one of the ±30 min temporal windows (Matins et al., 2017).

Land cover types for the AERONET sites
The first level of GIMCP land cover data was used to label the AERONET site group. Due to the selected 30 km × 30 km spatial window in Sect. 3.1, we labeled the AERONET sites if the proportion of one land cover type exceeded 50 % in the spatial window around the AERONET site. If there was no dominant land cover type, we defined the land cover type for this AERONET site as a mixed group. Except for the defined first level type in the GIMCP land cover data, we found some coastal AERONET sites in which the dominant region was ocean, so we defined the land cover type for these sites as the ocean group. Table 2 shows the land cover types for each AERONET site in 2013. There were no land cover type changes for most sites except Hangzhou_City, Muztagh_Ata, and NAM_CO. For the Hangzhou_City site, the land cover type changed from cropland to mixed group from 2005 to 2008, potentially due to the process of urbanization. For the Muztagh_Ata site, the land cover type changed from unoccupied land to grassland from 2008 to 2010, and the land cover type for the NAM_CO site varied from grassland to the mixed group between 2008 and 2010. We labeled each matchup dataset for the three sites using the land cover type in the nearest year to the AERONET sampling time.

Statistical approach
The expected error (EE) envelope is often used to validate satellite retrieval uncertainties. More than 66 % of retrievals falling within the expected error lines indicate good accuracy. For the DT algorithm, the EE envelope is generally defined as ±(0.05 + 0.15 × AOD) over land, and over 66 % of retrievals meet the defined expected error limits at the global scale (Levy et al., 2010Remer et al., 2005). In the global-scale validation for the MAIAC product, over 66 % of retrievals satisfy the ±(0.05 + 0.1 × AOD) EE limits, demonstrating that the accuracy of MAIAC is relatively higher than the DT algorithm over land (Lyapustin et al., 2018). In the regional validation of South America and South Asia for the MAIAC product, the EE envelope is defined as ±(0.05 + 0.05 × AOD) and ±(0.05 + 0.1 × AOD) respectively, and the fraction of retrievals within these EE limits are all over 66 % (Mhawish et al., 2019;Matins et al., 2017). In our study, to compare DT and DB products, we adopted ±(0.05 + 0.15 × AOD) as the EE envelope and calculated the proportion within the EE envelope (Within_EE) using Eq. (2): (2) In addition to the EE envelope, we also adopted coefficient of determination (R 2 ) and Pearson correlation coefficient (R) to study the correlation between the satellite AOD and AERONET AOD. The root-mean-square error (RMSE) was also utilized to analyze the dispersion degree of accuracy of the satellite AOD. The mean bias (bias) was used to describe the bias of the satellite AOD. These statistical indicators were calculated using Eqs.
AOD sat and AOD aero are the satellite AOD retrievals and AERONET data, respectively. AOD sat and AOD aero are the corresponding mean values. N is the matchup data number. In order to compare the spatiotemporal completeness of three products, daily spatial completeness and the temporal completeness are defined by Eqs. (7) and (8). Spatial completeness = (7) available AOD pixel numbers the total number of pixels in the study region × 100 % Temporal completeness = available AOD numbers in each pixel during the study period The length of the study period × 100 % All the statistical indicators are calculated for three products before and after the QA filter to indicate the accuracy improvements and the reduction of spatiotemporal completeness by the QA flag.
4 Results and discussion 4.1 Overall accuracy comparison Figure 2 shows the overall evaluation for MAIAC, DT, and DB products before and after the QA filter. In total, MAIAC products have more matchup data than DT and DB products, which indicates the completeness of the MAIAC product may be higher than the DT and DB products. Before the QA filter, the statistic showed that 69.84 % of retrievals fall within the EE envelope, indicating a good accuracy for MA-IAC products in China. Compared with DT and DB products, only 53.64 % and 55.66 % of retrievals were determined for DT and DB products. Based on the R statistical result, the results for the three products were all greater than 0.9, indicating that the three products are all well correlated with the ground-truth AERONET data. By contrast, the R 2 statistical result for MAIAC products, e.g., 0.847, was superior to those for the DT and DB products, e.g., less than 0.8. From the bias statistical result, no significant bias was observed for the overall MAIAC product. However, according to the corresponding bias box plot in different AOD bins, a slight overestimation was observed when the AODs were less than 0.5, and a slight underestimation was observed when the AODs were between 0.5 and 1. The DT and DB products appeared to be less overestimated based on the bias result. From the corresponding bias box plot, the mean bias result for each different AOD bin was also almost greater than zero. After the QA filter, the correlation for the MAIAC product slightly improved, but the Within_EE result was slightly reduced, and the RMSE and bias results increased. From the corresponding bias box plot subfigure, the positive mean biases when the AODs were less than 0.2 increased compared with corresponding results before the QA filter, and the negative biases when the AODs were between 0.5 and 1 were reduced. These phenomena resulted in the reduced overall accuracy. The reason for the changes in these statistical indicators will be explained in Sect. 4.2. For the DT and DB products, the overall accuracies were all improved after the QA filter. The improvement of the DB product was more obvious than for the DT product. The Within_EE result was improved from 57.66 % to 63.32 %, and the mean biases in the bias box plot showed no obvious overestimation trend after the QA filter. However, the DT product was still overestimated after the QA filter, and only a little improvement was achieved in the Within_EE result.
To analyze and compare the retrieval accuracy at different AOD levels for three products, four bins with different levels, low level (< 0.2), moderate level (0.2-0.4), moderate-high level (0.4-0.6), and high level (> 0.6), are defined . Table 3 shows the corresponding statistical results. At the low, moderate, and moderate-high levels, all statistical indicators showed that the MAIAC product had better accuracy than the DT and DB products before the QA filter. At the high level, the DT product achieved the highest correlation with the ground-truth data and low RMSE results, but the positive bias result for the DT product revealed that the overestimation phenomenon was more serious than for the other two products. After the QA filter, the accuracy of the DB product was higher compared with the other two products at the low level because the positive bias phenomenon became more severe for the MAIAC product at this level. At the moderate level, the MAIAC product demonstrated the best correlation and lowest RMSE results with a slightly higher positive bias than the DB product. At the moderate-high level, the MAIAC product remained the best quality product among the three products. At the high level, the DT product achieved the best correlation and lowest RMSE with the highest positive bias. Figure 3 shows a scatterplot figure of the MAIAC products in different land cover types before and after the QA filter. In total, MAIAC retrievals in cropland, built-up, grassland, and ocean types were more accurate than forest, unoccupied land, and mixed types according to the Within_EE results. After the QA filter, except for grassland, the accuracies all improved, and the improvement effect in ocean type was more obvious.

Land cover type dependency analysis
The high aerosol loading, e.g., AODs > 1, mostly emerged in cropland (Fig. 3a-i and a-ii) and built-up (Fig. 3d-i and dii) types due to biomass burning in the dry season and multiple human activities in the built-up area (Zhang et al., 2010;. MAIAC retrieved AODs with a very high accuracy for the two land cover types. The R and R 2 results were over 0.93 and 0.84, respectively, and the Within_EE results showed that more than 74 % of retrievals fell within the EE limits. In comparison, retrievals in cropland showed little bias, in contrast to a small positive bias in the builtup area, and RMSE results in the built-up area were smaller than those in the cropland area. This high retrieval accuracy in cropland and built-up regions can support relative studies on biomass burning and anthropogenic emissions. In evergreen forest areas (Fig. 3b-i and b-ii), the retrievals showed a good correlation with ground measurements, with R no_QA = 0.874 and R QA = 0.904. However, the R 2 results without and with the QA filter were all lower than 0.8, and only approximately 45 % of retrievals fell within the EE envelope. The result is opposite to the conclusion that the MA-IAC algorithm improves the dark target retrieval accuracy better than the DT algorithm in Lyapustin et al. (2011). To eliminate the influence of retrieval accuracy at a specific site, Fig. 4 shows a scatterplot figure of the forest AERONET site, ignoring the sites with matchup numbers less than 10. We can observe good performance at the Chiayi, Qiandaohu, and Xinglong sites, and the corresponding Within_EE results were all higher than 70 %. The relatively inferior performance sites were Banqiao and Taipei_CWB. After the QA filter, the accuracies were improved to 76.19 % and 61.79 %, respectively. The site with the worst performance was only the Lulin site, where the MAIAC retrievals were systemically higher than the ground measurements, and less than 4 % of retrievals fell within the EE limits. The percentage of forest type in the 30 km × 30 km spatial window around the Lulin site always exceeded 80 % in 2000, 2005, 2008, 2010, and 2013. This high proportion of forest type eliminates the influence of other mixed land cover type. The Lulin site is located on the Taiwan peninsula, and thus the improper aerosol type in the MAIAC algorithm and cloud cover may explain the overestimation at the Lulin site (Lyapustin et al., 2018).
For the grassland type ( Fig. 3c-i and c-ii), over 83.68 % of MAIAC retrievals fell into the EE lines before the QA filter, and the R 2 = 0.750, R = 0.875, RMSE = 0.085, and bias = −0.018 results all showed good accuracy in the grassland type. However, after the QA filter, the accuracy was dramatically decreased with Within_EE = 46.02 %, R 2 = 0.687, R = 0.868, bias = 0.051, and RMSE = 0.114, representing the main reason for some of the decreased overall statistical results shown in Fig. 2 for the MAIAC product after the QA filter. It is noteworthy that some values were underestimated when the AODs were less than 0.5, and these values were discarded after the QA filter. However, some overestimated values emerged when the AODs were very small. To identify the reason, we also performed a statistical validation for each grassland type site in Fig. 5, excluding the site with a matchup number less than 10. Before the QA filter, the underestimated values were mainly at the NAM_CO and QOMS_CAS sites. These two sites are located on the Tibetan Plateau. The MAIAC algorithm filled the AOD retrievals using climatology values, e.g., 0.014, in high-altitude regions, e.g., over 4.2 km, and the QA for climatology values was 0111 (Lyapustin et al., 2018). After the QA filter, the climatology values were thrown away at the NAM_CO site. For the QOMS_CAS site, nearly 2.13 % of pixels still had altitudes less than 4.2 km in the spatial window. MAIAC retrievals in these pixels were overestimated compared with the ground measurements. After the QA filter, the Within_EE results decreased from 92.26 % to 38.53 %. A severe under- estimation phenomenon was found at the Lanzhou_City site, in contrast to the positive bias at its closest SACOL site. The small matchup number for the Lanzhou_City site might be the reason for the underestimation phenomenon. A great improvement was found at the Muztagh_Ata site after the QA filter.
MAIAC had good accuracy in the unoccupied land cover type (Fig. 3e-i and e-ii), with Within_EE results of 67.44 % and 71.43 % before and after the QA filter, and R and R 2 results over 0.9 and 0.8, respectively. Figure 3f-i and f-ii indicate that MAIAC also achieved better performance in the mixed land cover area, with Within_EE = 66.80 % and R = 0.882. In the ocean area (Fig. 3f-i and f-ii), MAIAC algorithm retrievals seemed to be overestimated when AODs were small, and the R = 0.796 result was a little worse than one of the other land types. After the QA filter, the overestimated values were discarded, and the accuracy was greatly improved from R = 0.796, Within_EE = 67.96 % to R = 0.921, and Within_EE = 78.22 %.
In comparison to DT and DB products, Table 4 shows the validation of the statistical results for the MAIAC, DT, and DB products with different land type covers. In cropland area, the accuracy of the DT product was evidently better than that of the MAIAC and DB products according to the R 2 , R, and RMSE results. However, the values seemed to be overestimated compared with the MAIAC product, and the Within_EE result was a little smaller compared with the MAIAC product. In the forest area, the DT algorithm also achieved optimal accuracy compared with the MAIAC and DB products. However, only 56.23 % of the retrievals met the EE limits, which was less than the DB product. In the grassland type region, the accuracies for the three products were all decreased after the QA filter, and we consider the validations of the three products to have all been influenced by the NAM_CO and QOMS_CAS sites. Compared with DT and DB products, the MAIAC product obtained the best retrieval accuracy. Owing to the overestimation phenomenon at the QOMS_CAS sites after the QA filter, the Within_EE result dramatically dropped from 83.68 % to 46.02 %. In builtup, unoccupied land, and mixed regions, the MAIAC product performed better than the DB product, and the DB product was more accurate than the DT product. In the ocean region, the DT product was clearly more accurate than the DB and MAIAC products. Table 5 shows the validation accuracy for three products after the QA filter in four seasons. In cropland, the retrieval accuracies in autumn for the three products were better than in other seasons. For forest land types, three products showed a higher correlation in autumn than the other seasons, but the Within_EE values demonstrated the best results in winter, and the corresponding results for DB products were clearly higher than for the other two products. In terms of grassland type, MAIAC and DB products were more accurate in summer and spring, respectively. In the built-up region, all products showed a high correlation in all seasons, but DT products were seriously overestimated. In unoccupied land, matchup pairs for MAIAC and DB products were more focused in spring, and MAIAC products performed better than DB products. A high correlation was also found for the three products in mixed and ocean regions in all seasons, but more MAIAC retrievals met the EE envelope line.
The Ångström exponent (AE) is a key parameter to describe aerosol particle size, and in general, local aerosol sources play a dominant role in aerosol regimes (Mhawish et al., 2019). To discover aerosol particle sizes in different land covers, Fig. 6 shows a scatterplot of the AE (440-675 nm)  parameter versus AOD for different land cover types. Our results were similar to those of Martins et al. (2017). The aerosol types in China are mainly fine-mode aerosol particles (AE > 1). Some coarse-mode particles (AE < 0.5) are mainly found in some regions with sparse vegetation, e.g., grassland (vegetation coverage at selected site less than 20 %), built-up, and unoccupied land. As observed in Fig. 3, high AOD values mainly occurred in cropland and built-up areas. According to the AE parameter, the aerosol types for these high AOD values were mainly fine-mode aerosol particles. Figure 7 presents the AOD bias distribution along with the AE parameter. A higher AOD bias often occurred when the AODs were higher than 0.8 with 1 < AE < 1.5. There was no AE dependence when the AODs were very small, e.g., lower than 0.1, for the three products. However, MAIAC seemed to have a more positive bias than the DB product at a very small level.

View of the geometry dependency analysis
To determine how the view geometry influences the accuracy for three retrieval algorithms, we analyzed view geometry dependency using the following four angles: solar zenith angle (SZA), view zenith angle (VZA), scattering angle (SA), and relative azimuth angle (RAA) (Superczynski et al., 2017;. We separated each kind of angle into 10 bins and statistically analyzed the AOD bias distribution in each bin. The results are displayed in Fig. 8. In terms of the solar zenith angle, the three retrieval algorithms all showed a strong dependency with different characteristics. A slight downtrend along with SZA was found in the MAIAC algorithm, and the MAIAC retrievals seemed slightly overestimated when SZA was less than 40 • and underestimated when it was larger. The mean biases only fluctuated between −0.05 and 0.05. For the DT algorithm, the mean bias first arose when the SZAs were small, and the mean bias reached the maximum at SZA ≈ 25 • . Then, the mean biases decreased as SZA increased. The mean biases were close to zero when SZA reached the maximum value. With regard to the DB algorithm, the mean bias first slowly decreased when the SZAs were less than 35 • and then rapidly rose as SZA increased. After the QA filter, the whole mean bias line shifted down.
The MAIAC and DB algorithms showed no dependency on the view zenith angle, and the corresponding mean bias lines did not fluctuate much along with VZA. Compared with the results obtained before and after the QA filter, the mean bias line for the MAIAC algorithm slightly increased, and the mean bias line for the DB algorithm moves down to a relatively large degree. VZA slightly affected the DT performance with a little downtrend. After the QA filter, the mean bias line slightly declined.
The scattering angle also greatly impacted the performance of the three retrieval algorithms. MAIAC retrievals seemed to be underestimated when the SAs were less than 100 • and slightly overestimated when they were between 100 Table 4. Comparison of the retrieval accuracy of the MAIAC, DT, and DB products for different land cover types before and after the QA filter. "-" means no matchup pairs or that the matchup pairs number fewer than 10. The bold number is the highest peformance among three algorithms by each indicator. and 155 • . When the SAs were larger than 155 • , the retrievals tended to be underestimated. After the QA filter, the corresponding retrievals at large SAs tended to be overestimated. For the DT and DB retrievals, a significant uptrend was observed for the mean bias along with SAs. Small positive biases were found when the SAs were very small, and large positive biases occurred when the SAs were very large. After the QA filter, the significant uptrend was alleviated for DB retrievals, but a large negative bias was found when SA approached 180 • . We consider the scarce matchup number of DB retrievals to be the main reason for the negative bias. For the MAIAC algorithm, positive biases occurred as RAA approached the extremes of 0 • and 180 • , and negative bias emerged as RAA neared 90 • , where the matchup numbers were very limited in the three angle intervals. In the other angle intervals, MAIAC showed no dependency on RAA. After the QA filter, a downtrend of the mean bias was apparent along with RAA during backscattering (RAA < 90 • ), and an uptrend of the mean bias was observed during forward-scattering (RAA > 90 • ). For the DT algorithm, the positive mean bias decreased as RAA increased upon backscattering and first increased and then decreased upon forward-scattering. After the QA filter, the downward trend tended to be alleviated upon backscattering. For the DB algorithm, upon backscattering, the positive mean bias first decreased from very high to zero and then increased to become somewhat high. Upon forward-scattering, the positive mean biases were all larger than 0.05. After the QA filter and Table 5. Comparison of the retrieval accuracy of the MAIAC, DT, and DB products for different land cover types in four seasons after the QA filter. "-" means no matchup pairs or that the matchup pairs number fewer than 10. The bold number is the highest peformance among four seasons by each indicator. upon backscattering, no dependency on RAA was observed for the DB algorithm, but the highest mean bias was lower than zero. Upon forward-scattering, an obvious linear downtrend from positive to negative bias was observed as RAA increased.

Analysis of the spatiotemporal retrieval accuracy
To investigate retrieval accuracy of the three algorithms at different regions and different times, Fig. 9 shows the R, RMSE, bias, and Within_EE results for each AERONET site, ignoring the sites with fewer than 10 matchup numbers, which might cause unreliable statistical results. Three products presented different retrieval accuracies in different regions. In the BTH region (marked by the black box in Fig. 1), three products showed a good correlation with the ground measurements, e.g., R > 0.9. There were, however, more retrievals for MAIAC and DT products falling within the EE limits than the DB product. Based on the Bias results, the DT and DB products seemed to be overestimated compared with the MAIAC product. The DT product was Figure 6. Scatterplot of AOD at 550 nm against the Ångström exponent for different land cover types. We selected AERONET sites with maximum observations for each land cover type: XiangHe (cropland); Taipei_CWB (forest); QOMS_CAS (grassland); Beijing (built-up); Dunhuang (unoccupied land); Hong_Kong_PolyU (ocean). more positively biased compared with the MAIAC product. In the YRD region, the within_EE results showed that more MAIAC retrievals met the EE limits than DT and DB products. A good correlation of the three products was also found in this region. However, the DT product was overestimated, and DB was underestimated. In the PRD region, the MAIAC retrievals were obviously more accurate than the DT and DB retrievals. The Within_EE results for the MAIAC retrievals in this region were all greater than 70 %. The Within_EE results for the DT retrievals were relatively low for some sites before the QA filter. After the QA filter, the Within_EE results were greatly improved. DB retrievals in this region demonstrated the worst performance with low Within_EE results, a bad correlation, and a negative bias. In addition, the MAIAC product was also the most accurate product in the NW area. The Within_EE and R results overall were higher than for the DT and DB products. Additionally, the RMSE results for the MAIAC product in this region were also relatively lower than those for the BTH and YRD regions. The Within_EE results for the MAIAC product for most sites in the west of Taiwan were higher than 66 % after the QA filter, demonstrating a high accuracy compared with DT and DB products. However, according to the east site, e.g., Lulin, the MAIAC retrievals seemed to be overestimated with low Within_EE and R results. Additionally, DB retrievals at the Lulin site seemed to be unbiased with high Within_EE (over 70 %) and relatively high R (over 0.8) results. In the Tibet area, three algorithms all failed to retrieve AODs according to the statistical results due to the high latitude and snow cover. Figure 10 presents the monthly validation results for the three products. We overlooked the specific QOMS_CAS site for this purpose due to its poor performance after the QA filter, which would affect the overall accuracy. Three products showed a good correlation with the ground measurements for all months with R > 0.85. The AOD deviation for the DT and MAIAC products was higher in spring and summer than autumn and winter, consistent with the results of He et al. (2017). The RMSE results for the DB products were generally higher than the DT and MAIAC products before the QA filter. After the QA filter, the RMSE results decreased with no obvious seasonal variability law. The DT product seemed to be systematically overestimated, and the positive biases were extremely high in spring and summer. The MA-IAC product was positively biased from June to October with a bias < 0.1. The DB product was positively biased in all seasons before the QA filter, but the bias results from June to October were significantly reduced after the QA filter. Before the QA filter, the Within_EE results for the MAIAC product were higher than the DT and DB products in all months. However, less than 60 % of the MAIAC retrievals fell within the EE limits in summer. After the QA filter, the Within_EE results for the DB product from June to September were superior to those of the MAIAC and DT products. The R 2 results for the MAIAC products were stable for all months, and most R 2 results were over 0.8. The DB product had a lower R 2 in the cold season from November to February, and in April and May, the R 2 results for the DT product were generally lower than those in the other months. After the QA filter, the DB product achieved higher R 2 results from April to September. According to the matchup number results, the MAIAC product had more matchup numbers than the DT and DB products. However, all products had fewer matchup numbers in summer due to the increased cloud cover in the rainy season. In summary, the MAIAC product was more accurate than the DT and DB products except for in summer. In contrast to the positive bias of MAIAC retrievals in summer, the DB product after the QA filter could achieve unbiased results with higher Within_EE and R 2 than the MAIAC product.
We investigate the annual change in retrieval accuracy for three products to ascertain whether the MODIS instrument maintains its performance due to it exceeding its designed lifetime. However, according to Table 1, the time durations of each AERONET site were significantly different. Thus, the matchup observation pair during each year was from different sites. This phenomenon may result from incomparable validation results for each year. However, if only considering the sites with the same monitoring time, most sites will be discarded, and fewer matchup numbers will cause unreliable corresponding statistical results. Thus, we still adopted all site measurements. We ignored the results for the years 2000, 2001, 2002, and 2003 due to fewer matchup numbers in these years. According to Fig. 11, three products showed a high correlation with ground measurements according to the R and R 2 results, except in 2009. The reason for the sharp decline in the R and R 2 results in 2009 was mainly that some sites, e.g., Qiandaohu, SACOL, Kaiping, Shouxian, and Zhangye, did not have matchup pairs in this year, and matchup pairs containing bad retrieval satellite pixels around the Lanzhou_City and NAM_CO sites appeared in 2009. Thus, the high correlation revealed MODIS instrument results consistent the AOD retrievals from 2000 to 2017. The MAIAC AOD deviation was generally small, with most RMSE results being lower than 0.2 and larger than 0.15. The RMSE results for the DB product were generally larger than 0.2 before the QA filter. After the QA filter, the RMSE results varied in a large range from 0.15 to 0.25. Based on the bias result, there was a significant uptrend for the three products over the year. The MAIAC bias results were generally smaller than the DT and DB products, and most bias results for the MAIAC product were within ±0.05, with a negative bias before 2010 and a positive bias after 2010. To eliminate the influence of the contribution of some specific sites in the specific year, Fig. 12 plots bias time series for five AERONET sites with a monitoring time covering most study years and ignoring data with matchup numbers less than 20. The bias uptrend seemed to appear in three products for all selected sites except the EPA-NCU site for DT products. Thus the significant uptrend of bias results is not caused by the significantly different time durations of AERONET sites. For the Within_EE results, MAIAC also showed better accuracy than DT and DB products, and for Within_EE results, a slight declining trend was observed over the year.
The matchup numbers for the three products revealed an increasing trend due to the establishment of greater numbers of AERONET sites in the China region over time.

Analysis on spatial pattern variation difference
To compare the difference in spatial variations for the three products, we upscaled the MAIAC product to match the grid of the DT and DB products; thus, 1 km pixels falling within the 10 km grid were averaged. Such a protocol can aid in investigating differences in different regions between the three products. Figure 13 presents multiyear averaged and difference results between MAIAC, DT, and DB products, with aerosol loading presenting a noteworthy assembly characteristic.
Higher AOD values were concentrated in the North China Plain and Sichuan Basin where the land cover types were mainly cropland-oriented, as shown in Fig. 1. Before the QA filter, compared with the DT and DB observations, the MA-IAC AODs were smaller in the North China Plain and larger in Yunnan Province and east Taiwan. After the QA filter, the DB AODs became smaller in the North China Plain and southeast region. Compared with the DB AODs, the MAIAC AODs became slightly higher in the North China Plain (difference over 0.1) and obviously higher in southeast China (difference over 0.3). Recall the statistical result presented in Fig. 9, in which the DT and DB products were overestimated in the BTH region, the DB product was underestimated in the YRD region, and the MAIAC product seemed to be overestimated in east Taiwan. These findings indicate that MAIAC Figure 8. Dependency of the AOD bias on the solar zenith angle, view zenith angle, scattering angle, and relative azimuth angle for the (a) MAIAC product, (b) DT product, and (c) DB product before and after the QA filter. The dark blue bar is the histogram bin, red points in the shadowed area are the mean bias of the corresponding bin, and the top and bottom blue lines are the 75 % and 25 % percentiles of the AOD biases in the corresponding bin, respectively. retrievals are more accurate than DT and DB in the North China Plain and southeast region, and DB retrievals are more accurate than MAIAC in east Taiwan. However, due to the lack of the AERONET site in Yunnan Province, we could not evaluate the accuracy of the three products in Yunnan Province. The difference before and after the QA filter for the MAIAC product was very small, except for some individual pixels in the Tibet region. In addition, there was an obvious boundary in the 30 • latitude for MAIAC AODs. This boundary was caused by the different regional aerosol models used above and below 30 • latitude (Lyapustin et al., 2018).  Figure 14 shows the seasonal comparison results among three products before and after the QA filter. The AOD spatial variation for the three products showed apparent seasonal characteristics. The AODs in the North China Plain in summer were higher than in other seasons, and the AODs in the Tarim Basin in spring were higher than in other seasons. Based on the AOD spatial variation difference map, the difference between MAIAC and DT in the North China Plain evolved gradually from negative in spring to positive in winter. The negative difference between MAIAC and DB in the North China Plain was higher in summer and winter than in spring and autumn. The positive difference in Yunnan Province between MAIAC and DT was slightly lower than that between MAIAC and DB. After the QA filter, AODs in south China for the DB product were extremely low compared with those for the MAIAC product.

Analysis of spatiotemporal completeness
Based the upscale MAIAC 10 km data in Sect. 4.4, the spatial completeness in Eq. (7) and temporal completeness in Eq. (8) for three products are shown in Figs. 15 and 16. According to Fig. 15, the spatial completeness of the MAIAC product was higher than the DT and DB products before and after the QA filter. The spatial completeness of the DT product was smallest due to its retrieval failure on a bright surface. The spatial completeness for all the products showed an ob-vious periodical trend change. Table 6 shows the statistics for the spatial completeness of the three products in different seasons. Before the QA filter, the averaged spatial completeness of MAIAC (46.87 %) was higher than DT (16.66 %) and DB (34.80 %). After the QA filter, the reduced proportion of MAIAC (17.18 %) exceeded DB (15.30 %) and DT (8.66 %) because many climatology values in the Tibet Plateau were discarded. Comparison of the spatial completeness in four seasons revealed a higher spatial completeness for the three products in autumn than the other three seasons due to the  reduced cloudiness in the dry autumn season. The spatial completeness in winter was smallest due to the influence of the surface snow cover and large deciduous trees. Compared with MAIAC and DB products, the spatial completeness of the DT product in winter was minimal due to the bright surface in winter. Figure 16 presents the temporal completeness in China for the three products. Due to the climatology values in the Tibet Plateau, the temporal completeness of the MAIAC product in   this region was very high (over 80 %). After the QA filter, the temporal completeness rapidly decreased in this region. In the other region, the declining proportions of temporal completeness for MAIAC were mostly lower than 10 %, except for Yunnan Province (nearly 15 %), Hainan Province (nearly 20 %), and east Taiwan (nearly 20 %). Compared with the MAIAC and DB products, DT retrievals were very scarce in the Tarim Basin due to failure on the bright desert surface. DT retrievals were more concentrated on the North China Plain and in Yunnan Province. After the QA filter, a dramatically reduced proportional area of temporal completeness (nearly 30 %) for DT products was observed in the cropland region in northeast China. The severely reduced proportional area (nearly 40 %) for the DB product after the QA filter was mainly focused on unoccupied land, e.g., gobi, saline-alkaline soil, at the top of the Tibet Plateau. Compared with the MAIAC product, before the QA filter, the DB product showed more retrievals in the Tarim Basin, North China Plain, and southeast China and fewer retrievals in Yunnan Province and northeast China. After the QA filter, the temporal completeness of the MAIAC product was better than the DB product in all regions.

Conclusion
In this study, we present the first comprehensive validation and comparison of three MODIS aerosol retrieval algorithms (i.e., MAIAC, DT, and DB) across China in terms of overall accuracy, land cover dependency, viewed geometry dependency, spatiotemporal retrieval accuracy, spatial distribution difference, and spatiotemporal completeness. These validation results may guide users to utilize the three products appropriately. The main results and conclusions are presented below.
In terms of overall accuracy, the MAIAC product is more accurate than the DT and DB products. The DT and DB prod- ucts are positively biased before the QA filter, and the positive bias for the DB product is alleviated by the QA filter.
DT retrievals in cropland, forest, and ocean seem to be more accurate but with a positive bias than retrievals by the MAIAC and DB algorithms. The MAIAC algorithm performs better in grassland, built-up, and mixed areas than the DT and DB algorithms.
Three algorithms show a strong dependency on SZA, SA, and RAA. VZA only marginally affects the retrieval accuracy of the three algorithms.
The MAIAC product performs better in the BTH, YRD, PRD, and NW regions than the DT and DB algorithms, and the DB product performs better than the DT and MAIAC products after the QA filter in east Taiwan. The MAIAC algorithm performs better than the DT and DB algorithms in most months except June, July, August, and September. In these four months, MAIAC retrievals appear to be overestimated, and DB retrievals after the QA filter are more accurate than MAIAC retrievals.
Three AOD products present a similar spatial pattern with high aerosol loading in the North China Plain and Sichuan Basin. In comparison, MAIAC retrievals are lower in the North China Plain and Sichuan Basin than DT and DB retrievals and are higher in Yunnan Province and east Taiwan than DT and DB retrievals. After the QA filter, the DB AOD values are significantly reduced and obviously lower than the MAIAC product in southeast China.
Based on spatiotemporal completeness analysis, the MA-IAC product has more retrievals in the spatiotemporal domain than the DT and DB products. The spatial completeness exhibits a strong periodical change, and the temporal completeness is highest in autumn compared to other seasons due to the decreasing cloud cover in this dry season, which is lowest in winter due to the snow cover and deciduous vegetation. In terms of temporal completeness, MA-IAC has more retrievals in the Tarim Basin and the cropland in northeast China compared with the DT algorithm. Compared with the DB algorithm, MAIAC has fewer retrievals in the Tarim Basin and southeast China and more retrievals in northeast China. After the QA filter, the temporal completeness of MAIAC in all regions of China is better than that of the DB product.