Many applications of geophysical data – whether from surface observations, satellite retrievals, or model simulations – rely on aggregates produced at coarser spatial (e.g. degrees) and/or temporal (e.g. daily and monthly) resolution than the highest available from the technique. Almost all of these aggregates report the arithmetic mean and standard deviation as summary statistics, which are what data users employ in their analyses. These statistics are most meaningful for normally distributed data; however, for some quantities, such as aerosol optical depth (AOD), it is well-known that distributions are on large scales closer to log-normal, for which a geometric mean and standard deviation would be more appropriate. This study presents a method of assessing whether a given sample of data is more consistent with an underlying normal or log-normal distribution, using the Shapiro–Wilk test, and tests AOD frequency distributions on spatial scales of 1

Geophysical data are obtained from a variety of data sources and model simulations across many disciplines in the Earth sciences. As one example, aerosol optical depth (AOD) is often measured on the ground by Sun photometry

Reasons for preferring L3-type (i.e. aggregated) data for some applications over L2-type data include the decreased storage and computational overhead, the fact that aggregates are typically reprojected onto a regular grid and so are often more user-friendly, and a desire to have a data set with fewer gaps. Gaps can be caused by unfavourable retrieval conditions; for example, algorithms to retrieve atmospheric aerosol or surface reflective and emissive properties often require cloud-free, snow-free, and daytime scenes. Gaps also arise from the simple fact that surface and satellite observations do not observe every location all the time. Unfortunately, sampling incompleteness adds an additional representativity error in comparisons; in some fields, such as aerosol remote sensing, this can be difficult to quantify and sometimes is non-negligible

While the principles of uncertainty propagation in remote sensing are well established

This assumption runs counter to the fact that AOD at a given location tends not to be normally distributed, which has been indicated in the literature for at least 50 years. Writing in terms of aerosol-induced turbidity (directly proportional to AOD),

Note that Eq. (

Other studies published around this time

Daily and monthly averages of extinction at multiple locations presented by

Synthetic frequency distributions for log-normally distributed AOD with a mean of 0.2 and geometric standard deviation 0.35,

Due to this asymmetry, normal statistics (i.e. arithmetic mean

Difference between geometric- and arithmetic-mean AOD (

Figure

Log-normal distributions are common across quantities in the natural sciences and tend to arise when the underlying phenomenon is governed partly by multiplicative (rather than additive) factors;

Similar behaviour is found for many other remotely sensed quantities; for example,

Recent efforts by other researchers have helped with understanding spatial and temporal scales in AOD variations and their potential effects on data aggregates.

Several studies have sought to assess representation uncertainty in L3-type aggregates;

This analysis aims to complement these other recent studies, building most directly on

This analysis uses ground-based observations from AERONET, together with satellite retrievals from the Multiangle Imaging SpectroRadiometer (MISR) and MODIS instruments, and model simulations from the Goddard Earth Observing System (GEOS) Version 5 Nature Run (G5NR). All of these have different spatiotemporal sampling techniques and associated uncertainties in their estimates of AOD. Considering a diverse set of data sources such as this provides a more comprehensive picture of the frequency distributions of AOD than would be obtained from only a single data type. It allows the strengths of individual techniques to be used while helping to avoid erroneous conclusions stemming from limitations of individual techniques. The data sources are described below.

AERONET provides aerosol (and water vapour) data from Sun photometer measurements, obtained with standardised acquisition, calibration, and processing protocols. This analysis uses the latest Version 3 direct-Sun level 2 (cloud-screened, post-deployment-calibrated, and quality-assured) AERONET AOD data. Note that “level 2” in AERONET terminology refers to quality-assurance level, regardless of temporal aggregation level, as opposed to the satellite level 2, which refers to instantaneous data only. Version 3 includes improvements to sensor characterisation, site geolocation accuracy, and cloud–aerosol discrimination

All instruments deployed as part of AERONET provide AOD at 440, 675, 870, and 1020 nm at a minimum; the majority include additional channels between 340 and 1600 nm, with 500 nm being a common addition. In this analysis, AERONET AOD is interpolated spectrally to 550 nm, as this is a common reference wavelength for many satellite data products and model simulations, although the conclusions do not change if other wavelengths are used instead. Hereafter, mentions of AOD without a specified wavelength refer to AOD at 550 nm. This is performed with a least-squares fit of all available AERONET AODs within the 440–870 nm wavelength range (typically four, more for some configurations) to a quadratic polynomial,

The latest version, 23, of MISR L2 data provides AOD at 558 nm, over land and ocean, with a horizontal pixel size of 4.4 km; the use of 558 rather than 550 nm has a negligible impact on the analysis here. The instrument includes nine cameras with a maximum swath width around 400 km, although the edges of the scan are not covered by all cameras and so retrievals are provided over a slightly narrower swath. This provides repeat views of a given scene roughly once per week at tropical latitudes and once every 3 d at high latitudes. MISR flies on the Sun-synchronous Terra platform, providing data from early 2000, with a 10:30 local solar equatorial crossing time. Separate processing algorithms are applied over land and dark water; version 23 updates and initial evaluation are provided by

This analysis uses 5 years (2004–2008) of L2 data, corresponding to around half a million retrievals per day (after accounting for unfavourable retrieval conditions). The choice of record length is a balance between the robustness of the analyses and storage and processing concerns; 1 year of the MISR L2 product (MIL2ASAE) corresponds to approximately 170 GB. As an order-of-magnitude estimate, assuming (on average) a revisit time of 5 d and half the data being unsuitable for retrieval due to, for example, cloudiness, approximately 200 views of a given point on the Earth would be expected over a 5-year period. While this would show considerable spatial variation, qualitatively it is expected to be sufficient, as it is well-known from observations and modelling that the main features of the global aerosol system are systematic and repeat year-to-year

The MODIS instruments fly on the Terra and Aqua platforms; L2 data from the latest Collection 6.1 (C61) from Aqua (launched 2002) are used here for the same 5-year period as the MISR analysis. MODIS Aqua is thought to have slightly better radiometric performance than Terra

MODIS' 2330 km swath results in near-global daily observations in the tropics and once-daily or twice-daily observations at higher latitudes. Retrievals are provided at the 10 km nominal horizontal pixel size at the sub-satellite point. Towards the edge of the scan, the scan geometries and Earth's curvature cause a “bow-tie distortion” where pixels become larger and consecutive scans begin to overlap

The G5NR is a global 7 km non-hydrostatic mesoscale simulation based on the Ganymed version of GEOS-5

Aerosol output fields are provided on a 30 min time step on a 0.0625

The SW test is employed here as follows. First, spatial distributions of AOD are assessed by aggregating the MISR, MODIS, and G5NR data from their native spatial resolutions to 1

In each case, at least three data points are required for an aggregate to be considered valid; this is the minimum required for the SW test calculation and also the minimum number of observations for AERONET or MODIS standard processing to report a daily average value and the minimum number of days for MODIS products to report a monthly average value. The SW

The results will be interpreted in terms of relative frequencies of these four categories, as it is important to realise that the idiosyncrasies in real-world data complicate the estimation and calculation of

Fraction of data falling into each of the four categories of Shapiro–Wilk test results for AOD distributions aggregated temporally over a day. Columns show (left) AERONET and (right) G5NR data (note that the latter were previously spatially aggregated to 1

Mean fraction of data falling into the four categories of SW test results.

Figures

Patterns shown between Figs.

In areas of low to moderate AOD, including the global oceans, mountains, and fairly clean continental regions, for a strong majority (typically 80 % or more) of days the difference between arithmetic- and geometric-mean AOD (

In southern and eastern Asia and parts of North Africa, where the AOD is often high, the difference between the arithmetic and geometric mean is more frequently (up to around half the time) larger than 0.01. This implies greater sensitivity to the choice of averaging method. For these cases, log-normality tends to be a better representation of the distributions than normality, although for a non-negligible fraction of the data, neither distribution shape provides a good fit.

As in Fig.

Figure

Histograms (grey) of 550 nm AOD observed at three AERONET sites on individual dates (given in panel titles), corresponding to different SW test classification results. Arithmetic- and geometric-mean AOD (

Days where neither normal nor log-normal distributions provide a good fit to AOD observations are commonly those where multiple regimes are present within a grid cell or during a day. Figure

Note also that the near-universal choice of aggregating daily on a UTC calendar day basis, rather than in terms of local solar time (LST), can further complicate matters for locations far from the meridian. For example, AERONET sites in eastern Asia, Australasia, and the western Americas often contain data from midnight to mid-morning UTC, with a long gap, and then from late evening to midnight UTC. The break in the middle is due to local nighttime, during which no data are collected; i.e. observations from a single UTC day can contain data from 2 local days. If something happens during this gap to affect the AOD distribution, which is often the case due to the diurnal variations or meteorology, this will naturally increase the chances of multimodality. Thus, something as basic as the definition of the day to aggregate to can affect the inferred AOD distribution shape. This could be contributing to some of the cases where neither distribution fits (Figs.

While similar, the patterns in Figs.

G5NR temporal aggregates also show increased incidence of log-normality and of neither distribution fitting well in the Southern Ocean, while G5NR spatial aggregates do not; this implies diurnal cycles which affect the aerosol field here coherently on scales larger than 1

Fractional SW test assignment of spatial AOD variation with a day from selected AERONET DRAGON-like deployments and clustered sites.

AERONET also provides some opportunities to study the spatial distribution of AOD on horizontal scales of tens of kilometres to around 100 km, similar to L3 and global-climate-model resolution. These are mostly in so-called Distributed Regional Aerosol Gridded Observation Networks (DRAGONs) of up to several dozen sites, as detailed by

As in Fig.

As in Fig.

Maps of categorisation of monthly and seasonal AOD aggregates, in both cases from daily AOD, are shown in Figs.

Unlike daily aggregation, for monthly or seasonal aggregation the difference between arithmetic and geometric means is frequently more than 0.01. Thus, monthly and seasonal aggregates are more sensitive to the choice of averaging method. This implies generally larger variability on timescales of months and seasons than of spatial variability within a day, which is consistent with previous work

The exception to the above is very clean areas that are parts of the open ocean, Australasia, and mountainous or remote continental areas, which are outside of aerosol transport paths. Here, for much (but not always a majority) of the time, the AOD difference remains less than 0.01.

Downwind of major aerosol source regions, over both and land and ocean, all data sets tend to report higher consistency more frequently with log-normal than with normal distributions.

The previous portion of the analysis focuses mostly on the occurrence and distinguishability of normal and log-normal distributions for AOD; also relevant are the magnitudes of the differences introduced into the data sets by the choice of averaging method and summary statistic. Figure

It is important to realise that the AOD difference

Even a small offset in reported AOD, if systematic, can have important implications for calculations of climate forcing. This is particularly true for aerosol–cloud interactions, as these are very sensitive to both the anthropogenic perturbation and the natural background state assumed. For example, using perturbed parameter simulations to global climate models,

In contrast to the daily results, and even in low–moderate AOD loadings around 0.3, for monthly aggregates (right panel of Fig.

As Fig.

A potential counterexample to the need to account for distribution shape is the case of particulate matter (PM) estimation from AOD retrievals, in which case it might be more sensible to report arithmetic-mean AOD than the geometric-mean AOD, even if the underlying distribution is log-normal. This is because the arithmetic mean is directly proportional to the total, while the geometric mean requires knowledge of distribution width as well to return total mass, and in PM studies it is often the total mass which is of interest. However, in practical terms, PM forecasts and nowcasts and daily exposure estimates typically use the finest resolution data available rather than aggregates

Median (symbols) and central 68 % (lines) of binned difference between geometric- and arithmetic-mean AOD (

The issue may also be less crucial in analyses where the purpose is to assess the offsets between two data sets (e.g. difference between L3 composites), as opposed to the geophysical fields themselves, as differences between the arithmetic means of two data sets and geometric means of the same two data sets are likely to be of the same sign. The magnitudes will, however, differ depending on the tails of the distributions. As indicated earlier, for AOD, sampling differences between data sets can often be a large determinant of observed offsets

Decadal trends (

As the differences between arithmetic and geometric mean are larger for higher-AOD regions (Figs.

AOD over the global ocean, and over many ocean basins, has not changed very much.

AOD over parts of eastern North America and Europe has decreased in recent decades.

Some of the strongest positive AOD changes tend to be seen over the Arabian Peninsula.

Using three long-term AERONET sites (one for each of the above features), Table

The purpose here is not to perform an exhaustive global trend analysis but to assess quantitatively the implications of log-normally distributed AOD on some well-reported features of global aerosol trends. Prior studies typically calculated trends based on deseasonalised monthly mean AOD time series and calculating a linear least-squares regression fit. Deseasonalisation was achieved either by subtracting the mean AOD annual cycle over the time period or, as in

Here, linear trends are calculated using both monthly and seasonal aggregates for both normal (i.e. arithmetic mean) and log-normal (i.e. geometric mean) aggregates, both calculated from daily AOD (

Fraction of months where the difference between arithmetic- and geometric-mean AOD is larger than the GCOS goal uncertainty for an AOD climate data record; i.e.

At each of the three sites, the decadal AOD trends are qualitatively the same whether calculated using arithmetic- or geometric-mean AOD time series as a basis. However, as expected, trends using geometric-mean AOD are smaller in magnitude (i.e. increases and declines in AOD are less pronounced). The decrease in magnitude is often of the order of 10 %–30 %, which is typically within the

Note that during presentations and reviews of this work, a question arose as to whether, moving from point trends to larger regional-scale trends, the central limit theorem (CLT) would mean that arithmetic and geometric trends would converge. The CLT does not imply that expanding the region (adding more data) means that the AOD would become closer to a normal distribution; this is a common misconception of the CLT. Rather, the uncertainties on estimates of the summary statistic (whether arithmetic or geometric means) will behave approximately according to normal statistics, even if the underlying AOD distribution is not normal. It does not mean that the underlying quantity becomes closer to normally distributed; this misconception of the CLT is discussed in Sect. 3.1.3 of the review by

Widely used spatiotemporal aggregates of aerosol data from surface observations, satellite retrievals, and model simulations typically consist of arithmetic means and standard deviations of finer-resolution data. These statistics are most meaningful for normally distributed data, while previous work has indicated that AOD is often distributed close to log-normally on large scales. While one can transform between normal and log-normal summary statistics

As timescales increase from days to months to seasons, data become increasingly more consistent with log-normal than normal distributions, and the differences between arithmetic- and geometric-mean AOD become larger; assuming normality systematically overstates both the typical level of AOD and its variability. In low-AOD regions such as the open ocean and mountains, often the AOD difference is small enough (

As noted earlier, using the arithmetic mean and standard deviation to summarise log-normal data is not “wrong” in a mathematical sense, as one can transform between the two. The danger is in not explicitly considering the underlying distribution when drawing an inference, as the result may be misleading or, at a minimum, less of a full picture than could otherwise be obtained.

The main recommendations from this study for future missions and reprocessing of current data sets and simulations are as follows.

The frequency distribution of a geophysical quantity should be analysed in order to assess how best to aggregate it. This analysis should be done at the spatial and temporal scale or scales of interest for the aggregation because distributions are scale-dependent. The Shapiro–Wilk technique is a powerful tool to assess discrepancies from a normal or log-normal distribution and should be further combined with desired performance thresholds to assess whether discrepancies are scientifically relevant for a given quantity.

Ideally AOD aggregates such as satellite L3 products, but also from ground-based (e.g. AERONET) and model simulations, should report geometric-mean or median AOD rather than (or in addition to) arithmetic-mean AOD. Where data sets permit zero or unphysical negative AOD values (incompatible with geometric calculations), these should be truncated to some reasonable lower bound which will not introduce meaningful artefacts in derived statistics (such as 0.0001 used here for MODIS and MISR). These summary statistics are relevant because multiple data records provide evidence that AOD distributions are generally closer to log-normal than normal, particularly on monthly and seasonal timescales, and the geometric mean is the more natural and meaningful summary statistic for such data. This information should be clearly communicated to potential data users. Geometric mean AOD is systematically lower, often (on monthly or seasonal timescales) by more than the GCOS goal climate data record uncertainty of the larger of 0.03 or 10 %, so the choice of averaging method is scientifically important.

Due to the computational burden required on the data producer or user's end (i.e. for satellites; obtaining the full L2 data record to reaggregate to daily and then monthly time steps), this is unlikely to happen in the short term. In the meantime, calculation of geometric-mean monthly aggregates from current standard (i.e. arithmetic mean) daily L3 aggregates could be a useful stopgap measure. This is because the volume of daily L3 data is smaller than L2, and daily spatial aggregates were found to be less sensitive than monthly ones to the choice of arithmetic vs. geometric averaging.

Comparisons and statistical assessments of AOD must account for the expected numerical distribution. Some common performance assessment techniques making use of sum-of-squares calculations, such as the root-mean-square error or coefficient of determination, should not be used in all cases, as they can be systematically skewed by large tails on non-normally distributed data

The analysis presented here refers to AOD, but the methodology is general.

It is important to bear in mind that these simple distribution forms are just approximations for the true underlying distribution of a geophysical quantity, and the relevant problem is in identifying one which is a sufficiently accurate representation for a given task. Normal and log-normal distributions are mathematically convenient and represent many data sets reasonably well, which is a motivating factor for considering these two both historically and in the present work. Sometimes multiple distribution forms are suitable: this analysis has shown that often in low-AOD conditions the choice of normal or log-normal representation may not matter for many purposes. Furthermore, while not analysed here, dependent on the choice of parameters, gamma distributions

If only a few distributions or points need to be summarised, then it is of course preferable to show the actual distributions and/or an informative summary which is agnostic to any particular distribution shape, such as a box-and-whisker plot. However for many larger-scale analyses, aggregated outputs from observations and model simulations are likely to remain the format of the choice for many data users due to their convenience and significantly lower computational and storage requirements than full-resolution (e.g. L2) data. While these unavoidably lead to a loss of information, it is important that users consciously consider the underlying distributions that the data sets are drawn from when utilising these summary statistics in research. The above recommendations will result in more statistically and scientifically meaningful data sets and decrease potential systematic biases which can lead to erroneous qualitative and quantitative interpretation about the state of the Earth system.

The geometric-mean AOD output presented in this work is available upon request to the authors. AERONET data are available from

AMS and KDK jointly conceptualised the analysis. AMS performed the analysis and drafted the paper. Both authors contributed to the editing of the text.

The authors declare that they have no conflict of interest.

This research was performed as part of the NASA Plankton, Aerosol, Cloud, ocean Ecosystem (PACE) mission development. The AERONET team and site principal investigators and managers are thanked for the creation and maintenance of the AERONET data record. Satellite retrieval and modelling teams, and hosting entities, are acknowledged for the development and archiving of these data sets. Tom F. Eck (USRA), Brent N. Holben (NASA GSFC), and Alexander Smirnov (SSAI) are thanked for useful discussions about early AOD and turbidity measurement networks and their strengths and limitations. Patricia Castellanos (NASA GSFC) is thanked for advice on the use of the G5NR simulation. Chris J. Merchant (University of Reading) is thanked for input on uncertainty characterisation in sea surface temperature data. This analysis was also presented as part of NASA GSFC's AeroCenter seminar series and to the NASA Ocean Ecology Laboratory. The insightful comments, questions, suggestions, and endorsement of attendees of those seminars, as well as Yilun Chen (USTC), Adam C. Povey (Oxford), and three anonymous reviewers, are appreciated.

This research has been supported by the NASA PACE project.

This paper was edited by Anja Schmidt and reviewed by three anonymous referees.