CLARA-A 1 : a cloud , albedo , and radiation dataset from 28 yr of global AVHRR data

A new satellite-derived climate dataset – denoted CLARA-A1 (“The CM SAF cLoud,Albedo andRAdiation dataset from AVHRR data”) – is described. The dataset covers the 28 yr period from 1982 until 2009 and consists of cloud, surface albedo, and radiation budget products derived from the AVHRR (Advanced Very High Resolution Radiometer) sensor carried by polar-orbiting operational meteorological satellites. Its content, anticipated accuracies, limitations, and potential applications are described. The dataset is produced by the EUMETSAT Climate Monitoring Satellite Application Facility (CM SAF) project. The dataset has its strengths in the long duration, its foundation upon a homogenized AVHRR radiance data record, and in some unique features, e.g. the availability of 28 yr of summer surface albedo and cloudiness parameters over the polar regions. Quality characteristics are also well investigated and particularly useful results can be found over the tropics, mid to high latitudes and over nearly all oceanic areas. Being the first CM SAF dataset of its kind, an intensive evaluation of the quality of the datasets was performed and major findings with regard to merits and shortcomings of the datasets are reported. However, the CM SAF’s long-term commitment to perform two additional reprocessing events within the time frame 2013–2018 will allow proper handling of limitations as well as upgrading the dataset with new features (e.g. uncertainty estimates) and extension of the temporal coverage.


Introduction
Sustained climate monitoring activities are extremely important for assessing the presumed anthropogenically induced and accelerated climate change during the recent hundred years (Trenberth et al., 2002;Karl et al., 1995).Since the fundamental forcing of the climate is the net input and output of radiation into the Earth-atmosphere system (Manabe and Wetherald, 1967), there is a need to closely monitor the evolution of radiation budget components and associated influencing factors, both at the Earth's surface and at the top of the atmosphere (TOA -notice that all acronyms are listed in Appendix A).Among the influencing factors, changes in global cloudiness and surface albedo are two essential factors with large impacts on the radiation budget that require special attention (Dufresne and Bony, 2008;Bekryaev et al., 2010;Flanner et al., 2011).
The need for global monitoring inherently means that satellites must play an increasingly important role due to the ability to observe the Earth at high spatial and temporal resolution.However, it is vital that the satellite radiance datasets are carefully collected and prepared in order to achieve the highest standards of homogeneity and calibration (Ohring et al., 2005).In addition, one must assure that retrieval methods applied to original radiance datasets are appropriate and used in a consistent manner.The EUMETSAT Climate Monitoring Satellite Application Facility (CM SAF) project was formed in 1998 to address these issues and, in particular, to ensure proper use of operational meteorological satellite data for climate monitoring purposes.A comprehensive description of CM SAF activities and plans is given by Schultz et al. (2009).
Satellite observations have a very short history in a climatological perspective; useful systematic measurements did not begin until around 1980 (Davis, 2007).Consequently, it is only possible to study satellite-based observational series spanning at most three decades.Despite this seemingly critical limitation, it is important that these observations are analysed and prepared for future continuation in order to be used for careful evaluation of short-and medium-term climate fluctuations, in particular those suggested by climate scenarios from climate model simulations.Equally important is the ability to assist in the evaluation of climate models' ability to describe historic climate fluctuations during the last decades through hindcast simulations.
One of the longest satellite observation records is that collected by the Advanced Very High Resolution Radiometer (AVHRR) operated onboard the polar-orbiting NOAA satellites (also carried by the Metop-A polar orbiter operated by EUMETSAT from 2006).Measurements began in 1978 and have continued until present date (Kogan et al., 2011).The last AVHRR sensor is scheduled for launch in 2017 (on Metop-C), but AVHRR-like datasets will be available from sensors on future satellite missions.This can be accomplished by subsetting the spectral channels from new imagers, noting that many of them inherit the original AVHRR channels.Examples of these new imagers are the Visible Infrared Imager Radiometer Suite (VIIRS -carried by the NOAA satellite successors Suomi NPP and the forthcoming JPSS satellites, Justice et al., 2011), the Sea and Land Surface Temperature Radiometer (SLSTR -to be carried by ESA ENVISAT successor satellites Sentinel 3A and 3B, Coppo et al., 2009) and the METimage sensor (to be carried by EUMETSAT Metop successors from the EUMETSAT Polar System -Second Generation (EPS-SG); see Schmülling, 2010).Thus, AVHRR-like observations will be available for several additional decades to come, putting the AVHRR sensor in the front row among satellite sensors best suited for climate monitoring purposes.
Climate data records (CDR) based on historic AVHRR data have been compiled by e.g. the International Satellite Cloud Climatology Project (ISCCP, Rossow and Schiffer, 1999) but only based on a subset of the full AVHRR spectral channel dataset.The first multi-parameter dataset making use of all AVHRR channels is the AVHRR Pathfinder Atmospheres -Extended (PATMOS-x) dataset.PATMOSx has been generated in several versions and the latest algorithm and dataset versions are described by Heidinger et al. (2012), Foster and Heidinger (2012), and Walther and Heidinger (2012).
This paper presents a new comprehensive cloud and radiation dataset prepared by the CM SAF project based on global historic AVHRR data.It includes similar cloud prod-ucts to the PATMOS-x dataset but produced with different algorithms.In addition, it also includes products for the surface albedo and the surface radiation budget.The dataset is given the acronym CLARA, formed from the expression "The CM SAF cLoud, Albedo and RAdiation dataset".We will use CLARA in the remainder of the text to refer to the entire dataset and as a prefix to individual dataset components.To clarify that the dataset is based on AVHRR data and that this is the first of several reprocessing efforts, we have added the suffix A1 in order to form the complete name of the dataset as CLARA-A1.The observation period amounts to 28 yr, starting in 1982 and ending in 2009.
The basic AVHRR radiance dataset is described in Sect. 2 followed by method descriptions and initial results and comparisons with existing satellite datasets for the three groups of products in Sects.3-5.Some results from validation studies are described in these sections as well as recommended application areas.Section 6 discusses strengths and limitations of the dataset with some focus on observation sampling effects.Finally, the concluding Sect.7 summarizes the main features of the dataset and outlines future plans for the extension and improvement of the dataset.

The historic AVHRR dataset
Table 1 describes the AVHRR instrument, its various versions, and the satellites carrying them.Initially, the instrument only measured in four spectral bands (AVHRR/1), but from 1982 a fifth channel at 12 µm was added (AVHRR/2).Further, a sixth channel at 1.6 µm was added in 1998 (AVHRR/3); however, this channel was only accessible if switched with the previous third channel at 3.7 µm.The retrieval of cloud physical properties (in particular particle effective radius and liquid/ice water path -see more detailed descriptions later in Sect.3.2) is sensitive to the shortwave near-infrared channel being used, which was for example investigated in Stengel et al. (2012).Table 2 summarizes when either of the channels 3A and 3B has been active on the AVHRR/3 instruments.The AVHRR instrument measures at a horizontal resolution close to 1 km at nadir but only data at a reduced resolution of approximately 4 km are permanently archived and currently available with global coverage since the onset of measurements.
Figure 1 describes the coverage of observations used in CLARA-A1 for each individual satellite over the entire period.Cloud product retrieval methods have been dependent on access to two infrared split-window channels at 11 and 12 µm, meaning that only data from satellites carrying the AVHRR/2 or AVHRR/3 instruments have been used.As seen in Fig. 1, this leads to reduced time sampling (i.e.only one satellite available for daily observations) between 1982 and 1991.On the other hand, from 2001 and onwards, more than two satellites have been available for daily observations.Table 1.Spectral channels of the Advanced Very High Resolution Radiometer (AVHRR).The three different versions of the instrument are described as well as the corresponding satellites.Notice that channel 3A was only used continuously on NOAA-17 and Metop-1.For the other satellites with AVHRR/3 it was used only for shorter periods (see  Observations from polar-orbiting sun synchronous satellites are made at the same local solar time at each latitude band.Normally, satellites are classified into observation nodes according to the local solar time when crossing the equator during daytime (illuminated conditions).For the NOAA/Metop-A satellite observations, a system with one morning observation node and one afternoon observation node has been utilized as the fundamental polar-orbiting observation system.Theoretically, this yields four equally distributed observations per day (if including the complementary observation times at night and in the evening when the satellite passes again 12 h later).However, equator-crossing times have varied slightly between satellites.Morning satellites have generally been confined to the local solar time interval 07:00-08:00 and afternoon satellites to the interval 13:30-14:30 (Foster and Heidinger, 2012).However, a more significant deviation was introduced for the morning satellites NOAA-17 and Metop-A, now being defined in a socalled mid-morning orbit with equator crossing times close to 10:00.A specific problem with the observation nodes for the NOAA satellites has been the difficulty in keeping observation times stable for each individual satellite (e.g. as described by Ignatov et al., 2004).An important aspect for any product-based climate dataset (formally denoted thematic climate data records -TCDRs) is that retrieved products have to be derived from accurately calibrated and homogenized radiances (formally denoted fundamental climate data records -FCDRs).This is necessary for several reasons but most importantly for assuring that analysed trends are not artificially caused by differences between individual satellites and changes in observation frequencies and times.We have used an AVHRR FCDR prepared by NOAA (Heidinger et al., 2010).This FCDR was originally prepared for the compilation of the PATMOS-x dataset.This FCDR focuses in particular on homogenization and inter-calibration of the AVHRR visible reflectances.The calibration of infrared AVHRR channels is basically left untouched since the use of onboard blackbody calibration targets have been found to provide reasonably stable and reliable results (i.e. at least in the sense that only small trends or degradations have been detected as opposed to the situation for visible channels; e.g.see Trishchenko et al., 2002).However, future upgrades of the AVHRR FCDR need to address existing calibration uncertainties for the infrared channels as well (e.g.see Mittaz et al., 2009).

Cloud products
The presentation of the derived cloud products in CLARA-A1 has been subdivided into the following three subgroups: 1. Basic cloud products derived from the EUMETSAT Nowcasting Satellite Application Facility NWC SAF cloud-processing package.
2. Cloud products derived from the CM SAF cloud physical properties (CPP) package.
The first group of cloud products (consisting of cloud amount or cloud fraction -denoted CFC, and cloud-top level -denoted CTO) represents the general three-dimensional occurrence of clouds as described by the horizontal and vertical extension of cloud layers.The second group (cloud phase -CPH, cloud optical thickness -COT, liquid and ice water path -LWP and IWP) represents cloud optical and microphysical properties.Finally, the third group (joint cloudproperty histograms -JCH) represents condensed forms of cloud information involving both previous groups.
The CLARA-A1 cloud dataset is based on instantaneous AVHRR global area coverage (GAC) retrievals which have been used to derive the spatio-temporally averaged datasets at original swath level (4 km horizontal resolution).The products are available as daily and monthly composites for each satellite on a regular latitude/longitude grid with a spatial resolution of 0.25 × 0.25 • .In addition, results for the CFC and the surface albedo (SAL, introduced in Sect.4) products are available on two equal-area polar grids at 25 km resolution for the Arctic and Antarctic regions.These grids are centred at the poles and cover areas of 1000 km × 1000 km.
The monthly averages are also available in aggregated form (i.e.merging all satellites).Acknowledging the different observation capabilities during night and during day and also taking into account existing diurnal variations in cloudiness, a further separation of results into daytime and nighttime portions has also been done.Here, all observations made under twilight conditions (solar zenith angles between 80-95 • ) have been excluded in order to avoid being affected by specific cloud detection problems occurring in the twilight zone (as explained by Derrien and LeGleau, 2010).
All cloud products to be described in the following subsections (and also the following surface albedo and surface radiation products) are described in detail in product user manuals (PUM), algorithm theoretical basis documents (ATBD) and validation reports (VAL) available via the CM SAF web user interface accessible from www.cmsaf.eu.These documents are important since the peer-reviewed publications referred to in the following may not include the latest algorithm changes.

Algorithm descriptions and product examples
The CFC product is derived directly from results of a cloudscreening or cloud-masking method.CFC is defined as the fraction of cloudy pixels per grid box compared to the total number of analysed pixels in the grid box; CFC is expressed in percent.This product is calculated using the NWC SAF Polar Platform System (PPS) cloud-processing software.The algorithm (Dybbroe et al., 2005) is based on a multi-spectral thresholding technique applied to every pixel of the satellite scene.Several threshold tests may be applied (and must be passed) before a pixel is assigned to be cloudy or cloud free.Thresholds are assigned depending on present viewing and illumination conditions and from the current atmospheric state (prescribed from meteorological analyses -here, the ERA-Interim dataset; see Dee et al., 2011).Ancillary information about surface (e.g.land use categories and surface emissivities) is also taken into account.Thus, thresholds are dynamically defined, and therefore unique, for each individual pixel.
The CTO product is also derived using the NWC SAF PPS cloud software.Two separate algorithms are used: one for opaque clouds, and one for fractional and semitransparent clouds.For opaque clouds, cloudy top-of-atmosphere radiances from various levels in the atmosphere are simulated using the RTTOV radiative transfer code (Saunders et al., 1999).The simulations are then compared and matched against measured radiances.Semitransparent clouds are identified as clouds having significant brightness temperature differences between AVHRR channels 3B, 4, and 5 (i.e. at 3.7 µm, 11 µm and 12 µm).In a subsequent step, the cloudtop height is derived in an iterative manner by analysing the distribution of 11 µm and 12 µm radiance differences.This difference is large for thin ice clouds over sufficiently warm surfaces below the semitransparent cloud layer as a consequence of the fact that ice clouds appear more opaque at 12 µm due to differences in refractive indices for water and ice.
Observe that the CTO product exists in three different varieties, all simply different representations of the same product: 1. Cloud-top temperature (CTT), expressed in Kelvin.
Examples of the monthly CFC and CTO products are shown in Fig. 3 for July 2007.A corresponding yearly mean of zonally averaged results for afternoon orbits (NOAA-18) is shown in Fig. 3. Results are here compared with results from five other satellite-based datasets (PATMOS-x, ISCCP, MODIS Science Team, MODIS-CERES team, and  CALIPSO Science Team).The reference datasets were extracted from the GEWEX global cloud assessment database (Stubenrauch et al., 2013).
From Fig. 3 we observe reasonably good agreement with the other datasets with respect to the overall global distribution of cloudiness; this also concerns the geographical distribution of cloud features according to Fig. 2 (although not shown for the other reference datasets).However, there are some features of the CLARA-A1 dataset that deviate from the other datasets (best visible in the difference plot in the bottom panel of Fig. 3).Cloud amounts appear to be generally lower outside the tropical regions (e.g. from midlatitudes to the poles).Additionally, the CLARA-A1 CFC is substantially lower over the Southern Ocean between 50-70 • S. We suspect issues related to the extent of sea ice in the Southern Ocean during the polar winter to be the primary contributor to the suspected low bias; efforts are underway for improving the seasonal CFC here for the CLARA-A2 edition.
Another deviation from the other datasets can be seen for the latitude bands between 20 to 40 • on both hemispheres where the CLARA-A1 CFC is slightly larger than the other passive satellite observations (although, this is only valid for afternoon passages).This has been identified as inappropriately relaxed cloud thresholds in the transition zone between pure desert areas and tropical vegetated areas.A third remarkable feature is the large deviation of CALIPSO- CALIOP cloud amounts from all other datasets near the equator.This has to do with the much higher sensitivity of the CALIOP sensor in detecting thin and subvisible cirrus.Thus, most datasets based on passive imagery seriously underestimate the amount of thin clouds in this region.
Concerning the CTO product in Fig. 2, the northward movement and intensification of the ITCZ over the Asian branch in July is well depicted here.Results agree quite well with other reference datasets (not shown here) despite some differences in the basic cloud amounts.More specific features and differences to other datasets are better visible in multi-parameter visualisations (see Sect. 3.3).

Quality aspects and recommended applications
Extensive validation efforts comparing with surface observations, A-Train observations (mainly from the CALIPSO-CALIOP sensor) and the datasets displayed in Fig. 3 suggest that CFC results are accurate to within 10 % (absolute).Corresponding studies of CTO results indicate accuracies of 60 hPa for CTP and within 500 m for CTH (although, the latter is only achieved if filtering out topmost CALIPSO-CALIOP cloud layers with COT lower than 0.3).
A more detailed examination of the performance of the basic CLARA-A1 cloud products, based on comparisons with high-quality CALIPSO-CALIOP observations, is given by Karlsson and Johansson (2013).
We repeat that an aspect that needs improvement in coming editions is daytime cloud screening over subtropical land regions where the current dataset is not optimal.This has led to some overestimated cloud amounts and a too high frequency of optically thin water clouds.Furthermore, the daytime distribution of thin water clouds and thin ice clouds in this region is then slightly biased meaning that e.g.studies of subtropical and tropical cirrus cloudiness are compromised.
Nevertheless, for other regions (e.g. over mid and high latitudes, over most oceanic regions and over polar regions during the polar summer) results should be of sufficient quality for allowing detailed studies.For example, the long temporal record of the CLARA-A1 dataset would be a valuable asset for studies focussing on the sea ice-cloud interactions during the polar summer when surface albedo, radiation and cloud properties are available.This is also supported by the good validation results obtained from the surface albedo retrievals over snow and ice (Riihelä et al., 2013).

Algorithm descriptions and product examples
Four CLARA-A1 optical and microphysical cloud products are derived using the CPP algorithm (Roebeling et al., 2006).These are CPH, COT, LWP, and IWP.The central principle of the method to retrieve these cloud properties is that the reflectance of clouds at a non-absorbing wavelength in the visible region (0.6 or 0.8 µm) is largely dependent on the optical thickness with little dependence on particle effective radius (r e ), whereas the reflectance of clouds at an absorbing wavelength in the near-infrared region (1.6 or 3.7 µm) is strongly dependent on effective radius (Nakajima and King, 1990).In the CPP algorithm, the Doubling-Adding KNMI (DAK, De Haan et al., 1987;Stammes, 2001) radiative transfer model (RTM) is used to simulate 0.6 and 1.6 /3.7 µm TOA reflectances as a function of viewing geometry, COT, effective radius, and cloud phase.These simulated reflectances are stored in a look-up table (LUT).
COT and r e are retrieved for cloudy pixels in an iterative manner by simultaneously comparing satellite-observed reflectances to the LUT of RTM-simulated reflectances.Simulations are made for both ice and water clouds, enabling the retrieval of CPH.In those cases when simulated radiances for ice and water clouds overlap (suggesting two different solutions for COT and r e ), the solution is found by also utilizing cloud-top temperature and the assumption that water clouds should be warmer than 265 K.In a subsequent step, the liq-uid water path (LWP) of water clouds can be computed using the following relation (Stephens, 1978): where ρ l is the density of liquid water and τ is the COT.For water clouds, effective radii between 1 and 24 µm are retrieved.The IWP is approximated using the same relation as for LWP but with COT and r e retrievals based on RTM simulations for imperfect hexagonal ice crystals.Homogeneous distributions of C0, C1, C2, and C3 type ice crystals from the COP library (Hess et al., 1998) are assumed, with effective radii of 6, 12, 26, and 51 µm, respectively.A final, but critical, remark is that the CPP products depend on the availability of reflectances from visible channels; consequently CPP products are exclusively daytime products.
Figure 4 illustrates the CPH, LWP, and IWP products for one selected month (July 2007).Notice the consequence of requiring solar zenith angles (SZA) below 72 • : the majority of the globe south of 50 • S experiences too large SZAs or is in the midst of the polar night.The CPP products give a good description of large-scale cloud climatologies, such as the liquid-dominated stratocumulus regions off the west coast of continents and the deep convective nature of mainly ice-topped clouds along the ITCZ.The midlatitude cyclone tracks are also present on both hemispheres.Limitations are most notably seen in the Arctic, where inadequate characterization of sea ice has led to the retrieval of too large cloud water paths.

Quality aspects and recommended applications
A comparison of LWP in the tropics with two other satellitebased datasets is shown in Fig. 5.One of them (ISCCP) has a similar observation length as CLARA-A1, but has a large contribution from geostationary satellites; the other (MODIS) spans less than a decade.The three datasets agree reasonably well in the absolute amount of tropical cloud liquid water, but have different levels of variability.As expected, MODIS is most stable because it involves a single, well-calibrated instrument.CLARA-A1 and ISCCP show considerable trends during various parts of the time series.Although part of this variability may be real, it is likely related to artifacts such as jumps between satellites, orbital drift, and availability of different channels (AVHRR ch3a vs ch3b).Despite these issues, a promising finding is that the three datasets, and in particular CLARA-A1 and MODIS, agree relatively well on the average seasonal cycle of tropical LWP (see lower panel of Fig. 5).Comparisons were also made (not shown) with an independent, microwavebased (SSM/I and AMSR-E) dataset prepared by O' Dell et al. (2008).These comparisons focused on the main stratocumulus regions and showed good agreement in the seasonal cycle of LWP with biases on the order of 20 %. Results from the evaluation of the ice water path product (not displayed here) showed a considerably larger spread between different datasets.It is clear that current estimations of this parameter are still very uncertain (e.g.Eliasson et al., 2011).

Multi-parameter cloud product representations
The joint cloud property histogram (JCH) product is a combined histogram of CTP and COT covering the solution space of both parameters.This two-dimensional histogram gives the absolute numbers of occurrences for specific COT and CTP combinations defined by specific bins, separated into liquid and ice clouds.Notice that the product is defined in a slightly coarser grid (1 • × 1 • resolution) in order to achieve higher statistical significance and to maintain manageable file sizes.As the product is currently archived, analysis is possible in several modes, from the grid-point resolution (local distributions), to smaller, user-specified geographical domains (regional distributions) or for all grid points describing average distributions for the entire globe.
Figure 6 shows the JCH product integrated globally for March 2007 compared to corresponding histograms from the MODIS Science Team and ISCCP.It is obvious that all three datasets show very different CTP-COT distributions.Some similarities are found with the MODIS distribution with respect to the vertical distribution of clouds, but it is clear that the MODIS range of COT values is much larger than both CM SAF and ISCCP.Additionally, on the global scale, CM SAF had a tendency to give lower frequencies of optically thin (t<5) ice clouds compared to MODIS.This can be attributed to a higher efficiency by MODIS-based methods in detecting these clouds because of the availability of sounding channels (not available for CM SAF or ISCCP).In general, all three datasets have different bin sizes in COT and CTP, making direct interpretations difficult.

Dataset description and algorithm overview
The AVHRR radiance data record and the cloud mask product (i.e. the basic cloud-screening product used for generation of CFC) have been utilized to generate a 28 yr record of terrestrial surface albedo.This dataset, henceforth called CLARA-A1 SAL, describes the global black-sky surface albedo over the waveband of 0.25-2.5 µm.The dataset is generated at the same spatial resolution and projection(s) as the CLARA-A1 cloud products.The dataset is available as 5-day (pentad) or monthly means.Examples of the CLARA-A1 SAL product for January and July 2007 are given in Fig. 7.
The dataset and its validation are described in detail by Riihelä et al. (2013); therefore we will only provide a brief overview here.The retrieval algorithm is composed of sequential steps of (1) topography corrections in geolocation and radiometry over mountainous terrain, (2) an atmospheric correction for scattering and absorption effects of aerosols and other constituents, (3) a correction for reflectance anisotropy of vegetated surfaces and spectral albedo calculation, and (4) a narrow-to-broadband conversion to derive the albedo over the full waveband.Snow and ice are special cases: as the reflectance anisotropy of snow is large and varies according to snow type (Peltoniemi et al., 2005), we derive only broadband bidirectional reflectances from the AVHRR overpasses and derive the surface albedo by averaging the bidirectional reflectances spanning the viewing hemisphere.

Quality aspects and recommended applications
The dataset has been validated against in situ albedo observations from the Baseline Surface Radiation Network (Ohmura et al., 1998), the Greenland Climate Network (Steffen et al., 1996), and the Surface Heat Balance of the Arctic Ocean (SHEBA) Project and Tara floating ice camps (Perovich et al., 2002;Gascard et al., 2008).Apart from the ice camps, data coverage of 10 yr or more was generally required at each validation site.The validation results showed that CLARA-A1 SAL can retrieve the surface albedo with a relative   accuracy of 10-20 % over vegetated sites and 5-15 % over snow and ice.At some snow-free sites the albedo retrieval accuracy was considerably poorer.However, at these sites a significant correlation was found between poor retrieval accuracy and the heterogeneity of high-resolution near-infrared surface reflectances at CLARA-A1 SAL pixel scales.This indicates that spatial representativeness issues in the in situ albedo measurements have to be considered when assessing the product quality (Riihelä et al., 2013).

Atmos
The time series was also compared with existing surface albedo products from MODIS (Schaaf et al., 2002) and CERES FSW (Rutan et al., 2009).Both comparisons showed similar results: on a global scale, CLARA-A1 SAL mean albedo is 10-20 % higher than either CERES or MODIS mean albedo in relative terms.The MODIS comparison is presented in more detail by Riihelä et al. (2013).Here we show an overview of the CERES comparison results in Fig. 8.The figure shows the monthly mean albedo from both CLARA-A1 SAL and CERES FSW averaged over the commonly retrievable land/snow area after CLARA-A1 SAL has been coarsened to the 1 • × 1 • spatial resolution of CERES FSW.The dashed line shows the relative difference between the products.As we can see the difference is fairly constant in time.An analysis of the differences on latitudinal bands (not shown) reveals that the products agree best over the boreal zone north of 50 • N, whereas the largest disagreements are over the tropical latitudes.Regional exceptions to this tendency of course do occur.
The stability of the CLARA-A1 SAL time series was evaluated using the central part of the Greenland Ice Sheet as a site whose albedo was expected to remain fairly constant over a long period (Riihelä et al., 2013).The results showed that the maximum deviation of CLARA-A1 SAL monthly mean albedo over this site from its 28 yr mean was 6.8 %, including some natural variability associated with e.g.varying solar zenith angles.Also, the 28 yr mean albedo for this site was estimated to be 0.844, which is very well in line with citations from the literature for the albedo of dry fresh snow (0.85, Konzelmann and Ohmura, 1995).A similar stability evaluation was also carried out over Dome C in Antarctica, with similar results.
There are also some caveats which need to be kept in mind when using the CLARA-A1 SAL dataset.The aerosol optical depth (AOD) input in the atmospheric correction was kept universally constant at 0.1 in this first edition.We acknowledge that this scenario is accurate only over the polar regions where the atmosphere is dry and thin, and that considerable over and underestimations will occur over regions with high and variable aerosol loading.Efforts are currently underway to identify a dataset or algorithm that will allow for an accurate correction for aerosol effects in the next release of the CLARA-A SAL.However, we wish to point out that over vegetated terrain, the near-infrared reflectance typically dominates the resulting broadband albedo.As aerosol scattering and absorption effects are smaller in the near-infrared region, our simulations indicate that a true AOD of 0.25 will only cause an additional relative error of 3-5 % in the retrieved broadband albedo given typical grass reflectances and viewing/illumination geometries (Riihelä et al., 2013).Based on AOD retrievals from e.g.MISR (Martonchik et al., 1998), the annual mean AOD is less than this for most non-tropical regions of the Earth.This of course is not the case over deserts, where retrieval errors can be considerably larger.
Other issues such as sporadic cloud masking errors (especially during low-sun conditions) or inaccuracies in the land cover dataset used to resolve the CLARA-A1 SAL algorithms may cause retrieval errors as well.The users are recommended to utilize the existing support data (number of observations and standard deviation per pixel) to remove suspect retrievals from their analysis.
Our quality assessment of the CLARA-A1 SAL surface albedo dataset has shown that albedo retrievals over snow and ice, particularly over the Arctic, are the strongest point of the dataset.As such, we recommend the dataset particularly for climate model validation and climate monitoring studies involving the polar regions.

Algorithm overview -solar surface irradiance
The mesoscale atmospheric global irradiance code (MAGIC) algorithm is used for the retrieval of the solar surface irradiance (SIS) (Mueller et al., 2004(Mueller et al., , 2009)).A brief description of the applied algorithm is given below following Wang et al. (2011).The effect of the atmospheric variables ozone, aerosol, water vapour and clouds, and of the surface albedo is considered by radiative transfer calculations.Atmospheric transmittance is pre-calculated and saved in a LUT for a variety of combinations of atmospheric variables and surface albedos.The solar surface irradiance is then derived from pre-calculated LUTs for the atmospheric state given at the specific location and time for each pixel.However, instead of a traditional LUT approach which requires a huge amount of pre-calculations, a more sophisticated approach, the hybrid eigenvector approach, is applied.This approach is motivated by linear algebra and takes benefit of the eigenvector behaviour of the system (see Mueller et al. (2009) and more coherently Mueller et al. (2012) for further details about the eigenvector hybrid approach).The effect of the solar zenith angle on the transmission, and hence the surface solar irradiance, is considered by the use of the modified Lambert-Beer (MLB) function (Mueller et al., 2004).
The algorithm considers the effect of aerosols with different aerosol optical thickness, single-scattering albedo, and asymmetry parameters.The respective information is taken from an aerosol climatology which is based on the Aerocom model median (Kinne et al., 2006) merged with Aeronet in situ data (Holben et al., 1998).Water vapour is considered by its density and a standard profile.The water vapour density is taken from the ERA-Interim project (Dee et al., 2011).In addition to the atmospheric variables also the effect of the surface albedo is considered.For this purpose, surface albedo is calculated based on the spatial distribution of 20 surface types following the recommendation of the Surface and Atmospheric Radiation Budget (SARB) working group, which is part of the Clouds and the Earth's Radiant Energy System (CERES) mission.
Variations in cloud properties induce variations in the topof-atmosphere albedo.In this respect, the top-of-atmosphere albedo is used as input to consider the effects of clouds together with the information about cloudy and cloud-free conditions provided previously by the cloud-screening methods.Hence, the algorithm requires the satellite-derived TOA broadband albedo in the shortwave spectral region as input parameter.However, this quantity is not measured directly by the AVHRR instrument, as a result it has to be calculated.As a first step, the calculation of the broadband reflectance is conducted based on the measurements of the reflectance in the two visible channels of the AVHRR instruments (see Table 1) following Hucek and Jacobowitz (1995).The derived broadband reflectance for each pixel is then transferred to broadband fluxes using the bidirectional reflectance distribution function (BRDF, also termed angular dependence model (ADM)) derived for ERBE (Suttles et al., 1988).Figure 9 shows a diagram of the algorithm processing steps and the used input.
The output of the MAGIC algorithm is the all-sky surface solar irradiances in the 0.2-4.0µm wavelength region.The extraterrestrial total solar irradiance is 1365 W m −2 and is adjusted according to the Earth-Sun distance.Figure 10 shows a long-term mean of the CLARA-A1 solar irradiance data for September as an example.Information on the data quality is provided for the surface sites from the Baseline Surface Radiation Network (BSRN, Ohmura et al., 1989).All main features of the global distribution of surface irradiance are visible in the CLARA SIS dataset, including the stratocumulus regions in the eastern Atlantic and eastern Pacific as shown by their reduced surface irradiance.During the evaluation the data quality was found to be strongly degraded over bright surfaces (i.e.snow-covered areas, desert) and the corresponding data were set to missing (white areas in Fig. 10).

Algorithm overview -terrestrial part
The CM SAF algorithm to derive the surface downwelling longwave (SDL) radiation from the AVHRR GAC dataset is based on the monthly mean surface downwelling longwave radiation data from the ERA-Interim dataset.The CLARA-A1 cloud fraction (CFC) dataset and high-resolution topographic information are used to generate the SDL dataset on the global 0.25 • grid.
The surface downwelling longwave radiation from the AVHRR GAC dataset is calculated from the monthly mean of the clear-sky surface downwelling longwave radiation derived from ERA-Interim and the cloud correction factor (CCF) multiplied with the CLARA-A1 CFC dataset at 0.25 • × 0.25 • resolution:  here SDL clr denotes the monthly mean clear-sky surface downwelling longwave radiation from ERA-Interim and CFC CLARA is the CLARA-A1 cloud fraction.
The CCF is defined as the ratio of the difference between the model clear-sky and all-sky surface longwave downwelling radiation to the model cloud fraction: where CFC ERA represents the reanalysis grid-box horizontal cloud fraction.The CCF describes the sensitivity of the surface downwelling longwave radiation to changes in cloud fraction.It is derived from linear regression for grid boxes that exhibit a CFC variability of more than 10 % and for grid boxes with a correlation coefficient between SDL and CFC above 0.6.For the remaining grid boxes, CCF is extrapolated from neighbouring grid boxes.Figure 11   wave radiation decreases on average by 2.8 W m −2 per 100 m in elevation.To account for this effect when generating the CLARA-A1 SDL dataset, the Global Land One-km Base Elevation Project (GLOBE) database has been used to calculate the topography on the 0.25 • global grid.The GLOBE dataset is a global 1 km gridded, quality-controlled digital elevation model (DEM) accessible from the National Geophysical Data Center at NOAA (http://www.ngdc.noaa.gov/mgg/topo/globe.html).Using the topography information from the ERA-Interim dataset, the surface downwelling longwave radiation (SDL CLARA ) has been corrected according to Wild et al. (1995) to account for the differences in the surface elevation between the two grids.The conservation of the surface downwelling longwave radiation on the original ERA-Interim grid is taken into account during the topographic correction.The multi-year averaged surface downwelling longwave radiation for July is shown in Fig. 12.The high quality of this dataset is indicated by the validation using the BSRN surface measurements.The large-scale features of this dataset correspond to the ERA-Interim data.The small scale features in CLARA SDL, e.g. in topographically varying regions, lead to a significant improvement for regional climate monitoring and analysis.
Based on the surface radiation products, cloud radiative effect products are derived and provided in addition.Finally, the outgoing thermal radiation and the surface radiation budget are available as well.All CM SAF GAC surface radiation datasets are globally available as monthly means from 1982 to 2009 on an equal-angle grid of 0.25 • .An overview of product characteristics (i.e.associated accuracies and uncertainties) is given in Table 3.

Quality aspects and recommended applications
The datasets of the surface shortwave radiation quantities (SIS, SNS, SAL, CFS) exhibit high quality and are mainly derived from satellite observations.Also the quality of the up and downwelling longwave surface fluxes is remarkably good, expressed by a low bias and absolute differences in comparison with BSRN stations.However, these datasets use substantial information from reanalyses.This should be considered if the data is used for evaluation of reanalyses and other model-derived datasets.However, the high quality makes this variable very valuable for the analysis of the greenhouse warming, which directly affects SDL.
The surface solar irradiance data is expected to be useful for studies dealing with historical global dimming and brightening effects as well as with analyses of trends of extreme events (drought, heat waves).In addition, it is expected to be used for solar energy applications in amendment of data retrieved from geostationary satellites.Yet, please note that the mentioned applications are hampered by the data gaps over bright surfaces.Here, the accuracy of the data has been evaluated to be systematically lower and has consequently been masked.However, the majority of the Earth's surface is not affected by this limitation.
The temporal stability and homogeneity of the surface radiation datasets have not yet been fully evaluated.While all possible measures have been taken in the generation of these datasets, artificial shifts or trends in the final datasets cannot be excluded (to be further discussed in Sect.6).Application of these datasets for the analysis of temporal changes/trends is recommended only after a careful evaluation of the temporal behaviour of these datasets.

Discussion
The strength of the CLARA-A1 dataset is the long observation record since many other available datasets (e.g. from MODIS) are only available for the last decade.In addition, access to multiple shortwave channels allows a better retrieval of cloud optical properties and access to split-window channels in the thermal infrared region allows a better delineation of cirrus cloudiness compared to datasets based exclusively on just one visible and infrared channel (e.g.ISCCP).Regional results have also been evaluated extensively (e.g.see Karlsson andDybbroe, 2010, andRiihelä et al., 2013) and are stable after many years of development and use in the CM SAF project.
The advantages of the CLARA-A1 product time series stem mainly from its origin in a homogenized long-term AVHRR radiance dataset.This is particularly important for products relying exclusively on AVHRR visible channels compared to the ones that are based on the full multispectral channel dataset.Those products are the CPPs, the SAL product and the solar part of the surface radiation budget products.Closely related to this issue is also that the SAL product covers a sufficiently long timespan to be of use as a reference against, for example, surface albedo parameterizations in climate models.
We claim that over oceanic and sparsely populated areas, satellite-based data is still the main observational data source.Here, the CLARA-A1 dataset definitely fills a gap of observational data for climate monitoring and analysis purposes.
Still, for some areas on the globe, results are less reliable and users may have to await further updates of the dataset for securing proper use.This is further emphasized by noting that, despite homogenization efforts, the current dataset has remaining weaknesses in the temporal coverage and frequency of observations.Consequently, CLARA-A1 applications aiming at performing global trend analyses from this first edition of the dataset must be made with great care.To illustrate the problem with the temporal sampling, we will now examine closer the CLARA-A1 time series of daily mean global CFC over the full period 1982-2009.We have exclusively chosen the CFC product for this illustration, but it must be borne in mind that all other products are affected to some extent by the quality of the CFC product.
Figure 13 shows the daily mean cloud fraction for PATMOS-x and CLARA-A1 over the analysed period.Included is the corresponding daily mean CFC from the PATMOS-x version 5 dataset with cloud screening based on methods described by Heidinger et al. (2012).We immediately notice a clear decreasing trend for CLARA-A1 CFC over the period amounting to approximately 10 %.The corresponding trend for PATMOS-x is approximately 5 %.Since the CFC values for CLARA-A1 and PATMOS-x are more or less the same in the beginning of the period, it leads to that PATMOS-x values are about 5 % higher than CLARA at the end of the period.This is also consistent with results in Fig. 3, where PATMOS-x values are generally higher than CLARA-A1, in particular near the poles.
Thus, both datasets indicate a negative global temporal trend (although with different magnitudes) in CFC over the period.However, if we compare CLARA-A1 results with results from all available surface stations (synoptic observations) in Fig. 14, we see only a trend in the difference between the two datasets.This means that only satellite results have a negative trend (bias trend shown in lower panel of Fig. 14).To avoid being influenced by a changing surface observation network, we have here only used surface stations that were active over the full observation period.Unfortunately, this biases the geographic distribution of the 165 stations included to primarily European and North American stations.Thus, results in Fig. 14  for the Northern Hemisphere than for the entire globe.Nevertheless, we do not see signs of a large negative trend in cloudiness for this restricted surface observation dataset.
Another interesting fact is that if looking exclusively at daytime and night-time results from CLARA-A1 (not shown here), no or only weak trends are seen as opposed to results for the total satellite-based dataset.However, we also note that CLARA-A1 CFC values are generally lower at night and at twilight conditions, pointing at a slightly different cloud detection efficiency between day and night.Consequently, we might suspect that the trend seen for both satellite datasets in figure might at least partly be explained by changes in the temporal sampling of observations throughout the period (as illustrated in Fig. 1).The introduction of morningevening satellites in the 1990s, and even a slight dominance of morning-evening satellites during the last 10 yr, could be responsible for creating this trend in global cloud amounts.The fact that the two methods show different slopes indicates that also additional differences (e.g.use of different image features and input datasets) influence results.Future editions of the CLARA dataset need to address all these limitations and differences.A final remark on the usefulness of the CLARA-A1 dataset is that a central decision criterion for whether a dataset is useful for an application or not is the availability of transparent and extensive documentation.This important issue is well covered by the discussed CLARA-A1 dataset.All individual components of the dataset are well documented and validated.Respective documentation comprises user manuals, validation reports, and algorithm theoretical baseline documents (all documents available at www.cmsaf.eu).The extensive validation enables a good estimation of application uncertainties induced by the CLARA-A1 datasets.

Conclusions and future plans
This paper has described the CLARA-A1 dataset -a 28 yr cloud, surface albedo, and radiation budget dataset based on data from the AVHRR sensor on polar-orbiting operational meteorological satellites.Its content, anticipated accuracies, limitations, and potential applications have been described in some detail.However, the evaluation and validation of the products has been extensive and we intend to provide more details in subsequent papers.
The dataset has its strength in the long duration, its foundation upon a homogenized AVHRR radiance data record, and in some unique features compared to other available datasets.For example, we would like to highlight the availability of 28 yr of polar summer surface albedo and cloudiness param-eters.Quality characteristics are also well investigated and particularly useful results can be found over the tropics, mid to high latitudes and over nearly all oceanic areas.
Being the first CM SAF dataset of this kind, some shortcomings and limitations have been identified, especially with regard to daytime cloud retrieval results over the subtropical land regions and also the polar winter results in the region closest to the poles.Also, retrievals over regions with high aerosol-loading conditions are an issue in the surface albedo dataset.However, commitments to perform two additional reprocessing events within the time frame 2013-2018 aim at upgrading the dataset to much-improved levels.For example, it will include extension of the dataset with data forward in time for years 2010-2015 and backward in time to 1978 (including data from the AVHRR/1 sensor starting with the Tiros-N satellite).The ultimate goal is that this, together with actions to harmonize results for night-time and daytime conditions and to correct for orbital drift effects, will eventually lead to capabilities of composing more trustworthy results with a potential of describing real global and regional trends of the various derived parameters.

Fig. 1 .
Fig. 1.Visualization of the NOAA satellites used in CLARA-A1.The NOAA satellite numbers (ordinate) are shown as a function of length of observational period (abcissa).Notice that number 20 denotes Metop-A.Some data gaps are present but only for isolated months for NOAA-7, NOAA-9, NOAA-12 and NOAA-14.

Figure 2 .
Figure 2. Global monthly mean cloud fractional coverage (top) and cloud top pressure [hPa](bottom) for July 2007 derived from four satellites (see Figure 1).Regions without values are grey-shaded (here resulting from problems due to insufficient radiometric resolution for very cold surfaces in Antarctica during the Polar winter).

Fig. 2 .
Fig. 2. Global monthly mean cloud fractional coverage (CFC) (top) and cloud-top pressure (CTP) [hPa](bottom) for July 2007 derived from four satellites (see Fig. 1).Regions without values are greyshaded (here resulting from problems due to insufficient radiometric resolution for very cold surfaces in Antarctica during the polar winter).

Figure 4 .Fig. 4 .
Figure 4. Fraction of liquid clouds relative to total cloud fraction (top panel), all-sky liquid 4 water path (middle panel) and all-sky ice water path (bottom panel) for the month of July 5 2007.Regions without values are grey-shaded.This concerns locations for which no 6 Fig. 4. Fraction of liquid clouds relative to total cloud fraction (top panel), all-sky liquid water path (middle panel) and all-sky ice water path (bottom panel) for the month of July 2007.Regions without values are grey-shaded.This concerns locations for which no retrievals were performed because of too high surface albedo (Greenland) or solar zenith angle (Southern Hemisphere high latitudes).

Figure 7 .
Figure 7. Global monthly mean surface albedo for July 2007 (top).Corresponding plots for 6 two polar grids are shown at the bottom of the figure; one for the Arctic region (bottom left) 7 and one for the Antarctic region (bottom right, but observe that the month here is January 8 instead of July).Regions without values are grey-shaded (here resulting from dark conditions 9 prevailing close to Antarctica during the Polar winter).

Fig. 7 .
Fig. 7. Global monthly mean surface albedo for July 2007 (top).Corresponding plots for two polar grids are shown at the bottom of the figure: one for the Arctic region (bottom left) and one for the Antarctic region (bottom right, but observe that the month here is January instead of July).Regions without values are grey-shaded (here resulting from dark conditions prevailing close to Antarctica during the polar winter).

Fig. 8 .
Figure CERES are calc weighin shown w

Figure 9 .
Figure 9. Diagram of the calculation of the surface solar incoming radiation for all-sky conditions.The required input data is shown on the left side of the diagram, the right part represents the calculation of the surface solar irradiance using the look-up tables for the TOA albedo.The figure is taken from Mueller et al. (2009).

Fig. 9 .
Fig. 9. Diagram of the calculation of the surface solar incoming radiation for all-sky conditions.The required input data is shown on the left side of the diagram; the right part represents the calculation of the surface solar irradiance using the look-up tables for the TOA albedo.The figure is taken from Mueller et al. (2009).

Fig. 10 .
Figure the mon surface the targ where th

FigureFig. 12 .
Figure [W m -2 ] data set 12. Multi-y ] data set.G t fulfils the a year mean of Green dots c accuracy re f July from correspond equirements the CLARA to BSRN su s of 10 W m - A-A1 surface urface statio -2 for month e downwelli ons, where t hly means.

Figure 13 .Fig. 13 .
Figure 13.Daily mean cloud fraction for PATMOS-x (red) and CLARA-A1 (black).The green 5 line shows the corresponding monthly mean value for CLARA-A1.Results are computed from 6 Level2B data sets of all (ascending/descending) overpasses from all NOAA satellites.7 Fig. 13.Daily mean cloud fraction for PATMOS-x (red) and CLARA-A1 (black).The green line shows the corresponding monthly mean value for CLARA-A1.Results are computed from Level-2B datasets of all (ascending/descending) overpasses from all NOAA satellites.

Figure 14 .
Figure 14.Time series of cloud fraction (CFC, top panel) and the mean error and the RMS error (bottom panel) compared to observations from SYNOP stations available for the full period 1982-2009.Observe that all results are co-located, thus CLARA-A1 results (GAC ALL in the figure) are only those being matched over the selected surface stations.

Fig. 14 .
Fig. 14.Time series of cloud fraction (CFC, top panel) and the mean error and the RMS error (bottom panel) compared to observations from SYNOP stations available for the full period 1982-2009.Observe that all results are co-located, thus CLARA-A1 results (GAC ALL in the figure) are only those being matched over the selected surface stations.

Table 2 .
Channel 3A and 3B operations for the AVHRR/3 instruments during daytime.

Table 3 .
Overview of available CM-SAF CLARA-A1 radiation datasets.The resolution of the datasets is 0.25 • × 0.25 • .The accuracy of the data is defined by the mean absolute differences between BSRN surface measurements and satellite-based data.The estimated uncertainties of the radiation budgets (SNL, SNS, SRB) and the cloud radiative effects are calculated by error propagation.The bias is used as input for the error propagation.