EARLINET evaluation of the CATS L2 aerosol backscatter coefficient product

We present the evaluation activity of the European Aerosol Research Lidar Network (EARLINET) for the quantitative assessment of the Level 2 aerosol backscatter coefficient product derived by the Cloud-Aerosol Transport System (CATS) onboard the International Space Station (ISS). The study employs correlative CATS and EARLINET 45 backscatter measurements within 50 km distance between the ground station and the ISS overpass and as close in time as possible, typically with starting time or stop time of the EARLINET performed measurements time window within 90 minutes. of the ISS overpass, from February 2015 to September 2016. The results demonstrate the good agreement of CATS Level 2 backscatter coefficient and EARLINET. Three ISS overpasses close to the EARLINET stations of Leipzig-Germany, Évora-Portugal and Dushanbe-Tajikistan are analysed here to demonstrate the 50 Field Code Changed

performance of CATS lidar system under different conditions. The results show that under cloud-free, relative homogeneous aerosol conditions CATS is in good agreement with EARLINET, independently of daytime/nighttime conditions. CATS low negative biases, partially attributed to the deficiency of lidar systems to detect tenuous aerosol layers of backscatter signal below the minimum detection thresholds, may lead to systematic deviations and slight underestimations of the total Aerosol Optical Depth (AOD) in climate studies. In addition, CATS misclassification 5 of aerosol layers as clouds, and vice versa, in cases of coexistent and/or adjacent aerosol and cloud features, may lead to non-representative, unrealistic and cloud contaminated aerosol profiles. Regarding solar illumination conditions, low negative biases in CATS backscatter coefficient profiles, of the order of 6.1%, indicate the good nighttime performance of CATS. During daytime, reduced signal-to-noise ratio by solar background illumination prevents retrievals of weakly scattering atmospheric layers that would otherwise be detectable during nighttime, leading to 10 higher negative biases, of the order of 22.3%, in CATS daytime performance.The distributions of backscatter coefficient biases show the relatively good agreement between the CATS and EARLINET measurements, although on average underestimations are observed, 22.3 % during daytime and 6.1 % during nighttime.

Introduction 15
The Cloud-Aerosol Transport System (CATS) is a satellite-based elastic backscatter lidar developed to provide nearreal time, vertically resolved information on the vertical distribution of aerosols and clouds in the Earth's atmosphere (McGill et al., 2015). Developed at the NASA's Goddard Space Flight Center, CATS is based on the Cloud Physics Lidar (CPL; McGill et al., 2002) and the Airborne Cloud-Aerosol Transport System (ACATS; Yorks et al., 2014), 20 designed to operate onboard the high-altitude NASA ER-2 aircraft. CATS operated as a scientific payload onboard the Japanese Experiment Module -Exposed Facility (JEM-EF), utilizing the International Space Station (ISS) as a space science platform . Starting from 10 February 2015, CATS provided aerosol and cloud profile observations along the ISS flight track for more than 33 months until 30 October 2017 when the system suffered an unrecoverable fault. 25 CATS was developed to meet three main science goals. The primary objective was to measure and characterize aerosols and clouds on a global scale. The space-borne lidar orbited the Earth at an altitude of approximately 405 km and 51-degree inclination. The use of the ISS as an observation platform facilitated for the first time global lidarbased climatic studies of aerosols and clouds at various local times (Noel et al., 2018, Lee et al., 2018. In addition, near-real-time data acquisition of the CATS observations was developed towards the improvement of aerosol forecast 30 models (Hughes et al, 2016). A secondary objective was related to the need of long-term and continuous satellitebased lidar observations to be available for climatic studies. The first spaceborne lidar mission, the Lidar In-space Technology Experiment (LITE; McCormick et al., 1993) in 1994, was succeeded by the joint NASA and Centre National d'Études Spatiales (CNES) Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) mission in June, 2006 (Winker et al., 2007). Since 2009 the Cloud-Aerosol Lidar with Orthogonal Polarization 35 (CALIOP) instrument (Winker et al., 2009) onboard CALIPSO operates on the secondary backup laser. The launch of the post-CALIPSO missions, the joint European Space Agency (ESA) and JAXA satellite Earth Cloud Aerosol and Radiation Explorer (EarthCARE;Illingworth et al., 2015) and the NASA's Aerosols, Clouds, and Ecosystems (ACE) are planned for 2021 and post-2020 respectively. The CATS project was partially intended to fill a potential gap on global lidar observations of vertical aerosols and clouds profiling. The third scientific objective of CATS was to serve as a low-cost technological demonstration for future satellite lidar missions (McGill et al., 2015). Its science 5 goal to explore different technologies was fulfilled through the use of photon-counting detectors and of two low energy (1-2 mJ) and high repetition rate (4-5 kHz) Nd:YVO4 lasers (Multi-Beam and HSRL -UV demonstrations), aiming to provide simultaneous multiwavelength observations (355, 532 and 1064 nm). Additional gains of the CATS project were related to the exploitation and risk reduction of newly applied laser technologies, to pave the way for future spaceborne lidar missions (high repetition rate, injection seeding, wavelength tripling at 355 nm). 10 CATS performance has been validated against ground-based AErosol RObotic NETwork (AERONET; Holben et al.,

1998) measurements and evaluated against satellite-based Atmospheric Optical Depth (AOD) retrievals of Aqua and
Terra Moderate Imaging Spectroradiometer (MODIS; Levy et al., 2013) and active CPL (McGill et al., 2002) and CALIPSO CALIOP (Winker et al., 2009)  surface shows also good shape agreement, despite an apparent CALIOP underestimation in the lowest 2 km height. 25 CATS and CALIOP observations were used by Rajapakshe et al. (2017) to study the seasonally transported aerosol layers over the SE Atlantic Ocean. The performed comparative analysis reported on similar geographical patterns regarding Above Cloud Aerosols (ACA), Cloud Fraction (CF) and ACA occurrence frequency (ACA_F) between CATS and CALIOP retrievals. However, the authors reported also on differences between CATS and CALIOP vertical aerosol distributions, with ACA bottom height identified by CATS lower than the respective of CALIOP . 30 Noel et al. (2018), implemented measurements from CATS to investigate the diurnal cycle and variations of clouds over land and ocean. The authors showed that both CATS and CALIOP profiles and CF agree well on both the vertical patterns and values at 01:30 and 13:30 LT, over both land and ocean, with minor differences of the order of 2-7% throughout the entire cloud profiles. CATS depolarization measurements, which are critical in the processing algorithms of aerosol subtype classification, were investigated in the case of desert dust, smoke from biomass burning 35 and cirrus clouds , and were found consistent and in good agreement with depolarization measurements from previous studies and historical datasets implementing CPL (Yorks et al., 2011) and CALIOP (Liu et al., 2015).
Overall, CATS retrievals have been evaluated and found in reasonable agreement with ground-based AERONET, airborne CPL and satellite-based MODIS and CALIOP measurements. However, for the quality assessment of CATS backscatter coefficient profiles, a large-scale and dense network of ground-based lidar systems is needed, in order to 5 facilitate high-quality collocated and concurrent measurements. This necessity is largely related To assess the quality of CATS lidar observations a large-scale and dense network of ground-based lidars is required due to the ISS orbital characteristics, the CATS near-nadir viewing (0. EARLINET employs of advanced Raman lidar systems and is characterized by extensive geographical coverage.
In this paper, we utilize EARLINET for the evaluation of CATS Level 2 aerosol backscatter coefficient product at 1064 nm. The paper is structured as follows: in section 2 we introduce aspects of CATS and EARLINET relevant to 15 the study and additionally the methodology is presented and discussed. Specific study cases are evaluated and discussed in section 3. Section 4 presents the generic intercomparison results between CATS and EARLINET, while the concluding remarks on the CATS-EARLINET backscatter coefficient evaluation are summarized in section 5.

EARLINETCATS
The CATS elastic backscatter lidar was designed to provide near-real-time measurements of the vertical profiles of 25 aerosol and cloud optical properties at three wavelengths (355, 532 and 1064 nm). As a payload of the JEM-EF on the ISS, CATS was designed to operate two high repetition rate lasers in three different Modes and at four instantaneous fields of view (iFOV). Mode 1 was designed as multi-beam backscatter and depolarization configuration at 532 and 1064 nm, where a beam-splitter would produce two footprints of 14.38 m diameter on the Earth's surface, to the left side FOV (LSFOV) and the right side FOV (RSFOV) of the ISS orbit track, separated by 30 approximately a distance of 7 km. Mode 2 was designed as a demonstration of HSRL, to provide backscatter profiles at 532 nm and backscatter and depolarization ratio profiles at 1064 nm (Forward FOV). Mode 3 was designed to operate and provide backscatter at 355, 532 and 1064 nm, and depolarization ratio at 532 and 1064 nm. CATS was a technology demonstration designed to operate on-orbit between six months and three years. Due to a failure in the CATS optics at the 355 nm wavelength, CATS did not operate in Mode 3, while the use of Mode 1 was limited 35 between 10/02/2015 and 21/03/2015 due to a failure in the electronics of laser 1. Nevertheless, the successful longterm operation of Mode 2, between 02/2015 and 10/2017, allowed CATS to fulfil its science objectives.
CATS processing algorithms (Pauly et al., 2019) rely heavily on the processing algorithms developed in the framework of the CPL, ACATS and CALIPSO lidar systems (Palm et al., 2002;Yorks et al., 2011;Hlavka et al., 2012), while CATS products are provided in different levels of processing. CATS Level 1B data include vertical profiles of total and perpendicular attenuated backscatter signals, range-corrected, calibrated and annotated with ancillary meteorological parameters based on previous work using CPL and CALIPSO (McGill et al., 2007;Powell 5 et al., 2009;Vaughan et al., 2010). Level 2 products provide the vertical distribution of aerosol and cloud properties (depolarization ratio, backscatter and extinction coefficient profiles at 1064 nm -FFOV), with a horizontal and vertical resolution of 5 km and 60 m respectively. In addition, Level 2 data include geophysical parameters of the identified atmospheric layers (vertical feature mask -feature type, aerosol subtype), the required horizontal averaging and information on the feature type classification confidence . In addition to CATS Level 2 10 Feature Type (namely: clear air, cloud, aerosol and totally attenuated), the algorithm provides the confidence level of the Feature Type classification, similar to the CALIOP Cloud-Aerosol-Discrimination (CAD) algorithm (Liu et al., 2004;Liu et al., 2009). CATS Feature Type Score is a multidimensional probability density function (PDF) developed based on multiyear CPL observations, that discriminates cloud and aerosol features, assigning an integer between -10 and 10 for each detected atmospheric layer. 15 In this study, we used CATS Level 2 v2.01 profiles . A comprehensive overview of the CATS instrument and CATS science goals is given by McGill et al. (2015) and Yorks et al. (2016), while detailed information about CATS datasets and an images browser can be found in the CATS Data Release Notes, Quality Statements and Theoretical Basis, available at https://cats.gsfc.nasa.gov/ (last access: 20 December 2018). The main objective of EARLINET is to establish an extended, coordinated and continental wide network of 25 sophisticated ground-based Raman lidar systems. The vertical distribution of aerosols in the atmosphere, as well as their temporal evolution, are provided by high-resolution EARLINET measurements over Europe. The long-term continuous operation of EARLINΕT infrastructure has fostered a quantitative, comprehensive, and statistically significant database of the distribution of aerosol on a continental scale (Bösenberg et al., 2003;Pappalardo et al., 2014). 30 Since the beginning of the initiative in 2000, EARLINET has significantly increased its observing and operational capacity and capability. Currently, EARLINET is composed of twenty-nine operating lidar stations distributed over Europe (Fig. 1), including seven admitted or joining stations. EARLINET stations are classified as active on condition of contributing regularly aerosol backscatter/extinction coefficient profiles to the EARLINET database (https://www.earlinet.org/, last access: 20 December 2018). Lidar observations in the framework of EARLINET are 35 regularly and simultaneously performed according to a common schedule -on preselected dates. The schedule involves three measurements per week, one during daytime around local noon (Monday, 14:00 ± 1h) and two during nighttime (Monday/Thursday, sunset + 2/3h), to enable Raman extinction retrievals. In addition to the preselected dates of the operation schedule, dedicated measurements are performed to monitor special events such as major volcanic activity (Ansmann et al., 2010;Ansmann et al., 2011;Pappalardo et al., 2013;Perrone et al., 2012;Sicard et al., 2012;Wang et al., 2008), long-range transport of Saharan dust (Ansmann et al., 2003, Solomos et al., 2017, 2018 and smoke particles (Ortiz-Amezcua et al., 2017, Janicka et al. 2017, Stachlewska et al. 2018). Some of the 5 EARLINET systems perform meanwhile 24/7 continuous measurements as for example the PollyXT systems (Engelmann et al., 2016, Baars et al., 2016. The quality assurance and improvement of the performance of the EARLINET systems is tested through the intercomparison of both the infrastructure (Wandinger et al., 2015) and the optical products (Böckmann et al., 2004;Pappalardo et al., 2004). In addition, the homogenization of the lidar data in a standardized output format is facilitated and an automatic algorithm is developed to further address the quality 10 assurance of the lidar measurements (the Single Calculus Chain (SCC), D'Amico et al, 2015;D'Amico et al, 2016;Mattis et al., 2016). The SCC has been used in near-real time to shown the potential operationality of the network in a 72-hr continuous measurement exercise in 2012 (Sicard et al., 2015).
Due to its implicit characteristics, EARLINET is an optimum tool to support satellite-based lidar missions with extensive experience to satellite calibration and validation activities. EARLINET and Cloud-Aerosol Lidar and 15 Infrared Pathfinder Satellite Observation (CALIPSO; Winker et al., 2009) correlative measurements are regularly performed in order to investigate the quality of the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) observations, to test the presence of possible biases, and to assess the aspects of spaceborne lidar measurements (e.g. Pappalardo et al., 2010;Mamouri et al., 2009, Mona et al., 2009Perrone et al., 2011;Wandinger et al., 2011;Amiridis et al., 2013;Grigas et al., 2015;Papagiannopoulos et al., 2016). Similarly, the validation programs of the CALIOP, is a two-wavelength polarization-sensitive lidar that operates at 532 and 1064 nm, while the ESA's 25 ALADIN onboard Aeolus and the ESA-JAXA ATLID onboard EarthCARE operate at 355 nm and NASA's CATS lidar at 532 and 1064 nm in Mode 1 and 1064 nm in Mode 2 (Yorks et al., 2014). EARLINET supports the continuity of satellite lidar missions through the calculation of aerosol-dependent spectral conversion factors between different wavelengths, to homogenize different missions at different operating wavelengths in order to provide a long-term 3D climatic record from space (Amiridis et al., 2015). 30

CATSEARLINET
The CATS elastic backscatter lidar was designed to provide near-real-time measurements of the vertical profiles of aerosol and cloud optical properties at three wavelengths (355, 532 and 1064 nm). As a payload of the JEM-EF on 35 the ISS, CATS was designed to operate two high repetition rate lasers in three different Modes and at four instantaneous fields of view (iFOV). Mode 1 was designed as multi-beam backscatter and depolarization configuration at 532 and 1064 nm, where a beam-splitter would produce two footprints of 14.38 m diameter on the Earth's surface, to the left side FOV (LSFOV) and the right side FOV (RSFOV) of the ISS orbit track, separated by approximately a distance of 7 km. Mode 2 was designed as a demonstration of HSRL, to provide backscatter profiles at 532 nm and backscatter and depolarization ratio profiles at 1064 nm (Forward FOV). Mode 3 was designed to operate and provide backscatter at 355, 532 and 1064 nm, and depolarization ratio at 532 and 1064 nm. CATS was a 5 technology demonstration designed to operate on-orbit for a minimum of six months and up to three years. Due to a failure in the CATS optics at the 355 nm wavelength, CATS did not operate in Mode 3, while the use of Mode 1 was limited between 10/02/2015 and 21/03/2015 due to a failure in the electronics of laser 1. Nevertheless, the successful long-term operation of Mode 2, between 02/2015 and 10/2017, allowed CATS to fulfil its science objectives.
CATS was developed to meet three main science goals. The primary objective was to measure and characterize 10 aerosols and clouds on a global scale. The space-borne lidar orbited the Earth at an altitude of approximately 405 km and 51-degree inclination. The use of the ISS as an observation platform facilitated for the first time global lidarbased climatic studies of aerosols and clouds at various local times (Noel et al., 2018, Lee et al., 2018. In addition, near-real-time data acquisition of the CATS observations was developed towards the improvement of aerosol forecast models (Hughes et al, 2016). A secondary objective was related to the need of long-term and continuous satellite-15 based lidar observations to be available for climatic studies. The first spaceborne lidar mission, the Lidar In-space Technology Experiment (LITE; McCormick et al., 1993) in 1994, was succeeded by the joint NASA and Centre National d'Études Spatiales (CNES) CALIPSO mission in June, 2006 (Winker et al., 2007). Since 2009 the CALIOP instrument (Winker et al., 2009) onboard CALIPSO operates on the secondary backup laser. The launch of the post-CALIPSO missions, the joint ESA/JAXA satellite EarthCARE (Illingworth et al., 2015) and the NASA's Aerosols, 20 Clouds, and Ecosystems (ACE) are planned for 2021 and post-2020 respectively. The CATS project was partially intended to fill a potential gap on global lidar observations of vertical aerosols and clouds profiling. The third scientific objective of CATS was to serve as a low-cost technological demonstration for future satellite lidar missions (McGill et al., 2015). Its science goal to explore different technologies was fulfilled through the use of photoncounting detectors and of two low energy (1-2 mJ) and high repetition rate (4-5 kHz) Nd:YVO4 lasers (Multi-Beam 25 and HSRL -UV demonstrations), aiming to provide simultaneous multiwavelength observations (355, 532 and 1064 nm). Additional gains of the CATS were related to the exploitation and risk reduction of newly applied laser technologies, to pave the way for future spaceborne lidar missions (high repetition rate, injection seeding, wavelength tripling at 355 nm).
CATS products and processing algorithms rely heavily on the processing algorithms developed in the framework of 30 the CPL, ACATS and CALIPSO lidar systems (Palm et al., 2002;Yorks et al., 2011;Hlavka et al., 2012) and provided in different levels of processing. CATS Level 1B data include vertical profiles of total and perpendicular attenuated backscatter signals, range-corrected, calibrated and annotated with ancillary meteorological parameters based on previous work using CPL and CALIPSO (McGill et al., 2007;Powell et al., 2009;Vaughan et al., 2010). Level 2 products provide the vertical distribution of aerosol and cloud properties (depolarization ratio, backscatter and 35 extinction coefficient profiles at 1064 nm -FFOV), with a horizontal and vertical resolution of 5 km and 60 m respectively. In addition, Level 2 data include geophysical parameters of the identified atmospheric layers (vertical feature mask -feature type, aerosol subtype), the required horizontal averaging and information on the feature type classification confidence . In this study, we used CATS Level 2 v2.01 profiles .
A comprehensive overview of the CATS instrument and CATS science goals is given by McGill et al. (2015) and Yorks et al. (2016), while detailed information about CATS datasets and an images browser can be found in the  et al., 2003;Pappalardo et al., 2014). 15 Since the beginning of the initiative in 2000, EARLINET has significantly increased its observing and operational capacity. Currently, EARLINET is composed of twenty-nine operating lidar stations distributed over Europe ( Fig.   1), including seven admitted or joining stations. EARLINET stations are classified between "active", "not permanent", "joining" and "not active". An EARLINET station is classified as active when on condition of performing regularly and simultaneously measurements with the other stations composing the lidar network, and 20 accordingly, contributing with uploading the performed measurements to the EARLINET database (https://www.earlinet.org/, last access: 20 December 2018). Lidar observations in the framework of EARLINET are performed according to a common schedule -on preselected dates. The schedule involves three measurements per week, one during daytime around local noon (Monday, 14:00 ± 1h) and two during nighttime (Monday/Thursday, sunset + 2/3h), to enable Raman extinction retrievals. In addition to the preselected dates of the operation schedule, 25 dedicated measurements are performed to monitor special events such as major volcanic activity (Ansmann et al., 2010;Ansmann et al., 2011;Pappalardo et al., 2013;Perrone et al., 2012;Sicard et al., 2012;Wang et al., 2008), long-range transport of Saharan dust (Ansmann et al., 2003, Solomos et al., 2017, 2018 and smoke particles (Ortiz-Amezcua et al., 2017, Janicka et al. 2017, Stachlewska et al. 2018. Some of the EARLINET systems perform meanwhile 24/7 continuous measurements as for example the PollyXT systems (Engelmann et al., 2016, Baars et al., 30 2016). The quality assurance and improvement of the performance of the EARLINET systems is tested through the intercomparison of both the infrastructure (Wandinger et al., 2015) and the optical products (Böckmann et al., 2004;Pappalardo et al., 2004). In addition, the homogenization of the lidar data in a standardized output format is facilitated and an automatic algorithm is developed to further address the quality assurance of the lidar measurements (the Single Calculus Chain (SCC), D'Amico et al, 2015;D'Amico et al, 2016;Mattis et al., 2016). The SCC has been used in 35 near-real time to shown the potential operationality of the network in a 72-hr continuous measurement exercise in 2012(Sicard et al., 2015. Due to its implicit characteristics, EARLINET is an optimum tool to support satellite-based lidar missions with extensive experience to satellite calibration and validation activities. EARLINET and Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO; Winker et al., 2009) correlative measurements are regularly performed in order to investigate the quality of the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) observations, to test the presence of possible biases, and to assess the aspects of spaceborne lidar measurements (e.g. 5 Pappalardo et al., 2010;Mamouri et al., 2009, Mona et al., 2009Perrone et al., 2011;Wandinger et al., 2011;Amiridis et al., 2013;Grigas et al., 2015;Papagiannopoulos et al., 2016). Similarly, ESA validation programs of the -Atmospheric Laser Doppler Instrument (ALADIN) onboard Aeolus (Stoffelen et al., 2005;Ansmann et al., 2007) and the ESA-JAXA EarthCARE (Illingworth et al., 2015) are highly-dependent on ground-based EARLINET correlative measurements. In addition, EARLINET supports the homogenization of the different satellite missions. 10 CALIOP, is a two-wavelength polarization-sensitive lidar that operates at 532 and 1064 nm, while the ESA's ALADIN onboard Aeolus and the ESA-JAXA ATLID onboard EarthCARE operate at 355 nm and NASA's CATS lidar at 532 and 1064 nm in Mode 1 and 1064 nm in Mode 2 (Yorks et al., 2014). EARLINET supports the continuity of satellite lidar missions through the calculation of aerosol-dependent spectral conversion factors between different wavelengths, to homogenize different missions at different operating wavelengths in order to provide a long-term 3D 15 climatic record from space (Amiridis et al., 2015;Marinou et al., 2017;Proestakis et al., 2018).

Comparison methodology 20
To obtain a significant number of collocated and concurrent EARLINET-CATS cases, a large number of EARLINET stations contributed to the CATS evaluation activity. Figure 1 shows the geographical distribution of the active EARLINET stations during the study over Europe and Asia, including the daytime/nighttime overpasses of ISS within the evaluation period, between 02/2015 and 09/2016, encompassing the first twenty months of CATS 25 operation. The green circles denote the stations participating in the EARLINET-CATS inter-comparison activity (namely -in alphabetical order: Athens-NOA, Athens-NTUA, Barcelona, Belsk, Bucharest, Cabauw, Dushanbe, Évora, Hohenpeissenberg, Lecce, Leipzig, Potenza, Thessaloniki and Warsaw). All participating stations operate high performance multiwavelength lidar systems. Six of the contributing stations (Athens-NOA, Cabauw, Dushanbe, Évora, Leipzig and Warsaw) are part of the PollyNET subnetwork (http://polly.tropos.de/), operating 24/7 portable, 30 remote-controlled multiwavelength-polarization-Raman lidar systems (PollyXT;Baars et al., 2016;Engelmann et al., 2016). Due to the geographical distribution of EARLINET stations, the evaluation activity accounts for a large variety of aerosol types (marine, urban, desert dust, smoke). Table 1 provides the locations of the EARLINET stations contributing to this analysis along with the surface elevation and the respective identification codes.
In order to quantitatively address the accuracy and representativeness of CATS retrievals, we follow the methodology 35 introduced by EARLINET for CALIOP validation, which is based on correlative independent measurements (Pappalardo et al., 2010). For the validation of spaceborne lidar observations, of fundamental significance is the spatial and temporal variability of the atmospheric scene. The effect of distance between ground-based lidar measurements and space-based lidar measurements was investigated in the framework of the CALIPSO validation.
In particular, EARLINET-based studies attribute an introduced discrepancy of the order of 5 % to the intercompared signal analysis, when the horizontal distance between the EARLINET stations and the spaceborne lidar footprint is below 100 km (Mamouri et al., 2009;Mona et al., 2009;Pappalardo et al., 2010;Papagiannopoulos et al., 2016). In 5 the context of the applied validation criteria, we selected CATS measurements within 50 km horizontal distance between the EARLINET stations and the ISS subsatellite overpass position. In addition, the correlative measurements should be as close in time as possible. EARLINET contributed with performed measurements as close in time as possible, typically with starting time or stop time of the performed measurements window within 90 minutes of the ISS station overpass. The EARLINET-CATS cases considered to the assessment of the accuracy and 10 representativeness of CATS backscatter coefficient profiles are provided in Table 2, including the name of the EARLINET station, the EARLINET measurements window, the ISS overpass time and ISS minimum distance between the corresponding EARLINET station and the lidar footprint of CATS and the Daytime/Nighttime information. EARLINET contributed with performed measurements as close in time as possible, typically within 90 min of the ISS station overpass. 15 The number of available cases for the intercomparison is subject to a certain number of constraints. First and foremost, the orbital inclination of the ISS does not allow to overpass close to EARLINET stations northern of 52.2° latitude. Second, the ISS crossing-time and ground-track over an area is highly variable, enhancing the probability of the overpass time to fall outside the predefined common and fixed schedule of EARLINET measurements. In addition, to account for contamination effects of multiple-scattering and specular reflection in the intercomparison 20 process, only cloud-free (including cirrus clouds) atmospheric scenes are used. Cases with detected cirrus either at the EARLINET Range-Corrected-Signal quicklooks or at the ISS-CATS backscatter coefficient profiles or the feature type profiles are not considered in the study. Initially, the presence of clouds is investigated through the implementation of CATS backscatter coefficient and depolarization time-height images and EARLINET rangecorrected-signal. Cases for which the retrieval of EARLINET temporally-averaged profile is not feasible due to the 25 presence of clouds, and/or CATS cases that the presence of clouds propagated into the CATS spatial-averaged profile are discarded from the analysis. Regarding CATS, the "Sky_Condition" flag is used to screen cloudy (no aerosols) and hazy/cloudy (both clouds/aerosols) profiles from the analysis. The "Feature_Type_Score" parameter stored in the Level 2 data was additionally used to remove aerosol cases of medium/low confidence in the comparison process ("Feature_Type_Score" ≥ -1). Applying all match-up selection criteria resulted in a total of 47 correlative 30 EARLINET-CATS cases suitable to quantitatively address the accuracy and representativeness of CATS Level 2 backscatter coefficient product at 1064 nm. CATS requirements applied in the methodology are summarized in Table   3.

Particle backscatter coefficient retrievals from ground based lidars at 1064 nm
In order to evaluate the CATS Level 2 aerosol backscatter product at 1064 nm we utilized backscatter coefficient profiles calculated either with the SCC algorithm or, in case of PollyXT lidar systems, with independently developed user assisted retrieval algorithms (Baars 2016). The EARLINET backscatter coefficient profiles used in this study are calculated with the SCC version 4 algorithm (for the stations that are not part of PollyNET) and with the 5 methodology described in Haarig et al., 2017 (for the stations that are part of PollyNET). The SCC algorithm (D'Amico et al., 2015;D'Amico et al., 2016;Mattis et al., 2016) is developed in the concept of sustaining the homogeneity of aerosol products derived from different EARLINET lidar systems while satisfying the need for coordinated, quality assured measurements. It consists of five different modules, including one for handling the preprocessing of raw lidar data by applying all the necessary instrumental corrections to the signal and a module for 10 providing the final aerosol optical products, namely the particle backscatter and extinction coefficient. In particular, SCC algorithm calculates the backscatter coefficient with the iterative method (Di Girolamo et al., 1995), using only the elastic lidar channels. To calculate the b1064nm with these methods, an assumption of the lidar ratio value is required (as a profile or a height independent value, representative of the corresponding atmospheric scene) and the selection/determination of a reference height (R0), usually chosen at an altitude range with the minimum aerosol 15 contribution. All methods applied within the SCC, have been tested against synthetic (Mattis et al., 2016) and real lidar data (D'Amico et al., 2015). The comparison showed that by using only the signal from the elastic channels, the mean relative deviation in the calculation of the aerosol backscatter coefficient at 1064 nm is less than 30 % (Althausen et al., 2009;Baars et al., 2012;Engelmann et al., 2016;Hänel et al., 2012), thus meeting the quality assurance requirements of EARLINET. None of the lidar systems participateding in the present study, is equipped 20 with a rotational-vibrational Raman channel excited by the 1064 nm as for example recently reported by Haarig et al (2017). In the case of PollyXT lidars, for the daytime backscatter coefficient calculations, the Fernald-Klett method (Klett, 1981;Fernald, 1984) is implemented assuming a height independent lidar ratio. For the nighttime calculations, the Raman channel at 607 nm is additionally used (Baars et al., 2016 where P 607 and P 1064 stand for the power received from a distance R, with respect to the lidar system, at 607 nm and 1064 nm respectively. The constant C at 607 or 1064 nm contains all range independent system parameters. The overlap function O(R), which is less than unity for the altitude range where the laser beam is not completely inside 5 the receiving telescope field of view (Wandinger et al., 2002), is assumed identical between the two channels, which is the case for PollyXT systems which use one beam expander for all three emitted wavelengths. βmol and βpar represent molecular and particle scattering respectivelybackscattering respectively, whereas αmol and αpar are the molecular and particle extinction coefficients.
Finally, in order to perform the intercomparison between CATS and EARLINET profiles, the high resolution of 10 EARLINET profiles was lowered to match the vertical resolution of CATS profiles (i.e. 60m). The objective of obtaining profiles of similar vertical resolution was addressed through computing the EARLINET mean backscatter coefficient value from all EARLINET bins within each CATS 60m backscatter coefficient height range. The computed EARLINET profiles of similar vertical resolution with CATS followed with high accuracy the characterizes and tendencies, both qualitative and quantitative, of the initial EARLINET profiles, despite the loss of 15 vertical resolution (Iarlori et al., 2015).

Demonstration of the comparison methodology for a case study over Athens
To illustrate the evaluation methodology for the CATS Level 2 aerosol backscatter coefficient at 1064 nm, a pair of 20 collocated and concurrent CATS and EARLINET lidar observations is shown in Figure 2. The example refers to a nighttime ISS overpass of the coastal city of Athens-Greece on the 1st of February, 2016. During that period, the PollyXT-NOA system was operating in a 24/7 mode in Athens, at the premises of the National Observatory of Athens, to fulfill the needs of an ACTRIS Joint Research Activity (JRA) for aerosol absorption (Tsekeri et al., 2018). At the same time, on Monday 1st of February 2016, the lidar station operating at the National Technical University of 25 Athens (NTUA) was performing nighttime measurements according to the EARLINET schedule of regular and simultaneous measurements, in order to enable Raman extinction retrievals. The closest distances between the CATS footprint of the ISS overpass and the locations of the EARLINET-at (NTUA) and EARLINET-no (NOA) stations were approximately 18.58 and 23.3 km at 17:24 UTC (Fig. 2a). The vertical distribution of aerosols and clouds is shown in the CATS 1064 nm backscatter coefficient quicklook (Fig. 2b) and the PollyXT-ΝΟΑ lidar range-corrected 30 signal at 1064 nm, between 01/02/2016 at 12:00 UTC and 02/02/2016 00:00 UTC (Fig. 2c) vertical homogeneity of the scene. For the comparison of CATS and EARLINET observations, the latest are regridded to the CATS Level 2 vertical resolution (60 m). Accordingly, CATS spatial averaged and the EARLINET systems of NOA and NTUA temporal averaged backscatter coefficient profiles are qualitative compared (Fig. 2d).
The observed disagreements between the two EARLINET profiles are related to differences between the two system, to the different surface elevation of the locations of the two stations (86m for EARLINET-no and 212 for 5 EARLIENT-at), and the different overlap regions. The horizontal-bars in the CATS profile ( Fig. 2d) correspond to the standard deviation of the spatially averaged backscatter coefficient profiles.
The comparison of the mean backscatter coefficient profiles retrieved by CATS and the two corresponding EARLINET NOA and NTUA profiles presented in Figure 2 is an initial demonstration of the good agreement between the two products. The CATS instrument reproduces the observed aerosol features, in terms of aerosol load 10 as well as their vertical distribution (Fig. 2d). The assessment of CATS backscatter coefficient is performed in the region between 0.5 km above ground-level of the EARLINET sites, to account for overlap effects between the laser beam and the telescope (Wandinger and Ansmann, 2002), topographic effects, surface returns, and differences of The latter constrain is applied to account for very thin detected layers from ground-based Lidar systems with backscatter values below the CATS minimum detection limit due to the low Signal-to-Noise Ratio values (SNR). 20 The discussed constrains are employed because of our basic idea to quantitatively assess the representativeness and accuracy of the detected by CATS aerosol features, while preventing possible contaminations (e.g. presence of clouds) to propagate into the CATS-EARLINET dataset.To illustrate the evaluation methodology for the CATS Level 2 aerosol backscatter coefficient at 1064 nm, a pair of collocated and concurrent CATS and EARLINET lidar observations is shown in Figure 2. The example refers to a nighttime ISS overpass of the coastal city of Athens-25 Greece on the 1 st of February, 2016. During that period, the PollyXT-NOA system was operating in a 24/7 mode in Athens to fulfill the needs of an ACTRIS Joint Research Activity (JRA) for aerosol absorption (Tsekeri et al., 2018). regridded to the CATS Level 2 vertical resolution (60 m). Accordingly, CATS spatial averaged and EARLINET-NOA temporal averaged backscatter coefficient profiles (in this example PollyXT-NOA observations) are qualitative compared (Fig. 2d). The horizontal-bars in the CATS profile ( Fig. 2d) correspond to the standard deviation of the spatially averaged backscatter coefficient profiles.
Τhe comparison of CATS and EARLINET PollyXT-NOA mean backscatter coefficient profiles for the example presented in Figure 2 is an initial demonstration of the good agreement between the two products. The CATS instrument reproduces the observed aerosol features, in terms of aerosol load as well as their vertical distribution 5 ( Fig. 2d). Τhe assessment of CATS backscatter coefficient is performed in the region between 0.5 km above groundlevel of the EARLINET sites, to account for overlap effects between the laser beam and the telescope (Wandinger and Ansmann, 2002), topographic effects, surface returns, and differences of atmospheric samples within the Planetary Boundary Layer (Fig. 2d -shaded area iii), and 10 km height (a.s.l.). An upper limit of 2 Mm -1 sr -1 is applied to the aerosol backscatter coefficient values, in order to account for cloud features possible misclassified as aerosols 10 3 Results and discussion 20

EARLINET-CATS Correlative Cases
To illustrate strengths and limitations of CATS products, we discuss in details three selected cases of collocated and concurrent CATS-EARLINET observations close to the (EARLINET) stations of Leipzig, Évora and Dushanbe. The three study cases represent different atmospheric conditions with increasing degree of difficulty in the detection of 25 representative aerosol layers by CATS.

Case I: ISS-CATS over Leipzig -13/09/2016 03:37 UTC
The first overpass here consideredconsidered here shows a representative case study of a nighttime ISS orbit, on 30 September 13, 2016 (blue line), at a minimum distance of 3.78 km from the EARLINET Leipzig -Germany PollyXT lidar system (indicated by a white dot), at 03:37 UTC (Fig. 3a). CATS particulate backscatter coefficient cross section at 1064 nm ( Fig. 3b) shows the presence of aerosols up to 2.6 km (a.s.l.). CATS feature mask algorithm classifies all of the detected layers as aerosols (not shown). The ground based lidar measurements at Leipzig station between 00:00 and 12:00 UTC did not report any cloud features either, including cirrus clouds. CATS spatial-averaged and Leipzig 35 temporal-averaged profiles were derived from CATS profiles within horizontal distance below of 50 km, between the Leipzig station and the ISS footprint, and Leipzig measurements within 90 minutes of the ISS overpass, respectively (Fig. 3c). The direct comparison of the backscatter coefficient profiles, measured from the EARLINET Leipzig station (red line) and CATS (blue line), along with their standard deviations (horizontal error bars), indicate also the presence of aerosol up to 2.6 km height (a.s.l.). The intercompared profiles between ISS-CATS and EARLINET-Leipzig station are characterized by high agreement, although discrepancies are also present. To the uppermost part of the profiles, between 2.5 and 3 km (a.s.l.), due to the higher SNR, Leipzig lidar is capable to detect 5 tenuous atmospheric features of low backscatter coefficient values. Although the case presented and discussed in Figure 3 corresponds to a nighttime ISS overpass, the case is representative for cloud free and relative homogeneous atmospheric scenes in terms of aerosols, for both daytime and nighttime solar background illumination, demonstrating the overall high performance of CATS under such conditions.The intercomparison presented in Figure 3c is a representative case, indicating the overall high performance of CATS and the absence of 10 significant biases, during both daytime and nighttime, under relative homogeneous and cloud free conditions Small biases between EARLINET and CATS backscatter coefficient are also identified in specific cases. CATS 15 particulate backscatter coefficient profiles are available for the identified atmospheric features and not as full profiles as in the case of the attenuated backscatter profiles. The feature classification algorithm, assuming no cloud or aerosol layers are detected and no over-laying opaque layers are present, classifies the atmospheric layers as clear-air. Clearair segments though are not pristine and aerosol-free, as they frequently contain tenuous particulate layers (Kim et al., 2018). Layers of atmospheric features that are not detected, contain either fill values (0.0 km -1 sr -1 ), or are marked 20 as invalid in cases when the calculation of the particulate backscatter coefficients was not possible (-999.9).
This scheme of assigning appropriate backscatter coefficients to the detected atmospheric features (e.g., aerosol and clouds) propagates through many of the Level 2 products in the comparison of CATS Level 2 data, thus in the assessment of the representativeness of CATS observations. Consequently, the comparison of CATS Level 2 backscatter coefficient profiles against EARLINET observations is only possible over the detected atmospheric 25 features. In addition, the identification of the atmospheric features strongly depends on the calibrations of CATS lidar system and to the level of the background signal -solar illumination conditions, due to the different SNR between daytime and nighttime. shows the absence of aerosol and/or cloud features, while the Évora temporal-averaged profile during the cloud free window (Fig. 4c) indicates the presence of thin aerosol layers in the altitude range between 1 and 2.5 km height (a.s.l.). The aerosol layer detected by the Évora PollyXT lidar system is characterized by backscatter coefficient 35 values lower than 0.3 Mm -1 sr -1 . Although CATS is characterized by relatively low Minimum Detection Thresholds , CATS capabilities are limited in terms of detecting similarly tenuous aerosol layers at levels that lie below the detection thresholds (CATS M7.2 Minimum Detectable Backscatter 1064 nm: Night: 5.00E-5 ± 77E-5 km −1 sr −1 / Day: 1.30E-3 ± 0.24E-3 km −1 sr −1 -for cirrus clouds; Yorks et al., 2016e.g. CATS 7.2 Minimum Detectable Backscatter 1064 nm: Night: 0.05 ± 0.0077 Mm −1 sr −1 / Day: 1.3 ± 0.24 Mm −1 sr −1 -for cirrus clouds; Yorks et al., 2016). The detection limitation of CATS may propagate in scientific studies implementing CATS through introduced underestimations and possible biases. The assessment of accuracy of CATS Level 2 against EARLINET collocated and concurrent observations is performed on the basis of backscatter coefficient profiles, because this product constitutes the CATS Level 2 10 parameter with the lowest influence of a-priori assumptions (e.g. lidar ratio). In addition CATS Level 2 provides the feature classification of the detected layers and associated confidence level of the classification. In addition to the backscatter coefficient, CATS Level 2 data provide the feature classification of the detected layers (namely: clear air, cloud, aerosol and totally attenuated) and the numerical confidence level of the classification, similar to the CALIOP Cloud-Aerosol-Discrimination (CAD) algorithm (Liu et al., 2004;Liu et al., 2009). CATS Feature Type Score is a 15 multidimensional probability density function (PDF) developed based on multiyear CPL observations, that discriminates cloud and aerosol features, assigning an integer between -10 and 10 for each detected atmospheric layer. The Cloud-aerosol Aerosol discrimination though is not performed perfectly. Thus misclassified aerosol layers may be classified as clouds, and vice versa. In the framework of the study, for the assessment process of the CATS Level 2 aerosol quality, strict cloud-filtering is applied. In particular, cloud contaminated profiles (Sky Condition 2, 20 3) and aerosol layers characterized by medium/low classification confidence (Feature_Type_Score ≥ -1) are filtered.
The strict cloud screening is applied because of our basic idea to establish the accuracy of CATS aerosol backscatter coefficient profiles based on intercomparison against EARLINET, preventing any contamination of cloud features to propagate into the dataset.
As discussed in the case of Leipzig overpass, on average, the agreement between CATS Level 2 backscatter 25 coefficient profiles and EARLINET is good, especially under relative homogeneous cloud-free atmospheric conditions. Under complex atmospheric conditions though, of coexistent and adjacent aerosol and cloud features, the impact of the CATS Feature Type Score on the CATS aerosol retrievals becomes significant. Figure  The cloud features that cause the observed discrepancies are classified by CATS CAD algorithm as aerosol layers, contaminating the CATS profile, despite the strict cloud screening. Features with invalid CATS CAD Score, although not frequently observed, may impact the quality of the column aerosol optical depth (AOD) and related climatological studies. In addition, complex topography in terms of geographical characteristics, erroneous mean backscatter 5 coefficient profiles due to the high variability of aerosol load in the Planetary Boundary Layer, the horizontal distance between the CATS lidar footprint and the ground-based lidar stations and surface returns enhance further these discrepancies, especially in the lowermost part of the profiles. Based on this analysis and comparisons with CALIPSO, the CATS cloud-aerosol discrimination algorithm was updated for the V3-00 Level 2 data products (to be released by in the end of 2018) to improve the accuracy of the Feature Type and Feature Type Score, especially 10 during daytime.

EARLINET-CATS comparison statistics
In this section an overall assessment of the CATS backscatter coefficient product at 1064 nm is given, using the  paired correlative observations show an underestimation of the CATS retrievals, more pronounced during daytime than nighttime. In the case of daytime observations, the calculated mean (median) CATS difference from EARLINET is -0.123 Mm -1 sr -1 (-0.095 Mm -1 sr -1 ). In the case of nighttime observations, the corresponding mean (median) difference from EARLINET is -0.031 Mm -1 sr -1 (-0.065 Mm -1 sr -1 ). The observed standard deviation (SD) is 0.431 Mm -1 sr -1 over daytime and 0.342 Mm -1 sr -1 during nighttime. During daytime, minimum and maximum CATS-  discrepancies are also evident. CATS, as a result of the high spatial atmospheric variability, yields usually higher values of standard deviation than EARLINET. In addition, at altitudes higher than 6 km (a.s.l.), CATS mean backscatter coefficient profile yields zero or close-to-zero values, while EARLINET shows the presence of elevated aerosols, with rather low mean backscatter values, lower than 0.2 Mm −1 sr −1 .
The CATS Level 2 backscatter coefficient product evaluation study shows that CATS agrees reasonably well with 20 ground-based EARLINET measurements, although generally biased low. To assess the ability of CATS lidar to detect aerosol features and optical properties and to shed light on the origin of observed CATS-EARLINET discrepancies the conducted CALIOP validation studies offer an unprecedented basis. This is due to the similar viewing geometry between CATS and CALIOP and to the similarities between Level 1B and Level 2 processing algorithms (McGill et al., 2015;Yorks et al., 2016;2019). 25 Since CALIPSO joined the A-Train constellation of Earth observation satellites in June 2006 (Winker et al., 2007), several studies have been conducted to validate and evaluate CALIOP Level 1B, Level 2 and Level 3 products, against ground-based, airborne, and spaceborne measurements. Airborne NASA Langley HSRL (Hair et al., 2008) and CPL (McGill et al., 2002) (Amiridis et al., 2013;Omar et al., 2013;Schuster et al., 2012). In addition, evaluation studies of AOD observations from the passive spaceborne MODerate resolution Imaging Spectroradiometer (MODIS; Remer et al., 2005) show that CALIOP provides reasonably well known climatic features, although with apparent AOD underestimations (Amiridis et al., 2013;Kittaka et al., 2011;Oo and Holz, 2011;Redemann et al., 2012). The magnitude of the documented agreements and biases in the detection of aerosol features vary from study to study, with respect to the different CALIOP versions.
Substantially improvement in the detection of aerosol features is expected in the latest CALIPSO Version 4 (AMT 5 CALIPSO special issue).
Overall, CATS, much like CALIOP, observes reasonably well the vertical distribution of atmospheric aerosol backscatter coefficient, although with slight underestimations. The observed discrepancies in the compared CATS-EARLINET profiles are attributed to several sources.
First, the retrieval accuracy of CATS Level 2 data products, such as the aerosol and cloud backscatter and extinction 10 coefficient profiles, the vertical feature mask and the integrated parameters (e.g. AOD), depends crucially on the calibration of the lidar system and the calibration region (Kar et al., 2018). CATS total attenuated backscatter from molecules and particles in the atmosphere is performed in the calibration region between 22 and 26 km starting with V2-08 of the L1B data (Russell et al., 1979;Del Guasta 1998;McGill et al., 2007;Powell et al., 2009). Uncertainties in the CATS Level 1B backscatter calibration are attributed to random and systematic errors (CATS ATBD). Random 15 errors result mainly from normalizing the 1064 nm lidar signal to modeled molecular signal and are dominated by lidar noise. On the contrary, systematic errors result from a number of different sources, including uncertainties in the CALIOP stratospheric scattering ratios and molecular backscatter coefficient values generated from the Goddard Earth Observing System (GEOS) atmospheric general circulation model and assimilation system used to calculate molecular and ozone atmospheric transmission (Rienecker et al., 2008), and from the non-ideal performance of 20 CATS. The total uncertainty due to the CATS calibration constants is estimated between 5 % and 10 % (CATS ATBD). The total uncertainty, the sum of the systematic and random errors, in the CATS ATB at 1064nm is estimated at 10-20% for nighttime data and 20-30% for daytime data.
Secondly, CATS detection and classification schemes, similar to CALIOP, provide Level 2 aerosol products only in regions where aerosol features are detected and identified. This implies that optically thin aerosol layers can go 25 undetected by CATS, due to weak backscattering intensities below the CATS detection thresholds (Kacenelenbogen et al., 2014;Thorsen et al., 2015). To increase the detection of tenuous aerosol layers CATS incorporates an iterated horizontal averaging scheme (5 and   Another source of discrepancies discrepancy between CATS and EARLINET is attributed to the effect of horizontal distance between the ground-based lidar systems and the space-based lidar footprint. Studies performed in the framework of EARLINET attribute an introduced discrepancy of the order of 5 % to the intercompared profiles, when the horizontal distance is below 100 km (Mamouri et al., 2009;Pappalardo et al., 2010;Papagiannopoulos et al., 2016). The different -opposite viewing geometry (upward for EARLINET/downward for CATS/CALIPSO) and the different transmittance terms are further sources of discrepancies (Mona et al., 2009). In addition, enhanced disagreements observed between CATS and EARLINET in the lowermost part of the mean backscatter coefficient 5 profiles are attributed to the high spatial and temporal variability of the aerosol content within the PBL, to the complexity of the local topography and to surface returns.

Summary and conclusions
This study implements independent retrievals carried out at several EARLINET stations, to qualitatively and quantitatively assess the performance of the NASA's CATS lidar operating onboard the ISS from February 2015 to 25 October 2017. We compared satellite-based CATS and ground-based independent measurements over twelve highperformance EARLINET stations across Europe and one located in Central Asia. Our analysis is based to on the first twenty months of CATS operation (02/2015-09/2016). Comparison of CATS Level 2 and EARLINET backscatter coefficient profiles at 1064 nm is allowed only in cases of maximum distance between the ISS overpass and the EARLINET stations below 50 km. EARLINET contributed with observations as close in time as possible, typically 30 with starting time or stop time of the measurements within 90 minutes of the ISS overpass. The analysis was restricted to cloud-free profiles to avoid possible cloud-contamination of the intercompared aerosol backscatter coefficient profiles.
In the quantitative assessment of the performance of CATS, 47 collocated, concurrent and cloud-free measurements of CATS the EARLINET were identified (21 daytime and 26 nighttime), offering a unique opportunity for the 35 evaluation of the space-borne lidar system. The results of the generic comparison are encouraging, demonstrating the overall good performance of CATS, although with negative biases. The agreement, as expected due to higher SNR, is better during nighttime operation, with observed underestimation of 22.3 %, during daytime and 6.1 % during nighttime respectively.
In addition to the generic comparison, three CATS-EARLINET comparison cases were examined to demonstrate the system's performance, under different study conditions. The comparison showed that under cloud-free, relative homogeneous atmospheric aerosol conditions, the spatial averaged CATS backscatter coefficient profiles are in good 5 agreement with EARLINET, independently of light conditions. The deficiency of CATS though to detect tenuous aerosol layers, due to the inherent limitations of space-based lidar systems, may lead to systematic deviations and slight underestimations of the total AOD in climatic studies. In addition, the CATS V2-01 Feature Type Score misclassification of aerosol layers as clouds, and vice versa, in cases of coexistent and/or adjacent aerosol and cloud features, may lead to non-representative, unrealistic and cloud contaminated aerosol profiles. While CATS feature 10 identification will improve in V3-00 01 data products, the most crucial reason for the observed discrepancies between CATS and EARLINET in the lowermost part of the profiles is related to the complexity of the topography and the geographical characteristics. Especially in the case of large elevation/slope differences, the effects of both inadequate sampling lower than the maximum elevation and of the different atmospheric sampling volumes, result in large AOD biases and unrealistic AOD values. 15 The qualitative and quantitative agreement between CATS and EARLINET reported in this study is encouraging, especially during nighttime, agreement that will hopefully facilitate further studies implementing CATS observations in the future. CATS, for a period of almost three years, provided an unprecedented global dataset of vertical profiles of aerosols and clouds, much like CALIOP, taking though advantage of the unique orbital characteristics of the ISS.
ISS enabled CATS to provide for the first time satellite-based lidar measurements of the diurnal evolution of aerosols 20 and clouds over the tropics and midlatitudes, and to be more specific to latitudes below 52 o . Since CALIPSO and Aeolus (and in the future also EarthCARE) are polar sun-synchronous satellites of fixed equatorial crossing time (01:30 and 13:30 LT for CALIOP, 06:00 and 18:00 for ALADIN), it is expected that, at least for the near future, CATS dataset will remain the only available satellite-based lidar source of nearly global diurnal measurements of atmospheric aerosols and clouds. In addition, while CALIOP is a two-wavelength lidar system operating at 532 nm 25 and 1064 nm with depolarization capabilities at 532 nm, CATS provided satellite-based aerosol and cloud depolarization profiles at 1064 nm, thus in a different wavelength. This dataset, much like CALIOP dataset, is especially useful for studies of the three-dimensional distribution of non-spherical aerosol particles in the atmosphere (e.g. mineral dust and volcanic ash), and especially since it is an active sensor, over regions of high reflectivity (e.g. deserts, ice). Future studies including the exploitation of CATS unique observations may help the scientific 30 community to shed new light on physical processes of aerosols and clouds in the Earth's atmosphere.

Competing interests.
The authors declare that they have no conflict of interest. Ventress, L. J., Carboni, E., Grainger, R. G., Wang, P., Tilstra, G., Ronald van der, A., Theys, N. and Zehner, C.: Validation of ash optical depth and layer height retrieved from passive satellite sensors using EARLINET and airborne lidar data: the case of the Eyjafjallajokull eruption, Atmos. Chem. Phys., 16 (9)           The authors would like to thank the reviewer for the interesting and at the same time substantial comments and 25 suggestions. We tried, and did our best, to incorporate the proposed changes and corrections in the revised manuscript, aiming at improving the presented paper. Following, you will find our responses, one by one to the comments addressed. Kind regards, Emmanouil Proestakis 30 Specific Comments: 1. Abstract: Page 2, line 1: "Independently of daytime/nighttime conditions.". Please consider revising this statement. At the end of this paragraph the authors are mentioning an underestimation of 22.3% during day and 6.1% during night time. So there is a significant difference in the comparison based on the sky light conditions 35 something that has to be mentioned clearly in the abstract. Where you can attribute this difference? e.g. SNR issue, significance of your day-night statistical sample?
The authors agree with the statement of the reviewer. Therefore, the sentence was modified from: "In addition, CATS misclassification of aerosol layers as clouds, and vice versa, in cases of coexistent and/or adjacent 5 aerosol and cloud features, may lead to non-representative, unrealistic and cloud contaminated aerosol profiles. Regarding the comment of the reviewer, where the authors attribute this difference, the effect of SNR is considered the most critical factor, because measurement noise by solar illumination background and layer detection are different during daytime and nighttime, with the effect propagating through the retrieval algorithms to atmospheric 20 layer detection and classifications and eventually to Level 2 and Level 3 products. Example of the critical level of SNR effect is the Minimum Detectable Backscatter (MDB), as reported by McGill et al. (2007), for both CALIOP and CATS and for both daytime and nighttime conditions (Table 1). According to Table 1 the detection sensitiveness of thin, weakly scattering atmospheric layers at CATS M7.2 1064 nm is two orders of magnitude higher during nighttime that during daytime (MDB two orders of magnitude lower during nighttime than during daytime). In the 25 case of CALIOP, both for 532 and 1064 nm, MDB during nighttime is an order of magnitude lower during nighttime than during daytime. 1.6x10 -2 ± 0.84x10 -3 1.6x10 -4 ± 0.84x10 -4 1064 nm night 5.0x10 -5 ± 0.77x10 -5 1.6x10 -4 ± 0.84x10 -4 532 nm day 3.8x10 -2 ± 1.05x10 -3 1.7x10 -3 ± 0.84x10 -3 1064 nm day 1.3x10 -3 ± 0.24x10 -3 1.0x10 -3 ± 0.30x10 -3

Introduction: The Introduction is well written however I am missing the scientific question that this manuscript 30
envisages to answer. Please try to make this clear in this section and consider mentioning the achievements and progress of the scientific community so far towards this topic. Are there any similar activities for CATS? The results presented here are having great difference with similar studies for other space borne lidars? The reader has to reach section 2.1 in order to find some answers on the aforementioned concerns. 35 The authors agree with the reviewer that the manuscript was characterized by a significant lack of mentioning similar achievements and activities, towards the assessment of CATS performance. The authors agree with the reviewer regarding the necessity of including the findings of the aforementioned studies and have adjusted the manuscript accordingly. To be more specific, the following paragraphs were added to the manuscript (Section 1 -Introduction): Levy et al., 2013) and active CPL (McGill et al., 2002) and CALIPSO CALIOP (Winker et al., 2009)   , and were found consistent and in good agreement with 20 depolarization measurements from previous studies and historical datasets implementing CPL (Yorks et al., 2011) and CALIOP (Liu et al., 2015)."

"CATS performance has been validated against ground-based AErosol RObotic NETwork (AERONET; Holben et al., 1998) measurements and evaluated against satellite-based Atmospheric Optical Depth (AOD) retrievals of Aqua and Terra Moderate Imaging Spectroradiometer (MODIS;
Regarding the question the manuscript envisages to answer, the author have modified/included the following paragraphs to the manuscript (Section 1 -Introduction): 25

Section 2.3.1: I think it would be beneficial for the manuscript to include a flowchart showing the methodology 35
of the comparison followed by the authors. The entire process can be summarized there along with the methodology requirements followed by the authors. e.g. the spatial -temporal constraints, cloud screening requirements, etc. The information exists in the manuscript but I feel like it is scattered among the sections.
The authors agree with the reviewer that is would be beneficial to summarize the key parameters and the 40 associated thresholds implemented in the framework of the study. For this reason the following table was included in the manuscript: , for the homogenization of the lidar data in a standardized output format. SCC facilitates an automatic algorithm developed to further address the quality assurance of the lidar measurements. The EARLINET implementation is described in "Section 3.2.3". 5 4. Page 7, lines 18-19: "::: is less than 30%, ::: requirements of EARLINET". The authors are kindly requested to provide a reference for this statement.
The text is modified according to the reviewer's recommendation, and the following references were included: "The comparison showed that by using only the signal from the elastic channels, the mean relative deviation in the 10 calculation of the aerosol backscatter coefficient at 1064 nm is less than 30 % (Althausen et al., 2009;Baars et al., 2012;Engelmann et al., 2016;Hänel et al., 2012)

Section 2.3.3: This section is important for following up the manuscript and has to be highlighted. Therefore, I would kindly suggest to the authors to list it as 2.4.
The text is modified according to the reviewer's recommendation. 35 7. Page 9, line 6: "The discussed constraints:::": How much these constrains affect the final dataset (in terms of number of measurements and overall evaluation)?
Regarding the question of the reviewer on the discussed constrains on the dataset, Figures 1-4 show quantitatively 40 the effects of (i) distance between the EARLINET station and the closest profile of the CATS-ISS overpass for each correlative case, (ii) CATS Feature Type, (iii) number of CATS Level 2 (L2) Aerosol Profiles (APro) used in the CATS horizontal average, and the effect of (iv) topography of EARLINET stations. The comparison exercise examines the effect of one discussed constrain at a time, while keeping all the other parameters in the methodology constant, and considers various evaluation metrics, as discussed in the following sections. 45 Figure 1 shows the effect of distance between the closest CATS L2 APro and the respective EARLINET station matchup, for different upper Euclidean distance thresholds (i.e.: 5n km, n∈Ν={1,10}). To be more specific, the Mean Bias (MB; [Mm -1 sr -1 ]) - (Fig.1a), Root Mean Square Error (RMSE; [Mm -1 sr -1 ]) - (Fig.1b), Correlation Coefficient 50 (Fig.1c), and the number of CATS-EARLINET correlative cases per each upper distance threshold are considered. For each upper distance threshold, all the available CATS-EARLINET cases of Euclidean distance lower or equal to the respective upper limit are considered in the computation of the aforementioned evaluation metrics. This cumulative approach is selected due to the limited number of CATS-EARLINET correlative cases, and is applied separately for daytime and nighttime ISS overpasses, due to the different CATS measurement conditions. Based on the analysis, during nighttime (daytime), the CATS-EARLINET MB is increasing (decreasing) starting from the 5 km upper distance threshold, to reach -0.0300 (-0.123) Mm -1 sr -1 , for the radius threshold of 50km shown in 5 the study. The computed RMSE values are in the range between 0.447 and 0.343 Mm -1 sr -1 for nighttime and between 0.357 and 0.448 Mm -1 sr -1 for daytime, for the distance thresholds of 5km and 50km respectively. The minimum RMSE values are observed when considering ISS overpass cases of closer than 40 km distance to the EARLINET stations during nighttime, corresponding to MB of 0.018 Mm -1 sr -1 . The Correlation Coefficient is decreasing with increasing distance between the ISS overpass and the EARLINET stations. Notably, the Correlation 10 Coefficient is not changing considerably for thresholds between 15 and 40 km for nighttime (~0.8) and between 15 and 30 km for daytime (~ 0.7). Sharp decreases in the Correlation Coefficient are observed during daytime (0.547), for distances closer to the EARLINET stations than during nighttime (0.693), for 35 and 40 km distance respectively. The observed tendencies can be explained in terms of the distance thresholds and number of available cases, since the distance thresholds define the number of cases that are used in the analysis and the number of case is critical 15

(i) Effect of distance between the EARLINET station and the closest profile of the CATS-ISS overpass
to assess the performance of CATS. Consequently, the MB, RMSE and Correlation Coefficient are all subject to both the number and the characteristics of the CATS-EARLINET cases used. In the study the authors use the maximum number of available EARLINET cases, to avoid any possible selection effect resulting from a poor sample of correlative cases, when strict collocation filters are applied. Using the maximum number of available correlative cases, i.e. twenty six (26) and twenty one (21) for nighttime and daytime respectively, for ISS overpasses within 20 50km radius from the EARLINET stations, the authors envisage to quantitatively address the question of CATS performance and the representativeness of the aerosol backscatter coefficient profiles, over various atmospheric, illumination and ISS overpass conditions.

Figure 1: CATS backscatter coefficient at 1064nm with respect to EARLINET ground-based measurements, as a function of distance (km) between the closest CATS Level 2 Aerosol Profile and the respective "collocated" EARLINET station, for daytime (red line) and nighttime (blue line) ISS overpasses. Left: Mean Bias [Mm -1 sr -1 ], center: RMSE [Mm -1 sr -1 ] and right: Correlation Coefficient. Dashed lines correspond to the number of CATS-EARLINET correlative cases considered per each upper distance threshold between the CATS footprint and the locations of EARLINET stations.
(ii) Effect of Feature Type Score

25
The main objective of the CATS Cloud Aerosol Discrimination (CAD) score, or Feature Type Score, is to provide to the Feature Type classification a level of confidence. In the case of CATS, the Feature Type score is an integer number ranging between -10 and 10. The values of CATS Feature Type score correspond to classified aerosol atmospheric layers (negative values) and cloud atmospheric layers (positive values), while the magnitude of the Feature Type score corresponds to the confidence level of the classification. A value of -10 indicates complete 30 confidence that the layer is an aerosol layer, while Feature Type score equal to 0, indicates an atmospherics layer with equal probability to be cloud or aerosol. Figure 2 shows the effect of Feature Type Score, for different values, between -8 and 0 (i.e. for atmospheric layers classified as aerosol layers). The Mean Bias (MB; [Mm -1 sr -1 ]) - (Fig.2a), Root Mean Square Error (RMSE; [Mm -1 sr -1 ]) - (Fig.2b) and Correlation Coefficient (Fig.2c) are shown per each Feature Type Score. For each Feature Type score, 35 cases of lower classification confidence level are not considered in the assessment of CATS performance and representativity, indicating the effect of the selected Feature Type thresholds.
Based on the MB, RMSE and Correlation Coefficient, a similar tendency is observed for different Feature Type Scores. To be more specific, not considerable changes are observed for different Feature Type Scores, regardless of the selected Feature Type threshold. This effect is due to the atmospheric characteristics of the CATS-EARLINET cases considered in the analysis. In the framework of the study, to account for contamination effects of multiplescattering and specular reflection in the intercomparison process, only cloud-free atmospheric scenes are used. 5 Furthermore, cases with detected cirrus, either at the EARLINET Range-Corrected-Signal quicklooks or at the ISS-CATS backscatter coefficient profiles or the feature type profiles, are not considered in the study. Initially, the presence of clouds was investigated through the implementation of CATS backscatter coefficient and depolarization time-height images and EARLINET range-corrected-signal. Cases for which the retrieval of EARLINET temporallyaveraged profile was not feasible due to the presence of clouds, and/or CATS cases that the presence of clouds 10 propagated into the CATS spatial-averaged profile were discarded from the analysis. Consequently, the lack of dependence shown in Figure 2 (a-c) is the result from the a priory selection of cloud free conditions selected in the analysis. However, a notably characteristic is the nighttime performance of CATS, which as shown from the lower absolute MB and lower RMSE, but in addition from the higher Correlation Coefficient values, due to higher SNR, is more representative than the corresponding daytime performance. 15

Figure 2: CATS backscatter coefficient at 1064nm with respect to EARLINET ground-based measurements, as a function of Feature Type score, for daytime (red line) and nighttime (blue line) ISS overpasses. Left: Mean Bias [Mm -1 sr -1 ], center: RMSE [Mm -1 sr -1 ] and right: Correlation Coefficient. (iii) Effect of number of CATS-ISS L2 aerosol profiles used in the spatial averaging
Similarly to the analysis presented and discussed above, Figure 3 shows the effect of different number of aerosol profiles used when spatially averaging to retrieve the CATS aerosol profiles used in the framework of the study. In Figure 3, the acronym "CPro" corresponds to the closest CATS profiles to the corresponding EARLINET station. 20 Accordingly, the Mean Bias (MB; [Mm -1 sr -1 ]) - (Fig.3a), Root Mean Square Error (RMSE; [Mm -1 sr -1 ]) - (Fig.3b), Correlation Coefficient (Fig.3c), are computed for different number of profiles used (i.e. CPro±1Profile, CPro±2Profiles, …). Based on the MB, RMSE and Correlation Coefficient, the representativeness of CATS spatial profile is increasing with increasing number of aerosol profiles used in the horizontal averaging. To be more specific nighttime MB is 25 almost constant, showing a low dependence on the number of profiles used, while for daytime CATS cases the opposite effect is observed, with improvement of CATS performance though increasing number of profiles used. Regarding RMSE no significant changes are observed, though a slight decreasing tendency in the RMSE is observed for both daytime and nighttime cases. Regarding the Correlation Coefficient, increasing in the values is also observed, with increasing number of profiles used, both for daytime and nighttime cases, denoting the 30 improvement of the representativeness with increasing number of CATS profiles used in the spatial averaging.  Table 2. In 5 addition, Figure 4 shows the locations of the participating stations; green circles denote Continental stations, blue circles denote Coastal stations and brown circles denote Mountainous stations. Figure 4 shows, additionally to the geographical distribution of the active EARLINET stations, the daytime/nighttime overpasses of ISS within the evaluation period, between 02/2015 and 09/2016, encompassing the first twenty months of CATS operation. Due to the limited available dataset of CATS-EARLINET cases, the daytime/nighttime approach was not followed in the 10 case of the analysis regarding the effect of topography. 5 Figure 5 shows the effect of Topography, for three different clusters of station characteristics, as introduced above (Case I: Continental, Case II: Coastal and Case III: Mountainous). In Figure 5a, the Box and Whisker plot on the CATSi-EARLINETi residuals is shown, including the lower and upper whiskers which indicate the 10 th and 90 th percentiles respectively, and the 25 th and the 75 th quantiles indicated by the lower and upper box boundaries respectively. The horizontal line and the red dot indicate the statistical mean and median values respectively while outliers are 10 indicated by red crosses. According to the results, it is evident that the correlative measurements between the Mountainous EARLINET stations and the ISS overpasses are characterized by higher variability, more extreme differences, higher absolute mean and median biases and higher RMSE than in the Continental and Maritime cases. Complex topography, in terms of geographical characteristics, erroneous mean backscatter coefficient profiles due to the high variability of aerosol load in the Planetary Boundary Layer, the horizontal distance between the CATS 15 lidar footprint and the ground-based lidar stations and surface returns enhance the discrepancies, especially in the lowermost part of the profiles, resulting in higher differences between the EARLINET profiles and CATS profiles. Due to the lack of the aforementioned effects arising from complex topography, CATS representativeness and performance is higher over the Continental cases, while CATS performance over the Coastal stations is characterized by slightly lower absolute value of mean bias and at the same time by lower Correlation Coefficient 20 than in the case of Continental cases. However, it has to be taken into consideration the important factor related to the presented results that is the number of CATS-EARLIENT correlative cases used in the analysis, 23 for Case I -Continental, 10 for Case II -Coastal and 14 for Case III -Mountainous. Analytical evaluation metrics on the effect of topography are given in Table 3.  Fig.5b and Fig.5c show the RMSE and Correlation Coefficient as a function of the different clusters, including the number of available cases per cluster. 8. Page 9, line 18: "here considered"-> "considered here". 5 The text is modified according to the reviewer's recommendation. 9. Page 9, lines 32-33: I cannot understand this conclusive statement. How "the absence of significant biases, 10 both daytime and nighttime" is obvious from figure 3c.
The reviewer is right, that Figure 3c corresponds to a nighttime atmospheric scene, therefore the statement, referring not only to nighttime but also to daytime conclusions, may be confusing for the reader. The authors, have inspected of all available cases one-by-one, and wanted to provide the information through this section, that when 15 the atmospheric scene is homogeneous and the scattering characteristics of the aerosol layers are above the MDB thresholds of CATS sensor (i.e. sufficient SNR for detection and classification), the overall CATS performance under such homogeneous conditions is good, with absence of significant biases. This conclusion holds both for daytime and nighttime. For this reason the "representative case" was used. However, since the authors agree with the reviewer that the sentence may be confusing, the sentence was 20 reformulated from: "The intercomparison presented in Figure 3c is a representative case, indicating the overall high performance of CATS and the absence of significant biases, during both daytime and nighttime, under relative homogeneous and cloud free conditions." to: 25 "Although the case presented and discussed in Figure 3 corresponds to a nighttime ISS overpass, the case is representative for cloud free and relative homogeneous atmospheric scenes in terms of aerosols, for both daytime and nighttime solar background illumination, demonstrating the overall high performance of CATS under such conditions." 30 10. Page 10, lines 9-10: "due to the different SNR:::": I think that indeed this is the case. But this contradicts to the author statement of no significant bias between day and night conditions stated earlier (page 9, lines 32-33).
The reviewer is right on the high importance and effect of SNR is CATS retrievals and algorithms. Statement of page 9, lines 32-33 has been reformulated to avoid possible confusions, according to the reviewer's comment. 35

Page 10, lines 24-29: I have the feeling that this information should be moved to section 2.2 where the description of CATS data level product is already given. At that section, the authors can present a detailed description of their methodology followed for could screening. 40
According to reviewer's recommendation the suggested part of the manuscript was moved (and slightly modified to fit better to the paragraph), to Section 2.1 (former Section 2.2 in the ACPD discussion version).
To be more specific, the suggested part was modified from: "In addition to the backscatter coefficient, CATS Level 2 data provide the feature classification of the detected layers (namely: clear air, cloud, aerosol and totally attenuated) and the numerical confidence level of the classification, similar to the CALIOP Cloud-Aerosol-Discrimination (CAD) algorithm (Liu et al., 2004;Liu et al., 2009). CATS Feature Type Score is a multidimensional probability density function (PDF) developed based on multiyear CPL observations, that discriminates cloud and aerosol features, assigning an integer between -10 and 10 for each detected 5 atmospheric layer." to: "In addition to CATS Level 2 Feature Type (namely: clear air, cloud, aerosol and totally attenuated), the algorithm provides the confidence level of the Feature Type classification, similar to the CALIOP Cloud-Aerosol-Discrimination (CAD) algorithm (Liu et al., 2004;Liu et al., 2009). CATS Feature Type Score is a multidimensional probability density 10 function (PDF) developed based on multiyear CPL observations, that discriminates cloud and aerosol features, assigning an integer between -10 and 10 for each detected atmospheric layer." 12. Page 11, line 23: "end of 2018:::" -> Maybe "end of 2019" ? 15 The manuscript was modified to: "Based on this analysis and comparisons with CALIPSO, the CATS cloud-aerosol discrimination algorithm was updated for the V3-00 Level 2 data products (released in the end of 2018) to improve the accuracy of the Feature Type and Feature Type Score, especially during daytime." 20 13. Section 3.2: I wonder why the authors constrained their study only to the comparison of aerosol backscatter and they did not proceed with comparison of other aerosol related properties as well (e.g. physical and not properties such as integrated backscatter, AOD, lidar ratio, layer center of mass-thickness). I have the feeling that by taking into account more properties in their comparison will improve the manuscript and will enhance the arguments (i.e. argument of tenuous layer, argument of lidar ratio assumption) for the discrepancies shown here. 25 In

30
CATS products and processing algorithms are provided in different levels of processing. CATS Level 1B (L1B) data include vertical profiles of total and perpendicular attenuated backscatter signals, range-corrected, calibrated and annotated with ancillary meteorological parameters (McGill et al., 2007;Powell et al., 2009;Vaughan et al., 2010). CATS Level 2 (L2) products provide the vertical distribution of aerosol and cloud properties (depolarization ratio, backscatter and extinction coefficient profiles at 1064 nm -FFOV), with a horizontal and vertical resolution of 5 km 35 and 60 m respectively. In addition, L2 data include geophysical parameters of the identified atmospheric layers (vertical feature mask -feature type, aerosol subtype), the required horizontal averaging and information on the feature type classification confidence . Regarding CATS L1B, the validation is a study led by NASA GSFC Team, and more specific by Dr. Rebecca Pauly (Science Systems and Applications Inc., Lanham, 20706, United States Science Systems and Applications Inc., 40 Lanham, 20706, United States), member of the CATS Team. The study is already submitted on AMT journal: "Pauly, R. M., Yorks, J. E., Hlavka, D. L., McGill, M. J., Amiridis, V., Palm, S. P., Rodier, S. D., Vaughan, M. A., Selmer, P. A., Kupchock, A. W., Baars, H., and Gialitaki, A.: Cloud Aerosol Transport System (CATS) 1064 nm Calibration andValidation, Atmos. Meas. Tech. Discuss., https://doi.org/10.5194/amt-2019-172, in review, 2019". In this study, the EARLINET authors in collaboration with the CATS Team evaluate CATS Level 2 Mode 7.2 v2.01 45 backscatter profiles at 1064nm . The reason of focusing to the evaluation of backscatter coefficient is the operation wavelength of CATS, i.e. the 1064nm wavelength. Since EARLINET lidar systems do not provide depolarization ratio measurements at 1064nm the particulate depolarization ratio parameter could not be evaluated, included in the analyssis. In addition, since CATS is a satellite-based elastic backscatter lidar (McGill et al., 2015), in order to provide vertically resolved extinction coefficient profiles (km -1 ) of aerosols and clouds in the 50 Earth's atmosphere, the computation algorithm implements a number of intermediate parameters (i.e. lidar ratio, feature type classification, aerosol subtype classification, among others). Due to the reason that the profiles of extinction coefficient are a computed product and not included in the direct measurements, extinction coefficient profiles were also not included in the analysis. The authors have focused on particulate backscatter coefficients (km -1 sr -1 ), since this is the product directly derived from measurements, the sum of the parallel and perpendicular backscatter measurements (i.e., β1064nm_total = β1064nm_parallel + β1064_perpendicular). Future study will include high collocated analysis on the CATS performance and representativeness, including the issues mentioned by the reviewer, Regarding the comment of the reviewer of explicitly addressing the differences between CATS-EARLINET for day and night time conditions per station, along with the mean value to explain some of the discrepancies, it has to be 15 noted that the sample of collocated profiles in many stations does not permit an analysis with strong "per-station" conclusions. For instance, we mention here that Barcelona (ba), Athens_NTUA (at), and Bucharest (bu) stations are participating with only one available case of CATS-EARLINET collocated measurements. In addition, certain number of station happens to contribute with either only nighttime or daytime correlative cases, i.e. Athens_NOA (no) and Lecce (le) with only nighttime cases (three and two cases respectively) and Evora (ev) with only daytime cases (two 20 cases), not allowing to follow the per-station approach. The undervalue of EARLINET is relying to the approach of the participating community treats EARLINET as a single entity, with the main objective to obtain an extended, coordinated and of continental scale network of sophisticated ground-based Raman lidars and eventually, to foster a quantitative, comprehensive, and statistically significant database of the distribution of aerosol on a continental scale (Bösenberg et al., 2003;Pappalardo et al., 25 2014). The quality assurance and improvement of the performance of the EARLINET systems is tested through the intercomparison of both the infrastructure (Wandinger et al., 2015) and the optical products (Böckmann et al., 2004;Pappalardo et al., 2004). In addition, the homogenization of the lidar data in a standardized output format is facilitated and an automatic algorithm is developed to further address the quality assurance of the lidar measurements ( In order to clarify and demonstrate the sample issue, not allowing to follow a per-station approach, the authors have included here (but also in the manuscript) the following " Table 4", where the cases used in the intercomparison are given.  The authors agree with the reviewer regarding not properly commenting on the respective aspect. Regarding CATS L2 profiles, the product provides the vertical distribution of aerosol and cloud properties (depolarization ratio, backscatter and extinction coefficient profiles at 1064 nm -FFOV), with a horizontal and vertical resolution of 5km and 60m respectively. On the contrary, EARLINET profiles were provided by the EARLINET community with higher vertical resolution. Towards the assessment of CATS performance, for the comparison of CATS against EARLINET, 10

The pair of observation "i" refer to the vertical height of each case study or to each case study individually? This a general comment related to the comparison methodology followed by the authors: I speculate that the initial vertical resolution of the two profiles is
we implemented the CATSi-EARLINETi residuals for each pair of observations "i", as a statistical indicator of CATS average overestimation or underestimation of the aerosol load, in terms of backscatter coefficient values. Since the vertical resolution of the two profiles was not the same and in order to compute the CATSi-EARLINETi residuals, the EARLINET profiles were reduced in resolution to obtain 1-1 datasets, characterized by the same vertical resolution. This was achieved by computing the EARLINET mean backscatter coefficient value from all EARLINET bins within 15 each CATS 60m backscatter coefficient range. Thus, indeed the speculation of the reviewer on the methodology, through computing mean values in specific vertical height windows, is right. The aforementioned approach indeed led to loss of vertical resolution in the EARLINET profiles (Iarlori et al., 2015). For this reason, the authors (in the initial steps of the study) performed an exercise, to investigate the magnitude of the effect of the selected approach and the significance of loss of resolution in the EARLINET profiles, since the 20 opposite approach (i.e. to increase the resolution of CATS profiles to match the EARLINET resolution), was not feasible. Figure 6 shows an example of the exercise, corresponding to a nighttime ISS orbit, on September 30, 2015 (blue line), at a minimum distance of 12.9km from the EARLINET Leipzig -Germany PollyXT lidar system (indicated by a white dot), at 22:21 UTC (Fig. 3a). CATS particulate backscatter coefficient cross section at 1064 nm ( Fig.6-right) 25 shows the presence of aerosols up to 2.2 km (a.s.l.). CATS spatial-averaged and Leipzig temporal-averaged profiles were derived from CATS profiles within horizontal distance below of 50 km, between the Leipzig station and the ISS footprint. 30 Figure 7 shows the direct comparison between the backscatter coefficient profiles, measured from the EARLINET Leipzig station (red line) and CATS (blue line), along with their standard deviations (horizontal error bars). The profiles indicate the presence of aerosol up to 2.6 km height (a.s.l.). The intercompared profiles between ISS-CATS and EARLINET-Leipzig station are characterized by adequate agreement, although significant discrepancies were also present, especially to the lowermost part of the profiles, as discussed in the manuscript. 35 The intercomparison presented in Figure 7 is shown to provide to the reviewer a quantitative response to the specific comment. Figure 7 shows the CATS averaged backscatter coefficient profile in blue color, while with respect to EARLINET both the initial (high resolution) and final (reduced in resolution to match the CATS profile resolution) are provided in black and red colors. As was observed the necessary loss resolution in the EARLINET profiles for achieving vertical match between the two datasets is very low, with final EARLINET profile following with high accuracy the characterizes and tendencies, both qualitative and quantitative, of the initial EARLINET profiles.

Figure 7: CATS and EARLINET-Leipzig backscatter coefficient profiles (1064 nm) for the nighttime ISS orbit over EARLINET Leipzig station on the 30 th of September 2015. CATS backscatter coefficient profile at 1064nm is shown in blue line. EARLINET-Leipzig initial and final profiles, are shown is black and red respectively.
However, the authors agree with the reviewer on the absence of properly addressing the vertical match between 5 the two datasets. For this reason, the following part was added on "Section 2.3.2 -Particle backscatter coefficient retrievals from ground based lidars at 1064 nm": "Finally, in order to perform the intercomparison between CATS and EARLINET profiles, the high resolution of EARLINET profiles was lowered to match the vertical resolution of CATS profiles (i.e. 60m). The objective of obtaining profiles of similar vertical resolution was addressed through computing the EARLINET mean backscatter coefficient 10 value from all EARLINET bins within each CATS 60m backscatter coefficient height range. The computed EARLINET profiles of similar vertical resolution with CATS followed with high accuracy the characterizes and tendencies, both qualitative and quantitative, of the initial EARLINET profiles, despite the loss of vertical resolution (Iarlori et al., 2015). ". 15 15. Page 13, line 30: "CALIOP" -> Maybe "CATS" instead of CALIOP?
CATS calibration is performed by normalizing the NRB signal in the altitude regime between 23 and 27 km. Although the region is used to normalize the NRB signal to the molecular backscatter, the region between 23 and 27 km is not aerosol free. According to the ATBD, the scattering ratios (e.g. total backscatter to molecular backscatter) 20 at 532 nm are estimated based on CALIPSO CALIOP V4 L1 data. The 532 nm scattering ratios are used to estimate the 1064 nm scattering ratios and accordingly to the calibration of CATS. Consequently, a source of systematic errors in the CATS calibration is related to errors in the stratospheric scattering ratios provided by CALIPSO ( The authors agree with the reviewer. Although not a CATS extinction coefficient 1064nm and AOD 1064 nm analysis were not included, the authors in order to provide a more detailed overview of CATS capabilities and representativeness have included literature review on studies investigating the performance of CATS. To be more specific, the following paragraph was added to the manuscript (Section 1 -Introduction), in line to the comment of the reviewer and in order to justify the statement mentioned ny the reviewer: "CATS performance has been validated against ground-based AErosol RObotic NETwork (AERONET; Holben et 5 al., 1998) Imaging Spectroradiometer (MODIS;Levy et al., 2013) and active CPL (McGill et al., 2002) and CALIPSO CALIOP (Winker et al., 2009)   , and were found consistent and in good agreement with 30 depolarization measurements from previous studies and historical datasets implementing CPL (Yorks et al., 2011) and CALIOP (Liu et al., 2015). The text is modified according to the reviewer's recommendation.

Page
18. Figure 7: For the night time mean profiles the discrepancies are negligible but for the day time and specifically 40 for the height region from 1-2 km large differences are observed. What is the main reason behind this? The significant influence of the topography? In that case why this difference is not shown also in the nigh-time profiles, considering this as a bias from one or more stations. The low daytime CATS SNR? In that case I would expect to see higher discrepancies than sown inside the PBL (longer atmospheric path), compared to 1-2 km. The calibration region of CATS? In any case, I think that a solid and quantitative explanation on this is missing. 45 The effect of signal-to-noise ratio (SNR) and the associated Minimum Detection Backscatter (MDB) are the critical factors determining the performance of CATS. However along with the technical capabilities of CATS there are different factors with effect on the final CATS profiles (i.e. topography, as mentioned by the reviewer). Regarding the quantitative and qualitative explanation exercises under different cases are presented and discussed in the 50 reviewer's question #7. The authors would like to thank the reviewer for the interesting and at the same time substantial comments and suggestions. We tried, and did our best, to incorporate the most suitable proposed changes and corrections in the revised manuscript, aiming at improving the presented paper. Following, you will find our responses, one by one to the comments addressed, in the uploaded supplement pdf file. Kind regards, 15 Emmanouil Proestakis

General comments: This manuscript compares EARLINET (ground-based) and CATS (onboard the international spatial station) retrievals of the aerosol backscatter coefficient over 12 European sites and 1 Asian site. The paper is well 20
written, however, I did miss some explanation in the introduction about the importance of CATS product. I believe this could be easily achieved by modifying the order of some paragraphs and including extra information. In particular, I suggest moving the second paragraph of Section 2.2 (page 4, line 30 to page 5, line 12) to the introduction, with the due adjustments.

25
The authors agree with the reviewer. The science goals of CATS, indeed, were not mentioned in the introduction, leading to issues in the understanding of the scientific importance of the project in the early stages of the manuscript. For this reason the authors have followed the referee's recommendation to rearrange the manuscript, making at the same time all the appropriate modifications to ensure that the adjustments did not have a negative impact to the understanding of the manuscript context. To be more specific, the following section was added to the introduction: 30 "CATS was developed to meet three main science goals. The primary objective was to measure and characterize aerosols and clouds on a global scale. The space-borne lidar orbited the Earth at an altitude of approximately 405 km and 51-degree inclination. The use of the ISS as an observation platform facilitated for the first time global lidarbased climatic studies of aerosols and clouds at various local times (Noel et al., 2018, Lee et al., 2018. In addition, 35 near-real-time data acquisition of the CATS observations was developed towards the improvement of aerosol forecast models (Hughes et al, 2016). A secondary objective was related to the need of long-term and continuous satellite-based lidar observations to be available for climatic studies. The first spaceborne lidar mission, the Lidar In-space Technology Experiment (LITE;McCormick et al., 1993)

in 1994, was succeeded by the joint NASA and Centre National d'Études Spatiales (CNES) Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation 40
(CALIPSO) mission in June, 2006(Winker et al., 2007. Since 2009 the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) instrument (Winker et al., 2009)  intercomparison studies. However, the authors agree with the reviewer regarding the necessity of including the findings of the aforementioned studies and have adjusted the manuscript accordingly. To be more specific, the following paragraph was added to the manuscript (Section 1 -Introduction): "CATS performance has been validated against ground-based AErosol RObotic NETwork (AERONET; Holben et 10 al., 1998) measurements and evaluated against satellite-based Atmospheric Optical Depth (AOD) retrievals of Aqua and Terra Moderate Imaging Spectroradiometer (MODIS; Levy et al., 2013) and active CPL (McGill et al., 2002) and CALIPSO CALIOP (Winker et al., 2009)   , and were found consistent and in good agreement with 35 depolarization measurements from previous studies and historical datasets implementing CPL (Yorks et al., 2011) and CALIOP (Liu et al., 2015). The authors agree with the reviewer and a final paragraph stating suggestions related the use of the unique CAST dataset was included. To be more specific, the following section was added to the "Summary and Conclusions section": 45 "The qualitative and quantitative agreement between CATS and EARLINET reported in this study is encouraging, especially during nighttime, agreement that will hopefully facilitate further studies implementing CATS observations in the future. CATS, for a period of almost three years, provided an unprecedented global dataset of vertical profiles of aerosols and clouds, much like CALIOP, taking though advantage of the unique orbital characteristics of the ISS.

50
ISS enabled CATS to provide for the first time satellite-based lidar measurements of the diurnal evolution of aerosols and clouds over the tropics and midlatitudes, and to be more specific to latitudes below 52 o . Since CALIPSO and Aeolus (and in  atmospheric aerosols and clouds. In addition, while CALIOP is a two-wavelength lidar system operating at 532 nm and 1064 nm with depolarization capabilities at 532 nm, CATS provided satellite-based aerosol and cloud depolarization profiles at 1064 nm, thus in a different wavelength. This dataset, much like CALIOP dataset, is especially useful for studies of the three-dimensional distribution of non-spherical aerosol particles in the atmosphere (e.g. mineral dust and volcanic ash), and especially since it is an active sensor, over regions of high 5 reflectivity (e.g. deserts, ice). Future studies including the exploitations of CATS unique observations may help the scientific community to shed new light on physical processes of aerosols and clouds in the Earth's atmosphere." Specific comments: page 2, line 3 -Please modify "Physic" to "Physics". 10 The text is modified according to the reviewer's recommendation.

15
The text is modified according to the reviewer's recommendation.
The text is modified according to the reviewer's recommendation. 20

page 3, line 15 -What is the difference between capacity and capability?
The text is reformulated according to the reviewer's recommendation: 25 "Since the beginning of the initiative in 2000, EARLINET has significantly increased its observing and operational capacity" page 3, line 16 -Please reformulate or remove the sentence "EARLINET stations are classified as active on condition of...". 30 According to the reviewer's comment, the sentence was reformulated to: "EARLINET stations are classified between "active", "not permanent", "joining" and "not active". An EARLINET station is classified as active when on condition of performing regularly and simultaneously measurements with the 35 other stations composing the lidar network, and accordingly, contributing with uploading the performed measurements to the EARLINET database (https://www.earlinet.org/, last access: 20 December 2018)." page 4, line 32 -Please modify "space-borne" to "spaceborne". 40 The text is modified according to the reviewer's recommendation.
page 6, line 16 -It's not clear to me if observations more than 90 minutes apart were compared or not. Could you clarify this? 45 The study follows the CALIPSO CALIOP validation methodology developed in the framework of a collaboration between ESA and EARLINET collaboration (Pappalardo et al., 2010). The ESA dedicated program of collocated and concurrent EARLINET observations with CALIOP observations was developed prior to the launch of CALIPSO and is planned with a duration until the end-of-mission of the mission. On the contrary of the well-established CALIPSO-EARLINET validation activity, but also to the ESA-Aeolus and to the upcoming ESA-EarthCARE 50 satellite missions, a similar CATS-EARLINET validation strategy was not established. The participating EARLIENT stations in the study contributed to the evaluation of CATS through measurements performed during the fixed-scheduled program of EARLINET operation. As described in Pappalardo et al (2014), the EARLINET scheduled program of measurements includes three measurements per week, one during daytime around local noon (Monday, 14:00 ± 1h) and two during nighttime (Monday/Thursday, sunset + 2/3h), to enable 55 Raman extinction retrievals. In addition, EARLIENT operates a small number of lidar systems capable for 24/7 continuous measurements (Engelmann et al., 2016). The absence of an established dedicated validation activity between NASA and EARLINET prior to the operation of CATS, in combination with the fixed measurements schedule of EARLINET, the high variable overpass-time of CATS (bounded by the orbital characteristics of ISS) and the frequently cloud-contaminated cases led to a low 5 number of collocated and concurrent EARLINET-CATS cases to be available for the study. Eventually, this obstacle was tackled through the cooperative effort of a large number of EARLINET stations, contributing through the already performed measurements. The increasing number of EARLINET stations showing interest to contribute to the study led to an overall of forty-seven (47) available cloud-free EARLINET-CATS collocated cases to implement for the evaluation of CATS. 10 The EARLINET-CATS correlative study considers the collocation criteria established in the validation plan of CALIPSO. Regarding the spatial collocation, EARLINET participating stations contributed with measurements when the ISS overpass was within 50 km horizontal radius from their location.
Regarding the temporal collocation, the study implemented ground-based measurements with a temporal window of 15 EARLINET performed measurements with starting time, or stop time as close in time as possible to the ISS overpass. Accordingly, all the identified EARLINET cases where studied, through case-by-case inspection of the Range-Corrected-Signal quicklooks, for atmospheric homogeneity was of high importance, and additionally for other constrains (e.g. cirrus-clouds). During the first twenty months of CATS operation, based on thirteen EARLINET contributing stations, only 47 cases were found suitable to be used in the comparison. From the total of 47 cases, 44 20 where performed with "starting time", or "stop time" within 90 minutes of the ISS overpass. For this reason why the phrase "typically within 90 minutes of the ISS overpass" was used in the manuscript. In addition, it has to be mentioned that in the majority of the EARLINET cases encompasses the ISS overpass. The length of the temporal window was variable, based-on the expertise of the EARLINET teams, the homogeneity of the atmospheric scenes and the unique cloud constrains of each case, in order to allow retrievals of high-quality EARLINET backscatter 25 coefficient profiles.
The authors agree though with the reviewer that this part of the manuscript was not clear, therefore the manuscript was revised in the 2.3.1 section referring to the "Comparison methodology", and in addition the manuscript was updated with the following table (" Table 2" in the manuscript) that includes information on the correlative cases 30 used in the study. The table provides the "Day-Night Flag" of the study case, "Date" and "Time" of the ISS overpass, the corresponding EARLINET station and the minimum distance between the ISS orbit-track and the station location, and finally the EARLINET temporal window of measurements.
In Section 2.3.1, the following part of the manuscript was reformulated according to the reviewer's recommendation,  The authors acknowledge that the sentence was not clearly written, thus the sentence was reformulated from: "In addition, to account for contamination effects of multiple-scattering and specular reflection in the 5 intercomparison process, only cloud-free (including cirrus clouds) atmospheric scenes are used." to: "In addition, to account for contamination effects of multiple-scattering and specular reflection in the intercomparison process, only cloud-free atmospheric scenes are used. Cases with detected cirrus either at the EARLINET Range-Corrected-Signal quicklooks or at the ISS-CATS backscatter coefficient profiles or the feature 10 type profiles are not considered in the study." page 7, line 19 -Please modify "participated" to "participating".
The text is modified according to the reviewer's recommendation.
page 7, line 20 -"exited". Did you mean "excited"? 5 The reviewer is correct, the text is modified according to the reviewer's comment.
page 9, line 14 -Please modify "in details" to "in detail".
The text is modified according to the reviewer's recommendation. 10 page 9, line 24 -Please modify "below" to "of".
The text is modified according to the reviewer's recommendation.
The text is modified according to the reviewer's recommendation. later on in the beginning of 2019. Since CATS products are provided in different levels of processing, the made changes in the algorithms correspond to both L1B and L2O products.
To be more specific, the changes in the L1B algorithms include: (1) improvement of the nighttime attenuated total backscatter (ATB) profiles due to improvements in the calibration of CATS, thus improvement also in the daytime ATB profiles, since nighttime ATB is implemented in the 30 calculations of the daytime calibration.
The changes made in the algorithms of CATS L1B reflect on improvements on CATS L2O products. Though 35 additional changes in CATS L2O algorithms include also: (1) updates in number of profiles in the L2O datasets (2) improvements in the calculations of uncertainties in the L2O layer-integrated parameters (3) changes to the "Depolarization_Quality_Flag" (4) improvements of the Cloud Aerosol Discrimination (CAD) through the implementation of an additional 40 parameter, namely the "Cloud_350m_Fraction_XXX_FOV", to report of the number of 350 L1B profiles within each 5 km L2O bin of the L2O layer product with attenuated total backscatter values greater than 0.03 km -1 sr -1 , thus atmospheric features of high probability of being a cloud. In addition, the parameter "Num_Profs_Avg_LRatio_XXX_FOV" was added to the L2O Layer data product. (5) improvements in CATS Feature Type and Feature Type Score variables, but also in the Aerosol Subtype 45 classification (replace of "volcanic" with "UTLS Aerosol") and addition of the parameters "Opaque_Feature_Optical_Depth_1064_XXX_FOV" and "Opaque_Feature_Optical_Depth_Uncertainty_1064_XXX_FOV" in Mode 7.2 L2O datasets. (6) Updates in the Lidar Ratio (LR) values for cirrus clouds (7) update of the effective multiple scattering factor for ice clouds values to 0.52. 50 The above changes in the CATS V3-00 and V3-01 algorithms and the respective products are extensively presented and in-depth discussed in the CATS official website (; last visit on: 22/05/2019), in the "Publications" section.
pages 11 and 12, Section 3.2 and Table 2: It would be interesting to show the mean relative bias (that is bias over mean value). 55 According to the referee's comment we have computed and included in the table of comparison statistics between CATS and EARLINET the Mean Relative Bias (MRB), calculated as follows: The MRB were found equal to -24.06% and -19.84% for daytime and nighttime CATS observations respectively, and the results were included to the table.
page 14, line 10, Please modify "discrepancies" to "discrepancy". 10 The text is modified according to the reviewer's recommendation.
page 15, line 1, Please modify "based to" to "based on".
The text is modified according to the reviewer's recommendation. 25 Regarding the question of the reviewer on the discussed constrains on the dataset, Figures 1-4 show quantitatively the effects of (i) distance between the EARLINET station and the closest profile of the CATS-ISS overpass for each correlative case, (ii) CATS Feature Type, (iii) number of CATS Level 2 (L2) Aerosol Profiles (APro) used in the CATS horizontal average, and the effect of (iv) topography of EARLINET stations. The comparison exercise examines the effect of one discussed constrain at a time, while keeping all the other parameters in the methodology constant, 30 and considers various evaluation metrics, as discussed in the following sections.
(i) Effect of distance between the EARLINET station and the closest profile of the CATS-ISS overpass Figure 1 shows the effect of distance between the closest CATS L2 APro and the respective EARLINET station matchup, for different upper Euclidean distance thresholds (i.e.: 5n km, n∈Ν={1,10}). To be more specific, the 35 Mean Bias (MB; [Mm -1 sr -1 ]) - (Fig.1a), Root Mean Square Error (RMSE; [Mm -1 sr -1 ]) - (Fig.1b), Correlation Coefficient (Fig.1c), and the number of CATS-EARLINET correlative cases per each upper distance threshold are considered. For each upper distance threshold, all the available CATS-EARLINET cases of Euclidean distance lower or equal to the respective upper limit are considered in the computation of the aforementioned evaluation metrics. This cumulative approach is selected due to the limited number of CATS-EARLINET correlative cases, and is applied 40 separately for daytime and nighttime ISS overpasses, due to the different CATS measurement conditions. Based on the analysis, during nighttime (daytime), the CATS-EARLINET MB is increasing (decreasing) starting from the 5 km upper distance threshold, to reach -0.0300 (-0.123) Mm -1 sr -1 , for the radius threshold of 50km shown in the study. The computed RMSE values are in the range between 0.447 and 0.343 Mm -1 sr -1 for nighttime and between 0.357 and 0.448 Mm -1 sr -1 for daytime, for the distance thresholds of 5km and 50km respectively. The 45 minimum RMSE values are observed when considering ISS overpass cases of closer than 40 km distance to the EARLINET stations during nighttime, corresponding to MB of 0.018 Mm -1 sr -1 . The Correlation Coefficient is decreasing with increasing distance between the ISS overpass and the EARLINET stations. Notably, the Correlation Coefficient is not changing considerably for thresholds between 15 and 40 km for nighttime (~0.8) and between 15 and 30 km for daytime (~ 0.7). Sharp decreases in the Correlation Coefficient are observed during daytime (0.547), 50 for distances closer to the EARLINET stations than during nighttime (0.693), for 35 and 40 km distance respectively.
The observed tendencies can be explained in terms of the distance thresholds and number of available cases, since the distance thresholds define the number of cases that are used in the analysis and the number of case is critical to assess the performance of CATS. Consequently, the MB, RMSE and Correlation Coefficient are all subject to both the number and the characteristics of the CATS-EARLINET cases used. In the study the authors use the maximum number of available EARLINET cases, to avoid any possible selection effect resulting from a poor sample of 5 correlative cases, when strict collocation filters are applied. Using the maximum number of available correlative cases, i.e. twenty six (26) and twenty one (21) for nighttime and daytime respectively, for ISS overpasses within 50km radius from the EARLINET stations, the authors envisage to quantitatively address the question of CATS performance and the representativeness of the aerosol backscatter coefficient profiles, over various atmospheric, illumination and ISS overpass conditions. 10 The main objective of the CATS Cloud Aerosol Discrimination (CAD) score, or Feature Type Score, is to provide to the Feature Type classification a level of confidence. In the case of CATS, the Feature Type score is an integer number ranging between -10 and 10. The values of CATS Feature Type score correspond to classified aerosol 15 atmospheric layers (negative values) and cloud atmospheric layers (positive values), while the magnitude of the Feature Type score corresponds to the confidence level of the classification. A value of -10 indicates complete confidence that the layer is an aerosol layer, while Feature Type score equal to 0, indicates an atmospherics layer with equal probability to be cloud or aerosol. Figure 2 shows the effect of Feature Type Score, for different values, between -8 and 0 (i.e. for atmospheric layers 20 classified as aerosol layers). The Mean Bias (MB; [Mm -1 sr -1 ]) - (Fig.2a) Scores. To be more specific, not considerable changes are observed for different Feature Type Scores, regardless of the selected Feature Type threshold. This effect is due to the atmospheric characteristics of the CATS-EARLINET cases considered in the analysis. In the framework of the study, to account for contamination effects of multiplescattering and specular reflection in the intercomparison process, only cloud-free atmospheric scenes are used. Furthermore, cases with detected cirrus, either at the EARLINET Range-Corrected-Signal quicklooks or at the ISS-30 CATS backscatter coefficient profiles or the feature type profiles, are not considered in the study. Initially, the presence of clouds was investigated through the implementation of CATS backscatter coefficient and depolarization time-height images and EARLINET range-corrected-signal. Cases for which the retrieval of EARLINET temporallyaveraged profile was not feasible due to the presence of clouds, and/or CATS cases that the presence of clouds propagated into the CATS spatial-averaged profile were discarded from the analysis. Consequently, the lack of 35 dependence shown in Figure 2 (a-c) is the result from the a priory selection of cloud free conditions selected in the analysis. However, a notably characteristic is the nighttime performance of CATS, which as shown from the lower absolute MB and lower RMSE, but in addition from the higher Correlation Coefficient values, due to higher SNR, is more representative than the corresponding daytime performance. Similarly to the analysis presented and discussed above, Figure 3 shows the effect of different number of aerosol 5 profiles used when spatially averaging to retrieve the CATS aerosol profiles used in the framework of the study. In Figure 3, the acronym "CPro" corresponds to the closest CATS profiles to the corresponding EARLINET station. Accordingly, the Mean Bias (MB; [Mm -1 sr -1 ]) - (Fig.3a), Root Mean Square Error (RMSE; [Mm -1 sr -1 ]) - (Fig.3b), Correlation Coefficient (Fig.3c), are computed for different number of profiles used (i.e. CPro±1Profile, CPro±2Profiles, …). 10 Based on the MB, RMSE and Correlation Coefficient, the representativeness of CATS spatial profile is increasing with increasing number of aerosol profiles used in the horizontal averaging. To be more specific nighttime MB is almost constant, showing a low dependence on the number of profiles used, while for daytime CATS cases the opposite effect is observed, with improvement of CATS performance though increasing number of profiles used. Regarding RMSE no significant changes are observed, though a slight decreasing tendency in the RMSE is observed 15 for both daytime and nighttime cases. Regarding the Correlation Coefficient, increasing in the values is also observed, with increasing number of profiles used, both for daytime and nighttime cases, denoting the improvement of the representativeness with increasing number of CATS profiles used in the spatial averaging.

(iv) Effect of EARLINET stations topography
In order to study the effect of topography on the CATS profiles the authors separated the participating EARLINET stations into 3 clusters: Continental (Case I -Belsk, Bucharest, Leipzig, and Warsaw), Coastal (Case II -NOA, Athens 25 NTUA, Barcelona, Cabauw, Thessaloniki and Lecce) and Mountainous (Case III -Dushanbe, Evora, Observatory Hohenpeissenberg, Potenza). The three clusters and the characteristics of the stations are given in Table 1. In addition, Figure 4 shows the locations of the participating stations; green circles denote Continental stations, blue circles denote Coastal stations and brown circles denote Mountainous stations. Figure 4 shows, additionally to the geographical distribution of the active EARLINET stations, the daytime/nighttime overpasses of ISS within the 5 evaluation period, between 02/2015 and 09/2016, encompassing the first twenty months of CATS operation. Due to the limited available dataset of CATS-EARLINET cases, the daytime/nighttime approach was not followed in the case of the analysis regarding the effect of topography. 5 Figure 5 shows the effect of Topography, for three different clusters of station characteristics, as introduced above (Case I: Continental, Case II: Coastal and Case III: Mountainous). In Figure 5a, the Box and Whisker plot on the CATSi-EARLINETi residuals is shown, including the lower and upper whiskers which indicate the 10 th and 90 th percentiles respectively, and the 25 th and the 75 th quantiles indicated by the lower and upper box boundaries respectively. The horizontal line and the red dot indicate the statistical mean and median values respectively while outliers are 10 indicated by red crosses. According to the results, it is evident that the correlative measurements between the Mountainous EARLINET stations and the ISS overpasses are characterized by higher variability, more extreme differences, higher absolute mean and median biases and higher RMSE than in the Continental and Maritime cases. Complex topography, in terms of geographical characteristics, erroneous mean backscatter coefficient profiles due to the high variability of aerosol load in the Planetary Boundary Layer, the horizontal distance between the CATS 15 lidar footprint and the ground-based lidar stations and surface returns enhance the discrepancies, especially in the lowermost part of the profiles, resulting in higher differences between the EARLINET profiles and CATS profiles. Due to the lack of the aforementioned effects arising from complex topography, CATS representativeness and performance is higher over the Continental cases, while CATS performance over the Coastal stations is characterized by slightly lower absolute value of mean bias and at the same time by lower Correlation Coefficient 20 than in the case of Continental cases. However, it has to be taken into consideration the important factor related to the presented resultsm that is the number of CATS-EARLIENT correlative cases used in the analysis, 23 for Case I -Continental, 10 for Case II -Coastal and 14 for Case III -Mountainous. Analytical evaluation metrics on the effect of topography are given in Table 2. In Fig.5a, the Box and Whisker plot on the CATSi-EARLINETi residuals is shown, including the lower and upper whiskers which indicate the 10 th and 90 th percentiles respectively, and the 25 th and the 75 th quantiles indicated by the lower and upper box boundaries respectively. The horizontal line and the red dot indicate the statistical mean and median values respectively while outliers are indicated by red crosses. Fig.5b and Fig.5c show the RMSE and Correlation Coefficient as a function of the different clusters, including the number of available cases per cluster. written and its contribution to the scientific aerosol community is valuable. I believe that the paper is adequate for publication under the special issue "EARLINET aerosol profiling: contributions to atmospheric and climate research" of the Atmospheric Chemistry and Physics journal after minor revision.
The authors would like to thank the reviewer for the interesting and at the same time substantial comments and 20 suggestions. We tried, and did our best, to incorporate the proposed changes and corrections in the revised manuscript, aiming at improving the presented paper. Following, you will find our responses, one by one to the comments addressed. Kind regards, Emmanouil Proestakis 25 Specific Comments: Page 2, Line 5: "…underestimations of the total Aerosol Optical Depth (AOD)". Please reframe this sentence. The way it is currently written it gives the impression that the AOD exploration is part of this study. 30 The authors agree with the reviewer regarding CATS AOD at 1064nm. CATS AOD at 1064 nm has been investigated by a significant number of research groups, towards the assessment of CATS performance ( Comparison of CATS and CALIOP collocated extinction coefficient profiles shows also good shape agreement. Rajapakshe et al. (2017) reported on similar geographical patterns regarding Above Cloud Aerosols and Cloud Fraction between CATS and CALIOP retrievals. Furthermore, CATS retrievals were used to document the diurnal cycle and variations of clouds. Noel et al. (2018) showed that both CATS and CALIOP profiles of Clouds agree well, with minor differences of the order of 2-7% throughout the entire profiles. In addition, CATS depolarization 5 measurements were investigated in the case of desert dust, smoke from biomass burning and cirrus clouds , and were found consistent and in good agreement with depolarization measurements from previous studies and historical datasets implementing CPL (Yorks et al., 2011) andCALIOP (Liu et al., 2015). The studies report in general on the good performance of CATS, despite apparent underestimations. Similarly, to the aforementioned studies, the present study also reports on the good performance of CATS, especially during nighttime, despite 10 underestimations in the backscatter coefficients. The performed studies, not only the present study, report on CATS underestimations, starting from CATS L1 to CATS L3 data, for this reason the sentence includes the phrase: "CATS low negative biases, partially attributed to the deficiency of lidar systems to detect tenuous aerosol layers of backscatter signal below the minimum detection thresholds, may lead to systematic deviations and slight underestimations of the total Aerosol Optical Depth (AOD) in climate studies.". 15 Page 3, Line 29: "CATS retrievals…..complementarily used". This sentence is incomplete as written. Please reframe.
According to the reviewer's recommendation, the commented sentences of the manuscript were modified from: 20 "CATS retrievals were used to document the diurnal cycle and variations of clouds, with CALIOP complementarily used. Noel et al. (2018)   The reviewer is right regarding the potential value of the Mode 7.2 532nm product. However as mentioned in the CATS "Data Release Notes" and the CATS "Algorithm Theoretical Basis Document" (ATBD), unlike the M7.1 data, where the 532 and 1064 nm signals are comparable, the M7.2 532 and 1064 nm signals are very different. Mode 7.2 data at 532 nm is noisy due to issues with stabilizing the seeded laser (laser 2). Since the frequency stability is 40 poor on laser 2, the laser is not aligned properly with the CATS etalon, causing very weak signal transmission. Therefore, it is highly recommended by the NASA CATS team not to use the M7.2 532 nm data for any application, especially for daytime. On the contrary, the use of the 1064 nm data is recommended, though only for studies that are wavelength-independent (i.e. layer detection, relative backscatter intensity).
According to the reviewer's recommendation, the commented sentences of the manuscript were modified from: "CATS was a technology demonstration designed to operate on-orbit for a minimum of six months and up to three 50 years" to: "CATS was a technology demonstration designed to operate on-orbit between six months and three years".
Page 4, Line 35: "CATS products and….of processing". Please rephrase the sentence. The part "…and provided in …" is not in the correct tense or it is not a continuation of the previous text. 5 According to the reviewer's recommendation, the commented sentences of the manuscript were modified from: "CATS products and processing algorithms (Pauly et al., 2019) rely heavily on the processing algorithms developed in the framework of the CPL, ACATS and CALIPSO lidar systems (Palm et al., 2002;Yorks et al., 2011;Hlavka et al., 2012) and provided in different levels of processing." 10 to: "CATS processing algorithms (Pauly et al., 2019) rely heavily on the processing algorithms developed in the framework of the CPL, ACATS and CALIPSO lidar systems (Palm et al., 2002;Yorks et al., 2011;Hlavka et al., 2012), while CATS products are provided in different levels of processing". 15 Page 5, Line 1: What is the error in CATS aerosol backscatter retrievals?
The primary sources of uncertainties in the CATS attenuated backscatter signal are the calibration constant and signal noise. Thus if the calibration constant is accurate, the CATS The source of the systematic error in the CATS ATB is the uncertainty in the calibration constant and is estimated at 5-10% for 1064 nm at night (10-20 20% for daytime data). The random error in the ATB is dominated by noise in the lidar signal. The total uncertainty, sum of the systematic and random errors, in the CATS ATB 1064 nm is estimated at 10-20% for nighttime data and 20-30% for daytime data.
According to the reviewer's observation and comment, the manuscript was expanded to include also the phrase: "The total uncertainty, the sum of the systematic and random errors, in the CATS ATB at 1064nm is estimated at 10-25 20% for nighttime data and 20-30% for daytime data." Page 7, Line 4: Hohenpeissenberg site is not listed here. Please add it. 30 The authors would like the reviewer for observing that Hohenpeissenberg EARLINET station was not included in the list. The list is updated to include Hohenpeissenberg station.
Page 7, Line 22: "widow" -> window 35 The text is corrected according to the reviewer's comment.
Page 8, Line 11-13: The authors used two different processing algorithms for the retrieval of the ground-based aerosol backscatters namely the SCC and PollyXT specified retrieval algorithms. Under the SCC, all measurements could have been processed/treated in the same way. Could you comment on this decision not to process all 40 measurements in the same way and whether these two algorithms can introduce discrepancies in the reported CATS comparison?
Indeed, in case of Polly XT systems, an algorithm designed specifically for these systems is used for the retrieval of aerosol backscatter coefficient profiles. The reason why we decided not to process these data with the Single Calculus Chain (SCC) is that up to now, the SCC is able to support very efficiently the processing of lidar signals for 45 systems employing two different acquisition receivers (i.e. one in analogue and one in photon counting mode or both in photon counting mode) for the acquisition of far range and near range signals, but only one receiving telescope. However, it does not yet support the simultaneous processing of lidar signals in case of systems employing two different telescopes for the acquisition of far range and near range signals (i.e. Polly XT systems). Therefore, for Polly XT systems in the SCC, the only way in order to correct for the incomplete overlap effect at lower 50 altitudes would be to use an overlap function profile. On the contrary, the Polly XT specified retrieval algorithm enables the processing of the two signals (collected from far and near range telescopes) simultaneously, for this reason we decided not ignore the useful information collected by these two different receiving units and use this algorithm instead. The aforementioned algorithm has been excessively studied in the past (Baars et al., 2016) and proven to provide accurate results in an efficient manner. We thus believe that in this way the comparison of Polly XT profiles with profiles from other lidar systems, processed with SCC is more accurate since for the former an 5 assumption of the overlap profile is not needed.
Page 9, Line 28: CATS has an overpass over Athens-NTUA at the same day even closer to the measurement site than Athens-NOA but at different time frame (a bit later). As the authors explain, the atmospheric conditions were rather stable at that day. To authors' discretion, I find it valuable/informative if the profiles from that station would be added to Figure 2d and discuss further on the possible differences or similarities. 10 19:57:41 UTC was used. Both CATS and PollyXT-NOA quicklooks advocate the horizontal and vertical homogeneity of the scene. For the comparison of CATS and EARLINET observations, the latest are regridded to the CATS Level 2 vertical resolution (60 m). Accordingly, CATS spatial averaged and the EARLINET systems of NOA and NTUA temporal averaged backscatter coefficient profiles are qualitative compared (Fig. 2d). The observed disagreements between the two EARLINET profiles are related to differences between the two system, to the different surface elevation of 5 the locations of the two stations (86m for EARLINET-no and 212 for EARLIENT-at), and the different overlap regions. The horizontal-bars in the CATS profile (Fig. 2d) correspond to the standard deviation of the spatially averaged backscatter coefficient profiles. The comparison of the mean backscatter coefficient profiles retrieved by CATS and the two corresponding EARLINET NOA and NTUA profiles presented in Figure 2 is an initial demonstration of the good agreement between the two 10 products. The CATS instrument reproduces the observed aerosol features, in terms of aerosol load as well as their vertical distribution (Fig. 2d). The assessment of CATS backscatter coefficient is performed in the region between 0.5 km above ground-level of the EARLINET sites, to account for overlap effects between the laser beam and the telescope (Wandinger and Ansmann, 2002), topographic effects, surface returns, and differences of atmospheric samples within the Planetary Boundary Layer (Fig. 2d - Page 10, Line 30. I suggest to put each of the cases into a different section giving a short title indicating the complexity of the example.
According to reviewer's recommendation, the suggested part of the manuscript is modified to include the following headers, indicating the different EARLINET-CATS correlative cases:  Table 4.
The authors agree with the reviewer, therefore the sentence was modified to include the Minimum Detectable Backscatter (MDB) with the same accuracy, as shown in the paper of Yorks et al. (2016) regarding CATS level 1 processing algorithms and data products, as follows: 40 "CATS M7.2 Minimum Detectable Backscatter 1064 nm: Night: 5.00E-5 ± 77E-5 km −1 sr −1 / Day: 1.30E-3 ± 0.24E-3 km −1 sr −1 -for cirrus clouds; Yorks et al., 2016". In addition, and according to the reviewer's comment, Page 14, Line 10. To my understanding, the authors mention at Page 10, lines 19-21 that cases where CATS backscatter coefficient is zero or it is at its minimum detection limit have been eliminated from the study yet they are present in this figure for altitudes higher than 6km. Could you clarify ? 5 In the framework of the comparison methodology, cases of EARLINET backscatter coefficient values below the CATS minimum detectable backscatter limit at 1064 nm are not included in the comparison, when the corresponding CATS backscatter coefficient is reported to be zero ( Fig. 2d -shaded area i). This constrain is applied to account for very thin detected layers from ground-based Lidar systems with backscatter values below the CATS minimum detection limit due to the low Signal-to-Noise Ratio values (SNR). The discussed constrains are employed because 10 of our basic idea to quantitatively assess the representativeness and accuracy of the detected by CATS aerosol features, while preventing possible contaminations (e.g. presence of clouds) to propagate into the CATS-EARLINET dataset. It is applied in the comparison shown for instance in Figure 6, to address quantitatively the accuracy and representativeness of the satellite-based lidar retrievals and to estimate possible biases in the CATS backscatter coefficient. It is applied in the comparison of CATS against EARLINET, to the implementation of the CATSi-EARLINETi 15 residuals for each pair of observations "i", to be used as statistical indicators of CATS average overestimation or underestimation of the aerosol load, in terms of backscatter coefficient values. In this term, since the analysis if focusing to the possible CATS overestimation/underestimation of the aerosol load, the authors compare cases where aerosols are detected by both EARLINET and CATS, or by at least one of the two systems. The comparison statistics on the efficiency of CATS to detect atmospheric features detected by EARLINET systems refers to the 20 aforementioned discussion. However, the study of the evaluation discussion of the mean aerosol backscatter coefficient profiles at 1064 nm as provided by CATS and EARLINET, without further processing, for both daytime (Fig. 7a) and nighttime (Fig. 7b) lidar observations, to investigate the characteristics, similarities and discrepancies between CATS and EARLINET.
Page 16, Line 4: The authors have used the Level 2 v2.01 for the evaluation. Nonetheless the latest available version is the v3.01. How this versioning is going to change the associations reported here? Could you correct the 5 versioning to the latest available as in here and line 1 at page 17, for consistency?
The authors would like to thank the reviewer for this comment, the versioning at page 17 was corrected as should be. Regarding CATS algorithms and the different versions, CATS V3-00 replaced CATS V2-05 on October 1 st , 2018. 10 Accordingly, due to algorithm issues present in CATS V3-00 was shortly after replaced by CATS V3-01. Initially, the changes in CATS Level 1 and Level 2 algorithms corresponding to CATS Version 3-00 data was planned to be the final algorithm release for the CATS project, though observed issues in the CATS products led to the modifications of V3-00 and the release of the V3-01 later on in the beginning of 2019. Since CATS products are provided in different levels of processing, the made changes in the algorithms correspond to both L1B and L2O products. 15 To be more specific, the changes in the L1B algorithms include: (1) improvement of the nighttime attenuated total backscatter (ATB) profiles due to improvements in the calibration of CATS, thus improvement also in the daytime ATB profiles, since nighttime ATB is implemented in the calculations of the daytime calibration.
The changes made in the algorithms of CATS L1B reflect on improvements on CATS L2O products. Though additional changes in CATS L2O algorithms include also: (1) updates in number of profiles in the L2O datasets 25 (2) improvements in the calculations of uncertainties in the L2O layer-integrated parameters (3) changes to the "Depolarization_Quality_Flag" (4) improvements of the Cloud Aerosol Discrimination (CAD) through the implementation of an additional parameter, namely the "Cloud_350m_Fraction_XXX_FOV", to report of the number of 350 L1B profiles within each 5 km L2O bin of the L2O layer product with attenuated total backscatter values greater than 0.03 km -1 sr -1 , thus 30 atmospheric features of high probability of being a cloud. In addition, the parameter "Num_Profs_Avg_LRatio_XXX_FOV" was added to the L2O Layer data product. (5) improvements in CATS Feature Type and Feature Type Score variables, but also in the Aerosol Subtype classification (replace of "volcanic" with "UTLS Aerosol") and addition of the parameters "Opaque_Feature_Optical_Depth_1064_XXX_FOV" and 35 "Opaque_Feature_Optical_Depth_Uncertainty_1064_XXX_FOV" in Mode 7.2 L2O datasets.
(6) Updates in the Lidar Ratio (LR) values for cirrus clouds (7) update of the effective multiple scattering factor for ice clouds values to 0.52. The above changes in the CATS V3-00 and V3-01 algorithms and the respective products are extensively presented and in-depth discussed in the CATS official website (https://cats.gsfc.nasa.gov/; last visit on: 22/05/2019), in the 40 "Publications" section. CATS products and processing algorithms are provided in different levels of processing. CATS Level 1B (L1B) data include vertical profiles of total and perpendicular attenuated backscatter signals, range-corrected, calibrated and annotated with ancillary meteorological parameters (McGill et al., 2007;Powell et al., 2009;Vaughan et al., 2010). CATS Level 2 (L2) products provide the vertical distribution of aerosol and cloud properties (depolarization ratio, 45 backscatter and extinction coefficient profiles at 1064 nm -FFOV), with a horizontal and vertical resolution of 5 km and 60 m respectively. In addition, L2 data include geophysical parameters of the identified atmospheric layers (vertical feature mask -feature type, aerosol subtype), the required horizontal averaging and information on the feature type classification confidence . Regarding the way this versioning is change the associations reported in the present manuscript, more information 50 are included in the CATS validation led by NASA GSFC Team, and more specific by Dr. Rebecca Pauly (Science Page 17, Line, 11: "52o" -> "52°" The text is corrected according to the reviewer's comment. 25 Page 17, Line 20: "explotations" -> "explotation" The text is modified according to the reviewer's comment.   The reviewer is right regarding the Dushanbe station, regarding the time of ISS overpass and the frequently type of color used, at least in CALIPSO CALIOP quicklooks. Although in the case of ISS-CATS orbits are not presented in the 45 same colors, blue and red for nighttime and daytime orbits respectively, we have used the same color here, thus Figure 5a was adapted accordingly, as follows: