Comparison of Antarctic polar stratospheric cloud observations by ground-based and spaceborne lidars and relevance for Chemistry Climate Models ”

This paper describes the comparison between PSC measurements at Antarctic McMurdo Station from ground based lidar and CALIOP satellite measurements. Furthermore, the paper tries to extend the comparison of PSC statistics from CALIOP with several CCM model results from CCMVal-2 and CCMI. Although scientific value of this study might be significant, the method of comparison especially with CCM models is not well organized to derive scientifically useful conclusions, as is pointed out below. Also, there are too many typos and careless mistakes in the draft. A major revision is required before this paper will be published in ACP. I recommend that authors should check the draft carefully, including the native check, before submitting the revised draft.


Introduction
Lidar observations have been extensively used to characterize the occurrence of PSCs in the polar stratosphere (see e.g. Browell et al. (1990); Adriani et al. (2004);Di Liberto et al. (2014); Achtert and Tesche (2014)) . The observed optical parameters allow to discriminate different cloud types, such as STS (supercooled ternary solution), NAT (nitric acid trihydrate) and water ice, and external mixtures of the former. Pitts and co-workers (Pitts et al., 2009(Pitts et al., , 2013, calculated the optical parameters 5 of cloud particles with different size distributions and chemical composition in order to define a PSC classification, which was then applied to the CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization) data. Achtert and Tesche (2014) made an assessment of several lidar-based PSC classifications and their impact on the occurrences of the different PSC types. Their conclusion was that the comparison of PSC classifications obtained from different lidar observations is not straightforward and should take into account the measurement technique and classification methodology used. A variety of schemes using different 10 thresholds for detection and classification has been proposed, rendering a comparison difficult. Here we want to compare ground-based and satellite based lidar data, by using a detection and classification scheme for the ground-based data, which closely approaches the new v2 classification scheme used for CALIOP .
Ground-based lidar observatories provide a unique data base, having decadal coverage, albeit with discontinuities, spanning from the middle eighties to today. 15 The first lidar observations in Antarctica started in 1985 at Syowa Station. Iwasaka and co-workers (Iwasaka, 1985(Iwasaka, , 1986 used a polarization sensitive lidar to measure backscatter and depolarization to observe PSCs. Later, in 1987Later, in /1988, at the Amundsen-Scott South Pole Station, Fiocco and co-workers (Fiocco et al., 1992) used the elastic backscatter signal from a lidar operating at 532 nm to observe PSCs in relation to the temperature. PSCs have also been observed at Davis, from 2001 to 2004 (Innis and Klekociuk, 2006) and at Rothera (Simpson et al., 2005) from 2002 to 2005. 20 Long-term observations of PSCs have been performed at McMurdo (Adriani et al., 1992(Adriani et al., , 1995(Adriani et al., , 2004Di Liberto et al., 2014), from 1989 until 2010 and at Dumon D'Urville (Santacesaria et al., 2001;David et al., 1998David et al., , 2010, from 1990 until now, both with polarization sensitive lidars. Recently the McMurdo lidar has been transferred to Dome C and is operating there from 2014 on (Snels et al., 2018).
A clear issue is that the representativeness of ground-based long-term lidar data series of the Antarctic stratosphere might 25 limit their value in climatological studies and model evaluation. Since the long-term ground-based lidar observations have been performed only in few locations, the comparison with model simulations and satellite borne instruments is necessarily limited to these locations, which poses a limit to their use. The recent availability of satellite-borne lidar observations provides an almost complete coverage of the globe, and presents the opportunity to test the polar stratospheric cloud scheme of Chemistry Climate Models (CCMs) on synoptic scales. The Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) was 30 launched in April 2006 with the primary objective of improving our understanding about the impact of clouds and aerosols on the climate. CALIOP provides total backscatter and depolarization profiles, allowing classification of the observed clouds and aerosols. The original CALIPSO mission had a minimum time frame of 3 years, but has been extended several times and is still active.
Comparison between CALIOP and ground-based observations in the Antarctic stratosphere of PSCs is thus possible from 2006 on and has been pursued in the case of McMurdo Station by performing co-incident measurements with CALIPSO overpasses whenever possible.
Due to their primary role in ozone chemistry, a correct representation of PSCs in CCMs is needed. Actually, the parametrization of PSC formation in most CCMs depends only on temperature thresholds and on nitric acid and water vapour concentra- 5 tions for the determination of supersaturation conditions. A rather complete description of the parametrizations used in stateof-the-art CCMs is reported in Morgenstern et al. (2017). The SPARC Report N o 5 (2010) Chemistry-Climate Model Validation (CCMVal-2) (Eyring et al., 2010) has shown that CCMs can have a biased representation of the stratospheric conditions with colder temperatures that lead to an overestimate of ozone depletion, also due to an unrealistic PSC coverage. Hence PSC simulations show a large uncertainty, as reported in the CCMVal-2 report. Nevertheless, the report presents a preliminary evaluation 10 based on global averages with a subset of CALIOP data.
The most recent CCMs are able to reproduce the denitrification by the formation of STS and NAT and the dehydration through the formation of ice clouds, but use rather approximate schemes based on temperature thresholds for the onset of nucleation, with additional constraints on how much of the available nitric acid is depleted by STS and NAT formation.
Although the overall denitrification and dehydration can be represented rather well, the correct description of the formation of 15 STS and NAT, and mixed type PSCs would need a more sophisticated microphysics model.
In the present work we first compare the statistics of occurrence of different PSC classes in the stratosphere over McMurdo Station, as detected by the ground-based lidar operating there and the satellite-borne CALIOP. Subsequently we use the full coverage of the Antarctic CALIOP data to assess the performances of different CCMs in simulating PSC occurrences and PSC distribution over Antarctica.  (Stephens et al., 2002(Stephens et al., , 2017. With an orbit inclination of 98.2 • , it provides extensive daily measurement coverage over the polar regions of both hemispheres, up to 82 • in latitude. It hosts the CALIOP two wavelength polarization diversity lidar, that measures backscatter 25 at wavelengths of 1064 nm and 532 nm, the latter signal separated into parallel and cross polarization, with respect to the polarization of the outgoing laser beam. Details of CALIOP can be found in Hunt et al. (2009) andWinker et al. (2009). CALIOP data have extensively been used for observing PSCs and improved algorithms for PSC classification have been reported in Pitts et al. (2009Pitts et al. ( , 2011Pitts et al. ( , 2013Pitts et al. ( , 2018.

2.2 Ground-based PSC observations at McMurdo
A Rayleigh polarization diversity lidar has operated in the Antarctic station of McMurdo since 1991, in the framework of an USA-Italian collaboration (Adriani et al., 2004;Di Liberto et al., 2014). It measures aerosol backscatter and depolarization profiles from 12 km to 30 km, with a vertical resolution of 30 m. Aerosol backscattering is retrieved using the Klett algorithm (Klett, 1981) and the extinction is calculated according to Gobbi (1995). The depolarization is calibrated following the 5 method described in Snels et al. (2009). The lidar was operated by science technicians of the National Science Foundation (NSF) during the Antarctic winter, typically from the end of May until the end of September to cover the whole period of PSC occurrence. Potential vorticity reanalysis shows that McMurdo is well within the stratospheric polar vortex from mid-June to the end of September, except for rare events of major vortex perturbation. As a routine, the lidar is operated at the same time every day when meteorological conditions are favorable, or at the earliest chance to do so, for about 30 min-10 utes to render a single profile. When possible, the observations are synchronized with overpasses of the CALIPSO satellite, when its footprint is within 100 km distance from McMurdo. Observations are intensified in coincidence with Optical Particle Counter (OPC) and ozone sondes balloon measurements (Adriani et al., 1992). All observations at a wavelength of 532 nm used in the present analysis have been quality checked and the relevant data are publicly available in the NDACC data base (ftp://ftp.cpc.ncep.noaa.gov/ndacc/station/mcmurdo/ames/lidar/). 15 For the ground-based lidar data a single vertical profile with a vertical resolution of 150 m is obtained by averaging 30 minutes of acquisition.

PSC detection and classification
PSC detection and classification from lidar measurements with orthogonal polarization is usually based on two optical parameters derived from the optical signals with parallel and perpendicular polarization with respect to the laser, the backscatter ratio 20 and the aerosol depolarization. Here we use the backscatter ratio R and the perpendicular backscatter coefficient β ⊥ , in order to be consistent with the v2 detection and classification scheme used for the CALIOP data. The backscatter ratio is defined as where β aer is the total aerosol backscatter and β mol is the total molecular backscatter.
We must bear in mind that for all lidar measurements the optical parameters represent an average value of the microscopic 25 properties of an ensemble of many particles in a large air volume which may belong to different composition classes. Only The CALIOP v2 PSC detection and composition classification algorithm   the random uncertainties u(β ⊥ ) and u(R) due to shot noise in each data sample, which are used to establish dynamic detection thresholds and composition boundaries. The CALIOP v2 algorithm is represented pictorially in Figure 1 and is described in more detail in the following sections.

PSC detection
PSCs are detected in the CALIOP data as statistical outliers relative to the background stratospheric aerosol population. The v2 background aerosol thresholds β ⊥,thresh and R thresh are calculated as the daily median plus one median deviation of CALIOP data at ambient temperatures above 200 K. PSCs are those data points for which either β ⊥ > β ⊥,thresh +u( β ⊥ ) or R > R thresh +u(R). If β ⊥ ≤ β ⊥,thresh +u( β ⊥ ) and R ≤ R thresh +u(R), the point is a non-PSC. Noise spikes are eliminated in the CALIOP 5 v2 data by requiring coherence within a running 3-point vertical by 5-point horizontal along-track box.

PSC composition
The PSC composition is determined as follows: , the PSC is classified as STS.
• A PSC with β ⊥ > β ⊥,thresh + u( β ⊥ ) is assumed to contain non-spherical particles and is classified as NAT (or enhanced 10 NAT) mixture or ice based on its value of R. The boundary value separating ice from NAT and enhanced NAT mixtures, R N AT |ice , is calculated based on the total abundances of HNO 3 and H 2 O vapors as determined on a daily basis as a function of altitude and equivalent latitude from nearly coincident cloud-free Aura MLS data.
The CALIOP v2 data set provides both the grid of classified PSCs according to the v2 algorithm and the associated optical parameters.

PSC Detection and classification criteria for the ground-based data
In order to compare the ground-based lidar data to the CALIOP data we have adopted a new algorithm which follows the same 20 approach and uses the same optical parameters as the v2 CALIOP algorithm (see Figure 1 ).

PSC detection
The ground-based raw data have been re-elaborated to produce the backscatter ratio R and the perpendicular backscatter coefficient β ⊥ . While the determination of the background aerosol thresholds for the CALIOP data uses a very large number of observations, the quantity of ground-based lidar data is much smaller and does not allow a similar treatment. Instead of 25 using daily medians we calculated a median value from all ground-based data in the 5-year period without PSCs (typically before 15 June or after 1 October) or in obvious clear sky conditions. Thus the background aerosol thresholds were determined as the median values plus one standard deviation of the median. In this way we obtained fixed background thresholds for the backscatter ratio R thres =1.15, and also for β ⊥ =1·10 −6 m −1 sr −1 . While most PSC detection schemes for ground-based lidar data use a threshold only for R (Achtert and Tesche, 2014), the scheme used here is more permissive and allows all data 30 with R > 1. 15 + u(R) or β ⊥,thresh > 1·10 −6 m −1 sr −1 + u( β ⊥ ), where u(R) and u(β ⊥ ) are the running standard deviations over altitude, and a local temperature below 200 K in a range between 12 and 30 km to be detected as PSCs. Note that this procedure is very similar to the v2 CALIOP algorithm, except that we use fixed background thresholds and different estimates of the uncertainties in the data. Finally, to mimic the CALIOP coherence criteria, we require continuity along the vertical profile to avoid identifying isolated noise spikes as PSCs.

PSC composition 5
Composition classification for ground-based PSCs is nearly identical to the CALIOP v2 procedure, the exception being that we use monthly averages for R N AT |ice computed from daily values included in the v2 CALIOP data files.
2.6 Comparison of co-located PSC observations at McMurdo from the ground and from CALIPSO during the

5-year observation period
Here, we compare PSC statistics from ground-based and satellite-borne lidars, with the goal to assess if the differing measure-10 ment procedures used for each of them, induce a bias in the PSC classification, which might hamper the definitions of useful common diagnostics for assessing the performance of regional and global climate models. CALIOP overpasses do not occur every day and at most twice per day. In average we have up to 40 CALIOP overpasses per month. Ground-based lidar data are mostly recorded during a CALIOP overpass, but also on days without CALIOP overpasses, usually at the same time that CALIOP overpasses occur and sometimes at different times from the CALIOP overpasses. The 20 latter are not included in this analysis. All other ground-based measurements have been used in the statistical comparison.
Generally speaking most of the ground-based profiles have been recorded during a CALIOP overpass, but there might be days with either a ground-based measurement or a CALIOP measurement.
The comparison between data obtained by space-borne and ground-based instruments is not straightforward. Lidars on satellites provide altitude resolved PSC observations on a synoptic scale, with fixed revisit times on the ground spot, and  The ground-based lidar observes at distances up to 30 km from the ground, while the satellite based lidar is in orbit at 705 km and observes backscattering from distances around 700 km. This implies that the signal-to-noise ratio of CALIOP is in general lower than that of the ground-based lidar. Therefore the CALIOP data use averaging processes where the signal-to-noise ratio is low, and varies the threshold on both R and β ⊥ as a function of signal-to-noise ratio.
For these reasons, a point-to-point profile comparison of these data bases may not be sufficient to evaluate whether or not the instruments provide a compatible information of PSC coverage and partition in different classes, which, at the end is the information needed to evaluate models and provide a climatic survey of the polar stratosphere. 5 The purpose of this analysis is not to perform a validation of the satellite-borne instrument, but to verify if the two instruments provide compatible information in terms of occurrences of the different PSC classes around McMurdo.
In order to illustrate how ground-based and space-borne lidar observations of PSCs compare, we show as an example the Both the CALIOP PSC product, and the classification of the ground-based lidar optical parameters, obtained with the v2 algorithm adapted for ground-based data, provide a similar view for this winter with a dominance of NAT mixtures with isolated periods of ice PSCs in July. Enhanced NAT mixtures appear mostly in June and July, around and above 18 km, while STS has been observed in the lower layers throughout the season, being the major species in September. These results are not directly comparable with the analysis previously reported (Di Liberto et al., 2014), where a different classification scheme for ground-5 based data was adopted and different PSC classes were assigned. Although the overall agreement with CALIOP is acceptable, many small differences are evident, and confirm that a point-to-point comparison of these data is not straightforward.

11
The figure shows that PSCs are observed up to 25 km in July and August. Above 25 km the number of PSC observations is negligible, both for ground-based and CALIOP observations. NAT mixtures are the dominating species with a slightly different altitude ditribution in July; ground-based occurrences of NAT mixtures are more frequent below 18 km with respect to CALIOP data.
The occurrences of ice clouds in July are very similar, while in August some low ice clouds appear in the ground-based data, 5 but are absent in the CALIOP observations. Enhanced NAT mixtures occur mainly in July, and are observed between 17 and 25 km, though more abundant in the ground-based observations. The vertical distribution of STS shows a good agreement in July and August.
Another way to compare the statistical distribution of PSCs as observed by both instruments is to use the temperature dependence. The temperature dependence of the occurrence of different PSC classes has been studied intensively with in-10 situ and remote data with the goal to confirm hypotheses on microphysical mechanisms of PSC formation (Peter, 1997). In  The total number of observations have a very similar temperature distribution, which indicates that the two instruments statistically sample air masses with a similar temperature distribution. The temperature dependence of the NAT and STS PSCs is very similar, although the peak for NAT is slightly shifted to lower temperatures. The onset for ice is the same, although the ice fraction at lower temperatures appears to be larger for CALIOP than for the ground-based data.
3 Comparison of CALIOP PSC observations in the Southern Hemisphere with CCM simulations 5 The coupling of stratospheric chemical models with climate models has led to a new generation of models. These coupled CCMs have been used within the Chemistry-Climate Model Validation activity 2 (CCMVal-2) (Eyring et al., 2008) and represent both stratospheric chemistry and atmospheric climate. CCMVal-2 models do not include a representation of stratospheric aerosol physics and chemistry, but use parametrizations to take into account the formation of PSCs. There are large differences among CCMs for their treatments, regarding their formation mechanisms, types, and sizes (Morgenstern et al., 2010). Evaluating the ability of CCMs to reproduce ice and NAT PSCs is a key factor to interpret simulated stratospheric polar ozone changes. The comparison of space-borne PSC observations with CCM simulations requires adequate diagnostic methods. Here 15 we assess the ability of models to simulate PSCs taking into account diagnostics that mostly focus on microphysical factors, such as the NAT and ice surface area densities and diagnostics that are sensitive to the coupling of those with the simulation of polar vortex variability and its mean state. Some general features such as the horizontal resolution and vertical levels have been displayed in Table 3  All models include water-ice PSCs as well as NAT. They also treat sulfate aerosols in different forms, such as STS (CAM3.5, WACCM and CCSRNIES), or liquid aerosol (LMDZrepro).

10
The conditions at which PSCs condense and evaporate vary, not only for water-ice PSCs but also for NAT and STS, between CCMs (Morgenstern et al., 2010). Most CCMVal-2 models use a thermodynamic equilibrium assumption that PSCs are formed at the saturation points of HNO 3 over NAT and H 2 O over water-ice.
The microphysical processes of condensation and evaporation of the PSCs vary among the different models.  Table 4. Main features of simulation and of the microphysics of polar stratospheric clouds. EQ =thermodynamic equilibrium with gaseous HNO3 / H2SO4 / H2O assumed. HY = non-equilibrium / hysteresis considered. LA=liquid aerosol (adapted from CCMVal-2 report (2010)).
Note that the equilibrium assumption allows to determine the total mass of condensed PSCs, and that a size distribution needs to be postulated in order to derive surface area densities (SAD). Since the sedimentation velocity depends on the size of the particles, the size distribution assumed has a significant impact on denitrification and dehydration processes through sedimentation of PSCs.
Some differences between WACCM and WACCM-CCMI should be mentioned here. While the CCMVal-2 version of 5 WACCM simulated Southern Hemisphere winter and spring temperatures that were too cold compared with observations, in the CCMI-1 simulations this problem was addressed by introducing additional mechanical forcing of the circulation via parametrized gravity waves . Also the polar heterogeneous chemistry was recently updated (Wegner et al., 2013) and further evaluated by Solomon et al. (2015).
Recently Zhu and co-workers introduced a new PSC model (Zhu et al., 2015(Zhu et al., , 2017a within the CESM1 (Community These models are, to our knowledge, the most significant advancements in the field of PSC representation in Global Cli-20 mate Models used for ozone and climate change studies. The CARMA model is an interactive aerosol and radiation model fully coupled to the WACCM, able to simulate advection, diffusion, sedimentation, deposition, coagulation, nucleation and condensational growth of atmospheric aerosols online with the temperature, dynamics and radiation structure simulated by the GCM (Toon et al., 1988). This approach is completely different from the parametrizations available in the simulations we are analysing here. A full evaluation of the WACCM/CARMA models in Specified Dynamics runs with respect to CALIPSO data is available in literature (Zhu et al., 2015(Zhu et al., , 2017a but is beyond the scope of this intercomparison, where free running simulations are used.
Here we limit our analysis to simulations produced by four models from CCMVal-2 and one model from CCMI. One of the goals is to use different diagnostics to test the model simulations versus the CALIOP observations. Several studies concerning 5 PSC simulations by WACCM (Brakebusch et al., 2013;Wegner et al., 2013) and WACCM/CARMA (Zhu et al., 2015(Zhu et al., , 2017a have been published recently.

Comparison based on the PSC vertical extent
Presently, the evaluation of CCMs for what concerns stratospheric aerosol and in particular PSCs is still incomplete. The SPARC report (Eyring et al., 2010)  To be able to compare with the CALIOP lidar observations, we have to derive the mean PSC layer vertical extent and the frequency of occurrence as a function of height and of temperature for the models from the PSC surface area density (SAD) 20 spatial distribution. To do so, it is necessary to apply a simplified observation operator to the model output (i.e. identify the model grid points where a lidar would have observed NAT or ice clouds by defining a threshold for the SAD values produced by the models). We firstly define a vertical extent of PSCs as the sum of all layers (in km) containing a specific class of PSC.
In order to study seasonal and geographical variations, we construct maps of monthly means by accumulating all observations.  Höpfner et al. (2006) suggested that mountain waves may be responsible for the non-zonal NAT distribution that were indeed observed closer to the Transantarctic chain while Alexander et al. (2011) also consider that NAT formation can be related to the outflow of ice clouds. Wang et al. (2008) pointed out that increased convection due to orographic triggering in the lee of the Transantarctic chain is related to the occurrence of enhanced NAT mixtures. Enhanced NAT mixtures have a minor vertical extent with respect The vertical extent for the models is estimated analogously to the observations. The horizontal resolution applied to estimate the occurrence is the same among models and CALIOP data. The effect of the differences of vertical resolution among models and observations is reduced by calculating a total aggregate vertical occurrence.    Table 5 reports the total PSC vertically integrated frequencies of occurrence for the five models and for CALIPSO from June to September as already indicated in figures 5,6, 7, 8, 9 and Table 5. Total PSC frequencies (in %) in the 12-30 km height layer for NAT and ice clouds for June-July-August-September for the observations and models. Note that CALIPSO NAT includes the enhanced NAT mixtures class.
The differences between the simulations obtained from the CCMs and CALIOP observations are discussed in terms of geographical distribution, onset and decline of PSCs during polar winter and total vertical extent for NAT and ice.
The CAM3.5 model overestimates NAT and ice throughout the winter and shows an early onset of PSCs in June and also an 5 early decline in August, with respect to CALIOP observations. Also CCSRNIES shows a too strong presence of NAT and ice, with respect to CALIOP, in particular in September, but shows a correct seasonality, with July and August being the months with the largest presence of PSCs.
The LMDZrepro model produces a correct onset and decline of the PSC formation, but shows the largest NAT frequency and the lowest ice frequency of all models.

10
WACCM is similar to CAM3.5, but with a larger NAT and ice frequency. The onset of PSC formation is early, as for CAM3.5.
The simulations produced by WACCM-CCMI follow the same trend for both NAT and ice, as observed by CALIOP, although the NAT frequency in July ad August is underestimated and the ice PSCs are overestimated with respect to CALIOP.
In discussing the geographical distribution of PSCs, it should be noticed that the small numbers of observations in some All other models overestimate the NAT occurrences, most probably due to the cold temperature bias. Also ice is much 5 overestimated, with the exception of LMDZ-repro which underestimates the ice occurrences with respect to CALIOP.

Comparison based on SAD
Another diagnostic method consists of comparing the SAD for CCMs and CALIOP. A range of SAD values can be obtained for NAT and ice for each model. The surface area density for the CCMVal-2 is estimated based on a semi-empirical relation between mass and mean surface areas given by the model providers and reported in the CCMVal-2 report. We must be aware, 10 however, that SAD is a derived variable and depends on the assumptions on the mean particle size for each model (as detailed in Eyring et al. (2010)). When models predict both NAT and ice clouds, we assigned the SAD to ice if the SAD for ice is larger by a factor of 3 than the one for NAT. The SADs for CALIOP have been evaluated by using an empirical relationship derived from coincident lidar and size distributions observations (Snels et al., 2018). Figure 11 shows the histograms of ice and NAT values for SAD for each model together with the range of SAD reported in Adriani et al. (1995). The fraction is normalized 15 to the total number of model grid points in order to identify the differences in PSC occurrence among models and between classes.
22 Figure 11. Histogram of the NAT (solid lines) and ice (dashed lines) SADs for some CCMVal models and for CALIOP (2006)(2007)(2008)(2009)(2010) are displayed. The histograms for the model data have been truncated and represent 93% of the total SAD. The straight lines at the top of the figure indicate the range of SAD values for NAT and ice "observed" by ground-based lidars and are taken from Adriani et al. (1995).
We observe that for most of the models NAT PSCs have SAD ranging between 3·10 −10 and 10 −8 cm −1 except for LMDZrepro that has larger SAD for NAT PSCs and is clearly an outlier. In general all models produce SADs for NAT that are smaller by one order of magnitude than the SAD calculated from CALIOP data, except for LMDZ-repro. The variability among models for the NAT SAD may be related to the assumptions made on the number of particles per cm 3 . The narrow peak at larger NAT SAD values for the LMDz model could be consistent with the use of much larger particle number density and smaller particle 5 radius in the simulation. This in turn would give less irreversible denitrification processes simulated by the models with larger NAT SAD (CCMVal-2 report, 2010, Chapter 6). Most of the models have ice PSCs in a SAD range between 2·10 −9 and 10 −6 cm −1 and are generally a factor of 2-3 smaller than CALIOP values, except for the WACCM-CCMI simulations, which predict a larger value than that derived from CALIOP observations. 23

Comparison based on PSC occurrences
The comparison between CALIOP and CCMs can also be made by using the occurrences as a function of T-T N AT , similarly to what has been done above for the comparison between ground-based and satellite-borne lidars above McMurdo. In figure   12 the PSC occurrences as predicted by the models and as observed by CALIPSO between 60 • S and 82 • S averaged over the 2006-2010 period have been displayed as a function of T-T N AT , where T N AT has been calculated from HNO 3 and H 2 O 5 number densities. Note that the models produce only NAT and ice occurrences. As reported in the CCMVal-2 report, most models show a well-known cold pole bias in stratospheric temperature. The bias is in general attributed to model dynamics, as in Austin et al. (2003) that identifies a lack of westward wave forcing resulting in a more intense and persistent polar vortex. A clear improvement is obtained with an improvement in the gravity waves scheme as in Kinnison et al. (2007), resulting in more realistic temperatures in the WACCM-CCMI simulation as described above.

10
The fraction of data with different PSC classes helps in evaluating how realistic the microphysical scheme is, since this variable is normalized to the number of observations and in principle independent from the possible biases. The onset of NAT is similar for all models, except for WACCM-CCMI, where NAT starts to form only below T N AT . The onset of the ice formation occurs at T-T N AT = -5 K for all models, except for CCSRNIES. The increase of NAT occurrences with decreasing temperatures is stronger for all models with respect to CALIOP. This is due to the fact that the models consider only the thermodynamic equilibrium conditions for the formation of PSC, and do not allow the existence of supersaturation without PSC formation. The family of models CAM3.5, WACCM and WACCM-CCMI show a faster increase of the ice occurrences 5 with decreasing temperatures with respect to CALIOP. The reason is probably the same as for the NAT behaviour. LMDZ-repro evidently produces much less ice than the other models and CALIOP, and at low temperature NAT is the dominating species, while the other models and CALIOP show a dominant ice occurrence for low temperatures. The CCSRNIES model shows a slower increase of the ice occurrences with respect to CALIOP and the other models.
In general CAM3.5 and WACCM that share the same microphysical scheme have a more than satisfactory agreement, 10 notwithstanding the cold bias that generates an excessive PSC coverage. On the other hand, WACCM-CCMI has a more realistic PSC coverage but a likely too efficient ice PSC generation due to the new scheme. So, even if the overall skills of the model are largely improved, this kind of diagnostics (the slopes of curves in figure 12 and the "onset" PSC temperature) suggest the need to explore the ability of a single component of the model system such as the microphysical scheme.

15
A statistical comparison has been proposed for PSC observations at McMurdo, obtained from ground-based and satellite-borne lidar measurements. The analysis of the ground-based data has been performed by using a detection and classification algorithm which closely follows the v2 algorithm applied to CALIOP data, in order to avoid a bias due to different classification schemes.
Results have been shown for July and August 2006, being the months with the best temporal coverage. A comparison of PSC occurrences as a function of time and height in 2006, shows that both data sets capture the general features of the PSC season, 20 in terms of occurrence of each species throughout the winter. The vertical distribution and the temperature dependence of the occurrences of the different PSC classes show some discrepancies, in particular there are noticeable differences in the height distribution of NAT around 20 km. As a conclusion, the statistical agreement between CALIOP and ground-based data is acceptable, considering the different observation geometry and other possible biases.
A set of diagnostics has been proposed to compare the PSC simulations from CCMs with respect to CALIOP, with the goal 25 to evaluate possible biases. The diagnostics are based on spatial (vertical and horizontal) SAD distribution of ice and NAT particles together with their temperature distributions. Those diagnostics are here applied to a subset of CCM simulations from CCMVal-2 and to a more recent version of WACCM from CCMI. The geographical distributions of PSCs in the polar vortex observed by CALIOP is not well reproduced by most of the models. Moreover the NAT frequency is overestimated, with respect to CALIOP for all models, except for WACCM-CCMI. The onset of PSC formation is anticipated in the CAM3.5 and

30
WACCM models, with respect to CALIOP, while CCRSNIES and LMDZrepro show a too strong presence of NAT in June and September with respect to July and August. LMDZrepro has the largest amount of NAT and the smallest amount of ice PSCs.
WACCM-CCMI shows the best agreement with CALIOP, both for onset and decline and for absolute values, although NAT is slightly underestimated in July and August and ice is overestimated in the same months. As a conclusion the WACCM-CCMI model compares better with CALIOP observations for ice and NAT, due to additional forcings applied in order to eliminate the cold temperature bias.