Statistical estimation of stratospheric particle size distribution by combining optical modelling and lidar scattering measurements

A method for estimating the stratospheric particle size distribution from multiwavelength lidar measurements is presented. It is based on matching measured and model- simulated backscatter coefficients. The lidar backscatter co- efficients measured at the three commonly used wavelengths 355, 532 and 1064 nm are compared to a precomputed look- up table of model-calculated values. The optical model as- sumes that particles are spherical and that their size distribu- tion is unimodal. This inverse problem is not trivial because the optical model is highly non-linear with a strong sensi- tivity to the size distribution parameters in some cases. The errors in the lidar backscatter coefficients are explicitly taken into account in the estimation. The method takes advantage of the statistical properties of the possible solution cluster to identify the most probable size distribution parameters. In order to discard model-simulated outliers resulting from the strong non-linearity of the model, solutions farther than one standard deviation of the median values of the solution clus- ter are filtered out, because the most probable solution is ex- pected to be in the densest part of the cluster. Within the fil- tered solution cluster, the estimation algorithm minimizes a cost function of the misfit between measurements and model simulations. Two validation cases are presented on Polar Stratospheric Cloud (PSC) events detected above the ALOMAR obser- vatory (69 N - Norway). A first validation is performed against optical particle counter measurements carried out in January 1996. In non-depolarizing regions of the cloud (i.e. spherical particles), the parameters of an unimodal size dis- tribution and those of the optically dominant mode of a bi- modal size distribution are quite successfully retrieved, es- pecially for the median radius and the geometrical standard deviation. As expected, the algorithm performs poorly when solid particles drive the backscatter coefficient. A small bias is identified in modelling the refractive index when compared to previous works that inferred PSC type Ib refractive in- dices. The accuracy of the size distribution retrieval is im- proved when the refractive index is set to the value inferred in the reference paper. Our results are then compared to values retrieved with an- other similar method that does not account for the effect of the measurements errors and the non-linearity of the optical model on the likelihood of the solution. The case considered is a liquid PSC observed over northern Scandinavia on Jan- uary 2005. An excellent agreement is found between the two methods when our algorithm is applied without any statisti- cal filtering of the solution cluster. However, the solution for the geometrical standard deviation appears to be rather un- likely with a value close to unity ( 1.04). When our algo- rithm is applied with solution filtering, a more realistic value of the standard deviation ( 1.27) is found. This highlights the importance of taking into account the non linearity of the model together with the lidar errors, when estimating particle size distribution parameters from lidar measurements.


Introduction
Stratospheric particles play an important role in atmospheric chemistry (WMO, 2007) and, in the case of large volcanic eruption, in the earth radiative budget (Robock, 2005). For example, Polar Stratospheric Clouds (PSC) are a key element in polar ozone depletion by providing the surfaces for chlorine activation through heterogeneous chemistry (Peter, 1997;Charlson and Heintzenberg, 1995). Characterizing and understanding 15 stratospheric particles remains a major scientific issue. The most important characteristics of stratospheric particles are their size distribution, shape and composition. This information can be used to calculate radiative properties of particles, rates of heterogeneous chemistry or gravitational sedimentation (ASAP, 2006). But, very few measurements give access to the full size distribution of stratospheric particles. They from multiwavelength Raman lidar data (Müller et al., 1998;Veselovskii et al., 2002Veselovskii et al., , 2005. This typical inverse problem can be presented in the following way : given a set of lidar observations and optical modelling of the observations, how can the most probable aerosol size distribution be estimated? This problem is often addressed by assuming model linearity and normally distributed errors. Most retrieval methods are 10 only based on the least square criterion (i.e. minimal discrepancy between measured and model-simulated quantities, variance weighted by the errors). Note that, in some cases, the criterion is used without even considering observation errors. This leastsquare criterion should lead to the maximum likelihood estimate and to a minimum variance for the analysis error (error in the estimation) if all the errors are Gaussian, 15 unbiased and the model quasi-linear (Lahoz et al., 2007).
In our case, the model is highly non linear (see Sect. 3.3). It is very sensitive to the size distributions parameters in some cases, and, as a consequence, the least square estimator may not always give the most probable solution depending on the magnitude of the observation errors. In addition, even if the errors in the observables were to be 20 normally distributed, the non linearity of the model would result in a Probability Density Function (PDF) of the solution that is not necessarily Gaussian, meaning that the PDF cannot be described by its first and second moment. Therefore, the entire shape of the solution PDF has to be considered.
The best match estimator, identified as the minimal difference between measured 25 and modelled quantities in the least square sense, is usually the most probable solution when errors in the observables are negligible and the model quasi-linear. In our case, errors in the lidar retrieved backscatter coefficient are not at all negligible, and so it is important to check whether the best match solution is the most probable solution.

8916
It is worth pointing out that the need to find a physical solution has led to added constraints on the form or properties of the solution. One such approach is the regularization technique, and an example of such a constraint is the smoothness of the solution for vertical profile retrievals (Tikhonov and Arsenin, 1977;Tarantola, 1987;Müller et al., 1999Müller et al., , 2000. An accurate retrieval also often requires numerous optical quantities (backscatter and/or extinction coefficients) at different wavelengths. With enough constraints, regularization techniques could also enable the determination of the refractive index (Veselovskii et al., 2004).
In the stratosphere, the particle extinction coefficient is measured with a limited accuracy, because very few photons are detected by the lidar at stratospheric altitudes. 10 In addition, most of the best-equipped lidar stations only monitor the stratosphere at three wavelengths, typically at 355, 532 and 1064 nm. Because of these two limitations, the regularization technique is not necessarily the most suited approach for retrieving stratospheric particle size distributions from lidar data. In this paper, we explore an alternative method based on the comparison between measured and model-simulated Backscatter Coefficients (BC), using a Monte Carlo approach. The formulation of the method is inspired by the works of Beyerle et al. (1994) and Merhtens et al. (1999). The retrieval algorithm minimizes a cost function of the misfit between measurements and model simulations with the control variables being the parameters of the PSC size distribution. The errors in the measurements are explicitly taken into account in the 20 search for a solution. A cluster-based filtering of the solution pool ensures both stability and reliable error estimation. The refractive index is determined from the particle composition calculated by the microphysical model, taking into account both sulphuric acid aerosols and liquid type Ib PSC particles.
The paper is organized as follow.  von Zahn et al., 2000). The vibrational Raman at 387 and 607 nm (associated with the 355 and 532 nm wavelengths, respectively) and the rotational Raman measurements at 529 and 530 nm are also performed simultaneously. The measurement integration time is typically 3 min with a vertical resolution of 150 m. Data are acquired with the lidar pointing to the zenith.

Methodology of the size distribution estimation
The particle size distribution is retrieved from comparisons between measured and model-simulated backscatter coefficients, taking the measurement errors into account. In the first phase, the model calculates the chemical composition according to the specific environmental conditions (pressure, temperature, total HNO 3 , total H 2 O) for a Introduction We assume in our case that the stratospheric particle size distribution can be repre-5 sented by a lognormal size distribution (Pinnick et al., 1976;WMO, 2007): where N 0 is the total number of particles per unit volume and σ the geometrical standard deviation (hereafter called standard deviation) around the mode radius r m .
3.1 Refractive index modelling 10 In the first phase, the model calculates the particle composition, ranging from a binary H 2 SO 4 /H 2 O solution to a ternary H 2 SO 4 /HNO 3 /H 2 O (STS) solution (Larsen, 2000;Luo et al., 1996;Krieger et al., 2000). The composition of the condensed phase is assumed to be in thermodynamic equilibrium with the gas phase. In order to determine the particle equilibrium composition (weight fractions of sulphuric acid, nitric acid and water), one needs to solve a set of 2 non-linear equations describing the equality between the partial pressures of HNO 3 and H 2 O just over the surface of the condensed phase and the partial pressures in the gas phase. The model is initialized with total (gaseous + condensed) amounts of HNO 3 and H 2 O and it then redistributes HNO 3 and H 2 O between the gas and the condensed phases according to the calculated par-20 ticle composition. The iterative procedure of the equilibrium composition calculation ensures that the gas phase and condensed HNO 3 and H 2 O is equal to the initial total HNO 3 and H 2 O. The model then derives the condensed aerosol mass concentration (or aerosol volume concentration). Finally, the refractive index is calculated from the equilibrium composition (Luo et al., 1996) because of the lack of available data beyond the 360-1000 nm range. Modelling the composition allows us to account for the rapid variations of the particle composition, and hence, the refractive index, around the PSC type Ib temperature threshold (Larsen, 2000;Carslaw et al., 1997) instead of assuming a constant refractive index whatever the environmental conditions. 5

Backscatter modelling
The particle backscatter coefficient is simulated using Mie theory. It is strictly valid for spherical particles only. Consequently, our size distribution algorithm can only be applied to lidar measurements of spherical particles such as supercooled sulphuric acid aerosol particles or type Ib PSC. The model-simulated size backscatter coefficient, where λ is the incident wavelength, n(r) is the size distribution, the number of particles at the radius r between r and r+dr, m the refractive index and d σ b /d Ω the Mie particle backscattering differential cross section. 15 The other optical quantity used in the retrieval algorithm is the colour ratio CR λ (or CR) which is the BC at the wavelength λ normalised by the BC at the most sensitive wavelength of the lidar system, 532 nm : 3.3 Size distribution retrieval methodology 20 An optical module coupled to a size-resolving aerosol model is used to calculate the three backscatter coefficients (β 355 nm , β 532 nm and β 1064 nm ) and the two associated colour ratios (CR 355 nm and CR 1064 nm ) as a function of the size distribution parameters. In principle, the CR should not depend on N o because N o vanishes when CR are formed. But, in our size distribution algorithm, that is not the case. In the look-up table, the refractive index also varies with N o because N o , along with σ and r m , determines the aerosol volume concentration and, hence, the condensed mass of HNO 3 and H 2 O. As the total HNO 3 and H 2 O is fixed, the partitioning between gas phase and condensed 5 phase as well as the aerosol composition depend on aerosol volume concentration. For example, if N o is very small (or large), most of the HNO 3 is in the gas-phase (or condensed phase), and so, the particle equilibrium composition would correspond to high (or low) partial pressures of gaseous HNO 3 . As a result, even when σ and r m are kept constant, different N o give different aerosol compositions, different refractive 10 indices, and hence, different CR.
The influence of N o on modelling the refractive index appears in Fig. 1, which displays sample plots of CR 355 nm and CR 1064 nm . The standard deviation is fixed in Fig. 1a (CR=f i (r m ) σ=1.45 ) while the median radius is fixed in Fig. 1b (CR=f i (σ) rm=0.3 µm ). Two sets of curves are plotted. They correspond to N o being equal to either 0.1 or 10 cm −3 . 15 The reference values of r m and σ come from the validation case described in Sect. 4.3. For high aerosol volumes (high values of σ or r m ) the curves of CR(r m ,σ) for N o =0.1 and for N o =10 cm −3 start to differ, illustrating the influence of N o on CR. The nonlinearity of the retrieval problem is also highlighted in this figure, in that the colour ratios are not in univalent relationship with the size distribution parameters in Fig. 1. 20 For the environmental conditions (temperature, pressure, mixing ratios of total HNO 3 and H 2 O) of each lidar data point, a look-up table of model-calculated BC and CR is generated as a function of N o , r m and σ. The resolution step is typically 0.1 cm −3 for N o , 0.01 µm for r m and 0.01 for σ. The influence of the look-up table resolution on the retrieval is checked and increased till such influence is not noticeable anymore. 25 The size distribution retrieval algorithm then searches the look-up table for the modelsimulated BC (β i ,model with i =355, 532, 1064 nm) and CR (CR j,model with j =355/532, 1064/532) that correspond to the measurements (β i ,lidar , CR j,lidar ). As CR also depend on N o in the look-up Interactive Discussion one step by fitting the 5 optical quantities (= 3 BC + 2 CR). If the refractive index had been assumed constant, CR would have been independent of N o . We could have first searched for σ and r m by fitting the 2 CR and then derived N o by fitting the 3 BC (as in Blum et al., 2006;Baumgarten et al., 2007). This two-step procedure would have been less demanding in terms of computing time than the simultaneous search of N o , 5 r m and σ adopted here. In order to estimate the errors in the size distribution retrieval, the lidar measurement errors (∆β i ,idar , ∆CR j,lidar ) are taken into account by searching the β i and CR j that are within the measurement intervals (β i ,lidar − ∆β i ,lidar , β i ,lidar + ∆β i ,lidar ) and (CR j,lidar − ∆CR j,lidar , CR j,lidar + ∆CR j,lidar ), respectively. Any (N o , r m , σ) combination of the lookup table whose associated β i ,model and CR j,model belong to the measurement intervals is taken as a possible solution. Obviously, the larger the lidar errors, the wider the pool of possible solutions. Note that, in our algorithm, the model errors are ignored. For instance, the size distribution retrieval algorithm does not account for errors in modelling the particle refractive index. In other words, the model is assumed to be 15 perfect. The additional size distribution retrieval errors originating from possible model deficiencies are discussed in the results and conclusion sections.
Several approaches are possible for determining the model solution (N o , r m , σ). It is possible to simply look for the best (according to the least-square criterion) match between model simulations and measurements by minimizing the following scalar func-20 tion: (4) This type of quadratic function that quantifies the misfit between model and data is variously referred as cost function, distance function, objective function, or penalty 25 function in data assimilation (Elbern and Schmidt, 2001 (N o ,σ, r m ) such as X k,lidar − ∆X k,lidar <M(N o ,σ, r m , λ)<X k,lidar + ∆X k,lidar (with X =BC or CR, and k=1 to 5). The best match model solution is indeed identified independently from the statistical properties of the cluster of possible solutions. This approach gives the most likely solution when the model is quasi-linear and the errors are Gaussian. A second approach is to look for the most likely solution using the statistical prop-5 erties of the cluster of possible solutions, given the lidar uncertainties. According to estimation theory, the most likely solution is expected to be found in the densest part of the solution cluster, around the maximum of the probability density function (i.e. maximum likelihood estimation). The best match solution (i.e. solution corresponding to the minimization of the model-  Figure 2 clearly indicates that the solutions to a given value of CR 355 nm do not necessarily form a tight cluster in the size distribution parameter space. This is due to the non-linearity of the model and hence, to the high sensitivity of the calculated BC to the input size distribution parameters. In 20 the same way, the other model-simulated optical properties, BC or CR 1064 nm , can also have highly non-linear dependencies on the size distribution parameters. As shown by Eqs. (1) and (2), the only possible linear relationship at first sight is the dependency of BC on N o . BC have a somewhat exponential dependency on r m and σ but this dependency varies in a complex way in the size distribution parameter space. When σ 25 gets close to 1, the particles tend to have their radius tightly scattered around the median radius. This narrow size distribution enhances the behaviour of the Mie differential backscattering cross section as a function of r, and, when σ is close to 1, its oscillations drive the backscatter as can be seen on Fig. 2 (Bohren and Huffman, 1983 The final cluster of possible model solutions is the intersection of 5 different solution clusters corresponding to the 5 optical quantities (=3 BC+2 CR). The resulting 3-D surface of possible model solutions can be very convoluted.
Multiple sensitivity tests have shown that, in several cases, the sole best match approach leads to somewhat unrealistic values of N o , r m and σ with, in particular, σ being 5 close to 1. Indeed, the surface of J=f (N o , r m , σ) sometimes exhibits a deep and extremely localised minimum, but the realistic solutions are mostly found in a very broad but shallower minimum. When looking at the cluster of possible model solutions in these cases, the best match solution (i.e. minimum of J=f (N o , r m , σ)) is found on the edge or even completely disconnected from the cloud of possible solutions in the size distribution parameters space. This is confirmed by other numerical experiments using model-simulated BC as measurements. In this setup, the synthetic measurements are produced by adding random errors to the model-calculated BC. The true solution is the set of size distribution parameters taken as input to the model. In several cases, depending on the amplitude of the added random errors, the best match solution can 15 be localised in a deep minimum area far away from the true solution. Therefore, on its own, the best match criterion guarantees neither unicity nor optimality of the solution because of the high non-linearity of the model.
In order to look for a model solution in the densest part of the possible solution cluster, the cluster is filtered based on its spread. The standard deviations σ No , σ rm , and σ σ 20 on the 3-D solution cluster are first calculated. Then, any possible solutions outside the ±σ=(σ No , σ rm , σ σ ) envelope around the cluster centre are discarded. The coordinates of the cluster centre are taken as those of the median values of N o , r m and σ because the median values are less sensitive to anomalous solution points than the mean values. Overall, this data filtering ensures removal of most unrealistic possible solutions. 25 As expected, test simulations indicate that stable model solutions are associated with a large number of possible solutions within the clusters. Therefore, we also check the number of possible solutions left in the filtered cluster and proceed to the next step only when there is a significant fraction of the look-up (typically one hundred points). This issue is of course linked to the amplitude of lidar BC errors and to the resolution of the look-up table.
The model solution is finally searched within the filtered cluster of possible solutions. The 3-D size of the initial solution cluster and, hence, the filtered cluster are determined by the amplitude of the lidar errors that are not known accurately. We prefer to use the 5 best match approach (i.e. minimum of the model-measurement, as quantified by the scalar function J) to determine the final model solution instead of an average of the solution cluster, because the best match solution is not sensitive to the size of the solution cluster (except when the errors are assumed to be excessively small and so the realistic model solution may not be found within the searched domain in some 10 particular cases), whereas the mean or median values of the cluster can fluctuate with the cluster size. Overall, the uncertainties on the lidar errors are not critical for the best match model solution but they do influence the estimated errors in the retrieved size distribution parameters. Indeed, the standard deviations on the filtered cluster are taken as estimates of the size distribution retrieval errors. These estimates are 15 possibly lower limits on the retrieval errors because the size of the cluster is reduced by the filtering and model errors are ignored. Note that the points within the cluster are not exactly normally distributed. Therefore, we could have estimated the retrieval error with percentiles that do not require any assumptions on the shape of the solution PDF. However, simple tests have shown that the use of the standard deviation or the 20 75 percentile does not have a very significant influence on the final estimation.
If the inverse problem was properly posed, the best match solution and the cluster centre should be very close. In order to be consistent with estimation theory, a final and slight adjustment is carried out using the lidar error as an adjustable parameter. Indeed, lidar errors are notoriously difficult to estimate (ASAP, 2006). The estimated 25 errors in the measured BC are uncertain by at least 20%. Therefore, the distance between the best match solution and the cluster centre (defined by the median values) in the size distribution parameter space is minimized by varying the BC errors ∆β lidar . (5) where X best−match and X center are the size distribution parameter X of the best match solution and of the cluster centre (defined as the median value), respectively; σ X,cluster is the standard deviation of the size distribution parameter X on the solution cluster.

5
The uncertainties on the BC errors are assumed to be 20% typically. Therefore, if the BC error is 12.5% for example, the distance D is minimized by varying the BC error between 10% and 15%. The main objective of the whole procedure is to ensure that the retrieval algorithm produces realistic and stable model solutions (i.e. size distribution parameters) that are weakly sensitive to the estimated BC errors as well as being 10 consistent with the estimation theory.

Evaluation against size-resolved measurements
The first evaluation of the retrieval algorithm is based on more or less coincident lidar and balloonborne size-resolved measurements of a stratospheric thick cloud observed above ALOMAR and described in Deshler et al. (2000). The lidar measurements pro- 15 vide the input data for the retrieval algorithm and the size-resolved measurements provide the reference values for an independent evaluation of the retrieved size distribution.

PSC Observations
On 23 January 1996, a thick cloud was detected by lidar above ALOMAR. This cloud 20 was present between 19 and 26 km. The lidar signal was depolarized between 23 and 25 km. The cloud was also probed at about the same time with an Optical Particle Counter (OPC) instrument that provided size distribution measurements. Centre for Meteorological Weather Forecasts (ECMWF) analyses indicated favourable conditions for PSC formation. A detailed description of the temporal evolution of the lidar measurements and of the cloud structure can be found in Hanson and Hoppe (1997). Measurements were carried out at several altitudes within the broad cloud layer, both in the depolarizing and non-depolarizing regions of the cloud. These sets 5 of lidar and size-resolving measurements appear to be excellent opportunities to validate our size distribution retrieval methodology. Around 22.5 km, in a non-depolarizing (spherical particles) region of the cloud, Deshler et al. (2000) identified a type Ib PSC whose measured size distribution could be fitted with a unimodal lognormal distribution. This type Ib PSC represents an ideal case of validation because it fulfils the conditions 10 of sphericity for the particles and unimodality for the size distribution. The size-resolved measurements of the other PSC layers indicated bimodal distributions with some being non-depolarizing. In the depolarizing (non-spherical particles) layers, type Ia or even type II PSC were identified. To check the influence of the sphericity and unimodality conditions on the results, we also perform size distribution retrievals on theses cases 15 where the retrieval system is a priori expected to perform poorly. One of the aims is to identify the size distribution parameter whose retrieval is most affected by the nonrespect of these conditions. All the lognormal parameters of the measured (reference) and retrieved size distributions are gathered in Table 1.

Lidar data 20
Lidar signals are averaged between 17:50 and 18:30 UTC in order to be coincident with the in situ balloon-borne size-resolving measurements, as described in Deshler et al. (2000). It is worth pointing out that, even if the lidar and OPC measurements were truly coincident, a BC derived from lidar data averaged over 40 min is not expected to be equivalent to a BC calculated from the corresponding particle size distribution 25 averaged over the same time interval unless the particle distribution does not vary temporally. The BC at 355 and 532 nm are retrieved using the vibrational Raman N 2 scattering detected at 387 and 607 nm, respectively. As the association of elastic and 8927 Introduction

Back Close
Full Screen / Esc

Printer-friendly Version
Interactive Discussion inelastic scattering allows the determination of the particle extinction, there is no assumption on the value of the lidar ratio (extinction over backscatter) in the inversion of the equation linking the BC and the lidar signal. The lidar inversion still requires the choice of a reference altitude at which the BC is supposed to be known (clear-air assumption) (Ansmann et al., 1990;Ansmann et al., 1992aAnsmann et al., , 1992b. Figure 3 shows the 5 backscatter ratios profiles at 355, 532 and 1064 nm, defined as the total backscatter divided by the Rayleigh (molecular) contribution. These profiles can be compared to those obtained in Deshler et al. (2000) who used the Klett method for the three wavelengths assuming a linear relation between extinction and backscatter (Klett, 1981(Klett, , 1985. There is still an excellent agreement between the two inversion techniques. We 10 estimate the errors in our determination of the backscatter coefficient to be about 10% at 355 and 532 nm, and 20% at 1064 nm.

Non-depolarizing unimodal size distribution
The unimodal type Ib PSC was observed at about 22.5 km. The temperature was about 186 K according to the measurements. This temperature is used here as input 15 to the microphysical model. The volume mixing ratios in total HNO 3 and H 2 O required by the model are set to the typical values of 10 ppbv and 4.5 ppmv, respectively. The ranges of size distribution parameters considered by the retrieval algorithm are as follows: 10 −3 <N o <20 cm −3 , 10 −3 <r m <3 µm and 1.01<σ<2. After applying the retrieval algorithm to the BC and CR data, we obtain N o =11.4 cm −3 (9%), r m =0.30 µm (5%) 20 and σ=1.44 (2%) with the numbers in brackets being the estimated retrieval errors.
The retrieved values can be compared with the OPC measurements : N o =7.71 cm −3 , r m =0.29 µm and σ=1.45. The r m and σ values agree extremely well, around a few percent, which is within the error bars. In contrast, the N o values differ by about 50% which is a factor 5 higher than the estimated retrieval error in N o (9%). As stated before, BC 25 and CR are generally less sensitive to N o than r m and σ. Therefore, the retrieval of N o is expected to be less accurate.
From and V =1.4 µm 3 .cm −3 with uncertainties of around 30%, with 20% originating from the OPC measurement errors and 10% from errors in the refractive index (Deshler et al., 2000). Our A and V values differ by about 60% from the reference values, which is almost greater than the total (= size distribution retrieval + OPC) errors. Although A and V are most sensitive to r m (power law dependency), the 60% bias on A and V mostly 10 originates from the bias in N o . Indeed, the respective 3% and 1% biases on r m and σ only lead to a 5% error in A and V . It is not totally surprising that BC are less sensitive to N o and hence, that N o is the most difficult parameter to retrieve. The Mie scattering kernels in the spectral range of interest (from 0.385 to 1.02 µm) show that lidar data are most sensitive to particles in the 0.1-1 µm range (Fig. 4.28 in ASAP, 2006). As a 15 result, the moments that depend on smaller particles (such as the N o concentration) are less likely to be accurately retrieved.
To check the stability of the solution, the retrieval is performed with BC errors ranging from 5% to 50%. The solution does not change from around 10% to 50% lidar errors. When lidar errors are lower than 10%, another solution (N o =11 cm −3 , r m =0.32 µm and 20 σ=1.39) is identified. However, this solution has to be discarded because of the very low number of points present in the solution cluster. Besides, lidar errors are unlikely to be lower than 10%. Another test is carried out with a look-up table that does not cover the range of the reference OPC values. As expected, no solutions are identified. Overall, it can be concluded that the algorithm appears to perform well, especially when retrieving the size distribution parameters r m and σ, when applied to unimodal liquid PSC distributions. The total number of particles N o is found to be the most difficult parameter to retrieve accurately.  (Deshler et al., 2000). The values of the size distribution parameters of the two modes are given in Table 1b. Using them as inputs to the optical model, the 5 simulated BC of the small particle mode is found to be more than a hundred times higher than the BC of the large particle mode at all the considered wavelengths. The retrieval algorithm is applied at the 21.8 km altitude for the corresponding pressure and temperature and for the same conditions as above (total HNO 3 and H 2 O, considered ranges of size distribution parameters and lidar errors  (2000) analysed the measured size distribution and the associated particle volume to arrive at a type Ia (Nitric Acid Trihydate) PSC. As for the case of the liquid bimodal distribution previously considered, optical model calculations indicate that the optical signal of the small particle mode is largely dominant over the signal of the large particle mode. As solid PSC are composed of non-spherical particles, this case is used to test and V ≈12 µm 2 .cm −3 . The OPC-derived parameters for the small particle mode were N o =8.2 cm −3 , r m =0.26 µm and σ=1.39 with A≈9.0 µm 2 .cm −3 and V ≈1.0 µm 3 .cm −3 . In contrast to the previous results, the agreement is very poor for r m and σ but it is satisfactory for N o . The very large discrepancies on r m and σ lead to retrieved values of A and V differing from the OPC-derived values by a factor 5-10. Note that the refractive index value was calculated for STS particles, whose value is close to the NAT value. We also performed a retrieval forcing the refractive index to be equal to the PSC type II refractive index value as inferred in Scarchilli et al. (2005). But the overall results 15 do not change very significantly. The size distribution was found to be N o =7 cm −3 , r m =0.53 µm and σ=1.75 with A≈46 µm 2 .cm −3 and V ≈18 µm 2 .cm −3 . Accounting for a more accurate refractive index does not improve the retrieval in this case.
As expected, using Mie theory for non-spherical particles results in erroneous retrieved parameters. For the retrieval of solid PSC size distributions, it is necessary to 20 replace the Mie optical model by a T-matrix optical model in the algorithm (Mishchenko and Travis, 1998), possibly making the particle asphericity (for example the aspect ratio) one of the control variables along with N o , r m and σ.

Refractive index
An important issue in the retrieval is connected to the modelling of the refractive index 25 m. The backscatter coefficient is actually most sensitive to m, but its possible range of values is relatively limited (1.3-1.6) (Deshler et al., 2000;Luo et al., 1996;Scarchilli et , 2005). For the case of the non-depolarizing unimodal distribution, the STS equilibrium composition was calculated for an assumed 4.5 ppmv H 2 O and 10 ppbv HNO 3 giving a refractive index of 1.45 at 355 nm and reducing to 1.43 at 1064 nm (Luo et al., 1996). Deshler et al. (2000) inferred an m value of 1.47±0.03 for the same PSC from a comparison between scattering and size-resolved aerosol concentration mea-5 surements. They also concluded that this value was higher than expected on the basis of laboratory experiments, measurements in Antarctica and theory. In order to estimate the impact of this m uncertainty on the results, another retrieval is performed on the same case, but with the refractive index calculation replaced by setting m to 1.48, 1.46 and 1.51 at 355, 532 and 1064 nm wavelengths, respectively, as provided in 10 Deshler et al. (2000). The retrieved parameters are N o =6.7 cm −3 (22%), r m =0.32 µm (6%) and σ=1.38 (2%) with A=10.6 µm 2 .cm −3 (40%) and V =1.5 µm 3 .cm −3 (50%). Let us recall that the OPC reference values were N o =7.71 cm −3 , r m =0.29 µm and σ=1.45 with A=10.4 µm 2 .cm −3 and V =1.4 µm 3 .cm −3 . The quality of the retrieval is slightly degraded for r m and σ, but it improves drastically for N o . The improvement is even more 15 noticeable for A and V ; the retrieved values are now within 1% and 7%, respectively, of the reference values instead of the 60% difference initially obtained when the refractive index was calculated by the model (Sect. 4.3). Although the change in m seems to be very marginal (about 0.03 or 2%), it has a profound effect on the retrieval results, confirming by the way that the backscatter coefficient is highly sensitive to m. The results 20 obtained by bypassing the modelling of the composition and refractive index suggest that most of the bias in the initial retrieval was due to errors in m and not to problems in the Monte Carlo approach. This further validates our retrieval methodology.
If the refractive index inferred by Deshler et al. (2000) is taken as a reference value, the bias in m could originate from either errors in the relationship linking m to the 25 particle composition (Luo et al., 1996) or uncertainties in modelling the composition. The input variables are the temperature, pressure and total HNO 3 and H 2 O mixing ratios. The temperature is taken from in situ measurements and therefore should not carry a large uncertainty. The volume mixing ratio of total HNO 3 and H 2 O were set to The results are very close to those presented in Sect. 4.3. The effect of H 2 O and temperature uncertainties on the model-calculated refractive index is further illustrated in Fig. 4. Size distribution parameters and altitude are set to the reference values of the liquid unimodal distribution given in Table 1. The curves show that the uncertainties on the temperature and H 2 O can account for, at best, a 5.10 −3 change in m which is way 10 insufficient to explain the 0.03 discrepancy with the refractive index derived by Deshler et al. (2000). It is difficult for us to pinpoint the exact reasons for the discrepancy on this particular case because of several possible sources of biases, ranging from the lack of spatio-temporal coincidence between the lidar and size-resolving measurements, errors in the composition model itself (for example, the assumption of thermodynamical 15 equilibrium) or errors in the relationship between composition and m (Luo et al., 1996). In addition, the refractive indices of PSC do not appear to be known with an accuracy better than 0.03 (Scarchilli et al., 2005).

Evaluation against a simplified estimation methodology
Our results are then compared to values retrieved with another approach that is similar 20 but does not consider the effect of the non-linearity of the model on the likelihood of the solution (Blum et al., 2006). The test case is a liquid PSC observed over northern Scandinavia on 5 January 2005. Blum et al. (2006) and first searched for σ and r m by fitting the 2 CR and then derived N o by fitting the 3 BC. They did not consider the errors in the lidar measurements and retrieved the parameters by simply looking for the minimum of the misfit between model simulations and measurements (i.e. best match solution). The size distributions were found to be very narrow throughout the 5 km Interactive Discussion thick cloud. Retrieved values of r m and σ were more or less constant at 0.3 µm and 1.04, respectively, whereas N o varied between 2 and 20 cm −3 . We processed the lidar signal using vibrational Raman detection for determining the BC and rotational Raman detection at 529 and 530 nm to get the in situ temperature using the Rotational Raman Technique (Cooney and Pina, 1976;Nedeljkovic et al., 1993). The determination of 5 a simultaneous and collocated temperature profile improved the accuracy of both the BC and the chemical composition modelling. The volume mixing ratios of total HNO 3 and H 2 O are set to the typical values of 10 ppbv and 4.5 ppmv, respectively. Lidar backscatter errors are increased to 20% at 355 and 532 nm to account for differences in the inversion procedure with the BC retrieval of Blum et al. (2006) The full retrieval 10 procedure gives N o =16 cm −3 (20%), r m =0.26 µm (7%) and σ=1.27 (5%) in the middle of the PSC layer, at 21 km. The difference with the results of Blum et al. for the σ value is well beyond any possible retrieval errors. When the measurement errors are ignored (i.e. skipping the cluster filtering and statistical error analysis), our retrieval becomes a search for the best match solution only. This simplified version of our retrieval algorithm 15 is then applied to the same lidar data, but the cluster size is still used to estimate the retrieval errors. The retrieved parameters become N o =9 cm −3 (47%), r m =0.34 µm (18%) and σ=1.04 (12%) which is in excellent agreement with the results of Blum et al. (2006).
The results clearly illustrate the importance of the lidar errors and of the model non-20 linearity in the choice of the best estimator. The best match solution is found to be outside the ±σ boundary of the initial solution cluster. It corresponds to a very localised and deep minimum in the surface of the model-data misfit function (see Fig. 5). But the highest density of possible solutions (when lidar errors are accounted for) is found in a very broad but shallower minimum in that case. The advantage of the filter- 25 ing can be further justified by performing a retrieval in which the backscatter at 532 nm is substantially biased on purpose (i.e. increasing BC 532 nm by 20%). To be consistent, the error in BC 532 nm is also increased by 20%. BC 532 nm that are completely different, well beyond the retrieval error bars, from those produced with the unbiased BC 532 nm . In contrast, taking into account the retrieval error bars, the results of the full retrieval with the artificially biased BC 532 nm are fully consistent with the full retrieval with the unbiased BC 532 nm . Our retrieval algorithm shows less sensitivity to the specified lidar errors and a much greater convergence and sta-10 bility when the statistical filtering is applied.

Summary and conclusions
This paper introduces a particle size distribution retrieval method developed to take advantage of the numerous 3-wavelength lidar setups. It is based on comparing measured and model-simulated lidar backscatter coefficients. The retrieval algorithm mini-15 mizes a cost function of the misfit between measurements and model simulations with the control variables being the parameters of the PSC size distribution. In similar approaches, the estimator is often chosen as the sole "best match" estimator (i.e. best estimation corresponding to the best fit between modelled and measured backscatter coefficients). The errors in the lidar backscatter coefficients and associated colour ra-20 tios are rarely used as input parameters in the retrieval procedure. This may be critical in that the retrieval problem is highly non linear, and, on its own, the best match criterion guarantees neither unicity nor optimality of the solution. Our approach looks for the most likely estimation, using the statistical properties of the solution cluster given the lidar uncertainties. This solution is expected to be found in the densest part of the solution cluster, around the maximum of the probability density function. With our retrieval procedure, we aim at producing realistic and stable estimators of the size dis-8935 Introduction We use a microphysical model to calculate the refractive index from the particle composition, taking into account sulphuric acid aerosol (binary H 2 SO 4 /H 2 O solution) and liquid type Ib PSC particles (ternary STS solution). We then generate a look-up table of backscatter coefficients as a function of the particle size distribution parameters. Modelling the composition allows us to account for the rapid variation of the particle composition, and hence, the refractive index around the PSC type Ib temperature threshold. This refractive index modelling requires the local temperature to be known accurately because the temperature remains the key parameter regarding PSC 10 formation.
To be able to resolve the inverse problem from measurements at only three wavelengths, the size distribution is assumed to be unimodal. The particles are also assumed to be spherical, due to the use of Mie theory in the optical modelling. The model errors are ignored. Our methodology was described and validations were performed 15 against size-resolved OPC measurements and against a similar approach. The first validation case is a PSC event observed above ALOMAR on 23 January 1996. OPC size distribution measurements performed by Deshler et al. (2000) on this PSC provided the reference validation dataset. Due to photo-counting statistics, calculations from OPC measurements suffer from an error of about 20%. 20 In regions where the criteria of unimodality and sphericity are fulfilled, the parameters of size distribution of PSC type Ib particles are mostly correctly retrieved, with a respective 3% and 1% error in the median radius (r m ) and standard deviation (σ) (see Table 1a). The total number of particles N o is found to be the most difficult parameter to retrieve accurately, with an error of around 50%. In addition, we ran our retrieval 25 algorithm at other altitudes, where non-depolarizing (spherical particles) bimodal size distributions are observed, to check the influence of the assumption of unimodality. When retrieving the size distribution of this bimodal particle population, the retrieved parameters mainly correspond to the particle mode that dominates the lidar backscat-  Table 1b). Overall, the results appear to be satisfying for spherical particles. When considering the depolarizing region of the PSC, as expected, the retrieved parameters strongly differ from the OPC reference values. It is clear that the assumption of spherical particles appears to be critical (see Table 1c). Accounting for the particle 5 asphericity in the optical model could allow our algorithm to be applied to solid particles (PSC type Ia or type II).
Model errors are not accounted for in our retrieval algorithm. However, the refractive index is a critical parameter in the calculation of the aerosol optical properties. The model-calculated refractive index of liquid particles in the case observed in 1996 was 10 found to be 0.03 lower than the one inferred in Deshler et al. (2000) at the same altitudes. Using this inferred refractive index value in the optical model, the retrieved parameters are found to be in excellent agreement with the OPC measurements, with both the surface area density and aerosol volume carrying less than 10% error as compared to the reference values. This suggests that most of the errors in the retrieval 15 originate from uncertainties on the refractive index rather than errors in the retrieval methodology. Note that our estimation of the retrieval errors does not take into account the errors involved in the optical modelling such as errors in the chemical composition required in refractive index calculations or errors in the scattering theory.
Our results are then compared to values retrieved with another approach that is sim-20 ilar but does not consider the effect of the non-linearity of the model on the likelihood of the solution: Blum et al. (2006) retrieved the size distribution a liquid PSC observed over northern Scandinavia on 5 January 2005 using comparison between measured and model-simulated colour ratios, the solution being the best match. We performed retrieval tests with and without the cluster filtering approach on the PSC case described 25 in Blum et al. (2006). An excellent agreement is found between their values and values retrieved with our algorithm without any statistical filtering of the solution cluster. However, the value of the geometrical standard deviation appears to be unlikely, being very close to 1 (σ≈1.04). When the full retrieval algorithm is applied (statistical Introduction Interactive Discussion filtering), our results are found to be significantly different with a more realistic value of the standard deviation (σ≈1.27). Additional tests were performed with synthetic lidar measurements generated a priori with the optical model using specific values of size distribution parameters. These synthetic measurements are then used as inputs to the retrieval algorithm. Even with a 20% artificial bias on one of the backscatter 5 coefficients, the full algorithm is able to retrieve the original size distribution parameters with a reasonable accuracy. Our full retrieval procedure does not appear to be very sensitive to the specified lidar errors. Overall, our statistical approach produces reliable estimates of liquid particle size distributions from only three lidar backscatter coefficients.
10 Table 1. Reference and retrieved size distribution parameters on the PSC case observed on 23 January 1996. On the left part, OPC size distribution reference values in Deshler et al. (2000). On the right part, retrieved values in case of liquid (a and b) and solid particles (c). The associated retrieval errors in the estimated parameters are bracketed.   The purely best match solution is found within the encircled area. Taking lidar errors into account, the densest part of the solution cluster is found to be within the squared area.