Evaluation of upper tropospheric humidity forecasts from ECMWF using AIRS and CALIPSO data

An evaluation of the upper tropospheric humidity from the European Centre of Medium-Range Weather Forecasts (ECMWF) Integrated Forecast System (IFS) is presented. We first make an analysis of the spinup behaviour of ice supersaturation in weather forecasts. It shows that a spinup period of at least 12 h is necessary before using forecast humidity data from the upper troposphere. We compare the forecasted upper tropospheric humidity with coincident relative humidity fields retrieved from the Atmospheric InfraRed Sounder (AIRS) and with cloud vertical profiles from the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO). The analysis is made over one year, from October 2006 to September 2007, and we discuss how relative humidity and cloud features appear both in the IFS and in the observations. The comparison with AIRS is made difficult because of the vertical resolution of the sounder and the impossibility to retrieve humidity for high cloudiness. Clear sky relative humidities show a rather good correlation whereas cloudy situations reflect more the effect of a dry bias for AIRS increasing with the relative humidity. The comparison with CALIPSO shows that the IFS predicts high relative humidity where CALIPSO finds high clouds, which supports the good quality of the ECMWF upper tropospheric cloud forecast. In a last part, we investigate the presence of ice supersaturation within low vertical resolution pressure layers by comparing the IFS outputs for highresolution and low-resolution humidity profiles and by simulating the interpolation of humidity over radiosonde data. A new correction method is proposed and tested with these data. Correspondence to: N. Lamquin (nicolas.lamquin@lmd.polytechnique.fr)


Introduction
Ice supersaturation in the upper troposphere is an explicit feature in the Integrated Forecast System (IFS) of the European Centre of Medium-Range Weather Forecasts (ECMWF), operational since 13 September 2006 (IFS cycle 31r1).This new feature, introduced by Tompkins et al. (2007), has produced some changes in the statistics of upper tropospheric humidity and cloud fraction in the IFS.In particular, there is an increase in upper-tropospheric humidity, a decrease in high-level cloud cover and, to a much lesser extent, cloud ice amounts.First analyses of the frequency distribution of the modelled supersaturation values showed good agreement with a climatology derived from in situ aircraft observations (Tompkins et al., 2007).The global distribution of frequency of occurrence of supersaturated regions in the new scheme compares well with remotely sensed microwave limb sounder (MLS) data, with the most marked underprediction occurring in regions where the model is known to underpredict deep convection (Tompkins et al., 2007).
A major inconsistency in the IFS is that the tangent linear and adjoint models used in the data assimilation scheme do not involve ice supersaturation (Janisková et al., 2002), while the forecast model does.The only way an analysis (i.e. the initial state for a new forecast run) can obtain supersaturation is during the final trajectory integration which uses the full physics of the forecast model.In other words, forecasts are initialized with states that usually have less ice supersaturation than the forecast for the same time from the day before.Hence, there is a supersaturation spinup in the forecast runs, and we will first present an analysis of it.The present paper aims at further tests of the new supersaturation scheme regarding the simulated upper tropospheric humidity.The ECMWF tropical water vapour has been assessed by Luo et al. (2007) with the help of Measurements of OZone and water vapour by in-service AIrbus air-Craft (MOZAIC) data, but the forecast data did not include the new ice supersaturation scheme.For our purpose we first compare the IFS forecasts with collocated Atmospheric In-fraRed Sounder (AIRS) retrievals of relative humidity within pressure layers of 200-250, 250-300, 300-400, and  For the comparisons, we mostly use data on standard pressure levels.These data are interpolated from much higher resolved model level data, and it is conceivable that some supersaturation events will get lost in the interpolation procedure.To explore the effect of the interpolation further and to assess the occurrence of ice supersaturation using data with low vertical resolution, we compare relative humidity using ECMWF forecasts of high and low resolution, and we also simulate the averaging effect by using radiosonde data from Lindenberg, Germany.
2 Data handling 2.1 ECMWF forecasts ECMWF provides global weather forecasts twice daily (00:00 and 12:00 UTC) with output time steps of 3 h in the first 3 days and 6 h from then on to day 10.The model uses a horizontal spectral resolution of T799 (corresponding to a resolution of about 25 km at the equator) and 91 layers in the vertical (corresponding to a resolution of about 15 hPa).For our studies we extracted temperature, humidity and cloud cover data from the upper troposphere (500 to 200 hPa) in a regular 0.5 • ×0.5 • grid.We analyzed one year (from October 2006 to September 2007) of data on the standard pressure levels (200,250,300,400,500 hPa) and January and July 2007 on model levels which we interpolated on a vertical grid with higher resolution of 25 hPa.We first computed relative humidity from the specific humidity and temperature on model levels and then made the vertical interpolation.For the comparison with the satellite data layer values instead of level values of relative humidity are obtained by averaging the lower and upper edges for each standard pressure layer.This procedure is summarized in Fig. 1.
Figure 2 shows the difference between relative humidity with respect to ice obtained at high vertical resolution and the one obtained by interpolation to lower resolution as a function of relative humidity at high vertical resolution.The figure shows that data produced with different interpolation procedures differ typically by a few percent (in RHi-units).
The new supersaturation scheme is only in effect at temperatures lower than 250 K. Relative humidity is calculated  2. Two-dimensional histogram of the difference between the interpolated low-resolution relative humidity and the original highresolution relative humidity (y-axis) against the high-resolution relative humidity (x-axis) at standard pressure levels 200, 250, 300, 400, and 500 hPa.The probability is normalized by the maximum value (in percent) and the colour scale is limited to 50% with values higher than 50% represented by the same colour.All levels combined, January and July 2007 combined.The solid line represents binned mean, dashed lines represent binned mean ± standard deviation.
with respect to the ice phase (RHi) in this temperature regime.Thus comparisons are only made for temperatures lower than 250 K.

A-Train data and collocations with ECMWF
The A-Train satellite constellation (Stephens et al., 2002) provides a large panel of new instruments and offers increased possibilities to understand the Earth's atmosphere and climate.Among these instruments AIRS provides temperature and humidity profiles since May 2002, and CALIPSO provides cloud vertical profiles since July 2006.For our purposes we will use observations made by these two instruments to evaluate the ECMWF forecasts.Differences in spatial resolution and sampling of both instruments have led to different collocation schemes which are described in the following.
Onboard the Earth Observing System (EOS) satellite Aqua, AIRS provides very high resolution measurements of Earth emitted radiation in three spectral bands from 3.74 to 15.40 µm, using 2378 channels, at 01:30 and 13:30 local time (LT).The spatial resolution of these measurements is 13.5 km at nadir.Nine AIRS measurements (3×3) correspond to one footprint of the Advanced Microwave Sounder Unit (AMSU).AIRS level 2 (L2) standard products include temperature at 28 pressure levels from 0.1 hPa to the surface and water mixing ratios w within 14 pressure layers from 50 hPa to the surface (Susskind et al., 2003(Susskind et al., , 2006).These atmospheric profiles were retrieved from cloud-cleared AIRS radiances (Chahine et al., 2006) within each AMSU footprint, at a spatial resolution of about 45 km, which is close to the one of ECMWF.Validations with radiosonde data from the NOAA NESDIS operational meteorological database archive (Divakarla et al., 2006) have shown that the accuracy is close to 1 K in 1 km layers for temperature and better than 15% in 2 km layers for water vapour.However, Tobin et al. (2006) using Atmospheric Radiation Measurement (ARM) data, have shown that the uncertainty in the upper troposphere can increase to 30-35% in the midlatitudes.The nominal vertical resolution of temperature and humidity profiles is about 2 to 3 km in the upper troposphere (Gettelman et al., 2004;Read et al., 2007).
We use version 5 of AIRS L2 data and retrieve RHi as in Lamquin et al. (2008).Specific humidity q of an atmospheric layer is obtained from the given mixing ratio with q=w/(1+w), whereas temperatures are given at top and bottom of each layer.Therefore RHi is determined over the atmospheric layer, as in Stubenrauch and Schumann (2005).The saturation specific humidity with respect to ice q i s integrated over the pressure layer is obtained from the saturation water vapour pressure with respect to ice p i s (Sonntag, 1990): The latter is determined in steps of 1 hPa from the linearly interpolated temperature profiles within the pressure layer between the boundaries p 1 and p 2 .Then and RHi=q/q i s .The uncertainties related to the AIRS relative humidity can be quantified from the uncertainties of w and T given in the L2 products.We have RHi/RHi= q/q+ q i s /q i s .q arises from the uncertainty on the mixing ratio w. q i s is determined by computing q i s with T (p 1 )± T (p 1 )=T ± (p 1 ) along with T (p 2 )± T (p 2 )=T ± (p 2 ).This leads to four values of q i s =|q i s (T (p 1 ), T (p 2 ))−q i s (T ± (p 1 ), T ± (p 2 ))| from which we take an average.Figure 3 presents RHi as a function of RHi for July 2007 over Europe with standard deviations shown as errorbars.The influence of q i s is fairly small (maximum 1% in RHi) since the uncertainties on T are about 1 K. Figure 3 shows an increase of the absolute uncertainty over RHi but the relative uncertainty (not shown) RHi/RHi × 100 shows a small increase from about 30% to about 35% which agrees with Tobin et al. (2006)   Qual H2O =2 and PGood>600 hPa from AIRS L2 quality flags (Susskind et al., 2006;Tobin et al., 2006).Please note that atmospheric profiles of good quality are only obtained when the atmosphere is not too cloudy (effective cloud cover lower than 70-80%).In addition, as suggested in Gettelman et al. (2004) and Read et al. (2007), layers for which the water vapour content is lower than the nominal instrument sensitivity (q=20 ppmv) are rejected.RHi is kept for four pressure layers between 200 and 500 hPa: 200-250, 250-300, 300-400, and 400-500 hPa.Our analysis is made over Europe, a region where the tropopause is situated around 200-300 hPa, mostly depending on the season.Pressure layers are discarded when the tropopause lays inside the layer to avoid biases due to the inversion of the temperature gradient at the tropopause.
AIRS L2 also provides a description of clouds in terms of cloud pressure and effective cloud cover for up to two cloud layers.A layer is declared cloudy when the pressure of the highest cloud layer is between the pressures at the layer's top and bottom, it is termed clear when no cloud is detected over the whole profile.Recent studies by Kahn et al. (2008) and Stubenrauch et al. (2008) have revealed an underestimation of AIRS L2 cloud pressure for low and middle clouds (p AIRS >440 hPa).Figure 4 presents distributions of differences between AIRS cloud pressure and CALIPSO pressure p AIRS -p(mid) CALIOP (hPa) norm.frq.AIRS clds Fig. 4. Difference in cloud pressure between AIRS and CALIPSO for clouds higher than 400 hPa (top) and clouds between 400 and 500 hPa (bottom).Cloud pressures from AIRS determined from LMD retrieval (plain) and from AIRS level 2 (dashed).NH midlatitudes, one year of data.
of the middle of the cloud for two retrievals: the L2 standard product and the LMD retrieval presented in Stubenrauch et al. (2008), separately for clouds with p AIRS <400 hPa and clouds with p AIRS between 400 and 500 hPa.Statistics is shown for one year of AIRS-CALIPSO collocated data in the Northern Hemisphere (NH) midlatitudes.Due to the slight underestimation of the AIRS L2 cloud pressure in the layer between 400 and 500 hPa we only keep data with clouds having L2 cloud pressure smaller than 450 hPa in this specific pressure layer.
Aqua overflies at 01:30 and 13:30 LT, and the AIRS large swath (about 1700 km) makes it possible to find numerous events which are close in terms of time and location to ECMWF forecasts for 00:00 and 12:00 UTC.The geographical proximity is realized by associating events with a centre-to-centre distance ( lat) 2 + ( lon) 2 smaller than 0.25 • .Events are rejected when the time interval is larger than 30 min.One year of collocations then leads to a total amount of 325851, 527561, 377856, and 69754 events for the pressure layers 400-500, 300-400, 250-300 and 200-250 hPa, respectively.Note that further selections in the data regarding cloudiness reduce the statistics but still keep the amount significant (see Sect. 4.1 and Table 1).AIRS RHi and effective cloud cover data will be used to evaluate RHi from ECMWF.
The CALIPSO mission is a collaboration between National Aeronautics and Space Administration (NASA) and the French National Center of Space Studies (Centre National d'Etudes Spatiales, CNES) (Winker et al., 2003).CALIPSO contains the Cloud-Aerosol LIdar with Orthogonal Polarization (CALIOP) which can discriminate the vertical distribution of water and ice clouds as well as aerosol masses (Sassen, 1991).The performance of CALIOP is summarized in Winker et al. (2007).The instrument provides an accurate vertical profile of backscattered radiation at 532 nm and 1064 nm at a vertical resolution of 60 m for altitudes between 8.2 and 20.2 km, where high clouds are situated.CALIOP is a nadir viewing instrument, and the width of each shot is about 70 m, sampled every 333 m along the track.The 5 km CALIPSO L2 cloud products provide the number of cloud layers as well as their vertical extent (top and base altitudes) averaged over 5 km, as long as the signal is not totally attenuated by thick clouds (with τ larger than 5 (Winker et al., 2003)).We use version 2 of CALIPSO L2 data, with geometrical cloud height transformed to pressure using meteorological atmospheric profiles of the Global Modeling and Assimilation Office (GMAO) which are available in the CALIPSO L1 data.The collocation of the ECMWF forecasts with CALIPSO does not provide as many events as the collocation with AIRS, because CALIOP only measures at nadir.Nevertheless, a sufficient number of collocated profiles (75280) was found.Again the time interval has to be smaller than 30 minutes, and an IFS grid box is associated with a CALIPSO pixel only when the centre-to-centre distance is smaller than 0.15 • .This takes into account the fact that CALIPSO pixels are much smaller than the IFS grid boxes and that we only keep events close to the centre.One to four CALIPSO pixels fall within an IFS box.We will, however, consider these cases as independent.In Lamquin et al. ( 2008) collocations of AIRS and CALIPSO considered five pixels with an averaging of their vertical profiles and no large difference was inferred by considering the pixels independently.Since CALIPSO does not provide information on the relative humidity, we will use the data as an indication of presence, position and vertical extension of clouds in relationship to RHi from ECMWF.

Radiosonde data
We use relative humidity data obtained from radiosonde measurements made at the meteorological observatory Lindenberg near Berlin, Germany, between February 2000 and April 2001.Usually, radiosonde humidity measurements in the upper troposphere have problems; in particular, they are subject to a dry bias.Several correction methods exist in the literature (e.g.Soden and Lanzante, 1996;Wang et al., 2002), and our data have been corrected, too.The data and their handling as well as the correction method have been described in detail in (Nagel et al., 2001;Spichtinger et al., 2003a;Leiterer et al., 2005).The vertical resolution of the data is approximately 50 m.Ice supersaturation can be detected and is present in 28% of the measured profiles, but it cannot be decided whether a data point was taken within a cloud or in clear sky.Most ice supersaturated layers are located between 200 and 450 hPa, the mean altitude is 300 hPa in summer and fall, and 340 hPa in winter and spring.For more details see Spichtinger et al. (2003a).

Supersaturation spinup
ECMWF uses an assimilation scheme that does not account for ice supersaturation in the upper troposphere.Hence, data assimilation leads to analyses that severely underestimate the true occurrence and range of ice supersaturation.Since the analyses serve as initial conditions for the forecast runs, the forecast model needs some time for spinup of the supersaturation field.Studies of upper tropospheric humidity should not use forecast humidity data from the spinup phase since they are unreliable.
We have investigated the spinup using forecasts from two months, namely October 2006 and April 2007.For every day, and for the noon and midnight forecast runs, we used forecasts up to day 3 and counted at each 3-h forecast step the number of grid boxes (in 0.5 • ×0.5 • resolution) on the 250 hPa level where RHi≥100%.This number, divided by the total number of grid boxes per level and averaged over all forecast runs for that month yields the fraction of supersaturated grid boxes, shown in the left hand panels of Fig. 5.The corresponding right hand panels show the average supersaturation in the grid boxes with supersaturation.The upper and lower curves in each panel are the mean plus/minus one standard deviation.We only studied global means, i.e. we did not look at single regions as e.g.northern midlatitudes, as we think one should use the humidity data only when the model has left the spinup phase everywhere.Figure 5 shows that the investigated quantities display similar spinup behaviour in the two months.There is a steep rise of the fraction of supersaturated grid boxes in about the first half day into the forecast.During that period the average supersaturation also shows an increase and a more noisy behaviour than later.Analogous behaviour was found for April 2007 on the 200 and 300 hPa levels (only one month studied with these levels, not shown) and for January 2007 on the 250 hPa level (not shown).Hence a spinup period of at least 12 h is necessary before using forecast humidity data from the upper troposphere.For the following investigations we use 24 h forecasts.
We note that the mean fraction of gridboxes with ice supersaturation is slightly higher than 10%, consistent with findings from airborne in situ data (Gierens et al., 1999).The mean supersaturation is of the order 6 to 10%, which is slightly lower than airborne and satellite measurements (e.g.Spichtinger et al., 2003b)

Evaluation of upper tropospheric humidity
In the following, we first compare RHi between ECMWF and AIRS in the upper troposphere for clear and cloudy situations.CALIPSO is used to investigate the effect of clouds in more detail.At last, the effect of vertical resolution on ice supersaturation occurrence is studied with the help of radiosonde data.

Comparison with AIRS relative humidity fields
The comparison of RHi from AIRS (RHi A ) versus RHi from ECMWF (RHi E ) is made separately for four pressure layers in the upper troposphere (termed 1 to 4 from bottom to top).We distinguish clear and cloudy cases (the distinction is based on AIRS L2 cloud data, see above Sect.2.2), but it turned out useful to further subdivide the cloudy cases into two classes, one with effective cloud cover lower than 30% and another one with higher effective cloud cover in the upper troposphere.
Figure 6 presents two-dimensional histograms of RHi E −RHi A vs. RHi E for these three cloudiness classes: clear sky with letter a, and the two degrees of effective cloudiness with letters b (low) and c (high).The frequencies of occurrence are normalized by the maximum value for each plot.
This representation reveals two distinct modes: 1) a linear correlation between RHi A and RHi E for cases a and b and 2) a large range of RHi A values when the RHi E is around 100% for cases c.The spread of values around the found correlations is in line with the uncertainties discussed in Sect.2.2 and shown in Fig. 3. Figure 6 shows that RHi E −RHi A is not simply randomly distributed; the marginal distributions are neither centered at zero nor are they symmetric.On the contrary, most differences are slightly positive (i.e.ECMWF is a bit moister than AIRS), and the marginal distributions are left-skewed for low RHi E (say up to 60%) and right-skewed at higher RHi E .This means that a linear regression of RHi A vs. RHi E would rise less steep than y=x.
For each panel in the figure we have determined the linear correlation coefficient and its statistical significance under the Null Hypothesis that the humidity data from AIRS and ECMWF are uncorrelated.The results (the squares of Pearson's r) are compiled in Table 1.The shear data amount makes all correlation values statistically highly significant, so there is no need to show significance figures.As generally known, r 2 is a measure of the fraction of variance in one of the data series that could be represented by the values of the other data series if we would make a linear regression For the classes with low effective cloud amount or even clear sky a regression model would have at least some predictive power, and this would be better in the upper than in the lower levels and, unsurprisingly, better under clear sky than under cloudy conditions.The difference between the two modes mentioned above is also apparent from the correlation analysis.We analyze the two modes of Fig. 6 further.The first ("dry") mode (with RHi E roughly lower than 80%) occuring for clear (and mostly clear) cases a and b shows a rather good agreement with mean differences of RHi E −RHi A found to be 2.7, 1.7, 3.4, 1.4% with standard deviations 10.4, 13.2, 15.1, 15.3% respectively for the layers 200-250, 250-300, 300-400, and 400-500 hPa.The mean differences and their standard variations sound surprisingly good, but one should note that these means and standard deviations are obtained for values of RHi E <80% and are heavily influenced by the relatively high probabilities of low values of relative humidity in both datasets.This kind of selection bias arises because RHi A of good quality is only available when the AIRS L2 effective cloud cover is not too high.In the data that are used for this comparison the distributions of AIRS L2 effective cloud cover peak at small values and do not extend to values larger than 70-80% (see below).
The second ("moist") mode (with RHi E around 100%) of Fig. 6 occurs only for cases c, that is when AIRS detects clouds with a high effective cloud cover.AIRS humidity shows a broad distribution with a maximum between 80 and 100% when ECMWF shows saturation.This dry bias can be explained by the fact that AIRS atmospheric profiles correspond to situations with effective cloud cover lower than 70-80% (see Fig. 7).
Ice saturation does not generally imply cloud presence, neither in the model nor in the AIRS data.The clear sky panels of Fig. 6 have data with RHi E around 100% while the corresponding AIRS data are mostly subsaturated.In the model itself, situations (RHi E ∈[90%, 110%]) reveal a Ushape distribution of cloud cover (see Fig. 7), with maxima at low and at large cloud cover.The fact that homogeneous ice nucleation needs high supersaturation (as implemented in the cloud scheme) explains that there are clear sky situations in spite of saturated or supersaturated air.When the cloud cover within an IFS grid box approaches 100%, then the relative humidity approaches 100% because in-cloud supersaturation (as observed for instance by Comstock et al., 2004;Lee et al., 2004;Ovarlez et al., 2002) cannot be represented in the current cloud scheme (Tompkins et al., 2007).However, in-cloud supersaturation often is a transient phenomenon in cloud evolution, hence humidity statistics in clouds always peak at saturation.This is highlighted in Fig. 8, presenting overall distributions of RHi E and RHi A in the pressure layer 300-400 hPa.The presence of the peak in the RHi E curve and the absence of it in the RHi A curve show that the ECMWF distribution is more realistic than the one obtained from AIRS.
The pdfs of effective cloudiness for the AIRS data used in the present comparison are shown in Fig. 7, separately for RHi A <50%, 50%<RHi A <80% and RHi A >80%.The distribution is narrower for the lower RHi A interval.The ECMWF model shows a similar distribution of cloud cover when RHi is low, shown in the rhs panel of Fig. 7 for RHi E <50%.The distribution is broader because ECMWF cloud cover does not take into account the emissivity of the cloud.The effective cloud cover (weighted by cloud emissivity) of an optically thin cloud is much smaller than its cloud cover.The IFS cloud cover distribution has larger values than the AIRS L2 effective cloud cover distribution, because high RHi can be forecasted in presence of high cloud cover, whereas an infrared sounder can determine RHi (using cloud-cleared radiances) only when the cloudiness is not too high.

AIRS ECMWF
Fig. 8. Overall distributions of relative humidity wrt ice from AIRS (plain) and ECMWF (dashed) for the layer 300-400 hPa.
To investigate RHi of the moist mode further in relationship with clouds, we use in addition collocated AIRS-CALIPSO data (for the details on the collocation scheme see Lamquin et al., 2008).Figure 9 presents distributions of RHi A when RHi E >80% compared to distributions of RHi A in the presence of CALIPSO high clouds, separately for the same pressure layers as in Fig. 6.In Lamquin et al. (2008) the presence of clouds within the AIRS field-of-view is used as a proxy for "humidity around ice saturation" occuring within the pressure layers.In all cases of Fig. 9 the distributions compare well, suggesting that the conditions "humidity around ice saturation" in the IFS and "cloud present" in CALIPSO lead to similar distributions of RHi A which peak at about 80-90% and are very broad.This also suggests that ECMWF is quite successful in predicting high clouds, because, as Fig. 7 (right panel) shows, only a minor fraction of cases with RHi E between 90 and 110% are cloud free, the majority has clouds, admittedly often with small fractional coverage.
Besides the limitation to scenes with relatively low effective cloudiness for AIRS humidity and temperature retrievals, another reason for RHi A appearing on average drier than RHi E could be the vertical resolution of the AIRS weighting functions of the channels used for the humidity retrieval because the retrieval of humidity considers parts of the vertical profile both over and under the standard pressure levels.The effect of this has been demonstrated by Lamquin et al. (2008), showing that the mean relative humidity for cirrus scenes with clouds extending over the whole pressure layer is on average only about 70%.This value is slightly lower than the peak value of the distributions in Fig. 9, since clouds are on average geometrically thicker than the pressure layers (Lamquin et al., 2008).
AIRS and ECMWF shows an increasing bias most probably caused by the AIRS vertical resolution and the incapacity for AIRS to retrieve humidity for high cloudiness.This bias renders it difficult to draw clear conclusions about the forecast quality of relative humidity in the IFS.However, on a statistical basis, the moist mode in the IFS mostly implies high clouds which agrees with the findings from CALIPSO.This is encouraging.A fairer comparison may be possible by computing the radiance fields from the model profiles and comparing these to the measured radiances.However, it is not obvious how histograms of radiance differences could be used for model improvements.For this purpose one may prefer to use instruments with higher vertical resolution.Towards this end we now use CALIPSO to relate RHi E to a very vertically-resolved description of clouds.

Comparison with CALIPSO cloud vertical profiles
The use of CALIPSO data will give insights on how well cloud formation is represented in the model.All results are presented in Fig. 10.In a first step (Fig. 10, top left) we relate the high cloud (with apparent middle of the cloud p cld =(p top +p base )/2<500 hPa) occurrence as seen by CALIPSO with the maximum RHi E forecasted between 200 and 500 hPa by the IFS (data in low vertical resolution).High cloud occurrence is declared as soon as one cloud layer from CALIPSO is observed at altitudes higher than 500 hPa.The maximum humidity is better suited than the average over the layer, because cloud formation is related to the humidity peak rather than to the average.To study the effect of temperature on this relationship we divided the data into two temperature intervals of roughly equal data amounts, T <220 K and T >220 K, T corresponding to the level of maximum RHi E .The figure shows that CALIPSO high cloud occurrence increases with maximum RHi E , for both temperature intervals.Note that there are a few cases of clouds for which the model predicts low humidity values.ECMWF either does not predict it or it predicts a very thin cloud embedded in dry layers such that a maximum RHi E of 20% results on the coarse levels.The question is thus why ECMWF fails to predict a cloud in certain situations.Only case-by-case analysis can help to resolve this problem.The cloud occurrence at RHi E =100% is about 65%, and 100% cloud occurrence is only reached at certain supersaturations that reflect the supersaturation thresholds for homogeneous nucleation (Kärcher and Lohmann, 2002) together with the ±20% humidity fluctuations in the model.Larger maximum RHi E values are reached in the cold temperature interval than in the warmer interval, since the humidity threshold for homogeneous nucleation increases with decreasing temperature.
In a second step (Fig. 10, top right) the total vertical extent of high clouds from CALIPSO is related to the maximum RHi E .We define the total vertical extent as the sum of the vertical extent (in meter) of all high cloud layers detected by CALIPSO.In both temperature intervals the total extent increases with the maximum RHi E .This agrees qualitatively with the results of Lamquin et al. (2008).One observes that the average thickness of the clouds detected by CALIPSO in spite of a dry forecast is of the order 1500 m.
For single-layer high clouds detected by CALIPSO, Fig. 10 (bottom left) presents the distribution of the difference between their altitude and the altitude of the middle of the layer in which the maximum RHi E is found.The distribution is symmetric with a peak at zero.The standard deviation of the distribution reflects rather the resolution of the standard pressure levels than the actual data scatter, 100 hPa roughly corresponding to 1-2 km (less at the highest levels).The model predicts the highest humidity values mostly at the correct altitude, namely at that altitude where CALIPSO finds a cloud.
Finally (Fig. 10, bottom right), a distribution of the maximum of RHi E between 200 and 500 hPa is shown for scenes of high clouds detected by CALIPSO.The peak probability is around 100% which shows that this particular value in the model is well representative for the presence of high clouds.The distribution is strongly left-skewed because of dry cases for which a high cloud was detected by CALIPSO.The standard deviation is 19%, comparable to the distribution widths for in-cloud humidity distributions reported by Ovarlez et al. (2002) and Immler et al. (2008).
Summarising, the IFS model predicts high RHi where CALIPSO actually detects high clouds.This lends credence to the good quality of the ECMWF upper tropospheric cloud forecast.However, sometimes clouds are observed where ECMWF predicts dry air.Such cases need further consideration.

Influence of vertical resolution on occurrence of ice supersaturation
In this section we investigate effects of vertical resolution on the apparent frequency of ice supersaturation.This problem is important especially for satellite IR sounders, since ice supersaturated layers are frequently much shallower (see Spichtinger et al., 2003a) than the weighting functions of satellite instrument channels (Gierens et al., 2004).
Since ECMWF humidity data exist in both high and low vertical resolution we will first investigate the effect with these.Generally we expect that interpolation leads to a loss of detail in the interpolated profiles, single shallow supersaturated layers will not be detected at low vertical resolution, resulting in an underestimated overall occurrence of ice supersaturation.We expect the same effect to occur when using satellite IR sounder data, because the broad weighting functions average thin supersaturated layers within thicker subsaturated layers (Gierens et al., 2004;Kahn et al., 2008;Lamquin et al., 2008).
The two upper panels of Fig. 11 present the probability of occurrence of ice supersaturation within a layer (determined from ECMWF data with high vertical resolution) as a function of RHi E averaged over this layer.Results are shown for layers 400-500, 300-400, and 250-300 hPa in January and for layers 300-400, 250-300, and 200-250 hPa in July (as the tropopause is higher than in January).To avoid difficulties inferred by the tropopause eventually lying inside the pressure layer we only select cases for which the temperature gradient is constantly negative throughout the layer.One observes that supersaturation at the high resolution vertical levels can be present as soon as the data on the low resolution levels indicate RHi E 50%.In all cases the probability to have a supersaturated layer at 25 hPa resolution follows an s-shaped function of RHi E .The curves increase sharply just below RHi E =100% and reach 100% probability when the low resolution RHi E is slightly supersaturated, around 105-110%.That 100% probability is not already reached at RHi E =100% is certainly an effect of the two different interpolation schemes used to compute the low-and high-resolution profiles.This is supported by Fig. 2 when the high-resolution relative humidity is at 100%, low-resolution relative humidity is on average slightly higher than 100%.
If one can find a general expression for this s-shaped behaviour of the probability function due to low vertical resolution, it could be used for correcting apparent frequencies of supersaturation obtained from satellite IR sounder instruments.A common way to estimate the frequency of occurrence of ice supersaturation is to select a threshold lower than 100% (as in Stubenrauch andSchumann, 2005 andRädel andShine, 2008) and to count all cases with RHi exceeding the threshold as supersaturated.The threshold can be obtained by considering the distribution of RHi for cloudy cases.Rädel and Shine (2008) use a method based on hit and rejection rates as well as Peirce skill scores (Peirce, 1884).The use of such a threshold is subject to the risk of false alarms.
To study this behaviour further, we also analyzed the radiosonde data from Lindenberg described in Sect.2.3.We determine RHi at each standard pressure level by averaging the specific humidity inside the layer and by integrating the saturation specific humidity (as in Sect.2.2 for q i s ) between the edges of the layer.The high-resolution radiosonde levels give the opportunity to detect local ice supersaturation within the boundaries of a layer, and its probability of occurrence is displayed as a function of RHi averaged over the whole layer.The plots are shown on bottom of Fig. 11; they show a similar s-function as before even though the scheme and the source of the data employed are different.This confirms that, on a statistical basis, ice supersaturation may be present even when small low-resolution values of RHi are given.It seems possible to replace the use of a fixed threshold for the determination of ice supersaturation frequency by a probability function.Its shape may vary from one type of data to another but in any case may be close to an s-function.
The empirical s-functions start at zero at low RHi and reach unity at saturation (or slightly above).Furthermore, they are monotonically increasing, that is, they have the mathematical properties of a cumulative distribution function, F (r). F (r) is the probability that the high-resolution humidity profile exceeds saturation when the low-resolution humidity is r.The derivative of F is a probability density function, f (r) and we have r 0 f (r )dr =F (r).We are now seeking a suitable interpretation, which then might help to simulate the s-function for later applications.A suitable interpretation is the following: when scanning data from small to higher values of r, f (r) dr is the probability that the highresolution humidity profile exceeds saturation for the first time in the infinitesimal interval [r, r+dr].For values of r exceeding 100% this probability tends to zero because then the high-resolution humidity profile always exceeds saturation.Finally, we test the application of the s-function to determine the occurrence of ice supersaturation in the upper troposphere by comparing the result to the one obtained by applying a simple threshold.Since the empirical s-function is not easy to relate to a simple function (it is not symmetrical enough to be modelled by a logistic function), we use such a function empirically obtained for all layers combined.We determine the frequency of ice supersaturation for all pressure layers for the year of ECMWF low resolution data over Europe by using the s-function or thresholds at 80% (as in Rädel and Shine, 2008) and 100%: the use of the s-function leads to about 20% supersaturation occurrence while the use of thresholds at 80% or 100% lead respectively to 25% and 9% supersaturation occurrence.In a recent work (Burkhardt et al., 2008) ice supersaturation occurrence in the midlatitudes upper troposphere is found to be around 20% with large seasonal variations using MOZAIC at 230 hPa.As expected, the 100% threshold leads to a severe underestimation of the true probability of ice supersaturation.Using a fixed threshold of 80% leads to a slightly higher value than using the s-function.Our estimation from ECMWF using the s-function compares well to MOZAIC, all the more because we observe a comparable seasonal cycle of ice supersaturation occurrence.Burkhardt et al. (2008) find (DJF) 26-27%, (MAM) 18-19%, (JJA) 17-18%, (SON) 23-24% with MOZAIC while we find (DJF) 23%, (MAM) 19%, (JJA) 15%, (SON) 22%.

Conclusions
Upper tropospheric humidity and cloudiness forecasts from ECMWF's Integrated Forecast System (IFS) including the new ice supersaturation feature (Tompkins et al., 2007) have been compared with collocated humidity and cloud retrievals from AIRS and CALIPSO over Europe.An initial study of the global supersaturation spinup behaviour shows that at least 12 h are necessary before using forecast humidity data from the upper troposphere.Relative humidity of IFS and AIRS was compared in different pressure layers, separately for clear and cloudy situations distinguished by AIRS.Two N. Lamquin et al.: ECMWF vs. AIRS RHi and CALIPSO clouds modes were detected: 1) a dry mode in which IFS predicts RHi E <80% (and mainly cloud free) and in which the relative humidities show a reasonably good agreement (with standard deviations of the order 10 to 15%) and 2) a moist mode in which IFS predicts values around ice saturation and AIRS provides a range of RHi values from about 50% up to and exceeding 150% with a peak probability around 80-90%.It may be noted that AIRS always detects clouds in these cases.The linear correlation between AIRS and IFS relative humidity values is weak in the moist mode, while in the dry mode it is higher, in particular in clear sky conditions.A comparison of IFS relative humidities with the cloud products from CALIPSO showed a strong positive correlation between RHi E and the probability that CALIPSO detects a cloud in the respective layer.The peak relative humidity from IFS is mostly located in the pressure layer where CALIPSO indeed detects a cloud.The CALIPSO cloud probability reaches 100% when the IFS humidity approximately reaches the threshold for homogeneous nucleation; the cloud probability at ice saturation is 65%.The comparison uncovers that occasionally CALIPSO finds clouds (of geometrical thickness exceeding 1 km) where IFS predicts dry air.These cases need further consideration.
Finally we tested the dependence of vertical resolution on the reported frequency of ice supersaturation.This is a problem in particular for satellite sounder observations, but also for the forecast data when only the standard pressure levels are retained.We compared the IFS data on the standard pressure levels with those of higher vertical resolution and high-resolution radiosonde data with layer averages obtained in a way close to how upper tropospheric humidity is determined for AIRS.These exercises demonstrated that the true frequency of occurrence as a function of the low-resolution relative humidity follows a s-shaped function, that can be used for a correction algorithm.Application of such a correction yielded a slightly lower values than the method using a fixed RHi threshold which corresponds to the maximum of the RHi distribution of cloudy scenes.The s-function correction should be further tested with more data.Similar functions appear in other data, for instance the relationship between cloud fraction and total water divided by saturation follows an s-shaped function (Wood and Field, 2000).Such relations and possible connections to extreme-value theory should be explored as well.It might turn out as a valuable tool for predicting the true number of exceedances over a threshold when only coarse-resolution data profiles are available for analysis.

Fig. 1 .
Fig. 1.Sketch of the different relative humidity products, AIRS provides humidity fields integrated over pressure layers and ECMWF humidity fields are given at pressure levels either at low or high vertical resolution.For comparison with AIRS the relative humidities from ECMWF are averages between edges of the layers.
400-500 hPa for clear sky and cloudy situations.The influence of the vertical extent of clouds has been investigated by relating upper tropospheric humidity of the IFS forecasts to collocated Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation (CALIPSO) cloud vertical profiles.The comparisons are made over Europe (latitudes ∈[32 • , 74 • ], longitudes ∈[−27 • , 45 • ]) with one year of data, from October 2006 to September 2007.
Fig.2.Two-dimensional histogram of the difference between the interpolated low-resolution relative humidity and the original highresolution relative humidity (y-axis) against the high-resolution relative humidity (x-axis) at standard pressure levels 200, 250, 300, 400, and 500 hPa.The probability is normalized by the maximum value (in percent) and the colour scale is limited to 50% with values higher than 50% represented by the same colour.All levels combined, January and July 2007 combined.The solid line represents binned mean, dashed lines represent binned mean ± standard deviation.

Fig. 3 .
Fig. 3. Relative humidity uncertainties RHi determined for AIRS as a function of RHi.The errorbars are the standard deviations.Upper troposphere, all pressure layers combined, July 2007.

Fig. 5 .
Fig. 5. Fraction of grid boxes with ice supersaturation (left panels) and mean supersaturation in these grid boxes (right panels), averaged over all forecast runs (noon and midnight each day) of ECMWF for October 2006 and April 2007, in 0.5 • ×0.5 • resolution globally.The upper and lower curve in each panel are the respective mean plus/minus one standard deviation.All data refer to the 250 hPa pressure level.

Fig. 10 .
Fig. 10.Results from the CALIPSO-ECMWF collocation.Top left: occurrence of high clouds (p cld <500 hPa) seen by CALIPSO as a function of the maximum relative humidity wrt ice seen between 200 and 500 hPa by ECMWF for two temperature ranges T <220 K and T >220 K. Top right: same but showing the total thickness of clouds seen by CALIPSO.Bottom left: distribution of the distance (in hPa) between the apparent middle of CALIPSO single-layer clouds and the apparent middle of the pressure layer where the maximum relative humidity wrt ice (between 200 and 500 hPa) seen from ECMWF is located.Bottom right: distribution of the maximum relative humidity wrt ice from ECMWF in the presence of collocated CALIPSO high clouds.

Fig. 11 .
Fig. 11.Top: probability of occurrence of ice supersaturation on the high-resolution vertical levels as a function of the relative humidity wrt ice on the low-resolution vertical levels, January and July 2007, IFS data.Bottom: probability of occurrence of ice supersaturation within thin (≈50 m) layers seen from corrected radiosondes as a function of the relative humidity computed for the whole layer.
indicate.This underestimation will be discussed in Sect.4.3.

Table 1 .
Storch and Zwiers, 2001)icients squared between RHi A and RHi E for all panels of Fig.6.The numbers of data pairs used are given in brackets.vonStorchandZwiers, 2001).If we would like to predict RHi E by linear regression from the measured RHi A with a mean squared error of less than half the variance in RHi E we would need r 2 >0.5.If we would like to predict it with an rms error of less than half the standard deviation of RHi E , we would need r 2 >0.75.(Note that this stays the same if we interchange the roles of RHi A and RHi E ).That said, we see that such a regression model would not be very successful in this case.First we note that in the class with high effective cloud amount, r 2 is low in all considered pressure layers. (