Technical Note: Reanalysis of upper troposphere humidity data from the MOZAIC programme for the period 1994 to 2009

. In situ observational data on the relative humidity (RH) in the upper troposphere and lowermost stratosphere (UT/LS), or tropopause region, collected aboard civil passenger aircraft in the MOZAIC (Measurements of OZone, water vapour, carbon monoxide and nitrogen oxides by in-service AIrbus airCraft) programme were reanalysed for the period 2000 to 2009. Previous analyses of probability distribution functions (PDFs) of upper troposphere humidity (UTH) data from MOZAIC observations from year 2000 and later indicated a bias of UTH data towards higher RH values compared to data of the period 1994 to 1999. As a result, the PDF of UTH data show a substantial fraction of observations above 100 % relative humidity with respect to liquid water. Such supersaturations, however, do not occur in the atmosphere because there is always a sufﬁcient number of condensation nuclei available, that trigger condensation as soon as liquid saturation is slightly exceeded. An in-depth reanalysis of the data set identiﬁed a coding error in the calibration procedure from year 2000 on. The error did not affect earlier data from 1994 to 1999. The full data set for 2000–2009 was reanalysed applying the corrected calibration procedure. Applied correction schemes and a revised error analysis are presented along with the reanalysed PDF of relative humidity with respect to liquid water (RH liquid ) and ice (RH ice ) .


Introduction
Upper troposphere humidity (UTH) is one of the still poorly understood climate variables, although its role in the global climate system is considered essential (Solomon et al., 2010;Gettelman et al., 2011;Riese et al., 2012). The latest IPCC report (IPCC, 2013) states that the knowledge about potential trends and feedback mechanisms of upper tropospheric water vapour is low because of its large natural variability in the troposphere and relatively short records of observations. Although balloon-borne data collected over Boulder, CO (Hurst et al., 2011), and data from satellite-borne instruments like the AURA Microwave Limb Sounder (MLS; Read et al., 2007) or the High-Resolution Infrared Radiation Sounder (HIRS; Gierens et al., 2014) permit investigating long-term trends, over specific regions, there is still an urgent need for in situ observation of UTH on a global scale.
In situ data on meteorological quantities like temperature and pressure as well as data on atmospheric composition (O 3 , CO) and UTH have been collected regularly since 1994 in the framework of the European research programme MOZAIC (Marenco et al., 1998) and since 2011 in its successor programme IAGOS (Petzold et al., 2013) which aims at the continuation of measurements for another two decades (see http://www.iagos.org for further information).
From the start of the programme in 1994 autonomous instruments for measuring meteorological quantities and atmospheric chemical composition were installed aboard inservice aircraft of several internationally operating airlines. Measurements are conducted during scheduled flights of the equipped long-haul passenger aircraft. Using the existing infrastructure of the international air transport system permits the continuous collection of high-quality in situ observation data of excellent spatial and temporal resolution. However, the sampling regions are restricted to the major global flight routes and to the cruising altitude band of 9-13 km, i.e. the data refer to a large extent to the upper troposphere and lowermost stratosphere (UT/LS). In addition, vertical profiles of atmospheric composition (O 3 , CO) collected during ascent after take-off and descent into airports are of increasing importance for satellite validation (e.g. Cooper et al., 2011;Zbinden et al., 2013) and for regional air quality studies including the impact of trans-boundary long-range transport of air pollutants (Cooper et al., 2010;Solazzo et al., 2013).
Atmospheric relative humidity (RH) is measured in the framework of MOZAIC by means of a compact airborne humidity sensing device using capacitive sensors (MOZAIC Capacitive Hygrometer MCH). The sensor itself and applied calibration techniques are described in detail by . The sensor is calibrated for relative humidity with respect to liquid water (RH liquid ) and values of relative humidity with respect to ice (RH ice ) are then calculated from respective RH liquid data (e.g. Pruppacher and Klett, 1997).
First sensor validation studies from formation flights of a MOZAIC aircraft and a research aircraft are reported by Helten et al. (1999), while Smit et al. (2008) has presented an approach for a potential in-flight calibration method.
The reanalysis period for atmospheric RH data presented here focuses on the first 15 years of MOZAIC observations. As is reported by Lamquin et al. (2012), the probability distribution functions (PDFs) of RH ice as calculated from the MCH data show a significant shift in RH ice towards higher values for data since 2000, while data are in agreement with theoretical expectations and experimental findings for the pe- riod 1994 to 1999 (e.g. Gierens et al., 1999;Spichtinger et al., 2004).
The reason for this bias towards higher humidity values is identified as an error in the pre-and post-flight calibration regularly conducted in the environmental simulation chamber at Jülich Smit et al., 2000) from year 2000 onward. Here we report the procedures followed to reanalyse the calibrations and to reprocess the MOZAIC RH data. An in-depth evaluation of the RH data before and after the reprocessing of calibrations and flight data since year 2000 is presented and compared to MOZAIC RH data for the previous period 1994-1999. In summary, this study will serve as the reference publication for the reanalysed MOZAIC RH database for the period 1994 to 2009. Data from year 2010 onward are analysed using the correct sensor calibration procedure.

MOZAIC data set 1994 to 2009
In the first 15 years of MOZAIC between the start of the programme in August 1994 and the end of the reanalysis period in December 2009, in total 32 678 flights were conducted. Table 1 summarises the airlines contributing to the MOZAIC programme and the fraction of flights conducted by the respective aircraft. The global distribution of flights in the period 1994-2009 is shown in Fig. 1. The vast majority of 93 % of collected data is confined to the Northern Hemisphere and there between Europe and North America. Major gaps of the MOZAIC data set exist for the Pacific region (no flights) and for flights across the equator to the Southern Hemisphere (7 % of all flights).
In addition to the global distribution of flights shown in Fig. 1, the worldwide distribution of airports visited by MOZAIC aircraft is presented in Fig. 2. The larger the symbols shown in this graph the more frequently the airport was visited, and in turn the more vertical profiles of the atmospheric composition are available for these regions. Specifically, the investigation of seasonal variations of atmospheric chemical composition is meaningful only for those airports being visited continuously over the entire period; see e.g. Zbinden et al. (2013).  From experience gained in MOZAIC, each aircraft contributes approximately 500 flights per year to the data set. The distribution of flights and aircraft in operation over the considered period is shown in Fig. 3 whereas Fig. 4 illustrates the distribution of observations over altitude. As is clearly visible, the majority of observations (> 80 %) is bound to the UT/LS region. For this analysis, the tropopause is defined according to Thouret et al. (2006), as the altitude band centred around the pressure level (±15 hPa) at potential vorticity 2.0 PVU. PVU values are calculated for each single MOZAIC data point from ECMWF analyses.
In addition, observed vertical profiles from ascent and descent phases during the flights provide relevant information for the vertical distribution of measured species which are of increasing importance for detailed studies on air quality effects of long-range transport events (e.g. Cooper et al., 2010) or satellite validation studies (e.g. Cooper et al., 2011;Zbinden et al., 2013).
The regional distribution of data coverage by MOZAIC UTH observations is shown in Fig. 5 for the period 1994 to 2009, emphasising that the horizontal coverage by MOZAIC observations is highly inhomogeneous and dominated by the major global flight routes. Boundary conditions for selecting UTH data only are (1) an ambient air temperature range of T < − 40 • C to exclude perturbations by liquid water clouds and to restrict the altitude range to approx. 9 to 12 km altitude, and (2) potential vorticity below 2.0 PVU to exclude stratospheric air masses. The densest data coverage is obtained for the entire North Atlantic region. A few main air traffic routes to the Middle East region, Far East and South America are also well covered, whereas the Pacific region and in particular Australia are completely missing in this data set.

Description of errors
UTH data confined to air temperatures below −40 • C (threshold for spontaneous freezing of supercooled liquid water) should show only values below the homogeneous freezing threshold, which is below water saturation. This feature is confirmed for a large set of UTH data from research aircraft observations (Krämer et al., 2009). However, analysing MOZAIC RH Version 0 data (before recalibration and reprocessing) yields a significant fraction of observations above 100 % RH liquid ; see blue line in Fig. 6. When analysing the UT distribution of RH ice , the PDF exhibits a steep decrease at RH ice ≥ 100 % (RH liquid ≥ 60 %) towards ice supersaturation, and maximum values of RH ice of approx. 160 % (e.g. Ovarlez et al., 2002; , 2004;Krämer et al., 2009). Analysing the MOZAIC RH Version 0 data set in a similar manner yields PDFs which deviate strongly from the observations reported for researchtype field studies. Lamquin et al. (2012) report a significant difference in PDF behaviour for MOZAIC RH data between the period 1994 to 1999 and data from year 2000 and later. The modification appears as a significant shift in RH ice towards higher values by 10-20 % RH ice for data since 2000.
The bias of MCH data towards higher values for the period starting in year 2000 could not be explained by physical reasons -see e.g. Lamquin et al. (2012) and the discussion therein -but is related to an error in sensor handling during calibration. An in-depth analysis of the calibration and data processing procedures indicated a change in the sensor calibration at the end of 1999. The identification of this error and respective corrective measures are described in the following sections. As a brief but anticipated summary of the reprocessing effort, the average PDF of reanalysed data is shown in Fig. 6 (red line) together with the PDF of MOZAIC data from the period 1994 to 1999 (green line) which were found to be correct. Apparently, the reprocessed data set agrees well with the data from the first period and shows only a small and statistically insignificant fraction of data above 100 % RH liquid which, however, fall within the limit of uncertainty of the MCH of ±5 % RH liquid . Thus, data reprocessing based on the reanalysis of MCH calibrations have solved the problem of wet-biased MCH data for the period 2000 to 2009.

Pre-and post-flight calibration procedure
In the MOZAIC programme the humidity sensors in operation aboard the in-service aircraft are regularly changed every 1-2 months and calibrated in an environmental simula- tion chamber under typical atmospheric flight conditions for pressure, temperature and RH.
In the test chamber, a Lyman-α fluorescence hygrometer (LAH; Kley and Stone 1978) is installed as reference instrument for the measurement of low water vapour mixing ratios (1-1000 ppmv) with a relative accuracy of ±4 % . At water vapour mixing ratios above 1000 ppmv a dew/frost point hygrometer (DFH; General Eastern, Type D1311R) with an accuracy of ±0.5 K serves as a reference method. Up to three water vapour sensors can be simultaneously calibrated. They are positioned in the outlet duct flow of the Lyman-α fluorescence hygrometer and sample the air just after it has passed the hygrometer .
The calibration procedures are described in detail by Helten et al. (1998). The calibrations revealed that the relative humidity of a calibrated sensor (RH C ) for a constant temperature T i (with subscript i indicating the ith temperature level of the calibration procedure) can be expressed by a linear relation where RH UC is the uncalibrated output from an individual sensor, while offset a and slope b are determined as functions of temperature. At a fixed sensor temperature T i , three different levels of humidity are set which correspond to typical conditions encountered at the sensing element during inflight operation in the troposphere. In order to derive the coefficients a and b as functions of temperature, calibrations have been performed at three tem- upper panel: temperature difference between air flow (T AFL ) and duct wall (T ACH ); plus temperature differences (T Si − T AFL ) between the three MOZAIC hygrometers (T S1 , T S2 and T S3 ) and the air flow (T AFL ), respectively. perature levels of −20, −30 and −40 • C, while at higher temperatures an extrapolation of the calibration to the nominal calibration of the manufacturer at 20 • C has been applied. However, since late 1999 additional calibrations at 0 • C and 20 • C have become standard in the calibration process to improve the accuracy of the measurements made in the corresponding altitude region between 0 and 5 km. From investigations made at constant temperature but at different pressures between 100 and 1000 hPa, no significant pressure dependence of the sensitivity of the humidity sensor had been observed. Figure 7 shows the relation between the uncalibrated sensor (RH UC ) at five sensor temperatures and relative humidity RH C as measured by the reference instruments: (i) Lymanα fluorescence hygrometer (LAH) for T i of −40, −30 and −20 • C and (ii) dew/frost point hygrometer (DFH) for 0 • C and +20 • C. Excellent linear relationships were always observed.

Error in the calibration procedure
As pointed out in the previous section, the sudden jump of MCH data towards higher RH values is caused by an error introduced in the sensor calibration since fall 1999 after (1) the calibration procedure was expanded by two additional temperature levels at 0 • C and +20 • C, and (2) the data acquisition software was switched from Pascal to LabView programming language. Table 2. Mean and standard deviations of the differences between calibration coefficients a(T ) (offset) and b(T ) (slope) for 1994 to 1997 (Helten et al., to 2009 − A typical behaviour of the temperature measured at different locations inside the environmental simulation chamber as a function of time during a calibration run is shown in Fig. 8. The following temperatures are measured with different sensors: (i) T AFL and T ACH are the temperatures of the air flow and at the wall inside the flow duct of the LAH, respectively; (ii) T S1 , T S2 and T S2 are the temperatures of three different MCH units which are subject to calibration; (iii) T wall is the temperature of the wall inside the simulation chamber.
In the new data acquisition software the air flow temperature (T AFL ) was no longer used but instead, by mistake, the wall temperature of the flow duct of the LAH reference instrument (T ACH ) was applied. Since calibration was and is conducted at a variety of temperatures, adjustment of the wall temperature of the LAH to the changed air temperature (lower panel of Fig. 8) requires time. Because a standard calibration run always starts at the lowest air temperature level of −40 • C and then increases in steps of 10-20 • C towards higher temperature levels, T ACH values are systematically 1-3 • C, or even more, lower than the air flow temperature T AFL or the three sensor temperatures T S1 , T S2 and T S2 (upper panel of Fig. 8). However, T Si are all very close to T AFL .
To derive relative humidity RH C , either from the measured water vapour volume mixing ratio of LAH, or from the measured dew/frost temperature from T DF , in both cases the temperature of the air flow, T AFL , has to be applied in equations where µ LAH is the water vapour volume mixing ratio as measured by LAH, e S (T ) is the saturation water vapour pressure at temperature T and p air is air pressure; and where T DF is dew/frost point temperature as measured by DFH. Due to the erroneous use of the lower T ACH instead of T AFL all RH C values were systematically too high. Consequently, this bias introduced systematic errors (larger values) in the offset a(T i ) and slope b(T i ) as derived from Eqs. (2) and (3) at five different air temperature levels (T i ) of the calibration (Figs. 7 and 8).
There are no indications that the temperature sensors used have changed their performance over time. Thus, calibration coefficients for offset a and slope b (i.e. sensitivity) are affected by this systematic temperature bias of 1-3 K. Because saturation water vapour pressure e S (T ) is a strong function of temperature and decreases almost exponentially with temperature (6 % K −1 at 300 K and 10 % K −1 at 200 K), it is obvious that the systematic temperature bias of 1-3 K can introduce systematic effects of 10 % or more in RH LAH or RH DFH and thus an impact of similar magnitude on the offset a and slope b of the calibration function (Eq. 1).
Consequently, this bias in the calibration function has had a quantitative impact of equal magnitude on the RH flight data and thus correcting the bias requires: (1) reprocessing of all pre-and post-flight calibrations made since 1999 by applying the correct temperature; (2) applying the corrected offset and slope as a function of the sensor temperature. Since all calibration records including T AFL and T ACH were archived since the start of measurements in 1994, all calibrations and in consequence all MOZAIC RH flight data could have been fully reprocessed.

Quality assurance of calibration
The error analysis and the resulting corrective measures taken for the MCH calibration as described in the previous section yielded a set of calibration functions of offset a and slope b. In order to assure the quality of the obtained calibration functions, the statistical distribution of the obtained calibration parameters and their long-term stability were analysed similar to the analysis conducted at the beginning of the MOZAIC RH measurements . Comparing the scatter of reanalysed calibration parameters and their long-term stability with the results from the early period of this programme provides a measure for the quality of the reanalysed MOZAIC RH data and in particular a measure for the validity of the long-term time series of MOZAIC RH data from 1994 to 2009. The statistical distributions of the differences in parameters a and b between calibrations conducted before installation on an aircraft and after removal are shown in Fig. 9. Both frequency distributions are of Gaussian type similar to the observations reported for the first set of calibration parameters by Helten et al. (1998). The respective mean values of parameters a and b and associated standard deviations are compiled in Table 2. Obviously, differences of slopes b of calibration functions are of value zero, i.e. they do not change on a statistically significant level between pre-flight and postflight calibrations. On the other hand, the differences of offsets between pre-and post-flight calibrations are significant, shifting from −0.2 to −0.4, which however is a consistent finding for the periods 1994 to 1999 and 2000 to 2009.
The quantitative values of the statistical distribution of differences (a post − a pre ) and (b post − b pre ) are in unexpectedly close agreement for the analysed periods 1994-1999 and 2000-2009; see Table 2 for details. Smit et al. (2008) have shown that the sensor offset shifts (offset a) are the most dominating parameter in determining the uncertainty of the measurements, while the sensitivity (slope b) is stable in time. The observed consensus of data underpins the consistency of the RH data set which has emerged from the MOZAIC programme.
The long-term stability of sensor calibrations was investigated by checking calibration parameters of the same sensor over the entire analysed decade from 2000 to 2009. Results are shown in Fig. 10 with different colours referring to different sensor units; they agree well with previous findings reported by Helten et al. (1998). Although a significant scatter of calibration factors is observed among different sensor units, the behaviour of each single sensor unit is robust. Observed changes of offset a and slope b between a post-flight and the next pre-flight calibration are most likely caused by the cleaning procedure of the sensor in the laboratory prior to the pre-flight calibration . However, it should be mentioned that despite the consistency of the longterm sensor behaviour, only current calibration functions are used for the data analysis.
In a final assessment, the uncertainty of RH liquid data was analysed as a function of altitude or temperature, respectively. As is explained in detail by Helten et al. (1998), the analysis of the MOZAIC RH measurement is performed with the averages of the individual pre-flight and post-flight calibration coefficients a and b for each interval of flight operation.
Recalling details of sensor installation and operation, the capacitive humidity sensor is installed inside a conventional Rosemount inlet housing together with a Pt 100 temperature sensor. The movement of the aircraft forces airflow around the RH-and T -sensors but at a higher pressure and temperature than for the surrounding atmosphere due to adiabatic heating of the air when entering the inlet. The transformation of RH values measured by the capacitive sensor of the MCH (RH D ; Helten et al., 1998) to RH values for ambient air temperature and pressure conditions (RH S ; Helten et al.,1998) requires knowledge of the static air temperature (SAT) of ambient air and of the total air temperature (TAT) at the position of the capacitive sensor inside the MCH housing. The latter quantity TAT is calculated from the actually measured sensor temperature and the so-called recovery factor which expresses the effect that the adiabatic conversion of energy into heat is not exactly 100 % such that the temperature measured inside the housing, the total recovery temperature, is about 0-1.0 K lower than TAT, depending on aircraft speed. The housing manufacturer provides an empirical recovery factor to determine the real TAT from the measured recovery temperature.
Relative humidity of the ambient air (RH S ) is then determined from the measured values for RH D , TAT, and SAT by applying the procedure described by Helten et al. (1998). The uncertainty of RH is deduced by the law of error propagation with the uncertainty of these parameters.
The uncertainty of RH D is a composite of the uncertainty of the Lyman-α fluorescence hygrometer calibration and half of the absolute value of the differences of the individual preflight and post-flight calibration coefficients, a and b. To convert to the uncertainty of RH, the uncertainties of TAT (0.25 K) and SAT (0.5 K) have to be included. The contribution of uncertainty of the air speed measurement by the aircraft to the uncertainty of temperature determination is below 0.01 K and was excluded from the error propagation determination. The uncertainty of the recovery factor of the Rosemount probe housing contributes to the uncertainties of the temperature measurements and thus to the uncertainty of the recovered RH.
The major contribution to RH uncertainty stems from the differences of calibration coefficients a and b between preflight and post-flight calibrations. If these differences are in a similar range as the values listed in Table 2 and shown in Fig. 9, then this contribution is of the same order of magnitude as the uncertainty caused by the temperature uncertainty. The MOZAIC database contains estimates of the total uncertainty of RH for each individual data point.
Since at the beginning the MOZAIC programme focused on the middle and upper troposphere, the pre-flight and postflight calibrations of the humidity sensors above −20 • C were not performed before the year 2000. This means that then the coefficients a and b of the MOZAIC humidity sensors for measurements in the lower troposphere are based on the interpolation between pre-flight and post-flight calibrations at around −20 • C and the manufacturer's calibration at +20 • C. Also, estimates of calibration uncertainties, based on pre-flight and post-flight analyses cannot be given for the lower troposphere for the period 1994-1999. Since 2000 the calibrations were extended to two additional temperature levels at 0 and +20 • C. Figure 11 show the variations of uncertainties of RH measurements in % RH liquid for the altitude range covered by the observations. Uncertainties are calculated from the mean plus standard deviation of the individual total uncertainties over all MOZAIC data of 1994MOZAIC data of -1999MOZAIC data of and 2000MOZAIC data of -2009 In the middle and upper troposphere the total uncertainties centre at approx. 4.5 % RH liquid (2.5-6.5 % RH liquid ) for both periods. In the lower troposphere the total uncertainties Figure 11. Mean uncertainty of MOZAIC relative humidity measurements in % RH liquid as a function of altitude (blue solid line) for periods 1994-1999 (left) and 2000-2009 (right). Horizontal bars represent the standard deviation of the mean uncertainty.
for the first period of approx. 6 % RH liquid are slightly higher compared to the value of < 5 % RH liquid for the second period due to the missing calibrations at temperatures above −20 • C.
For measurements of stratospheric humidity, where RH liquid values below 5 % prevail, the uncertainty of the MOZAIC Capacitive Hygrometer is insufficient for quantitative water vapour measurements, since sensor response time is too slow to equilibrate at the low relative humidity and low temperatures. Thus, these data have to be considered carefully in the data analysis. However, cold and dry sequences in the lower stratosphere are used for an in-flight calibration of the sensor offset (calibration coefficient a) which is described in more detail by Smit et al. (2008).

Performance of MCH
In order to back-up and extend data on the performance of the MCH which were collected in the beginning of MOZAIC RH measurements during formation flights of research aircraft equipped with water vapour instruments and MOZAIC aircraft , the MCH was operated aboard a Learjet 35A aircraft as part of the CIRRUS-III field study; see Kunz et al. (2008) and Krämer et al. (2009) for more information. A detailed analysis of the MCH performance during CIRRUS-III is provided elsewhere (Neis et al., 2014), while we present here a brief summary of campaign details and key findings.
The overarching goals of CIRRUS-III were to understand the formation mechanisms of cirrus clouds in different background conditions, their radiative effects and the microphysical properties of the cirrus cloud particles. In total six flights have been conducted in the period between 23 and 29 November 2006 at mid-latitudes (45-70 • N) and at flight altitudes between 7 km and 12 km. These flights in the UT/LS were launched from Hohn Air Base in northern Germany with the Learjet 35A operated by enviscope GmbH. CIRRUS-III provided a data set with approx. 14 flight hours in air masses colder than −40 • C, approx. 4 flight hours in cirrus clouds and 10 flight hours out of cloud. Furthermore, stratospherically influenced air masses have been sampled for 20 min with ozone volume mixing ratios (VMRs) above 125 ppmv and 35 min with ozone VMRs above 100 ppmv, respectively.
Part of the scientific payload of CIRRUS-III was dedicated to the measurement of water vapour and total water by one MCH for measuring relative humidity and one open path tuneable diode laser system (OJSTER; MayComm Instruments; May and Webster, 1993;Krämer et al., 2009) which delivered the water vapour VMRs. Simultaneously total water, i.e. gas phase and ice water, was measured by the reference instrument FISH (Fast In-Situ Hygrometer). This closed-cell Lyman-α fluorescence hygrometer (Zöger et al., 1999) was equipped with a forward-facing inlet to sample also the ice particles. To determine whether a data point was inside a cirrus cloud or not, the difference between total water and water vapour was used to define a cloud index; see Krämer et al. (2009) for the detailed data analysis procedure.
For the sensor intercomparison study, data for H 2 O VMR > 1000 ppm were excluded because at these large water vapour abundances the FISH instrument, which is based on the absorption of Lyman-α radiation by H 2 O molecules, becomes optically opaque and thus insensitive to further changes in VMR (Zöger et al., 1999). Furthermore, data at sensor temperatures TAT < −40 • C, i.e. below the MCH calibration limits, were excluded from the data analysis. In order to exclude warm clouds from the data set, the maximum ambient air temperature of accepted data was set to the level of instantaneous freezing of −40 • C. For a complete validation of the MCH the data set was split into a clear-sky set and a cirrus-cloud set by means of the above-described cloud index. Finally, flight sequences of the Learjet 35A with strong ascents and descents were excluded. These flight conditions are not suitable for instrument intercomparison, because already small time shifts between instruments with different response times lead to large differences due to the rapidly changing H 2 O VMR.
For the instrument intercomparison we analysed the sensors with respect to RH liquid since this is the quantity the MCH is calibrated against. The correlation between the two sensors is shown in Fig. 12 for RH liquid values averaged for 5 % bins. The bin size was selected according to the expected uncertainty of the sensor of ±5 % RH liquid . The plotted data points and whiskers per bin shown in Fig. 12 represent the median, 25-and 75-percentile of the binned RH liquid data from the reference instruments (x-axis) and MCH (y-axis), respectively. The top panel of Fig. 12 illustrates the number of data points in each 5 % RH liquid bin.
In a cloud-free atmosphere (clear-sky section of Fig. 12) and around cirrus clouds (transition area in Fig. 12), MCH and reference instruments agree very well. Linear regression analysis provides a correlation coefficient R 2 = 0.99 and a slope m = 1.02 ± 0.03 while the y-axis intercept equals zero within the limit of uncertainty (−0.15±1.29 % RH liquid ). The data for RH liquid ≥ 75 % and RH liquid ≤ 10 % suffer from a small number of counts, but contribute only weakly to the Figure 13. Frequency of occurrence for observations of RH liquid during CIRRUS III; blue and red lines refer to data from reference hygrometers and the MOZAIC Capacitive Hygrometer (MCH) (Neis et al., 2014).
MCH performance analysis because data bins were weighted by the number of contained data points.
Inside cirrus clouds, i.e. RH liquid > approx. 60 % (cirrus section of Fig. 12), deviations between instruments are larger, with a systematic bias of the reference instruments towards higher RH liquid values than measured by MCH. One potential and likely explanation is related to the fact that both reference instruments FISH and OJSTER report data on a 1 Hz basis while the response time of the MCH is of the order of 1 min or longer at these temperatures . Hence, small-scale fluctuations of high RH liquid values are captured by the reference instruments but not resolved by MCH.
Despite the weaker agreement between MCH and reference instruments close to and inside cirrus clouds, the data shown in Fig. 12 rule out the speculated contamination of MCH data by partial or complete evaporation of hydrometeors via adiabatic heating in the sensor housing; see e.g. Helten et al. (1998). This type of contamination would result in systematically higher RH liquid values measured by MCH inside clouds compared to reference instruments using another type of inlet. However, this behaviour was not found; for details see Neis et al. (2014).
The good quality of the MCH RH liquid data in a statistical sense is shown in Fig. 13. The PDFs for RH liquid agree well between MCH and the reference instruments (FISH or OJSTER, resp.) for the entire CIRRUS-III data set. The shift of the RH liquid PDF by one bin towards more humid data at cirrus cloud edges (transition are to cirrus in Fig. 12) can also be explained by the slower response time of the MCH at these conditions, because the MCH adjust more quickly to higher RH liquid when entering cirrus clouds, while it requires longer adjustment time when leaving the cloud and changing from higher to lower RH liquid . An in-depth analysis of the MCH performance including implications for the MCH data analysis is provided separately by Neis et al. (2014).

Discussion and conclusions
The identification of a bias of UTH data from the MCH towards more humid conditions (e.g. Lamquin et al., 2012) sparked an in-depth reanalysis of the entire MOZAIC UTH data set from year 2000 onwards, whereas MOZAIC MCH data from the pre-2000 period (Gierens et al., 1999) were found to be unbiased. The reanalysis identified an error in the analysis of the instrument calibration as the source for this bias. The entire calibration data set since year 2000 was reanalysed and the MOZAIC data set was reprocessed using the corrected calibration functions.
The annually averaged PDF of reprocessed UTH data from the MCH operated aboard the MOZAIC fleet is shown in Fig. 14. The reprocessed MOZAIC MCH data set exhibits the key features of physically sound UTH data, i.e. only a statistically insignificant fraction of the observations (< 10 −4 ) is above the physical limit of 100 % RH liquid (Fig. 14a, c), and the inflection point of the PDF with respect to RH ice is close to 100 % RH ice (Fig. 14b, d).
Concerning the scatter of data at high ice supersaturation (RH liquid ≥ 80 % or RH ice ≥ 130 %, respectively), it has to be noted that the PDFs displayed in Figs. 14b and d represent annual mean distributions with only a small fraction of data in this range of RH values. The mean uncertainty of MCH data is about 4-6 % RH liquid for the 1994-1999 period and about 4 % for the 2000-2009 period. Due to the fact that the RH uncertainty is of statistical nature and not systematic, the consideration of the uncertainty range of approx. 5 % RH liquid in the calculation of the PDF would result in additional data scatter but not in a systematic shift of the PDF.
The validity of the reprocessed MOZAIC UTH data set is further confirmed by the comparison with an extensive data set collected by Krämer et al. (2009); see the solid line in Fig. 14d. This data set is based on 28 research flights in 10 field campaigns in the UT/LS and in/around cirrus clouds using the Lyman-α fluorescence Fast In situ Hygrometers FISH (Zöger et al., 1999) as well as FLASH (Sitnikov et al., 2007) and the open-path tunable diode laser instrument OJSTER (Krämer et al., 2009). The PDFs shown in Fig. 14d refer to clear-sky conditions and are based on FISH total water measurements far off cirrus and FLASH or OJSTER gas phase measurements in the vicinity of cirrus.
The difference between the MOZAIC and the FISH-FLASH-OJSTER PDFs can be explained by the different underlying flight strategies. While in the MOZAIC programme flights are not targeted to scientific questions, the flights performed by FISH-FLASH-OJSTER are dedicated to research in the UT/LS and in/around cirrus clouds. Hence, the peak around 100 % RH ice is slightly higher and the peak at 10 % RH ice slightly lower in FISH-FLASH-OJSTER than in the MOZAIC PDF, since regions around cirrus are more frequently present in the research flights than in the regular passenger flights. Further, the larger fraction of data points at high ice supersaturation in the MOZAIC compared to the FISH-FLASH-OJSTER data set is due to the fact that MOZAIC data include occasional cirrus cloud encounters where ice supersaturation frequently occurs, whereas the FISH-FLASH-OJSTER data represent cloud-free conditions.
Major modifications of the MOZAIC RH data due to the reprocessing can be understood as a shift of single observation data towards drier conditions, i.e. towards lower RH liquid data. The shift cannot be parameterised in a simplistic way because its magnitude depends on the correction which has been applied to the calibration function of each single MCH unit.
However, from a statistical point of view, major modifications of the data set are associated with the fraction of observations close to or above ice supersaturation which is significantly reduced and the inflection point of RH ice data is shifted from RH ice ∼ = 130 % to 100 %. In contrast, fractional changes in the RH liquid range between 20 and 60 % are only minor. Finally, the maximum of RH liquid values for dry conditions which is associated to observations in the dry and cold lowermost stratosphere is shifted from RH liquid ∼ = 10 % to 5 %.
We have evaluated all previous studies, which have potentially used the flawed MOZAIC water vapour data, addressing the extent to which the wet bias may have influenced the results and the conclusions made.
Studies by Crowther et al. (2002), Offermann et al. (2002), and Spichtinger et al. (2004) analysed MOZAIC UTH data from the period 1995-1999, whereas Nedoluha et al. (2002) and Kley et al. (2007) Fig. 5b in . Using re-analysed data would lower these Figure 14. Annually averaged probability distribution of UTH observations from the MOZAIC Capacitive Hygrometer with respect to RH liquid (a, c) and RH ice (b, d) for the indicated periods; the solid line in panel (d) represents the average RH ice PDF for the UTH clear-sky data set reported by Krämer et al. (2009). enhanced UTH values to values common to the period before 2000. Conclusions drawn are not influenced. Most of the comparison has been performed on decadal averages of UTH data such that the impact of the wet bias is of minor influence on the results because the variability of UTH is very large in that region. Ekström et al. (2007Ekström et al. ( , 2008 compared RH ice values from ODIN (ODIN-SMR is a limb-sounder operating in the 500 GHz region) at 200 hPa with MOZAIC RH ice at 200 hPa for the period 2001-2004 over tropical regions. The agreement of the PDF for RH ice from ODIN and MOZAIC sensors is better than 5 % RH ice , which is within the retrieval error of ODIN. In consequence, using re-analysed MOZAIC data for the intercomparison would suggest that ODIN-SMR shows a wet bias of about 10 % on a relative scale; see the PDF shown in Fig. 7 of Ekström et al. (2007). In their consecutive study Ekström et al. (2008) compared PDFs of RH ice measured by ODIN, AURA-MLS and UARS-MLS with MOZAIC UTH data optimised at 205 hPa; see Fig. 4 of their paper. They found that MOZAIC UTH data are slightly wetter. Thus, agreement would be getting better if MOZAIC PDF of RH ice were to shift by about 10 % RH ice to drier values. However, uncertainties in satellite retrievals are large so that conclu-sions drawn in the paper are not affected at all by the wet bias of the MOZAIC UTH data. Kunz et al. (2008) used climatological data of MOZAIC UTH from the period August 1994-December 2005 for comparison with SPURT-FISH data on UTH which were collected in the periods November 2001 and July 2003 during dedicated research flights. Applying the performed statistical analyses on reanalysed MOZAIC data would reduce the reported difference between PDF of H 2 O volume mixing ratio of SPURT and MOZAIC. Further statistical studies focused on the analysis of variances. In this case, the wet bias of MOZAIC UTH data is only of minor influence and the conclusions drawn by Kunz et al. (2008) are not affected. Heise et al. (2008) used MOZAIC UTH data from March 2001 to February 2006 for the comparison of UTH and temperature results from GPS Radio Occultation aboard the CHAMP mini-satellite with MOZAIC measurements. Observed wet bias effects of MOZAIC UTH data compared to ECMWF and CHAMP results can be qualitatively and for part quantitatively explained by the 10 % RH liquid wet bias of MOZAIC UTH data; see Fig. 3 of Heise et al. (2008). Agreement between CHAMP and MOZAIC increases when using revised MOZAIC UTH data. Sahu et al. (2009Sahu et al. ( , 2011 analysed MOZAIC UTH data and RH liquid vertical profiles over Delhi/India for the period 1996 to 2001. Data are lumped together to obtain sufficient statistical relevance for investigating the seasonal variations on a monthly average base. RH liquid (%) and H 2 O mass mixing ratio (g kg −1 ) are analysed only in a qualitative way. Since the period 2000-2001 contributes only 1/3 to the monthly averages, MOZAIC RH liquid data revision is of limited relevance. Lamquin et al. (2012) have raised the issue of the wet bias and data were corrected by 10 % RH liquid such that major impact already had been corrected for. Results and conclusions are appropriate.
In conclusion, the reanalysis of MOZAIC RH data should be considered for studies which have focused on the investigation of ice supersaturation in the UT and used mainly MOZAIC data from year 2000 and later. The reprocessed UTH data set from measurements aboard MOZAIC aircraft will become available at the IAGOS/MOZAIC Database website http://www.iagos.fr/web/ for scientific exploration as Version No. 1.