The total aviation effective radiative forcing is dominated by

Since ISS is more common in some dynamical regimes than in others, the aim of this study is to find variables/proxies that are related to the formation of ISSRs and to use these in a regression method to predict persistent contrails. To find the best-suited proxies for regressions, we use various methods of information theory. These include the log-likelihood ratios, known from Bayes' theorem, a modified form of the Kullback–Leibler divergence, and mutual information. The variables (the relative humidity with respect to ice, RH

It turns out that RH

In order to avoid persistent (warming) contrails, it is necessary that they can be reliably predicted. For this aim, three conditions need to be fulfilled. First, the formation of contrails has to be predicted with reasonable skill. Contrails form if (super)saturation with respect to water occurs during the mixing process of the ambient air with the exhaust gases from the aircraft. This criterion is called the Schmidt–Appleman criterion

While the first of these conditions, the ability to predict the SAC, is generally fulfilled with a satisfying quality, this is not the case for the prediction of ice supersaturation

There are several reasons why the prediction of persistent contrails is currently challenging. The main reason is the strong variability in the water vapour field in the atmosphere. This is because water substance is present in three aggregate states; it is involved in chemical and aerosol processes, and thus it varies greatly in the atmosphere. This problem is intensified by the low number of humidity measurements at cruise levels for data assimilation. Data assimilation is necessary to keep the simulation of a complex system close to measured reality. Therefore, more data on relative humidity at flight levels are urgently needed. Note that satellite data cannot fill this gap since their vertical resolution is insufficient

However, there is growing interest in reducing the climate impact of aviation nowadays, and a relatively straightforward possibility would be the avoidance of the formation of persistent contrails if only ice supersaturation could be predicted with the precision necessary for flight routing. Because of the challenges mentioned before, the relative humidity field is insufficient for this purpose, and we need either corrections to the humidity field

In the present paper, we concentrate on the prediction of ice supersaturation, i.e. the prediction of persistent contrails. For this purpose, we use data obtained from an instrumented passenger aircraft and reanalysis data, which are explained in Sect.

Various data sources are utilised in this study. These are briefly described in the following sections, Sect.

In this study, we use pressure and relative humidity with respect to ice (RH

For this study, we have chosen aerial boundaries of

In addition to the data from commercial aircraft, hourly ERA5 high-resolution realisation (HRES) reanalysis data

In this work, we consider whether it is possible to use the dynamical proxies suggested by

The most simple way to map the values of the six dynamical proxies to probabilities for ice supersaturation or contrail persistence is to divide the phase space into six-dimensional rectangles/blocks (six because of the six suggested dynamical proxies by

We are interested in whether persistent contrails are possible or not, i.e. whether there is ice supersaturation or not. Unfortunately, the moisture field in the models is not accurate enough for that purpose. So, how can we solve this problem?

It is known that ISS is more frequent in some dynamic situations than in others

Let us assume there is a value

Naively, one could compare

Another possibility of framing Bayes’ theorem for the present problem is to use an odds ratio:

The first factor on the right side of Eq. (

As long as there is only one special value

As one does not know in advance whether a situation is ISS or not, it is best to also use the corresponding expectation of the absolute logit,

Note that

The conditional probability densities of the dynamical proxies of ERA5 for ISS and

For the calculation of the expectation of the absolute logit

To apply the Bayesian law for several proxies simultaneously, e.g. as for

Expectation values for absolute logit of the different proxies.

Log-likelihood ratios for dynamical quantities. Positive values raise the probability for ISS and negative values lower it. The probability for ISS exceeds the probability for

The absolute logarithm of the quotient of the densities for ISS and

The dynamical candidate proxies are not independent quantities, and one has to take care that a regression is not formulated with redundant information. But, of course, a variable that has some relation with (i.e. information on) the relative humidity is welcome. Above, we see that RH

The mutual information is a measure of information that one variable,

Since we assume the humidity of MOZAIC/IAGOS (RH

To be a good proxy for a regression, it must not only be well correlated with RH

As mentioned before, out of all quantities, RH

The mutual information matrix,

A generalised additive model (GAM) is a regression method for predicting a response

The procedure is as follows: for the tests,

In this study, the equitable threat score (ETS) is used to validate and compare the prediction accuracy of the different GAMs (with varied input parameters) with that of the raw data. For the calculation of the ETS

Contingency table for predicting and observing persistent contrails.

The sum of the events is labelled as

If the prediction agrees perfectly with the observation, ETS

In order to fill the contingency table, it is necessary to decide on a conditional probability threshold

Table

In

As we saw in Sect.

For

Next, we use all proxies that show separate distributions in their probability density functions (PDFs, not shown),

Even if we supposedly put more information into the GAM using more proxies, the ETS does not increase.

The relative humidity cannot be ignored as an input variable. This indicates that even if the relative humidity is an imprecise variable, it must not be excluded; otherwise, the ETS value will drastically decrease.

These two new insights may be explained by the log-likelihood ratios (Fig.

Using the same proxies as before and adding RH

It seems that the use of dynamical proxies in the GAMs does not outperform a simple GAM that uses only relative humidity and temperature by much. At least the ETS values obtained via the GAMs (that is, for prediction of potential persistent contrails) distinctively exceed those obtained from a simple check of the ISS prediction, as can be seen from the study of

Note that despite

Since, as we have seen, the relative humidity should definitely be used as an input for a GAM, although it is not very precise, the questions arise of whether it is possible to improve the regression results using corrections to the relative humidity field from the weather forecast models and what the reason for why even the most advanced regression methods are not able to yield better ETS values is. These questions are dealt with in the next section.

Results of comparing RH

If weather forecasts were perfect, contrail persistence could easily be predicted using temperature and relative humidity alone and it would not be necessary to use any proxies. Unfortunately, it seems that the predicted humidity field in particular (at least from ERA5 but certainly from other weather models as well) is not good enough to allow for such a forecast for single flights, that is, waypoint to waypoint

We guess that the root of the problem of predicting ice supersaturation and contrail persistence is the too strong of an overlap of the two conditional humidity PDFs, namely

The results are shown in Table

Conditional probability density functions

Results of the sensitivity test.

As Sect.

The quantile mapping procedure uses the two cumulative distributions of RH

Illustration of quantile mapping for RH

In a study by

Table

We check how good the prediction of ice supersaturation is using the corrected versions of RH

Results of comparing RH

Now, we use the same proxies as in

Unfortunately, it turns out that neither a GAM produced with quantile-mapped ERA5 humidity values nor a GAM where the

The probable reason for this negative result is seen in Fig.

Another reason for the insensitivity of the GAMs to these corrections may be that they absorb such modifications in the coefficients of the non-linear smooth functions. This may become clearer if one thinks of a linear regression (

The conditioned PDFs

There are various approaches to minimising the climate impact of aviation. One of these approaches is to prevent the formation of persistent contrails by avoiding flying through ice-supersaturated regions, where contrails can last for hours. For implementing such aircraft diversions, these regions have to be accurately predicted in terms of time and location, which is currently associated with difficulties and uncertainties. This is mainly due to the inaccurate forecast of the relative humidity. Since ice supersaturation (ISS) is more common in some dynamical regimes than in others, we use different dynamical proxies (in addition to the relative humidity with respect to ice RH

To find out which dynamical variables are best suited for the regressions and which do not provide redundant information, we use various methods of information theory and test them. These include the log-likelihood ratios, known from Bayes' theorem; a modified form of the Kullback–Leibler divergence, which we call the expectation of the absolute logit; and the mutual information.

Log-likelihood ratios with values greater than

Particularly high values of the expectation for absolute logits are found for RH

Furthermore, to estimate the suitability of a proxy for a regression, we use the mutual information, which is a measure of how much information one variable,

We use the most promising variables in several regression models to predict

It turns out that the dynamical proxies hardly provide information on the question of whether a situation is

In the present paper, we use the meteorological data only at the point and time where the prediction of ice supersaturation is required. One can increase the effort and use additional forecast data from earlier points in time and locations upstream of the location of interest

We conclude that the representation of RH

R codes can be shared on request.

ERA5 data can be obtained from the Copernicus Climate Data Service at

This paper is part of SH's PhD thesis. SH wrote the codes, ran the calculations, analysed the results, and produced the figures. KG supervised her research. SH and KG discussed the methods and results and wrote the paper. SR curated the MOZAIC/IAGOS data.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.

The authors would like to thank Robert Sausen for the helpful discussion and Johanna Mayer for her thorough read-through and comments on a draft of the paper.

This research has been supported by the Horizon 2020 Framework Programme H2020 Societal Challenges as part of project ACACIA (grant no. 875036).The article processing charges for this open-access publication were covered by the German Aerospace Center (DLR).

This paper was edited by Fangqun Yu and reviewed by two anonymous referees.