Using a Bayesian framework in the inverse problem of estimating the source of an atmospheric release of a pollutant has proven fruitful in recent years. Through Markov chain Monte Carlo (MCMC) algorithms, the statistical distribution of the release parameters such as the location, the duration, and the magnitude as well as error covariances can be sampled so as to get a complete characterisation of the source.
In this study, several approaches are described and applied to better quantify these distributions, and therefore to get a better representation of the uncertainties.
First, we propose a method based on ensemble forecasting: physical parameters of both the meteorological fields and the transport model are perturbed to create an enhanced ensemble.
In order to account for physical model errors, the importance of ensemble members are represented by weights and sampled together with the other variables of the source.
Second, once the choice of the statistical likelihood is shown to alter the nuclear source assessment, we suggest several suitable distributions for the errors.
Finally, we propose two specific designs of the covariance matrix associated with the observation error.
These methods are applied to the source term reconstruction of the

The inverse modelling of a nuclear release source is an issue fraught with uncertainties

A major source of uncertainties in the inverse modelling for source term estimation of nuclear accidents originates from the meteorological fields and the transport models

We wish to parametrise the distribution of the variable vector

The shape of the posterior distribution strongly depends on the uncertainties related to the data and the modelling choices,
which include the meteorological data and transport models definitions as well as the likelihood definition.
The objective of this study is to investigate the various sources of uncertainties compounding the problem of source reconstruction,
and to propose solutions to better evaluate them, i.e. to increase our confidence in the reconstructed posterior distribution.
The quantification of the uncertainties largely depends on the definition of the likelihood and its components (for example, a corresponding covariance matrix).
The choice of the likelihood is the concern of Sect.

In this paper and in the case of a source of unknown location, the predictions are written as

Equation (

the observations

the physical models: the meteorological fields

the likelihood definition: its choice and the design of its associated error covariance matrix

the representation error: the release rates

the choice of the priors.

This study is a continuation from a previous study from the authors

Subsequently, these three sources of uncertainty are explored in an application of source term estimation of the

In the field of source assessment, and more precisely radioactive material source assessment,
the likelihoods are often defined as Gaussian

With the assumption

The whole set of measurements should provide information:
if the inversion is dominated by the few measurements with the largest errors (which may possibly be outliers), valuable information provided by the other measurements may be missed.
More generally, the following inventory lists the criteria that a good likelihood choice under the assumption

positive domain of definition: should be defined for values on the semi-infinite interval

symmetry between the prediction vector and the observation vector, i.e.

close to proportionality: the ratio of the cost function value of a couple

existence of a covariance matrix, and of a term able to play the role of the modelled predictions. Indeed, the likelihood measures the difference between the observations and the predictions, which should therefore appear as a parameter of the distribution. Distributions with a location parameter comply with this requirement.

Their difference lies in the treatment of the relative quantity

For all choices, the value of

We should also consider that the log-Cauchy distribution needs a second threshold

As will be shown later, the choice of the likelihood has in practice a significant impact on the shape of the posterior distribution. Hence, to better describe the uncertainties of the problem, the approach proposed here is to combine and compare the distributions obtained with these three likelihoods.

The likelihood definition, and therefore the posterior distribution shape, is also greatly impacted by the modelling choice of the error covariance matrix

This critical reduction can lead to paradoxes.
With

In the following, we refer to this algorithm as the observation-sorting algorithm. A justification of the use of this clustering using the Akaike information criterion (AIC) is proposed in Appendix

We now propose a second approach to improve the design of the covariance matrix

Using both methods, the set of variable

As explained in Sect.

First, ensemble weather forecasts can be used to represent variability in the meteorological fields. The members of the ensemble are based on a set of

Therefore, to create an ensemble of observation operators with both uncertainty in the meteorological fields and in the transport parametrisation,
a collection of observation operators

choosing a member from an ensemble of meteorological fields and hence a discrete value in

a constant deposition velocity in

a distribution of the height of the release between two layers defined between 0 and 40

a multiplicative constant on the Kz values chosen in

Once the set of operators has been built, the idea is to combine them linearly to get a more accurate forecast. A weight

Several methods are used in Sect.

Maximum air concentrations of

Main configuration features of the ldX dispersion simulations for the

Density of the longitude for a log-Laplace likelihood with threshold

In this section, the methods are applied to the detection event of

Small quantities of

The concentration measurements used in this study are available in

All simulations, described in Sect.

Simulations are performed forward in time from 22 September 2017 at 00:00

In a Bayesian framework, the prior knowledge on the control variables for the

We rely on Markov chain Monte Carlo (MCMC) algorithms to sample from the target

The transition probabilities used for the random walk of the Markov chains are defined independently for each variable and based on the folded-normal distribution as described by

The variances of the transition probabilities are chosen based on experimentations and are set to be

To see the impacts of the techniques proposed in Sect.

Sect.

Sect.

Sect.

We present here an experiment supporting the observation-sorting method.
A reconstruction of the source variables is proposed using the enhanced ensemble of observation operators, only on the first

Figure

The mean of the observation error variance

What happens when the observation-sorting algorithm is not used (orange histogram), i.e. with the basic design

The observation-sorting algorithm is a clustering algorithm that avoids this compromise.
Observations always equal to their predictions (i.e. associated with very small observation error variance, or with very high confidence for all probable sources) – the non discriminant observations – are assigned a specific observation error variable. In this way, the uncertainty variance associated with the other observations is far more appropriate. This clustering is totally valid as explained in Appendix

Finally, note that sampling the longitude posterior distribution using

Pdfs of the coordinates describing the

Pdfs of the total retrieved released activity (TRRA) describing the

In this section, we study two cases.
First, we assess the impact of the choice of the likelihood on the reconstruction of the control variable pdfs (

Evolution of the meridional wind of the HRES and EDA meteorologies for a random location in Europe since the beginning of the simulations. The red curve represents the mean of the meridional wind EDA members with the space between the minimum and maximum values

Figures

The daily total retrieved released activity (TRRA) was mostly significant on 25 September. The extent of the release pdf overlap is smaller than the coordinates pdf overlap extent; probable TRRA values range from 140 to 300 TBq.

This shows that using a single likelihood is not enough to aggregate the whole uncertainty of the problem. Furthermore, we can see on these graphs that the threshold choice of the likelihood also has a moderate impact on the final coordinate pdfs and an important impact on the TRRA pdfs. More precisely, the daily TRRA pdfs obtained from the log-normal and the log-Laplace choices are moderately impacted by the threshold value choice.

Figures

Pdfs of the variables describing the

Densities of the weights of the members of the enhanced ensemble using the parallel tempering method and the observation-sorting algorithm for diverse likelihoods:
log-normal with threshold 0.5

Before reconstructing the pdfs of the

The original ERA5 EDA meteorology is under-dispersive as can be seen in Fig.

To examine the spread of the ensemble of observation operators, we need to define a reference source

We now study the impact of adding meteorological and transport uncertainties into the sampling process:

The pdfs of the member weights are displayed in Fig.

Member 17 is present 4 times and corresponds to a deposition velocity of

These conclusions must, however, be largely qualified and are mainly proposed to present the interest and potential of the method.

In this paper, we proposed several methods to quantify the uncertainties in the assessment of a radionuclide atmospheric source. In the first step, the impact of the choice of the likelihood which largely defines the a posteriori distribution when the chosen priors are non-informative was examined. Several likelihoods were selected from a list of criteria: log-normal, log-Laplace, and log-Cauchy distributions which quantify the fit between observations and predictions.

In the second step, we have focused on the likelihood covariance matrix

Finally, in order to incorporate the uncertainties related to the meteorological fields and the transport model into the sampling process, ensemble methods have been implemented. An ensemble of observation operators, constructed from the ERA5 ECMWF EDA and a perturbation of the IRSN ldX transport model dry deposition, release height, and vertical turbulent diffusion coefficient parameters, was used in place of a deterministic observation operator. Following a Bayesian approach, each operator of the ensemble was given a weight which was sampled in the MCMC algorithm.

Thereafter, a full reconstruction of the variables describing the source of the

First, the refinement of

Second, independent MCMC samplings with the three likelihoods examined performed with the HRES meteorological fields showed that the support of the TRRA distribution was moderately impacted by the choice of the likelihood. This reveals that the uncertainties are not correctly estimated when using a single likelihood.

Finally, incorporating the uncertainties of the meteorological and transport fields using the observation operator set in combination with the use of multiple likelihoods had a significant impact on the conditional distribution of the TRRA, increasing the magnitude and timing of the release variances, but also on the conditional distribution of the release source coordinates. We have also shown that this method allows the reconstruction of transport model parameters such as dry deposition velocity or release height.

With the help of the three main methods proposed in this paper, the longitude spread of the

We recommend the use of all three methods when sampling sources of atmospheric releases: all three methods can have a moderate to large effect depending on the event modelling (e.g. the use of three likelihoods has a large effect only in combination with the inclusion of physical uncertainties). As far as the likelihood is concerned, we think that the log-Cauchy distribution is the most suitable while the choice of the associated threshold necessarily depends on the observations. We intend to apply the methods to the Fukushima–Daiichi accident.

Let us suppose that the cost function is computed from a Gaussian likelihood; then we have

If we take into account this new set of observations, the cost function becomes

Therefore in this configuration, adding a given number of observations anterior to the accident will degrade the distributions of the source variables. This problem is due to the homogeneous and hence inconsistent design of the observation error covariance matrix. Assigning a different

We compare a first model (0) where only one variable

AIC(0)

We write

According to the AIC criterion, the model

A ROC and a reliability diagram are computed using the reference source defined in Sect.

Each curve of Fig.

From the ROC curves, the enhanced ensemble appears to be good for discriminating: curves always have a low rate of false occurrence and an acceptable hit rate. In the reliability diagrams, the forecast overestimates the probability that an observation is between 0 and

ROC curve and reliability diagram of the enhanced ensemble created with sampling of EDA and the transport model parameters.

The observation dataset used is described in detail and is publicly available in the work of

JDLB designed the software, contributed to the construction of the methodology and conceptualisation, performed the investigations, and wrote the original version of the paper. MB supervised the work and contributed to the construction of the methodology, conceptualisation, and the revision/edition of the manuscript. OS supervised the work, provided resources and contributed to the construction of the methodology, conceptualisation, and the revision/edition of the manuscript. YR contributed to the construction of the methodology and the revision/edition of the manuscript.

The authors declare that they have no conflict of interest.

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors are grateful to the European Centre for Medium-Range Weather Forecasts (ECMWF) for the ERA5 meteorological fields used in this study. They also wish to thank Didier Lucor, Yann Richet and Anne Mathieu for their comments on this work. CEREA is a member of Institut Pierre Simon Laplace (IPSL). The authors would like to thank the reviewers for their many constructive comments.

This paper was edited by Yun Qian and reviewed by Pieter De Meutter, Ondřej Tichý, and three anonymous referees.