Quantifying the impact of meteorological uncertainty on emission estimates and volcanic ash forecasts of the Raikoke 2019 eruption

. Due to the remote location of many volcanoes, there is large uncertainty in the timing, amount and vertical distribution of volcanic ash released when they erupt. One approach to determine these properties is to combine prior estimates with satellite retrievals and simulations from atmospheric dispersion models to create posterior emissions estimates constrained by both the observations and the prior estimates using a technique known as source inversion. However, the results are dependent not only on the accuracy of the prior assumptions, the atmospheric dispersion model and the observations used but also the 5 accuracy of the meteorological data used in the dispersion simulations. In this study we advance the source inversion approach by using an ensemble of meteorological data to represent the uncertainty in the meteorological data and apply it to the 2019 eruption of Raikoke. This provides conﬁdence in the posterior emission estimates and associated dispersion simulations that are used to produce ash forecasts. Prior mean estimates of ﬁne volcanic ash emissions for the Raikoke eruption based on plume height observations are more than 15 times higher than any of the mean posterior ensemble estimates. In addition, the posterior 10 estimates have a different vertical distribution with 27-44% of ash being emitted into the stratosphere compared to 8% in the mean prior estimate. This has consequences for the long-range transport of ash as deposition to the surface from this region of the atmosphere happens over long time-scales. The posterior ensemble spread represents uncertainty in the inversion estimate of the ash the ﬁrst the eruption, the prior ash column loadings lie outside an estimate of the associated with a set of independent satellite retrievals the posterior ensemble column loadings do not. a risk-based methodology to an ensemble of dispersion simulations using the posterior emissions shows that the area deemed to be highest risk to aviation, based on the fraction of ensemble members exceeding predeﬁned ash concentration thresholds, is reduced by 51% compared to estimates using an ensemble of dispersion simulations using the prior emissions with ensemble meteorology. If source inversion had been used following the eruption of Raikoke it would have had the potential to signiﬁcantly reduce the disruption to aviation operations. The posterior inversion emission estimates are also sensitive to uncertainty in other eruption source parameters (e.g., the ash density and size distribution) and internal dispersion model parameters (e.g., parameters relating to the turbulence parameterisation). the ensemble inversion methodology to for uncer-1 determine an optimal estimate between the assumed and the observed radiances in the at 10.4 µ m, 12.4 µ m, and 13.3 The detection is based on a combination of brightness temperature difference (BTD) tests and beta ratio tests that are optimised for the Raikoke The beta ratio tests a radiative , Pavolonis (2010), optical ratio and are used to ﬁlter by the BTD To reduce false detections over high satellite zenith angle, several ﬁlters are the neighbouring also removes other false detections.

quantitative ash forecasts need to be provided but also uncertainty information, potentially through the use of ensembles of 55 VATDM simulations.
The forecasting of ash location and concentration following an eruption is strongly dependent on information about the eruption that is used to initialise the VATDM simulations (e.g. Webley et al., 2009;Dacre et al., 2011;Stohl et al., 2011;Tesche et al., 2012;Webster et al., 2012;Harvey et al., 2018;Prata et al., 2019). Typically, VATDMs require the eruption location, start time and duration, particle size distribution and ash density to be specified. Plus, the time evolution of the height 60 of the ash plume, the vertical distribution of the ash within the eruption column and the mass eruption rate also need to be defined. It is possible to estimate the eruption start time using satellites or local observers. There are several remote sensing techniques to estimate the height of the ash plume (e.g. Oppenheimer, 1998;Petersen et al., 2012). Mass eruption rates are typically estimated using empirical relationships based on the reported plume height and ash deposits from past eruptions (e.g. Mastin et al., 2009;Sparks et al., 1997). However, these empirical relationships do not account for other factors that window and the need to present the ensemble forecasts in a format which can be used by decision makers to make fast and robust decisions in an emergency response situation (Mulder et al., 2017). 90 The use of ensemble meteorology to produce an ensemble of dispersion simulations as a research tool is not new (e.g. Straume et al., 1998;Galmarini et al., 2004Galmarini et al., , 2010 but there are only a small number of studies that apply this approach to volcanic ash forecasts. Recent work by Zidikheri et al. (2018) found that using an ensemble of dispersion model simulations driven by an ensemble of meteorological fields and different values of ash source parameters gave increased Brier skill scores at all lead times (the length of time between when the forecast is issued and the time the phenomena are predicted to occur) 95 compared to a deterministic forecast. A study focussing on the 2013 eruption of Kelut (Dare et al., 2016) found that if an ensemble is used rather than a single realisation of the meteorological situation there is better qualitative agreement with satellite observations for lead times greater than 12 hours and similar agreement at short lead times. This is relevant for the VAAs and VAGs as they are issued out to a forecast lead time of 18 hours. Two earlier studies by Stefanescu et al. (2014) and Madankan et al. (2014) found that there can be a large spread in predicted ash concentrations at lead times greater than 48 100 hours when using ensemble meteorology. These studies suggest that the use of ensembles can provide improved volcanic ash forecasts, however, there is a need to develop strategies to extract information from them that is useful for decision making at longer lead times.
Satellite imagery in the visible and infrared can show the presence and extent of volcanic ash clouds. Advances in satellite retrieval techniques mean that estimates of ash cloud top height, effective ash radius and ash column loading are also available 105 (e.g. Francis et al., 2012;Pavolonis et al., 2013;Grainger et al., 2013). Mass eruption rates at the neutral buoyancy level can be estimated under certain assumptions (e.g. Woods and Kienle, 1994;Pouget et al., 2013;Prata et al., 2021a) but direct retrievals of the vertical distribution within the eruption column are not possible. However, satellite retrievals, typically of ash column loading, can be combined with VATDM simulations using inversion techniques to give time-evolving estimates of these crucial quantities. There are numerous published approaches that use inversion modelling to estimate ash source parameters for 110 volcanic eruptions (e.g. Kristiansen et al., 2012;Schmehl et al., 2012;Denlinger et al., 2012;Pelley et al., 2021;Zidikheri et al., 2017a,b) using a single deterministic realisation of the meteorological situation. This means that uncertainty in the forecast precipitation or 3d wind fields will lead to uncertainty in the estimated mass eruption rates and their vertical distribution.
As in Harvey et al. (2020), this study brings together inverse modelling and the use of an ensemble of meteorological forecasts to give an ensemble of the most probable source emission estimates of volcanic ash that will undergo long range 115 transport following the 2019 Raikoke eruption. These emission estimates can be used to obtain robust ash forecasts constrained by observations. There will be a particular focus on regions where medium and high levels of ash contamination are predicted, as these are areas that aircraft may be prohibited from entering, and on the influence of emissions of ash into the stratosphere on these regions.
The methods and data used in this study are described in Section 2. Section 3 describes the details of the 2019 Raikoke 120 eruption. The volcanic emission estimates determined using the ensemble inversion system, their impact on volcanic ash forecasts and on flight planning decisions are presented in Sections 4 and 5. A summary, conclusions and implications for future work are presented in Section 6.

125
The Met Office Global and Regional Ensemble Prediction System (MOGREPS-G) has 17 ensemble members plus a control member. It has a horizontal resolution of 20 km in mid-latitudes and there are 70 vertical levels with the lid at approximately 80 km. Each forecast is initialised 4 times per day at 0000, 0600, 1200 and 1800 UTC and they extend out for 7 days (Bowler et al., 2008). At the time of the Raikoke eruption, MOGREPS-G used a stochastic physics scheme to account for model uncertainty and an online inflation factor calculation to calibrate the spread of the ensemble in space and time 130 Bowler, 2011, 2013). The MOGREPS-G forecasts used to drive NAME in this study are initialized at 1200 UTC 21 June 2019.
Its high temporal (10 min) and spatial (2 km at nadir for the infra-red bands) resolution makes its observations ideally suited 135 to evaluate the transport of volcanic ash following an eruption. Two independent volcanic ash retrieval algorithms, one based on work primarily from the UK Met Office (Francis et al., 2012) and one determined using the methodology described in (McGarragh et al., 2018), are used in this study and are described below. Note that although the retrieval methods have been developed independently in different research groups, they both use information from the Himawari-8 satellite and therefore have the same bias when volcanic ash is obscured by meteorological cloud (which is prevalent during the Raikoke 2019 140 eruption).
Met Office Algorithm: This algorithm is based on the method described in Francis et al. (2012), with slight adaptations for the channels of the Advanced Himawari Imager (AHI). Firstly, the channels at 8.6 µm, 10.4 µm and 12.4 µm are used to detect which pixels are contaminated by volcanic ash. Then the ash column loading, layer height and effective radius are determined using a one-dimensional variational (1D-Var) analysis to determine an optimal estimate between the assumed background and 145 the observed radiances in the channels at 10.4 µm, 12.4 µm, and 13.3 µm. The detection is based on a combination of brightness temperature difference (BTD) tests and beta ratio tests that are optimised for the June 2019 Raikoke eruption. The beta ratio tests use a derived radiative parameter β, originally devised by Pavolonis (2010), that is the effective absorption optical depth ratio of two channels and are used to filter pixels marked as ash by the BTD tests. To reduce false detections over arid land surfaces and at high satellite zenith angle, several geographical filters are used. Checking the consistency of ash detection in 150 neighbouring pixels also removes other false detections.
Where ash is detected, the Met Office algorithm determines the ash column loading. These pixels are flagged as containing ash. If a pixel is free from both ash and meteorological cloud then it is flagged as a clear sky pixel. Pixels that don't have detectable ash and are not flagged as clear skies are unclassified. As in Pelley et al. (2021), further processing is performed to regrid the retrieved column loadings on to a grid of 0.375 • latitude by 0.5625 • longitude (approximately 40 km × 40 km in 155 mid-latitudes) and averaged over 1 h. This is to match the resolution of the NAME ash column loading output and to reduce data volumes. If 50% or more satellite pixels in a grid box contain ash or more than 90% of pixels are classified as ash or clear skies, then the grid box is selected for use in the InTEM inversion. If all classified pixels within a grid box are flagged as clear sky pixels, then the grid box is deemed to be a clear sky observation. Otherwise, the grid box is deemed to be an ash grid observation with the column loading in this grid box given by the mean of all the classified pixels (including clear skies).

160
ORAC Algorithm: To estimate the mass loading of fine ash for the Raikoke ash clouds, the Optimal Retrieval of Aerosol and Cloud (ORAC, McGarragh et al., 2018) algorithm was applied to infrared measurements made by the AHI on board the Himawari-8 satellite. ORAC uses the optimal estimation approach (Rodgers, 2000) to retrieve state variables based on radiative transfer simulations, satellite measurements and prior information. The ORAC algorithm includes optical depth, effective radius, surface temperature and cloud-top pressure in the state vector and the mass loading is derived from the 165 retrieved optical depth and effective radius consistent with previous authors (e.g. Wen and Rose, 1994;Corradini et al., 2008;Prata and Prata, 2012). To retrieve volcanic ash properties consistently over day and night the 10.4, 11.2, 12.4 and 13.3 µm thermal infrared channels are used in the measurement vector. The microphysical model used in the radiative transfer model assumes an ash cloud containing spherical particles which conform to a log-normal distribution and ash composition based on the Eyjafjallajökull ash taken from the Aerosol Refractive Index Archive (ARIA, http://eodg.atm.ox.ac.uk/ARIA/).

170
Ash detection follows the approach presented in Appendix A of Prata et al. (2021b) and only retrievals that converge with a measurement cost at solution of less than 10 are considered. Four forward model configurations are used in the retrievals: 1) A single tropospheric layer of ash, 2) A tropospheric ash layer above a liquid water cloud layer, 3) A single stratospheric layer of ash and 4) A stratospheric ash layer above a liquid water cloud layer. The multi-layer configurations are introduced to account for prevalent low-level stratus clouds during the explosive phase of the Raikoke eruption. The water cloud layer is tightly 175 constrained with an a priori cloud-top pressure of 800hPa, effective radius of 10 µm and 550 nm optical depth of 16. The a priori uncertainties on these values are set to 50 hPa, 1 µm and 3. The a priori settings were chosen based on ORAC standard cloud retrievals (Poulsen et al., 2012;McGarragh et al., 2018) that are run separately on the nearby, low-level stratus clouds.
The troposphere/stratosphere model configurations are run by setting two different a priori cloud-top pressures to determine cloud-top heights above and below the tropopause.

180
The a priori cloud-top pressures considered are 500 hPa and 200 hPa. The stratospheric a priori cloud-top pressure is chosen based on independent CALIPSO observations of the stratospheric ash cloud. After running all four forward model configurations, we select the retrieval with the lowest cost for each ash affected AHI pixel to generate the final retrieval product. Further details of the algorithm used to generate the retrievals shown here are given in Prata et al. (In preparation, 2021).

NAME
In this study the Numerical Atmospheric-dispersion Modelling Environment (NAME) model was used to simulate the dispersion of volcanic ash (Jones et al., 2007). To model the transport and removal of volcanic ash, NAME includes parameterizations of dispersion due to free tropospheric turbulence, sedimentation, dry deposition and wet deposition. It is assumed that the ash particles are spherical and have a density of 2300 kg m −3 , although in reality the density of the ash is determined by its porosity, chemical composition and grain size. In this study, aggregation of ash particles, near source plume rise and processes driven by the eruption dynamics (e.g., Woodhouse et al., 2013) are not explicitly modelled. The particle size distribution used is based on data from Hobbs et al. (1991), but only includes ash particles with diameters between 1 -30 µm, as ash particles larger than this are not typically detected by the AHI. We refer to this as fine ash in this paper.

InTEM for volcanic ash 195
The Inversion Technique for Emissions Modelling (InTEM) for volcanic ash is an inversion system that combines VATDM simulations, satellite retrievals and a prior estimate of the emission using a Bayesian approach to give the best estimate of the emissions profile for fine ash that can undergo long range dispersion. This system has been developed at the UK Met Office and was originally developed to estimate greenhouse gas emissions (Manning et al., 2011). It has been previously used by Harvey

Estimate of the prior source term
To ensure that the inverted source term is not overfitted to the satellite information, a prior estimate of the source term is used.

205
This also makes sure that the posterior source term is informed by known information about the eruption, such as the eruption start and end time and the maximum plume height. To construct the prior, it is assumed that emissions in the eruption column are uniform in the vertical from the volcano vent to an estimate of the plume height based on observations with an error of +/-2km. The mass eruption rate is determined using the empirical relationship in Mastin et al. (2009). The mean and error covariance matrix of the prior are estimated using a stochastic model that includes correlations between errors in the emissions 210 at different heights and times. Thomson et al. (2017) contains a full description of how the prior is determined and used in the InTEM system. In this study the prior source term is based on a plume height of 13km, which is consistent with information provided by the Tokyo VAAC and Bruckert et al. (2021). Note that the prior is a probability distribution and from now on it is assumed that when the prior is mentioned refers to the mean of this distribution.

VATDM simulations 215
The VATDM simulations within InTEM are performed using NAME. As this methodology exploits ash column loadings determined from satellite retrievals, the NAME simulations only include ash particles with diameters between 1 -30 µm as ash particles larger than this are not typically detected by the AHI. This assumption is consistent to those made in other inversion systems for this application (e.g., Stohl et al., 2011). Simulations representing a nominal release rate (1 g s −1 ) from each possible source term component (4km height range and 3-hourly time period) are conducted. Model predictions of ash column 220 loads can be easily determined for an arbitrary emission profile by a linear combination of these nominal simulations. The resolution of the posterior emission estimate (3-hourly with 4km vertical resolution), is chosen to ensure that the inversion can be performed within a time frame that is compatible with VAAC operational constraints.

The inversion algorithm
The prior estimate, NAME simulations and satellite retrievals are combined within the inversion scheme to give a posterior 225 distribution of emissions. The posterior distribution is Gaussian and within InTEM the best estimate of emissions is taken as the peak of this distribution with a non-negative constraint applied. It is possible that selecting the peak of the distribution ignores a large amount of uncertainty information within the posterior and results in a loss of information about the emissions, but this is not the focus of the present study. A quadratic cost function, representing the simultaneous fit of the VATDM simulations and satellite retrievals, and between the emission estimate and the prior, is minimised using the Lawson and Hanson (1974) 230 non-negative least squares algorithm. This algorithm was chosen as it converges in a finite number of iterations and is very fast.
In this study the inversion algorithm took less than 1 minute to run for each of the ensemble members using satellite retrievals from 1800 UTC on 21 June to 0000 UTC on 22 June 2019. The speed of the InTEM system is therefore governed by the   for the temporal evolution of the simulated ash cloud as ash in the stratosphere cannot be easily deposited to the surface through wet deposition and sedimentation. Plus, due to vertical wind shear, ash within the stratosphere can be transported by atmospheric winds with different speeds and directions compared to other parts of the atmosphere. Simulations of the evolution 280 of the Northern Hemisphere mean sulphur dioxide mass burden performed by de Leeuw et al. (2020) were also found to be sensitive to the amount of sulphur dioxide emitted in the stratosphere. The NAME simulation with the best agreement with satellite retrievals of sulphur dioxide from TROPOMI in terms of peak concentrations and e-folding times had a larger fraction of sulphur dioxide emitted in the lower stratosphere compared to the control simulation.

Ensemble InTEM ash forecasts 285
The previous discussion in Section 4.1 focused on the differences in posterior emission profiles, however VATDM forecasts of the ash cloud are used to inform VAAC graphics and advisories. This section analyses NAME simulations driven with the MOGREPS-G meteorological ensemble both using the peak of the posterior distribution of emissions (shown in Figure 1) and prior emissions and evaluates these simulations against independent satellite retrievals using the ORAC Himawari retrievals.       Posterior ensemble (blue), prior ensemble (cyan), Himawari ash column loading retrieved using the Met Office algorithm (red crosses) and Himawari ash column loading retrieved using the ORAC algorithm (black circles). The error bars represent ± one standard deviation in the estimates of ash column loading determined from the satellite retrievals. The coloured shading indicates the range of column loading in the ensemble.
represents the uncertainty in ash column loading due to the meteorological ensemble. Satellite retrievals of ash column loading and associated uncertainty from Himawari using the Met Office algorithm (red crosses) and from Himawari using the ORAC algorithm (black circles) are shown for comparison. The Met Office and ORAC retrievals, at this time and along this cross  The prior ensemble has a much higher mean magnitude than the posterior emissions (by over a factor of 10) and the ensemble spread does not encompass either set of satellite retrievals within one standard deviation. Meteorological cloud obscures much of the domain so obtaining a contiguous retrieval of ash and clear sky pixels is not possible at this time. To assess the impact of a larger fraction of ash being emitted into the stratosphere we focus on the evolution of the ash plume using inverted emissions and matching meteorology for ensemble members 5 and 9. These members were chosen as member 5 has the smallest fraction of ash emitted into the stratosphere (27%), whereas member 9 has the largest (44%). To determine the differences between the simulations driven by emissions from members 5 and 9, the output from the NAME simulations is post-processed by dividing the vertical grid into three layers (near surface, troposphere and stratosphere) and setting the 335 concentration of each of the three layers to the maximum ash concentration of the original higher resolution layers. Figure   7 shows the maximum over the simulated ash layer concentrations near the surface, in the troposphere and stratosphere at 1200 UTC on 23 June 2019. At this time, the ash plume structure at all three levels is qualitatively very similar. However, there are some differences both in plume location, extent and ash concentration values. The location differences are likely to be due to the use of different driving meteorology. Near the surface and in the troposphere, member 5 has higher maximum 340 concentrations than member 9 but in the stratosphere the opposite is true. This is expected as more mass is emitted at these heights in member 9. The differences in peak concentration are largest when considering the stratosphere (2.53 g m −3 in member 5 compared to 3.32 g m −3 in member 9). In the stratosphere member 9 has the highest peak concentration despite the plume extending further east (by approximately 8 degrees) and the plume having an area that is 1.4 times larger than the plume produced using member 5 and the region with concentrations greater than 200 µg m −3 being 1.5 times larger than the 345 equivalent region in member 5. These differences highlight the importance of the vertical distribution of the ash emissions as it can lead to an increase in areas that may be considered too contaminated for aircraft to fly through. Note that the lowest ash concentration shown in Figure 7 is 100 µg m −3 .

Potential implications for aviation operations
The comparison between simulated and satellite retrievals of ash column loadings is valuable for forecast validation, however 350 this variable does not give any information about the vertical extent of the ash cloud or peak ash concentrations at the three International Civil Aviation Organisation (ICAO) prescribed flight levels. Numerous charts are required to visualise the ash extent at different flight levels, different concentration levels and different times. During an emergency, the number of graphics associated with an ensemble of simulations can be overwhelming, plus the interpretation of ensemble spread relies on the decision makers experience and risk appetite (Mulder et al., 2017).

355
One approach of condensing this data into a single chart using a risk matrix is presented by Prata et al. (2019). Here we apply the same risk-based approach to the Pacific region following the eruption of Raikoke using both the prior and posterior ensembles outlined in Section 4. The use of this type of graphic reduces state-of-the-art ensemble information into an easy to interpret decision making tool that can be used to make fast and scientifically robust decisions. As in Prata et al. (high). This is done for each of the three VAAC flight levels and the overall risk at a given location is the maximum risk over the three flight levels. This is a conservative approach but is in line with the current ICAO guidance (International Civil Aviation Organization: Montréal., 2007). In this analysis, to be consistent with the approach in Prata et al. (2019), the ash concentration 365 fields output by NAME were multiplied by a factor of 10, known as the "peak-to-mean" factor. This factor accounts for peak concentrations that are not resolved in the NAME simulations. by the highest risk reduced from 11% to 7% when the posterior ensemble is used instead of the prior. For med-level risk, the percentage is reduced from 17% to 11%. At both times, the application of this risk-matrix approach highlights the potential 380 impact of using the ensemble inversion approach on airline operations. Disruption could have been reduced and the high economic cost actions (such as flight cancellation, rerouting and enhanced engine checks) could have been greatly decreased.

Conclusions
The eruption of Raikoke on 21 June 2019 sent volcanic ash high into the atmosphere. In this study satellite retrievals from the Himawari satellite and an ensemble of NAME simulations driven by an ensemble of meteorological forecasts have been 385 combined using the InTEM inversion system. Our main results can be summarized as follows: -For this case study, the posterior ash emission rates determined using InTEM are substantially lower compared to the prior emission profile estimated using the Mastin et al. (2009) relationship. The posterior emission profiles produced using a range of plausible meteorological situations are qualitatively very similar giving confidence in the use of the InTEM system. However, there are differences in the magnitude of the ash emitted at different heights. There is a large 390 range in the fraction of mass that is emitted into the stratosphere (above 12 km avl in this study). These differences lead to a range of values (0.32 -0.71 Tg) for the total amount of ash (in the size range 0.1-100 µm) emitted over the eruption period. This range is broadly consistent to the range found in Muser et al. (2020). It should be noted that the reduction in emissions determined by InTEM compared to using the Mastin et al. (2009) approach is much larger than the differences between the emissions determined with the ensemble of meteorological situation. In this case, this points 395 to the Mastin et al. (2009) relationship giving ash emissions that are grossly over estimated. However, the Mastin et al. (2009) relationship is still routinely used in VAAC operations and ensures that the ash forecasts are conservative.
-As expected with reduced emissions, the VATDM forecasts produced using the posterior emission ensemble with matching meteorology have ash clouds with much lower column loadings compared to the prior ensemble simulations, although they have a similar evolution. The simulations using the posterior emission ensemble have a much smaller range of col-400 umn loadings and are a closer match to the ORAC retrievals of ash column loading than the prior ensemble simulations.
Thus, the Himawari observations constrain the ensemble spread.
-For this case study, the amount of ash emitted into the stratosphere is important. Higher fractions of ash (in terms of mass) are emitted into the stratosphere leading to higher peak stratospheric concentrations and ash plumes with greater horizontal extent when using the posterior ensemble. This could potentially increase the risk to aviation as this is near 405 the cruise altitude of aircraft in the Pacific region.
-The risk-matrix approach to presenting ensemble forecast data has been applied to the VATDM simulations produced using the prior and posterior emissions from InTEM. In this case study, the use of the posterior emissions reduces the region of highest forecast risk by up to 51%. This has the potential to reduce disruption to civil flight plans. This result is consistent with that found in Harvey et al. (2020) and builds confidence in applying this methodology.

410
Future work will focus on applying this methodology to further case studies and comparing with ensemble and inversion systems used by other modelling centres. Here the focus of the study was the impact of meteorological uncertainty on the InTEM emission estimates and VATDM forecasts of ash location and the magnitude of ash column loadings but there are other sources of uncertainty which could be incorporated into a full ensemble inversion scheme. These include uncertainties in ash density and particle size and the representation of free tropospheric turbulence and wet deposition within the VATDM.