Time-lagged ensemble simulations of the dispersion of the Eyjafjallajökull plume over Europe with COSMO-ART

An extended version of the German operational weather forecast model was used to simulate the ash dispersion during the eruption of the Eyjafjallajökull. As an operational forecast was launched every 6 hours, a time-lagged ensemble was obtained. Sensitivity runs show the ability of the model to simulate thin ash layers when an increased vertical resolution is used. Calibration of the model results with measured data allows for a quantitative forecast of the ash concentration. After this calibration an independent comparison of the simulated number concentration of 3 μm particles and observations at Hohenpeißenberg gives a correlation coefficient of 0.79. However, this agreement could only be reached after additional modifications of the emissions. Based on the time lagged ensemble the conditional probability of violation of a certain threshold is calculated. Improving the ensemble technique used in our study such probabilities could become valuable information for the forecasters advising the organizations responsible for the closing of the airspace.


Introduction
After resting for 187 years, the Icelandic volcano Eyjafjallajökull woke up again on 20 March 2010.Starting on 14 April, massive emissions of volcanic ash occurred and finally led to a shutdown of civil aviation over entire Europe.
The emissions went on with variable strength until 23 May 2010.The volcanic eruption represents a unique field experiment for investigating atmospheric processes, such as transport, radiation, and cloud formation, on a large variety of scales by combining observations and numerical models.In addition to the academic interest, the huge economic costs of the shutdown of civil aviation raised the need for accurate forecasts of the temporal and spatial distribution of the ash concentrations within the plume.From first principles a quantitative numerical forecast of the spatial distribution of volcanic ash particles requires the forecast of the source strength of the particles and gaseous precursors, the vertical distribution of the effective source heights, and the size distributions at the source.Hardly any of these requirements was fulfilled during the eruption of Eyjafjallajökull or will be fulfilled in case of future eruptions.
Almost parallel to the ongoing eruption, numerical simulations were started with different numerical model systems by different research institutions and weather services.None of these model forecasts were quantitative during the time interval the eruption took place.Most of these model systems were coupled offline, i.e. the aerosol transport models were driven by pre-calculated outputs of meteorological forecast models (e.g.Folch et al., 2012;Matthias et al., 2012;O'Dowd et al., 2012b).During specific episodes, aircraft measurements were carried out in addition to the remote sensing observation data taken by LIDARS, sun photometers, ceilometers, and satellites (Flentje et al., 2010;Ansmann et al., 2010;Emeis et al., 2011;Gasteiger et al., 2011;Schumann et al., 2011;Langmann et al., 2012;O'Dowd et al., 2012a, Wiegner et al., 2012).
After the shutdown caused by the volcano, many efforts have been made to transform the qualitative model results and the measurements into reliable quantitative results.Moreover, some of those model results in combination with the observations were even used to quantitatively calculate Published by Copernicus Publications on behalf of the European Geosciences Union.
the temporal development of the source strength and the vertical emission profile (Stohl et al., 2011).
For our study we transferred the online coupled model system COSMO-ART into an operational version at Deutscher Wetterdienst (DWD, Sect.2).Forecasts were launched every 6 hours for a period of at least 48 h with the same integration procedure as used for the weather forecast at DWD.With this procedure, we obtain a time-lagged ensemble forecast for volcanic ash.
Additionally, we perform sensitivity runs to quantify the impact of vertical grid resolution, washout, and sub grid scale deep convection.These results are qualitatively compared with LIDAR backscatter data derived in Munich, Germany (Sect.3).
Quantitative measurements that were recently published are used to recalibrate our model results for subsequent calculation of absolute values of the volcanic ash number density and the mass concentration of individual size classes.This calibration allows a comparison with measured number concentrations at Hohenpeißenberg, Germany (Sect.4).
An evaluation of the results of our time lagged ensemble and additional model runs varying the plume height at the volcano allow us to quantify the uncertainty of the model results with respect to meteorology and source strength as part of the overall uncertainty (Sect.5).As far as the authors are aware of this has not been done in previous studies.

The model system, procedure of integration, and input data
2.1 The model system COSMO-ART (Vogel et al., 2009;Bangert et al., 2012;Knote et al., 2011) is a recently developed comprehensive online-coupled model system.Here, online-coupled means that the Eulerian prognostic equations for e.g.volcanic ash are solved and integrated time step by time step in the same way as it is done for the moisture scalar variables.COSMO-ART is an extension of the operational weather forecast model COSMO of DWD (Baldauf et al., 2011).The main purpose of the model system is to quantify the interaction of gaseous and particulate air constituents with the state of the atmosphere on the regional scale.Within COSMO-ART, the treatment of natural (e.g.mineral dust and sea salt) and anthropogenic (soot, secondary organic and inorganic) aerosol particles is usually based on the so-called modal approach.In case of the volcanic ash, we use a sectional approach.We describe the size distribution of the number density of the volcanic ash particles by six individual size mono disperse bins with a diameter of 1, 3, 5, 10, 15, and 30 µm.Coagulation between the size bins is not taken into account.Neglecting this process is justified for the considered size ranges.The atmospheric concentrations of particles emitted by the volcano are modified by advection, turbulent diffusion, sedimentation, deposition, and washout by below cloud scavenging.

Emission data
Crucial input parameters to calculate the volcanic ash plume are the emission data.In our case, we need the plume height at the source, the vertical profile of the emission strength, and the size distribution of the ash particles at the source.The only parameter that was determined with certain accuracy was the plume height at the eruption site.We used the observations that were published on the volcanic ash advisory centre London website (VAAC, http://www.metoffice.gov.uk/aviation/vaac/).VAAC is responsible for the official forecast of ash coming from volcanoes in Iceland according to international agreements.
The information on the plume height was updated every 6 hours.Figure 1 shows the temporal development of the plume height used for our simulations.For the source strength, we made the assumption of a constant emission rate for the number density on all vertical model levels.The value of the strength was scaled according to the parameterization given by Mastin et al. (2009).
Based on their findings we prescribed the source term in the following way: (1) h is the top of the plume height at the location of the volcano in m. 100 is a prescribed constant.The unit of Q is number of particles per second.
The actual plume height was kept constant until a new observation was available.Figure 1 gives the source strength we calculated with this approach.

Procedure of integration
Our model domain is identical to the COSMO-EU domain that is used by DWD for operational weather forecast with a horizontal grid size of 7 km.We present results for two different methods of time integration.The first method is named hindcast mode.We simulate the 9-day period between 14 April 2010, 00:00 UTC and 23 April 2010, 00:00 UTC with a single run in the hindcast mode (Sect.3).To obtain the best possible meteorological simulation, we use the operational analyses of COSMO-EU as boundary data.
For the second method named operational forecast mode the model is driven by boundary data of the operational GME (global model of DWD).The individual model runs were started from the operational COSMO-EU analysis data.The operational data assimilation cycle consists of 3-hourly runs of the COSMO model, where the basic meteorological fields are nudged to the observations.These assimilation runs are complemented by additional surface and sea surface temperature, and snow analyses as well as by an additional soil moisture analysis which aims at improving the 2 m-temperature forecast.The actual forecast runs are started from so called main run analyses with, i.e. again 3-hourly nudging analysis runs of COSMO with shorter cut-of times for the observations compared to the final COSMO analyses.A more detailed description of this procedure is given by Hanisch (2002).Since the prognostic volcanic ash fields are not part of the operational data assimilation cycle, the 6hourly forecasts of volcanic ash were handed over from one simulation run to the next.Results obtained with this time integration procedure are presented in Sects.4 and 5.

Model results and sensitivity studies
In the following section we will present a qualitative comparison of the non-calibrated hindcast model results with observations.Profiles of LIDAR backscatter coefficients were measured during the eruption at several sites.We compare our model results with the LIDAR measurements obtained close to Munich (Gasteiger et al., 2011;Wiegner et al., 2012) presented in Fig. 2. The observation shows a thin vertical ash layer of a few hundred metres thickness.Eulerian models, such as COSMO-ART, usually use a telescoping grid in the vertical direction.As the vertical grid size increases with height, it is difficult to resolve such a thin layer.In order to study the impact of the vertical resolution on the model results, we increased the number of vertical layers from 40 used operationally to 80.This increase alters the meteorological variables and, consequently, the concentration distributions.Additionally, artificial vertical mixing caused by the vertical  grid size is reduced when the vertical resolution is increased.Figure 3 shows the simulated vertical profiles for the individual runs.
At 17 April, 09:00 UTC the vertical depth of the simulated ash plume above 1200 m is 640 m in case of 80 vertical levels and 930 m in case of 40 vertical levels.Increasing the vertical resolution increases the number of vertical layers between 1000 and 3000 m from 7 to 15 vertical layers.The observations of the backscatter coefficients give a corresponding vertical depth of the ash plume of 580 m (Fig. 2).
The centre of the simulated ash layer above the boundary layer starts to decrease in height around midnight on 17 April 2010 and finally reaches 1600 m at 17 April, 17:00 UTC.This is in good agreement with the height of the observed plume (Fig. 2).Based on the results of the model run with 80 vertical levels (Fig. 3b), a height decrease rate of 8.5 cm s −1 is calculated between 01:30 UTC and 08:00 UTC and of 2.35 cm s −1 between 08:00 UTC and 21:00 UTC of 17 April 2010.Calculating the same numbers from the LI-DAR profile of Gasteiger et al. (2011) gives 5.5 cm s −1 between 01:30 UTC and 08:00 UTC and 1.7 cm s −1 between 08:00 UTC and 17:00 UTC (Fig. 2).
We performed additional sensitivity runs to quantify the contribution of washout by below cloud scavenging and the effect of sub grid scale convection.Figure 4 shows the time series of the non-calibrated number density for the 1 µm bin of the individual sensitivity runs at the surface at a grid point that corresponds to the location of the observatory Hohenpeißenberg, Germany (47 • 48 N, 11 • 01 E).For the simulated period the parameterization of deep convection does not change the results.That indicates that deep convection did not occur within the plume between Iceland and Hohenpeißenberg.Larger differences in comparison to the reference run show up when washout is neglected especially between 19 and 20 April 2010.This underlines the high relative importance of this process.Increasing the number of vertical layers leads to changes in the appearance of the concentration maxima on 17 and 21 April 2010.Among other quantities, size distributions of the particles were measured (Fig. 5, left).We used the measured size distributions of the number density to calculate the recalibration factors of our simulated size distributions.The simulated number densities of the individual size bins were multiplied by these factors.On 17 April 2010, the Falcon measured vertical profiles of the number density close to Leipzig, Germany (Fig. 5, right).We used the measured number densities for particles with a diameter greater than 1.5 µm and summed up the normalised number concentrations of our size bins 3, 5, 10, 15, and 30 µm for that location and time.For heights above 3000 m we determined a factor of roughly 100 for the final calibration of the simulated number densities (Fig. 5, right).Below 3000 m there is still a large difference between measured and simulated values.These differences can be explained by non-volcanic aerosol which is not covered by the simulations.By this procedure, we transferred our results into quantitative ones.From the calibrated number concentrations and assuming the mass density of volcanic ash to be equal to 2500 kg m −3 we calculated mass concentrations.
For comparison of the calibrated number densities we use observations that were carried out by DWD at the observatory Hohenpeißenberg which is located at a distance of roughly 400 km from Leipzig.
Figure 6a shows the temporal development of the measured number density of particles with a diameter of 3.5 µm and the simulated (calibrated) number densities for the size bin 3 µm.The results of the individual forecast runs are shown.With respect to the level of agreement, three time periods can be distinguished (labelled a, b, and c, Fig. 6a).During time period A, the individual model results differ substantially from the observations.We found that the quality of the forecast increases for shorter forecast times.The most excessive overestimation seen in period A is related to the final hours of a 78 h forecast run.This confirms the huge impact meteorology has on ash dispersion.During the 3 days of period B, the agreement of the individual forecast results with the observations is much better and the spread of the individual runs is reduced.During the beginning of period C and especially during the first 3 days, the model results differ from the observations.Stohl et al. (2011) derived emission data for the 2010 Eyjafjallajökull eruption using inverse modelling techniques.Figure 2 of the original paper compares the a priori estimated emission strength based on a plume model of Mastin (2007) with the a posteriori emission from the inversion algorithm.From 16 to 19 April 2010, huge differences up to a factor of 10 between both emissions are found.Based on this finding, we reduced our emission during the time period with the biggest differences, i.e. from 17 April, 00:00 UTC until 18 April, 12:00 UTC by a factor of 5.0 and repeated our simulations.Figure 6b gives the results of these simulations.The large deviation of the model results from the observations on 21 April disappeared completely.During the rest of the time of period C, the model results are somewhat lower than the observation for this model run.As the observations are partly outside of the uncertainty range given by our ensemble results (e.g.21 April, 00:00 UTC) it shows that our ensemble covers only parts of the uncertainty.
Figure 7 presents a correlation plot that allows a quantitative assessment of the quality of our forecast ensemble.Shown are the results presented in Fig. 6b.The correlation coefficient is 0.79.The majority of the points are located within the range of a factor of 2. We calculated additional statistical quantities according to Boylan and Russell (2006) to assess the quality of our model results.Those are the mean fractional bias (MFB) and the mean fractional error (MFE). (2) and c o is the measured concentration and c m is the modelled concentration, N is the number of pairs.According to Boylan and Russell (2006) the performance of the model is quantified in the following way: good performance: MFB ≤ ± 30% and MFE ≤ +50 %; average performance: MFB ≤ ± 60% and MFE ≤ +75 %; poor performance: MFB > ± 60 % and MFE > +75 %.In our case we found an MFB of −30 % and a MFE of 73 %.That means our results lie in the average performance range.

Uncertainties and conditional probabilities of the simulated ash concentration
Volcanic ash forecasts are associated with large uncertainties.Two extreme situations can be thought of.Firstly, we might have highly accurate emission data but high uncertainties in the forecast of the meteorological variables.Secondly, we might have highly uncertain emission data but small uncertainties in the meteorological variables.In both cases the quality of the concentration forecast of volcanic ash would be low.In reality we usually have a mixture of both and especially the uncertainties of the forecast of the meteorological situation depend on the atmospheric conditions.A common method to quantify and to account for the uncertainties of meteorological forecasts is ensemble simulations instead of a single forecast.Different methods are applied to generate ensemble forecasts e.g.variation of boundary and initial conditions or different physical parameterizations.
On the global scale the TIGGE (THORPEX Interactive Grand Global Ensemble, http://tigge.ecmwf.int/)project brings together the results of 10 international forecast centres worldwide.Ensemble forecasts are created applying different methods e.g.singular vectors (DelSole, 2007) and stochastic perturbations of physics tendencies.A detailed overview of the individual methods used by the ensemble systems is documented at the following address: http://tigge.ecmwf.int/.On the regional scale DWD runs an ensemble system with a horizontal grid size of 2.8 km for the COSMO-DE domain that covers mainly the area of Germany (Gebhardt et al., 2008).On the European scale, i.e. the COSMO-EU domain used in this study, however, such an ensemble system does not exist at DWD.Therefore, we decided to make use of a time lagged ensemble (Hoffmann and Kalnay, 1982) to address the question of uncertainty of our volcanic ash forecast.Although in case of a volcanic eruption the missing source strength is the biggest driver of uncertainty we could show that using observations at a single point of time for recalibration of the model can reduce this uncertainty to a large extent.Only when the emission characteristics of the volcano changed drastically the model results differ from the observation (Fig. 6).
In the following we will evaluate the model results for which we applied the Mastin et al. (2009) emission parameterization plus the calibration described in Sect. 4 (Fig. 6a).This ensemble is named reference ensemble.Moreover, we produced two additional time lagged ensembles varying the top of the plume height at the volcano by ± 1000 m.This in addition varies the total emissions of volcanic ash according to the Mastin et al. (2009) parameterization.Figure 1 shows the source strengths that correspond to the modifications of the top height of the plume at the volcano.
Figure 8 shows the ensemble mean and the standard deviation of the individual members for the three time lagged ensembles for the grid point Hohenpeißenberg.For a better assessment of the results the statistical information (standard deviation) of the reference ensemble is put in the foreground.The mean values of the reference ensemble and the ensembles with modified source height and strength show the same temporal behaviour during the whole simulation period.Time periods where the standard deviation of the ensemble members are small change with those where they are large but in an unsystematic manner.The reason is the nonlinear behaviour of the atmosphere.It is quite interesting that there exist time intervals where the standard deviation of the reference case covers the major part of the uncertainty while the model runs with modified source height and source strength do not add considerable amount of uncertainty.However, starting with 23 April the individual simulations do not overlap anymore.Now the variability of the atmosphere causes only a small amount of uncertainty and the impact of a 1000 m error on the plume height causes a much greater uncertainty than meteorology does.The latter is also due to the fact that the concentrations at Hohenpeißenberg are then affected by the reduction of the top of the plume height that occurred from 17 to 19 April at the volcano.At the beginning of the eruption phase, when the top of the plume height in the reference case was in the order of 8000 m a variation of the plume height of ± 1000 m varies the source strength according to Eq. ( 1) by roughly a factor of 3. When the top of the plume height of the reference case reduces to 3 km a variation of the plume height of ± 1000 m modifies the source strength according to Eq. ( 1) by a factor of 17.This is the reason why at the beginning, when the plume height is quite high the meteorology is the main contributor to the uncertainty while later on, when the plume height decreased the source strength becomes more important.
Civil aviation and official air traffic control authorities depend on quantitative ash forecasts for flight planning and decision making.Volcanic ash advisory centres usually deliver spatial distributions of ash concentrations at different flight levels.A step beyond that is giving conditional probabilities that a certain threshold is violated.
Using the results of our calibrated time lagged ensembles allows producing such conditional probabilities.The probabilities are calculated in the following way.The four daily model runs with 78 h forecasts at 00:00 and 12:00 UTC and 48 h forecasts at 06:00 and 18:00 UTC result in up to 12 realisations for a specific target date.These different realisations of the time-lagged ensemble allow us to calculate a probability measure.The conditional probabilities we are presenting here cover only parts of the total uncertainties of volcanic ash forecast.In Fig. 9 the conditional probabilities of the maximum mass concentration of volcanic ash between flight level 200 and 350 (approx.6100-10 675 m) exceeding a threshold of 2000 µ g m −3 are presented for 16 April 2010 at 12:00 UTC.For this date, the ensemble consists of 10 members.To improve the statistical basis for the calculation of the probability, an additional upscaling using a 6 × 6 neighbourhood of each grid point was performed resulting in 360 values at each grid point.
Figure 9a presents results for the time lagged ensemble with the reduced plume height at the volcano.From this fig-ure one would derive that two separated areas exist where the probability to exceed the threshold value is above 70 %.One area is close to the source another one covers parts of northern France, Belgium, the Netherlands and parts of western Germany.In case of the reference ensemble (Fig. 9b) the spatial extension of both areas are increasing.Consequently, the areas are increasing again for the ensemble with the increased plume height increased by 1000 m (Fig. 9c).
The spatial patterns in all cases remain similar.If we take all ensemble members together and calculate the probability measure we get the result shown in Fig. 9d.
Compared to a single model output based on the spatial distribution of mass concentration these probabilities that contain at least the uncertainties caused by atmospheric variability and to some extent also the uncertainties due to emissions are a step forward in volcanic ash forecast for aviation safety.This approach can be improved by using more sophisticated ensemble methods as they are applied for example within TIGGE in the future.

Summary and conclusions
The online coupled model system COSMO-ART was used in an operational forecast mode at DWD to simulate number and mass concentrations of volcanic ash for six different size bins reaching from 1 to 30 µm for the time period 14 until 25 April 2010.
Sensitivity runs with non-calibrated ash concentrations showed the great importance of a correct treatment of the washout process and a sufficient vertical resolution.
A comparison of the qualitative results with backscatter ratios measured by a LIDAR system close to Munich shows a descending ash plume in good agreement.The rate of descent calculated from the model results and from the observations differs by 50 %.
Using observations carried out with the DLR research aircraft close to the volcano and close to Leipzig the model results were calibrated.Using only these single measurements already a good quantitative agreement between simulated and observed number concentrations at Hohenpeißenberg is achieved until 21 April.Then model results and observations differ substantially.The agreement is improved using a source strength correction factor based on the inverse modelling work by Stohl et al. (2011).The correlation coefficient for the time lagged ensemble reaches a value of 0.79.
Performing additional simulations with varying plume height at the volcano allows quantifying the uncertainties of the model results due to atmospheric variability and due to source parameters.Although one might argue that the source term will always raise the highest level of uncertainty we demonstrated that already a limited number of high-quality observations can help to calibrate the model and to reduce this uncertainty.Then the uncertainty caused by meteorology including the process of washout gains in importance.
www.atmos-chem-phys.net/14/7837/2014/Atmos.Chem.Phys., 14, 7837-7845, 2014 For the first time and based on results of the time lagged ensembles probabilities of the violation of a certain threshold of ash concentration in distinct flight levels are calculated.Although the ensemble technique we used needs further improvements such probabilities can serve in the future as valuable additional information for people responsible for aviation safety.

Figure 1 .
Figure 1.Temporal development of the height of the top of the plume in April 2010 at Eyjafjallajökull as used in the simulation (brown).Data was obtained from the reports of VAAC London.The blue lines give the source strength (solid: reference case, dotted: plume height increased by 1 km, dashed: plume height decreased by 1 km).

Figure 2 .
Figure 2. Logarithm of range-corrected signal of MULIS (multiwavelength lidar system) at = 1064 nm at Maisach from 16 April 2010, 17:00 UTC to 17 April 2010, 17:00 UTC and from 0 to 10 km above ground; white areas denote periods without measurements (taken from Gasteiger et al., 2011).The two arrows indicate the descent of the plume during two time intervals.

Figure 3 .
Figure 3. Vertical cross-section of the non-calibrated ash concentration (arbitrary units) at Munich for different model configurations.(a) reference run with 40 vertical layers, (b) model run with 80 vertical layers.

Figure 4 .
Figure 4. Temporal development of the non-calibrated simulated ash concentrations (arbitrary units) with a diameter of 1 µm at Hohenpeißenberg for the model configurations indicated in the legend.

Figure 5 .
Figure 5. Left: measured size distributions close to the volcano on 2 May, 2010.Right: measured (red) and simulated (blue) number concentrations for particles greater than 2 µm at Leipzig on 17 April 2010.Measured data taken from Schumann et al. (2011).

Figure 6 :Figure 6 .
Figure 6: Temporal development of the simulated (thin) and observed (thick) number density 594 of the 3 m particles.a) relation between source height and source strength following Mastin 595 et al. (2009) and b) correction of the source strength according to Stohl et al. (2011).The 596 thin lines give the results of the forecast runs that are started with a time lag of 6 h.597 598

Figure 7 .
Figure 7. Simulated and observed number concentrations for the model runs shown in Fig. 6b.The red curve shows the calculated regression line.The correlation coefficient is 0.79.

Figure 8 .
Figure 8. Simulated mean ash concentrations together with the ± σ range (σ = standard deviation of the individual ensemble).Green: standard simulation, blue: top of the plume height at the volcano reduced by 1000 m, orange: top of the plume height at the volcano increased by 1000 m.

Figure 9 :Figure 9 .
Figure 9: Probability to exceed a threshold concentration of volcanic ash of 2000 µg m -3 in % 641 between flight levels FL200 and FL350 (6096 m and 10668 m) at 16 April, 12:00 UTC.a: 642 model runs with reduced plume height; b: reference runs; c: model runs with increased 643 plume height; d: all model runs.644 645 Figure 9. Probability to exceed a threshold concentration of volcanic ash of 2000 µg m −3 in % between flight levels FL200 and FL350 (6096 m and 10 668 m) at 16 April, 12:00 UTC.(a) model runs with reduced plume height; (b) reference runs; (c) model runs with increased plume height; (d) all model runs.