Interactive comment on “ The impact of volcanic aerosol on the Northern Hemisphere stratospheric polar vortex : mechanisms and sensitivity to forcing structure ”

stratosphere caused by the radiative effects of the volcanic aerosols. In the present paper the impacts of 4 different aerosol forcings, all estimates of Pinatubo, are studied with a climate model. For each forcing an ensemble of 12 simulations is performed and the different ensembles are compared to a control ensemble. Large differences are found among the ensembles and the changes are found to be fragile and sensitive to the details of the forcings. A more robust result is that the temperature gradient in the lower stratosphere is not due only to the direct radiative forcing but to a large extent is due to an increased wave-driven meridional circulation. I find that the topic is interesting and that the paper contributes with important results.


Introduction
The Northern Hemisphere (NH) stratospheric winter polar vortex, which shows considerable interannual and intraseasonal variability, has been observed to be stronger than normal in winters after major volcanic eruptions (Kodera, 1995;Labitzke and van Loon, 1989).While this observation is based on a relatively small sample (limited to the winters after the 1963 Agung, 1982 El Chichón and1991 Pinatubo eruptions), the theoretical argument to explain such a strengthening appears clear: namely, that heating of the lower stratosphere through the absorption of radiation by volcanic sulfate aerosols enhances the equator-to-pole temperature gradient in the lower stratosphere, which, through the thermal wind equation, leads to stronger westerly winds (Robock, 2000 and references therein).Satellite observations clearly show a warming of the tropical lower stratosphere after volcanic eruptions (Labitzke and McCormick, 1992), so changes in meridional temperature gradients and zonal Published by Copernicus Publications on behalf of the European Geosciences Union.
winds are logical consequences.The degree to which secondary feedback mechanisms -such as changes in ozone or upward propagating planetary waves (e.g.Graf et al., 2007;Stenchikov et al., 2006) -also affect the vortex strength is at present unclear.
Post-volcanic strengthening of the NH polar vortex is an important step in a proposed mechanism which explains observed changes in surface climate in post-eruption winters (Kodera, 1994;Perlwitz and Graf, 1995).Specifically, it is thought that through stratosphere-troposphere coupling (Baldwin and Dunkerton, 2001;Gerber et al., 2012) the volcanically induced strong stratospheric polar vortex leads to the observed positive anomalies in surface dynamical indexes such as the Northern Annular Mode (NAM) or North Atlantic Oscillation (NAO) (Christiansen, 2008).Such dynamical changes lead to the "winter warming" pattern of postvolcanic temperature anomalies (Robock and Mao, 1992), which is characterized by warmer temperatures over large regions of the NH continents during winter which oppose the overall cooling impact of the volcanic aerosols on the surface.On the other hand, in a model study, Stenchikov (2002) found that a positive phase of the NAM was also produced in an experiment in which only the tropospheric impact of volcanic aerosols was included, implying that aerosol heating in the lower tropical stratosphere is not necessary to force a positive NAM response.Whatever the mechanisms, observations show that 11 out of 13 eruptions since 1870 were followed by positive wintertime NAO values (Christiansen, 2008): the apparent robustness of this post-eruption dynamical response should allow for enhanced skill in seasonal prediction for winters that follow volcanic eruptions (e.g.Marshall et al., 2009).
While a number of early model simulations reported qualified success in simulating the atmospheric dynamical response to volcanic eruptions (e.g.Graf et al., 1993;Kirchner et al., 1999;Rozanov, 2002;Shindell et al., 2001), assessments of the multi-model ensembles of the Coupled Model Intercomparison Projects (CMIP) 3 and 5 showed no significant winter warming response to prescribed volcanic forcing, nor did they show significant anomalies in post-eruption dynamical quantities in the stratosphere or at the surface (Driscoll et al., 2012;Stenchikov et al., 2006).It has been suggested that in order for a model to successfully respond to volcanic forcing, it should include a reasonably well-resolved stratosphere (Shindell, 2004;Stenchikov et al., 2006).However, analysis of the CMIP5 ensemble revealed no appreciable systematic difference in post-eruption geopotential height anomalies to volcanic aerosol forcing between models with or without well-resolved stratospheres (Charlton-Perez et al., 2013).
Most model simulations that incorporate the impact of volcanic eruptions, such as the CMIP3 and 5 historical simulations, do so using prescribed volcanic aerosol fields, which have associated uncertainties.In this study we investigate how the response of the atmosphere to volcanic aerosol forc-ing depends on the prescribed aerosol forcing used in the simulation.We use one CMIP5 model, the MPI-ESM, and focus on the first winter after Mt.Pinatubo, the period of strongest volcanic forcing within the era of satellite observations.We run ensemble simulations using four different forcing data sets: two observation-based forcing sets, based primarily on different versions of SAGE II aerosol extinction measurements, and two model-constructed aerosol forcing sets.By assessing the model response to these forcing sets, we investigate (1) the mechanisms linking volcanic aerosol heating in the lower tropical stratosphere and NH winter vortex strength, ( 2) what stratospheric circulation responses to volcanic aerosol forcing are robustly produced by the model and forcings, and (3) the sensitivity of the vortex response to changes in the space-time structure of the volcanic forcing.

Experiment: observational basis, model and methods
In this section, the hypothesis that volcanic eruptions produce a strengthened NH winter polar vortex is briefly summarized by examining ERA-Interim reanalysis data.Then, the materials and methods of the present modelling study are presented including the model and volcanic forcing sets used, and the design of the ensemble simulations and analysis.

NH polar vortex response to volcanic eruptions in ERA-interim
The NH polar vortex is highly variable as a result of unforced internal variability, and the impact of external forcings such as volcanic eruptions, the 11 year solar cycle, the El Niño-Southern Oscillation (ENSO), and the Quasi Biennial Oscillation (QBO).Isolating the vortex response to any individual forcing term can be difficult, especially in the case of major volcanic eruptions for which so few actual events have occurred within the era of satellite measurements.
A common and simple method to isolate the pure volcanic impact on the state of the NH winter stratosphere is to simply average post-eruptive anomalies over winters after recent major volcanic eruptions.This technique is here applied to ERA-Interim reanalysis (Dee et al., 2011) temperature and zonal wind fields after the eruptions of El Chichón (1982) and Pinatubo (1991).Post-eruption winter anomalies are constructed for the two winters after each eruption by differencing post-eruptive December-to-February (DJF) mean fields with fields averaged for 3 (El Chichón) and 5 (Pinatubo) years before the eruption (the shorter reference period for El Chichón results from the fact that the ERA-Interim data set begins in 1979).Differences in post-eruption anomalies for Pinatubo are not strongly dependent on the choice of a 3, 4 or 5 year reference period.
Figure 1 (left) shows ERA-Interim temperature and zonal wind anomalies composited for four winters, the two winters each after Pinatubo and El Chichón.Mean tempera- ture anomalies in the post-volcanic composite show positive anomalies in the tropical lower stratosphere, as would be expected due to aerosol heating.Temperature anomalies also show cooling of the tropical upper stratosphere, cooling of the NH polar lower stratosphere, and warming of the NH polar upper stratosphere.Mean post-volcanic zonal winds show a strengthening of the NH winter polar vortex by ∼ 6 m s −1 .Such a simple average with a small sample size does not completely remove the influences of variability resulting from other forcing terms -notably, the solar cycle was at a maximum at the times of both the El Chichón and Pinatubo eruptions, and El Niño events were observed in the first winters after both eruptions -but the composite temperature and zonal wind anomaly structure is certainly a better approximation of the direct volcanic impact than anomalies in any single post-volcanic year.For comparison, single winter anomalies are shown for the first winter after the Pinatubo eruption.Temperature anomalies for DJF 1991/92 roughly follow the structure of the 4 year composite, albeit with tropical positive anomalies located at higher altitudes, and weaker polar lower stratosphere cooling.Post-Pinatubo zonal wind anomalies in the tropics reflect the state of the QBO at the time, with negative (easterly) wind anomalies in the middle stratosphere (∼ 50-15 hPa) and positive (westerly) anomalies in the upper stratosphere (15-2 hPa).The polar vortex in the first post-Pinatubo winter was actually not as clearly enhanced as the 4 year composite, with positive zonal wind anomalies only in the mid to lower polar stratosphere centred at ∼ 70 • N. It is likely that in addition to the volcanic forcing, the vortex in DJF 1991/92 was weakened somewhat due to the influences of the concurrent forcing of the El Niño and QBO easterly phase.Based on these arguments, it can be hypothesized that the pure response of the stratosphere to Pinatubo aerosol forcing would have the approximate structure of the composite response shown in Fig. 1 (left), albeit with greater amplitude, since the aerosol optical depth and hence aerosol radiative heating during the first post-Pinatubo winter is the strongest of the years used in the volcanic composite.This "expected" response in the first post-Pinatubo winter is based on a small sample size of observations, and an assumption of linear response to the magnitude of volcanic forcing; however, in light of limited evidence, it represents a best first-order, observation-based expectation, consistent with that assumed explicitly or implicitly in prior studies.

MPI-ESM
The MPI-ESM is a full Earth system model, with atmosphere, ocean, carbon cycle, and vegetation components.Major characteristics of the full ESM and its performance in the CMIP5 experiments are described by Giorgetta et al. (2013).
The "low resolution" model configuration (MPI-ESM-LR) is used here, with horizontal resolution of the atmospheric component given by a triangular truncation at 63 wave numbers (T63) and 47 vertical layers extending to 0.01 hPa.Unlike model configurations with higher vertical resolutions, the LR version has no internally generated QBO.CMIP5 historical simulations have previously been performed with the MPI-ESM over the time period 1850-2005.Prescribed external forcings for the historical simulations, including volcanic aerosols as well as greenhouse gases and ozone, follow CMIP5 recommendations, and the responses to these forcings are described by Schmidt et al. (2013).
The MPI-ESM is configured to take volcanic aerosol forcing data in two formats, both of which are used in this study.One format consists of monthly zonal mean values of aerosol extinction, single scattering albedo, and asymmetry factor as a function of time, pressure, and the 30 short wave and long wave spectral bands used by the model.This format is consistent with the observation-based forcings sets introduced in Sect.2.3.A second format consists of monthly zonal mean aerosol extinction at 0.55 µm, and zonal mean effective radius, both as a function of latitude, height, and time.Pre-calculated look-up tables are then used to scale the 0.55 µm extinction to the wavelengths of the model's radiation code based on the effective radius.This methodology has been used to perform MPI-ESM simulations using the forcing time series of Crowley et al. (2008, e.g. Timmreck et al., 2009) or output from prior runs of the MAECHAM5-HAM coupled aerosol-climate model (e.g.Timmreck et al., 2010) as done in this study and described in Sect.2.4.

Observation-based aerosol forcing sets
Volcanic sulfate aerosol forcing for the MPI-ESM CMIP5 historical simulations is based on an extended version of the aerosol data set developed by Stenchikov et al. (1998, hereafter S98) on the basis of SAGE II measurements of aerosol extinction at 1.02 µm and estimates of effective radii derived from instruments on the Upper Atmosphere Research Satellite.The data are given at 40 pressure levels and interpolated to the actual hybrid model layers during the simulations.The S98 data set is based primarily on retrievals of aerosol extinction at 1.02 µm from SAGE II, with gaps filled with data from ground-based lidar systems.Together, the S98 forcing set and that from Sato et al. (1993, with updates http://data.giss.nasa.gov/modelforce/strataer/),both primarily based on SAGE II data, have been used in roughly half of the models that performed CMIP5 historical simulations (Driscoll et al., 2012).
Subsequent updates to the SAGE II retrievals have led to significant changes in the space-time morphology of the estimated aerosol extinction after Pinatubo (Arfeuille et al., 2013;Thomason and Peter, 2006).A new volcanic forcing set (SAGE_4λ Arfeuille et al., 2013) has been compiled and made available to modelling centres (http://www.pa.op.dlr.de/CCMI/CCMI_SimulationsForcings.html),specifically for use in chemistry climate simulations within the Chemistry Climate Model Intercomparison (CCMI) initiative (Eyring and Lamarque, 2013).Time series of zonal mean aerosol optical depth (AOD) at 1 µm -the wavelength closest to the original SAGE II measurements and so less impacted by uncertainties in derived aerosol size distribution -are shown in Fig. 2 over the Pinatubo period for the S98 and SAGE_4λ reconstructions.
It should be noted that even for Pinatubo -the bestobserved eruption in history -the observation-based volcanic aerosol forcing sets suffer from significant but mostly unquantified uncertainties.Most notably, gaps in the satellite record result from sparse sampling of the satellite instruments (Stenchikov et al., 1998) and the fact that large optical depths in the initial months after the Pinatubo eruption reduced atmospheric transmission below detectability (Russell et al., 1996).

Model-based aerosol forcing sets
Two "synthetic" volcanic aerosol forcing sets were constructed based on a 12-member ensemble of simulations of a Pinatubo-like eruption using the MAECHAM5-HAM coupled aerosol-climate model with SO 2 injections of 17 Tg and prescribed climatological sea surface temperatures (Toohey et al., 2011).Figure 3 shows lower stratospheric zonal mean temperature anomalies (at 100 hPa) and zonal wind anomalies (at 50 hPa) for these simulations, in comparison with ERA-Interim post-volcanic anomalies described in Sect.2.1.Most of the MAECHAM5-HAM ensemble members (roughly 9/12) show characteristics of a strengthened polar vortex in the lower stratosphere, as quantified as negative temperature anomalies at polar latitudes and positive zonal wind anomalies between 60 and 80 • N in Fig. 3.However, the ensemble variability of the simulations is pronounced, with three members showing a weakened polar vortex with positive temperature anomalies over the polar cap and negative wind anomalies between 60 and 80 • N. From the full 12-member ensemble, two subsets were defined based on the zonal wind anomalies, with strong and weak vortex composites (hereafter SVC and WVC) defined respectively as the average of the three realizations with the most positive and most negative zonal wind anomalies at 50 hPa and 70 • N. Aerosol properties (aerosol extinction at 0.55 µm and effective radius) for these two composites were collected for use in MPI-ESM simulations.SVC and WVC zonal mean AOD time series, scaled to 1 µm to be consistent with the observation-based AOD time series, are shown in Fig. 2.

Experiments
The forcing sets described above were used to force four 16member ensemble simulations of the Pinatubo eruption time period.The number of MPI-ESM realizations used here is therefore notably greater than the three MPI-ESM realizations used in prior single- (Schmidt et al., 2013) or multimodel (Charlton-Perez et al., 2013;Driscoll et al., 2012) investigations of the CMIP5 historical simulations.Ten of the 16 unique initial condition states (at June 1991) were taken from 10 independent, pre-existing CMIP5 historical simulations.In order to increase the ensemble size, six of the historical simulations were restarted in 1980 with a small atmospheric perturbation applied, and integrated until 1991.All simulations were therefore forced with S98 volcanic forcing up until June 1991, at which point forcing either continued as S98 or switched to one of the other forcings.
A control ensemble (CTL) is based on the 5 year period 1986-1990 for the original 10 historical simulations used to produce the initial conditions, comprising in total 50 years, during which the other external forcings are negligibly different from 1991-1992 conditions.Anomalies for the volcanic ensembles are computed by differencing ensemble mean results with the CTL ensemble mean.
Observations imply that the NH winter vortex response to volcanic forcing may last for 2 years.While considering the results for the first two winters together certainly increases the ensemble size, it is quite possible that the mechanisms leading to vortex response are different in the first and second winters after an eruption.The magnitude of aerosol forcing will be different for the two winters for any forcing set, with typically stronger aerosol forcing occurring in the first winter.Such temporal changes are likely larger than differences between the different forcing sets used in this study.With these potential complications in mind, in order to simplify the analysis and interpretation of the model results in this study, we choose to focus our analysis only on the first winter after the Pinatubo eruption.
Unless otherwise stated, results shown are ensemble means.Confidence intervals (95 % level) are calculated for differences between forced and CTL ensemble means and the differences between the forced ensemble means using a bootstrapping technique, with 1000 resamples of the original ensembles.For example, for each latitude and pressure level, and for each forced ensemble, 1000 bootstrapped sample means are produced by sampling the 16 ensemble member values with replacement, resulting in an approximation of the uncertainty distribution for the mean value.The same process is applied to the CTL ensemble, with n = 50.The difference between the two bootstrapped distributions defines the uncertainty in the mean difference, and is used to define the 95 % confidence interval of the ensemble mean difference.A parallel procedure is used to define the confidence intervals for the differences of two volcanically forced ensembles.

Radiative forcing
Latitude-pressure plots of zonal mean DJF 1 µm extinction (EXT) are shown in Fig. 4 for the S98 and SAGE_4λ forcing sets.A major difference between the two forcing sets is the vertical distribution of extinction in the tropics, with SAGE_4λ extinction being more constrained to the lower stratosphere, compared to S98 which has considerable extinction extending down into the upper troposphere.This difference in tropical extinction is the result of improvements in the SAGE_4λ retrieval: the S98 vertical distribution in the tropics is very likely unrealistic (Arfeuille et al., 2013).The two forcing sets also differ in terms of the magnitude of tropical extinction, with SAGE_4λ having stronger extinction in the tropical lower stratosphere for all wavelengths greater than 0.55 µm.Differences in extinction magnitude are also apparent in the high-latitude lower stratosphere of both hemispheres, with SAGE_4λ extinction much smaller than that of S98.
Direct aerosol radiative heating rates (Q aer ) are computed by performing radiative transfer calculations at each model time step twice, once with and once without the volcanic aerosols (as in Stenchikov et al., 1998).We have calculated Q aer for single realizations of each forced ensemble.
Net (long wave + short wave) Q aer for DJF for the two observation-based forcing sets is shown in Fig. 4. Q aer values are positive over most of the stratosphere for both S98 and SAGE_4λ forcings, with highest magnitude in the tropical lower stratosphere at approximately 30 hPa, just north of the equator.Like the extinction values (Fig. 4, upper row), S98 heating rates are spread over a larger vertical extent than SAGE_4λ heating rates: at the equator, S98 heating rates > 0.1 K day −1 extend from ∼ 100 hPa upwards whereas for the SAGE_4λ forcing set, heating rates > 0.1 K day −1 begin ∼ 60 hPa.Like the extinction at 1 µm (and longer wavelengths) SAGE_4λ forcing leads to stronger Q aer , with maximum values of 0.5 K day −1 compared to max values of 0.3 K day −1 for S98.Although minor compared to the differences in the tropical stratosphere, there are differences in Q aer in the NH polar latitudes, with S98 leading to slightly larger Q aer values in the NH polar lower stratosphere than for SAGE_4λ.
Figure 5 shows DJF 1 µm aerosol extinction for the SVC and WVC forcing sets.Compared to the observation-based forcing sets, both model-based forcing sets have greater magnitudes of aerosol extinction, especially in the high latitudes.Such differences likely arise from a combination of (1) underestimates in the observation-based forcing sets due to saturation effects, especially in the weeks directly following the eruption, and (2) potential errors in the model simulations, in terms of either general model deficiencies or errors in the model formulation in regard to the actual Pinatubo eruption (e.g.SO 2 injection amount or height).The model-based extinctions also have stronger gradients (both vertical and horizontal) across the tropopause compared to the observationbased forcing sets.These differences, especially in the vertical, are likely due to the limited vertical resolution of the observations.The primary difference between the SVC and WVC forcings is the hemispheric partitioning of the aerosol extinction, with WVC having larger extinctions in the NH than SVC.We interpret this difference as a result of wave driven circulation in the NH: in cases of strong NH wave forcing in the original MAECHAM5-HAM runs, a stronger residual circulation transports more aerosol from the tropical region towards the NH, while also disturbing the NH polar vortex.Therefore, by selecting cases of weak polar vortex, we also select cases of strong northward aerosol transport.Another major difference is the magnitude of extinction in the NH high latitudes, and therefore the gradient in extinction around 60 • N. The stronger aerosol extinction gradient in the SVC forcing set is obviously a result of the strong vortex in the MAECHAM5-HAM simulations, which inhibits mixing of aerosol into the polar cap.In the tropics, SVC and WVC have very small differences, and their vertical distribution is almost identical.
Q aer values from the model-based forcing sets, also shown in Fig. 5, are more similar to the observation-based Q aer values than the extinctions: the differences in high-latitude extinction have relatively little impact on the Q aer differences due to the much weaker long wave radiation field here than in the tropics.Differences in Q aer between the two modelbased forcing ensembles are relatively small (compared to differences between the two observation-based forcing ensembles), and are characterized primarily by the north-south shift between the two forcing sets, and the gradients in Q aer at 60 • S and 60 • N. At high latitudes, Q aer values are apparently not strongly dependent on the aerosol extinction; for example, SVC has smaller aerosol extinction values in polar latitudes than WVC, but shows larger Q aer .This is likely due to the fact that the net (absorption-emission) long wave heating rate is strongly temperature dependent because of the Stippling highlights anomalies that are significant at the 95 % confidence level based on a bootstrapping algorithm.
temperature dependence of the emission.As shown below, the SVC ensemble is characterized by a colder polar vortex than for WVC, which should lead to less emission and thus larger net heating.

Temperature and zonal wind response
In order to determine which thermal and dynamical responses to the volcanic forcings are robust across the different forcings, results are first examined by concatenating the 64 volcanic simulations into one "grand" ensemble (VOLC), with ensemble mean anomalies of temperature and zonal wind shown in Fig. 6.Significant and robust temperature responses in the volcanic simulations include positive anomalies in the tropical lower stratosphere and negative temperature anomalies in the troposphere, extending to the surface between approximately 70 • S-45 • N. In the NH high-latitude stratosphere, the temperature anomaly structure for the VOLC ensemble is roughly consistent with that shown by Schmidt et al. (2013) for three MPI-ESM realizations (using S98 forcing), with positive temperature anomalies in the upper stratosphere and mesosphere.However, while temperature anomalies for the VOLC ensemble are significant throughout most of the tropopause and lower-to-middle stratosphere, anomalies in the NH polar region are generally not significant.
Significant zonal wind responses in the VOLC ensemble include weakening of the subtropical jets at ∼ 30 • and ∼ 100 hPa of both hemispheres by 2-4 m s −1 , and a weaken-ing of the SH stratospheric easterlies by 4-6 m s −1 .Grand ensemble mean zonal wind anomalies in the NH highlatitude stratosphere reach a maximum of ∼ 2-4 m s −1 , and are not significant.
Temperature and zonal wind anomalies are shown separately for the observation-and model-based forcing ensembles in Figs.7 and 8, respectively.For the observation based forcings S98 and SAGE_4λ (Fig. 7), the general features are consistent with the VOLC grand ensemble.In agreement with Q aer of Fig. 4, SAGE_4λ tropical temperature anomalies are greater in magnitude, with peak values of 4.8 K compared to 3.6 K for S98.Temperature anomalies are also shifted in height between the two ensembles, with peak temperature anomalies located at 30 hPa for SAGE_4λ, compared to 50 hPa for S98.Differences in tropical temperature anomalies between the two ensembles (right-most column of Fig. 7) are significant at the 95 % confidence level between approximately 200 and 20 hPa.The SAGE_4λ ensemble shows slightly larger warming in the polar upper stratosphere, although the difference between S98 and SAGE_4λ is not significant.
In the NH high latitudes, the S98 ensemble shows a weak (2-3 m s −1 ) increase in westerly wind, while the SAGE_4λ ensemble shows a weak (3-4 m s −1 ) negative wind anomaly in the upper stratosphere/mesosphere. Differences between the two ensembles are not significant in the midlatitudes to high latitudes.For the model-based forcing ensembles SVC and WVC (Fig. 8), temperature anomalies in the tropics and midlatitudes are quite similar in structure between the two ensem-bles and similar to the grand ensemble mean.In the NH high latitudes, however, the temperature responses are quite different in structure between the two ensembles.The SVC ensem-ble produces an NH high-latitude temperature anomaly pattern with significant warming in the upper stratosphere and lower mesosphere.WVC, on the other hand, gives a temperature anomaly pattern with insignificant positive temperature anomalies in the polar lower and middle stratosphere.Differences between the two forcing sets are significant in the polar lower and upper stratosphere.
Zonal wind anomalies for the model-based forcing ensembles follow from the temperature anomalies.The SVC ensemble produces a significant strengthening of the NH polar vortex, with peak zonal wind anomalies 6-8 m s −1 .The WVC ensemble produces no significant change in NH vortex winds.The difference between the vortex response in the SVC and WVC ensembles is significant in the polar lower stratosphere, and in the mid-to-upper stratosphere at the highest latitudes (70-90 • N).
DJF zonal mean zonal wind at 60 • N and 10 hPa for each realization of each ensemble is shown in Fig. 9. Ensemble means with 95 % confidence intervals are also shown.The mean of the control ensemble is marked by a horizontal dashed line.Consistent with results discussed above, only the SVC ensemble shows a significant zonal wind response to volcanic forcing, with an ensemble mean 95 % interval that excludes the control mean.In terms of individual ensemble members, 13/16 members of the SVC ensemble have zonal mean wind greater than the control mean, although all of the members lie within the natural variability of the control ensemble.Three SVC members have zonal winds weaker than the control; as such, the response to SVC forcing is not totally robust and the increase in ensemble mean vortex strength appears to represent a change in the probability of weak vs. strong vortex states.The WVC and S98 ensembles show ensemble means and spreads very similar to the control ensemble, with S98 showing a weak and insignificant positive anomaly in wind strength.The SAGE_4λ ensemble shows an ensemble mean equal to that of the control ensemble, but interestingly shows some evidence of a decrease in ensemble variability.

Wave-driven circulation response
For all ensembles, we have computed transformed Eulerian mean (TEM) diagnostics (Andrews et al., 1987), including Eliassen-Palm (EP) fluxes, the meridional residual mass circulation stream function (ψ * ), the residual vertical velocity (w * ), and temperature tendencies due to vertical residual advection.Ensemble means of these quantities have been compared to values from the control ensemble in order to compute post-volcanic anomalies in the first NH winter.As for the temperature and zonal wind anomalies, selected TEM quantities are examined first in terms of the grand VOLC ensemble in Fig. 10.
It is well known that the variability of polar vortex strength is largely controlled by planetary wave drag, and therefore depends on the upward wave flux from the troposphere into the stratosphere (Newman et al., 2001;Polvani and Waugh, 2004).The vertical component of EP-flux (F z ) is a commonly used proxy for the amount of wave activity entering and propagating through the stratosphere.Figure 10 shows DJF F z for the control ensemble and anomalies for the VOLC grand ensemble.Around the tropopause (∼ 100-200 hPa), F z anomalies in the VOLC ensemble are negative in the midlatitude (30-45 • ) regions of both hemispheres, and generally positive in the midlatitudes to high latitudes (45-90 • N).Positive F z anomalies are significant throughout the SH stratosphere, while in the NH, significant positive F z anomalies occur around 60 • N and 100 hPa, and extend upwards, slanting equatorward with height.
Convergence of EP-flux (or negative values of EP-flux divergence, EPFD) leads to wave drag, a slowing of the wintertime westerly (eastward) zonal wind and a poleward residual circulation.Enhanced wave drag in volcanic simulations is found throughout the SH stratosphere and in the midlatitude NH middle stratosphere (around 30 • N, 10 hPa, Fig. 10d).Wave drag in the latter location is especially important for forcing the residual circulation (Shepherd and McLandress, 2011).Figure 10e and f show the CTL and VOLC anomalies of the meridional residual circulation stream function.The poleward NH residual circulation stream function is found to be enhanced in the VOLC ensemble.
The volcanically induced residual circulation anomalies drive adiabatic heating anomalies where vertical motions are induced.Temperature tendency anomalies due to residual vertical velocity (hereafter dT w * , Fig. 10g, h) show clearly the tropical cooling associated with anomalous vertical upwelling, and heating at the midlatitudes and high latitudes due to anomalous downwelling.
The terms dT w * and Q aer are found to dominate the temperature tendency budget compared to other terms in the TEM diagnostics.Therefore, the temperature anomalies found in the volcanic simulations can be understood to be pri- marily the result of the direct (diabatic) aerosol heating Q aer , and the indirect, dynamical (adiabatic) heating dT w * .These terms, along with the corresponding temperature anomalies, are shown for each of the volcanic ensembles as lower stratosphere (100-20 hPa) averages in Fig. 11.
Ensemble mean temperature anomalies show roughly similar behaviour in the tropics and midlatitudes (0-55 • N) for all ensembles.In the NH polar latitudes, weak positive temperature anomalies are simulated for the S98, SAGE_4λ and WVC ensembles, and negative anomalies for the SVC en-semble.Q aer peaks in the tropics and decays to zero between 30 and 60 • N. dT w * is negative in the tropics, opposing the Q aer heating, and becomes generally positive in the extratropics where downward advection occurs.In the high latitudes, dT w * is positive for S98, SAGE_4λ and WVC, but negative for SVC, consistent with the temperature anomalies (Fig. 8).It is thus clear that the differences in temperature anomalies at high latitudes, and therefore the temperature gradients around 60 • N, result from differences in dT w * between the ensembles.The effect of differences in temperature tendencies on temperature anomalies is likely amplified at higher latitudes since radiative damping timescales increase with latitude in the lower stratosphere (Newman and Rosenfield, 1997).These results underscore the point that temperature gradients at the high latitudes are controlled by the structure of the volcanically induced residual circulation anomalies rather than the direct aerosol heating.
Differences in the dynamical responses to the SVC and WVC volcanic forcings are further explored by examining TEM diagnostics for these ensembles.As in the VOLC grand ensemble, both model-based forced ensembles show positive F z anomalies throughout the SH stratosphere and negative anomalies in the subtropical tropopause region (Fig. 12a,  b).Positive F z anomalies are found at the NH high-latitude lower stratosphere, extending upwards and slanting equatorward with height.However, these positive F z anomalies are much stronger (and significant only) in the WVC ensemble.
As in the VOLC ensemble, negative EP-flux divergence (wave drag) is significantly enhanced in the SH stratosphere and in the NH midlatitude middle stratosphere in both the SVC and WVC ensembles (Fig. 10d, e).NH wave EP-flux divergence anomalies are notably stronger in the WVC ensemble between 10 and 100 hPa and 30-60 • N; differences between the two ensembles are significant around 100 hPa and 60 • N. Residual circulation anomalies (Fig. 12g, h) show significant poleward circulation cells in both hemispheres.Notable differences in the form of the induced residual circulation cells between the two ensembles exist in the lower and middle NH stratosphere, with circulation cells confined to 0-45 • N in the SVC ensemble, and extending 0-90 • N in the WVC ensemble.These differences are significant in around 100 hPa and 60 • N, and as such appear related to the differences in EP-flux divergence discussed above.Temperature tendency anomalies due to residual vertical velocity (dT w * , Fig. 10j, k) show that the broad residual circulation anomaly cell in the WVC ensemble leads to dynamical heating of the lower stratosphere extending from ∼ 45-90 • N, while the narrower poleward cell in SVC leads to significant heating only in the midlatitude (45-60 • N) lower stratosphere.Differences in dynamical warming are marginally significant in the polar lower stratosphere, consistent with the differences in EP-flux divergence and residual circulation.
The TEM fields of Fig. 12 thus show that the significant difference in polar vortex zonal wind response to the SVC and WVC volcanic forcings comes about due to differences in the wave activity propagating upwards into the stratosphere from the troposphere.This mechanism is further explored in Fig. 13 for all volcanic ensembles.Figure 13 (left) shows the relationship between polar vortex wind (u at 10 hPa, 60 • N, hereafter u vortex ) and lower stratosphere polar temperature (T at 50 hPa, 60-90 • N average, hereafter T vortex ).Individual ensemble members are shown by circular markers, while ensemble mean values are shown as triangles on the bottom and right-hand axes.The compact linear relationship between u vortex and T vortex apparent in the control run is shifted in the volcanic simulations: for a given T vortex , the associated u vortex is typically ∼ 5 m s −1 stronger in the volcanic runs than in CTL.The SVC ensemble mean T vortex is similar to the CTL ensemble mean, and correspondingly, the SVC u vortex is larger than the CTL ensemble mean by 5-10 m s −1 .In contrast, the other volcanic ensembles all show increases in T vortex , and correspondingly, u vortex for these ensembles is less than that of SVC, and similar to the CTL ensemble mean value.
Polar lower stratosphere temperatures are related to upward EP-flux in the midlatitudes (Newman et al., 2001;Polvani and Waugh, 2004).In Fig. 13b, u vortex is plotted vs. F z at 100 hPa averaged over 45-75 • N (hereafter F z,100 , a common scalar metric for wave activity entering the extratropical stratosphere).Polar vortex strength, u vortex , is seen to be stronger for any particular F z,100 value in the volcanic runs compared to the CTL ensemble.For the SVC ensemble, F z,100 is comparable to the CTL value, and the u vortex anomaly is 5-10 m s −1 .The other ensembles show positive anomalies in F z,100 ; the zonal wind of these volcanic ensembles are thus reduced due to the increased wave activity and hence wave drag.Post-volcanic anomalies in F z,100 thus exert a negative feedback on the wind anomalies driven by changes in the lower stratosphere temperature gradient.

Discussion
A major result of the preceding sections is that the temperature structure of the lower stratosphere in post-volcanic winters in the high latitudes is controlled primarily by induced residual circulation anomalies.Post volcanic-eruption enhancement of the stratospheric residual circulation (or Brewer-Dobson circulation) has been suggested based on previous model studies (Pitari and Mancini, 2002;Pitari, 1993).Graf et al. (2007) assessed the observational record and found evidence of increased winter stratospheric wave activity after three eruptions (Agung, El Chichón and Pinatubo).Poberaj et al. (2011) showed that anomalously strong EP-fluxes occurred in the SH after Pinatubo.Such increases in winter stratospheric wave activity may be a result of changes in the wave propagation conditions of the stratosphere following aerosol radiative heating, allowing more planetary waves to propagate upwards.Similar arguments explain climate models' predicted increase in future stratospheric residual circulation due to changes in the atmospheric temperature structure due to climate change (Garcia and Randel, 2008;McLandress and Shepherd, 2009;Shepherd and McLandress, 2011), which also enhances the meridional temperature gradient in the lower stratosphere, albeit at lower altitudes.
The residual circulation anomalies induced by the volcanic forcing in the ensembles strengthen the climatological residual circulation, although the structure of the anomalies is different to the climatology.For example, the induced upwelling is centred on the equator whereas the maximum climatological upwelling is centred in the summer hemisphere, and the induced upwelling is strongest above the level of maximum aerosol radiative heating, and even negative below.This latter point may explain why post-Pinatubo anomalies in tropical upwelling are not apparent in observational records, which are usually displayed as time series of upwelling in the lower stratosphere (Seviour et al., 2012).We have shown that increased wave activity and wave drag in both hemispheres is a robust response to volcanic aerosol forcing in the MPI-ESM, and therefore a component of the residual circulation anomalies in the volcanically forced simulations results from wave drag anomalies.However, given the fact that maximum anomalous tropical upwelling occurs at and above the location of maximum aerosol radiative heating in the tropics, it seems possible that there is also a diabatic component to the anomalous residual circulation, forced directly by the aerosol radiative heating, as suggested by previous studies (e.g.Aquila et al., 2013;Pitari and Mancini, 2002).
The lack of a robust NH polar vortex response to the Pinatubo forcings used here does not necessarily rule out the possibility that other forcings could produce significant vortex responses.Most obviously, the magnitude of the volcanic forcing may be essentially important.Toohey et al. (2011) found a significant vortex response to a near-super eruption volcanic forcing, which likely produced a much stronger direct aerosol radiative heating gradient in the midlatitudes to high latitudes.Similarly, Bittner et al. ("Sensitivity of the Northern Hemisphere winter stratosphere variability to the strength of volcanic eruptions", submitted manuscript) find a significant response of the MPI-ESM NH polar vortex to volcanic forcing representing that of the 1815 eruption of Tambora.It is also important that our simulations do not include possible chemical depletion of stratospheric ozone brought about by the presence of volcanic aerosols, or ozone anomalies due to changes in the residual circulation.Induced changes in the meridional structure of stratospheric ozone can also affect temperature gradients in the lower stratosphere (Muthers et al., 2014), and may be a significant feedback in the response of the polar vortex to volcanic forcing.
While the SAGE_4λ volcanic forcing set is almost certainly a more accurate reconstruction of the true aerosol evolution than S98 in many aspects, e.g. the vertical distribution of aerosol extinction in the tropics, it does not produce the expected increase in NH polar vortex strength in simulations with the MPI-ESM.It actually leads to a somewhat weaker vortex than the S98 forcing, which is especially surprising given that it produces stronger radiative heating in the tropical lower stratosphere than S98.
The model-based SVC forcing was the only forcing to produce a significant vortex strengthening.Differences in simulated vortex response between the different ensembles imply sensitivity in vortex response to the exact structure of the volcanic forcing.These differences between the ensembles, and especially the significant response of SVC ensemble, should perhaps be taken with a grain of salt: the large variability of wave drag and vortex dynamics necessitates the use of large ensembles in order to negate the impact of variability on the ensemble means.While our ensemble size of 16 is larger than most prior single-model volcanic studies, it may still be insufficient to unambiguously identify significant highlatitude responses from volcanic forcing.Furthermore, given the anomalous heating of the lower stratosphere in volcanic simulations, it could be the case that insignificant anomalies in residual circulation lead to significant anomalies in ensemble mean temperature gradients and therefore zonal wind.Nevertheless, planetary wave propagation through the tropopause region has been shown to be quite sensitive to local vertical and meridional gradients in zonal wind and temperature (e.g.Chen and Robinson, 1992).It is plausible that small differences in the structure of the prescribed volcanic aerosol forcing, and resulting radiative heating, temperature and wind, could have relatively large impacts on wave propagation.

Conclusions
In simulations of the post-Pinatubo eruption period with the MPI-ESM with four different volcanic aerosol forcings, an enhanced polar vortex -which is expected based on limited observations and simple theoretical arguments -was not a robust response.The responses that were significant and robust across all four forcings in the NH winter stratospheric include: (1) positive temperature anomalies in the lower tropical stratosphere, (2) enhanced F z in the NH midlatitudes (40-60 • N) and wave drag in the midlatitude middle stratosphere, (3) enhanced meridional residual circulation, (4) dynamical cooling of the tropical lower stratosphere and heating of the midlatitude lower stratosphere.
The lack of robust polar vortex response to volcanic aerosol forcing in the MPI-ESM simulations of this work is consistent with the multi-model results of the CMIP5 historical experiments (Charlton-Perez et al., 2013;Driscoll et al., 2012).We have shown that the meridional temperature gradient in the extratropical lower stratosphere induced by volcanic aerosol forcing, and therefore the strength of the induced stratospheric winter polar vortex wind anomalies, is controlled primarily by dynamical heating associated with the induced residual circulation rather than the direct aerosol radiative heating.The vortex response in the model is therefore much less robust than would be expected if it were directly due to the aerosol radiative heating, as it is instead sub-ject to complexity and variability of wave-mean interaction in the winter stratosphere.
results further imply that the NH polar vortex response is quite sensitive to the specific structure of the volcanic forcing used.Generally, volcanic forcing leads to an overall increase in resolved wave activity entering the midlatitude to high-latitude stratosphere, and the impact of this wave activity tends to weaken the polar vortex, counteracting the impact of low-latitude heating.However, one volcanic forcing set used here, the model-based SVC forcing, did not significantly affect high-latitude wave activity, and thereby produced a strengthened polar vortex, approximately in line with the expected response.We speculate that such sensitivity is due to the role that minor differences in volcanically induced temperature and wind anomalies play in wave-mean flow interactions in the stratosphere.An alternative theory is that differences in volcanic forcing lead to differences in wave generation in the troposphere.In either case, the results imply that -at least for a Pinatubo magnitude eruption -very accurate reconstructions of volcanic aerosol forcing would be required to reproduce any impact of the aerosols on polar vortex strength in climate model simulations.

Figure 1 .
Figure 1.ERA-Interim DJF (top) temperature and (bottom) zonal wind anomalies for (left) post-volcanic composite (n = 4) of two winters after eruptions of El Chichón and Pinatubo, and (right) 1991/92, the first winter after the Pinatubo eruption.

Figure 2 .
Figure 2. Zonal mean aerosol optical depth at 1 µm for the Pinatubo eruption, (left) as reconstructed from observations producing the S98 and SAGE_4λ forcing data sets and (right) based on MAECHAM5-HAM model simulations and composited according to strong (SVC) and weak (WVC) vortex states.Dashed vertical lines demarcate the DJF period of interest.

Figure 3 .
Figure 3. First post-eruption NH winter (DJF) anomalies of (left) temperature at 100 hPa and (right) zonal wind at 50 hPa for the 12-member MAECHAM5-HAM Pinatubo ensemble of Toohey et al. (2011) (grey lines).Ensemble members constituting the strong and weak vortex composites (SVC and WVC) as defined in the main text are marked by circles and crosses, respectively.For comparison, ERA-Interim anomalies based on a composite of the two winters after the eruptions of El Chichón and Pinatubo (blue), and the single winter after the Pinatubo eruption (red) are also shown.

760Figure 4
Figure 4: (top) latitude-pressure distributions of 1 μm aerosol extinction in the S98 and SAGE_4λ 761 volcanic aerosol forcing sets in first DJF after Pinatubo (winter 1991/92).(bottom) aerosol radiative 762 heating (Q aer ) as computed within the MPI-ESM model for each forcing set.Right-hand column shows 763 S98-SAGE_4λ differences for both 1 μm extinction and Q aer .764 765 Figure 5: (top) latitude-pressure distributions of 1 μm aerosol extinction in the SVC and WVC volcanic 768 aerosol forcing sets in first DJF after Pinatubo (winter 1991/92).(bottom) aerosol radiative heating (Q aer ) 769 as computed within the MPI-ESM model for each forcing set.Right-hand column shows SVC-WVC 770 differences for both 1 μm extinction and Q aer .771 772

Figure 6 .
Figure 6.DJF temperature (top) and zonal wind (bottom) from the CTL ensemble (left) and anomalies for the VOLC grand ensemble (right).Stippling highlights anomalies that are significant at the 95 % confidence level based on a bootstrapping algorithm.

Figure 7 :Figure 7 .
Figure 7: Temperature and zonal wind response of the observations-based volcanic forced ensembles.783 Shown are DJF anomalies for the (left) S98 and (middle) SAGE_4λ ensembles, and (right) the difference 784 between the two ensembles for (top) temperature and (bottom) zonal wind.Hatching highlights 785 anomalies which are significant at the 95% confidence level based on a bootstrapping algorithm.786 787

Figure 8 :Figure 8 .
Figure 8: Temperature and zonal wind response of the model-based volcanic forced ensembles.Shown 790 are DJF anomalies for the (left) SVC and (middle) WVC ensembles, and (right) the difference between 791 the two ensembles for (top) temperature and (bottom) zonal wind.Hatching highlights anomalies which 792 are significant at the 95% confidence level based on a bootstrapping algorithm.793 794

Figure 9 .
Figure 9. DJF zonal wind at 60 • N and 10 hPa for the CTL ensemble and the volcanically forced ensembles S98, SAGE_4λ, SVC and WVC.Individual ensemble members shown as grey symbols.Ensemble means shown in coloured symbols, with vertical whiskers representing the 95 % confidence interval of the ensemble mean.Dashed horizontal line shows the ensemble mean of the CTL ensemble.

Figure 10 .
Figure 10.Selected TEM diagnostics for (left) the CTL ensemble and (right) anomalies for the VOLC grand ensemble.Shown are (a, b) vertical component of EP-flux, (c, d) EP-flux divergence, (e, f) residual circulation streamfunction and (g, h) heating due to residual vertical advection.Contours in (a) are in units of 1 × 10 4 kg s −2 .Stream function contours in (e) and (f) are red for positive values and blue for negative values, and are log-spaced, ranging from 10 to 1 × 10 4 kg m −1 s −1 in panel (e) and 1 to 100 kg m −1 s −1 in panel (f).

Figure 12 :Figure 12 .
Figure 12: Selected transformed Eularian mean diagnostics for the SVC and WVC forced ensembles.823 Shown are DJF anomalies for the (left) SVC and (middle) WVC ensembles, and (right) the difference 824 between the two ensembles for (a,b,c) vertical component of EP-flux, (d,e,f) EP-flux divergence, (g,h,i) 825 residual circulation and (j,k,l) heating due to residual vertical advection.Stream function contours in 826 Figure 12.Selected TEM diagnostics for the SVC and WVC forced ensembles.Shown are DJF anomalies for the (left) SVC and (middle) WVC ensembles, and (right) the difference between the two ensembles for (a, b, c) vertical component of EP-flux, (d, e, f) EP-flux divergence, (g, h, i) residual circulation streamfunction and (j, k, l) heating due to residual vertical advection.Stream function contours in panels (g, h, i) are red for positive values and blue for negative values, and are log-spaced ranging from 1 to 100 kg m −1 s −1 .

Figure 13 .
Figure 13.Scatter plots of vortex wind, temperature and wave forcing for the CTL and four volcanically forced ensembles.All quantities are averaged over the NH winter (DJF).Shown are (left) zonal wind at 10 hPa, 60 • N plotted against temperature at 50 hPa, 60-90 • N averaged, and (right) zonal wind at 10 hPa, 60 • N plotted against the vertical component of EP-flux averaged over 40-75 • N. Individual forced ensemble members are shown by coloured circles, and the CTL ensemble members by black crosses.Ensemble mean values of the quantities plotted on the horizontal and vertical axes of each panel are shown by coloured (black) triangles for the forced (CTL) ensembles along the bottom and right-hand axes.