Global streamflow and flood response to stratospheric aerosol geoengineering

. Flood risk is projected to increase under future warming climates due to an enhanced hydrological cycle. Solar geoengineering is known to reduce precipitation and slow down the hydrological cycle and may therefore be expected to offset increased ﬂood risk. We examine this hypothesis using streamﬂow and river discharge responses to Representa-tive Concentration Pathway 4.5 (RCP4.5) and the Geoengineering Model Intercomparison Project (GeoMIP) G4 scenarios. Compared with RCP4.5, streamﬂow on the western sides of Eurasia and North America is increased under G4, while the eastern sides see a decrease. In the Southern Hemisphere, the northern parts of landmasses have lower streamﬂow under G4, and streamﬂow of southern parts increases relative to RCP4.5. We furthermore calculate changes in 30-, 50-, and 100-year ﬂood return periods relative to the historical (1960–1999) period under the RCP4.5 and G4 scenarios. spatial patterns are for each return although those under G4 are closer to historical values than under RCP4.5. by this large-scale G4 stratospheric geoengineering the a weak in soil decreased and


Introduction
Floods cause considerable damage every year (UNISDR, 2013), which increases with economic development and rate of climate change . Generally, people and assets exposed to extreme hydrology disasters, including flooding, increase under global warming (Alfieri et al., 2017;Arnell and Gosling, 2013;Tanoue et al., 2016;Ward et al., 2013). Previous studies have shown that flood risk covaries with runoff and streamflow (Arnell and Gosling, 2013;Hirabayashi et al., 2013Hirabayashi et al., , 2008. Hirabayashi et al. (2013) analyzed CMIP5 (Coupled Model Intercomparison Project Phase 5) projections for the RCP4.5 and RCP8.5 scenarios (Meinshausen et al., 2011) and found shortened return periods for floods, especially in Southeast Asia, India, and eastern Africa, especially under the RCP8.5 scenario.
Streamflow is a continuous variable and for convenience three quantities are commonly used to measure its distribution: Q 5 , the level of streamflow exceeding 5 % in a year; Q 95 , the level of streamflow exceeding 95 % in a year; and Q m , the annual mean flow. Koirala et al. (2014) analyzed the changes in streamflow conditions under the different RCP scenarios. Under RCP8.5 Q 5 increases at high latitudes and in Asia and central Africa, while Q m and Q 95 decrease in Europe and western parts of North and Central America. The spatial pattern under RCP4.5 is similar, and changes of Q m and Q 5 streamflow are somewhat smaller than those under RCP8.5, while Q 95 is about the same under both scenarios.
Other hydrologic indicators show similar results under future climate projections. For example, Arnell and Gosling (2013) used a global daily water balance hydrologic model (Mac-PDM.09; Gosling et al., 2010), forced by 21 climate models from the CMIP3 ensemble, and analyzed 10year and 100-year return periods of maximum daily flood under various scenarios. They found that the uncertainty in projecting river streamflow is dominated by across-model differences rather than the climate scenario. Dankers et al. (2014) used a 30-year return period of 5-day average peak flows to study the changing patterns of flood hazard under the RCP8.5 scenario. They used nine global hydrology models together with five coupled climate models from CMIP5 and showed that simulated increases in flood risk occur in Siberia, Southeast Asia, and India, while decreases occur in northern and eastern Europe and northwestern North America.
River-routing models such as CaMa-Flood (Yamazaki et al., 2011) are important tools for simulating flood hazard. These models have been combined with high-resolution digital elevation models, flow direction maps (e.g., HYDRO1k and HydroSHEDS; Lehner et al., 2008), and hydrological models. Global-scale river models (GRMs) are typically structured to use the gridded runoff outputs from Earth system models (ESMs), land surface models (LSMs), or global hydrological models (GHMs) to simulate the lateral movement of water (Trigg et al., 2016). High-resolution offline river-routing models, such as CaMa-Flood, have contributed to improved simulation of river discharge (Yamazaki et al., 2009Mateo et al., 2017). Zhao et al. (2017) used daily runoff from GHMs driving CaMa-Flood to produce monthly and daily river discharge and found that this approach results in better agreement between simulated and observed discharge compared with using native hydrological model routing. The CaMa-Flood model accounts for floodplain storage and backwater effects that are not represented in most GHM native routing methods, and these effects play a critical role in simulating peak river discharge (Yamazaki et al., 2014;Zhao et al., 2017;Mateo et al., 2017). Vano et al. (2014) analyzed several sources of uncertainty in future flood projections and suggested inter-model variability in forcing from ESMs is the major source of uncertainty in modeling the river discharge, although the model's ability to handle complex channels (e.g., deltas and floodplains) also has an important impact on simulation realism. Solar radiation management (SRM) is geoengineering designed to reduce the amount of sunlight incident on the surface and so cool the climate. Stratospheric aerosol injection is one SRM method inspired by volcanic eruptions and utilizes the aerosol direct effect to scatter incoming solar radiation. Under the Geoengineering Model Intercomparison Project (GeoMIP; Robock et al., 2011;Kravitz et al., 2011Kravitz et al., , 2012Kravitz et al., , 2013a, the G4 experiment specifies a constant injection of 5 Tg sulfur dioxide (SO 2 ) per year to the tropical lower stratosphere, or the equivalent aerosol burden, for the period of 2020-2069. This mimics about one-fourth of the stratospheric load injected by the 1991 eruption of Mount Pinatubo. Greenhouse gas forcing is specified by the RCP4.5 scenario. Nine ESMs have carried out the GeoMIP G4 experiment, with sulfate aerosols handled differently by each model. For example, BNU-ESM and MIROC-ESM use the prescribed meridional distribution of aerosol optical depth (AOD) recommended by the GeoMIP protocol; CanESM2 specifies a uniform sulfate AOD (Kashimura et al., 2017); GISS-E2-R and HadGEM2-ES adopt stratospheric aerosol schemes to simulate the AOD; NorESM1-M specifies the AOD and effective radius, calculated in previous simulations with the aerosol microphysical model ECHAM5-HAM (Niemeier et al., 2011;Niemeier and Timmreck, 2015). Indirect, potentially undesirable side effects of the injected sulfur aerosol include changing ice particle distributions in the upper troposphere and the distribution of ozone and water vapor in the stratosphere (Visioni et al., 2017). The direct radiative effects mainly result in the sharp reduction of the top-of-theatmosphere (TOA) net radiative flux with a significant drop in global surface temperature and a concomitant decrease in global precipitation (Yu et al., 2015). The decline of precipitation under SRM is mainly due to increasing atmospheric static stability, together with a reduction of latent heat flux from the land surface to the atmosphere (Bala et al., 2008;Kravitz et al., 2013b;Tilmes et al., 2013). Both the reduction of latent heat flux and precipitation result in a slowdown of the global hydrological cycle Kalidindi et al., 2014;Ferraro and Griffiths, 2016).
The spatial pattern of runoff roughly follows that of precipitation. Global spatially continuous and temporally variable observations of runoff are not available (Ukkola et al., 2018). Climate-model-simulated runoff is usually compared with observed downstream river discharge datasets, with the dataset collected by Dai (2016) and Dai et al. (2009) being the most complete. The Dai (2016) dataset represents historical monthly streamflow at the farthest downstream stations for the world's 925 largest ocean-reaching rivers from 1900 to early 2014, lacking global daily observations. As daily runoff is largely driven by daily precipitation, it is difficult to evaluate how good the runoff outputs from the climate models are at a daily scale. Over longer timescales, Alkama et al. (2013) found the CMIP5 models simulate mean runoff reasonably well (±25 % of observed) at the global scale. The CMIP5 models tend to slightly underestimate global runoff, with South American runoff being underestimated by all models. Koirala et al. (2014) found more CMIP5 model agreement on streamflow projections under RCP8.5 than under the RCP4.5 scenario, but the projected changes in low flow are robust in both scenarios with strong model agreement. Previous studies have shown that under RCP4.5, precipitation would decrease over southern Africa, the Amazon Basin, and Central America, and runoff follows these patterns. Over dry continental interiors, relatively large evaporation means that runoff does not follow precipitation (Dai, 2016). SRM affects both precipitation and evaporation and L. Wei et al.: Global streamflow and flood response to stratospheric aerosol geoengineering 16035 hence global patterns of runoff and streamflow. The risk of drought in dry regions under SRM appears to be reduced (Curry et al., 2014;Keith and Irvine, 2016;Ji et al., 2018). While many studies have looked at the impact of solar geoengineering on the hydrologic cycle, none have specifically considered the potential changes of river flow and flood frequency.
We investigate the potential change of streamflow using annual mean and extreme daily discharge and changes in the pattern of flooding using flood return period. Our study is organized as follows: Sect. 2 describes the models and methods used in this study; Sect. 3 presents the results of projected precipitation, evaporation, runoff, streamflow, and return period under the G4 and RCP4.5 simulations. Section 4 provides a discussion of mechanisms for the differences between G4 and RCP4.5 and uncertainties in the study. Finally, Sect. 5 summarizes the findings and mentions some social and economic implications from this study.

GeoMIP experiments
To analyze the potential changes of flood under stratospheric sulfate injection geoengineering, we compare the streamflow patterns under the RCP4.5 and G4 scenarios. Five ESMs were used here due to data availability (Table 1). We exclude the first decade of the G4 simulation from our analysis because it follows the abrupt increase in stratospheric aerosol forcing, which likely exerts a large perturbation to some parts of the climate system, and analyze the precipitation, evaporation, runoff, and streamflow pattern changes between each of model's G4 and RCP4.5 simulations during the period of 2030-2069. Using the last 40 years of G4 simulations is common to several previous studies (e.g., Curry et al., 2014;Ji et al., 2018). The historical simulation covering the period of 1960-1999 is used as the reference for the return period analysis. Equal weight is given to each model in the analysis, and streamflow and flood response are calculated for each model before multi-model ensemble averaging is carried out. For models with multiple realizations, streamflow and flood response are calculated for individual realization and then averaged for each model.

The river-routing model
The river-routing model used here is the Catchment-based Macro-scale Floodplain Model (CaMa-Flood; Yamazaki et al., 2011). CaMa-Flood uses a local inertial flow equation (Bates et al., 2010;Yamazaki et al., 2014) to integrate runoff along a high-resolution river map (HydroSHEDS; Yamazaki et al., 2013). Sub-grid characteristics such as slope, river length, river channel width, and river channel depth are parameterized in each grid box by using the innovative upscaling method: Flexible Location of Waterways (FLOW) (Mateo et al., 2017;Yamazaki et al., 2014;Zhao et al., 2017). In addition, CaMa-Flood implements channel bifurcation and accounts for floodplain storage and backwater effects, which are not represented in most GHMs . CaMa-Flood is able to reproduce relatively realistic flow patterns in complex river regions, such as deltas (Ikeuchi et al., 2015;Yamazaki et al., 2011Yamazaki et al., , 2013. CaMa-Flood has been extensively validated and applied to many regional-and global-scale hydrological studies (e.g., Pappenberger et al., 2012;Hirabayashi et al., 2013;Mateo et al., 2014;Ikeuchi et al., 2015Ikeuchi et al., , 2017Trigg et al., 2016;Zsótér et al., 2016;Emerton et al., 2017;Suzuki et al., 2018;Yamazaki et al., 2017).
We use only the daily runoff outputs from climate models to drive CaMa-Flood v3.6.2, which calculates the river discharge along the global river network. The spatial resolution of CaMa-Flood is set to 0.25 • (∼ 25 km at midlatitudes). An adaptive time step scheme was applied in the model numerical integration, leading to a time step of about 10 min, while the model outputs are at daily temporal resolution. To conserve the input runoff mass, an area-weighted averaging method is used in CaMa-Flood to distribute the coarse input to the fine-resolution routing model (Mateo et al., 2017). CaMa-Flood performs a 1-year spin-up before simulating 40year river discharge in our historical, RCP4.5, and G4 experiments. The runoff and river discharge from Antarctica and Greenland are not included in the simulations. For each streamflow level, grid cells with less than 0.01 mm day −1 are excluded from the analysis.

Indicators of streamflow
We analyze the streamflow change under the RCP4.5 and G4 scenarios using three streamflow indicators for the 2030-2069 period, that is, annual mean flow (Q m ) and extreme high (Q 5 ) or low flow (Q 95 ). Q m , Q 5 , and Q 95 are averaged over 40 years for each model, then averaged among models to obtain the multi-model mean response under the different scenarios. We compared the multi-model mean and multimodel median responses of the five models used in this study and found no obvious difference between the two averages.
We employ the two-sample Mann-Whitney U (MW-U) test to measure the significance of streamflow differences between G4 and RCP4.5. The MW-U test is a nonparametric test, which does not need the assumption of normal probability distributions. We use a bootstrap resampling method  with the MW-U test to increase sample size and to minimize the effects of outliers that can arise from the relatively short study period (Koirala et al., 2014). Specifically, we first apply the MW-U test to the G4 and RCP4.5 annual mean daily streamflow data for each model to obtain the value of the rank sum statistical value, U 0 . Then we generate 1000 random paired series of 40-year streamflow data from RCP4.5 and G4 simulations using the bootstrap resampling method and apply the MW-U test to each sample pair of generated streamflow data to obtain a series of statistical  (Arora et al., 2011;Chylek et al., 2011) 2.8 × 2.8, L35 3 3 3 MIROC-ESM (Watanabe et al., 2011) 2.8 × 2.8, L80 1 1 1 MIROC-ESM-CHEM (Watanabe et al., 2011) 2.8 × 2.8, L80 1 1 1 NorESM1-M Tjiputra et al., 2013) 1.9 × 2.5, L26 1 1 1 values: U j , j = 1, 2...1000. The rank of U 0 is then used to calculate the non-exceedance probability (Cunnane, 1978): Here p 0 is the non-exceedance probability and R 0 is the rank of U 0 , and N b is the number of the bootstrap samples. Finally, a non-exceedance probability less than 0.025 (or greater than 0.975) indicates a significant increase (or decrease) from RCP4.5 to G4.

Changes in flood frequency
The return period of a flood event is as an indicator of flood frequency (e.g., Dankers et al., 2014;Ward et al., 2017). The N-year return period indicates the probability of flood exceeding a given level in any given year of 1/N . For each model, we choose the historical period of 1960-1999 as a reference for the return period calculation based on the annual maximum daily river discharge. We then analyze the return period change under RCP4.5 and G4 scenarios during the period of 2030-2069. In this study, we choose the 30-, 50-, and 100-year return period levels of river flow at each grid cell to study the change of flood probability. To estimate the return period, the time series of annual maximum daily discharge for the historical, RCP4.5, and G4 scenarios from each ESM are first arranged in ascending order and then fitted to a Gumbel probability distribution. The Gumbel distribution was used as a statistic of extreme flood events in previous studies (e.g., Hirabayashi et al., 2013;Ward et al., 2014). Using the Gumbel distribution, the cumulative distribution function, F (x), of river discharge (x) can be expressed as where the two parameters a (scale) and b (location) are the parameters of Gumbel distribution (Gumbel, 1941). The parameters are estimated using an L-moments-based approach (Rasmussen and Gautam, 2003), where and X i is the annual maximum daily river discharge and is sorted in ascending order, and N is the number of sample years. Then where c = 0.57721 is Euler's constant. Changes in return period under SRM are expressed as differences G4 -RCP4.5 relative to the corresponding historical level.

Projected changes in precipitation, evaporation, and runoff
G4 stratospheric aerosol geoengineering lowers net radiation fluxes at the TOA by ∼ 0.36 W m −2 , reduces mean global temperature by ∼ 0.5 K, and slows down the global hydrological cycle. Global precipitation decreases by 2.3 ± 0.5 % per kelvin in response to G4 stratospheric aerosol injection . Precipitation and evaporation rates are strongly influenced by incoming radiation and the water vapor content of the troposphere. Solar geoengineering produces changes in both atmospheric circulation and thermodynamics. Several studies have analyzed changes in large-scale circulation under the G1 solar dimming experiment (e.g., Moore et al., 2014;Davis et al., 2016;Smyth et al., 2017;Guo et al., 2018), but the more subtle changes under G4 have not yet been analyzed in similar depth. Broadly speaking, increasing greenhouse gases tend to produce a stronger Hadley circulation and enhanced hydrological cycle, increasing precipitation in the tropics and lowering it in the subtropics (the wet gets wetter and dry gets drier) (Chou et al., 2013). Geoengineering, under both G1 solar dimming and G4 aerosol injection, counteracts this response, decreasing tropospheric temperatures and maintaining a higher pole-Equator meridional temperature gradient than under greenhouse gas forcing alone and tending to reverse the wet dry patterns under greenhouse gas forcing Wang et al., 2018). Stratospheric aerosol injection geoengineering produces a more complex climate response than produced by simple solar dimming (e.g., G1), as the aerosol layer not only scatters shortwave radiation but also absorbs near-infrared and longer-wavelength radiation (Lohmann and Feichter, 2005;Niemeier et al., 2013;Ferraro et al., 2014). The net result of these changes in the GeoMIP experiments is model dependent (Wang et al., 2018;Ji et al., 2018). Under G4, the global annual precipitation over land (excluding Greenland and Antarctic) decreases 9.3 mm relative to the reference RCP4.5 scenario. The tropical Africa and south Asia regions suffer large precipitation reduction, with values up to 37.1 and 52.3 mm per year (Fig. 1a); southeastern North America and Alaska also see large precipitation decreases. In contrast, precipitation increases significantly over southern Africa and eastern Brazil under G4. Previous studies based on Global Land-Atmosphere Climate Experiment-Coupled Model Intercomparison Project Phase 5 (GLACE-CMIP5) suggest strong coupling between local soil moisture and precipitation over southern Africa and eastern Brazil, both of which are simulated to experience large precipitation reduction under global warming (Seneviratne et al., 2013), which is reversed under G4. Although the precipitation increase under G4 over the Mediterranean region is not statistically significant, May et al. (2017) note soil moisture and precipitation both decrease under global warming. Lower temperatures under G4 result in a reduction of 6.9 mm in mean global land (excluding Greenland and Antarctic) evaporation relative to RCP4.5.
Under G4, there is large precipitation reduction over the Indian subcontinent and East Asia monsoon regions of 5.4 % and 5.0 %, respectively. Under G1, these reductions have been related to a reduced latitudinal seasonal amplitude of the Intertropical Convergence Zone (ITCZ) (Schmidt et al., 2012;Smyth et al., 2017) and a reduction in the intensity of the Hadley circulation . Precipitation over other monsoon regions in G4 sees less significant changes. Displacement of midlatitude westerlies and changes to the North Atlantic Oscillation, especially during winter, will change regional precipitation variations under G4. Ferraro et al. (2015) and Muri et al. (2018) found that tropical lowerstratospheric sulfate aerosol injection leads to a thermal wind response that affects the stratospheric polar vortices. The polar vortices guide winter midlatitude jets and cyclone paths across the midlatitudes. Under a warming climate, an earlier spring snowmelt over northeastern Europe and a later onset of the winter storm season would both alter flooding conditions (Blöschl et al., 2017). Both these will also be affected by G4 stratospheric aerosol geoengineering.
Increased evaporation forecast under RCP4.5 is suppressed under G4 geoengineering due to reduced downward surface radiation (Kravitz et al., 2013a;Yu et al., 2015). Evaporation decreases over a significantly (p < 0.05) broader area than precipitation, especially in the Northern Hemisphere (Fig. 1b). The change of precipitation minus evaporation (P − E) basically follows the change of precipitation and evaporation, but is of a smaller magnitude (Fig. 1c), due to their simultaneous reductions. There are significant reductions in P − E over south Asia, tropical eastern Africa, and the Amazon Basin and significant increases over southern Africa and eastern Brazil. Increased P −E in northern Asia caused by global warming could be partly counteracted by solar geoengineering (Jones et al., 2018;Sonntag et al., 2018). The simulated precipitation and evaporation changes under G4 imply potentially significant changes in the terrestrial hydrological cycle. P −E can be used as a simplified measure of runoff and water availability. Under the G4 experiment, P − E increases over Europe during summertime, implying more water availability and a shortened return period of river discharge. Soil moisture also reflects local water mass balance, i.e., the difference between P − E and runoff. Soil moisture increases over southern Africa, southwestern North America, and several parts of South America, where P − E and runoff both increase. The regions with significant reductions in both P − E and runoff, such as tropical Africa, South Asia, and most of middle North America, also show decreases in soil moisture.
The spatial pattern of runoff change from RCP4.5 to G4 resembles that of P − E, with a broader area of significant changes (Fig. 1c, d). The annual runoff decreases by 2.4 mm, similar to the change in P − E. There are large runoff decreases over tropical Africa, South Asia, southeastern North America, the Amazon Basin, and Alaska. Runoff slightly increases over southern Africa, southwestern North America, and several regions of South America. Variability in runoff and streamflow is greater than for precipitation and evaporation (Figs. 1, 2) due to spatial heterogeneity in soil moisture and because streamflow spatially integrates runoff (Chiew and McMahon, 2002).
Precipitation, evaporation, and runoff changes show that land areas dry slightly, especially around the Equator, south Asia, and at northern high latitudes under G4. Increases in P −E are predicted in the western parts of Europe and North America, with their eastern sides becoming drier with decreasing P − E and runoff.  Table 1. Figures S6-S7 show the relative changes of three streamflow indicators under G4 and RCP4.5 relative to the his- torical period. In general, the streamflow indicators under G4 are less changed from the historical levels than under RCP4.5. In Fig. 2, positive values mean G4 streamflow is larger than RCP4.5 levels. Generally, decreases in Q m occur at high northern latitudes such as Siberia, northern Europe, and the Arctic Ocean coast of North America, along with Southeast Asia and middle and southern Africa. Q m increases in western Europe, central Asia, southwestern North America, and Central America (Fig. 2a). Significant changes are generally distributed around the globe. Based on the ensemble response of the five models analyzed here, 55 % of global continental area excluding Greenland, Antarctica, and masked cells shows decreases in Q m under G4 compared with RCP4.5, and about 45 % of global continental area shows increases. Figure 3 shows areas with robust agreement among models and allows the primary regions affected to be seen more clearly. Globally, only 21 % of global continental area exhibits robust decreases and 12 % increases in Q m under G4 (Fig. 3a). Despite the few grid cells with robust agreement among models, the general patterns are similar for the mean changes in Fig. 2a. Consistent decreases occur at high northern latitudes and in Papua New Guinea and the semiarid Sahel. Increases are mainly in the Southern Hemisphere but also parts of western Europe and the southwestern US. MIROC-ESM (Fig. S3) and NorESM1-M (Fig. S5) contradict the ensemble in having larger areas with increases in Q m under G4 than RCP4.5. Figures 2b and 3b show that under G4, 52 % of unmasked land area is projected to increase its high flow Q 5 levels under G4. Europe, western North America, central Asia, and central Australia show increases in Q 5 under G4 compared with RCP4.5. Differences at the 95 % significance level are distributed fairly similarly to Q m in Fig. 2a. The Amazon Basin shows decreases in both Q 5 and Q m and the southwestern US shows increases in both. Globally, 17 % of unmasked land area shows robust increases and 17 % shows decreases in Q 5 under G4 (Fig. 3b). Robust increases are generally confined to the extratropics, while decreases are mainly, but not only, in the tropics. The projections of Q 5 from CanESM2 under G4 show the largest differences in spatial pattern from the ensemble mean (Fig. S2) and it is the only model with more decreases than increases in Q 5 under G4. Though high flow levels usually correspond with flood events , changes in flow levels do not necessarily translate into increases in flood frequency. We elaborate further on flood return period in Sect. 3.3.

Projected changes in streamflow
Low flow (Q 95 , in Figs. 2c and 3c) has a noisier spatial pattern than for mean and high flow. Low flow shows a relatively uniform decrease around the globe. A total of 49 % of global unmasked land area shows increases in Q 95 under G4. Despite the generally noisier pattern, the regions with differences significant at the 95 % level are more defined for Q 95 than either Q m or Q 5 . The high northern latitudes become drier under G4, the southern high latitudes wetter. Robust increases cover about 11 % of global unmasked  in the remaining parts of South Asia, central Africa, and South America. Increased high flow and simultaneous decrease in low flow suggests the potential for increased flood and drought frequencies. In 21 % of global unmasked land area, high flows decrease and low flows increase (regions in blue), which suggests these would see a decline in streamflow extremes, and are mainly at northern midlatitudes and high latitudes. Areas with both increased high and low flow also cover 29 % of the unmasked land surface (regions in green), mainly in Europe, Central America, and the Southern Hemisphere midlatitudes. Perhaps the clearest overall pattern is the streamflow generally increasing under G4 on the western sides of the large continents of Eurasia and North America, especially over Mexico, southern California, Spain, and western Europe, while streamflow decreases on the eastern sides of these continents. In the Southern Hemisphere, Figure 4. The ensemble mean difference (G4 − RCP4.5) of high (Q 5 ) and low (Q 95 ) streamflow. The color bar is defined such that grid cells in which G4 is less than RCP4.5 for both Q 5 and Q 95 are in red (Q 5 . ↓ Q 95 . ↓), both Q 5 and Q 95 greater in G4 than in RCP4.5 are in green (Q 5 . ↑ Q 95 . ↑), Q 5 greater in G4 and Q 95 greater in RCP4.5 in yellow (Q 5 . ↑ Q 95 . ↓), and vice versa in blue (Q 5 . ↓ Q 95 . ↑). Grid cells with Q 95 less than 0.01 mm day −1 are masked out. the pattern is meridional, with northern wetter parts of the landmasses having lower streamflow under G4, and southern drier parts increases.

Projected changes in return period
Changes in flooding between the RCP4.5 and G4 scenarios are measured by the changes in the return period of particular river discharge magnitude. Previous studies have used a 30-year return period as a relatively modest indicator of flood frequency . We choose both the same flooding frequency indicator and also the more extreme 50or 100-year return levels. The discharge for each model's 30-, 50-, and 100-year return periods in the simulated historical period defines the reference magnitudes at each grid cell. The return period of discharge corresponding to those levels are then found under the RCP4.5 and G4 scenarios. Dry regions, defined as mean annual streamflow during the historical period  less than 0.01 mm day −1 , are masked out. The 40-year time series of the historical period  and 40-year future projections (2030-2069) are then fitted to the Gumbel probability distribution for each grid cell. Figure 5a and b show the global distribution of the multimodel ensemble median return period of the historical 30year return level under the RCP4.5 and G4 scenarios. Figures S8 and S9 show the relevant patterns for 50-and 100year return periods. The elongation of return period in some regions (such as central Asia and the Amazon Basin) indicates relatively less-frequent flooding events compared with the past. Very close to half the global unmasked land area (49 %) shows increases in return period under the RCP4.5 scenario, while the other half experience decreases. Increases in return period are mainly in Asia and eastern Africa while decreases occur in Europe and North America. Our results agree with similar previous studies for RCP4.5 (e.g., Hirabayashi et al., 2013). Under G4 the spatial pattern is very similar to RCP4.5, with comparable large differences from the historical levels. Figure 5c shows the difference of return period between the G4 and RCP4.5 scenarios. A negative value means a shorter return period under G4 than RCP4.5, which indicates an increase in flood frequency under G4. Decreasing flood frequency appears in India, China, Siberia, parts of the Amazon Basin, and northern Australia. Increasing flood frequencies are projected mainly in Europe, the southwestern US and much of Australia. The regions that are projected to experience an increased flood frequency under the RCP4.5 scenario, such as southern and southeastern Asia ( Fig. 5a; Dankers et al., 2014;Hirabayashi et al., 2013), would experience a consistent decline of the flood frequency under G4. In general, the G4 return periods are less changed from the historical levels than under RCP4.5. Figure 6 shows the regions of robust agreement among models in changes of 30-year return period under RCP4.5 and G4. Slightly fewer grid cells show robust responses under G4 than RCP4.5. As with Fig. 5, there is close agreement in spatial pattern of return period under the RCP4.5 and G4 scenarios. The spatial pattern of the changes in 50-and 100year return levels shown in Figs. S8 and S9 is similar to that for the 30-year return level (Fig. 5), while the spread between two different return period levels is slightly different from the 30-year levels. These results suggest a consistent changing pattern of flood frequency as defined by the three return levels, but with different magnitudes of differences between RCP4.5 and G4, with G4 being closer to the historical levels.
4 Discussion 4.1 G4 changes relative to RCP4.5 G4 weakens the streamflow changes expected under RCP4.5 relative to the historical period (Koirala et al., 2014). For example, in southeastern Asia and India, both high flows and low flows are projected to increase under the RCP4.5 scenario, while both of them would increase less under G4. In contrast, southern Europe is projected to see decreases in both high and low flow under RCP4.5, while the projected streamflow shows fewer decreases under G4. However, in the Amazon Basin, both high and low streamflow decreases in both RCP4.5 and G4 relative to the historical period. In Siberia both high and low streamflow increases under RCP4.5 relative to the historical scenario, while the pattern is mixed under G4. This means that G4 offsets the impact introduced by anthropogenic climate warming in some regions, while in other regions such as the Amazon Basin and Siberia, it further enhances the decreasing trend in streamflow under the RCP4.5 scenario. The pattern seen is suggestive of the role of large-scale circulation patterns (Fig. 7), westerly flows over the Northern Hemisphere continents, Figure 5. Multi-model ensemble median of return periods for discharge that correspond to a 30-year return period level in the historical simulation  under (a) G4, (b) RCP4.5, and (c) the difference of G4 and RCP4.5. Grid cells in extremely dry regions in the historical simulation, i.e., Q m < 0.01 mm day −1 , are masked out. and the Asian monsoon systems, with relative increases in midlatitude storm systems and decreases in monsoons under G4 compared with RCP4.5. These circulation changes result in, for example, more moist maritime air flowing into the Mediterranean region and weakened summertime monsoonal circulation under G4 in India and East Asia (Fig. 7e, f). Similar mechanisms may also account for the north-south pattern seen in Australia and South America. Monsoonal indicators do decrease under the much more extreme G1 experiment, in which solar dimming is designed to offset quadrupled CO 2 levels (Tilmes et al., 2013).
There is a latitudinal dependence for streamflow: generally, the Q m decreases across all latitudes; high flow, Figure 6. The number of models agreeing on the sign of change in a 30-year return period under G4 (a) and RCP4.5 (b). Blue indicates decreases and red indicates increases relative to the historical simulation. Grid cells in extremely dry regions in the historical simulation, i.e., Q m < 0.01 mm day −1 , are masked out.
Q 5 , decreases most in tropical regions; low flow, Q 95 , decreases most at high latitudes. The high latitudes display a complicated streamflow pattern with weakly increasing Q 5 and significantly decreasing Q 95 . The decrease in the lower probability tail of streamflow is indicative of hydrological droughts, while the increases in the high streamflow tail indicate hydrological flooding (Keyantash and Dracup, 2002). Previous studies Hirabayashi et al., 2008) have noted that the flood frequency for rivers at high latitude (e.g., Alaska and Siberia) decreases under global warming, even in areas where the frequency, intensity of precipitation, or both are projected to increase. The annual hydrograph of these rivers is dominated by snowmelt, so changes of peak flow reflect the balance between length and temperature of winter season and the total amount of winter precipitation. The thawing of permafrost and changes in evapotranspiration also play an important role in the increase of runoff and streamflow (Dai, 2016). The combined effect of atmospheric circulation and land surface processes results in the complex change pattern in this cold region.
Under the G4 experiment, recent studies (Jones et al., 2018;Sonntag et al., 2018) have pointed out that the increased P − E in northern Asia caused by global warming could be partly counteracted by solar geoengineering. At the same time, solar geoengineering reduces polar temperatures and precipitation (Berdahl et al., 2014;Ji et al., 2018). The balance among precipitation, evaporation, and tempera- Figure 7. Multi-model ensemble mean of 925 hPa wind field during December-January-February (DJF) and June-July-August (JJA). Panels (a) and (b) for RCP4.5, panels (c) and (d) for G4, and panels (e) and (f) for the difference between G4 and RCP4.5. Grid cells in which wind speed is less than 2.0 m s −1 are masked out in panels (a), (b), (c), and (d). Grids cells in which wind speed is less than 0.1 m s −1 are masked out in panels (e) and (f). Shaded monsoonal regions are derived using the criteria of Wang and Ding (2006) with the Global Precipitation Climatology Project (GPCP) dataset covering the years 1979-2010 (Adler et al., 2003). ture accounts for the complex spatial pattern of streamflow and flood frequency under solar geoengineering, which has been previously related to soil moisture content (Dagon and Schrag, 2017). It is worth noting that the method for calculating potential evapotranspiration (ET) plays a significant role in determining simulated surface runoff changes (Haddeland et al., 2011;Thompson et al., 2013), which would influence the condition of streamflow. A recent study (Wartenburger et al., 2018) compared the ET spatial and temporal patterns simulated by GHMs in the second phase of the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP2a), which also confirmed that the ET scheme used affects model ensemble variance. The ET in this study is calculated by the ESMs (Table 1), not GHMs, and any biases in ET would feed into streamflow. For example, Mueller and Seneviratne (2014) found that climate models that participated in CMIP5 display an overall systematic overestimation of annual average ET over most regions, particularly in Europe, Africa, China, Australia, western North America, and part of the Amazon region.
The relatively drier streamflow pattern in the Amazon Basin under G4 is notable and consistent with changes in P − E (e.g., Jones et al., 2018). This drying pattern would increase the risk of a decline of the Amazon tropical rainforest (Boisier et al., 2015). Amazon Basin drying is complicated by various factors that are dependent on solar geoengineering. These include (i) the reduced seasonal movement of the ITCZ under solar geoengineering (Smyth et al., 2017;Guo et al., 2018); (ii) changes in sea surface temperature reflecting changes in frequency of El Niño-Southern Oscillation (Harris et al., 2008;Jiménez-Muñoz et al., 2016), although there is no evidence of such changes occurring under SRM (Gabriel and Robock, 2015); and (iii) changes to carbon cycle feedbacks (Chadwick et al., 2017;Halladay and Good, 2017), which would certainly be affected by changes in diffuse radiation under SRM (Bala et al., 2008;Muri et al., 2018).

Uncertainties
Previous studies suggest that the river-routing model CaMa-Flood can realistically reproduce peak river discharge because the floodplain storage and backwater effects are implemented (e.g., Zhao et al., 2017). In this study, the CaMa-Flood is driven by the runoff output directly from ESMs to simulate streamflow and flood response. Therefore, the uncertainty in runoff from the ESMs is also important. To drive the high-resolution CaMa-Flood model, the coarseresolution runoff from ESMs was regridded using a firstorder conservation method. Although the regridding method conserves the mass of runoff, distributing the runoff from coarse climate model grids to fine river-routing model grids introduces unavoidable errors. The relative magnitudes of this kind of error are dependent on the regional terrain and river-routing map. The uncertainty in runoff might be transformed by the river-routing model and overlap with the built-in bias of the river-routing model itself. Comparing the ratio between inter-model spread and multi-model ensemble mean, we find that runoff usually has large inter-model spread in arid regions, and streamflow has large inter-model spread over a broader area than that of runoff. This is due to the streamflow integrating the runoff spatially along the riverrouting map; therefore it carries the uncertainties of runoff to a relatively large extent. Several studies have identified the uncertainty introduced by hydrological models (e.g., Chen et al., 2011;Prudhomme et al., 2014). We assume that systematic river-routing model bias relative to observations can be alleviated by subtracting historical simulations, and simulated runoff biases are not expected to change significantly under future scenarios. In addition to inherent model biases, there are natural processes that could change river routes and river network silt-up over time; these changes would impact local runoff and streamflow (Chezik et al., 2017), and we do not account for them in this study. Gosling et al. (2017) compared the river runoff output from multiple global and catchment-scale hydrological models under three warming scenarios simulated by ESMs, finding that the across-model uncertainty overwhelmed the ensemble median differences among the scenarios. Yu et al. (2016) suggested model internal variability may be larger than across-model spread in eastern and southeastern Asia. In this study we use the offline hydrological model driven by runoff outputs from ESMs to calculate the streamflow; the uncertainty among ESMs is reflected in the range of return period based on streamflow change. Figure S10 shows the multi-model ensemble range of the 30-year return period level. Regions that have the shorter return period (i.e., higher flood frequency) from historical to future show a relatively small range among models (e.g., India and southeastern Asia). Regions that have the longer return period show a large range (e.g., Europe and North America). This reflects larger inter-model uncertainty over dry zones than over wetter ones. The return period change over dry zones is more meaningful when interpreted as the change of drought tendency. The 50-and 100-year return period level flow shows larger uncertainty than for the 30-year return period level, which is expected when estimating the low probability extreme tails of the flow probability density function from relatively short (40 year) sets of results.

Summary and implications
We analyzed the streamflow response under stratospheric aerosol injection geoengineering, G4, and the RCP4.5 scenario using the daily total runoff from five climate models that participated in GeoMIP. We investigated the mean change patterns of annual mean and extreme high and low streamflow and analyzed the global flood frequency change in terms of return period. There is a pattern of generally increasing streamflow under G4 on the western sides of the major continents of Eurasia and North America, with decreasing streamflow on their eastern sides. In the Southern Hemisphere, the pattern is meridional, with northern parts of the landmasses having lower streamflow under G4, and southern parts increases. We further investigated the change of flooding corresponding to the magnitudes of the historical 30-, 50-, and 100-year return period levels; the flooding frequencies change dramatically from historical levels under both RCP4.5 and G4 and show similar spatial patterns. The projected return period pattern under the RCP4.5 scenario agrees well with previous studies, such as  and Hirabayashi et al. (2013). Generally, stratospheric aerosol injection geoengineering as simulated by G4 relieves flood stress, especially for Southeast Asia, and in turn increases the probability of flooding in the southwestern US, Mexico, and much of Australia -which are droughtprone places that might benefit from increased soil moisture and streamflow. The Amazon Basin shows a relative elongation of flood return period, while Europe shows shortening of return period under G4, and this was also implicit in streamflow characteristics in these regions.
CaMa-Flood does not consider anthropogenic infrastructure, such as dams or reservoirs, which some hydrological models do include. However, estimating future changes in human intervention on the natural system is highly uncertain. Technological advances over the century that may affect anthropogenic changes are by their nature entirely unknown at present. Hence integrating the human dimension into a model of the physical system is fraught with difficulty and uncertainty. Several studies can be used as a guide to the possible effects of anthropogenic impacts compared with natural changes that are captured in CaMa-Flood. Dai et al. (2009) argued that the direct human influence on the major global river streamflow is relatively small compared with climate forcing during the historical period. Mateo et al. (2014) suggested that dams regulate streamflow consistently in a basin study using CaMa-Flood combined with integrated water resources and reservoir operation models. Wang et al. (2017) shows that the reservoir would effectively suppress the flood magnitude and frequency. Recently, analyses of the role of human impact parameterizations (HIPs) in five hydrological models found that the inclusion of HIPs improves the performance of GHMs, in both managed and near-natural catchments, and simulates fewer hydrological extremes by decreasing the simulated high flows Zaherpour et al., 2018). These studies suggest that the high flows and flood response under G4 relative to RCP4.5 might be smaller when human intervention is considered and indicate the importance of considering human impacts in future hydrological response studies under geoengineering.
The accurate assessment of human impacts on flood frequency and magnitude depends not only on how anthropogenic effects are parameterized in hydrological models (Masaki et al., 2017) but also on how human activities are represented in geoengineering scenarios. As anthropogenic greenhouse gas emissions increase, human society would continually adapt to climate change and mitigate the related risk, including building new dams and reservoirs to withstand a strengthened global hydrological cycle. How society would respond to future streamflow and flood risk is an important topic both scientifically and in policy making. This is especially true for the developing world, where many cities are experiencing subsidence due to unsustainable rates of groundwater extraction. Subsidence accounted for up to onethird of 20th century relative sea level rise in and around China (Chen, 1991;Ren, 1993). Subsidence and sea level rise both increase flooding risks. However, in densely populated regions with much experience of irrigation management, such as Southeast Asia and India, reduced flood frequency under G4 stratospheric aerosol geoengineering might be further ameliorated.
Our results on streamflow and flood response are based on the GeoMIP G4 simulation and its reference RCP4.5 simulation. The generalizations of the work to other types and extents of solar geoengineering depends on the linearity of the streamflow response to both greenhouse gas and geoengineering. The linearity of response of radiative forcing and global temperatures in particular have been explored in CESM1 Stratospheric Aerosol Geoengineering Large Ensemble (GLENS; Tilmes et al., 2018). Many climate fields, such as temperature, are surprisingly linear under a very wide range of forcing, potentially allowing standard engineering control theory methods (e.g., MacMartin et al., 2014) to tailor a global response given the freedom to use different latitudinal input locations for the aerosol injection Kravitz et al., 2017), or combinations of, for example, aerosol injection and marine cloud brightening (Cao et al., 2017). Nonlinearities are expected for systems that depend on ice/water phase changes, and these could affect global streamflow and flood responses in some regions, especially in the Arctic. Moreover, the type of solar geoengineering might be relevant as well. Ferraro et al. (2014) found that the tropical overturning circulation weakens in response to geoengineering with stratospheric sulfate aerosol injection due to radiative heating from the aerosol layer, but geoengineering simulated as a simple reduction in total solar irradiance does not capture this effect. A larger tropical precipitation perturbation occurs under equatorial injection scenarios (such as G4) than under simple solar dimming geoengineering, or the latitudinal varying injection schemes explored by GLENS, or a mix of different geoengineering strategies (such as aerosol injection and marine cloud brightening; Cao et al., 2017). Thus the response of streamflow and flood would be expected to differ, to some extent, under different types of solar geoengineering.
Floods are among the most costly natural disasters around the world, especially for more vulnerable developing countries (e.g., Bangladesh, India, and China). Our study suggests that solar geoengineering would exert nonuniform impacts on global flooding risk and hence local hydraulic infrastructure needs would vary if solar geoengineering of the G4 type were undertaken. Changes in flooding are strongly connected with the economic cost of damage due to climate change and sea level rise (Jevrejeva et al., 2016;Hinkel et al., 2014) and thorough studies should be made for further policy and decision making, especially applied to high-value economic or ecological entities. This may be carried out in the framework of specific impact models applied to local cities or regions and would hence benefit from local knowledge, especially in the developing world where resources for adaptation measures are scarce. Linkages between the developing world climate impacts researchers and the GeoMIP community will be encouraged and funded by the Developing Country Impacts Modelling Analysis for SRM (DECIMALS) project (Rahman et al., 2018). Scientists from developing countries are encouraged to apply DECIMALS to model the solar geoengineering impacts that matter most to their regions. DEC-IMALS promotes wider discussion of the implications of regional impact studies of solar geoengineering. These studies will be a helpful initial step in future decision making related to climate change adaptation and urban infrastructure design.
Author contributions. DJ designed the research; LW carried out the simulations; LW, DJ, JCM, CM, and HM performed research and wrote the paper.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "The Geoengineering Model Intercomparison Project (GeoMIP): Simulations of solar radiation reduction methods (ACP/GMD interjournal SI)". It is not associated with a conference.