Understanding the surface temperature response and its uncertainty to CO2, CH4, black carbon and sulfate

Understanding the regional surface temperature responses to different anthropogenic climate forcing agents, such as greenhouse gases and aerosols, is crucial for understanding past and future regional climate changes. In modern climate 10 models, the regional temperature responses vary greatly for all major forcing agents, but the causes of this variability are poorly understood. Here, we analyse how changes in atmospheric and oceanic energy fluxes due to perturbations in different anthropogenic climate forcing agents lead to changes in global and regional surface temperatures. We use climate model data on idealized perturbations in four major anthropogenic climate forcing agents (CO2, CH4, and sulfate and black carbon aerosols) from PDRMIP climate experiments for six climate models (CanESM2, HadGEM2-ES, NCAR-CESM1-CAM4, 15 NorESM1, MIROC-SPRINTARS, GISS-E2). Particularly, we decompose the regional energy budget contributions to the surface temperature responses due to changes in longwave and shortwave fluxes under clear-sky and cloudy conditions, surface albedo changes, and oceanic and atmospheric energy transport. We also analyse the regional model-to-model temperature response spread due to each of these components. The global surface temperature response stems from changes in longwave emissivity for greenhouse gases (CO2 and CH4) and mainly from changes in shortwave clear-sky fluxes for aerosols (sulfate 20 and black carbon). The global surface temperature response normalized by effective radiative forcing is nearly the same for all forcing agents (0.63, 0.54, 0.57, 0.61 KWm). While the main physical processes driving global temperature responses vary between forcing agents, for all forcing agents the model-to-model spread in temperature responses is dominated by differences in modelled changes in longwave clear-sky emissivity. Furthermore, in polar regions for all forcing agents the differences in surface albedo change is a key contributor to temperature responses and its spread. For black carbon the modelled 25 differences in temperature response due to shortwave clear-sky radiation are also important in the Arctic. Regional model-tomodel differences due to changes in shortwave and longwave cloud radiative effect strongly modulate each other. For aerosols clouds play a major role in the model spread of regional surface temperature responses. In regions with strong aerosol forcing the model-to-model differences arise from shortwave clear-sky responses and are strongly modulated by combined temperature responses to oceanic and atmospheric heat transport in the models. 30 https://doi.org/10.5194/acp-2021-401 Preprint. Discussion started: 22 June 2021 c © Author(s) 2021. CC BY 4.0 License.


Introduction
Climate change projections depend highly on future scenarios of climate mitigation actions. But in addition to uncertainty arising from different possible futures particularly in timescales of decades the climate projection uncertainties are dominated by the climate model response uncertainty (Hawkins & Sutton, 2009;Lehner et al., 2020). This arises from 35 structural differences between different climate models. Climate models differ on how they represent the radiative forcing of anthropogenic greenhouse gases and aerosols. But perhaps more importantly, they respond differently to the same external radiative forcing (Nordling et al., 2019). As stated in Lehner (2020) the model spread in the estimated temperature responses is affected by inter-model differences in both the forcing and in how the models respond to the forcing. Smith et al. (2020) quantified the effective radiative forcings (ERFs) for modern-day greenhouse gas and aerosol 40 concentrations for a range of climate models participating to CMIP6 multi-model climate experiments. They showed that since CMIP5, the spread in modelled radiative forcing has narrowed down. Despite this, the response uncertainty in CMIP6 models appears to have grown from CMIP5 models (Lehner et al., 2020;Zelinka et al., 2020). Uncertainty in the climate response hampers efforts to robustly define carbon emission targets to maintain global warming below specified limits, such as below 1.5 °C (Matthews et al., 2021;Rogelj et al., 2019). Furthermore, the carbon emission targets depend on the climate 45 response to radiative forcers besides carbon dioxide, such as aerosols and methane (Tokarska et al., 2018). Modern day anthropogenic aerosols cool the global surface temperatures between 0.5-1.1 °C (Samset et al., 2018, Nordling et al., 2019, and their future reductions can accelerate global warming and enhance global precipitation (e.g. Merikanto et al., 2021;Wilcox et al., 2020).
Besides the need to better understand the impacts of different climate forcing agents on the global climate, there is an urgent 50 need to better understand how they impact climate on regional scale. The spatial distribution of aerosols is highly heterogenous, and much of the modern-day effective aerosol radiative forcing is concentrated over the South and East Asian region (Fiedler et al., 2019), while the radiative forcing of long-lived greenhouse gases is much more uniform (Shindell et al., 2015). Aerosols have both local and remote climate effects which depend on the emission region and type of aerosol (Merikanto et al., 2021;Nordling et al., 2019;Persad & Caldeira, 2018). Furthermore, the differences in regional 55 distributions of aerosol surface temperature responses are not dominated by the aerosol description in modern climate models (Nordling et al., 2019). Therefore, differences in modelled regional temperature responses for both greenhouse gases and aerosols appear to mainly depend on differences in dynamic responses of the atmosphere-ocean-sea ice system in the models. The main focus of this paper are these differences in modelled responses to aerosol and greenhouse gas perturbations in different climate models. 60 The Precipitation Driver Response Model Intercomparison Project (PDRMIP)  provides a data set that allows us to investigate how different climate forcing agents affect the Earth's climate in global and regional scale. PDRMIP comprised idealized single-forcer scenarios for several independent climate models. Previously, the PDRMIP data set has been used to study e.g. how different forcing agents affect the Arctic amplification (Stjern et al., 2019) and how they produce rapid adjustments and ERF . Estimating ERF is not straightforward, and different methods provide a 65 variety of different results. For example, Tang et al. (2019) used PRDMIP data to estimate ERF for different climate forcing agents with several different methods. The model-mean estimated ERF for the doubling of carbon dioxide concentrations varied from 3.65 to 4.70 Wm -2 depending on the method and on how rapid adjustments were included in the estimate. Richardson et al. (2019) showed that ERF calculated from fixed-sea-surface experiments is a good predictor for the global temperature change for different forcing agents, and particularly so if the adjustments due to land temperature change are 70 included.
The model differences in climate response are often investigated through radiative feedback analysis (e.g. Zelinka et al., 2020).
While the feedback analysis is particularly suitable for analyzing the root causes of model-to-model differences in the equilibrium climate sensitivity (the equilibrium temperature response to doubled atmospheric carbon dioxide concentrations), it is less suitable for exploring regional temperature response variance between the models due to the nonlinearity of regional 75 feedbacks (Andrews et al., 2012). Räisänen & Ylhäisi (2015) formulated an energy balance framework to explore the impact of the top-of-the atmosphere (TOA) radiative fluxes, atmospheric energy transport and the net surface energy flux on regional surface temperatures. The method relies on the local conservation of energy and it is therefore mathematically an almost exact solution for the decomposition of energetic components of the temperature response. Its also takes into account both the horizontal energy transport and surface energy fluxes on the local energy balance. Räisänen (2017) included a more detailed 80 shortwave radiative flux treatment according to Taylor et al. (2007), and Merikanto et al. (2021) included a cloud radiative kernel treatment for a more physical separation of longwave cloud and clear-sky radiative fluxes. In this paper, we use this energy balance framework with climate model data from PDRMIP experiments to study the origins of regional temperature response and its standard deviation in six different climate models to four different climate forcing agents (carbon dioxide, methane, sulfate and black carbon). Evaluation of the mechanisms responsible for the model spread is key for understanding 85 why models still exhibit a substantial spread in temperature response even when forced identically. ), changes in surface energy fluxes (∆ ↓ , essentially representing changes in atmosphere-to-ocean net heat flux), and convergence of atmospheric energy (∆ , representing horizontal atmospheric heat transport). These changes are illustrated in Fig. 1.We use the method presented in Räisänen and Ylhäisi (2015), Räisänen (2017), and Merikanto et al. 110 (2021). The method is based on a concept of planetary emissivity (Harshvardhan and Cess, 1976), which links the local surface air temperature ( ) to the outgoing long wave radiation at the top of the atmospheric column ( where is an effective local planetary emissivity and is the Boltzmann constant. Then, letting [ ] to mark the mean state between baseline and perturbed climates, the change in outgoing longwave radiation between the two climate states can be 115 written as where ∆ is the local change in outgoing thermal radiation at constant emissivity (i.e at fixed thermal atmospheric structure and water vapor concentration), and hence represents the local Planck feedback parameter. ∆ , ↑ is the change in the outgoing thermal radiation associated with the change in the local emissivity of the atmosphere. 120 The rate of energy change within an atmospheric column is given by the energy balance equation where is the change of internal energy within the column with respect to time, The change in ↑ between two climate states can thus be written as 130 Using Eq.
(2) with Eq. (5), the local change in surface temperature can be decomposed to different energetic components as ∆ , ∆ and ∆ can be calculated directly from the standard energy flux output of the models, and ∆ as a residual term. ∆ includes both horizontal energy transport and change in local atmospheric energy storage which is insignificant at annual time scales. Furthermore, ∆ can be decomposed into clear-sky, cloud, albedo and non-linear terms 135 using the Approximate Partial Radiative Perturbation (APRP) method (Taylor et al., 2007), is typically negligibly small, and can be ignored (Räisänen, 2017;Merikanto et al., 2021). the actual cloud contribution to longwave emissivity change, we applied the radiative kernel method of (Soden et al., 2008).
With this method, a correction factor can be calculated, where , and represent radiative kernels where each state variable (surface temperature, temperature profile and All results have been calculated using three different kernels (ECHAM , GFDL (Pendergrass et al., 2018) and HadGEM2 (Smith, 2018) to obtain a better estimate of the overall cloud effect. The correction factor of Eq. (9) has been calculated as an average of the three kernels.
In the above equation, the temperature responses related to the first five components build up from a sum of the instant radiative forcing (if any), rapid adjustments associated with the component, and a temperature dependent feedback which adjusts its magnitude as the surface temperature changes, normalized by (the Planck feedback). Therefore, temperature responses related to these terms are functions of a constant term (forcing and adjustments) and a time dependent term (the impact of feedback due to surface temperature changes). of the energy budget in regional forcing-feedback analysis in various ways in the literature (Crook et al., 2011;Feldl & Roe, 2013;Lu & Cai, 2009).

175
Decomposing the temperature responses ∆ also allows us to to decompose their contributions to the model-to-model standard deviations of the total temperature responses by, (13)

Models and Simulations
We use climate model data from (PDRMIP) . In PDRMIP, several independent climate models were used to simulate various idealized climate perturbations. The models used in this study are listed in Table 1. According to Knutti 185 (2013) all these models belong to different model families, and hence are independent from each other. Our study uses data from experiments of instant doubling of CO2 concentrations (co2x2), tripling of CH4 concentrations (ch4x3), five folding sulfate emission (sulx5) and ten folding black carbon emissions (bcx10) (see Table 2). Perturbations were relative to the https://doi.org/10.5194/acp-2021-401 Preprint. All simulations consisted of 100-year baseline and perturbed runs, and the last 50 years of these runs are used for the temperature response analysis carried out here. The PRDMIP experiments also included additional fixed sea-surface temperature runs, which we use for the calculation of the effective radiative forcing (ERFfsst) associated with each climate perturbation. We also calculated the effective radiative forcing by regressing the top-of-atmosphere radiative imbalance against 200 surface temperature change (ERFreg) by using the full 100-year timeseries of experiments, as further discussed below. Aerosol emissions were either defined explicitly or by multiplying pre-defined concentrations derived from AeroCom Phase II (Myhre et al., 2012) (see Table 1). Only NorESM1 and MIROC-SPRINTARS include the aerosol indirect effect (the Twomey effect) for black carbon; however, the semi-indirect effect is included in all models. The aerosol cloud effects from sulfate are included in all models except NCAR-ESM1-CAM4 and GISS-E2-R. We include the six PDRMIP models which had reported all 205 necessary fields for this analysis.

Global TOA radiative forcing and surface temperature responses of analyzed experiments
In this paper, we focus on decomposed local and global temperature responses normalized by the global effective radiative 210 forcing (ERFfsst) obtained from fixed-sea-surface-temperature experiments for each modelled perturbation. The normalization by ERFfsst enables us to compare the temperature responses of different modelled perturbations with each other on a level ground, as ERFfsst varies in sign and magnitude between different perturbations. Also, particularly in aerosol and methane experiments the modelled ERFfsst values vary between different models, likely due to differences in model aerosol setups and baseline methane concentrations. 215 Figure 2 shows the calculated effective radiative forcings and the global mean temperature responses (the difference in perturbed climate for the years 50-100 and the corresponding years from the base case) in the analyzed PDRMIP experiments.
The effective radiative forcing is calculated from both fixed-sea-surface-temperature simulations (ERFfsst, no land-warming corrections included) and by regressing the top-of-atmosphere radiative imbalance with respect to surface temperature change 220 by using the full 100-year timeseries of experiments (ERFreg, Gregory et al., 2004).   (Tables S1-S4), and the estimated equilibrium temperature responses for 235 each of the experiments is shown in Table S5.

255
In the following sections, we present decomposed effective temperature responses for each analyzed experiment and modelto-model spread of these decompositions. The effective surface temperature responses and their decompositions are calculated for each atmospheric column separately from the average differences in perturbed climates for the years 50-100 after a sudden perturbation and the corresponding years from the baseline simulations without perturbations. The local temperature responses 260 are normalized by the globally averaged ERFfsst for each experiment (hence the term effective). Scaling with ERFfsst allows a simpler comparison of responses between different forcing agents, but it also changes the sign of responses in case of sulx5 experiments. This is because in contrast to the other three forcing agents, the radiative forcing is negative for increasing sulfate feedback which adjusts its magnitude as the surface temperature changes, as described in the end of Section 2.1. Therefore, temperature responses related to these components are functions of a constant term -forcing (if any) and rapid adjustmentsand a time dependent term -the impact of feedback as surface temperature changes. 270 The temperature response decomposition applied here relies on a local conservation of energy in each atmospheric column, and hence the sums of individual temperature response components generate the local total surface temperature responses with high accuracy. Below, Section 3.1 presents the globally averaged results. Section 3.2 then presents the regional distributions of the decomposed surface temperature responses and their zonal averages. Section 3.3 presents the regional and latitudinal 275 distributions of the model-to-model standard deviations of the effective temperature components, and the contributions of each of the decomposed surface temperature response components to the total standard deviations of the responses. is directly related to the response due to surface albedo feedback, ΔSURF is a measure of the surface energy flux imbalance on global surface temperatures due to oceanic heat uptake in the models, and ΔCONV describes the impact of horizontal surface energy transport and change in atmospheric heat uptake. The models which have a fully coupled ocean have not fully reached equilibrium, and therefore there is still some heat flux from the atmosphere to the ocean. Thus, the effective temperature response from this heat flux is always negative. Globally, ΔCONV averages effectively to zero in each experiment 290 since the energy transport only redistributes regional surface temperature effects, and the change in atmospheric heat uptake is negligible on annual or longer timescales (Räisänen, 2017).   different forcers, as shown in previous PDRMIP studies (Richardson et al., 2018;Samset et al., 2018;Stjern et al., 2017). The change in TOA longwave clear-sky emissivity is the key driver of the effective temperature response for the greenhouse gas experiments (co2x2 and ch4x3), with the multi-model-mean effective surface temperature responses to ΔLWclr matching nearly exactly the overall responses (0.60 KW -1 m 2 ± 0.10 and 0.53 KW -1 m 2 ± 0.18 respectively). ΔLWclr results from the change in 310 clear-sky planetary emissivity, i.e. from the combination of the longwave clear-sky radiative forcing and its adjustments, and the change in the thermal structure of the atmosphere and water vapor concentrations which evolve with the surface temperature response (lapse rate and water vapor feedbacks). For the aerosol experiments (sulx5 and bcx10) the effective temperature response associated with ΔLWclr is only a small contribution to the total temperature response. This is because for aerosols the instantaneous radiative forcing associated with the longwave clear-sky radiation is small. 315

Decomposed global effective surface temperature responses for different forcers
The differences in effective temperature responses associated with the ΔSWclr between the greenhouse gas and aerosol forcers can be understood via a similar narrative as in case of ΔLWclr responses. The total effective temperature response for aerosol experiments (sulx5 and bcx10) is largely dominated by the temperature response to ΔSWclr , since much of instantaneous aerosol radiative forcing takes place via this channel. For the greenhouse gas experiments the temperature response to ΔSWclr 320 originates from the shortwave water vapor feedback and direct GHG's shortwave forcing (Etminan et al., 2016), and its modelmean contribution to total effective temperature response is comparable to that from the albedo response (~10%). The multi-model-mean effective temperature responses related to ΔLWcld and ΔSWcld are close to zero for all experiments besides for bcx10, for which the cloud temperature responses modestly oppose the total effective temperature response. For 325 the sulx5 experiment, the model-mean ΔSWcld is near zero and its spread is high between the models. A significant part of the spread is related to the lack of cloud-aerosol interaction in NCAR-CESM1-CAM4 and GISS-E2-R. In these models, ΔSWcld reduces the sulfate-induced global mean cooling, whereas it amplifies the cooling in the other models.
The global effective temperature response from the changes in surface albedo is similar across each experiment. The mean 330 effective temperature response due to albedo change varies from ~10% (co2x2, chx3 and bcx10) to 14% (sulx5) of the total effective temperature response. In the aerosol experiments the aerosol setup has a significant effect on the temperature contribution of the surface albedo change. Emission-driven models tend to produce a higher albedo effective temperature response than the concentration-driven models.   The spatial distribution of the total effective temperature response is largely similar for each forcer, although the total response 355 to aerosols is stronger over the continental Northern midlatitudes, compared to total responses to greenhouse gases, and weaker over the Southern hemisphere oceans. Regionally, local maximum effective temperature responses are found in the Barents Sea for all forcers, with maximum values of 2.38, 2.04, 2.96 and 2.53 KW -1 m 2 for co2x2, ch4x3, sulx5 and bcx10, respectively. For greenhouse gases, the regional effective temperature responses are mostly associated with the response to ΔLWclr. 365 However, with all forcers the spatial distribution of the ΔLWclr contribution resembles the overall effective temperature response. The spatial correlation coefficients between the effective total and ΔLWclr -induced temperature responses for the co2x2, ch4x3, sulx5 and bcx10 experiments are 0.90, 0.81, 0,94 and 0.74, respectively. The difference between the greenhouse gas and aerosol experiments is that for greenhouse gases the ΔLWclr response includes the combined effects of forcing, its rapid adjustments and lapse rate and water vapor feedbacks, while for aerosols the response results only from the rapid 370 adjustments and lapse rate and water vapor feedbacks. For the co2x2 and ch4x3 experiments, the total effective temperature response and ΔLWclr temperature response differ most in the equatorial Pacific Ocean, where the ΔLWclr response exceeds the total response. The high ΔLWclr temperature response in this region is compensated by contributions from ΔSWcld, ΔLWcld and atmospheric energy transport ΔCONV (see Fig, 4).

375
For aerosols, most of the effective temperature response is due to ΔSWclr . Besides the modest water vapor contributions to ΔSWclr , this response is directly related to excess scattering and absorption of solar radiation (direct aerosol radiative forcing) due to changes in aerosol concentrations, as was shown in Merikanto et al. (2021). Most of the sulfate emissions originate from Asia, Europe and North America, while most of the black carbon emissions originate from Asia, Europe and North America and African, South American and boreal wildfires (Myhre et al., 2012), This makes the forcing in both cases stronger 380 in the Northern than in the Southern hemisphere. The ΔSWclr temperature response to these emissions can be clearly seen both for the bcx10 and sulx5 experiments (see Fig. 5). For the bcx5 experiments, the local effective temperature response due to ΔSWclr exceeds the total effective temperature response from the tropics to the Northern midlatitudes (Fig. 5). These local excess warming responses by ΔSWclr in bcx5 experiments are counteracted by temperature responses to changes in atmospheric energy transport and clouds. In case of aerosols, ΔLWclr contributes to the effective temperature response mainly in the 385 Northern hemisphere continents and Arctic sea-ice regions. In the bcx10 experiments, ΔLWclr induces a clear negative contribution over ocean regions related to changes in the vertical temperature distribution of the atmosphere (see Fig. S2).
There is significant variation in the regional effective temperature contributions due to clouds between regions and forcing agents. In the greenhouse gas experiments (co2x2 and ch4x3) the regional effective temperature responses due to ΔLWcld and 390 ΔSWcld tend to cancel each other, except over the Southern Ocean where the total cloud contribution is dominated by a negative ΔSWcld (see Fig. 5). This relates to a strong increase in cloud cover in the same regions (see Figures S4 and S5). hemispheric cloud responses are larger in magnitude for aerosols than for greenhouse gases, and for the bcx10 experiment in particular. Throughout the latitudes bcx10 causes a strong net cloud cooling, except for the polar regions where the net cloud responses are small. The sign of the regional ΔSWcld effective temperature response in the bcx10 experiments depends strongly on the region. There is a negative temperature response due to ΔSWcld in Asia and Africa, but positive over the Amazon region. This is related to the cloud cover change, as black carbon increases the cloud cover over Asia and Africa, but decreases 400 it over the Amazon (see Fig, S5) due to the semi-direct aerosol effect of black carbon. Contrary to bcx10, sulx5 shows a mild positive contribution from clouds over Northern hemispheric midlatitudes and a mild cooling response in the Arctic regions.
However, the strength of the ΔSWcld responses in the sulx5 experiments depends on the inclusion or lack of aerosol-cloud indirect effect in the models.

405
The effective temperature response to surface albedo change originates from the change in sea-ice and snow cover and is always positive. Changes in surface albedo have a modest effect on the global effective temperature response with all forcers (0.07,0.06,0.08,0.06 KW -1 m 2 for the co2x2, ch4x3, sulx5, bcx10 experiments, respectively). However, in some regions, the effective temperature response to albedo change exceeds 1 KW -1 m 2 for all forcers. Over the Arctic, the regions of local maximum values are the same where the overall effective temperature responses are highest, highlighting the role of sea-ice 410 changes causing locally high temperature responses. The local maximum values of the effective temperature response to albedo change also match with the regions with a positive temperature response to ocean heat exchange (ΔSURF). The change in surface albedo also enhances the temperature response over the Southern Ocean, but there the temperature response to oceanic heat exchange is slightly negative. However, over the Southern Ocean the temperature response signal to surface albedo change is mainly visible in the co2x2 and ch4x3 experiments, and appears to be driven by the longwave clear-sky forcing and 415 feedbacks, and ocean heat transport.
Over the oceans, ΔSURF has a large impact on the effective temperature response, with the co2x2, ch4x3 and sulx5 experiments all showing negative effective temperature contributions due to ΔSURF south of Greenland and positive contributions over the Barents Sea. With black carbon a robust negative signal over the northern North Atlantic is missing, 420 however, but similarly to other forcers there is a robust positive signal over the Barents Sea. As earlier found for increased CO2 by Räisänen (2017), the effects of oceanic heat transport and storage (ΔSURF) and atmospheric heat transport (ΔCONV) strongly oppose each other over the oceans in the co2x2, ch4x3 and sulx5 experiments. In the sulx5 and bcx10 experiments, ΔCONV also significantly compensates for the surface temperature effects due to changes in ΔSWclr, which reflects changes in the direct aerosol forcing. Similarly, to the effective temperature response itself, also its model-to-model spread (standard deviation) can be decomposed to components that sum up to the total spread in the effective surface temperature response (Sect. 2.2). Figure 6 shows the 430 decomposed model-to-model standard deviations of the total effective temperature responses (first row) for each perturbation experiment, and the decomposed contributions of each component to the spread in total responses. The latitudinal distributions of the different components are shown in Fig. 7.

445
The globally averaged magnitude of the model-to-model spread is similar between co2x2, ch4x3 and sulx5 experiments (0.19, 0.18, and 0.18 KW -1 m 2, respectively). Black carbon induces a much larger variability between the models (0.32 KW -1 m 2 ). The spatial structure of the model-to-model spread resembles the spatial structure of the effective temperature response. spread is the largest in the ocean region between Iceland and Svalbard (2.47 KW -1 m 2 ), and for black carbon east of Svalbard (2.23 KW -1 m 2 ). In the co2x2 and ch4x3 experiments, the strongest component in model-to-model spread is ΔLWclr (see Fig.  455 7), but the partial contributions of other components are also significant. The amplification of the spread in the effective temperature response towards high latitudes (Fig. 7) is strongly related to additional spread arising from differences in the surface albedo response (ΔSWAlbedo), reflecting differences in sea ice and snow cover responses. The total contributions of cloud responses to the model spread are significant over Southern and Northern midlatitudes and to a lesser extent over the equatorial region. In the Southern Ocean sea-ice regions, the model spread originates from differences in the ΔLWclr and 460 ΔSWAlbedo responses in the models, as well as from differences in the oceanic heat exchange (ΔSURF) compensated by differences in the atmospheric heat transport (ΔCONV) (see Fig. 6 With aerosol experiments (sulx5 and bcx10) most of the cloud-induced model-to-model spread is related to emissions sources, and is highly affected by which aerosol-cloud effects are included in the models. Also for aerosols, the contributions to model spread due to ΔSWcld and ΔLWcld often oppose each other, but the stronger model spread related to the ΔSWcld response dominates the overall cloud contribution in midlatitudes, while model spread due to ΔLWcld dominates the model spread over 480 the equatorial regions.
On the other hand, in the aerosol experiments (sulx5 and bcx10) the atmospheric heat transport (ΔCONV) tends to compensate the regional differences in model responses more efficiently than in the greenhouse gas experiments. In addition, the total heat transport (ΔSURF + ΔSURF) reduces the zonally averaged model spread (see Fig. 7), particularly in the case of the bcx10 485 experiments. Similarly to the greenhouse gas experiments, ΔLWclr is still a major driver of model-to-model spread also in the aerosol experiments, and particularly in the high latitude regions (Fig. 6). Compared to the greenhouse gas experiments, the Previously, the model-to-model spread in global climate sensitivity (equilibrium response to doubled CO2 concentration) has been largely attributed to differences in cloud feedback strength (Colman, 2003;Zelinka et al., 2020;Zhao et al., 2016). Our results point to somewhat divergent conclusions. Similarly to Hu et al. (2020), our results point to the water vapor feedback 495 as the main mechanism leading to model spread. If the model spread is only attributed using feedback analysis, model differences in the forcing and adjustments may counteract some of the differences. However, it should be noted that our results are based on only six different models, and hence might be biased. For example, Zelinka et al. (2020) show that the contribution of clouds to the equilibrium climate sensitivity response exhibits notably large variation from approximately -0.2K to 3K for CMIP6 models and -0.18K to 2.6K for CMIP5. In our results ΔSWcld and ΔLWcld largely cancel each other out in the co2x2 500 experiments, leading to a smaller combined cloud contribution to the model spread and contributes only 12% (see table S1) to the global model spread. For comparison, for a sample of 16 CMIP5 models with a transient response to doubling of CO2, Räisänen (2017) found the clear-sky LW response to be the largest contributor to the model spread in 34% of the global area, whereas the combined cloud response had this position in 29% of the world.

Discussion and Conclusions 505
In this work, we have conducted an energy balance decomposition of the near-surface temperature response resulting from doubling CO2, tripling CH4, five folding sulfate concentrations, and ten folding black carbon concentrations for six independent climate models. The regional temperature response was then decomposed to contributions from different energy balance terms, namely changes in LW and SW clear-sky and cloud radiative fluxes (SW and LW), the net surface energy flux (SURF), and horizontal energy transport (CONV). All forcers produce a similar global response per unit ERF (0.63, 0.54, 0.57 and 0.61 510 KW -1 m 2 for increasing CO2, CH4, sulfates and black carbon, respectively). The majority of the globally averaged temperature change for doubling the CO2 and tripling the CH4 concentration, originates from changes in clear-sky planetary emissivity (0.60 and 0.53 KW -1 m 2 respectively), i.e. from the combination of the longwave clear-sky radiative forcing with the change in the thermal structure of the atmosphere and water vapor concentrations. In the aerosol experiments (sulfate and black carbon) the key driver of surface temperature response is the change in the clear-sky shortwave radiative flux (0.36 and 0.71 KW -1 m 2 515 respectively) related to excess scattering and absorption of solar radiation (direct aerosol radiative forcing) due to changes in aerosol concentrations. We note that if only models including sulfate aerosol-cloud interactions are included, the SW cloud terms are in the same magnitude (ΔSWclr=0.25 and ΔSWcld=0.12 KW -1 m 2 ). See below for a more detailed discussion on different aerosol setups. The overall temperature response to the forcers is largest in high latitudes, where the response is driven by changes in LW clear-sky fluxes and changes in surface albedo, except for black carbon for which the majority of the total 520 response comes from the clear-sky SW term in the high latitudes.
The temperature decomposition method provides a tool for understanding regional and global temperature changes. However, the original method is somewhat simplistic in its treatment of LW cloud processes (Räisänen, 2017). In Merikanto et al. (2021) we implemented a radiative kernel correction to make the LW treatment of clouds more realistic. However, despite this 525 correction we still have a negative effective LW temperature response from clouds when the CO2 concentration is doubled, whereas literature suggests a positive LW cloud feedback (Tomassini et al., 2013;Vial et al., 2013). This is due to neglecting the positive masking effect of increasing CO2 on the LW cloud forcing, since the corresponding effects can not be calculated for the other forcing agents with existing kernels. The applied radiative kernels for the temperature and H2O have a relatively minor effect on global averages of the LW cloud and clear-sky terms (see Fig, S6) . Similarly to previous studies (Smith et 530 al., 2018), we found that the relative importance of the kernel correction does not depend on the radiative kernel used.
In our study, clouds play a minor role in the global mean temperature response, as the LW cloud and SW cloud terms tend to cancel each other out. However, regionally the temperature response originating from the clouds is a significant contributor.
For all forcers, the temperature response in the Antarctic sea-ice region and in the Southern Ocean is dampened by clouds. 535 With BC clouds dampen the regional temperature response in Asia, North America, Africa, and Europe, and enhance the warming in Amazon. In contrast to clouds, with all forcing agents surface albedo changes enhance the temperature responses in high latitudes. For greenhouse gases, the mild polar amplification in south is associated with a negative contribution from the ocean heat exchange over the Southern Ocean, negative total cloud contribution and a mild LW clear-sky component.

540
We also decompose the model-to-model spread to the contributions of energy balance terms. The model-to-model spread is the largest in the same regions as the average temperature response, i.e. at high latitudes, where the spread is driven by differences in the lapse-rate and water vapor feedbacks (ΔLWclr) and differences in surface albedo (ΔSWAlbedo) changes. For the aerosol-induced temperature responses, also differences in the direct aerosol forcing (ΔSWclr) generate a significant contribution to the model spread, especially for black carbon. This partly arises from different aerosol configuration between 545 models.
The aerosol configuration is important in the generation of the effective temperature response and its model-to-model spread.
In the aerosol experiments, part of the model-to-model spread originates from the difference between aerosol setups, with the emission-driven models generating a higher effective temperature response than the concentration-driven models. For sulx5, 550 the concentration-driven models' mean effective temperature response is 0.49 KW -1 m 2 while for the emission-driven models it is 0.66 KW -1 m 2 . The corresponding numbers for the bcx10 experiments are 0.45 KW -1 m 2 and 0.76 KW -1 m 2 . For the sulx5 experiments, the sign and the regional distribution of ΔSWcld is strongly related to the aerosol setup; however, it should be https://doi.org/10.5194/acp-2021-401 Preprint. Discussion started: 22 June 2021 c Author(s) 2021. CC BY 4.0 License.
noted that two out of the three concentration-driven models do not include aerosol-cloud interactions. In the bcx10 experiment, the aerosol setup modifies the ΔSWclr and ΔLWclr temperature responses (see S1, showing the decomposed temperature 555 responses separately for the concentration and emission-driven models for the sulx5 and bcx10 experiments). The aerosol configuration also plays a crucial role in the SW-albedo response. For both the sulx5 and bcx10 aerosol experiments, only the emission-driven models show a significant temperature contribution from the ΔSWAlbedo term (0.1 and 0.08 KW -1 m 2 respectively) whereas the corresponding mean values for the concentration driven models are only 0.06 and 0.03 KW -1 m 2 . 560 We have demonstrated that the mechanisms behind model uncertainty vary between different regions and forcing agents.
Understanding the atmosphere's dynamical response to different forcers is key to understand future climate changes at the regional level. This is especially important in the case of aerosols, which are predicted to decline in the near future due to climate change and air pollution mitigation actions.

Data and code availability
Data and scripts used for data analysis can be obtained by contacting corresponding author