Separating radiative forcing by aerosol–cloud interactions and rapid cloud adjustments in the ECHAM–HAMMOZ aerosol–climate model using the method of partial radiative perturbations

Using the method of offline radiative transfer modelling within the partial radiative perturbations (PRP) approach, the effective radiative forcing by aerosol–cloud interactions (ERFaci) in the ECHAM–HAMMOZ aerosol climate model is decomposed into a radiative forcing by anthropogenic cloud droplet number change and adjustments of the liquid water path and cloud fraction. The simulated radiative forcing by anthropogenic cloud droplet number change and liquid water path adjustment are of approximately equal magnitude at −0.52 W m−2 and −0.53 W m−2, respectively, while the cloud fraction adjustment is 5 somewhat weaker at −0.31 W m−2 (constituting 38%, 39%, and 23% of the total ERFaci, respectively); geographically, all three ERFaci components in the simulation peak over China, the subtropical eastern ocean boundaries, the northern Atlantic and Pacific oceans, Europe, and eastern North America (in order of prominence). Spatial correlations indicate that the temporal-mean liquid water path adjustment is proportional to the temporal-mean radiative forcing, while the relationship between cloud fraction adjustment and radiative forcing is less direct. While the estimate of warm-cloud ERFaci is relatively insensitive to the 10 treatment of ice and mixed-phase cloud overlying warm cloud, there are indications that more restrictive treatments of ice in the column result in a low bias in the estimated magnitude of the liquid water path adjustment and a high bias in the estimated magnitude of the droplet number forcing. Since the present work is the first PRP decomposition of the aerosol effective radiative forcing into radiative forcing and rapid cloud adjustments, idealized experiments are conducted to provide evidence that the PRP results are accurate. The experiments show that using low-frequency (daily or monthly) time-averaged model output 15 of the cloud property fields underestimates the ERF, but three-hourly mean output is sufficiently frequent. Copyright statement. TEXT

denoted as the effective radiative forcing (ERF aci ). In liquid-water clouds, RF aci arises from the increased availability of cloud condensation nuclei (CCN) in a polluted atmosphere leading to higher droplet number N d and smaller effective radius r e at constant cloud liquid water path L (Twomey, 1977). The atmosphere responds to the higher-N d , lower-r e clouds by various processes occurring on short timescales, leading to adjustments in other cloud properties, including L and cloud vertical and horizontal geometric extent. 5 Physically, the most important adjustment mechanisms are suppression of precipitation formation (Albrecht, 1989) and enhanced cloud-edge dry-air entrainment and droplet evaporation in the smaller-droplet clouds (Ackerman et al., 2004). The former mechanism, in isolation, would lead to an increase in cloud condensate and cloud fraction, and thus a negative adjustment to the radiative forcing; the latter, in isolation, would lead to a decrease in cloud condensate and thus a positive adjustment to the radiative forcing. Since no mechanism occurs in isolation in a coupled system (Stevens and Feingold, 2009), the question 10 of whether the net adjustment is positive or negative is both difficult and unresolved (e.g., Mülmenstädt and Feingold, 2018;Gryspeerdt et al., 2019a).
One method by which aerosol effective radiative forcing is estimated is to calculate the difference in top-of-atmosphere (TOA) radiative fluxes in general-circulation models (GCMs) between runs with fixed sea-surface temperature and present-day or preindustrial aerosol concentrations or emissions (e.g., Lohmann et al., 2010;Forster et al., 2016). This requires a GCM 15 that includes the relevant aerosol-radiation and aerosol-cloud interaction mechanisms. By representing the optical properties of aerosol in the radiative transfer, the radiative forcing by aerosol-radiation interactions (RF ari ) can be estimated; by representing the N d dependence on aerosol activation during cloud formation, RF aci can be estimated; and by representing precipitation suppression and enhanced evaporation in smaller-r e clouds, the adjustments to RF aci can be estimated. (We exclude ice and mixed-phase cloud processes, which introduce further complications, from this discussion.) These processes occur on scales 20 far below the resolved scale, so their representation in the GCM requires parameterization. Thus, the model is only imperfectly (if at all) aware of subgrid-scale variability in the process rates and feedbacks between the processes; relies on imperfect basestate cloud properties (e.g., Penner et al., 2006); and only considers effects that are amenable to parameterization, meaning that precipitation suppression is included in many models but enhanced evaporation is not (e.g., Salzmann et al., 2010;Michibata et al., 2016;Zhou and Penner, 2017). Based on these considerations, a prevalent view is that, from the standpoint of achieving 25 GCM fidelity, ACI are more difficult than aerosol-radiation interactions, and ACI adjustments are more difficult than the ACI forcing. On top of this, the usual concerns about parametric uncertainty apply, so that the overall uncertainty on GCM estimates of ERF aci is large (Boucher et al., 2014). Nevertheless, the ability of GCMs to produce a global estimate will assure their star will continue to shine brightly in the firmament of ERF estimation methods until competing methods overcome their own significant drawbacks. 30 In general, one might argue that knowing the uncertainty on each term in a sum is a good first step towards attacking the uncertainty on the total; certainly, this is consistent with the GCM paradigm of building up the total forcing from parameterizations for each of the contributing processes, even if it is less applicable to "top-down" estimates from the historical evolution of the climate system. Thus, we write the effective radiative forcing by aerosol as where F ari is the RF ari , F N d is the RF aci due to the increase in N d , and F L and F f c are the L and cloud fraction ( f c ) adjustments to the RF ari and RF aci . Other adjustments, e.g., due to rapid changes in land surface temperatures or atmospheric temperature and humidity profiles, have been estimated as small in a previous study (Heyn et al., 2017). Each of these terms maps fairly well onto a parameterization in the GCM: RF ari is parameterized in the radiative transfer, F N d is parameterized in a droplet activation scheme, the ACI part of F L is parameterized in the precipitation microphysics (and, if enhanced evaporation becomes tractable 5 in the future, that component will presumably be parameterized in the turbulence scheme; e.g., Guo et al., 2011;Neubauer et al., 2014), and F f c is parameterized in the cloud cover scheme (although in our model the response to the perturbation is indirect, via relative humidity changes subsequent to precipitation rate changes); the only component that emerges from the model dynamics rather than from an explicit parameterization is the adjustments of temperature and moisture profiles that entail further adjustments to aerosol-cloud interactions and to aerosol-radiation interactions (formerly known as "semi-10 direct effect"); both of these terms are small (Heyn et al., 2017;Stjern et al., 2017). A natural first step would then be to ask how large each of these terms is, and a natural second step would be to ask how much uncertainty each contributes to the total. One of the benefits of such a decomposition would be that it would provide a more solid footing for-or falsifythe notion that models agree fairly well on the "simpler" problem of RF aci and not well at all on the "harder" problem of the adjustments. However, performing the decomposition is quite difficult in practice. Ghan (2013) addresses the issue of separating 15 F ari from F N d + F L + F f c using the model's intrinsic knowledge of the anthropogenic perturbations. Methods to separate the three ERF aci components using the intrinsic model knowledge of the time-varying, three-dimensional aerosol perturbation and resulting perturbation of cloud properties need to contend with the problems that the ERF aci is diagnosed from two separate runs that cannot easily share fields online, so that double radiation calls as in Ghan (2013) are not feasible; and that the adjustments, by definition, do not respond to the aerosol perturbations instantaneously. One possibility is the approximate 20 partial radiative perturbation (APRP) decomposition (Taylor et al., 2007;Zelinka et al., 2014). APRP decomposes the cloud property changes into changes in area fraction, cloud albedo, and cloud absorption; the change in area fraction maps well onto the f c adjustment in the forcing-adjustment framework, but the APRP cloud albedo change includes both the effect of the anthropogenic N d change and the L adjustment. Another possibility is to deactivate the parameterized precipitation suppression (and, if models include a parameterization of the enhanced evaporation, deactivate that as well); however, the 25 model with parameterized adjustments will produce a different climate than the model without (Penner et al., 2006). Due to the complications arising in methods that directly use the model state, less direct methods have been developed that idealize the cloud field as a globally homogeneous single layer (Ghan et al., 2016) or use similar regression-based statistical techniques as satellite studies (Gryspeerdt et al., 2019b).
In this work, we apply the method of partial radiative perturbations (PRP; Wetherald and Manabe, 1988;Colman and McA-30 vaney, 1997;Colman, 2003;Klocke et al., 2013) to the ERF decomposition problem. PRP falls in the category of methods that directly use the intrinsic model knowledge of the time-varying, three-dimensional aerosol perturbation and resulting perturbation of cloud properties. The starting point for PRP is a perturbed and an unperturbed model run. One then introduces the fields of the perturbed run into the unperturbed run, one at a time, and reruns the radiative transfer scheme on the "partially perturbed" state to derive the resulting change in radiative fluxes. In our application, the two runs are simulations with an atmospheric 35 GCM with prescribed climatological, seasonally varying sea surface temperature (SST) and sea ice cover (SIC) distributions with present-day and preindustrial aerosol emissions, nudged to present-day large-scale upper-level winds to reduce the internal variability without overconstraining the behavior of lower-tropospheric warm cloud and allow significant changes in cloud property to emerge after a shorter integration time than would otherwise be required (e.g., Kooperman et al., 2012;Zhang et al., 2014). The perturbed fields are N d , L, and f c ; the corresponding changes in radiative fluxes are F N d , F L , and F f c .

5
In Section 2, we describe the PRP method and ECHAM-HAMMOZ model in detail; in Section 3, we use PRP to estimate the ERF components in the ECHAM-HAMMOZ model and determine whether the adjustments are a simple proportional response to the forcing.

Methods
We first give a brief formal description of the PRP method; we then describe the model configurations to which we will apply 10 the method.

Partial radiative perturbations
We denote the shortwave TOA flux as Q and the longwave flux as R (all-sky, positive downward in both cases). For the purposes of this analysis, the radiative flux in each spectral range is considered a function of the cloud properties N d , L (or the vertically resolved analogue q l ), and f c . The dependence of the fluxes on other climate state variables -water vapor mixing ratio q; 15 ice-water particle size, shape, and mixing ratio q i ; aerosol properties; radiatively active gases; surface properties; and incoming solar radiation -is implicit: backward PRP as inserting one cloud property at a time from one run into the cloud field of the other and recalculating the radiative fluxes: where cloud property ξ is substituted from run A into run B or run B into run A, respectively, and Q (or R, analogously) is 25 recalculated using the offline version of the model's radiative transfer scheme. Forward-backward PRP is simply the average of the two, taking into consideration that reversing direction reverses the sign of the radiative-flux perturbation (e.g., Klocke et al., 2013): When A denotes the preindustrial (PI)-emissions run and B denotes the present-day (PD)-emissions run, the components of ERF aci correspond to For other meanings of A and B, as in the additional experiments performed in Sections 3.1-3.3, the equivalent expressions to Equations (7)-(9) describe pseudo-forcing components rather than forcing components; we denote them asF N d ,F L , andF f c .
In Equations (7)-(9), indicates averaging over the time dimension of a field F evaluated at the N time steps {t 1 , . . ., t N }. "Evaluated" can, itself, 10 refer to a temporal average over the interval between evaluation time steps, as in a 3-hourly or daily mean, or it can refer to the instantaneous value of the field at that time step; when the distinction matters (because Q and R are not linear functions of their input variables), we will indicate the averaging interval as F (∆t) . Thus, F (inst) denotes the temporal mean of instantaneous model output, while F (3 h) denotes the temporal mean of 3-hourly-averaged model output. 15 We use several model runs performed with the ECHAM-HAMMOZ model, version echam6.1-ham2.2-moz0.9 (Neubauer et al., 2014). The model is based on the ECHAM atmospheric general circulation model , the HAM interactive aerosol module (Stier et al., 2005;Zhang et al., 2012), and the trace-gas chemistry module MOZ (Kinnison et al., 2007) (the latter is disabled in these runs). Of most direct relevance to our study, the parameterized processes contributing to warm-cloud-aerosol interactions are aerosol activation into cloud droplets according to Lin and Leaitch (1997); diagnostic 20 warm rain processes (autoconversion and accretion) according to Khairoutdinov and Kogan (2000); and aerosol scavenging according to Croft et al. (2009Croft et al. ( , 2010. The stratiform cloud scheme is that of Lohmann and Roeckner (1996), extended to double-moment microphysics by Lohmann et al. (2007) and Lohmann and Hoose (2009), with the Sundqvist et al. (1989) critical-relative-humidity cloud cover parameterization.

Model description
To reduce internal variability and achieve low statistical uncertainty on the forcing components within a reasonable integra- To perform PRP on the model output, we have updated the offline version of the RRTM-based ECHAM6 radiative transfer code  that was originally used in Klocke et al. (2013). We neglect time-varying aerosol-radiation interactions to reduce technical complexity. To the extent that aerosol overlying cloud is a small effect, this choice mainly affects our estimate of the f c adjustment (Ghan, 2013;Zelinka et al., 2014), which, unlike the N d forcing and L adjustment, is straightforward to compute without the PRP machinery; comparison to Gryspeerdt et al. (2019b) shows that the f c adjustment 5 estimate is not strongly affected by this simplification.
When clouds are absent (or the cloud fraction is very low) in one run and present in the other, perturbing cloud properties can yield unrealistically large or small q l or N d (and thus r e ); this "decorrelation" problem is well known from the application to climate feedbacks (Colman and McAvaney, 1997) in the context of the correlation between water vapor and cloudiness.
We allow the radiative transfer code to resolve the conflicting cloud properties in the same way as it does when the cloud 10 microphysics and cloud cover schemes produce conflicting cloud properties; in particular, r e can only vary within the limits of the cloud optics lookup table used by the model (2 × 10 −6 to 32 × 10 −6 m). Appendices A1 and A2 describe tests we performed to verify that forward-backward PRP ERF aci estimates are not impacted by the decorrelation problem.

Results
Since the components of the ERF aci have not been diagnosed before in ECHAM-HAM by any method, we begin by presenting 15 their global-mean values and geographic distributions in Section 3.1. In Section 3.2, we investigate whether rapid adjustments to the Twomey forcing are proportional to F N d in terms of their spatial patterns. Section 3.3 investigates the sensitivity of the PRP results to the treatment of model columns containing ice and mixed-phase clouds. In Section 3.4, we determine how much temporal averaging is permissible before the PRP estimate becomes inaccurate. Sections A1 and A2 discuss whether PRP diagnoses the ERF aci components correctly in the presence of decorrelation effects (i.e., effects of introducing one cloud 20 property from one run into an uncorrelated cloud field in another run).

What are the ERF aci components in ECHAM-HAM?
Using the PRP decomposition, Equations (7)-(9), we can diagnose the contributions to the ERF aci from PD and PI-emission fixed-SST model runs. This is shown in Figure 1 and understand the differences between our decomposition and others is underway. All results in this section are based on 3-hourly mean output, F (3 h) . This is a commonly used model output configuration, albeit at the expensive end of the spectrum from the standpoint of storage space requirements. We will justify this choice in Section 3.4.
The geographic patterns of all components exhibit similar features. In the northern hemisphere, fairly strong forcing prevails over both oceans and over most of the continents, with the exception of desert regions, northern Asia, and the Arctic; over the continents, a plume of high forcing components over China, extending eastwards into the Pacific Ocean, and of a magnitude far 5 greater than over Europe and North America, is especially pronounced. In the southern hemisphere, on the other hand, sizable forcing components are largely limited to the subtropical southern Pacific and southern Atlantic in the vicinity of the persistent stratocumulus decks; smaller local maxima in the forcing components also exist in the outflow regions of the midlatitude westerlies downwind of South America, Africa, and Australia.
These geographic patterns result from a convolution of the distributions of susceptible clouds and N d perturbations. Figure 2   10 shows the ERF aci sensitivity, defined as the forcing or adjustment strength per e-folding of the N d burden (with the bar denoting temporal averaging over the length of the run), where the N d burden is defined as 15 Equation (11)   In observational studies or observationally constrained modeling studies, it is common to define susceptibilities analogously to Eq. (11) based on PD variability in cloud and aerosol variables and then multiply those susceptibilities by wholly or partially (Bellouin et al., 2013;Kinne, 2019) model-derived estimates of anthropogenic aerosol perturbations. (In the terminology we adopt here, "sensitivity" is a change in cloud property or cloud radiative effect in response to a climatological change in an aerosol variable, whereas "susceptibility" is a change in response to an instantaneous change in an aerosol variable.) There is disagreement among these studies on whether (Quaas et al., 2008;Alterskjaer et al., 2012;Engström et al., 2015;Gryspeerdt et al., 2016;Andersen et al., 2017;Christensen et al., 2017;McCoy et al., 2017) or not (Lebsock et al., 2008;Chen et al., 2014;5 Gryspeerdt et al., 2017) the most susceptible oceanic clouds (or strongest ERF aci components over ocean, in studies that do not report susceptibilities) occur in warm oceanic clouds where continental pollution intrudes on relatively clean conditions over the eastern ocean boundaries, as in our results. Of those studies that are not restricted to oceanic clouds, some agree with our finding of strong forcing due to relatively susceptible clouds over China (Gryspeerdt et al., 2017;McCoy et al., 2017) and some do not (Quaas et al., 2008;Alterskjaer et al., 2012;Gryspeerdt et al., 2016). An intriguing aspect of the ACI problem is whether the adjustments may be described approximately as a proportional response to the forcing (Gryspeerdt et al., 2019a). On the one hand, we do not necessarily expect proportionality in the physical atmosphere, since the processes responsible for the adjustments carry memory of the cloud evolution over various time scales; the parameterized cloud processes in GCMs share this feature, at least in principle, since the anthropogenic N d perturbation 15 seen by the precipitation parameterization at one time step could be the result of a CCN perturbation at some point in the past, carried to another point in space by advection, and influenced by any of the other parameterized cloud processes. On the other hand, complex systems oftentimes exhibit simple emergent behaviors (e.g., Mülmenstädt and Feingold, 2018, and references therein). If the adjustments were to follow proportionally from the forcing, one consequence for the ACI problem would be that the total ERF aci uncertainty should not be estimated as the uncertainty on the sum of uncorrelated RF aci and 20 adjustments but rather take the correlation between the forcing and adjustments into account, which would result in a smaller ERF aci uncertainty estimate.
In this study, we can test for proportionality in terms of the geographic distribution using the spatial variability in the temporal-mean ERF aci components. Figure 4 shows that the zonal mean of the ratio between L adjustment and N d forcing is relatively stable around unity between the southern and northern midlatitudes with fairly small interhemispheric differences 25 except in the Southern Ocean. The picture is somewhat different for the ratio between f c adjustment and N d forcing, which is more latitudinally variable and more different between the northern and southern hemisphere. Figure 5 reinforces these conclusions, showing that F L and F N d are fairly tightly correlated with a regression relationship remarkably close to one-toone, while the relationship between F f c and F N d is much looser.
One interpretation of these results is that the ERF aci components share a geographic pattern due to the fact that large effects 30 result from the coincidence of large anthropogenic aerosol sources and susceptible clouds; the shared geographic pattern then leads to an approximately proportional relationship that breaks down farther from the source regions or where a different mixture of cloud processes dominates the cloud response (e.g., the Southern Ocean). The cloud cover scheme, which diagnoses f c from the grid mean relative humidity, to some extent decouples f c from the other cloud properties, which attenuates the influence of N d on f c . Nevertheless, the vagueness of this argument is unsatisfactorily mismatched against the precision of the F L -F N d relationship, which suggests a deeper mechanism at play, e.g., that precipitation acts as a common sink process for both N d and L.
Further evidence for proportionality comes from (Gryspeerdt et al., 2019b), who find an intermodel proportional relationship between global-mean forcing and rapid adjustments. 3.3 How should we treat columns containing ice?
In attempting to diagnose warm-cloud ACI forcing components, the question arises how ice-containing clouds should be handled. We can conduct the following set of experiments to determine the range of forcing strengths associated with different thermodynamic-phase treatments: 1. Perturb cloud properties in all cloudy model levels.  shortwave flux only. We did not perform experiment 2 because we expect the result to lie between experiments 1 and 3, whose separation is already in the noise.) We conclude that how we choose to treat mixed-phase and ice clouds makes little difference in ECHAM-HAM, so long as we do not restrict ourselves to columns containing only warm clouds. In the latter case, correcting the forcing by the temporal occurrence fraction of liquid-only columns in each model latitude-longitude box approximately recovers the results when ice-containing columns are retained; however, there is some indication of diverging trends in F N d

20
(which decreases in magnitude as the ice filtering becomes more restrictive) and F L (which increases in magnitude as the ice filtering becomes more restrictive). The ice-free column requirement is often made in passive remote sensing studies to prevent contamination from ice clouds overlying warm clouds and uncertainties in multilayer cloud retrievals.

Does temporal averaging bias the results?
As Table 3 shows, longer averaging periods underestimate the forcing, but the differences between instantaneous output (the 25 model time step is 7.5 minutes, but we sample every 3 h to reduce the data volume) and 3 h averages is minimal. Multimodel ensembles which archive 3 h average output or 3 h subsampled instantaneous of column cloud properties, e.g., AeroCom and CFMIP2, are therefore amenable to treatment by the PRP method.
We have presented the first decomposition of the ACI effective forcing in ECHAM-HAM into a Twomey forcing component and rapid adjustments of L and f c . In ECHAM-HAM, no single component dominates: , and F f c = −0.31 W m −2 ; the Twomey forcing and L adjustment are approximately equally strong, and the f c adjustment is somewhat weaker, as in many other models. The global ERF is dominated by the northern-hemisphere forcing.

5
Within the northern hemisphere, the strongest forcing components occur over land in China in F N d and F L . As expected, the stratocumulus sheets over the eastern ocean basins also show strong responses in both hemispheres, as do the midlatitude North Atlantic and North Pacific.
The temporal-mean spatial patterns of F N d and F L are highly correlated, suggesting an effective proportionality in the L adjustment to the Twomey forcing even though the precipitation-suppression mechanism by which the L adjustment is 10 parameterized in the model has inherent memory that could decouple it from the Twomey effect. The spatial patterns of the temporal-mean F N d and F L , while sharing some of the same gross features, have a much less tight relationship than F N d and In our study of ECHAM-HAMMOZ, the forcing components are fairly insensitive to how we treat columns containing both ice and liquid cloud condensate. Requiring that columns be free of ice and then correcting for the temporal fractional 15 occurrence of ice cloud, a technique that is often necessary in observational studies, largely reproduces the results we obtain when we do not filter out such columns, albeit possibly causing an overestimate of the L adjustment and an underestimate of the N d forcing. (In interpreting the bearing of these results on analyses of satellite cloud retrievals, note that these studies do not necessarily apply the ice-free requirement at the coarse GCM scales of the present work, depending on whether they use gridded "level 3" data or the "level 2" native resolution of the retrieval algorithms.) 20 Through idealized sensitivity studies presented in the Appendix, we have showed that PRP is a viable method for accurately decomposing ERF aci into a N d forcing and L and f c adjustments. This is the case despite large artifacts that occur due to the decorrelated cloud property fields; the forward-backward technique advocated by Colman and McAvaney (1997) is capable of removing these artifacts.
PRP directly uses the intrinsic model knowledge of the time-varying, three-dimensional aerosol perturbation and resulting 25 perturbation of cloud properties to diagnose the ERF aci components and their spatial patterns. This makes it a useful tool for providing context to other less resource-intensive decomposition methods (e.g., Ghan et al., 2016;Gryspeerdt et al., 2019b) or to intercomparison studies (e.g., Pincus et al., 2016;Smith et al., 2018) despite its demand for high-frequency vertically resolved model output.
Appendix A: Validation of the PRP method for ACI decomposition A1 What is the effect of decorrelating the cloud properties?
Consider the results of forward and backward PRP plotted separately for the PD-PI experiment in Figure A1. Not only are the magnitudes grotesque, but taken at face value, they would imply a positive forcing in one direction and a negative forcing in the other. Furthermore, the spatial patterns bear no resemblance to that expected for ERF aci . In this section, we investigate the 5 consequences of these features for the ERF aci decomposition.
Any given atmospheric property is often correlated with many others. Substituting cloud properties one at a time breaks these correlations. For example, since ECHAM-HAM parameterizes precipitation suppression by aerosol, we expect a positive correlation between N d and L within a model run. If we substitute L from another run, the mechanistic link between N d and L through precipitation suppression, by which higher N d at a given point in time leads to higher L at later times, is broken, 10 and, therefore, the correlation between N d and L is altered.
We estimate the strength of such decorrelation effects by performing two model runs with (almost exactly) the same model physics, both nudged to the same large-scale dynamics and with the same fixed SST; the only difference between the runs is that a parameter in the Khairoutdinov and Kogan (2000) formulation for the autoconversion rate, tuned for ECHAM-HAM, 15 has been slightly perturbed from β = 1.79 to β = 1.79 + 10 −5 (α = 2.47 and γ = 4 × 1350 s −1 are unchanged). Even over short integration times (a year), these model runs will converge on the same climate, with nearly identical forcing components. (The small perturbation in β does not result in a significant change in model sensitivity.) However, at any given elapsed integration time and geographic location, the cloud properties in the two runs are essentially uncorrelated. We refer to this pair of runs as the same-climate-different-weather experiment. Knowing that the true climatological TOA flux difference between this pair of 20 runs is zero, we can use these runs to estimate decorrelation effects between any other decorrelated pair of runs, including the PD and PI emissions runs.
We find that decorrelation effects cause the PRP method to estimate enormous TOA flux perturbations when we perform forward or backward substitution of any single cloud property; this is shown in panels (a) and (b) of Figure A2.
Unlike forward PRP or backward PRP individually, forward-backward PRP is unaffected by decorrelation, both in the global 25 mean and locally in the temporal mean: panel (c) of Figure A2 shows that the fluctuations in ∆Q rapidly (i.e., within a year) average to zero. This confirms that the Colman and McAvaney (1997) prescription is successful at minimizing the spurious effects of decorrelation.
A2 Does PRP give the right answer?
The preceding section provides evidence that strong decorrelation effects do not lead to a spurious offset in forward-backward 30 PRP results. Next, we show that decorrelation effects also do not lead to spurious scale factors. To do so, we scale N d and L by a globally constant factor of 1.1 at all timesteps and scale f c by 0.99. We use PRP to diagnose the forcing associated with each of these perturbations; the results are shown in the first three rows of Table A1. We can then estimate the strength of the decorrelation effects by performing PRP on the β = 1.79 + 10 −5 run and the scaled-{N d , L, f c } β = 1.79 run. This is shown in the middle three rows of Table A1; the correct results are recovered to good approximation, with generally small attribution to incorrect ERF aci components (the largest is −0.05 W m −2 incorrectly diagnosed as f c adjustment in the L×1.1 experiment) and generally small differences between the actual and diagnosed (the largest is a diagnosed F L = −0.48 W m −2 when the correct 5 value is −0.53 W m −2 ). The final test is an experiment in which all cloud properties are perturbed simultaneously and the clouds are decorrelated by using a β = 1.79 + 10 −5 baseline run. The results are shown on the last line of Table A1; the correct ERF aci components are recovered in the presence of the confounding effects of decorrelation and of perturbing multiple cloud properties simultaneously with 0.1 W m −2 or better accuracy, the largest discrepancy being the diagnosed F f c = 0.14 W m −2 when the correct value is 0.24 W m −2 ). 10 Thus, we find that forward-backward PRP can diagnose the forcing components correctly in the presence of decorrelations, in addition to diagnosing the absence of forcing correctly in the same-climate-different-weather case.
A3 Are the results sensitive to choosing grid-mean or in-cloud perturbations?
Perturbing in-cloud or grid-mean N d and L would be equivalent in the limit in which TOA flux perturbations are linear in the cloud properties. While individual model columns do not satisfy this linearity requirement, the temporal mean apparently ex- 15 hibits sufficient effective linearity that the choice of in-cloud or grid-mean perturbations has little effect on the ERF component estimate; compare Tables 1 and A2.
Code and data availability. The PRP code is available in a GitHub repository that will be made public and receive a DOI concurrently with publication of the final paper. Similarly for the analysis code. ECHAM-HAMMOZ is available from hammoz.ethz.ch subject to acknowledgement of a licensing agreement. The PRP output on which the manuscript is based will be made available and receive a DOI on Zenodo 20 concurrently with publication of the final paper. Due to the large data volume of 3-hourly vertically resolved fields, the model output itself was not archived, but model configuration files that can be used to replicate the output are available as part of the PRP code.    Figure 5. Correlation plots between the temporal-mean Twomey forcing and the adjustments; color indicates the number of grid boxes within each 0.05 W m −2 × 0.05 W m −2 bin; the red line is a linear least-squares regression; the blue line is a generalized additive model regression (Wood, 2011), with 95% confidence interval shaded in light blue; and the dashed gray line is the one-to-one line Table 1. ERF aci components in ECHAM-HAM estimated by forward-backward PRP. The total ERF also includes the ice-phase ACI effects (−0.59 W m −2 in the SW, 0.88 W m −2 in the LW), RF ari (−0.17 W m −2 in the SW), and a negligible surface-albedo contribution (−0.01 W m −2 ), estimated for a very similar model run in Gryspeerdt et al. (2019b). The sum of the components thus balances at approximately the 0.2 W m −2 level, a relative error similar to the 0.1 W m −2 estimated uncertainty on the ERF aci components.