Articles | Volume 24, issue 12
Research article
27 Jun 2024
Research article |  | 27 Jun 2024

General circulation models simulate negative liquid water path–droplet number correlations, but anthropogenic aerosols still increase simulated liquid water path

Johannes Mülmenstädt, Edward Gryspeerdt, Sudhakar Dipu, Johannes Quaas, Andrew S. Ackerman, Ann M. Fridlind, Florian Tornow, Susanne E. Bauer, Andrew Gettelman, Yi Ming, Youtong Zheng, Po-Lun Ma, Hailong Wang, Kai Zhang, Matthew W. Christensen, Adam C. Varble, L. Ruby Leung, Xiaohong Liu, David Neubauer, Daniel G. Partridge, Philip Stier, and Toshihiko Takemura

General circulation models' (GCMs) estimates of the liquid water path adjustment to anthropogenic aerosol emissions differ in sign from other lines of evidence. This reduces confidence in estimates of the effective radiative forcing of the climate by aerosol–cloud interactions (ERFaci). The discrepancy is thought to stem in part from GCMs' inability to represent the turbulence–microphysics interactions in cloud-top entrainment, a mechanism that leads to a reduction in liquid water in response to an anthropogenic increase in aerosols. In the real atmosphere, enhanced cloud-top entrainment is thought to be the dominant adjustment mechanism for liquid water path, weakening the overall ERFaci. We show that the latest generation of GCMs includes models that produce a negative correlation between the present-day cloud droplet number and liquid water path, a key piece of observational evidence supporting liquid water path reduction by anthropogenic aerosols and one that earlier-generation GCMs could not reproduce. However, even in GCMs with this negative correlation, the increase in anthropogenic aerosols from preindustrial to present-day values still leads to an increase in the simulated liquid water path due to the parameterized precipitation suppression mechanism. This adds to the evidence that correlations in the present-day climate are not necessarily causal. We investigate sources of confounding to explain the noncausal correlation between liquid water path and droplet number. These results are a reminder that assessments of climate parameters based on multiple lines of evidence must carefully consider the complementary strengths of different lines when the lines disagree.

1 Introduction

Aerosol–cloud interactions (ACIs) remain the greatest source of uncertainty in our estimates of anthropogenic perturbations in the Earth's energy budget (Boucher et al.2014; Forster et al.2021). In liquid clouds, an anthropogenic aerosol perturbation essentially instantaneously alters the number of cloud droplets (Nd), changing cloud reflectance and thus the shortwave radiation absorbed by the climate system, which exerts a radiative forcing on climate (radiative forcing by aerosol–cloud interactions or RFaci; Twomey1977; Boucher et al.2014). While our knowledge of RFaci is uncertain (Quaas et al.2020), an even thornier issue is cloud adjustments to the Nd perturbation, where multiple processes acting at different scales from cloud droplets to planetary circulation (Stevens and Feingold2009) result in a multiscale dynamics prediction problem that is impervious to any one silver bullet solution (Mülmenstädt and Feingold2018). Estimates of ACI adjustments are, therefore, based on multiple, and often conflicting, lines of evidence (Boucher et al.2014; Bellouin et al.2020; Forster et al.2021). Those lines of evidence are, broadly, modeling at the cloud process scale (large eddy simulation or LES), global modeling, and observations at different scales.

In the following, we focus on stratocumulus (Sc) clouds, which play a large role in the energy budget due to their high albedo and frequent occurrence. Our understanding of adjustments in Sc is that two effects compete: an anthropogenic increase in Nd suppresses precipitation (Albrecht1989), increasing cloud liquid water path (); but the Nd increase also promotes increasing turbulent entrainment of subsaturated air at cloud top (Ackerman et al.2004; Bretherton et al.2007), decreasing . These mechanisms are regime-dependent; precipitation suppression only plays a role in clouds that would have precipitated in the absence of the aerosol perturbation, and the entrainment mechanisms depend strongly on the turbulence generation mechanisms (for example, cloud-top radiative cooling). The regime dependence of the underlying processes leads to “process fingerprints” in Nd space in LES (Hoffmann et al.2020) for the very limited set of boundary conditions where LES is available. Similar bifurcation behavior appears in satellite observations, where mean as a function of Nd first increases in precipitating clouds, then reaches a peak that roughly coincides with the transition to non-precipitating clouds, and then decreases again (Gryspeerdt et al.2019). There is evidence that this inverted-V relationship between and Nd overestimates the strength of the causal effect of Nd on (Gryspeerdt et al.2019; Arola et al.2022; Fons et al.2023), but qualitatively it is consistent with process understanding from LES. Integrated over all meteorological boundary conditions, the overall satellite correlation between and Nd is negative. The satellite inverted V; satellite observations of natural laboratories (Christensen et al.2022), where the origin of the perturbation is evident (Malavelle et al.2017; Toll et al.2019); and process-modeling lines of evidence lead to the assessment that the adjustment of to anthropogenic aerosol is a reduction in – that is, a positive contribution to the effective radiative forcing by ACI (ERFaci; Bellouin et al.2020; Forster et al.2021).

Global climate models – which, currently, means general circulation models (GCMs) run at a roughly 1° latitude–longitude spatial resolution – tell a different story. They would project an increase rather than a decrease in when aerosols are increased from preindustrial (PI) to present-day (PD) concentrations (Gryspeerdt et al.2020). The GCM line of evidence is discounted in multiline assessments because it conflicts with the other lines and because those lines are assumed to provide more reliable information. This assumption rests on the representation of the relevant processes in GCMs. In these models, precipitation is initiated by a microphysical parameterization with an explicit dependence on Nd (or, largely equivalent, droplet size) so that the increase by precipitation suppression is explicitly parameterized. Reduced by enhanced evaporation, on the other hand, depends critically on meter-scale or smaller interactions between turbulence, radiation, and microphysics at the cloud edge. These interactions fall between several parameterizations and are therefore tricky to formulate in GCMs. As a perverse consequence, this causes us to fret that GCMs may be structurally incapable of representing turbulent entrainment scales, while we often mistakenly consider the many-orders-of-magnitude-smaller-scale precipitation processes a parametric problem (e.g., Mülmenstädt et al.2020, 2021).

In this work, we show that some GCMs from the Coupled Model Intercomparison Project 6 (CMIP6) era, unlike earlier model generations, are capable of producing inverted-V Nd relationships in agreement with global observations and LES. Based on these PD correlations and on the Nd change between PI and PD (i.e., mimicking the information available to observation-based ERFaci estimates), these models predict a reduction in , which is consistent with assessments that use multiple lines of evidence. However, the causal effect of anthropogenic Nd changes on , as diagnosed by model experiments where all climatic boundary conditions apart from aerosols are held fixed, remains as in previous GCM generations: an anthropogenic Nd increase leads to an increase in average , consistent with a dominant role of the precipitation suppression mechanism parameterized in the model microphysics.

2 Data and methods

We use an ensemble of GCMs to perform fixed-sea-surface-temperature model experiments with PD and PI emissions and archive instantaneous aerosol and cloud information, with a sufficient frequency (3 h) to resolve the diurnal cycle and with a sufficient length (1–5 years with the large-scale winds nudged to PD meteorology) to draw statistically robust conclusions. The model ensembles used are the CMIP5-era AeroCom indirect effect experiment (AeroCom IND3) simulations on the one hand and four newer-version models prepared for CMIP6 on the other. The AeroCom models are described in Zhang et al. (2016) and Ghan et al. (2016). The CMIP6-era models are the US Department of Energy's Energy Exascale Earth System Model (E3SM) Atmosphere Model version 1 (EAMv1; Rasch et al.2019), the NASA Goddard Institute for Space Studies (GISS) ModelE3 (Cesana et al.2019, 2021) configuration Tun2, the Geophysical Fluid Dynamics Laboratory (GFDL) atmospheric model AM4.0 (Zhao et al.2018), and the Community Earth System Model version 2 Community Atmosphere Model version 6 (CESM2-CAM6; Gettelman et al.2019). The CMIP6-era models were run for 1 year for the baseline experiment. E3SM was further run for 5 years for additional experiments that needed more data to perform stratification by confounding variables (see Sect. 3.3). For E3SM, the Cloud Feedback Model Intercomparison Project (CFMIP) Observation Simulator Package (COSP) satellite simulator (Pincus et al.2012; Swales et al.2018), mimicking the MODerate Resolution Imaging Spectroradiometer (MODIS) cloud retrievals (Platnick et al.2017), and a number of vertically resolved fields were archived over a limited area over the northeast Pacific (NEP) Sc region for further analysis of confounders; only this experiment uses COSP-simulated Nd and .

2.1 Cloud selection

From the model output, we select liquid clouds, defined by the absence of ice (ice water path <10-3 kg m−2 and ice cloud cover =0) in the column. To mimic passive satellite analyses and to simplify the application of entrainment diagnostics in Part 2 of the series (Mülmenstädt et al.2024a), we require near-overcast (liquid cloud cover >0.9) conditions. For these liquid clouds, we calculate “in-cloud” cloud-top Nd and by dividing the grid-mean Nd and by the projected cloud cover. Only clouds over ocean are considered in this analysis. We refer to these clouds as “overcast clouds”.

In addition to these globally occurring overcast clouds, we also study smaller cloud subsets defined by the dynamical regime following Medeiros and Stevens (2011). In this classification, the stratocumulus regime is based on vertical velocity (ω) at 700 and 500 hPa (ω700>10 hPa d−1 and ω500>10 hPa d−1) and lower tropospheric stability (LTS), which we define here as the difference in potential temperature (θ) between 1000 and 700 hPa (θ700-θ1000>18.55 K). We further restrict the clouds to occur in grid boxes where these conditions are met at least 30 % of the time, a subjective choice that selects the subtropical Sc regions and rejects midlatitude Sc. The occurrence fraction (fSc) of these conditions is shown in Fig. 1. In addition to the Medeiros and Stevens (2011) requirements, all of the abovementioned warm-cloud criteria are applied. We refer to these clouds as Sc-regime clouds.

Figure 1Occurrence fraction, fSc, of Sc conditions by the Medeiros and Stevens (2011) criteria in EAMv1, shown where fSc>0.1. The fSc>0.3 threshold used in the analysis selects the NEP, southeast Pacific, and southeast Atlantic Sc regions, while limiting the northward extent of the NEP region beyond the subtropics and rejecting all midlatitude Sc.

2.2 Analysis methods

From and Nd, we construct the conditional probability P(ℒ|Nd), following Gryspeerdt et al. (2019). For ease of comparison among models and configurations, we collapse the two-dimensional P(ℒ|Nd) into one dimension by calculating the geometric mean in each Nd bin, also following Gryspeerdt et al. (2019).

For the MODIS simulator analysis in Sect. 3.3.3, we transform the simulated τ and droplet effective radius (re) into Nd and using a power-law relationship for adiabatic updrafts with constant Nd (Brenguier et al.2000; Bennartz2007; Painemal and Zuidema2011; Grosvenor et al.2018):


where we take the ratio k=(rv/re)3 between the volumetric mean radius rv cubed and effective radius cubed to be 1; subadiabatic factor, fad, to be 1; scattering efficiency, Q to be 2; and adiabatic condensation rate, Γ, to be 2×10-6 kg m−4. These assumptions minimize the complications involved in showing results that are mostly power-law behavior independent of these constant factors. (This does neglect important modifications that can arise if these factors are not, in fact, constant; Varble et al.2023.)

To analyze confounding by planetary boundary-layer (PBL) depth, h (Sect. 3.3.2), we identify the top of the Sc-like boundary layer by the first model level where temperature increases with height in Sc-regime overcast columns. This produces well-mixed profiles of the liquid water potential temperature, θl, and total water mixing ratio, qw. (Other definitions of PBL top – i.e., the model level of the greatest gradient in θl or qw – yield very similar results.) As we see in Sect. 3.3.2, cloud and aerosol properties are remarkably stratified by PBL depth in E3SM; to keep the properties as distinct as possible as a function of PBL depth, we retain the native model vertical discretization instead of converting the hybrid pressure levels to pressure or geometric height.

Table 1 summarizes the emissions, cloud selection, and model run duration for each experiment.

Table 1GCM experiments used in this analysis.

Download Print Version | Download XLSX

3 Results and discussion

In Fig. 2, we show the behavior of the AeroCom IND3 (CMIP5-era) GCMs in Nd space: with the exception of one model, increases monotonically as a function of Nd. In some models, the slope decreases at high Nd, but only one model, the Hadley Center Global Environment Model version 3 (HadGEM), has quantitatively similar behavior to the inverted-V satellite Nd plot. The behavior of these models (with the exception of HadGEM) is consistent with the interpretation that the predominant mechanism linking and Nd is precipitation suppression.

Figure 2AeroCom IND3 state-of-the-art models' Nd relationship. The satellite inverted-V relationship (Gryspeerdt et al.2019) is indicated by the solid line.


3.1 CMIP6-era models produce inverted-V Nd relationships

A funny thing happened on the way to CMIP6: three of the four US CMIP6-era GCMs had an inverted V with a pronounced negative slope. The behavior of these models is contrasted with the AeroCom models' behavior in Fig. 3. The geographic distribution of the regression slope between log ℒ and log Nd is predominantly negative in the models with an inverted V (Fig. 4), as Gryspeerdt et al. (2019) found in satellite retrievals.

Figure 3(a) AeroCom IND3 state-of-the-art models' marginal probability distributions and Nd conditional probability distributions compared to the (b) CMIP6-era state-of-the-art models' Nd relationship. The satellite inverted-V relationship (Gryspeerdt et al.2019) is indicated by the dashed line. Three of the four CMIP6 models examined are qualitatively similar to the satellite result in the sense that the Nd correlation turns negative at moderate Nd.


Figure 4Geographic distribution of dlogL/dlogNd in the CMIP6-era models. Model output is aggregated to 5° × 5° latitude–longitude boxes before calculating linear regression slopes of log ℒ against log Nd.

One of these models (ModelE) was designed to better represent the entrainment behavior to which the negative slope is attributed in process-scale modeling. The other two (CAM6 and EAMv1), however, were not; if the negative slope is due to an entrainment ACI mechanism, it is an emergent behavior not explicitly parameterized into the turbulence scheme. It is doubly surprising that these models produce a negative slope considering that their closely related predecessor, CAM5.3-CLUBB-MG2, was part of the AeroCom ensemble and showed, at best, a slightly negative relationship between Nd and .

Unfortunately, the intersection between CMIP5 AeroCom models and CMIP6 models in this study is small (only CAM5 and CAM6). We hope that an updated AeroCom experiment will provide comparisons between the CMIP5 and CMIP6 versions of more models. We also hope perturbed physics ensembles of each of the inverted-V models will explore the effect of physics choices on the Nd relationship.

3.2 The negative correlation between Nd and does not predict the sign of PI to PD change in

The bulk of the Nd population lies in the part of the inverted V with a negative Nd correlation. If we regarded this relationship as indicative of a causal influence of Nd on – that is, that an increase in Nd causes to decrease – then we would predict a decrease in as Nd increases from its PI value to its PD value due to anthropogenic emissions.

We can compare the change in predicted by the Nd correlation in PD internal variability to the outcome of a model experiment designed to measure the causal effect of Nd on . This experiment fixes all climatic boundary conditions affecting cloud state (i.e., solar constant, greenhouse gases, and sea-surface temperature) with the exception of anthropogenic aerosols. The change in in this experiment can therefore only be due to the anthropogenic aerosol emissions change. This model experiment shows that the causal effect of the Nd increase is to increase on average, contradicting the prediction of a decrease in based on PD internal variability (Fig. 5). The correlation seen in PD internal variability in these models therefore cannot be causal. Plotting the correlations within PD and PI, as shown in Fig. 6, provides a glimpse at what is happening instead: a secular increase in Nd does not lead to a secular reduction in by shifting the population along the correlation line, as would be expected for a causal relationship. Instead, the correlation line shifts along with the secular shifts in Nd and (mostly to the right given that the change in Nd is far greater than the change in ) in a way that is not predicted by the correlation line itself.

Figure 5PI–PD change from the causal experiment (solid arrow) contrasted with the change predicted from the PD internal variability (dashed arrow). The mean log ℒ as a function of log Nd from the PD model run (Fig. 3) is used to predict PI mean log ℒ from the PI log Nd distribution. Even though the PD Nd correlation is negative, PD>ℒPI.


Figure 6PD (solid) and PI (dashed) Nd and marginal probability distributions and Nd conditional probability distributions in two GCMs with unrelated turbulence schemes. The Nd relationship based on internal variability within one climate state is not universal across the states.


This contradiction raises three questions. First, what produces the noncausal negative Nd correlation? We provide a few hypotheses in the following section. Second, considering that these models can replicate the observed PD correlation, what can we infer about the causality of the relationship in observations, where we are unable to conduct direct experimental tests of causality? We discuss this question in Sect. 3.4. Third, is any part of the negative relationship between Nd and in the models causal? Any such causal mechanism would have to involve a direct or indirect Nd dependence in cloud-top entrainment. In ModelE, the Bretherton and Park (2009) turbulence scheme provides an explicit entrainment closure. Guo et al. (2011) have shown that the combination of the Cloud Layers Unified By Binormals (CLUBB; Larson and Golaz2005; Golaz et al.2007) cloud and turbulence scheme and the Morrison–Gettelman microphysics scheme (Morrison and Gettelman2008; Salzmann et al.2010) can reproduce entrainment-mediated enhanced evaporation at high Nd in single-column experiments. This behavior has not been documented in three-dimensional GCM experiments, but CAM6 and EAMv1 use related cloud–turbulence (Bogenschutz et al.2013; Larson2022) and cloud–microphysics (Gettelman2015) schemes, so it is conceivable that Nd-dependent entrainment mechanisms contribute to the Nd relationship in these three models. A deeper investigation of this question merits a separate paper (Part 2 of this series, Mülmenstädt et al.2024a).

3.3 Sources of covariability that produce noncausal Nd relationships

Noncausal relationships between two variables often originate from a third (possibly unobserved) variable that exerts a causal relationship on the two variables being correlated. This third variable is termed a confounding variable (Pearl and Mackenzie2018). In its most striking form, confounding can lead to a sign reversal between causation and correlation – for example, in Simpson's paradox (Simpson1951; Feingold et al.2022). Cloud properties respond strongly to the circulation at the scales of the Sc cellular organization (mesoscale) and greater. Thus, the mesoscale-to-synoptic-scale circulation is a natural place to look for confounding variables that lead to noncausal correlations between cloud properties.

3.3.1 Mesoscale cloud regimes

Mesoscale circulation manifests as cloud regimes (e.g., Rossow et al.2005; Gryspeerdt and Stier2012; Muhlbauer et al.2014; Unglaub et al.2020). ACI mechanisms likely differ between cloud regimes (e.g., Mülmenstädt and Feingold2018; Possner et al.2020; Dipu et al.2022). This could result in different Nd slopes in open- or closed-cell Sc or shallow cumulus or, as the positively and negatively sloped legs of the inverted-V relationship perhaps show, in precipitating and non-precipitating cloud regimes. Due to GCMs' coarse resolution, it is doubtful that they can correctly represent these mesoscale cloud regimes, their ACI mechanisms, or their coupling to the circulation. Nevertheless – or perhaps precisely because we can probably discount cloud-scale causal links between Nd and due to the mismatch with the GCM resolved scale – we can use GCMs to test whether the existence of cloud regimes is, on its own, a confounding mechanism for the Nd relationship.

To assess whether regime-induced confounding effects may exist in the model Nd relationship, we stratify the E3SM model clouds by surface rain rate, R. These bins of rain rate are our stand-ins for precipitation regimes. We focus on the surface rain rate because, unlike mesoscale morphological regime definitions (which are subgrid-scale in the GCM), the precipitating–non-precipitating regime delineation has a somewhat clear analog in the GCM. Because the model rain rate has a very long low tail, we do not attempt to define a binary non-precipitating versus precipitating categorization but rather divide the cloud sample into quantiles of rain rate. Specifically, we use sextiles, balancing the need for a meaningful range of rain rates with the need to maintain a large sample of clouds within each bin. The CloudSat precipitation detection sensitivity at the GCM spatial resolution (R≈0.01 mm d−1; Stephens et al.2010) falls roughly into the third rain-rate bin; so, by this definition, half the bins approximately represent precipitating and half non-precipitating clouds.

Figure 7 shows the results. The model, perhaps unrealistically, produces clouds that generate surface-reaching rain at all droplet concentrations; however, in bins with higher rain rates, the Nd distribution peak is shifted lower, as might be expected from the negative-exponent power law that parameterizes the autoconversion of cloud water to rain, and as is expected from observations (Pawlowska and Brenguier2003; Comstock et al.2004). At the same time, is higher in bins with higher rain rates, again, as might be expected from the parameterized autoconversion and accretion. Superimposing the bin mean Nd and for each rain-rate bin on the unbinned Nd distribution, we find that the negative correlation among the bin means echoes the unbinned correlation. This is the case even though, in very classic Simpson (1951) fashion, the correlations among five out of the six R bins are positive. Comparing a monovariate linear regression of the form logL=α0+αNdlogNd (with α0 and αNd being the intercept and slope parameters) with a bivariate regression of the form logL=β0+βNdlogNd+βRlogR (with β0, βNd, and βR the intercept and slope parameters) provides further confirmation of the Simpson's paradox-like behavior: while the monovariate regression slope is negative (αNd=-0.27±0.001; fraction of explained variance is 3.3 %), the bivariate slope is positive (βNd=+0.13±0.001; fraction of explained variance is 53 %). Both regressions are performed over the range where the Nd correlation is negative (Nd>20 cm−3). The fraction of the sample where R=0 is approximately 10−4; these data points are discarded in both regressions. The numbers quoted are ordinary least-squares regression slopes and their standard errors. Thus, the opposing influences of Nd and on rain rate can, without any involvement of entrainment or evaporation mechanisms, generate a noncausal negative correlation between Nd and .

Figure 7Precipitation-stratified Nd and marginal probability distributions and Nd conditional probability distributions (colored lines; surface rain-rate intervals given in brackets) in E3SM. The dashed black line shows the unstratified Nd relationship. The solid black line connects the mean (Nd,ℒ) in each precipitation sextile (colored dots). Binning by precipitation intensity exposes a precipitation-mediated negative Nd covariability, with a much steeper slope than the overall Nd correlation, even though the Nd correlation within all but the least-precipitating sextile is positive.


We note that the mechanism generating this noncausal correlation is unusual. The contrast with the causal precipitation suppression mechanism is clear: there, causation runs from Nd to autoconversion to . Here, causation runs jointly from both Nd and to precipitation. How or whether the causal chain then returns from precipitation to Nd and , as in the classic confounding mechanism, is an open question.

We further note that precipitation already appears to have a qualitative effect on the model's Nd relationship at rain rates far below the CloudSat sensitivity threshold: even in the second-lowest R bin, the correlation between Nd and is already positive. This suggests that the parameterized precipitation may exert such a strong influence on ACI even for clouds with a low precipitation rate that other ACI adjustment mechanisms, while they may in principle be represented in the model, could be so overwhelmed by the parameterized precipitation suppression that their effect is not discernible in the climate response.

3.3.2 Synoptic-scale air mass advection

At the synoptic to planetary scales, covariability between cloud and aerosol properties can lead to spurious correlations in ACI metrics (Grandey and Stier2010). Synoptic-scale meteorological covariability can take the form of continental versus marine air mass advection. When an air mass originates over land, it typically has a higher temperature, lower relative humidity (contributing to lower ), and higher aerosol concentration (contributing to higher Nd) than when an air mass originates over ocean. This contrast between air masses creates an anticorrelation between Nd and even in the absence of any causal effect of Nd on (Brenguier et al.2003). Additionally, sea-surface temperature is coldest and climatological subsidence strongest near the coast, resulting in shallow marine boundary layers. The model's conception of this synoptic-scale covariability in space can be seen in Fig. 8, with shallow boundary layers and high cloud condensation nuclei (CCN) concentrations near shore and deeper boundary layers with a low CCN farther offshore. A similar covariability exists at particular locations in time. Figure 9 illustrates the mechanism in the NEP Sc region: at any given location, PBL depth and CCN concentration are strongly linked via the synoptic-scale circulation. Presumably, the position of the anticyclonic subtropical subsidence governs both the PBL depth and whether continental or maritime air is advected.

Figure 8Within the Sc regime, synoptic-scale meteorology results in strong spatial covariability of (a) PBL depth h and (b) CCN concentration at 0.2 % supersaturation averaged over the depth of the PBL in E3SM. Both of these variables are functions of air mass continentality.

Figure 9Within the Sc regime, synoptic-scale meteorology results in strong temporal covariability between CCN and PBL depth, h, in E3SM. This is exemplified by the NEP; the Southern California Bight is depicted in top right corner. The star indicates the grid point with the highest occurrence fraction of Sc conditions according to the criteria of Medeiros and Stevens (2011).

To assess the synoptic meteorological confounding effect, we stratify the E3SM model clouds in the NEP Sc region by PBL depth. We choose PBL depth as the confounding variable because it appears to act as a proxy for air mass continentality in the model (Fig. 8), without a direct parameterized relationship to either aerosols or cloud. PBL depth is nevertheless strongly correlated with both CCN concentration (temporal- and regional-mean vertical profiles are shown in Fig. 10) and . For a fairly wide range of PBL depths (representing the central 90 % of the PBL depth distribution for Sc-regime cloud columns; 10–15 model levels, corresponding approximately to 750–1400 m), the relationship between mean Nd and mean , stratified by PBL depth, mimics the slope of the unstratified Nd relationship quite closely (Fig. 11). Based on this, it is plausible that synoptic-scale meteorological covariability contributes substantially to the overall negative Nd correlation in the model.

Figure 10Temporal- and regional-mean CCN concentration profiles in the NEP Sc region stratified by PBL depth in E3SM. Within the Sc regime, CCN is strongly sorted by PBL depth, illustrating the strong covariability between PBL thermodynamic structure and aerosol advection. The central 90 % of the PBL depth range (between 10 and 15 model levels, corresponding to approximately 750–1400 m) is shown to avoid outliers in the low-statistics PBL depth bins. The PBL depth is measured in units of model levels k, with k increasing downward from the level of the PBL-capping inversion kpbl to the model level closest to the surface (k=72 in EAMv1).


Figure 11Within the Sc regime, PBL depth–CCN covariability leads to a negative Nd correlation with a slope similar to the overall Nd correlation in E3SM. The dashed black line shows the Nd relationship not stratified by PBL depth. The solid black line connects the mean (Nd,ℒ) at each PBL depth (colored dots). The central 90 % of the PBL depth range (between 10 and 15 model levels, corresponding to approximately 750–1400 m) is shown to avoid outliers in the low-statistics PBL depth bins.


We note several caveats. The synoptic-scale covariability of aerosol advection and PBL depth is a feature of the general circulation and can therefore be expected to be modeled reasonably well in a GCM. However, the interaction of the advected aerosol with boundary-layer clouds depends on mixing between the free troposphere and the boundary layer, which is likely much less well represented in GCMs that have coarse vertical resolutions. Furthermore, cloud adjustments mediated by circulation changes are artificially suppressed as a side effect of the fixed sea-surface temperature and nudged winds that we employ to reduce internal-variability noise in the ERFaci estimates. Whether the synoptic-scale confounding signature in the model mimics the real climate is therefore uncertain. Further, the synoptic-scale covariability differs depending on the geographic particulars of each Sc basin; we have only analyzed the NEP Sc in detail. Finally, while the PBL depth-stratified negative Nd relationship in the model is consistent with observational analyses (e.g., Fons et al.2023), the model does not reproduce the weakening of the Nd correlation within each PBL depth bin that is found in observations (Possner et al.2020; Fons et al.2023). We compare (a) a monovariate linear regression of the form logL=α0+αNdlogNd (with α0 and αNd being the intercept and slope parameters) over the unstratified data, (b) a monovariate linear regression of the form logL=α0+αNd|hlogNd (with α0 and αNd|h the intercept and slope parameters) stratified by PBL depth, and (c) a bivariate regression of the form logL=β0+βNdlogNd+βRlogh (with β0, βNd, and βh the intercept and slope parameters). In all three cases, the slope parameters (αNd, αNd|h, and βNd) lie in a range from −0.33 to −0.30. The ordinary least-squares regressions are performed over the Nd range where the Nd correlation is negative (Nd>20 cm−3) and over the central 90 % PBL depth range (10–15 model levels, corresponding approximately to 750–1400 m).

3.3.3 Phase-space boundaries

Correlations between Nd and can also arise simply because not all parts of the Nd phase space are uniformly populated by clouds. This can be illustrated by applying the MODIS simulator (Pincus et al.2012) to the model. The MODIS simulator provides the optical thickness, τ, and droplet effective radius, re, diagnosed consistently with the MODIS satellite cloud retrievals (Platnick et al.2017). Power-law adiabatic relationships ℒ(re,τ) and Nd(re,τ) can be used to transform the MODIS output into Nd space (e.g., Dipu et al.2022). In logarithmic coordinates, this is a linear transformation, yielding the correlation shown in Fig. 12. This S-curve correlation, like the model-native Nd correlation, shows a steep rise in at low Nd and a steep drop at moderate Nd. It also shows another steep rise at high Nd that the model may hint at but does not exhibit clearly. Investigating the data before and after the coordinate transformation to Nd space is instructive. In log τlog re space, the MODIS simulator output falls within a rectangle, bounded by the limits the model prescribes on its clouds. Upon transformation to log ℒlog Nd space, the population is bounded by a parallelogram (see the isolines in Fig. 12). These limits on the phase space strongly sculpt the behavior of the mean log ℒ as a function of log Nd because the parts of phase space that are not populated do not contribute to the mean as a function of Nd.

Figure 12Joint probability distribution of E3SM MODIS-simulated and Nd. Isolines of re and τ, from which adiabatic Nd and are retrieved, are overlaid. The mean log ℒ as a function of log Nd is shown as a blue line. Because the model imposes a rectangular limit on re and τ, the Nd phase space has parallelogram-shaped boundaries. At least part of the rise, fall, and repeated rise in as a function of Nd (blue line) is due to these phase-space boundaries cutting off the upper and lower parts of the distribution.


3.4 Persistent disagreement with other lines of evidence

Before these results, it was only logical to discount the GCM evidence on the basis that it could not reproduce the observed Nd relationship in PD internal variability. Now that some GCMs match the other lines of evidence in PD internal variability, what do we make of the fact that the disagreement on the sign of the causal climatic adjustment to RFaci persists?

In observations, it is more difficult to establish causality than in the GCMs, where it is as simple as changing the aerosol emissions while fixing all other boundary conditions. The most reliable causal evidence in observations comes from observational natural laboratories, where the aerosol perturbation is known and an unperturbed control can be identified clearly (Christensen et al.2022). Such laboratories indicate unchanged or reduced in the perturbed clouds (Malavelle et al.2017; Toll et al.2019; Diamond et al.2020). But, such laboratories are rare, and there is no rigorous extrapolation from these laboratories to the full diversity of cloud regimes found in the climate. The most representative observations – that is, the global satellite-retrieved inverted-V correlations – have the opposite problem: they are representative, but are the correlations causal? The correlation is more negative than the estimate of the causal interannual response to Nd perturbations using an effusive volcano as a laboratory (Gryspeerdt et al.2019; albeit for shallow Cu rather than Sc). Arola et al. (2022) argue that satellite Nd correlations are negatively biased not only by covariability confounding but also by retrieval errors. Fons et al. (2023) applied a causal network approach to the temporal evolution in geostationary satellite data and found that the causal negative Nd relationship is weaker than the Nd correlation. Strong regional increasing and declining trends on multidecadal timescales in the satellite record may also contribute to disentangling covariability and causality (Quaas et al.2022).

In LES, as in GCMs, causality can be established by varying aerosol concentration, while keeping the other boundary conditions constant. This provides very clear evidence that precipitation suppression and entrainment feedbacks lead to process fingerprints of positive and negative tendencies in Nd space (Hoffmann et al.2020) that translate into steady Sc states (Glassmeier et al.2021). But these LES experiments are expensive, so boundary conditions are carefully curated to a very small subset of the high-dimensional space of meteorological conditions present in the climate. We simply do not know whether the process fingerprints would be as unambiguous if a broader spectrum of boundary conditions were simulated or if the clouds were able to interact with larger scales in the multiscale climate problem (Kazil et al.2021) instead of evolving to a steady state.

In summary, GCMs are still the odd ones out in their negative adjustment component of ERFaci. The observational and LES modeling lines of evidence have clear confounding and representativeness problems. Are these problems severe enough to flip the sign of the adjustment? It seems unlikely, but our GCM results show that it is possible; addressing the representativeness and confounder questions in the other lines of evidence thus takes on a renewed urgency.

4 Conclusions

Mülmenstädt and Wilcox (2021) expressed the hope that global models, after a long stretch of playing the odd line of evidence out in assessments of global energy budget problems (Bellouin et al.2020; Sherwood et al.2020), might return to a more equal role in the balance and struggle between conflicting lines of evidence. One way in which the global model perspective shores up the strength of the multiline assessment by providing information not available from the other lines of evidence: being able to test causality and showing that PD internal variability may not even correctly predict the sign of the causal cloud water adjustment to the anthropogenic cloud droplet perturbation.

Causality (or, in this case, lack of causality) is easy to establish in a model experiment but very difficult in observations. Where the noncausal correlation originates is another question that models can, in principle, answer definitively by shutting off confounding model mechanisms in mechanism-denial experiments. In Part 2 of this series, we more fully use the power of models as hypothesis testers by performing perturbed-physics and mechanism-denial experiments. In this paper, we have restricted ourselves to slice-and-dice analyses that could, in principle, also be performed on observations. We hope that they will be performed on observations, especially if Lagrangian investigation of the cloud life cycle (e.g., Eastman et al.2022; Christensen et al.2023) and observational fingerprints of loss processes (e.g., Varble et al.2023) can be included.

Whether lack of causality in the model system implies lack of causality in the real atmosphere is a question that models alone cannot address, so we do not yet know how worried we need to be about the sign difference between correlation and causation in the model Nd relationship. When it comes to the non-GCM lines of evidence, one can quibble with the representativeness of the causal evidence and with causality in the representative evidence – at the very least, these model results are a flashing-red warning sign hanging over our interpretation of the adjustment component of ERFaci.

Code and data availability

Model output for the AeroCom IND3 experiment is available (subject to acknowledgment of a data policy at, last access: 26 June 2024) from ssh:// (AeroCom project2016). The US CMS model output is available at (Mülmenstädt et al.2024b). The analysis code is available at (Mülmenstädt2024).

Author contributions

The idea for this research came from ASA, SEB, MWC, AMF, SD, EG, LRL, PLM, YM, JM, JQ, ACV, HW, KZ, and YZ.

ASA, AG, XL, JM, DN, DGP, PS, TT, and YZ performed model experiments.

ASA, AMF, SD, AG, EG, PLM, JM, JQ, FT, and YZ contributed to analysis.

The original manuscript draft was prepared by JM, and all co-authors provided comments.

Competing interests

At least one of the (co-)authors is a member of the editorial board of Atmospheric Chemistry and Physics. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.


Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.


We thank the two reviewers, Anna Possner and Jianhao Zhang, for suggestions that have resulted in improvements in the analysis and presentation. We thank Christopher Bretherton, Susannah Burrows, Yao-Sheng Chen, Leo Donner, Graham Feingold, Colleen Kaul, Jan Kazil, Naser Mahfouz, Daniel McCoy, Isabel McCoy, Christina Sackmann, Rob Wood, Jianhao Zhang, and Xiaoli Zhou for comments and discussion. This work arises from the 2020 US Climate Modeling Summit held virtually and chaired by Gavin Schmidt and Susanne Bauer.

Financial support

Johannes Mülmenstädt was supported by the Office of Science, US Department of Energy (DOE) Biological and Environmental Research as part of the Regional and Global Model Analysis program area and used resources of the National Energy Research Scientific Computing Center (NERSC), a US DOE Office of Science user facility located at Lawrence Berkeley National Laboratory (contract no. DE-AC02-05CH11231). Andrew S. Ackerman, Ann M. Fridlind, Florian Tornow, and Susanne E. Bauer were supported by the NASA Modeling, Analysis, and Prediction Program; their computational resources were provided by the NASA Center for Climate Simulation (NCCS) at Goddard Space Flight Center. Edward Gryspeerdt was supported by a Royal Society University Research Fellowship (URF\R1\191602). David Neubauer is supported by the European Union’s Horizon 2020 research and innovation program (grant no. 821205). Toshihiko Takemura was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI grant (grant no. JP19H05669) and the Environment Research and Technology Development Fund S-20 (grant no. JPMEERF21S12010) of the Environmental Restoration and Conservation Agency provided by the Ministry of the Environment, Japan. Philip Stier was supported by the European Research Council project RECAP under the European Union’s Horizon 2020 research and innovation program (grant no. 724602) and the FORCeS project under the European Union’s Horizon 2020 research and innovation program (grant no. 821205). The Pacific Northwest National Laboratory (PNNL) is operated for DOE by Battelle Memorial Institute (contract no. DE-AC05-76RLO1830).

Review statement

This paper was edited by Matthew Lebsock and reviewed by Anna Possner and Jianhao Zhang.


Ackerman, A., Kirkpatrick, M., Stevens, D., and Toon, O.: The impact of humidity above stratiform clouds on indirect aerosol climate forcing, Nature, 432, 1014–1017,, 2004. a

AeroCom project: AeroCom Phase II indirect effect experiment, The Norwegian Meteorological Institute [data set], ssh://, 2016. a

Albrecht, B. A.: Aerosols, Cloud Microphysics, and Fractional Cloudiness, Science, 245, 1227–1230, 1989. a

Arola, A., Lipponen, A., Kolmonen, P., Virtanen, T. H., Bellouin, N., Grosvenor, D. P., Gryspeerdt, E., Quaas, J., and Kokkola, H.: Aerosol effects on clouds are concealed by natural cloud heterogeneity and satellite retrieval errors, Nat. Commun., 13, 7357,, 2022. a, b

Bellouin, N., Quaas, J., Gryspeerdt, E., Kinne, S., Stier, P., Watson-Parris, D., Boucher, O., Carslaw, K. S., Christensen, M., Daniau, A.-L., Dufresne, J.-L., Feingold, G., Fiedler, S., Forster, P., Gettelman, A., Haywood, J. M., Lohmann, U., Malavelle, F., Mauritsen, T., McCoy, D. T., Myhre, G., Mülmenstädt, J., Neubauer, D., Possner, A., Rugenstein, M., Sato, Y., Schulz, M., Schwartz, S. E., Sourdeval, O., Storelvmo, T., Toll, V., Winker, D., and Stevens, B.: Bounding Global Aerosol Radiative Forcing of Climate Change, Rev. Geophys., 58, e2019RG000660,, 2020. a, b, c

Bennartz, R.: Global assessment of marine boundary layer cloud droplet number concentration from satellite, J. Geophys. Res., 112, D02201,, 2007. a

Bogenschutz, P. A., Gettelman, A., Morrison, H., Larson, V. E., Craig, C., and Schanen, D. P.: Higher-Order Turbulence Closure and Its Impact on Climate Simulations in the Community Atmosphere Model, J. Climate, 26, 9655–9676,, 2013. a

Boucher, O., Randall, D., Artaxo, P., Bretherton, C., Feingold, G., Forster, P., Kerminen, V.-M., Kondo, Y., Liao, H., Lohmann, U., Rasch, P., Satheesh, S., Sherwood, S., Stevens, B., and Zhang, X.: Clouds and Aerosols, book section Chapter 7, 571–658, Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA, ISBN 978-1-107-66182-0,, 2014. a, b, c

Brenguier, J. L., Pawlowska, H., Schüller, L., Preusker, R., Fischer, J., and Fouquart, Y.: Radiative properties of boundary layer clouds: Droplet effective radius versus number concentration, J. Atmos. Sci., 57, 803–821,<0803:RPOBLC>2.0.CO;2, 2000. a

Brenguier, J. L., Pawlowska, H., and Schüller, L.: Cloud microphysical and radiative properties for parameterization and satellite monitoring of the indirect effect of aerosol on climate, J. Geophys. Res., 108, 8632,, 2003. a

Bretherton, C. S. and Park, S.: A New Moist Turbulence Parameterization in the Community Atmosphere Model, J. Climate, 22, 3422–3448,, 2009. a

Bretherton, C. S., Blossey, P. N., and Uchida, J.: Cloud droplet sedimentation, entrainment efficiency, and subtropical stratocumulus albedo, Geophys. Res. Lett., 34, L03813,, 2007. a

Cesana, G., Del Genio, A. D., Ackerman, A. S., Kelley, M., Elsaesser, G., Fridlind, A. M., Cheng, Y., and Yao, M.-S.: Evaluating models' response of tropical low clouds to SST forcings using CALIPSO observations, Atmos. Chem. Phys., 19, 2813–2832,, 2019. a

Cesana, Gregory, V., Ackerman, A. S., Fridlind, A. M., Silber, I., and Kelley, M.: Snow Reconciles Observed and Simulated Phase Partitioning and Increases Cloud Feedback, Geophys. Res. Lett., 48, e2021GL094876,, 2021. a

Christensen, M. W., Gettelman, A., Cermak, J., Dagan, G., Diamond, M., Douglas, A., Feingold, G., Glassmeier, F., Goren, T., Grosvenor, D. P., Gryspeerdt, E., Kahn, R., Li, Z., Ma, P.-L., Malavelle, F., McCoy, I. L., McCoy, D. T., McFarquhar, G., Mülmenstädt, J., Pal, S., Possner, A., Povey, A., Quaas, J., Rosenfeld, D., Schmidt, A., Schrödner, R., Sorooshian, A., Stier, P., Toll, V., Watson-Parris, D., Wood, R., Yang, M., and Yuan, T.: Opportunistic experiments to constrain aerosol effective radiative forcing, Atmos. Chem. Phys., 22, 641–674,, 2022. a, b

Christensen, M. W., Ma, P.-L., Wu, P., Varble, A. C., Mülmenstädt, J., and Fast, J. D.: Evaluation of aerosol–cloud interactions in E3SM using a Lagrangian framework, Atmos. Chem. Phys., 23, 2789–2812,, 2023. a

Comstock, K. K., Wood, R., Yuter, S. E., and Bretherton, C. S.: Reflectivity and rain rate in and below drizzling stratocumulus, Q. J. Roy. Meteor. Soc., 130, 2891–2918,, 2004. a

Diamond, M. S., Director, H. M., Eastman, R., Possner, A., and Wood, R.: Substantial Cloud Brightening From Shipping in Subtropical Low Clouds, AGU Adv., 1, e2019AV000111,, 2020. a

Dipu, S., Schwarz, M., Ekman, A. M. L., Gryspeerdt, E., Goren, T., Sourdeval, O., Mülmenstädt, J., and Quaas, J.: Exploring satellite-derived relationships between cloud droplet number concentration and liquid water path using large-domain large-eddy simulation, Tellus, 74, 176–188,, 2022. a, b

Eastman, R., McCoy, I. L., and Wood, R.: Wind, Rain, and the Closed to Open Cell Transition in Subtropical Marine Stratocumulus, J. Geophys. Res., 127, e2022JD036795,, 2022. a

Feingold, G., Goren, T., and Yamaguchi, T.: Quantifying albedo susceptibility biases in shallow clouds, Atmos. Chem. Phys., 22, 3303–3319,, 2022. a

Fons, E., Runge, J., Neubauer, D., and Lohmann, U.: Stratocumulus adjustments to aerosol perturbations disentangled with a causal approach, npj Clim. Atmos. Sci., 6, 130,, 2023. a, b, c, d

Forster, P., Storelvmo, T., Armour, K., Collins, W., Dufresne, J.-L., Frame, D., Lunt, D., Mauritsen, T., Palmer, M., Watanabe, M., Wild, M., and Zhang, H.: The Earth's Energy Budget, Climate Feedbacks, and Climate Sensitivity, chap. 7, 923–1054, Cambridge University Press,, 2021. a, b, c

Gettelman, A.: Putting the clouds back in aerosol–cloud interactions, Atmos. Chem. Phys., 15, 12397–12411,, 2015. a

Gettelman, A., Hannay, C., Bacmeister, J. T., Neale, R. B., Pendergrass, A. G., Danabasoglu, G., Lamarque, J.-F., Fasullo, J. T., Bailey, D. A., Lawrence, D. M., and Mills, M. J.: High Climate Sensitivity in the Community Earth System Model Version 2 (CESM2), Geophys. Res. Lett., 46, 8329–8337,, 2019. a

Ghan, S., Wang, M., Zhang, S., Ferrachat, S., Gettelman, A., Griesfeller, J., Kipling, Z., Lohmann, U., Morrison, H., Neubauer, D., Partridge, D. G., Stier, P., Takemura, T., Wang, H., and Zhang, K.: Challenges in constraining anthropogenic aerosol effects on cloud radiative forcing using present-day spatiotemporal variability, P. Natl. Acad. Sci. USA, 113, 5804–5811,, 2016. a

Glassmeier, F., Hoffmann, F., Johnson, J. S., Yamaguchi, T., Carslaw, K. S., and Feingold, G.: Aerosol-cloud-climate cooling overestimated by ship-track data, Science, 371, 485–489,, 2021. a

Golaz, J.-C., Larson, V. E., Hansen, J. A., Schanen, D. P., and Griffin, B. M.: Elucidating model inadequacies in a cloud parameterization by use of an ensemble-based calibration framework, Mon. Weather Rev., 135, 4077–4096,, 2007. a

Grandey, B. S. and Stier, P.: A critical look at spatial scale choices in satellite-based aerosol indirect effect studies, Atmos. Chem. Phys., 10, 11459–11470,, 2010. a

Grosvenor, D. P., Sourdeval, O., Zuidema, P., Ackerman, A., Alexandrov, M. D., Bennartz, R., Boers, R., Cairns, B., Chiu, J. C., Christensen, M., Deneke, H., Diamond, M., Feingold, G., Fridlind, A., Huenerbein, A., Knist, C., Kollias, P., Marshak, A., McCoy, D., Merk, D., Painemal, D., Rausch, J., Rosenfeld, D., Russchenberg, H., Seifert, P., Sinclair, K., Stier, P., van Diedenhoven, B., Wendisch, M., Werner, F., Wood, R., Zhang, Z., and Quaas, J.: Remote Sensing of Droplet Number Concentration in Warm Clouds: A Review of the Current State of Knowledge and Perspectives, Rev. Geophys., 56, 409–453,, 2018. a

Gryspeerdt, E. and Stier, P.: Regime-based analysis of aerosol-cloud interactions, Geophys. Res. Lett., 39, L21802,, 2012. a

Gryspeerdt, E., Goren, T., Sourdeval, O., Quaas, J., Mülmenstädt, J., Dipu, S., Unglaub, C., Gettelman, A., and Christensen, M.: Constraining the aerosol influence on cloud liquid water path, Atmos. Chem. Phys., 19, 5331–5347,, 2019. a, b, c, d, e, f, g, h

Gryspeerdt, E., Mülmenstädt, J., Gettelman, A., Malavelle, F. F., Morrison, H., Neubauer, D., Partridge, D. G., Stier, P., Takemura, T., Wang, H., Wang, M., and Zhang, K.: Surprising similarities in model and observational aerosol radiative forcing estimates, Atmos. Chem. Phys., 20, 613–623,, 2020. a

Guo, H., Golaz, J.-C., and Donner, L. J.: Aerosol effects on stratocumulus water paths in a PDF-based parameterization, Geophys. Res. Lett., 38, L17808,, 2011. a

Hoffmann, F., Glassmeier, F., Yamaguchi, T., and Feingold, G.: Liquid Water Path Steady States in Stratocumulus: Insights from Process-Level Emulation and Mixed-Layer Theory, J. Atmos. Sci., 77, 2203–2215,, 2020. a, b

Kazil, J., Christensen, M. W., Abel, S. J., Yamaguchi, T., and Feingold, G.: Realism of Lagrangian Large Eddy Simulations Driven by Renalysis Meteorology: Tracking a Pocket of Open Cells Under a Biomass Burning Aerosol Layer, J. Adv. Model. Earth Sy., 13, e2021MS002664,, 2021. a

Larson, V. E.: CLUBB-SILHS: A parameterization of subgrid variability in the atmosphere, arXiv [preprint],, 2022. a

Larson, V. E. and Golaz, J. C.: Using probability density functions to derive consistent closure relationships among higher-order moments, Mon. Weather Rev., 133, 1023–1042,, 2005. a

Malavelle, F. F., Haywood, J. M., Jones, A., Gettelman, A., Larisse, L. C., Bauduin, S., Allan, R. P., Karset, I. H. H., Kristjansson, J. E., Oreopoulos, L., Ho, N. C., Lee, D., Bellouin, N., Boucher, O., Grosvenor, D. P., Carslaw, K. S., Dhomse, S., Mann, G. W., Schmidt, A., Coe, H., Hartley, M. E., Dalvi, M., Hill, A. A., Johnson, B. T., Johnson, C. E., Knight, J. R., O'Connor, F. M., Partridge, D. G., Stier, P., Myhre, G., Platnick, S., Stephens, G. L., Takahashi, H., and Thordarson, T.: Strong constraints on aerosol-cloud interactions from volcanic eruptions, Nature, 546, 485–491,, 2017. a, b

Medeiros, B. and Stevens, B.: Revealing differences in GCM representations of low clouds, Clim. Dynam., 36, 385–399,, 2011. a, b, c, d

Morrison, H. and Gettelman, A.: A new two-moment bulk stratiform cloud microphysics scheme in the community atmosphere model, version 3 (CAM3). Part I: Description and numerical tests, J. Climate, 21, 3642–3659,, 2008. a

Muhlbauer, A., McCoy, I. L., and Wood, R.: Climatology of stratocumulus cloud morphologies: microphysical properties and radiative effects, Atmos. Chem. Phys., 14, 6695–6716,, 2014. a

Mülmenstädt, J.: jmuelmen/egusphere-2024-4: egusphere-2024-4 initial ACP submission (egusphere-2024-4_initial_ACP_submission), Zenodo [code],, 2024. a

Mülmenstädt, J. and Feingold, G.: The radiative forcing of aerosol–cloud interactions in liquid clouds: Wrestling and embracing uncertainty, Curr. Clim. Change Rep., 4, 23–40,, 2018. a, b

Mülmenstädt, J. and Wilcox, L. J.: The Fall and Rise of the Global Climate Model, J. Adv. Model. Earth Sy., 13, e2021MS002781,, 2021. a

Mülmenstädt, J., Nam, C., Salzmann, M., Kretzschmar, J., L'Ecuyer, T. S., Lohmann, U., Ma, P.-L., Myhre, G., Neubauer, D., Stier, P., Suzuki, K., Wang, M., and Quaas, J.: Reducing the aerosol forcing uncertainty using observational constraints on warm rain processes, Sci. Adv., 6, eaaz6433,, 2020. a

Mülmenstädt, J., Salzmann, M., Kay, J. E., Zelinka, M. D., Ma, P.-L., Nam, C., Kretzschmar, J., Hörnig, S., and Quaas, J.: An underestimated negative cloud feedback from cloud lifetime changes, Nat. Clim. Change, 11, 508–513,, 2021. a

Mülmenstädt, J., Ackerman, A. S., Fridlind, A. M., Huang, M., Ma, P.-L., Mahfouz, N., Bauer, S. E., Burrows, S. M., Christensen, M. W., Dipu, S., Gettelman, A., Leung, L. R., Tornow, F., Quaas, J., Varble, A. C., Wang, H., Zhang, K., and Zheng, Y.: Can GCMs represent cloud adjustments to aerosol–cloud interactions?, EGUsphere [preprint],, 2024a. a, b

Mülmenstädt, J., Ackerman, A., Gettelman, A., and Zheng, Y.: US CMS model runs for (0.9), Zenodo [data set],, 2024b. a

Painemal, D. and Zuidema, P.: Assessment of MODIS cloud effective radius and optical thickness retrievals over the Southeast Pacific with VOCALS-REx in situ measurements, J. Geophys. Res., 116, D24206,, 2011. a

Pawlowska, H. and Brenguier, J. L.: An observational study of drizzle formation in stratocumulus clouds for general circulation model (GCM) parameterizations, J. Geophys. Res., 108, 8630,, 2003. a

Pearl, J. and Mackenzie, D.: The Book of Why: The New Science of Cause and Effect, Basic Books, New York, USA, ISBN 9780465097616, 2018. a

Pincus, R., Platnick, S., Ackerman, S. A., Hemler, R. S., and Hofmann, R. J. P.: Reconciling Simulated and Observed Views of Clouds: MODIS, ISCCP, and the Limits of Instrument Simulators, J. Climate, 25, 4699–4720,, 2012. a, b

Platnick, S., Meyer, K. G., King, M. D., Wind, G., Amarasinghe, N., Marchant, B., Arnold, G. T., Zhang, Z., Hubanks, P. A., Holz, R. E., Yang, P., Ridgway, W. L., and Riedi, J.: The MODIS Cloud Optical and Microphysical Products: Collection 6 Updates and Examples From Terra and Aqua, IEEE T. Geosci. Remote Sens., 55, 502–525,, 2017. a, b

Possner, A., Eastman, R., Bender, F., and Glassmeier, F.: Deconvolution of boundary layer depth and aerosol constraints on cloud water path in subtropical stratocumulus decks, Atmos. Chem. Phys., 20, 3609–3621,, 2020. a, b

Quaas, J., Arola, A., Cairns, B., Christensen, M., Deneke, H., Ekman, A. M. L., Feingold, G., Fridlind, A., Gryspeerdt, E., Hasekamp, O., Li, Z., Lipponen, A., Ma, P.-L., Mülmenstädt, J., Nenes, A., Penner, J. E., Rosenfeld, D., Schrödner, R., Sinclair, K., Sourdeval, O., Stier, P., Tesche, M., van Diedenhoven, B., and Wendisch, M.: Constraining the Twomey effect from satellite observations: issues and perspectives, Atmos. Chem. Phys., 20, 15079–15099,, 2020. a

Quaas, J., Jia, H., Smith, C., Albright, A. L., Aas, W., Bellouin, N., Boucher, O., Doutriaux-Boucher, M., Forster, P. M., Grosvenor, D., Jenkins, S., Klimont, Z., Loeb, N. G., Ma, X., Naik, V., Paulot, F., Stier, P., Wild, M., Myhre, G., and Schulz, M.: Robust evidence for reversal of the trend in aerosol effective climate forcing, Atmos. Chem. Phys., 22, 12221–12239,, 2022. a

Rasch, P. J., Xie, S., Ma, P.-L., Lin, W., Wang, H., Tang, Q., Burrows, S. M., Caldwell, P., Zhang, K., Easter, R. C., Cameron-Smith, P., Singh, B., Wan, H., Golaz, J.-C., Harrop, B. E., Roesler, E., Bacmeister, J., Larson, V. E., Evans, K. J., Qian, Y., Taylor, M., Leung, L. R., Zhang, Y., Brent, L., Branstetter, M., Hannay, C., Mahajan, S., Mametjanov, A., Neale, R., Richter, J. H., Yoon, J.-H., Zender, C. S., Bader, D., Flanner, M., Foucar, J. G., Jacob, R., Keen, N., Klein, S. A., Liu, X., Salinger, A. G., Shrivastava, M., and Yang, Y.: An Overview of the Atmospheric Component of the Energy Exascale Earth system Model, J. Adv. Model. Earth Sy., 11, 2377–2411,, 2019. a

Rossow, W. B., Tselioudis, G., Polak, A., and Jakob, C.: Tropical climate described as a distribution of weather states indicated by distinct mesoscale cloud property mixtures, Geophys. Res. Lett., 32, L21812,, 2005. a

Salzmann, M., Ming, Y., Golaz, J.-C., Ginoux, P. A., Morrison, H., Gettelman, A., Krämer, M., and Donner, L. J.: Two-moment bulk stratiform cloud microphysics in the GFDL AM3 GCM: description, evaluation, and sensitivity tests, Atmos. Chem. Phys., 10, 8037–8064,, 2010. a

Sherwood, S. C., Webb, M. J., Annan, J. D., Armour, K. C., Forster, P. M., Hargreaves, J. C., Hegerl, G., Klein, S. A., Marvel, K. D., Rohling, E. J., Watanabe, M., Andrews, T., Braconnot, P., Bretherton, C. S., Foster, G. L., Hausfather, Z., von der Heydt, A. S., Knutti, R., Mauritsen, T., Norris, J. R., Proistosescu, C., Rugenstein, M., Schmidt, G. A., Tokarska, K. B., and Zelinka, M. D.: An Assessment of Earth's Climate Sensitivity Using Multiple Lines of Evidence, Rev. Geophys., 58, e2019RG000678,, 2020. a

Simpson, E.: The Interpretation of Interaction in Contingency Tables, J. R. Stat. Soc. B, 13, 238–241, 1951. a, b

Stephens, G. L., L'Ecuyer, T., Forbes, R., Gettleman, A., Golaz, J.-C., Bodas-Salcedo, A., Suzuki, K., Gabriel, P., and Haynes, J.: Dreary state of precipitation in global models, J. Geophys. Res., 115, D24211,, 2010. a

Stevens, B. and Feingold, G.: Untangling aerosol effects on clouds and precipitation in a buffered system, Nature, 461, 607–613,, 2009.  a

Swales, D. J., Pincus, R., and Bodas-Salcedo, A.: The Cloud Feedback Model Intercomparison Project Observational Simulator Package: Version 2, Geosci. Model Dev., 11, 77–81,, 2018. a

Toll, V., Christensen, M., Quaas, J., and Bellouin, N.: Weak average liquid-cloud-water response to anthropogenic aerosols, Nature, 572, 51–55,, 2019. a, b

Twomey, S.: Influence of pollution on shortwave albedo of clouds, J. Atmos. Sci., 34, 1149–1152,<1149:TIOPOT>2.0.CO;2, 1977. a

Unglaub, C., Block, K., Mülmenstädt, J., Sourdeval, O., and Quaas, J.: A new classification of satellite-derived liquid water cloud regimes at cloud scale, Atmos. Chem. Phys., 20, 2407–2418,, 2020. a

Varble, A. C., Ma, P.-L., Christensen, M. W., Mülmenstädt, J., Tang, S., and Fast, J.: Evaluation of liquid cloud albedo susceptibility in E3SM using coupled eastern North Atlantic surface and satellite retrievals, Atmos. Chem. Phys., 23, 13523–13553,, 2023. a, b

Zhang, S., Wang, M., Ghan, S. J., Ding, A., Wang, H., Zhang, K., Neubauer, D., Lohmann, U., Ferrachat, S., Takeamura, T., Gettelman, A., Morrison, H., Lee, Y., Shindell, D. T., Partridge, D. G., Stier, P., Kipling, Z., and Fu, C.: On the characteristics of aerosol indirect effect based on dynamic regimes in global climate models, Atmos. Chem. Phys., 16, 2765–2783,, 2016. a

Zhao, M., Golaz, J.-C., Held, I. M., Guo, H., Balaji, V., Benson, R., Chen, J.-H., Chen, X., Donner, L. J., Dunne, J. P., Dunne, K., Durachta, J., Fan, S.-M., Freidenreich, S. M., Garner, S. T., Ginoux, P., Harris, L. M., Horowitz, L. W., Krasting, J. P., Langenhorst, A. R., Liang, Z., Lin, P., Lin, S.-J., Malyshev, S. L., Mason, E., Milly, P. C. D., Ming, Y., Naik, V., Paulot, F., Paynter, D., Phillipps, P., Radhakrishnan, A., Ramaswamy, V., Robinson, T., Schwarzkopf, D., Seman, C. J., Shevliakova, E., Shen, Z., Shin, H., Silvers, L. G., Wilson, J. R., Winton, M., Wittenberg, A. T., Wyman, B., and Xiang, B.: The GFDL Global Atmosphere and Land Model AM4.0/LM4.0:2. Model Description, Sensitivity Studies, and Tuning Strategies, J. Adv. Model. Earth Sy., 10, 735–769,, 2018. a

Short summary
Human activities release copious amounts of small particles called aerosols into the atmosphere. These particles change how much sunlight clouds reflect to space, an important human perturbation of the climate, whose magnitude is highly uncertain. We found that the latest climate models show a negative correlation but a positive causal relationship between aerosols and cloud water. This means we need to be very careful when we interpret observational studies that can only see correlation.
Final-revised paper