Anthropogenic aerosol forcing of the AMOC and the associated mechanisms in CMIP6 models

By regulating the global transport of heat, freshwater and carbon, the Atlantic Meridional Overturning Circulation (AMOC) serves as an important component of the climate system. During the late 20th and early 21st centuries, indirect observations and models suggest a weakening of the AMOC. Direct AMOC observations also suggest a weakening during the early 21st century, but with substantial interannual variability. Long-term weakening of the AMOC has been associated with increasing greenhouse gases (GHGs), but some modeling studies suggest the build up of anthropogenic aerosols (AAs) may 5 have offset part of the GHG-induced weakening. Here, we quantify 1900-2020 AMOC variations and assess the driving mechanisms in state-of-the-art climate models from the Coupled Model Intercomparison Project phase 6 (CMIP6). The CMIP6 all forcing (GHGs, anthropogenic and volcanic aerosols, solar variability, and land use/land change) multi-model mean shows negligible AMOC changes up to ∼1950, followed by robust AMOC strengthening during the second half of the 20th century (∼1950-1990), and weakening afterwards (1990-2020). These multi-decadal AMOC variations are related to changes in North 10 Atlantic atmospheric circulation, including an altered sea level pressure gradient, storm track activity, surface winds and heat fluxes, which drive changes in the subpolar North Atlantic surface density flux. Similar to previous studies, CMIP6 GHG simulations yield robust AMOC weakening, particularly during the second half of the 20th century. Changes in natural forcings, including solar variability and volcanic aerosols, yield negligible AMOC changes. In contrast, CMIP6 AA simulations yield robust AMOC strengthening (weakening) in response to increasing (decreasing) anthropogenic aerosols. Moreover, the 15 CMIP6 all-forcing AMOC variations and atmospheric circulation responses also occur in the CMIP6 AA simulations, which suggests these are largely driven by changes in anthropogenic aerosol emissions. Although aspects of the CMIP6 all-forcing multi-model mean response resembles observations, notable differences exist. This includes CMIP6 AMOC strengthening from∼1950-1990, when the indirect estimates suggest AMOC weakening. The CMIP6 multi-model mean also underestimates the observed increase in North Atlantic ocean heat content. And although the CMIP6 North Atlantic atmospheric circulation 20 responses−particularly the overall patterns−are similar to observations, the simulated responses are weaker than those observed, implying they are only partially externally forced. The possible causes of these differences include internal climate variability, observational uncertainties and model shortcomings−including excessive aerosol forcing. A handful of CMIP6 realizations yield AMOC evolution since 1900 similar to the indirect observations, implying the inferred AMOC weakening from 1950-1990 (and even from 1930-1990) may have a significant contribution from internal (i.e., unforced) climate variabil25 1 https://doi.org/10.5194/acp-2020-769 Preprint. Discussion started: 2 October 2020 c © Author(s) 2020. CC BY 4.0 License.

salinification of the North Atlantic subpolar gyre via increased evaporation, decreased flux of ice through the Fram Strait and increased salt advection from the subtropical Atlantic. This study, like many of the earlier studies, relies on a single climate model. Very recently, however, Menary et al. (2020) use the new Coupled Model Intercomparison Project phase 6 (CMIP6) (Eyring et al., 2016) archive to show a ∼10% AMOC strengthening from 1850-1985, which they attribute to aerosol forcing.
The newest generation of coupled climate and earth system models, CMIP6, represents a significant opportunity to evaluate 70 the role of external forcing, including anthropogenic aerosols, on North Atlantic climate and the AMOC. Similar to the very recent results of Menary et al. (2020), we show that a large suite of state-of-the-art climate models simulate robust strengthening of the AMOC from ∼1950-1990, and that this response is largely driven by anthropogenic aerosols. Furthermore, CMIP6 models yield robust AMOC weakening from ∼1990-2020, with anthropogenic aerosols again playing an important role. We also show that these responses are related to atmospheric circulation changes, including an altered sea level pressure gradient, 75 storm track, surface winds and heat fluxes, which ultimately drive changes in the surface density flux in the subpolar North Atlantic.

AMOC Calculation
The AMOC is defined as the maximum stream function (ψ) below 500 m at 28 • N in the Atlantic Ocean. It is calculated by 80 integrating the northward sea water velocity (vo) with depth, z, along the western (x w ) to the eastern boundaries (x e ) of the Atlantic Ocean: The AMOC percent change is estimated from the least-squares regression slope (r s ) of the non-normalized AMOC time series using: 100 × rs×N AM OC (N =1) , where N is the number of years (e.g., 30 for 1990-2020) and AM OC(N = 1) is the initial AMOC 85 strength (e.g., in 1990 for 1990-2020). The quoted AMOC percent change uncertainties are estimated as the standard error, defined as σ √ nm , where σ represents the standard deviation across each model mean AMOC percent change and n m is the number of models.

SDF Calculation
The surface density flux (SDF) indicates the loss or gain of density (buoyancy) of the ocean surface due to thermal (radiation, sensible and latent heat) and haline (sea-ice melting/freezing, brine rejection, precipitation minus evaporation) exchanges (Liu 95 et al., 2017(Liu 95 et al., , 2019). An increase in subpolar North Atlantic SDF is associated with strengthening of the AMOC; a decrease in SDF is associated with weakening of the AMOC. Surface density flux is define as: where c p , SST, and SSS are the specific heat capacity and sea surface temperature and salinity, respectively; α and β are thermal expansion and haline contraction coefficients; and ρ(0, SST ) is the density of freshwater with a salinity of zero and 100 the temperature of SST. SHF represents the net surface heat flux into ocean (positive downward), which is estimated as a sum of shortwave (SW) and longwave (LW) radiation, sensible (SHFLX) and latent (LHFLX) heat fluxes, and heat fluxes from sea ice melting and other minor sources. SFWF represents net surface freshwater flux into ocean (positive downward) and is estimated as precipitation + runoff + ice melting -evaporation. The first term −α SHF cp represents the thermal contribution (TSDF); the second term −ρ(0, SST )β SF W F ×SSS 1−SSS represents the haline contribution (HSDF) to the density flux. 105

OHC Calculation
The ocean heat content (OHC) is estimated from the ocean potential temperature for each model vertical level. It is derived by spatially integrating over the North Atlantic (0-60 • N; 7.5-75 • W) upper-ocean (0-700 m) (e.g., Zhang et al., 2013), and then multiplying by reference values for sea water density (ρ) and specific heat capacity (C) of 1025 kg m −3 and 3985 J kg −1 K −1 , respectively (Palmer and McNeall, 2014). Ocean heat content is calculated for each vertical level according to the following 110 equation: where Φ z is the ocean heat content for model vertical level, z; θ is the potential temperature at that vertical level; V is the grid cell volume; and i, j are the latitudes and longitudes that cover the North Atlantic. Equation (3)

Decomposition of Latent and Sensible Heat Fluxes
Using Monin-Obukhov similarity theory (Monin and Obukhov, 1954), latent (LHFLX) and sensible (SHFLX) heat fluxes can 120 be decomposed into wind, moisture and temperature components according to: where L v is the latent heat of vaporization; c p,air is the specific heat capacity of air at constant pressure; ρ air is the surface 125 air density; u * is the surface velocity scale (m s −1 , also referred to as the surface friction velocity); q * is the surface humidity scale (kg kg −1 ); and θ * is the surface temperature scale (K) (Grachev and Fairall, 1997;Maronga, 2014). The velocity scale can be estimated from observed or simulated surface wind stress (τ ) as u * = |τ | ρair . Given values for latent and sensible heat fluxes and Eqs. (4-5), the moisture and temperature scales can be calculated as the residual. The validity of this methodology has been verified in MERRA2, where all fields (e.g., u * , q * , θ * , and the surface heat fluxes) are archived.

130
LHFLX and SHFLX trends can then be decomposed into wind, moisture and temperature components according to: δSHF LX ≈ −c p,air ρ air (u c * δθ * + θ c * δu * ) where δ represents the trend and u c * , q c * and θ c * represent climatological values at each grid box. ρ air is assumed to be 135 constant for each grid box. Cross checking the estimated and actual LHFLX and SHFLX trends shows very close agreement.
The first (second) term in Eq. (6) represents the moisture (wind) component of δLHFLX. Similarly, the first (second) term in Eq. (7) represents the temperature (wind) component of δSHFLX.

Storm Track Activity
We define the extratropical cyclone (storm track) activity using temporal variance statistics, band-pass filtered using a 24-hour 140 difference filer (Chang et al., 2015;Allen and Luptowitz, 2017): where PSL is the daily sea level pressure and pp is the 24-hour difference filtered variance of sea level pressure. The overbar corresponds to time averaging over each year.

Anthropogenic Aerosol Effective Radiative Forcing
Anthropogenic aerosol Effective Radiative Forcing (ERF) is estimated from the net top-of-the-atmosphere (TOA) radiative fluxes (the sum of net longwave and shortwave fluxes) using ∼30 years of data from fixed sea surface temperature (SST) simulations (Forster et al., 2016). More specifically, anthropogenic aerosol ERF is the net TOA radiative flux difference between piClim-Control and piClim-aer simulations (i.e., piClim-aer−piClim-Control  -term (1900-2020) climatology. Trends are based on a least-squares regression and significance is based on a standard t-test. The lead-lag correlation analysis is based on Pearson's correlation coefficient. The 95% confidence intervals for the lead-lag correlations are estimated by first transforming the Pearson's correlation coefficient (r) to a Fisher's z-score (r z ). The corresponding standard error of the z distribution is defined as: σ z = 1 √ N −3 , where N is the number 160 of years. The confidence interval under the transformed system is calculated as: r z ± z α 2 × σ z , where z α 2 is calculated from the inverse of the cumulative distribution function and α is 0.05 for a 95% confidence interval. The transformation is reversed to obtain the lower and upper bounds of the confidence interval. Similar lead-lag correlation results are obtained under detrended and non-detrended time series.   Figure 1 shows the models and number of realizations used). Relatively small change occurs up to ∼1950, after which the AMOC strengthens through ∼1990, and then rapidly weakens through present-day (2020). 83% (92%) of the models yield a positive (negative) AMOC trend from 1950-1990 (1990-2020). The 1950The -1990The (1990The -2020 ensemble 170 mean strengthening (weakening) represents a 7.7±1.6 (−11.4±1.8) percent change (Supplementary Figure 1). These and all subsequent percent changes are relative to the beginning year of the time period (Methods Section). As these multi-decadal AMOC variations are based on the ensemble mean from a relatively large number of models, they are not due to internal climate variability. Instead, they are driven by external forcing.
6 https://doi.org/10.5194/acp-2020-769 Preprint. Discussion started: 2 October 2020 c Author(s) 2020. CC BY 4.0 License. Table 1). Over the present-day (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018), the CMIP6 simulated AMOC ranges from 9.1 Sv (NESM3) to 30.3 Sv (NorESM2-MM). The corresponding multi-model mean AMOC strength and one-sigma uncertainty across models is 19.8 and 5.6 Sv, respectively (similar values are obtained over the entire 1900-2020 time period at 20.5 an 5.8 Sv). This is similar to but somewhat larger than that from the RAPID array at 17.5 Sv with an interannual standard deviation of 1.4 Sv. Re-estimating The AMOC is related to surface density fluxes in the subpolar North Atlantic (Liu et al., 2019(Liu et al., , 2017, which modulate deepwater formation in the deep convection region. We define the subpolar North Atlantic region as 45-60 • N and 0-50 • W. We get similar results with alternate definitions of the subpolar North Atlantic region (e.g. 45-65 • N and 10-60 • W). Figure 1 also 185 includes the corresponding time series for the subpolar North Atlantic surface density flux (SDF), as well as its thermal (TSDF) component. The AMOC and SDF exhibit similar multi-decadal variations, including an increase (decrease) from ∼1950-1990 (1990-2020). Moreover, most of the temporal variation in SDF is consistent with TSDF. The haline SDF component (HSDF) is two order of magnitudes weaker (not shown). Multi-decadal variations in TSDF are largely consistent with latent (LHFLX) and sensible (SHFLX) heat fluxes ( Fig. 1d-e). Similar temporal evolution also occurs for the subpolar North Atlantic surface 190 wind (SFWD), which is a component of both LHFLX and SHFLX. Moreover, the sea level pressure gradient (dPSL) between Europe (30-45 • N and 0-30 • E) and the subpolar North Atlantic also exhibits similar temporal evolution consistent with surface wind variations ( Fig. 1g-h), as does the subpolar North Atlantic extratropical cyclone (storm track) activity ( Fig. 1i-j). We also mention here that the multi-decadal evolution of these variables is generally out of phase with the subpolar North Atlantic net downward surface shortwave radiation (SW; Fig. 1f).
195 Figure 2 shows subpolar North Atlantic lead-lag Pearson correlations (r; Methods Section) based on the CMIP6 all forcing annual mean ensemble mean. The subpolar North Atlantic 550 nm aerosol optical thickness (AOT; a measure of the extinction of radiation by aerosols) and SW exhibit the maximum correlation at −0.89 with zero lag (Fig. 2a). The subpolar North Atlantic net surface shortwave radiation and AMOC exhibit maximum correlation at −0.84, with SW leading the AMOC by ∼12 years (Fig. 2b). Similarly, the subpolar North Atlantic net surface shortwave radiation and surface temperature are 200 maximally correlated at 0.90 with zero lag (Fig. 2c); and surface temperature and AMOC have maximum correlation of −0.85, with AMOC leading by ∼12 years (Fig. 2d). Thus, the subpolar North Atlantic net surface shortwave radiation and surface temperature are temporally in sync with aerosol optical thickness, all three of which lead the AMOC by ∼12 years. lag with latent (r = 0.79) and sensible (r = 0.94) heat fluxes (not shown). Thus, the Europe-subpolar North Atlantic pressure gradient, as well as the subpolar North Atlantic surface wind and surface density and heat fluxes are temporally in sync and significantly correlated. These responses are similar to, and generally consistent with, North Atlantic Oscillation (NAO)-like variability driving air-sea fluxes (Eden and Jung, 2001). However, correlations between these variables (i.e., SDF, SFWD, and dPSL) and the AMOC all have maximum (and significant) correlations at a 4-5 year lead, ranging from 0.66 to 0.78 (Fig. 2f,h,j).
The 5-year lead correlation where the subpolar North Atlantic surface density flux leads and AMOC is likely related to signal 215 propagation via Kelvin waves/boundary currents, which impact the AMOC in the lower latitudes (e.g., 28 • N) (Kawase, 1987;Huang et al., 2000;Johnson and Marshall, 2002;Cessi et al., 2004;Zhang, 2010). Figure 3 shows that the Europe-subpolar North Atlantic sea level pressure gradient and the subpolar North Atlantic surface wind, and surface density are significantly correlated with the net downward surface shortwave radiation and surface temperature, with the latter two variables leading by 6-8 years. For example, the maximum correlation between the subpolar North 220 Atlantic surface temperature and the Europe-subpolar North Atlantic sea level pressure gradient is −0.67 at a 6-year lag (Figure 3a). Similarly, the maximum correlation between the subpolar North Atlantic net downward surface shortwave radiation and the Europe-subpolar North Atlantic sea level pressure gradient is −0.65 at a 6-year lag ( Figure 3b). Similar, but somewhat stronger correlations exist between the subpolar North Atlantic surface temperature/net downward surface shortwave radiation and both surface wind and surface density flux.

225
Reasons for the delay between changes in the subpolar North Atlantic surface temperature/shortwave radiation and the subsequent pressure gradient/surface wind responses are not clear. We do note, however, that this appears to be a robust relationship. Figure 4 shows CMIP6 ensemble mean 1950-2020 spatial correlation maps between the net downward surface shortwave radiation and sea level pressure, including temporally in sync correlations (i.e., not lagged), and with PSL lagged by 7 years (based on Fig. 3b). In both cases, signifiant negative correlations occur over Europe (and North America), implying 230 an increase in surface shortwave radiation is associated with a decrease in European sea level pressure (and vice versa). This is consistent with an increase in net surface shortwave radiation and enhanced heating driving a reduction is surface pressure (and vice versa), particularly over the continents where the change in anthropogenic aerosol emissions is largest. Although significant positive correlations occur over the subpolar North Atlantic under both cases−implying an increase (decrease) in surface shortwave radiation is associated with weakening (strengthening) of the Icelandic Low−this signal is much stronger 235 when PSL is lagged by 7 years. Thus, an increase (decrease) in surface shortwave radiation is associated with weakening (strengthening) of the climatological pressure gradient between the subpolar North Atlantic and Europe (see also Fig. 1).
To summarize these results, the subpolar North Atlantic net downward surface shortwave radiation, surface temperature and aerosol optical thickness lead the Europe-subpolar North Atlantic sea level pressure gradient and the subpolar North Atlantic surface wind, surface density and heat fluxes by 6-8 years (and the AMOC by 12 years); the Europe-subpolar North Atlantic 240 sea level pressure gradient and the North Atlantic surface wind and surface density and heat fluxes lead the AMOC by 4-5 years. Although a correlation analysis does not show causation, this analysis suggests that AMOC multi-decadal variability is initiated by North Atlantic aerosol optical thickness perturbations to net surface shortwave radiation and surface temperature,  Figure 5 shows the 1990-2020 CMIP6 all forcing ensemble mean annual mean spatial trend map, and the corresponding model agreement on the sign of the trend, for the surface density flux and its thermal component. Consistent with Fig. 1, SDF significantly decreases from 1990-2020 in the subpolar North Atlantic, with high (80-100%) model agreement (Fig. 5a,b).
Most of this SDF decrease is driven by the thermal component (Fig. 5c,d). The haline component yields very weak increases (Supplementary Figure 2). Moreover, decomposing the thermal SDF into its respective components shows that latent and sen-  Figures 4-6). However, dissimilarities in magnitude exist, suggesting these responses are only partially externally forced.
Using Monin-Obukhov similarity theory (Monin and Obukhov, 1954), latent and sensible heat fluxes can be further decomposed into wind, moisture and temperature components (Methods Section). Supplementary Figures 7-8 shows the importance 260 of wind changes to latent and sensible heat fluxes, and in turn, the thermal component of the SDF. Thus, our results suggest that strengthening (weakening) of the AMOC from ∼1950-1990 (1990-2020) is due to strengthening (weakening) of the surface winds in the subpolar North Atlantic (consistent with the altered sea level pressure gradient), which in turn leads to increases (decreases) in surface density flux through increases (decreases) in surface latent and sensible heat fluxes. CMIP6 all forcing simulations show that multi-decadal variability of the subpolar North Atlantic net surface shortwave 265 radiation and aerosol topical thickness lead the AMOC, as well as the atmospheric circulation (e.g., dPSL and SFWD) and SDF ( Fig. 2-3). This suggests changes in anthropogenic aerosols are important drivers of North Atlantic atmospheric circulation and AMOC multi-decadal variability. Beginning near the middle of the 20th century and lasting for several decades, global anthropogenic and chemically reactive gas emissions grew quickly, particularly from North America and Europe (Hoesly et al., 2018). In the later parts of the 20th century, while emissions from Asia continued to grow, European and North American 270 sulfate emissions declined as a result of emission control policies. Supplementary Figures 9-10 show a consistent evolution of North Atlantic SW, AOT and anthropogenic aerosol effective radiative forcing (ERF; Methods Section). This includes relatively rapid increases in AOT and corresponding decreases in SW and ERF beginning in ∼1940 and lasting until ∼1980, and opposite changes afterwards (i.e., about 10 years prior to the AMOC responses; Fig. 2a-b), particularly over Europe. AMOC in CMIP6 AA simulations is similar to that in the corresponding all forcing simulations, in particular the strengthening from ∼1950 to 1990, and weakening afterwards. 88% (100%) of the models yield a positive (negative) AMOC trend from 1950-1990 (1990-2020). The 1950The -1990The (1990The -2020 ensemble mean strengthening (weakening) represents a 8.8±2.3 (−7.1±1.6) 280 percent change (Supplementary Figure 11). Figure 6 also shows that from ∼1950-2020, surface density and heat fluxes, as well as the sea level pressure gradient, storm track activity, and surface wind follow a similar evolution as in the CMIP6 all forcing simulations. CMIP6 AA experiments also exhibit similar lead-lag relationships as in the CMIP6 all forcing simulations (not shown).

CMIP6 Anthropogenic Aerosol Simulations
We note that fewer CMIP6 AA (as compared to all forcing) models are available. Similar CMIP6 all forcing results as  The close correspondence between the CMIP6 AA and all forcing ensemble mean AMOC time series since ∼1950 again suggests anthropogenic aerosols are driving much of the response. This is further supported by looking at the CMIP6 greenhouse gas (GHG) and natural forcing ensemble mean AMOC time series. The CMIP6 GHG ensemble mean annual mean AMOC shows long-term weakening, whereas natural forcing yields negligible long-term change ( Supplementary Figures 13-14). Over 1990-2020, the CMIP6 GHG AMOC weakening represents a −6.7±0.8 percent change, which is comparable to 295 the AMOC weakening under CMIP6 AA (−7.1±1.6; Supplementary Figure 11). Thus, ∼1950-1990 AMOC strengthening in CMIP6 all forcing simulations is largely controlled by anthropogenic aerosols; from 1990-2020, both anthropogenic aerosols and GHGs contribute to AMOC weakening. Figure 7 shows the 1990-2020 CMIP6 AA ensemble mean annual mean trends and the model agreement on the sign of the trend for the surface density flux and its thermal component, as well as the atmospheric variables (e.g., SFWD). Responses 300 are again very similar to the corresponding CMIP6 all forcing simulations, further supporting the importance of anthropogenic aerosols. The CMIP6 AA ensemble mean shows a decrease in SDF that is largely driven by TSDF (Fig. 7a-d), weakening of the Europe-subpolar North Atlantic pressure gradient (Fig. 7e,f), a corresponding decrease in the subpolar North Atlantic surface wind (Fig. 7g-h), and a decrease in the subpolar North Atlantic storm track activity (Fig. 7i-j). Also consistent with CMIP6 all forcing simulations are near opposite changes in these variables from 1950-1990 Figure 15; see 305 also Supplementary Figures 4-5). And furthermore, Supplementary Figures 7-8 shows the importance of wind changes to latent and sensible heat fluxes, and in turn, the thermal component of the SDF in CMIP6 AA simulations. The AMOC strengthening in response to increasing anthropogenic aerosol forcing is consistent with prior studies (Delworth and Dixon, 2006;Cai et al., 2006Cai et al., , 2007Cowan and Cai, 2013;Collier et al., 2013;Menary et al., 2013;Cheng et al., 2013). However, unlike Menary et al. (2013) who used the HadGEM2-ES model, we do not find strong evidence that increased salinification is the dominant driving 310 factor.
Models will continue to have uncertainties, including those relevant to the AMOC and North Atlantic climate. These include biases in the mean state, as well as their representation of the strength and depth of the AMOC (e.g., Supplementary Table 1) and ocean freshwater transport (Rahmstorf, 1996;Drijfhout et al., 2011;Danabasoglu et al., 2014;Kostov et al., 2014;Dan-315 abasoglu et al., 2016). For example, in many CMIP3/5 models, the AMOC imports freshwater into the Atlantic, in opposition to observations, likely resulting in an artificially stable AMOC (Liu et al., 2017). Models also lack realistic melting of the Greenland ice sheet and the corresponding freshening of the North Atlantic (Bakker et al., 2016).
The CMIP6 AMOC response may be too sensitive to anthropogenic aerosol forcing (e.g., Zhang et al., 2013) and CMIP6 models may also overestimate aerosol indirect effects (e.g., Toll et al., 2019). However, anthropogenic aerosol ERF estimates 320 are consistent between CMIP6 and recent observational estimates, with 90% confidence intervals of −1.5 to −0.6 and −2.0 to −0.4 W m −2 , respectively (Bellouin et al., 2020;Smith et al., 2020). It is also notable that the aerosol ERF in CMIP5 models, with a 90% confidence interval of −1.8 to −0.2 W m −2 (Allen, 2015), is similar to that (but with a larger range) in CMIP6 models. The mean and standard deviation of the anthropogenic aerosol ERF in 12 CMIP6 models (Supplementary Table 2 (Allen, 2015). In contrast, Menary et al. (2020) argues the larger 1850-1985 AMOC weakening in CMIP6 models, relative to CMIP5, is due to stronger anthropogenic aerosol forcing in CMIP6. There, they show a robust relationship between AMOC strength and a proxy for aerosol forcing−the interhemispheric difference of net top-of-the-atmosphere shortwave radiation.
There is some evidence that the magnitude of the AMOC trends in CMIP6 models is related to a model's anthropogenic aerosol ERF−particularly over Europe−which again supports the importance of changes in European aerosols. The correlation 330 (over model means and using the 12 models with aerosol ERF; Supplementary Table 2) between the global mean aerosol ERF and AMOC trend yields the expected negative (positive) correlation from 1950-1990 (1990-2020), implying models with a larger global mean aerosol ERF yield larger AMOC strengthening (weakening). However, these correlations are not significant at the 95% confidence level, at −0.29 for 1950-1990 and 0.11 for 1990-2020. Somewhat larger, but still non-significant, correlations between European aerosol ERF and AMOC trends exist at −0.38 for 1950-1990 and 0.26 for 1990-2020. Ideally, 335 the transient aerosol ERF should be used for this calculation, but this quantity is only available for 3 models. Similar conclusions are also obtained if we divide the CMIP6 models into two groups, one with a larger (absolute value) global mean anthropogenic aerosol ERF (ERF HI ; 7 model mean aerosol ERF of −1.17 W m −2 ), and the other with a smaller global mean aerosol ERF (ERF LO ; 5 model mean aerosol ERF of −0.72 W m −2 ). From 1950-1990, ERF HI (ERF LO ) models yield AMOC strengthening that represents a 7.4±1.4 (4.7±2.1) percent change. From 1990-2020, ERF HI (ERF LO ) models yield AMOC weakening that 340 represents a −14.6±1.6 (−11.3±2.6) percent change (Supplementary Table 2).
Although these CMIP6 inferred AMOC trends are comparable to the actual CMIP6 AMOC trends, there are also notable differences. The CMIP6 all forcing ensemble mean 1950-1990 inferred AMOC trend is weaker than the actual CMIP6 AMOC trend (25% weaker, 0.03 versus 0.04 Sv year −1 ). And moreover, there is less model agreement for the CMIP6 1950CMIP6 -1990 inferred AMOC strengthening, as compared to the actual AMOC (62 versus 83%, respectively). CMIP6 1990-2020 inferred 355 and actual AMOC trends are both significant and similar in magnitude (−0.07 versus −0.08 Sv year −1 , respectively), as is the model agreement (92% for both).
Thus, CMIP6 and observations both suggest AMOC weakening after 1990. However, disagreement exists for 1950-1990, where inferred AMOC observations show significant weakening, but CMIP6 shows significant strengthening. Moreover, disagreement exists between the CMIP6 1950-1990 actual and inferred AMOC trend, with the inferred AMOC yielding weaker 360 and less robust strengthening. These discrepancies warrant further clarification, but they suggest that the 1950-1990 inferred AMOC in observations may yield excessive weakening (relative to the actual AMOC). A recent study suggests that the North Atlantic cooling is not only related to a weaker AMOC, but also northward heat transport. So, inferred AMOC estimates from sea surface temperature are prone to error, and they are not solely a measure of the AMOC (Keil et al., 2020). We do note, however, that multiple proxy observations, support AMOC weakening during 1950-1990(Chen and Tung, 2018. In addition 365 to these AMOC differences, the CMIP6 multi-model mean also underestimates the magnitude of observed increase in North Atlantic upper ocean heat content (Fig. 8e).
The inferred AMOC weakening from 1950-1990 (and even from 1930-1990) may have a significant contribution from internal (i.e., unforced) climate variability. Figure 9a shows CMIP6 AMOC trends for each individual model realization for four time periods, 1950-1990, 1990-2020, 1930-2020, as well as 1930-1990. Also included are the corresponding inferred 370 AMOC trends based on surface temperature observations. Some individual model realizations are able to reproduce the inferred AMOC trends, including the 1950-1990 weakening, as well as weakening over the longer 1930-1990 time period. 8.6% (8 of 92) and 13% (12 of 92) of the model realizations yield 1950-1990 and 1930-1990 AMOC weakening that falls within the observational uncertainty (which includes 5 and 12 different models, respectively). For the inferred AMOC strengthening from 1990-2020, 41.3% (38 of 92) of the model realizations are within the observational uncertainty (which includes 13 models).

375
There are 5 realizations from two different CMIP6 models (CanESM5 and IPSL-CM6A-LR) that yield AMOC trends that fall within the observational uncertainty for all four time periods. Figure 9b shows that the corresponding ensemble mean AMOC for these 5 realizations better resembles the inferred AMOC evolution since 1900, including strengthening during the first few decades, followed by a prolonged weakening, a relatively brief strengthening, and then subsequent weakening. Furthermore, these 5 realizations also better simulate the increase in North Atlantic upper ocean heat content (Fig. 9c). Differences 12 https://doi.org/10.5194/acp-2020-769 Preprint. Discussion started: 2 October 2020 c Author(s) 2020. CC BY 4.0 License. remain, however, including a ∼decade delay in the initial AMOC weakening (inferred weakening begins in the 1930s but these models show weakening commences in the 1940s), as well as an earlier (and brief) strengthening during the late-20th century (inferred strengthening begins in the 1990s but these models show weakening commences in the 1980s). We note that both of these models underestimate the climatological AMOC strength relative to RAPID observations (17.5±1-standard deviation of 1.4 Sv versus 11.6 Sv for IPSL-CM6A-LR and 13.1 Sv for CanESM5; Supplementary Table 1). Although no significant 385 AMOC differences were found between the ERF HI and ERF LOW subsets, it is interesting to note that IPSL-CM6A-LR and CanESM5 have 2 of the lowest 5 CMIP6 anthropogenic aerosol ERFs (Supplementary Table 2). It is also possible that the reason why these two models stand out is because they have a relatively large number of realizations (11 and 10, respectively; Supplementary Figure 1), which simply increases the chances of a simulated AMOC evolution comparable to that observed.

390
CMIP6 models yield consistent multi-decadal AMOC variability, including strengthening from ∼1950-1990, followed by weakening from 1990-2020. These AMOC variations are related to robust changes in the North Atlantic atmospheric circulation, involving sea level pressure, storm track activity, surface winds and latent and sensible heat fluxes. Anthropogenic aerosol forcing alone reproduces the bulk of these responses. Moreover, reanalyses and observations yield similar patterns of the North Atlantic atmospheric circulation response, suggesting part of this signal is externally forced. However, other aspects of the 395 CMIP6 AMOC response are at odds with observations. This includes the inferred ∼1950-1990 weakening of the AMOC based on surface temperature observations (e.g., Rahmstorf et al., 2015), when the CMIP6 multi-model mean yields strengthening.
Moreover, the CMIP6 multi-model mean underestimates the observed increase in North Atlantic ocean heat content since ∼1955. Some of these discrepancies could be due to model shortcomings, such as excessive anthropogenic aerosol forcing (Menary et al., 2020). A handful of CMIP6 realizations (5 of 92) yield AMOC evolution since 1900 similar to the indirect 400 observations, implying the inferred AMOC weakening from 1950-1990 (and even from 1930-1990)  around the North Atlantic, will likely continue to rapidly decline over the next few decades. Our results suggest that the continued decrease in anthropogenic aerosol emissions that accompany efforts to reduce air pollution will reinforce GHGinduced AMOC weakening over the next few decades−with the caveat that internal AMOC variability will also be important.
Code and data availability. The data and code that support the findings of this study are available from the corresponding author upon reasonable request. CMIP6 data can be downloaded from the Earth System Grid Federation at https://esgf-node.llnl.gov/search/cmip6/, or by 410 using the acccmip6 package available at https://github.com/TaufiqHassan/acccmip6. MERRA2 data can be accessed at https://gmao.gsfc.
Author contributions. R.J.A. conceived the project, designed the study, performed analyses and wrote the paper. T. H. performed data analysis and wrote the paper. W. L. and C. R. advised on methods. All authors discussed results and contributed to the writing of the manuscript.