Present and future aerosol impacts on Arctic climate change in the GISS-E2.1 Earth system model

The Arctic is warming 2 to 3 times faster than the global average, partly due to changes in short-lived climate forcers (SLCFs) including aerosols. In order to study the effects of atmospheric aerosols in this warming, recent past (1990–2014) and future (2015–2050) simulations have been carried out using the GISS-E2.1 Earth system model to study the aerosol burdens and their radiative and climate impacts over the Arctic (> 60 N), using anthropogenic emissions from the Eclipse V6b and the Coupled Model Intercomparison Project Phase 6 (CMIP6) databases, while global annual mean greenhouse gas concentrations were prescribed and kept fixed in all simulations. Results showed that the simulations have underestimated observed surface aerosol levels, in particular black carbon (BC) and sulfate (SO2− 4 ), by more than 50 %, with the smallest biases calculated for the atmosphere-only simulations, where winds are nudged to reanalysis data. CMIP6 simulations performed slightly better in reproducing the observed surface aerosol concentrations and climate parameters, compared to the Eclipse simulations. In addition, simulations where atmosphere and ocean are fully coupled had slightly smaller biases in aerosol levels compared to atmosphere-only simulations without nudging. Arctic BC, organic aerosol (OA), and SO2− 4 burdens decrease significantly in all simulations by 10 %–60 % following the reductions of 7 %–78 % in emission projections, with the Eclipse ensemble showing larger reductions in Arctic aerosol burdens compared to the CMIP6 ensemble. For the 2030–2050 period, the Eclipse ensemble simulated a radiative forcing due to aerosol–radiation interactions (RFARI) of −0.39± 0.01 Wm−2, which is −0.08 Wm−2 larger than the 1990–2010 mean forcing (−0.32 Wm−2), of which −0.24± 0.01 Wm−2 was attributed to the anthropogenic aerosols. The CMIP6 ensemble simulated a RFARI of −0.35 to −0.40 Wm−2 for the same period, which is −0.01 to −0.06 Wm−2 larger than the 1990–2010 mean forcing of−0.35 Wm−2. The scenarios with little to no mitigation (worst-case scenarios) led to very small changes in the RFARI, while scenarios with medium to large emission mitigations led to increases in the negative RFARI, mainly due to the decrease in the positive BC forcing and the decrease in the negative SO2− 4 forcing. The anthropogenic aerosols accounted for −0.24 to −0.26 Wm−2 of the net RFARI in Published by Copernicus Publications on behalf of the European Geosciences Union. 10414 U. Im et al.: Present and future aerosol impacts on Arctic climate change 2030–2050 period, in Eclipse and CMIP6 ensembles, respectively. Finally, all simulations showed an increase in the Arctic surface air temperatures throughout the simulation period. By 2050, surface air temperatures are projected to increase by 2.4 to 2.6 C in the Eclipse ensemble and 1.9 to 2.6 C in the CMIP6 ensemble, compared to the 1990–2010 mean. Overall, results show that even the scenarios with largest emission reductions leads to similar impact on the future Arctic surface air temperatures and sea-ice extent compared to scenarios with smaller emission reductions, implying reductions of greenhouse emissions are still necessary to mitigate climate change.

Abstract. The Arctic is warming 2 to 3 times faster than the global average, partly due to changes in short-lived climate forcers (SLCFs) including aerosols. In order to study the effects of atmospheric aerosols in this warming, recent past  and future  simulations have been carried out using the GISS-E2.1 Earth system model to study the aerosol burdens and their radiative and climate impacts over the Arctic (> 60 • N), using anthropogenic emissions from the Eclipse V6b and the Coupled Model Intercomparison Project Phase 6 (CMIP6) databases, while global annual mean greenhouse gas concentrations were prescribed and kept fixed in all simulations.
Results showed that the simulations have underestimated observed surface aerosol levels, in particular black carbon (BC) and sulfate , by more than 50 %, with the smallest biases calculated for the atmosphere-only simulations, where winds are nudged to reanalysis data. CMIP6 simulations performed slightly better in reproducing the observed surface aerosol concentrations and climate parameters, compared to the Eclipse simulations. In addition, simulations where atmosphere and ocean are fully coupled had slightly smaller biases in aerosol levels compared to atmosphere-only simulations without nudging.
Arctic BC, organic aerosol (OA), and SO 2− 4 burdens decrease significantly in all simulations by 10 %-60 % following the reductions of 7 %-78 % in emission projections, with the Eclipse ensemble showing larger reductions in Arctic aerosol burdens compared to the CMIP6 ensemble. For the 2030-2050 period, the Eclipse ensemble simulated a radiative forcing due to aerosol-radiation interactions (RF ARI ) of −0.39 ± 0.01 W m −2 , which is −0.08 W m −2 larger than the 1990-2010 mean forcing (−0.32 W m −2 ), of which −0.24 ± 0.01 W m −2 was attributed to the anthropogenic aerosols. The CMIP6 ensemble simulated a RF ARI of −0.35 to −0.40 W m −2 for the same period, which is −0.01 to −0.06 W m −2 larger than the 1990-2010 mean forcing of −0.35 W m −2 . The scenarios with little to no mitigation (worst-case scenarios) led to very small changes in the RF ARI , while scenarios with medium to large emission mitigations led to increases in the negative RF ARI , mainly due to the decrease in the positive BC forcing and the decrease in the negative SO 2− 4 forcing. The anthropogenic aerosols accounted for −0.24 to −0.26 W m −2 of the net RF ARI in

Introduction
The Arctic is warming 2 to 3 times faster than the global average (IPCC, 2013;Lenssen et al., 2019). This is partly due to internal Arctic feedback mechanisms, such as the snow and sea-ice-albedo feedback, where melting ice leads to increased absorption of solar radiation, which further enhances warming in the Arctic (Serreze and Francis, 2006). However, Arctic temperatures are also affected by interactions with warming at lower latitudes (e.g., Stuecker et al., 2018;Graversen and Langen, 2019;Semmler et al., 2020) and by local in situ response to radiative forcing due to changes in greenhouse gases and aerosols in the area (Shindell, 2007;Stuecker et al., 2018). In addition to warming induced by increases in global atmospheric carbon dioxide (CO 2 ) concentrations, changes in short-lived climate forcers (SLCFs) such as tropospheric ozone (O 3 ), methane (CH 4 ), and aerosols (e.g., black carbon (BC) and sulfate (SO 2− 4 )) in the Northern Hemisphere (NH) have contributed substantially to the Arctic warming since 1890 (Shindell and Faluvegi, 2009;Ren et al., 2020). This contribution from SLCFs to Arctic heating together with efficient local amplification mechanisms puts a high priority on understanding the sources and sinks of SLCFs at high latitudes and their corresponding climatic effects.
SLCFs include all atmospheric species, which have short residence times in the atmosphere relative to long-lived greenhouse gases and have the potential to affect Earth's radiative energy budget. Aerosols are important SLCFs and are a predominant component of air quality that affects human health (Burnett et al., 2018;Lelieveld et al., 2019). They mostly affect climate by altering the amount of solar energy absorbed by Earth, as well as changing the cloud properties and indirectly affecting the scattering of radiation, and are efficiently removed from the troposphere within several days to weeks. BC, which is a product of incomplete combustion and open biomass/biofuel burning (Bond et al., 2014(Bond et al., , 2013, absorbs a high proportion of incident solar radiation and therefore warms the climate system (Jacobson, 2001). SO 2− 4 , which is formed primarily through oxidation of sulfur diox-ide (SO 2 ), absorbs negligible solar radiation and cools the climate by scattering solar radiation back to space. Organic carbon (OC), which is co-emitted with BC during combustion, both scatters and absorbs solar radiation and therefore causes cooling in some environments and warming in others. Highly reflective regions such as the Arctic are more likely to experience warming effects from these organic aerosols (e.g., Myhre et al., 2013).
Aerosols also influence climate via indirect mechanisms. After being deposited on snow and ice surfaces, BC can amplify ice melt by lowering the albedo and increasing solar heating of the surface (AMAP, 2015). Aerosols also affect cloud properties, including their droplet size, lifetime, and vertical extent, thereby influencing both the shortwave cooling and longwave warming effects of clouds. Globally, this indirect cloud forcing from aerosols is likely larger than their direct forcing, although the indirect effects are more uncertain and difficult to accurately quantify (IPCC, 2013). Moreover, Arctic cloud impacts are distinct from global impacts, owing to the extreme seasonality of solar radiation in the Arctic, unique characteristics of Arctic clouds (e.g., high frequency of mixed-phase occurrence), and rapidly evolving sea-ice distributions. Together, they lead to complicated and unique phenomena that govern Arctic aerosol abundances and climate impacts (e.g., Willis et al., 2018;Abbatt et al., 2019). The changes taking place in the Arctic have consequences for how SLCFs affect the region. For example, reductions in sea-ice extent, thawing of permafrost, and humidification of the Arctic troposphere can affect the emissions, lifetime, and radiative forcing of SLCFs within the Arctic (J. L. Thomas et al., 2019).
The effect of aerosols on the Arctic climate through the effects of scattering and absorption of radiation, clouds, and surface ice/snow albedo has been investigated in previous studies (i.e., Clarke and Noone, 1985;Flanner et al., 2007;Shindell et al., 2012;Bond et al., 2013;Dumont et al., 2014). The impact of aerosols on the Arctic climate change is mainly driven by a response to remote forcings (Gagné et al., 2015;Sand et al., 2015;Westervelt et al., 2015). Long-range transport is known to play an important role in the Arctic air pollution levels, and much of the attention on aerosol climatic effects in the Arctic was focused on long-range-transported anthropogenic pollution (Arctic haze) in the past (Quinn et al., 2007;AMAP, 2015;Abbatt et al., 2019). Long-range transport of BC and SO 2− 4 , in particular from Asia, traveling at a relatively high altitude to the Arctic can be deposited on the snow and ice, contributing to surface albedo reduction. On the other hand, there has been increasing attention on the local Arctic aerosol sources, in particular natural aerosol sources (Schmale et al., 2021). Lewinschal et al. (2019) estimated an Arctic surface temperature change per unit global sulfur emission of −0.020 to −0.025 K Tg −1 S yr −1 . Sand et al. (2020) calculated an Arctic surface air temperature response of 0.06-0.1 K Tg −1 BC yr −1 to BC emissions in Eu-rope and North America and a slightly lower response of 0.05-0.08 K Tg −1 BC yr −1 to Asian emissions. Breider et al. (2017) reported a shortwave (SW) aerosol radiative forcing (ARF) of −0.19 ± 0.05 W m −2 at the top of the atmosphere (TOA) over the Arctic, which reflects the balance between sulfate cooling (−0.60 W m −2 ) and black carbon (BC) warming (+0.44 W m −2 ). Schacht et al. (2019) calculated a direct radiative forcing of up to 0.4 W m −2 over the Arctic using the ECHAM6.3-HAM2.3 global aerosol-climate model. Markowicz et al. (2021), using the Navy Aerosol Analysis and Prediction System (NAAPS) radiative transfer model, calculated a total aerosol forcing over the Arctic (> 70.5 • N) of −0.4 W m −2 . Ren et al. (2020) simulated 0.11 and 0.25 W m −2 direct and indirect warming in 2014-2018 compared to 1980-1984 due to reductions in sulfate, using the CAM5-EAST global aerosol-climate model. They also reported that the aerosols produced an Arctic surface warming of +0.30 • C during 1980-2018, explaining about 20 % of the observed Arctic warming observed during the last four decades, while according to Shindell and Faluvegi (2009), aerosols contributed 1.09 ± 0.81 • C to the observed Arctic surface air temperature increase of 1.48 ± 0.28 • C observed in 1976-2007. AMAP (2015, based on four Earth system models (ESMs), estimated a total Arctic surface air temperature response due to the direct effect of current global combustion-derived BC, OC, and sulfur emissions to be +0.35 • C, of which +0.40 • C was attributed to BC in the atmosphere, +0.22 • C to BC in snow, −0.04 • C to OC, and −0.23 • C to SO 2− 4 . On the other hand, Stjern et al. (2017) and Takemura and Suzuki (2019) showed that due to the rapid adjustments from BC, mitigation of BC emissions can lead to weak responses in the surface temperatures. Samset et al. (2018), using a multi-model ensemble of ocean coupled ESMs, where aerosol emissions were either kept at presentday conditions or anthropogenic emissions of SO 2 and fossil fuel BC and OC were set to zero, showed that Arctic surface warming due to aerosol reductions can reach up to 4 • C in some locations, with a multi-model increase for the 60-90 • N region of 2.8 • C. In addition, recent studies also suggest that as global emissions of anthropogenic aerosols decrease, natural aerosol feedbacks may become increasingly important for the Arctic climate (Boy et al., 2019;Mahmood et al., 2019).
In this study, we carry out several simulations with the fully coupled NASA Goddard Institute for Space Studies (GISS) earth system model, GISS-E2.1 (Kelley et al., 2020) to study the recent past and future burdens of aerosols as well as their impacts on TOA radiative forcing and climaterelevant parameters such as surface air temperatures, sea ice, and snow over the Arctic (> 60 • N). In addition, we investigate the impacts from two different emission inventories -Eclipse V6b (Höglund-Isaksson et al., 2020; vs. CMIP6 (Hoesly et al., 2018;van Marle et al., 2017;Feng et al., 2020) -as well as differences between atmosphere-only vs. fully coupled simulations on the eval-uation of the model and the climate impact. Section 2 introduces the GISS-E2.1 model, the anthropogenic emissions, and the observation datasets used in model evaluation. Section 3 presents results from the model evaluation as well as recent past and future trends in simulated aerosol burdens, radiative forcing, and climate change over the Arctic. Section 4 summarizes the overall findings and the conclusions.
2 Materials and methods 2.1 Model description GISS-E2.1 is the CMIP6 version of the GISS modelE Earth system model, which has been validated extensively over the globe (Kelley et al., 2020;Bauer et al., 2020) as well as regionally for air pollutants (Turnock et al., 2020). A full description of GISS-E2.1 and evaluation of its coupled climatology during the satellite era  and the recent past ensemble simulation of the atmosphere and ocean component models (1850-2014) are described in Kelly et al. (2020) and Miller et al. (2020), respectively. GISS-E2.1 has a horizontal resolution of 2 • in latitude by 2.5 • in longitude and 40 vertical layers extending from the surface to 0.1 hPa in the lower mesosphere. The tropospheric chemistry scheme used in GISS-E2.1  includes inorganic chemistry of O x , NO x , HO x , and CO and organic chemistry of CH 4 and higher hydrocarbons using the CBM4 scheme (Gery et al., 1989) and the stratospheric chemistry scheme , which includes chlorine and bromine chemistry together with polar stratospheric clouds.
In the present work, we used the one-moment aerosol scheme (OMA: Bauer et al., 2020, and references therein), which is a mass-based scheme in which aerosols are assumed to remain externally mixed. All aerosols have a prescribed and constant size distribution, with the exception of sea salt that has two distinct size classes and dust that is described by a sectional model with an option from four to six bins. The default dust configuration that is used in this work includes five bins, one clay and four silt ones, from submicron to 16 µm in size. The first three dust size bins can be coated by sulfate and nitrate aerosols (Bauer and Koch, 2005). The scheme treats sulfate, nitrate, ammonium, carbonaceous aerosols (black carbon and organic carbon, including the NO x -dependent formation of secondary organic aerosol (SOA) and methanesulfonic acid formation), dust, and sea salt. The model includes secondary organic aerosol production, as described by Tsigaridis and Kanakidou (2007). SOA is calculated from terpenes and other reactive volatile organic compounds (VOCs) using NO xdependent calculations of the two-product model, as described in Tsigaridis and Kanakidou (2007). Isoprene is explicitly used as a source, while terpenes and other reactive VOCs are lumped on α-pinene, taking into account their different reactivity against oxidation. The semi-volatile com-pounds formed can condense on all submicron particles except sea salt and dust. In the model, an OA-to-OC ratio of 1.4 used. OMA only includes the first indirect effect, in which the aerosol number concentration that impacts clouds is obtained from the aerosol mass as described in Menon and Rotstayn (2006). The parameterization described by Menon and Rotstayn (2006) that we use only affects the cloud droplet number concentration (CDNC) and not the cloud droplet size, which is not explicitly calculated in GISS-E2.1. Following the change in CDNC, we do not stop the model from changing either liquid water path (LWP) or precipitation rates, since the clouds code sees the different CDNC and responds accordingly. What we do not include is the second indirect effect (autoconversion). In addition to OMA, we have also conducted a non-interactive tracer (NINT: Kelley et al., 2020) simulation from 1850 to 2014, with noninteractive (through monthly varying) fields of radiatively active components (ozone and multiple aerosol species) read in from previously calculated offline fields from the OMA version of the model, ran using the Atmospheric Model Intercomparison Project (AMIP) configuration in Bauer et al. (2020) as described in Kelley et al. (2020). The NINT model includes a tuned aerosol first indirect effect following Hansen et al. (2005).
The natural emissions of sea salt, dimethylsulfide (DMS), isoprene, and dust are calculated interactively. Anthropogenic dust sources are not represented in GISS-E2.1. Dust emissions vary spatially and temporally only with the evolution of climate variables like wind speed and soil moisture (Miller et al., 2006). Dust concentrations are tuned to match the observed dust aerosol optical depth (AOD). The AMIP type simulations (see Sect. 2.3) use prescribed sea surface temperature (SST) and sea-ice fraction during the recent past (Rayner et al., 2003). The prescribed SST dataset in GISS-E2.1 is the merged product based on the HadISST and NOAA optimum interpolation (OI) sea surface temperature (SST) V2 (Reynolds et al., 2002).

Emissions
In this study, we have used two different emission datasets: the Eclipse V6b (Höglund-Isaksson et al., 2020;, which has been developed with support of the EUfunded Action on Black Carbon in the Arctic (EUA-BCA) and used in the framework of the ongoing AMAP assessment (AMAP, 2021), referred to as Eclipse in this paper; and the CEDS emissions (Hoesly et al., 2018;Feng et al., 2020) combined with selected Shared Socioeconomic Pathway (SSP) scenarios used in the CMIP6 future projections (Eyring et al., 2016), collectively referred to as CMIP6 in this paper.

Eclipse V6b emissions
The Eclipse V6b emissions dataset is a further evolution of the scenarios established in the EU-funded Eclipse project (Stohl et al., 2015;Klimont et al., 2017). It has been developed with the global implementation of the GAINS (Greenhouse Gas and Air Pollution Interactions and Synergies) model (Amann et al., 2011). The GAINS model includes all key air pollutants and Kyoto greenhouse gases, where emissions are estimated for nearly 200 country regions and several hundred source sectors representing anthropogenic emissions. For this work, annual emissions were spatially distributed on 0.5 • × 0.5 • longitude-latitude grids for nine sectors: energy, industry, solvent use, transport, residential combustion, agriculture, open burning of agricultural waste, waste treatment, gas flaring and venting, and international shipping. A monthly pattern for each gridded layer was provided at a 0.5 • × 0.5 • grid level. The Eclipse V6b dataset, used in this study, includes an estimate for 1990 to 2015 using statistical data and two scenarios extending to 2050 that rely on the same energy projections from the World Energy Outlook 2018 (IEA, 2018) but have different assumptions about the implementation of air pollution reduction technologies, as described below.
The current legislation (CLE) scenario assumes efficient implementation of the current air pollution legislation committed before 2018, while the maximum feasible reduction (MFR) scenario assumes implementation of best available emission reduction technologies included in the GAINS model. The MFR scenario demonstrates the additional reduction potential of SO 2 emissions by up to 60 % and 40 %, by 2030 for Arctic Council member and observer countries respectively, with implementation of best available technologies mostly in the energy and industrial sectors and to a smaller extent via measures in the residential sector. The Arctic Council member countries' maximum reduction potential could be fully realized by 2030, whereas in the observer countries additional reductions of 15 % to 20 % would remain to be achieved between 2030 and 2050. The assumptions and the details for the CLE and MFR scenarios (as well as other scenarios developed within the Eclipse V6b family) can be found in Höglund-Isaksson et al. (2020) and Klimont et al. (2021).

CMIP6 emissions
The CMIP6 emission datasets include a historical time series generated by the Community Emissions Data System (CEDS) for anthropogenic emissions (Hoesly et al., 2018;Feng et al., 2020), open biomass burning emissions (van Marle et al., 2017), and the future emission scenarios driven by the assumptions embedded in the Shared Socioeconomic Pathways (SSPs) and Representative Concentration Pathways (RCPs) (Riahi et al., 2017) that include specific air pollution storylines . Gridded CMIP6 emis-sions are aggregated to nine sectors: agriculture, energy, industrial, transportation, residential-commercial-other, solvents, waste, international shipping, and aircraft. SSP data for future emissions from integrated assessment models (IAMs) are first harmonized to a common 2015 base-year value by the native model per region and sector. This harmonization process adjusts the native model data to match the 2015 starting year values with a smooth transition forward in time, generally converging to native model results (Gidden et al., 2018). The production of the harmonized future emissions data is described in Gidden et al. (2019).

Implementation of the emissions in the GISS-E2.1
The Eclipse V6b and CEDS emissions on 0.5 • × 0.5 • spatial resolution are regridded to 2 • × 2.5 • resolution in order to be used in the various GISS-E2.1 simulations. In the GISS-E2.1 Eclipse simulations, the non-methane volatile organic carbon (NMVOC) emissions are chemically speciated assuming the SSP2-4.5 VOC composition profiles. In the Eclipse simulations, biomass burning emissions are taken from the CMIP6 emissions, which have been pre-processed to include the agricultural waste burning emissions from the Eclipse V6b dataset, while the rest of the biomass burning emissions are taken as the original CMIP6 biomass burning emissions. In addition to the biomass burning emissions, the aircraft emissions are also taken from the CMIP6 database to be used in the Eclipse simulations. As seen in Fig. 1, the emissions are consistently higher in the CMIP6 compared to the Eclipse emissions. The main differences in the two datasets are mainly over southeast Asia (not shown). The CMIP6 emissions are also consistently higher on a sectoral basis compared to the Eclipse emissions. The figure shows that for air pollutant emissions, the CMIP6 SSP1-2.6 scenario and the Eclipse MFR scenario follow each other closely, while the Eclipse CLE scenario is comparable with the CMIP6 SSP2-4.5 scenario for most pollutants; that is to some extent owing to the fact that the CO 2 trajectory of the Eclipse CLE and the SSP2-4.5 are very similar (not shown). A more detailed discussion of differences between historical Eclipse and CMIP6 as well as CMIP6 scenarios is provided in Klimont et al. (2021).

Simulations
In order to contribute to the AMAP assessment report (AMAP, 2021), the GISS-E2.1 model participated with AMIP-type simulations, which aim to assess the trends of Arctic air pollution and climate change in the recent past, as well as with fully coupled climate simulations. Five fully coupled Earth system models (ESMs) simulated the future (2015-2050) changes in atmospheric composition and climate in the Arctic (> 60 • N), as well as over the globe. We have carried out two AMIP-type simulations, one with winds nudged to NCEP (standard AMIP-type simulation in AMAP) and one with freely varying winds, where both simulations used prescribed SSTs and sea ice ( Table 1). The nudging extends from the first model layer up to 10 hPa, which is the top of the NCEP input. In the fully coupled simulations, we carried out two sets of simulations, each with three ensemble members, that used the CLE and MFR emission scenarios. Each simulation in these two sets of scenarios was initialized from a set of three fully coupled ensemble recent past simulations (1990-2014) to ensure a smooth continuation from CMIP6 to Eclipse emissions.
In addition to the AMAP simulations, we have also conducted CMIP6-type simulations in order to compare the climate aerosol burdens and their impacts on radiative forcing and climate impacts with those from the AMAP simulations. We have used the SSP1-2.6, 2-4.5, 3-7.0, and 3-7.0-lowNTCF scenarios representing different levels of emission mitigations in the CMIP6 simulations. SSP1 and SSP3 define various combinations of high or low socioeconomic challenges to climate change adaptation and mitigation, while SSP2 describes medium challenges of both kinds and is intended to represent a future in which development trends are not extreme in any of the dimensions but rather follow middle-of-the-road pathways . SSP1-2.6 scenario aims to achieve a 2100 radiative forcing level of 2.6 W m −2 , keeping the temperature increase below 2 • C compared to the preindustrial levels. The SSP2-4.5 describes a middle-of-the-road socioeconomic family with a 4.5 W m −2 radiative forcing level by 2100. The SSP3-7.0 scenario is a medium-high reference scenario. SSP3-7.0-lowNTCF is a variant of the SSP3-7.0 scenario with reduced near-term climate forcer (NTCF) emissions. The SSP3-7.0 scenario has the highest methane and air pollution precursor emissions, while SSP3-7.0-lowNTCF investigates an alternative pathway for the Aerosols and Chemistry Model Intercomparison Project (AerChemMIP: Collins et al., 2017), exhibiting very low methane, aerosol, and tropospheric-ozone precursor emissions -approximately in line with SSP1-2.6. As seen in Table 1, we have conducted one transient fully coupled simulation from 1850 to 2014 and a number of future scenarios.
We have employed prescribed global and annual mean greenhouse (CO 2 and CH 4 ) concentrations, where a linear increase in global mean temperature of 0.2 • C per decade from 2019 to 2050 was assumed, which is approximately in line with the simulated warming rates for the SSP2-4.5 scenario (AMAP, 2021).

Observations
The GISS-E2.1 ensemble has been evaluated against surface observations of BC, organic aerosols (sum of OC and secondary organic aerosols (SOA), referred to as OA in the rest of the paper) and SO 2− 4 ; ground-based and satellite-derived AOD at 550 nm; and surface and satellite observations of surface air temperature, precipitation, sea surface temperature,  sea-ice extent, cloud fraction, and liquid and ice water content in 1995-2014 period. The surface monitoring stations used to evaluate the simulated aerosol levels have been listed in Tables S1 and S2 in the Supplement.

Aerosols
Measurements of speciated particulate matter (PM), BC, SO 2− 4 , and (OA) come from three major networks: the Interagency Monitoring of Protected Visual Environments (IM-PROVE) for Alaska (the IMPROVE measurements that are in the Arctic (> 60 • N) are all in Alaska), the European Monitoring and Evaluation Programme (EMEP) for Europe, and the Canadian Aerosol Baseline Measurement (CABM) for Canada (Tables S1 and S2). In addition to these monitoring networks, BC, OA, and SO 2− 4 measurements from individual Arctic stations were used in this study. The individual Arctic stations are Fairbanks and Utqiagvik, Alaska (part of IMPROVE, though their measurements were obtained from their PIs); Gruvebadet and Zeppelin mountain (Ny-Ålesund), Norway; Villum Research Station, Greenland; and Alert, Nunavut (with the latter being an observatory in the Global Atmosphere Watch Programme of the WMO and a part of CABM). The measurement techniques are briefly described in the Supplement.
AOD at 500 nm from the AErosol RObotic NETwork (AERONET, Holben et al., 1998) was interpolated to 550 nm AOD using the Ångström formula (Ångström, 1929). We also used a new merged AOD product developed by Sogacheva et al. (2020) using AOD from 10 different satellitebased products. According to Sogacheva et al. (2020), this merged product could provide a better representation of temporal and spatial distribution of AOD. However, it is important to note that the monthly aggregates of observations for both AERONET and the satellite products depend on availability of data and are not likely to be the true aggregate of observations for a whole month when only few data points exist during the course of a month. In addition, many polarorbiting satellites take one observation during any given day and typically at the same local time. Nevertheless, these datasets are key observations currently available for evaluating model performances. Information about the uncertain nature of AOD observations can be found in previous studies (e.g., Sayer et al., 2018;Sayer and Knobelspiesse, 2019;Wei et al., 2019;Schutgens et al., 2020, Schutgens, 2020Sogacheva et al., 2020 Rayner et al., 2003) was used for evaluating model simulations of sea ice and SSTs. HadISST data are an improved version of their predecessor known as global sea ice and sea surface temperature (GISST). HadISST data are constructed using information from a variety of data sources such as the Met Office marine database, International Comprehensive Ocean-Atmosphere Data Set, passive microwave remote sensing retrieval, and sea-ice charts.

Satellite observations used for cloud fraction and cloud liquid water and ice water
The Advanced Very High Resolution Radiometer (AVHRR-2) sensors aboard the NOAA and EUMETSAT polar-orbiting satellites have been flying since the early 1980s. These data have been instrumental in providing the scientific community with climate data records spanning nearly four decades. Tremendous progress has been made in recent decades in improving, training, and evaluating the cloud property retrievals from these AVHRR sensors. In this study, we use the retrievals of total cloud fraction from the second edition of EUMETSAT's Climate Monitoring Satellite Application Facility (CM SAF) Cloud, Albedo and surface Radiation dataset from AVHRR data (CLARA-A2, Karlsson et al., 2017a). This cloud property climate data record is available for the period 1982-2018. Its strengths and weaknesses, and inter-comparison with the other similar climate data records are documented in Karlsson and Devasthale (2018). Further dataset documentation including algorithm theoretical basis and validation reports can be found in Karlsson et al. (2017b).
Cloud liquid and ice water path estimates derived from the cloud profiling radar aboard CloudSat (Stephens et al., 2002) and constrained with another sensor aboard NASA's A-Train constellation, MODIS-Aqua (Platnick et al., 2015), are used for the model evaluation. These Level 2b retrievals, available through the 2B-CWC-RVOD product (Version 5), for the period 2007-2016 are analyzed. This constrained version is used instead of its radar-only counterpart, as it uses additional information about visible cloud optical depths from MODIS, leading to better estimates of cloud liquid water paths. Because of this constraint the data are available only for the daylit conditions and, hence, are missing over the polar regions during the respective winter seasons. The theoretical basis for these retrievals can be found at http://www.cloudsat.cira.colostate.edu/sites/default/files/ products/files/2B-CWC-RVOD_PDICD.P1_R05.rev0_.pdf (last access: 26 October 2020). Being an active cloud radar, CloudSat provides orbital curtains with a swath width of just about 1.4 km. Therefore, the data are gridded at 5 • × 5 • to avoid too many gaps or patchiness and to provide robust statistics.

Evaluation
The simulations are compared against surface measurements of BC, OA, SO 2− 4 , and AOD, as well as surface and satellite measurements of surface air temperature, precipitation, sea surface temperature, sea-ice extent, total cloud fraction, liquid water path, and ice water path described in Sect. 2.4, by calculating the correlation coefficient (r) and normalized mean bias (NMB). OA refers to the sum of primary organic carbon (OC) and secondary organic aerosol (SOA).

Aerosols
The recent past simulations are for BC, OA, SO 4 , and AOD (Table 2) against available surface measurements. The monthly observed and simulated time series for each station are accumulated per species in order to get full Arctic time series data, which also include spatial variation, to be used for the evaluation of the model. In addition to Table 2   Results showed overall an underestimation of aerosol species over the Arctic, as discussed below. Surface BC levels are underestimated at all Arctic stations from 15 % to 90 %. Surface OA levels are also underestimated from −5 % to −70 %, except for a slight overestimation of < 1 % over Kårvatn (B5) and a large overestimation of 90 % over Trapper Creek (B6). Surface SO 2− 4 concentrations are also consistently underestimated from −10 % to −70 %, except for Villum Research Station (S11) over northeastern Greenland where there is an overestimation of 45 %. Finally, AODs are also underestimated over all stations from 20 % to 60 %. Such underestimations at high latitudes have also been reported by many previous studies (e.g., Skeie et al., 2011;Eckhardt et al., 2015;Lund et al., 2017Lund et al., , 2018Schacht et al., 2019;Turnock et al., 2020), pointing to a variety of reasons including uncertainties in emission inventories, errors in the wet and dry deposition schemes, the absence or underrepresentation of new aerosol formation processes, and the coarse resolution of global models leading to errors in emissions and simulated meteorology, as well as in representation of point observations in coarse model grid cells. Turnock et al. (2020) evaluated the air pollutant concentrations in the CMIP6 models, including the GISS-E2.1 ESM, and found that observed surface PM 2.5 concentrations are consistently underestimated in CMIP6 models by up to 10 µg m −3 , partic-ularly for the Northern Hemisphere winter months, with the largest model diversity near natural emission source regions and the polar regions.
The BC levels are largely underestimated in simulations by 50 % (CMIP6_Cpl_Hist) to 67 % (Eclipse_AMIP). The CMIP6 simulations have lower bias compared to Eclipse V6b simulations due to higher emissions in the CMIP6 emission inventory (Fig. 1). Within the Eclipse V6b simulations, the lowest bias (−57 %) is calculated for the Eclipse_AMIP_NCEP simulation, while the free climate and coupled simulations showed a larger underestimation (> 62 %), which can be attributed to a better simulation of transport to the Arctic when nudged winds are used. The Eclipse simulations also show that the coupled simulations had slightly smaller biases (NMB = −63 %) compared to the AMIP-type free climate simulation (AMIP-OnlyAtm: NMB = −67 %). The climatological monthly variation of the observed levels is poorly reproduced by the model with r values around 0.3. BC levels are mainly underestimated in winter and spring, which can be attributed to the underestimation of the anthropogenic emissions of BC, while the summer levels are well captured by the majority of the simulations (Fig. 2).
Surface OA concentrations are underestimated from 8 % (Eclipse_AMIP_NCEP) to 35 % (Eclipse_AMIP) by the Eclipse ensemble, while the CMIP6_Cpl_Hist simulation overestimated surface OA by 13 %. The Eclipse simulations suggest that the nudged winds lead to a better representation of transport to the Arctic, while the coupled simulations had smaller biases compared to the AMIP-type free climate simulation (AMIP-OnlyAtm), similar to BC. The climatological monthly variation of the observed concentrations are reasonably simulated, with r values between 0.51 and 0.69 (Table 2 and Fig. 2). As can be seen in Fig. S1, the OA levels are dominated by the biogenic SOA, in particular via α-pinene (monoterpenes) oxidation, compared to anthropogenic (by a factor of 4-9) and biomass burning (by a factor of 2-3) OA. While OC and BC are emitted almost from similar sources, this biogenic-dominated OA seasonality also explains why simulated BC seasonality is not as well captured, suggesting the underestimations in the anthropogenic emissions of these species, in particular during the winter. It should also be noted that GISS-E2.1 does not include marine VOC emissions except for DMS, while these missing VOCs such as isoprene and monoterpenes are suggested to be important sources for the summertime aerosol levels over the Arctic (Orellana et al., 2011;Karl et al., 2013;Schmale et al., 2021).
Surface SO 2− 4 levels are simulated with a smaller bias compared to the BC levels; however, they are still underestimated by 40 % (CMIP6_Cpl_Hist) to 53 % (Eclipse_AMIP_NCEP). The Eclipse_AMIP_NCEP simulation is biased higher (NMB = −53 %) compared to the Eclipse_AMIP (NMB = −50 %), probably due to higher cloud fraction simulated by the nudged version (see Sect. 3.1.6), leading to higher in-cloud SO 2− 4 production. The climatological monthly variation of observed SO 2− 4 concentrations is reasonably simulated in all simulations (r = 0.65-0.74). The observed springtime maximum is well captured by the GISS-E2.1 ensemble, with underestimations in all seasons, mainly suggesting underestimations in anthropogenic SO 2 emissions (Fig. 2), as well as simulated cloud fractions, which have high positive bias in winter and transition seasons, while in summer the cloud fraction is well captured with a slight underestimation. The clear-sky AOD over the AERONET stations in the Arctic region is underestimated by 33 % (Eclipse_AMIP) to 47 % (Eclipse_CplHist1). Simi-lar negative biases are found with comparison to the satellite based AOD product ( Table 2). The climatological monthly variation is poorly captured with r values between −0.07 and 0.07 compared to AERONET AOD and between 0 and 0.13 compared to satellite AOD. The simulations could not represent the climatological monthly variation of the observed AERONET AODs (Fig. 2).

Climate
The different simulations are evaluated against a set of climate variables, and the statistics are presented in Table 3a and b and in Figs. 4 and 5. The climatological mean (1995)(1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014) monthly Arctic surface air temperatures are slightly overestimated by up to 0.55 • C in the AMIP simulations, while the coupled ocean simulations underestimate the surface air temperatures by up to −0.17 • C. All simulations were able to reproduce the monthly climatological variation with r values of 0.99 and higher (Fig. 4). Results show that both absorbing (BC) and scattering aerosols (OC and SO 2− 4 ) are underestimated by the GISS-E2.1 model, implying that these biases can partly cancel out their impacts on radiative forcing due to aerosol-radiation interactions. This, together with the very low biases in surface temperatures, suggests that the effects of the anthropogenic aerosols on the Arctic climate via radiation are not the main driver in comparison to cloud indirect effects and forcing from greenhouse gases. The monthly mean precipitation has been underestimated by around 50 % by all simulations (Table 3a), with the largest biases during the summer and autumn (Fig. 4). The observed monthly climatological mean variation was very well simulated by all simulations, with r values between 0.80 and 0.90. Arctic SSTs are underestimated by the ocean-coupled simulation up to −1.96 • C, while the atmosphere-only runs underestimated SSTs by −1.5 • C (Table 3a). The negative bias in atmosphere-only simulations is due to the different datasets used to drive the model, which is a combined product of HadISST and NOAA-OI2 (Reynolds et al., 2002), and to evaluate the model (Rayner et al., 2003), which is only HadISST. The monthly climatological mean variation is well captured with r values above 0.99 (Table 3a, Fig. 4), with a similar cold bias in almost all seasons. The sea-ice extent was overestimated by all coupled simulations by about 12 %, while the AMIP-type Eclipse simulations slightly underestimated the extent by 3 % (Table 3a). The observed variation was also very well captured with very high r values. The winter and spring biases were slightly higher compared to the summer and autumn biases (Fig. 4).
All simulations overestimate the climatological (1995-2014) mean total cloud fraction by 21 % to 25 % during the extended winter months (October through February), where the simulated seasonality is anti-correlated in comparison to AVHRR CLARA-A2 observations, whereas a good correlation is seen during the summer months irrespective of the observational data reference. The largest biases were simu-lated by the atmosphere-only simulations, with the nudged simulation having the largest bias (NMB = 25 %). The coupled model simulations are closer to the observations during the recent past. On the other hand, the climatology of the annual-mean cloud fraction was best simulated by the nudged atmosphere-only simulation (Eclipse_AMIP_NCEP) with an r value of 0.40, while other simulations showed a poor performance (r = −0.17 to +0.10), except for the summer where the bias is lowest (Fig. 5). The evaluation against CALIPSO data however shows much smaller biases (NMB = +3 % to +6 %). This is because in comparison to the CALIPSO satellite that carries an active lidar instrument (CALIOP), the CLARA-A2 dataset has difficulties in separating cold and bright ice/snow surfaces from clouds, thereby underestimating the cloudiness during Arctic winters. Here both datasets are used for the evaluation as they provide different observational perspectives and cover the typical range of uncertainty expected from the satellite observations. Furthermore, while the CLARA-A2 covers the entire evaluation period in current climate scenario, CALIPSO observations are based on 10-year data covering the 2007-2016 period. Figure 5 shows the evaluation of the simulations with respect to LWP and IWP. It has to be noted here that to obtain a better estimate of the cloud water content, the CloudSat observations were constrained with MODIS observations, which resulted in a lack of data during the months with darkness (October-March) over the Arctic (see Sect. 2.4.3). Hence, we present the results for the polar summer months only. As seen in Fig. 5, all simulations overestimated the climatological (2007-2014) mean polar summer LWP by up to almost 75 %. The smallest bias (14 %) is calculated for the nudged atmosphere only (Eclipse_OnlyAtm_NCEP), while the coupled simulations had biases of 70 % or more. Observations show a gradual increase in the LWP, peaking in July, whereas the model simulates a more constant amount for the nudged simulation and a slightly decreasing tendency for the other configurations. All model simulations overestimate LWP during the spring months. The atmosphere-only nudged simulations tend to better simulate the observed LWP during the summer months (June through September). The coupled simulations, irrespective of the emission dataset used, are closer to observations only during the months of July and August.
The climatological (2007-2014) mean polar summer IWP is slightly better simulated compared to the LWP, with biases within −60 % with the exception of the nudged Eclipse (Eclipse_AMIP_NCEP) simulation (NMB = −74 %). All simulations simulated the monthly variation well, with r values of 0.95 and more. In the Arctic, the net cloud forcing at the surface changes sign from positive to negative during the polar summer (Kay and L'Ecuyer, 2013). This change typically occurs in May driven mainly by shortwave cooling at the surface. Since the model simulates the magnitude of the LWP reasonably, particularly in summer, the negative cloud forcing can also be expected to be realistic in the model (e.g., Table 3. (a) Annual normalized mean biases (NMB: %) and correlation coefficients (r) for the recent past simulations in the GISS-E2.1 model ensemble in 1995-2014 for surface air temperature (T surf ) and sea surface temperature (SST) in units of degree Celsius ( • C), precipitation (Precip), and sea-ice fraction (Sea ice). (b) Annual mean normalized mean biases (NMB: %) and correlation coefficients (r) for the recent past simulations in the GISS-E2.1 model ensemble in 1995-2014 for total cloud fraction (Cld Frac), liquid water path (LWP), and ice water path (IWP) in units of percent (%).   [1995][1996][1997][1998][1999][2000][2001][2002][2003][2004][2005][2006][2007][2008][2009][2010][2011][2012][2013][2014] surface air temperature, precipitation, sea surface temperature, and sea ice, along with the interannual variation shown in bars. Obs denotes the UDel dataset for surface air temperature and precipitation, and HadISST denotes that for sea surface temperature and sea-ice extent. Note that the two AMIP runs (blue and red lines) for the SST and sea ice are on top of each other as they use that data to run as input. Gryspeerdt et al., 2019). Furthermore, the aerosol and pollution transport into the Arctic typically occurs in the lowermost troposphere where liquid water clouds are prevalent during late spring and summer seasons (Stohl, 2006;Law et al., 2014;M. A. Thomas et al., 2019). The interaction of ice clouds with aerosols is, however, more complex, as ice clouds could have varying optical thicknesses, with mainly thin cirrus in the upper troposphere and relatively thicker clouds in the layers below. Without the knowledge on the vertical distribution of optical thickness, it is difficult to infer the potential impact of the underestimation of IWP on total cloud forcing and their implications.

Arctic burdens and radiative forcing due to aerosol-radiation interactions (RF ARI )
The recent past and future Arctic column burdens for BC, OA, and SO 2− 4 for the different scenarios and emissions are provided in Fig. 6. In addition, Table 4 shows the calculated trends in the burdens for BC, OA, and SO 2− 4 for the different scenarios, while Table 5  In addition to the burdens of these aerosol species, the TOA radiative forcing due to aerosol-radiation interaction (RF ARI ) over the Arctic is simulated by the GISS-E2.1 ensemble. RF ARI is calculated as the sum of shortwave and longwave forcing from the individual aerosol species between 1850 and 2050 is presented in Fig. 7. It is important to note that the present study uses the instantaneous forcing diagnostics from the model, which are calculated with a double call to the model's radiation code, with and without aerosols, as described in Bauer et al. (2020) and Miller et al. (2021), and not the effective radiative forcing. The transient cloud ra-  diative effect in GISS-E.2.1 follows Ghan (2013), which calculates the difference in cloud radiative forcing with aerosol scattering and absorption omitted (Bauer et al., 2020). However, the present study only focuses on the RF ARI . The model outputs separate forcing diagnostics for anthropogenic and biomass burning BC and OC, as well as biogenic SOA, making it possible to attribute the forcing to individual aerosol species. The negative RF ARI has increased significantly since 1850 until the 1970s due to an increase in aerosol concentrations. Due to the efforts of mitigating air pollution and thus a decrease in emissions, the forcing became less negative after the 1970s until 2015. Figure 7 also shows a visible difference in the anthropogenic RF ARI simulated by the NINT (prescribed aerosols) and OMA (interactive aerosols) sim-ulations in the CMIP6 ensemble, where the anthropogenic RF ARI by NINT simulation is less negative (by almost 30 %) compared to the OMA simulation (Fig. 7b). On the other hand, no such difference is seen in the net RF ARI time series (Fig. 7a). This compensation is largely driven by the 50 % more positive dust and 10 % less negative sea-salt RF ARI in the OMA simulation.

Black carbon
All simulations show a statistically significant (as calculated by Mann-Kendall trend analyses) decrease in the Arctic BC burdens (Table 4) between 1990-2014, except for the CMIP6_Cpl_Hist, which shows a slight non-significant increase that can be attributed to the large increase in global an-  1 kt) is comparable with the change of −1 kt in the SSP2-4.5 scenario, consistent with the projected emission changes in the two scenarios (Fig. 1).
As seen in Table 6, the GISS-E2.1 ensemble calculated a BC RF ARI of up to 0.23 W m −2 over the Arctic, with both CMIP6 and Eclipse coupled simulations estimating the highest forcing of 0.23 W m −2 for the 1990-2010 mean (Table 6a). This agrees with previous estimates of the BC RF ARI over the Arctic (e.g., Schacht et al., 2019). In the future, the positive BC RF ARI generally decreases (Fig. 6) due to lower BC emissions and therefore burdens, except for the SSP3-7.0 scenario, where the BC forcing becomes more positive by 0.05 W m −2 due to increasing BC emissions and burdens. The changes in the Arctic RF ARI in Table 6a follow the Arctic burdens presented in Table 5 and emission projections presented in Fig. 1, leading to largest reductions in BC RF ARI simulated in SSP1-2.6 (−0.10 W m −2 ). Similar to the burdens, the Eclipse CLE and CMIP6 SSP2-4.5 scenarios simulate a very close decrease in the 2030-2050 mean BC RF ARI of −0.06 and −0.14 W m −2 , respectively.

Organic aerosols
The Eclipse historical ensemble simulate a positive OA burden trend between 1990 and 2014; however, this trend is not significant at the 95 % confidence level (Table 4). The CMIP6_Cpl_Hist simulation gives a larger trend, due to a large increase in global anthropogenic OC emissions in CMIP6 (Fig. 1). The nudged AMIP Eclipse simulation calculates the largest 1990-2010 mean OA burden (57 kt), while the coupled simulation shows a slightly lower 1990-2010 mean burden (55 kt). This largest OA burden in the Eclipse_AMIP_NCEP simulation is attributed to the largest biogenic SOA burden calculated in this scenario, as well as a better-simulated transport from source regions due to the nudged winds (Fig. S1). The anthropogenic and biogenic contributions to SOA burdens in the coupled Eclipse and CMIP6 recent past simulations imply that the differences in the burdens between the two ensembles can be attributed to the different anthropogenic emissions datasets used in the Eclipse and CMIP6 simulations (Fig. S1), as well as the differences in SOA contributions due to simulated increases in the biogenic emissions (Fig. S5 in the Supplement). The AMIP-type Eclipse run simulates a lower 1990-2010 mean OA burden (50 kt), attributed to the smallest biogenic SOA burden in this scenario. The Eclipse CLE ensemble shows a decrease of 6.6 kt (12 %) in 2030-2050 mean OA burden compared to the 1990-2010 mean, while the MFR ensemble shows a larger decrease in the same period (15.2 kt: 27 %). The CMIP6 simulations show a much larger decrease in 2030-2050 mean Arctic OA burdens, with a decrease of 8.1 kt (SSP2-4.5) to 17 kt (SSP1-2.6), while the SSP3-7.0 simulation shows an increase in OA burdens in the same period by 1.3 kt (2 %). Similar to BC burdens, Eclipse CLE and CMIP6 SSP2-4.5 scenarios project similar changes in 2030-2050 mean OA burden (6.6 and 8.1 kt, respectively). As shown in Table 6a, the Eclipse ensemble calculated an OA RF ARI of −0.05 to −0.08 W m −2 for the 1990-2010 mean, where the nudged AMIP-type simulation shows the largest RF ARI , due to the largest Arctic OA burden calculated for this period (Table 5). For the future, both Eclipse CLE and MFR ensembles show an increase in the negative 2030-2050 mean RF ARI by −0.02 W m −2 , which is very close to the increase in the negative forcing calculated for the various CMIP6 simulations (−0.01 to −0.03 W m −2 ). Following the burdens, the largest increase in the 2030-2050 mean OA RF ARI is calculated for the SSP3-7.0 (−0.03 W m −2 ) and the lowest for SSP1-2.6 and 3-7.0-lowNTCF (−0.01 W m −2 ).

Net aerosol radiative forcing
The coupled simulations in both the Eclipse and the CMIP6 ensemble show an Arctic RF ARI of −0.32 to −0.35 W m −2 for the 1990-2010 mean, slightly lower than recent estimates (e.g., −0.4 W m −2 by Markowicz et al., 2021). In the Eclipse ensemble, −0.22 ± 0.01 W m −2 is calculated to be originated by the anthropogenic aerosols, while in the Table 6.  1990-2010 2030-2050 1990-2010 2030-2050 1990-2010 2030-2050 1990-2010 2030-2050 1990-2010 2030-2050 1990-2010 2030-2050 NINT_Cpl 0.20 CMIP6 near-past simulations show a contribution of −0.19 to −0.26 W m −2 from anthropogenic aerosols (Table 6b). The AMIP-type Eclipse simulations calculated a much larger RF ARI of −0.47 W m −2 for the same period, which can be mainly due to the increase in the positive forcing of the BC aerosols in the coupled simulations due to larger burdens. This effect is amplified due to the larger sea-ice concentration simulated with the coupled model, leading to brighter surfaces compared to the AMIP simulations. For the 2030-2050 period, the Eclipse ensemble simulated an increase in the negative RF ARI by −0.07 W m −2 , while the negative anthropogenic RF ARI increased by only −0.02 W m −2 , suggesting that the contribution from natural aerosols becomes more important in the future. The results show that the positive dust forcing is decreased by 0.03 W m −2 (from 0.12 to 0.09 W m −2 ), while the negative sea-salt forcing becomes more negative by −0.03 W m −2 due to the increase in icefree ocean fraction due to melting of sea ice (see Sect. 3.3). For the same period, the CMIP6 future ensemble simulated an increase in the negative RF ARI by −0.01 to −0.06 W m −2 , with the largest change being in SSP1-2.6 and SSP2-4.5, mainly driven by the change in BC forcing (Table 6a). Table 6 also shows that the SSP1-1.6 simulates no change in the anthropogenic forcing, while SSP2-4.5 shows a similar increase of −0.01 W m −2 in the Eclipse ensemble. In contrast, the SSP3-7.0 and SSP3-7.0-lowNTCF simulates a large decrease in the anthropogenic negative RF ARI by 0.05 and 0.02 W m −2 , respectively.
The different behavior in the two ensembles is further investigated by looking at the aerosol-radiation forcing calculated for the individual aerosol species of BC, OA, SO 2− 4 , and NO − 3 presented in Fig. 8 that shows the box-whisker plots using the similar scenarios in the Eclipse (CLE and MFR) and CMIP6 (SSP2-4.5 and SSP1-2.6) ensembles. The increase in cooling effect of aerosols calculated by the Eclipse ensemble is attributed mainly to the decrease in BC as opposed to other aerosol species (Fig. 8). More negative forcing is calculated for the OA and NO − 3 , while the SO 2− 4 forcing is becoming less negative due to large reductions in SO 2 emissions (Fig. 1). The net aerosol forcing is therefore slightly more negative. In the CMIP6 ensemble, the BC forcing does not change as much compared to the Eclipse ensemble to counteract the change in impact from SO 2− 4 , giving a more negative net aerosol forcing, which is slightly smaller compared to the Eclipse ensemble. The CMIP6 ensemble also simulates a larger increase in the negative NO − 3 forcing compared to the Eclipse ensemble . Overall, the changes in the different aerosol species lead to a more negative aerosol forcing by mid-century (2030-2050) compared to the 1990-2010 period.
The spatial distributions of the statistically significant change in the Arctic RF ARI in the 2030-2050 mean with respect to the 1990-2010 mean in the different ensemble members are presented in Fig. 9. Results show a decrease in the negative RF ARI over Europe, and partly over North America, and an increase over the northern Pacific in all ensemble members. Globally, larger changes are simulated over East and South Asia (Fig. S2 in the Supplement), where largest anthropogenic emission reductions take place. The global net RF ARI is dominated by the sea-salt particles, accounting for about 60 % of the 1990-2010 mean forcing of −2 to −2.3 W m −2 and the 2030-2050 mean forcing of −1.9 to 2.1 W m −2 .

Surface air and sea surface temperatures
The surface air temperature and sea-ice extent are calculated in the different simulations for the 1990-2050 period. As seen in Fig. 10, the Arctic surface air temperatures increase in all scenarios. Between 1990 and 2014, the surface air temperatures over the Arctic increased statistically significantly by 0.5 (Eclipse_CplHist) to 1 • C per decade (CMIP6_Cpl_Hist), with CMIP6 showing larger increases compared to the Eclipse ensemble (Table 7). On the other hand, the observed surface air temperature during 1990-2014 shows a smaller and statistically non-significant increase of 0.2 • C per decade. From 2015 onwards, surface air temperatures continue to increase significantly by 0.3 to 0.6 • C per decade, with larger increases in the Eclipse ensemble, due to larger reductions in the emissions and therefore in the burdens and associated RF ARI .
The 2030-2050 mean surface air temperatures are projected to increase by 2.1 and 2.3 • C compared to the 1990-2010 mean temperature (Table 8, Fig. 10) according to the Eclipse CLE and MFR ensembles, respectively, while the CMIP6 simulation calculated an increase of 1.9 (SSP1-2.6) to 2.2 • C (SSP3-7.0). Changes in both ensembles are statistically significant on a 95 % level. These warmings are smaller compared to the 4.5-5 • C warmer 2040 temperatures and compared to the 1950-1980 average in the CMIP6 SSP1-2.6, SSP2-4.5, and SSP3-7.0 scenarios reported by Davy and Outten (2020). It should however be noted that due to the different baselines used in the present study (1990-2010) and the 1950-1980baseline used in Davy and Outten (2020, it is not possible to directly compare these datasets. Figure 11 shows the spatial distributions of the statistically significant (as calculated by Student t test) Arctic surface air temperature change between the 1990-2010 mean and the 2030-2050 mean for the individual Eclipse and CMIP6 future scenarios. All scenarios calculate a warming in the surface air temperatures over the central Arctic, while there are differences over the land areas. The Eclipse CLE and MFR ensembles show similar warming mainly over the Arctic ocean as well as North America and Northeast Asia and cooling over the Greenland Sea. The latter is a well-known feature of observations and future projections, linked to the deep mixed layer in the area, and declines in the Atlantic meridional circulation (e.g., IPCC, 2014;Menary and Wood, 2018;Keil et al., 2020). There are also differences between the Eclipse and the CMIP6 ensembles as seen in Fig. 11. All CMIP6 scenarios show a warming over the central Arctic and a limited cooling over northern Scandinavia, following the changes in RF ARI shown in Fig. 9, except for the SSP3-7.0 scenario that shows no cooling in the region. The SSP3-7.0-lowNTCF scenario shows an additional cooling over Siberia. These warmings are comparable with earlier studies, such as Samset et al.

Sea ice
The Arctic sea-ice extent is found to decrease significantly in all simulations ( Fig. 10 and Table 7). Similar to the near-surface temperatures, during the 1990-2014 period, the CMIP6 ensemble simulated a large decrease in sea-ice extent compared to the Eclipse ensemble. On the other hand, the CMIP6_Cpl_Hist largely overestimated the observed decrease of 30 000 km 2 yr −1 . This overestimation has also been reported for some of the CMIP5 and CMIP6 models (Davy and Outten, 2020). After 2015, the Eclipse CLE ensemble projected larger decreases in the sea-ice extent compared to the CMIP6 ensemble (Table 7), in agreement with the changes in the near-surface temperatures. The evolutions of March and September sea-ice extents, representing the Arctic annual maximum and minimum extents, respectively, are also analyzed. The Eclipse ensemble projects a decrease of 23 000 ± 11 000 km 2 yr −1 in March sea-ice extent during the 2015-2050 period, while the CMIP6 ensemble projects a decrease of 10 000 ± 6000 km 2 yr −1 for the same period, both statistically significant. In September, much larger decreases are projected by both ensembles. The Eclipse ensemble simulates a decrease of 64 000 ± 10 000 km 2 yr −1 in the 2015-  2050 period, while the CMIP6 ensemble predicts a decrease of 50 000 ± 20 000 km 2 yr −1 .
The 2030-2050 annual mean sea-ice extent (Table 8) is projected to be 1.5 and 1.7 million square kilometers lower compared to the 1990-2010 mean in the Eclipse CLE and MFR scenarios, respectively, both statistically significant on a 95 % level. The CMIP6 simulations predict a lower decrease in sea-ice extent by 1.2-1.5 million square kilometers; however, these changes are not statistically significant. These results are comparable with the results from the CMIP6 models (Davy and Outten, 2020). In the 2030-2050 March mean the sea-ice extent is projected to be 925 000 km 2 lower in the Eclipse ensemble (statistically significant), while the CMIP6 ensemble projects a decrease of 991 000 km 2 (not statistically significant). A much larger decrease is projected for the 2030-2050 September mean: 2.6 million and 2.3 million square kilometers in Eclipse and CMIP6 ensembles, respectively. As seen in Fig. 12, the Eclipse ensemble predicts an up to 90 % lower September sea-ice fraction in a band marking the maximum retreat of the sea-ice line at the end of the summer, while the changes simulated by the CMIP6 ensemble are not statistically significant on a 95 % level (therefore not shown in Fig. 11), which can be attributed to the single ensemble member per scenario in the CMIP6 ensemble, as well as the not-significant changes in the near-surface temperatures (not shown). In March (Fig. S3 in the Supplement), the Eclipse ensemble simulated a decrease in maximum sea-ice extent at the end of winter over the northern Pacific, while the CMIP6 ensemble did not show any statistically significant changes in sea ice. In addition, the Eclipse ensemble shows a decrease over the north Atlantic close to Greenland. All simulations show a similar and statistically significant decrease in annual mean sea-ice extent (Fig. S4 in the Supplement) over the central Arctic, with the CMIP6 ensemble showing also some increase in the sea-ice extent over the Canadian Arctic, which is largest in SSP3-7.0. The retreat in sea-ice extent also led to an increase in oceanic emissions of DMS and sea salt (Fig. S5); however, the increases are not significant on a 95 % significance level. The simulated increase, in particular for the DMS emissions, is slightly larger in the Eclipse ensemble compared to the CMIP6 ensemble, due to a larger decrease in sea-ice extent in the Eclipse ensemble. Also note that GISS-E2.1 is using prescribed and fixed maps of DMS concentration in the ocean. When ocean locations that are year-round under sea ice at present get exposed, the DMS that would exist in that seawater is not included in the simulations, likely underestimating the increased flux of DMS into the atmosphere as the sea-ice retreats.

Summary and conclusions
The GISS-E2.1 earth system model has been used to simulate the recent past  and future (2015-2050) aerosol burdens and their climate impacts over the Arctic. An ensemble of 17 simulations has been conducted using historical and future anthropogenic emissions and projections from CMIP6  and Eclipse V6b, with the latter supporting the ongoing Arctic Monitoring and Assessment Programme.
The evaluation of the recent past simulations shows underestimates of Arctic surface aerosol levels by up to 50 %, with the smallest biases calculated for the simulations where winds are nudged and sea surface temperature and sea ice are prescribed (AMIP-type: atmosphere-only). An exception is SO 2− 4 , where the nudged Eclipse AMIP simulation had the highest bias, due to the high cloud bias that leads to more in-cloud sulfate production from SO 2 . The model skill anal-yses indicate slightly better performance of the CMIP6 version of the GISS-E2.1 model in simulating both the aerosol levels and climate parameters compared to the Eclipse version. In addition, the underestimations in summertime cloud fraction suggests missing sources of aerosols, in particular the local marine sources. GISS-E2.1 does not include marine VOC emissions except for DMS, which are suggested to be important for the summertime cloud properties over the Arctic (Ornella et al., 2011;Karl et al., 2013;Schmale et al., 2021). Results also suggest that the underestimation of both absorbing and scattering aerosol levels can partly cancel out their impacts on RF ARI and near-surface temperatures as the temperatures are very well reproduced by the model.
From 2015 onwards, all simulations, except for the worstcase CMIP6 scenario SSP3-7.0, show a statistically significant decrease in the Arctic BC, OA, and SO 2− 4 burdens, with the CMIP6 ensemble simulating larger aerosol burdens than Eclipse, while the Eclipse ensemble shows larger reductions (10 %-60 %) in Arctic aerosol burdens compared to the reduction simulated by the CMIP6 ensemble (10 %-45 %). The largest burden reductions are calculated by the highly ambitious emission reductions in the two ensembles; i.e., the Eclipse MFR (25 %-60 %) and the CMIP6 SSP1-2.6 (25 %-45 %).
The present-day (1990-2010 mean) CMIP6 and Eclipse simulations calculated an aerosol radiative forcing due to aerosol-radiation interactions (RF ARI ) of −0.32 to −0.35 W m −2 . For the same period, the atmosphere-only (AMIP) Eclipse simulations calculated a much larger negative RF ARI of −0.47 W m −2 . This smaller RF ARI by the coupled simulations is mainly due to larger BC burdens in the coupled simulations, leading to more positive forcing, which is amplified by the larger albedo effect due to larger sea-ice extent simulated in the coupled simulations. In the 2030-2050 period, the Eclipse ensemble simulated a RF ARI of −0.39 ± 0.01 W m −2 , of which −0.24 ± 0.01 W m −2 is attributed to the anthropogenic aerosols (BC, OA, SO 2− 4 , and NO − 3 ). For the same period, the worst-case CMIP6 scenario (SSP3-7.0) simulated a similar RF ARI (−0.35 W m −2 ) compared to the 1990-2010 mean, while large emission reductions led to a more negative RF ARI (−0.40 W m −2 ), mainly due to decrease in the positive forcing of the BC aerosols. Overall, the Eclipse ensemble simulated slightly larger changes in the RF ARI over the 2015-2050 period, relative to the 1990-2010 mean, compared to the CMIP6 ensemble, which can be attributed to the larger reductions in burdens in the Eclipse ensemble. The differences between the two ensembles are further attributed to differences in the BC and SO 2− 4 forcings. The results suggest that the different anthropogenic emission projections lead to only small differences in how the RF ARI will evolve in the future over the Arctic.
The future scenarios with the largest aerosol reductions, i.e., MFR in the Eclipse and SSP1-2.6 in the CMIP6 ensemble, project the largest warming and sea-ice retreat. The Eclipse ensemble shows a slightly larger warming of 2030-2050 mean surface air temperatures compared to the 1990-2010 mean warming (2.1 to 2.5 • C) compared to that from the CMIP6 ensemble (1.9 to 2.2 • C). Larger warming in the Eclipse ensemble also resulted in a slightly larger reduction in sea-ice extent (−1.5 to −1.7 million square kilometers in CLE and MFR, respectively) in the 2030-2050 mean compared to the reduction in the CMIP6 scenario (−1.3 to −1.6 million square kilometers in SSP1.2-6 and SSP3-7.0, respectively). However, the changes simulated by the two ensembles are within 1 standard deviation of each other.
The overall results showed that the aerosol burdens will substantially decrease in the short-to mid-term future, implying improvements in impacts on human health and ecosystems. However, the impacts of aerosols on the radiative forcing can be amplified by the sea-ice extent. Results also show that even the scenarios with largest emission reductions, i.e., Eclipse MFR and CMIP6 SSP1-2.6, lead to similar impact on the future Arctic surface air temperatures and sea-ice loss compared to scenarios with very little mitigation such as the CMIP6 SSP3-7.0, exacerbating the dominant role played by well-mixed greenhouse gases and underlining the importance of continued greenhouse gas reductions.

Data availability.
HadISST data were obtained from https://www.metoffice.gov.uk/hadobs/hadisst/ (Met Office Hadley Centre, 2021) and have a © British Crown Copyright, Met Office, provided under a Non-Commercial Government Licence (http://www.nationalarchives.gov.uk/doc/ non-commercial-government-licence/version/2/; last access: 3 July 2021). UDel_AirT_Precip data were provided by the NOAA/OAR/ESRL PSL, Boulder, Colorado, USA, from their website at https://psl.noaa.gov/ . Alert sulfate data were provided by Sangeeta Sharma, and elemental carbon (EC) and OA data were provided by Lin Huang as part of Canadian Aerosol Baseline Measurement (CABM) program at ECCC. We would like to thank operators and technicians for the collection of filters, calibration, and analysis, as well as the Canadian Forces Services Alert for the operation of the military base. These datasets are also available on the Global Atmospheric Watch Programme, World Data Center for Aerosols, EBAS database (http://ebas.nilu.no/default.aspx, Norwegian Institute for Air Research, 2021). Aside from Alert, Canada's surface air quality data are from the National Atmospheric Pollution Surveillance network (NAPS: https://open.canada.ca/data/en/ dataset/1b36a356-defd-4813-acea-47bc3abd859b, Analysis and Air Quality Section Environment Canada, 2021).
Fairbanks aerosol measurements are from William Simpson and KC Nattinger. Aside from Fairbanks, Alaska measurements are from the IMPROVE network. IMPROVE is a collaborative association of state, tribal, and federal agencies, as well as international partners. The US Environmental Protection Agency is the primary funding source, with contracting and research support from the National Park Service. The Air Quality Group at the University of California, Davis, is the central analytical laboratory, with ion analysis provided by the Research Triangle Institute and carbon analysis provided by the Desert Research Institute. European measurements are from the EMEP network and obtained from the EBAS database (http://ebas.nilu.no, last access: 3 July 2021). Other European data include the Gruvebadet measurements, for which we acknowledge Mauro Mazzola (mauro.mazzola@cnr.it), Stefania Gilardoni (stefania.gilardoni@cnr.it), and Angelo Lupi (angelo.lupi@cnr.it) from the Institute of Polar Sciences for Gruvabadet eBC measurements, as well as Rita Traversi (rita.traversi@unifi.it), Mirko Severi (mirko.severi@unifi.it), and Silvia Becagli (silvia.becagli@unifi.it) from the University of Florence (http://www.isac.cnr.it/~radiclim/ CCTower/?Data:Aerosol, National Research Council of Italy, 2021); the Zeppelin datasets, for which we acknowledge Vito Vitale and Angelo Lupi (also available on http://ebas.nilu.no); and the Villum Research Station datasets (http://www.villumresearchstation. dk, Aarhus University, 2021) from Henrik Skov (hsk@envs.au.dk; also available at http://ebas.nilo.no). The AERONET AOD measurements were obtained from NASA's Goddard Space Flight Center (https://aeronet.gsfc.nasa.gov/new_web/index.html, NASA, 2021). The authors acknowledge Larisa Sogacheva and the AEROSAT team for satellite-based merged AOD data.
Author contributions. UI coordinated the study; conducted the model simulations, model evaluation, and analyses of the simulations; and wrote the manuscript. KT and GF supported the model simulations and processing of the Eclipse V6b emissions for the GISS-E2.1 model. JPF contributed to the plotting of the spatial distributions by further developing the autoimage R package (French, 2017). RM prepared and provided the AOD measurements, as well as the surface air temperature, sea surface temperature, and sea-ice data. MAT prepared the cloud observation data. CHW prepared the Arctic surface aerosol measurement data. KvS coordinated the experimental setup for the Eclipse simulations in the framework of the ongoing AMAP assessment. ZG prepared and provided the Eclipse V6b anthropogenic emissions. HS and DCT prepared the Villum Research Station aerosol data. JB and PL contributed to analyses of aerosols and climate parameters, respectively, and manuscript writing. All authors contributed to the analyses and interpretation of the results, as well as to the writing of the manuscript.
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Special issue statement. This article is part of the special issue "Arctic climate, air quality, and health impacts from short-lived climate forcers (SLCFs): contributions from the AMAP Expert Group (ACP/BG inter-journal SI)". It is not associated with a conference.