Evaluation of preindustrial to present-day black carbon and its albedo forcing from Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP)

. As part of the Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP), we evaluate the historical black carbon (BC) aerosols simulated by 8 ACCMIP models against observations including 12 ice core records, long-term surface mass concentrations, and recent Arctic BC snowpack measurements. We also estimate BC albedo forcing by performing additional simulations using ofﬂine models with prescribed meteorology from 1996–2000. We evaluate the vertical proﬁle of BC snow Published by Copernicus Publications on behalf of the European Geosciences Union.


2608
Y. H. Lee et al.: Evaluation of preindustrial to present-day black carbon concentrations from these offline simulations using the recent BC snowpack measurements.
Despite using the same BC emissions, the global BC burden differs by approximately a factor of 3 among models due to differences in aerosol removal parameterizations and simulated meteorology: 34 Gg to 103 Gg in 1850 and 82 Gg to 315 Gg in 2000. However, the global BC burden from preindustrial to present-day increases by 2.5-3 times with little variation among models, roughly matching the 2.5-fold increase in total BC emissions during the same period. We find a large divergence among models at both Northern Hemisphere (NH) and Southern Hemisphere (SH) high latitude regions for BC burden and at SH high latitude regions for deposition fluxes. The ACCMIP simulations match the observed BC surface mass concentrations well in Europe and North America except at Ispra. However, the models fail to predict the Arctic BC seasonality due to severe underestimations during winter and spring. The simulated vertically resolved BC snow concentrations are, on average, within a factor of 2-3 of the BC snowpack measurements except for Greenland and the Arctic Ocean.
For the ice core evaluation, models tend to adequately capture both the observed temporal trends and the magnitudes at Greenland sites. However, models fail to predict the decreasing trend of BC depositions/ice core concentrations from the 1950s to the 1970s in most Tibetan Plateau ice cores. The distinct temporal trend at the Tibetan Plateau ice cores indicates a strong influence from Western Europe, but the modeled BC increases in that period are consistent with the emission changes in Eastern Europe, the Middle East, South and East Asia. At the Alps site, the simulated BC suggests a strong influence from Europe, which agrees with the Alps ice core observations. At Zuoqiupu on the Tibetan Plateau, models successfully simulate the higher BC concentrations observed during the non-monsoon season compared to the monsoon season but overpredict BC in both seasons. Despite a large divergence in BC deposition at two Antarctic ice core sites, some models with a BC lifetime of less than 7 days are able to capture the observed concentrations.
In 2000 relative to 1850, globally and annually averaged BC surface albedo forcing from the offline simulations ranges from 0.014 to 0.019 W m −2 among the ACCMIP models. Comparing offline and online BC albedo forcings computed by some of the same models, we find that the global annual mean can vary by up to a factor of two because of different aerosol models or different BC-snow parameterizations and snow cover. The spatial distributions of the offline BC albedo forcing in 2000 show especially high BC forcing (i.e., over 0.1 W m −2 ) over Manchuria, Karakoram, and most of the Former USSR. Models predict the highest global annual mean BC forcing in 1980 rather than 2000, mostly driven by the high fossil fuel and biofuel emissions in the Former USSR in 1980.

Introduction
Black carbon (BC) is defined here as the light-absorbing and refractory portion of carbonaceous aerosols that is emitted through the incomplete combustion of fossil fuel, biofuel, and biomass. BC aerosols influence climate in the following ways: (1) BC absorbs and scatters radiation, mainly resulting in warming the atmosphere and reducing the amount of solar radiation reaching the surface of Earth, which is known as aerosol direct effect; (2) the direct effect of BC can affect cloud formation by changing the atmospheric stability and/or relative humidity ("semi-direct effect") (e.g., Ackerman et al., 2000;Koch and Del Genio, 2010); (3) BC can alter Cloud Condensation Nuclei (CCN) concentrations and cloud properties when internally mixed with hydrophilic aerosols such as sulfate, which is known as aerosol indirect effect, and BC can also affect the ice and mixed-phase clouds by acting as ice nuclei (e.g., Cozic et al., 2007Cozic et al., , 2008; and (4) BC reduces the surface albedo when deposited in and on snow and ice surfaces, which is called the BC albedo effect (e.g., Hansen and Nazarenko, 2004;Jacobson, 2004).
Areas covered with snow or ice surface (e.g., the Arctic, the Antarctic, high mountain regions, Northern Canada, and Northern Eurasia) are particularly sensitive to global climate change and have undergone rapid changes in recent decades (e.g., Lubin and Vogelmann, 2006;Kehrwald et al., 2008; J. C. . The radiative effects of BC particles are important in these regions even at relatively low concentrations because of its dominant light absorbing properties; the BC albedo effect can darken the snow/ice surface and cause a further warming via the snow albedo feedback (e.g., Flanner et al., 2007Flanner et al., , 2009). Reduction of BC emissions has been seriously considered as a potential method for mitigating the global warming trend especially over the Arctic and Tibet Plateau (e.g., Quinn et al., 2008;Kopp and Mauzerall, 2010).
To understand BC impacts on climate in these climatically-sensitive regions, a number of BC measurements have been taken, including surface mass concentrations (e.g., Collaud Coen et al., 2007;Chaubey et al., 2010;Gong et al., 2010;Andrew et al., 2011), BC in snow (e.g., Doherty et al., 2010;Hegg et al., 2010;Yasunari et al., 2010), and BC in ice cores (e.g., McConnell et al., 2007;Ming et al., 2008;B. Xu et al., 2009;Bisiaux et al., 2012b). For atmospheric BC measurements, two common methods are filter-based optical and thermal-optical (e.g., Moosmuller et al., 2009). Thermal methods measure the refractory portion of total carbon, which is therefore called elemental carbon (EC) instead of BC. Filter-based optical instruments such as Particle Soot Absorption Photometer (PSAP) and Aethalometer have been frequently used in long-term atmospheric sampling networks (e.g., Bodhaine, 1995;Sharma et al., 2004) because of ease of remote operation. These instruments measure a light absorption coefficient that is converted to BC mass concentrations by assuming a mass absorption cross section and therefore provide an "equivalent BC (EBC)" concentration since this is not a direct measurement of BC. The EBC concentration can suffer from using an improper mass absorption cross section that is highly influenced by the aerosol mixing state and other light absorbing species such as organic matter and mineral dust (e.g., Sharma et al., 2002). Also interference caused by scattering aerosols on the filter can lead to artificial high absorption (e.g., Sharma et al., 2002). For BC in surface snow or ice core measurements, several methods have been used for BC analysis: optical methods (e.g., Doherty et al., 2010), thermo-optical methods (e.g., Lavanchy et al., 1999;Ming et al., 2008;B. Xu et al., 2009), coulometric titration-based method (e.g., Ming et al., 2008), gas chromatography with thermo-chemical treatments (e.g., Thevenon et al., 2009) and a laser-based incandescence (i.e. Single Particle Soot Photometer, SP2) (e.g., McConnell et al., 2007;Kaspari et al., 2011;Bisiaux et al., 2012b). Compared with the atmospheric measurement, the BC in snow or ice generally requires filtration, which is used to collect BC from a melted snow/ice sample before analysis and which can lead to the potential loss of BC. The novel laserbased method does not need the filtration process because it can be used directly to measure the BC mass concentrations in snowmelt (McConnell et al., 2007). However, this method uses the nebulization process to aerosolize the BC from snowmelt, and Schwarz et al. (2012) point out that the nebulizer performance may result in an underestimates of the BC concentration.
As previously mentioned, BC (measured with optical methods) is rather distinct from EC (measured with thermal methods). For example, BC and EC concentrations can differ by a factor of 3-4 depending on the aerosol characteristics (Reisinger et al., 2008). However, most aerosol models tend to use the term BC, EC, or soot interchangeably (Vignati et al., 2010). It is difficult to clearly distinguish these terminologies in a model because emission, physical, chemical, and optical properties used in a model are built/chosen based on various observation sources. Importantly, BC emission inventories are largely based on EC data.
There have been numerous modeling studies to understand BC source attribution and to estimate its climate impact. Many studies have focused on the Arctic including Greenland (e.g., Koch and Hansen, 2005;Stohl, 2006;Law and Stohl, 2007;Shindell et al., 2008;Hirdman et al., 2010a, b;Huang et al., 2010b;Jacobson, 2010;Warneke et al., 2010). These studies consistently show that Eurasian and North American BC emissions contribute significantly to Arctic BC, while Hirdman et al. (2010b) demonstrate the importance of the Arctic BC emissions to the Arctic BC concentrations. Law and Stohl (2007) and Hirdman et al. (2010b) point out that, compared to the rest of the Arctic, Greenland is more influenced by BC originating from southeast Asia and North America due to its high topography. To inter-compare BC models, Koch et al. (2009b) evaluated several BC mod-els with available observations including surface mass concentrations, aircraft measurements, aerosol absorption optical depth from AERONET and Ozone Monitoring Instruments, and BC column estimations based on AERONET under the AeroCom model intercomparison project. Shindell et al. (2008) investigated transport of BC emitted from each continent to the Arctic under the Hemispheric Transport of Air Pollution (HTAP) project. Unlike the Arctic, only a few modeling studies have focused on the Tibetan Plateau (Kopacz et al., 2011;Lu et al., 2012), the Alps (Fagerli et al., 2007), and the Antarctic (Graf et al., 2010). The two modeling studies on the Tibetan Plateau find BC transported from South Asia and East Asia are major sources, although the relative contributions of each source region vary with seasons and the receptor location.
As part of the Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP; Lamarque et al., 2013), our primary goal in this paper is to evaluate preindustrial to present-day BC in ACCMIP models with ice cores sampled from the Arctic, Antarctic, Tibetan Plateau, and Alps. This is especially meaningful for General Circulation Model (GCM) evaluation as ice core records are the only measurements providing BC information from preindustrial to present-day (McConnell, 2010). However, since the number of ice cores is limited, our evaluation includes additional BC observations such as a long-term surface mass concentration and BC in snow measurements from the Arctic regions, which are used to compare with the ACCMIP present-day simulations. Finally, we investigate the BC surface albedo effect with additional modeling. To make it clear, our experiments are not designed to study long-range transport of BC particles. In this paper, we do not include evaluations with remote sensing instruments that are covered in Shindell et al. (2012). Shindell et al. (2012) also include aerosol radiative forcings from the ACCMIP models including BC albedo forcing, but more details of BC albedo forcing are reported in this paper. In Sect. 2, we explain the description of ACCMIP models and simulations used here. Section 3 describes the additional modeling we used to obtain BC albedo forcing and a vertical profile of BC snow concentrations. We present and discuss the model results and BC evaluation using observations in Sect 4 and BC albedo forcing in Sect. 5. We draw conclusions in Sect. 6.

Description of ACCMIP models
Detailed information on the ACCMIP model descriptions and project's experiments are provided in Lamarque et al. (2013). Here we provide the most pertinent information about the models and simulations used in this work. Among a total 15 participating models, 9 models include black carbon as a prognostic tracer in their ACCMIP simulations, but only 8 models are used in this paper; LMDzORINCA is excluded in this study due to the absence of required  1890 1910 1930 (core) 1950 1970 1980 (core) 1990 2000 (core) GISS All the models except CICERO-OsloCTM2 are run as coupled chemistry-climate models (CCMs), driven by monthly sea-surface temperatures and sea-ice coverage that are averaged over 10 yr from the corresponding coupled ocean-atmosphere model integrations submitted to the Coupled Model Intercomparison Project Phase 5 (CMIP5). The ACCMIP proposed several timeslice runs complementing CMIP5, and the historical runs used in this study consists of 4 core timeslice runs (i.e., 1850, 1930, 1980, and 2000) and 5 tier-1 timeslice runs (i.e., 1890, 1910, 1950, 1970, 1990); GFDL-AM3 and HadGEM2 ran 1860 instead of 1850, but these are considered as 1850 for analysis. Model output is not available for all the timeslices. Table 1 summarizes the availability of model BC output for each ACCMIP timeslice run and for the CMIP5 transient historical run. Most models performed the core simulations and only a few models performed the tier-1 simulations. Except for the GISS-E2-R and the CICERO-OsloCTM2, each timeslice ran for 4 to 10 yr in order to reduce the amount of interannual variability; this reduces the noise in the computed changes (between simulations) and therefore increases the likelihood of relating them to the associated forcings. The GISS-E2-R participated with their CMIP5 transient simulations, which covers the entire historical period   Table 2 presents brief descriptions of the BC modeling of emission, aging, and deposition. Among 8 models, only 3 models used aerosol microphysics: GISS-E2-R-TOMAS, NCAR-CAM5.1, and HadGEM2. GISS-E2-R-TOMAS (Lee and Adams, 2011) and NCAR-CAM5.1  used aerosol microphysics to track aerosol number and mass explicitly based on a sectional scheme and a modal scheme, respectively. HadGEM2 used a modal scheme to track aerosol mass only (Bellouin et al., 2011). Although the size assumptions of emissions vary widely among models, this information is unlikely to affect the BC loading in models without aerosol microphysics. The hydrophilic fraction of emitted BC particles is assumed to be either 0 % or 20 % in models with the exception of HadGEM2 assuming 94.6 % hydrophilic fraction for biomass burning emissions. The larger hydrophilic fraction assumed, the faster BC particles are removed by wet scavenging. All models except NCAR-CAM3.5 inject the biomass burning emissions above the lowermost layer, and two models (GFDL-AM3 and NCAR-CAM5.1) inject the BC up to 6 km altitude above the surface. Injection height can play a big role in the lifetime and transport of biomass burning emitted particles (e.g., Sessions et al., 2011). For dry deposition, NCAR-CAM3.5 and CICERO-OsloCTM2 used a constant dry deposition velocity and the other models used the resistance series method, even though the details of the dry deposition parameterization among models vary widely. Wet scavenging of BC particles by ice/mixed-phase clouds can be important in high latitude regions such as the Arctic where most clouds are in ice/mixed-phase. Models account for this scavenging with either 12 % (two GISS models and CICRO-OsloCTM2) or 100 % (GFDL-AM3, HadGEM2, and MIROC-CHEM) of that by liquid clouds. NCAR-CAM5.1 does not allow wet scavenging by cloud ice. Most models treated the aging process simply with a fixed e-folding lifetime (i.e., 1-2.7 days). HadGEM2 assumes that BC from fossil fuel emissions remains hydrophobic even when aged and thus only enters cloud droplets through diffusion scavenging rather than nucleation scavenging: this accounts for the long lifetime of BC in HadGEM2 (see Sect. 4.1). Although NCAR-CAM5.1 has an option to include a separate externally-mixed BC mode that can become internally mixed after coagulation and condensation with the hydrophilic aerosol components such as sulfate, sea-salt, and organic aerosols, for these simulations all BC was assumed to be internally mixed immediately  n/a n/a Takemura et al. (2000Takemura et al. ( , 2002Takemura et al. ( , 2005 after emissions. Instead of a fixed value, BC aging lifetime in CICERO-OsloCTM2 depends on season and latitude, based on simulations using the full tropospheric chemistry version of Oslo CTM2 with the M7 aerosol microphysical module (Lund and Berntsen, 2012). The detailed description of each aerosol model is available in the references listed in Table 2 Takemura et al. (2000Takemura et al. ( , 2002Takemura et al. ( , 2005 for MIROC-CHEM.

ACCMIP BC emission
The ACCMIP simulations use the BC emission inventory covering the historical period (1850-2000) provided by Lamarque et al. (2010), which is built for the climate model simulations in CMIP5. BC emissions are largely distinguished by anthropogenic (i.e., originating from energy use in stationary and mobile sources, industrial processes, domestic and agricultural activities) and open biomass burning but are segregated into 12 sectors by a source type listed in Table 3 in Lamarque et al. (2010). Hereafter, anthropogenic BC emissions will be referred to as FF/BF (i.e., fossil fuel/biofuel) emissions and open biomass burning emissions as BB (i.e., biomass burning) emissions. A few key aspects of the CMIP5 BC emissions are summarized here. This emission inventory is based on previous inventories but has incorporated new information. The FF/BF emissions are mainly based on Bond et al. (2004Bond et al. ( , 2007   of three datasets: the GICC inventory for the period of 1900-1950, the RETRO inventory for the period of 1960-1990(Schultz et al., 2008, and the GFEDv2 inventory for 2000 . This does not provide a vertical profile for BB emissions and it allows the models to use different methods to determine an injection height (see Table 2). Finally, emissions are provided every 10 yr, so a linear interpolation is applied for a transient simulation to reproduce interannual variability; this might be a poor approximation especially for BB emissions. Figure 1a shows the total BC emission changes from 1850 to 2000 used in the ACCMIP historical simulations. All participant models use the same emission rate except the GISS-E2-R, which increases BB emission by 40 % to compensate for the underestimated BC predictions over the biomass burning regions of Africa and South America (Koch et al., 2009b). While having the same emission is useful for model inter-comparison, the 40 % increase in BB emissions results in less than 10 % changes in the spatial distributions of the column burden in the 2000 timeslice simulations except near and downwind from BB sources (based on GISS-E2-R 2000 timeslice experiments with/without 40 % increase in the BB emissions), and the range in emissions used in GISS-E2-R can be considered realistic. The total BC emissions increase almost linearly from 3 Tg (i.e., 1 Tg of FF/BF and 2 Tg of BB) in 1850 to 7.8 Tg (i.e., 5.2 Tg of FF/BF and 2.6 Tg of BB) in 2000, mostly due to anthropogenic sources. The emission between 1910 and 1950 is almost constant due to the economic situation and cleaner technology implementation (Bond et al., 2007). BB emissions between 1850 and 1900 are held constant (i.e., 2 Tg per year) .
The BC snow albedo effect is sensitive to the regional changes in BC emissions from preindustrial to present-day as the areas covered with the snow/ice are localized. Thus, we present the spatially distributed FF/BF and BB emissions in 1850, 1930, 1980; the segregated FF/BF and BB emissions by region are presented in S- Fig. 1 in the Supplement. The historical FF/BF emission evolution is closely related to economic status, air pollution control technology and policy, and fuel switch; details are in Bond et al. (2007).

The offline land and sea-ice models
Among the 8 ACCMIP models, only 3 models (GISS-E2-R, GISS-E2-R-TOMAS, and CICERO-OsloCTM2) calculated BC albedo forcing with their own snow model. In fact, NCAR-CAM5.1 computes the BC albedo forcing, but their albedo forcing is not isolated from other forcing mechanisms. To compute BC albedo forcing and vertically resolved BC snow concentrations, offline land and sea-ice models are used in each core timeslice (i.e., 1850, 1930, 1980, and 2000) for the 8 ACCMIP models. BC and mineral dust deposition fields from each model were prescribed with monthly resolution (annually-repeating) and linearly-interpolated to the model timestep. All runs were conducted using prescribed meteorology from 1994-2000, with spin-up from 1994-1995 and analysis (averaging) over 1996-2000. The land simulations applied the NCAR Community Land Model 4 (CLM4) (Lawrence et al., 2011), using bias-corrected atmospheric forcing data from Qian et al. (2006), and were run at 1.9×2.5 degree resolution. The CLM4 can have 5 snow layers at maximum, and the depth of each layer depends on the number of snow layers in the grid cell. For BC snow concentration evaluation, a constant depth is assumed for each model snow layer, shown in the parentheses after each layer in the following: 1st layer (top surface to 2 cm), 2nd layer (2 cm to 7 cm), 3rd layer (7 cm to 18 cm), and 4th layer (deeper than 18 cm); the 5th (i.e., deepest) snow layer is not used. When the top depth of the snow sample falls in the range, we selected the corresponding layer for the evaluation. The seaice simulations applied the Community Ice CodE 4 (CICE4), using inter-annually varying atmospheric forcing data from different sources. The CICE4 model has two snow layers. The depth of the top snow layer depends on the total snow depth and can be as deep as 4 cm. For the evaluation, the 1st layer is assumed to extend to 2 cm and the 2nd layer includes snow deeper than 2 cm. Sensitivity studies based on  1850, 1930, 1980, and 2000. the CICE4 model indicate that the assumption of fixed layer depths could alter the evaluation by at most several percent for the snow observations over ice surface; the fixed layer assumption could affect the evaluation more for the snow observations over the land surfaces. The land snow treatments of aerosol processes and radiative transfer are described by Flanner et al. (2007) and Lawrence et al. (2011), and the new sea-ice aerosol and radiation treatments are described by Holland et al. (2012). The snow and sea-ice fields generated with these offline configurations agree better with observed conditions during this time period than those simulated with coupled land-ocean-atmosphere simulations, but the precipitation and aerosol deposition fluxes are less compatible with each other than in coupled aerosol-climate simulations. The influence of this incompatibility on simulated surface snow BC concentrations and radiative forcing is somewhat mitigated by the use of temporally smoothed monthly aerosol deposition fields. Note that greater details of the technique of using offline CLM4 and CICE4 models with prescribed deposition fields will be available in Jiao et al. (2013).

Global-average BC budgets and spatial distributions
In this section, we present the global-average BC budgets and spatial patterns of BC simulations from each model as well as spatial patterns of average and relative standard deviation (RSD) of BC predictions from 8 models, which are referred to as "multi-model mean" (MMM) and "RSD" hereafter. Note that RSD is used to show model diversity and is calculated as a ratio of multi-model standard deviation to MMM. A higher value of RSD represents large model diversity and zero RSD means no disagreement among models. Table 3 presents the global-average BC budgets in 1850 and 2000 from each model, MMM and RSD for the corresponding process. Except for GISS-E2-R using 40 % higher BB emissions, all models including GISS-E2-R-TOMAS have the same total BC emissions (i.e., ∼ 3 Tg in 1850 and ∼ 8 Tg in 2000), matching the Lamarque et al. (2010) historical BC emission (see Sect. 2.1; Fig. 1). Global BC burden varies more than a factor of three among the models, ranging from 34 Gg to 103 Gg in 1850 and from 82 Gg to 315 Gg in 2000; excluding HadGEM2, which shows the maximum global BC burden, the range reduces to a factor of two variations (i.e., 34-68 Gg in 1850 and 82-169 Gg in 2000). However, the models consistently show that the global BC burden increases by 2.5-3 times from preindustrial to presentday, which is close to the 2.5 times increase in BC emissions occurring in the same period. This suggests that, for each model, burdens are proportional to emissions. Globalaverage BC lifetime is quite varied from 4 days in NCAR-CAM5.1 to 12-15 days in HadGEM2, but these differ insignificantly between 1850 and 2000 for all models except HadGEM2. The short lifetime in NCAR-CAM5.1 is due to its assumption that all BC is internally mixed with available hygroscopic material. The long lifetime in HadGEM2 is due to the fact that the BC particles from FF sources remain hydrophobic even when aged and thus result in slow wet scavenging -they cannot be removed via nucleation scavenging but still experience wet deposition via the collection by cloud droplets via diffusion and relative sedimentation between aerosol particles and precipitating droplets (Roberts and Jones, 2004;Bellouin et al., 2007). Wet deposition is the dominant removal process for BC, and its contribution to the total removal rate varies from 53 % in GFDL-AM3 to 95 % in GISS-E2-R-TOMAS; without GFDL-AM3 and GISS-E2-R-TOMAS, it ranges from 74 % to 90 %. Some models in Table 3 show the imbalances between the emission and total deposition rates seem to originate from a minor diagnostics problem or a minor undiagnosed term in deposition processes (see the Supplement for details). We believe this does not disqualify the models from the study.
The MMM and RSD of BC budgets in 2000 are compared to Textor et al. (2006) (hereafter, just TXT06), who studied model diversity using MMM and RSD of 16 global aerosol models in the framework of the AEROCOM project. It is important to note that the 16 models in TXT06 did not use consistent aerosol emissions and their BC emission RSD is 0.23 instead of 0.05 in this study. The MMM BC burden in this study is 155 Gg which is about a factor of two less than that in TXT06. This discrepancy can be explained by the higher mean BC emissions (∼ 12 Tg per year) in TXT06 than those here (7.9 Tg per year). The relative variation of BC burden among models is slightly lower (RSD = 0.42) in TXT06 than in the 8 ACCMIP models (RSD = 0.46). For BC lifetime, our MMM is 7.4 days, similar to 7.1 days in TXT06, but our RSD is 0.47, which is higher than 0.33 in TXT06. The lower model diversity in the TXT06 BC lifetime likely results from the fact that 12 of the 16 models used the same year reanalysis meteorology, while only one of the ACCMIP models was driven by reanalysis data, so that the higher variation in our BC lifetime (and thus burden) might be caused by variations in the meteorology simulated in each host model (Lamarque et al., 2013). This could explain why the diversity in the ACCMIP burdens exceeds the TXT06 burden diversity, even though the ACCMIP simulations use the same emissions while the TXT06 BC emission RSD is 0.23. Because all models (except GISS-E2-R) use the same BC emissions, the model diversity of the total deposition flux is negligible. Again, wet deposition is the dominant removal process for BC aerosols in the ACCMIP models. The MMM of the ratio of wet deposition and total deposition is 0.8 with RSD of 0.16, very close to TXT06. Wet deposition rates have small model diversity (RSD = 0.17) but dry deposition rates do not (RSD = 0.63).
The spatial distributions of BC column burden in 2000 for all ACCMIP models are presented in Fig. 2, and the MMM and RSD spatial patterns are shown in Fig. 3a, using a global map, and Fig. 3c, using the Arctic-focused map. MMM and RSD spatial patterns are computed after regridding the model outputs into 2 • in latitude and 2.5 • in longitude. In general, the model BC burden distributions show a large model diversity that tends to increase away from the source areas due to the large influence of the aerosol transport and deposition processes on the burden. The NH high latitude regions and some other remote regions show RSD of ∼ 1 or higher, which means the standard deviation is as large as (or more than) the mean. On the other hand, the high emission regions tend to have RSD of less than 0.3.
The MMM and RSD distributions of the total BC deposition fluxes are shown in Fig. 3b, using a global map, and Fig. 3d, using the Arctic-centered map to provide a better look over the Arctic. In general, the spatial patterns of MMM are roughly similar to the emissions shown in Fig. 1b and the column burden MMM distribution in Fig. 3a. Unlike the large diversity in the column burden distribution in Fig. 3a and 3c, the BC deposition fluxes display much smaller model diversity than the column burden. This is consistent with what we saw in the global-average budgets. Model deposition fluxes diverge significantly over the Southern Hemisphere (SH) high latitude. Figure 3d displays the relatively large model diversity over the Arctic regions especially near Greenland. Interestingly, although the RSD distributions of column burden and deposition fluxes show an increasing RSD away from the source regions, the lowest RSD in the deposition fluxes distribution seems to be over BB emission regions while the lowest RSD in the column burden distribution seems to be over FF/BF emission regions. This feature is not related to the 40 % enhanced BB emissions in GISS-E2-R as this was reproduced without GISS-E2-R (not shown). This might be related to the BB injection height, but we did not investigate this further.
To display the spread among 8 ACCMIP models for the spatial distributions of modeled BC between 1850 and 2000, MMM and RSD are computed based on a ratio of BC column burden in 2000 to that in 1850 from each model (shown in Fig. 4). In the same manner, the MMM and RSD of the deposition flux changes in 2000, 1980, and 1930  decreases over small parts of the US and Europe due to FF/BF emission and some parts of South America and Australia due to BB emissions. The RSD of the BC changes between 2000 and 1850 are mostly lower than 0.3-0.4 except in high latitude regions. Compared to Fig. 3a, Fig. 4 shows relatively small model diversity because each model's removal parameterizations behave similarly in 2000 and 1850. Over SH high latitude areas, we observed high model variations in the deposition flux (in Fig. 3b) in 2000 and also in the deposition flux changes in 2000 and 1980 relative to 1850 (in Fig. 5). Compared to the SH mid-latitude, the SH high latitudes exhibit a greater increase in BC deposition fluxes starting from 1980. Given that the SH high latitude regions are far from the source regions, this suggests increasing BC transport into this region. Thus, the larger model diversity might be due to the substantial diversity in aerosol transport among models, which is especially significant in the 1980 and 2000 simulations with increasing BC emissions in SH subtropics. In Fig. 5, the changes in 1980 relative to 1850 are quite similar to 2000 except for the higher increase ob-    Table  7), shown in red, and atmospheric surface mass concentration sites (listed in Table 4), shown in blue, and (b) Arctic BC snow concentrations sites. Note that Ny-Ålesund and Hyytiälä are shown as Ny-Alesund and Hyytiala, respectively. the deposition decreases over the high latitudes of South America and North America due to reduced BB emissions. data as well as the BC in Alert and Pallas were provided as a mass concentration whereas the BC data from the NOAA-ESRL-GMD were an absorption coefficient and thus were converted into a mass concentration by assuming a mass absorption cross section of 9.7 m 2 g −1 , following Skeie et al. (2011). When BC data were provided at multiple wavelengths including 880nm from Aethalometer, which is the case for most EMEP data, we chose to use BC concentrations at the 880 nm wavelength channel because the BC mass concentration measured at this wavelength is considered to represent the BC in the atmosphere. At this wavelength BC is the principal absorber of light and other known aerosol components have negligible absorption. Within the information we were able to collect, the Prelia BC data at 880 nm from the Aethalometer was converted with a mass absorption cross section of 16.6 m 2 g −1 , a default value set by the manufacturer for a wavelength of 880 nm (V. Ulevicius, personal communication, 2012). The Ny-Ålesund data were treated with 15.9 m 2 g −1 (Eleftheriadis et al., 2009) and for the Mace Head data, 19 m 2 g −1 before May 2005 and 16.6 m 2 g −1 after May 2005 (Junker et al., 2006;G. Jennings, personal communication, 2012).The Jungfraujoch BC data (Collaud Coen et al., 2007) was prepared using the method described in Collaud Coen et al. (2010). The BC data in Alert were converted using 19 m 2 g −1 from October to May and 28 m 2 g −1 from June to September to match with EC measurements (Sharma et al., 2002;S. Sharma, personal communication, 2012), and BC in Pallas (Hyvarinen et al., 2011) used the conversion method followed by Weingartner et al. (2003). Barrow data included only values from the clean air sector in order to avoid local contamination by emissions from the town of Barrow, and Mauna Loa data only included nighttime data to minimize exposure to local pollution sources (Bodhaine, 1995;B. Andrews, personal communication, 2012). Four out of 7 stations used the constant mass absorption cross sections, which vary from 15.9 m 2 g −1 to 19 m 2 g −1 by station (but up to 28 m 2 g −1 for the winter times in Alert). Our major conclusions are little affected by some variations in mass absorption cross section. Except for the Alert winter season, BC concentrations would differ only by −4 % to 14 % if using 16.6 m 2 g −1 as a default conversion value. Some of the BC measurement stations listed in Table 4 also provide a longterm measurement of carbon monoxide (CO) mixing ratio, so the modeled CO seasonality is also compared to the monthly mean of the CO mixing ratios measured between 1996 and 2005. Since CO tracer has a long lifetime (∼ 2 month), we used the seasonality of CO mixing ratio as a rough proxy to diagnose BC aerosol transport. Figure 7 shows the seasonality of BC mass concentrations in the model surface layer compared to the measurements listed in Table 4. For Jungfraujoch and Mauna Loa sites, we used the model BC concentration in the vertical layer from approximately 600-700 mbar in order to match their high altitude location (∼ 3 km). We provide the statistical measures at each station for each model including correlation  Table 5. The measured BC concentrations at the Arctic stations (i.e., Alert, Barrow, and Ny-Ålesund) exhibit a similar magnitude and seasonality with the minimum during the late summer and the maximum during the late winter and early spring. This seasonality pattern can be explained by the seasonal variations of transport pathways and deposition processes (especially wet deposition) within the Arctic polar dome and across the polar front (e.g., Stohl, 2006;Law and Stohl, 2007;Quinn et al., 2007). During winter, the polar dome typically extends toward more southerly latitudes than in summer, allowing low-altitude transport pole-ward (i.e., faster transport compared to the high-altitude transport) and limiting deposition processes because the atmosphere within the polar dome is stable and dry (Law and Stohl, 2007). In contrast, the summer polar dome inhibits direct low-level transport from the surrounding continents and enhances wet scavenging by forming frequent drizzle (Stohl, 2006). Gar-rett et al. (2011) use long-term surface CO and BC measurements to show that enhanced wet scavenging during summer is a more important driver for the Arctic BC seasonality than inhibited transport. At three Arctic stations, HadGEM2 captures the observed seasonality quite well (i.e., over R of 0.7 and LMNB of 0.5) but overestimates BC mass concentrations significantly in summer and fall. The two GISS models are able to capture the seasonal cycle at Alert and Ny-Ålesund with R of ∼ 0.8, but their prediction at Alert is poor especially during winter and spring seasons. Overall, the BC predictions vary by more than two orders of magnitude among models, and most models underestimate the BC concentrations severely in winter and spring, with a difficulty in capturing the seasonality. Table 5 shows all models except HadGEM2 with LMNB of −0.2 to −0.8 for the Arctic region. Unlike BC, the model CO seasonality is very similar among three stations and is fairly good compared to the measurements even with some underpredictions during the late winter and early spring (see Fig. 8). Also, the model diversity is very small. This may suggest that scavenging rather than transport is a primary cause for the poor BC seasonality and  Table 4. For Jungfraujoch and Mauna Loa, the model BC mass concentration in the vertical layer from approximately 600-700 mbar is used to match their high altitude location (∼ 3 km). Note that the observed BC is shown with the black line with the error bar including the minimum and maximum monthly mean during the observation period. Note that Ny-Ålesund and Hyytiälä are shown as Ny-Alesund and Hyytiala, respectively. the large model diversity. In fact, the poor seasonality of BC mass concentrations at these Arctic stations has been also shown in previous studies (Shindell et al., 2008;Huang et al., 2010a;Liu et al., 2011;Skeie et al., 2011;Browse et al., 2012;Wang et al., 2013). Several model studies (Huang et al., 2010a;Liu et al., 2011;Q. Wang et al., 2011;Wang et al., 2013;Browse et al., 2012) showed that the large underprediction of the Arctic BC could be improved by adjusting the treatment of wet scavenging: Liu et al. (2012) used GFDL-AM3 and Wang et al. (2013) used CAM5.1, but the modified scavenging schemes were not applied to their AC-CMIP simulations. However, it is important to mention that the deviation in the Arctic BC mass concentrations among the ACCMIP models cannot be explained by the difference in a single process such as the ice/mixed clouds wet scavenging scheme or the BC aging (see Table 2). Unfortunately, none of these studies investigated the impact of the modification of wet scavenging on BC deposition over snow and ice surfaces in the Arctic. The modeled BC concentrations at European stations (i.e., Pallas located in the sub-Arctic region, Hyytiälä, Mace Head, Preila, Jungfraujoch, and Ispra) show much less model  Table 4 are excluded if the CO measurements are not available, and the measured CO is averaged with available data between 1996 and 2005. NCAR-CAM5.1 and HadGEM2 do not provide CO output. Note that Ny-Ålesund is shown as Ny-Alesund.  Table 5 shows LMNE values of 0.3-0.4 except for NCAR-CAM5.1. At Jungfraujoch, the model to observation agreement was improved significantly when sampling the model values at 600-700 mbar, and models capture the observed seasonality well (6 models with R of over 0.9). However, the seasonal cycles at several stations vary widely among models, as also reflected in the wide range of R-values. The measured BC seasonality at high-latitude European stations (Pallas, Hyytiälä, Mace Head, and Prelia) resembles the Arctic seasonality especially sustaining a higher concentration during the late winter to early spring, although most models cannot capture this. The CO seasonality at those sites also looks similar to the Arctic with very small variations among models. At most North American stations (i.e., Sable Island, Bondville, Southern Great Plains, and Mauna Loa), the model BC concentrations agree within a factor of 2 of the observations (i.e., LMNE < 0.3; see Table 5). In general, they show a very weak seasonality. Similarly, the CO seasonality is quite weak at Southern Great Plains. However, Trinidad Head and Mauna Loa have a distinctive seasonality of CO. At Trinidad Head, most models did not capture the observed seasonality properly as the models peak 1-2 months earlier. This same behavior is also captured in the CO plot in Fig. 8.  Table 6. Log-mean normalized bias (LMNB) and log-mean normalized error (LMNE) of model present-day BC snow concentrations compared with the BC Arctic snowpack observations. Note that Canada sub-Arctic, Canadian Arctic, and Alaska N. slope regions in Fig. 9

Present-day BC snow concentrations in the Arctic
Recent measurements of BC snow concentrations (Hegg et al., 2009;Doherty et al., 2010;Hegg et al., 2010) were performed in Alaska, Canada, Greenland, Svalbard, Norway, Russia, and the Arctic Ocean, mostly in spring, by sampling the full snowpack depth (mostly down to the top few to tens of centimeters). These measurements can provide information on the BC deposition during the corresponding snow season. We obtained the measurement data from http://www.atmos.washington.edu/ sootinsnow/ArcticSnowBC.php. Since BC snow concentrations were not available directly from the ACCMIP models, we performed an additional set of simulations using the offline land and sea-ice models running with prescribed meteorology and aerosol deposition fields from each ACCMIP model (see Sect. 3 for more details). It is important to keep in mind that the ACCMIP simulations did not apply interannually varying emissions and simulated deposition fields are more consistent with each model's meteorology than that used to force the offline land and sea-ice simulations. To obtain a vertical profile of BC snow concentrations, the offline land model used a sophisticated BC-snow model developed by Flanner et al. (2007). For this work, we used the same "inefficient" melt scavenging parameters applied by Flanner et al. (2007), meaning that aerosols accumulate at the surface during snow melting. Limited field observations also show melt-induced impurity accumulation at the snow surface (Conway et al., 1996;Xu et al., 2012). To compare the observed data and the modeled BC, we first prepared the observation and the model data in the following way; (1) the observed data are averaged when falling into in the same model gridcell and snow layer; (2) the modeled data are averaged over 5 simulated years (1996)(1997)(1998)(1999)(2000) and sampled for the months of observations. The observation year was not a critical sampling condition in our case, for reasons provided later. Figure 9 shows scatter plots of the modeled BC snow concentration (ng of BC per g of snow) and the observed values in 8 regions consistent with the regional classification used in the original observation data: Arctic Ocean, Canada sub-Arctic, Canadian Arctic, Alaska N. Slope, Ny-Ålesund, Tromsø (hereafter, just Tromso), Greenland, and Russia. The LMNB and LMNE are computed for each region and each model (see Table 6). Based on the LMNB and LMNE values, the modeled BC concentrations are, on average, within a factor of 2-3 of observed BC and observations, except for in the Arctic Ocean and Greenland. Models overpredict Greenland BC, on average, by a factor of 4 to 8. For the Arctic Ocean, modeled BC is underpredicted, on average, by a factor of 2-5 (but 22 times for NCAR-CAM5.1). Compared to previous modeling studies (Skeie et al., 2011;Q. Wang et al., 2011), our model to observation agreements seem to be poorer. This is likely because the CMIP5 emission is decadal-scale, which is especially inappropriate for BB emissions. In fact, the observations report a significant contribution of BB emissions to the Arctic BC snow concentrations (Hegg et al., 2009(Hegg et al., , 2010. Also, based on GEOS-CHEM simulations, Q.  attribute 60 % of the Arctic BC in snow in spring 2008 (40 % in springs 2007-2009) to BB emissions.
Regarding the sampling method, the observation year was less critical in this study because (a) the ACCMIP simulations do not account for interannual variations in emissions, (b) the prescribed meteorology from 1996-2000 used in the offline models does not cover much of the observation periods (1998 and 2005-2009), and (c) the model results are quite insensitive to sampling with/without the observation year, based on the additional simulations we conducted with prescribed meteorology from 2000-2008. In fact, we initially ran the offline land and sea-ice models using reanalysis meteorology from 2000 to 2008 with deposition fields from 4 ACCMIP models (GFDL-AM3, GISS-E2-R, CICERO-OsloCTM2, and MIROC-CHEM). Using the 2000-2008 meteorology, we found little difference in model results between sampling for the year of observations and averaging from 2000 to 2008. However, we found that model BC snow concentrations decreased somewhat when using 1996-2000 meteorology compared with 2000-2008, especially over Russia and the Arctic Ocean, which is mainly due to the differences in snow/ice conditions between the two periods. However, this did not change the model to observation agreements much, however, in any region except the Arctic Ocean. For Russia, although the BC snow concentrations were reduced enough to change from overprediction to underprediction, for both cases models agree with the observations, on average, within a factor of two (not shown). Even though some of the Arctic Ocean data were actually sampled in 1998, models agree with the observations much better using the 2000-2008 meteorology (i.e., overpredict BC within a factor 2-3 of the observation vs. underpredict within a factor of 2-5 for 1996-2000 meteorology). This suggests that the Arctic BC snow concentrations are not insensitive to the choice of the meteorology period applied in the offline simulations, especially over the Arctic Ocean and Russia, or to interannual variations in BC emissions.
Precipitation rates simulated in the ACCMIP models affect the simulated BC snow concentrations only through their influence on aerosol deposition fluxes. The offline snow and sea-ice simulations all prescribe precipitation fields from reanalysis data spanning the same time period over which the BC measurements were conducted. We evaluated the AC-CMIP models' precipitation rates using gauge-based precipitation measurements that are available for the period of 1995 to 2004 and cover large areas in the Arctic (Yang et al., 2005). We obtained the observation data from http://ine.uaf. edu/werc/people/yang/yang files/MonthlySum/. We selected the measurement sites only when all monthly mean data from 1995 to 2004 were available. Figure 10 shows the seasonally accumulated precipitation rates for the multi-model mean and standard deviation compared to the observation over the high NH latitude region. With some exceptions, models overall have good skill in simulating the observed spatial and temporal patterns of the precipitation. This suggests that biases in model precipitation are likely only a second order source of bias in these evaluations of BC concentrations in snow.

Historical BC ice core concentration (1850-2000) evaluations
Aerosols in GCMs are mostly evaluated with observations from recent years to recent decades. Ice core records, possibly the only datasets to provide long-term historical information on aerosols, are extremely valuable for GCM model evaluations (McConnell, 2010), even if only a few datasets are available. Ice core evaluations in previous studies have been done using either BC deposition fluxes  or BC snow concentrations (Koch et al., 2011;Skeie et al., 2011); Koch et al. (2011) adjusted the modeled BC snow concentrations with the ice core precipitation data. In this study, we use BC snow concentrations, BC deposition fluxes, and precipitation, despite their close relationships. This is because BC snow concentrations are the most relevant for BC albedo forcing, while BC deposition fluxes and precipitation are directly simulated in a model, and therefore, their evaluation is very useful for models. Note that we do not examine matching between the model topography and the altitude of the ice core site, since this information was not available for all models. It might be questionable how well a GCM model performs over the Arctic, the Antarctic, and the high mountain glacier regions, especially with coarser grid resolution, but this topic is beyond the scope of our work. It is also important to mention that our discussions on the model temporal trends are quite limited to the emission patterns because our simulations do not track BC emitted from various source regions separately. Table 7 presents 12 BC ice core sites used to evaluate the historical BC predictions from the ACCMIP models: 4 sites in Greenland, 5 sites in the Tibetan Plateau, 1 site in the Alps (i.e., Colle Gnifetti glacier) and 2 sites in Antarctica. Additionally, we included BC ice core from the Fiescherhorn glacier in the Alps (Jenk et al., 2006), which is presented separately in the Supplement. We present the evaluation of modeled BC deposition fluxes (Fig. 11), BC snow concentrations (Fig. 12) and annual precipitation (Fig. 13) using the ice core data. Note that the same figure as Fig. 12 but with smaller y-axis scale is shown in S- Fig. 2 in the Supplement in order to display the observed BC concentrations more clearly. A few notes are provided here. (1) The Alps data are only used for BC snow concentrations due to the absence of precipitation data (and thus missing BC deposition fluxes). Specifically, the accumulated precipitation is impossible to obtain at this site because of the removal of winter snow by wind (F. Thevenon, personal communication, 2012). (2) For BC deposition fluxes evaluations, we used the CMIP5 transient simulations for NCAR-CAM3.5 and MIROC-CHEM instead of their equivalent timeslice simulations. (3) Modeled BC snow concentrations are computed using total BC deposition fluxes and precipitation rate, assuming all precipitation falls as snow, which is a reasonable assumption at these ice core sites. (4) Ice core observation data in Fig. 11 to Fig. 13 are presented as 5-yr running averages (thick black lines) along with annual-average (black dots) to reduce interannual variations. This helps to make it more comparable to our simulations, which are based on climatological-meteorology and decadal-scale BC emissions. BC emissions in the CMIP5 transient simulations are interpolated linearly between two adjacent decades and therefore might not represent a realistic interannual variation.
For the Antarctic stations, WAIS and Law Dome (Bisiaux et al., 2012a, b) show the lowest BC deposition fluxes (and BC snow concentrations) among sites and little change throughout the period. For example, comparing the average from 1850 to 1859 to that from 1996 to 2001-2002 for the ice core observation, the WAIS ice core records increase from 0.08 to 0.12 ng g −1 in BC snow concentrations and from 15 to 21 µg m −2 yr −1 in BC deposition fluxes; for Law Dome, there is no change in BC snow concentrations but an increase With the exception of GISS-E2-R, models with less than 7 days of BC lifetime (NCAR-CAM3.5, MIROC-CHEM, GFDL-AM3, and NCAR-CAM5.1) agree better with the observations, but MIROC-CHEM underpredicts in the preindustrial period in both Antarctic stations. For GISS-E2-R, GISS-E2-R-TOMAS, HadGEM2 and CICERO-OsloCTM2, their relative changes between preindustrial to present-day are similar to the observation even with their overpredictions. At Law Dome, most models overpredict BC deposition fluxes more than snow concentrations because of overpredicted precipitation (see Fig. 13). Figure 4 in Bisiaux et al. (2012b) shows a decreasing trend during the 1950s to the 1980s between the observed BC and SH grass fire emissions that are used in this study. However, the three ACCMIP models with the CMIP5 transient simulations exhibit increasing BC concentrations during the same period, which seems to resemble the overall SH emissions (or the SH fossil fuel and forest fire emissions) trend rather than grass fire emissions. The five Tibetan Plateau (TP) ice cores (B. ) are widely spaced on the Tibetan Plateau (Fig. 6a) and extend back to the 1950s. The BC deposited in the TP glaciers shows higher BC deposition/concentrations than other regions because of the proximity to the large BC emission areas. They have strong seasonality (i.e., high in winter and low in summer) that can be characterized by a dominant westerly jet stream and dry weather conditions in winter and the South Asian Monsoon and wet weather conditions in summer (Ming et al., 2008;B. Xu et al., 2009;Z. L. Wang et al., 2011). The westerly jet stream brings BC mostly from Europe, the Middle East, the Former USSR and North Africa while the summer monsoon brings BC from the south and southwestern areas including South Asia (Lu et al., 2012). The westerly jet stream plays an important role in transporting BC aerosols all year in the northern and northwestern TP regions (i.e., Muztagh Ata and Taggula).
Regarding BC temporal trends, as shown in Figs. 11 and 12, the TP BC ice core records display strong temporal variations from the 1950s to the 2000s that are not captured in the Fig. 11. Comparison of annual-average BC deposition fluxes in models to ice core observations. Observations are shown in two ways: annual-average shown in black dot and 5-yr running average shown in thick black line. Note that the CMIP5 transient simulations for NCAR-CAM3.5 and MIROC-CHEM are used here (presented as a solid line), instead of their ACCMIP timeslice simulations. For models with the timeslice runs, an error bar is used to present the mean of each timeslice run with its minimum and maximum of the corresponding timeslice runs and are positioned at the beginning of the corresponding timeslice year. CICERO-OsloCTM2 and HadGEM2 is presented with an "X" symbol for their mean. models. B.  pointed out that all TP ice core records except Zuoqiupu show the declining BC concentration during the 1950s-1960s possibly due to the decreasing European emissions. Instead of decreasing, the modeled BC increases steadily since the 1950s except MIROC-CHEM, which shows a very slight declining trend between the 1930s-1940s and the 1960-1970s. For the BC emissions used in this study, we found that the Western European emissions decrease during that period but the Eastern European emissions increase (see S- Fig. 1), which is similar to the model BC deposition history. The total BC emissions in Western and Eastern Europe fluctuate a little but do not clearly decrease during that period. This might indicate that, even though Eastern Europe is located closer to the TP, BC deposition in the TP ice cores is more influenced by Western Europe. In Muztagh Ata, which is most likely influenced by the westerlies all year, the modeled BC in the 1980s is higher than that in the 2000s, which is also consistent with the Eastern European emission trend because the western European emissions keep decreasing. At Zuoqiupu, the BC ice core shows an increasing trend from the 1980s without the high BC concentrations in the 1950s-1960s, possibly because of the unfavorable transport path from the European emissions to that site (B. Xu et al., 2009). The temporal trend at Zuoqi-upu is well captured by the models. In contrast to the temporal trend in the BC ice core record at Rongbuk from Ming et al. (2008) (i.e., shown the red line in Fig. 12), the BC ice core record from Mt. Everest spanning 1860-2000 (Kaspari et al., 2011), which matches the same location of Rongbuk glaciers used here, does not exhibit the high BC concentration in the 1950s-1960s but shows a similar trend as Zuoqiupu. However, this should be viewed cautiously because the BC ice core records in Kaspari et al. (2011) are unusually low (below 1 ng g −1 , close to the Antarctic BC ice cores).
Despite missing the strong temporal variations of BC in the TP areas, the time mean of the modeled BC depositions at Muztagh Ata, Tanggula and Noijin Kangsang are consistent with the observed values, with GISS-E2-R, GISS-E2-R-TOMAS and NCAR-CAM5.1 showing excellent agreement (see Fig. 11). All models overpredict the BC deposition significantly at Rongbuk and Zuoqiupu. Compared to the BC deposition fluxes, BC concentrations by GISS-E2-R and GISS-E2-R-TOMAS agree slightly less at Muztagh Ata, while the model overprediction is much reduced at Rongbuk (see Fig. 12). These changes are explained by the precipitation biases (see Fig. 13).
We also evaluate the simulated BC concentrations and precipitation during the monsoon season (June to September) Fig. 12. Same as Fig. 11 but for BC snow concentrations. The red thick line in Rongbuk is BC ice core data from Ming et al. (2008). Note that, unlike Fig. 11, the ACCMIP timeslice simulations for NCAR-CAM3.5 and MIROC-CHEM are used. and non-monsoon season (October to May) at Zuoqiupu (Fig. 14). Note that we present the same plot as Fig. 14 but with a smaller Y-axis scale in S- Fig. 3 in the Supplement in order to show the observed BC concentrations better. The observed BC concentration in Fig. 14 was treated with a 5yr running average. The observed precipitation data are simply derived using a 5-yr running average of BC concentrations separated into the two seasons, annual BC concentrations and annually accumulated precipitation. Although the BC concentrations are overpredicted regardless of season, most models predict BC concentrations that are 2-4 times higher during the non-monsoon season than during the monsoon season, which is consistent with observations. Precipitation at Zuoqiupu is captured well by the models.
Greenland ice core records at D4 (McConnell et al., 2007), ACT2 (McConnell andEdwards, 2008), andSummit andHumboldt (McConnell, 2010) are most frequently used in previous studies Koch et al., 2011;Skeie et al., 2011). Both models and observations exhibit a large interannual variation in Greenland (Figs. 11 and 12). All Greenland ice cores show a peak in the 1910s-1930s that is consistent with the highest BC emissions in North America and Western Europe in the same period. This signature is also captured very well in the models. Interestingly, the models simulate an increasing trend after the 1960s-1970s using BC deposition fluxes, but this trend does not appear in the BC concentrations or in the ice core records. For BC deposition fluxes, all models except NCAR-CAM5.1 capture the observation quite well (within a factor of two) until approximately the 1960s. The underprediction of BC surface mass concentration and overprediction of BC deposition fluxes in the 2000s suggests a problem with deposition. For BC concentrations, the models agree relatively well with the observations except at Summit. However, model to observation agreement for BC concentrations is not as good as that for deposition fluxes.
Similar to the Greenland ice cores, modeled BC at the Alps peaks in the 1930s, reflecting the influence of neighboring European emissions. From 1850 to 1950, the Alps ice core, from the Colle Gnifetti glacier (Thevenon et al., 2009), shows no clear increase because of particularly high concentrations from the 1850s to the 1890s (e.g., 12 ng g −1 for the 1890-1948 average, 14 ng g −1 for the 1850-1889 average or 11 ng g −1 for the 1850-1889 average excluding years with over 20 ng g −1 ). However, when including the observed BC from 1750 and excluding the high BC concentrations during the 1850-1890 periods, the Alps BC concentrations increase over time (e.g., 9 ng g −1 for the 1750-1889 average). The high concentrations during the 1850s-1890s do not appear in the BC ice core from the same glacier in the Alps in Lavanchy et al. (1999), but the two BC records have quite different BC concentrations. These differences might result from  the different measurement techniques applied to determine BC particles. Nevertheless, both Alps BC ice cores consistently show BC increasing by 2-3 times from the 1750-1890 period to the 1950-1975 period due to the regional emission changes (Lavanchy et al. 1999;Thevenon et al., 2009). Gabrieli et al. (2010) find that organic pollutants deposited in the Colle Gnifetti glacier, the same ice core as Thevenon et al. (2009), are influenced by neighboring European emissions. This agrees with the increase in the modeled BC concentrations, although the models seem to show the higher concentration in the 1930s, which agrees with the emissions changes in the ACCMIP models. We obtained the additional data from the Alps EC ice core extracted from the Fiescherhorn glacier (46.55 • N, 8.07 • E, 3900 m), which are available from Table S2 in Jenk et al. (2006), and we find that our models agree quite well (see S- Fig. 4 under the Supplement). Overall, our models and three Alps BC/EC ice cores suggest that the BC particles deposited in the Alps are most influenced by the neighboring European emissions.

BC albedo forcing
Global annual average BC surface albedo forcing is computed using the NCAR CLM4 and CICE4 models with the offline methodology described in Sect. 3. Results are presented in Fig. 15a for 1930 , 1980, and 2000 relative to 1850 (a) Offline BC albedo forcing (b) Offline BC albedo forcing vs. Online BC albedo forcing Fig. 15. Global annual average of (a) offline BC albedo forcing from 8 ACCMIP models and (b) comparison of the offline forcing to online BC albedo forcing. Note that the offline BC albedo forcing is computed in the offline land and sea-ice models (NCAR CLM4 and CICE4, see Sect. 3 for more details), and the online BC albedo forcing is computed in its own model. In (b), the offline BC albedo forcing is shown in the bar plot and the online forcing, in the line plot.
(hereafter, referred to as "offline" BC albedo forcing). The highest global annual average BC albedo forcing occurs in 1980, and the forcing in 1930 is similar to 2000. Our offline BC albedo forcing in 2000 ranges from 0.014 W m −2 to 0.019 W m −2 among the ACCMIP models, which differ little among models: 40 % difference for the 2000 forcing, 54 % for the 1980, and 68 % for the 1930. This is likely due to our method of using the same offline model and meteorological data (producing identical snow cover) to compute the albedo forcings for all deposition fields. , who applied the same snow albedo model with a coupled atmosphere-land-ocean configuration (CAM3), report a total BC albedo forcing of 0.05 W m −2 (relative to no BC), larger than the ACCMIP range of year 2000 total BC albedo forcings found here (0.024-0.037 W m −2 : relative to no BC instead of 1850). These differences arise partly because the offline methodology, applied here, produces less snow cover over the Tibetan Plateau and other parts of Asia that were subject to large forcing in the Flanner et al. (2007) study, leading to smaller forcing (see Fig. 16). Also, because the forcings for all time periods were quantified using snow and ice states representing the 1996-2000 period, which are likely diminished relative to previous periods, actual BC snow forcing in 1850 may have been greater (Lawrence et al., 2012). Three ACCMIP models (CICERO-OsloCTM2 and two GISS models) compute the albedo forcing in their host model (hereafter, referred to as "online" BC albedo forcing). We compare global annual average online BC albedo forcing with the offline BC albedo forcing in Fig. 15b. They agree well in terms of temporal pattern, with the highest forcing in 1980. Interestingly, even with the same parameterization of BC albedo effect used in GISS-E2-R and GISS-E2-R-TOMAS, the online BC albedo forcing is a factor of two lower than the offline forcing for GISS-E2-R and 20 % higher than the offline forcing for GISS-E2-R-TOMAS. It seems that the parameterization used in the GISS models (Koch et al., 2009a) is much more sensitive to the BC deposition fluxes -the BC-snow parameterization used in GISS models accounts for the impact on snow over the bare land and sea-ice but not over the land ice. Like GISS-E2-R, the CICERO-OsloCTM2 offline BC albedo forcing is also 2 times higher than the estimates from their own model. This suggests that BC-snow parameterizations and model snow cover can strongly influence forcing . It is worth mentioning that CICERO-OsloCTM2 shows little change in the present-day BC albedo forcing when switching from the Bond et al. (2007) anthropogenic BC emission dataset to the Lamarque et al. (2010) dataset. Skeie et al. (2011) reports BC albedo forcing of 0.008 W m −2 based on 2006 meteorology only due to the anthropogenic emissions changes (i.e., FF/BF) based on Bond et al. (2007). However, the time evolution of global annual BC albedo forcing in Skeie et al. (2011) is slightly different as a higher forcing is found in 1930 than in 1980. This shows that BC albedo forcing is sensitive to the changes in regional BC emissions, which might slightly differ between Lamarque et al. (2010) and Bond et al. (2007), though the BC emissions in Lamarque et al. (2010) are heavily based on Bond et al. (2007). Bauer and Menon (2012) estimate a global annual mean BC albedo forcing of 0.016 W m −2 , based on the MA-TRIX aerosol physics model  in the same GISS GCM model as GISS-E2-R and GISS-E2-R-TOMAS and using the same emissions from Lamarque et al.(2010). The main difference among these models is their aerosol representation. The estimate from Bauer and Menon (2012) falls in between 0.009 W m −2 (GISS-E-2-R) and 0.022 W m −2 (GISS-E2-R-TOMAS). Based on the GISS models, we find that differences in aerosol representations within the same GCM can produce more than a factor of two difference in global annual mean BC albedo forcing. Figure 16 shows the spatial patterns of BC albedo forcing in 2000 and 1980 relative to 1850. In 2000, BC albedo forcing is positive everywhere with the highest BC forcing (i.e., over 0.2 W m −2 ) over the Manchuria and Karakoram areas and relatively high forcing (i.e., over 0.1 W m −2 ) over most of the Former USSR. Most Arctic areas show small positive forcing. In 1980, the highest BC forcing (i.e., over 0.8 W m −2 ) is seen over a large area in the Former USSR. This is due to the high FF/BF emissions in the Former USSR in 1980 compared to 2000 (see  in the Supplement). The BC albedo forcing estimates diverge the most over Greenland, North America and the Tibetan Plateau for both the 2000 and 1980 cases (not shown).

Summary and conclusions
The main goal of the Atmospheric Chemistry and Climate Model Intercomparison Project (ACCMIP) is to investigate the influence of atmospheric gases and aerosols on climate change (Lamarque et al., 2013). As part of the ACCMIP, this work aims to evaluate preindustrial to present-day black carbon (BC) aerosols in the 8 ACCMIP models against 12 ice core records from Greenland, Tibetan Plateau, Alps, and Antarctica. Among 15 ACCMIP models, 8 models tracking BC aerosols as a prognostic tracer are used: GFDL-AM3, GISS-E2-R, GISS-E2-R-TOMAS, NCAR-CAM3.5, NCAR-CAM5.1, HadGEM2, CICERO-OsloCTM2, and MIROC-CHEM. We also evaluate the present-day BC predictions with long-term atmospheric BC surface mass concentrations and recent BC snowpack measurements from the Arctic regions. With the monthly mean deposition fields of BC and mineral dust from each model, we estimate BC albedo forcing with additional simulations using the NCAR CLM4 and CICE4 models, which include sophisticated BC-snow and BC-ice treatments Lawrence et al., 2011;Holland et al., 2012). To evaluate with recent snowpack measurements, we obtained a vertical profile of BC snow concentrations from the additional simulations.
We first examined the global annual budgets of BC emission, deposition, and lifetime in 1850 and 2000 from each model. Global annual BC emission rate is increased from ∼ 3 Tg yr −1 to ∼ 8 Tg yr −1 , mostly driven by anthropogenic emissions. Global BC burden differs by a factor of approximately 3 among models: 34 Gg to 103 Gg in 1850 and 82 Gg to 315 Gg in 2000. Similarly, BC lifetime varies from 4 days to 15 days with a negligible change between 1850 and 2000. This large model diversity stems from the differences in aerosol removal parameterizations and also from simulated meteorology among models. The dominant removal process for BC in models is wet deposition, contributing 80 % of the total deposition rate on average among models. For the relative changes between two timeslices (e.g., 1850 vs. 2000), models agree quite well because of the similar behavior in each model's own parameterizations for aerosol removals between two timeslices. For instance, the global BC burden from preindustrial to present-day increases by 2.5-3 times, which is close to the 2.5 times increase in BC emissions, suggesting that emissions are a main driver for the BC burden changes simulated by each model. Comparing spatial distributions of BC burden in the present-day simulations from the models, we notice that models diverge the most at both NH and SH high latitude regions. However, only SH high latitude regions appear to be noticeably divergent for BC deposition fluxes. Compared to 1850 simulations, models show increasing BC deposition over the Antarctic regions in 1980 and 2000 with particularly larger model diversity. This suggests significant model diversity in aerosol transport, which becomes important with increasing BC transport into the Antarctic. This model behavior is also shown in the ice core evaluation.
For BC in the present-day ACCMIP simulations, we find that the simulated Arctic atmospheric BC surface mass concentrations are severely underestimated during the winter and spring, leading to a poor seasonality. Based on our CO evaluations and previous studies (Huang et al., 2010a;Liu et al., 2011;Browse et al., 2012;Wang et al., 2013), improving the wet scavenging scheme seems to be the key to improving the Arctic BC seasonality in models. In general, the models capture the observed BC mass concentrations well in Europe and North America except Ispra. With the recent snowpack measurements, we find that the model's vertically resolved BC snow concentrations are, on average, within a factor of 2-3 of the measurements except for Greenland and the Arctic Ocean. Missing interannual variations in our emission dataset seem to contribute a considerable scatter in model to observation agreements compared to previous studies. During this evaluation, we found that the choice of meteorological period used to simulate the BC snow concentrations in the offline land and sea-ice models (i.e., NCAR CLM4 and CICE4) has a moderate impact on BC snow concentration due to the differences in cryospheric conditions between the two periods and is larger over the Arctic Ocean and Russia.
For preindustrial to present-day BC in the ACCMIP models evaluated with the ice core records, models with relatively longer BC lifetime (i.e., ≥ 7 days) overpredict BC deposition fluxes/concentrations in the Antarctic sites, while models with shorter lifetime (i.e. < 7 days), except GISS-E2-R, capture the observed magnitude quite well throughout the period. Modeled BC deposition fluxes increase during the 1950s to the 1980s, which might be due to rising SH total BC emissions, while the ice core records are correlated better with the declining SH BB emission during this period. Most models do not simulate the decreasing trend of BC from the 1950s to the 1970s that is measured in the Tibetan Plateau. As pointed out by B. , this decreasing trend reflects the strong influence of the Western European emissions to this region; Western European BC emissions declined after the peak in the 1930s. The increasing trend in models after the 1950s is consistent with emission increases in the neighboring regions including Eastern Europe, the Middle East, South Asia and East Asia. Without the high concentrations in the 1950s-1960s, some models simulated the observed magnitudes well except at Zuoqiupu and Rongbuk. Although models severely overestimate BC concentrations at Zuoqiupu, models successfully capture higher BC concentrations during the non-monsoon season than during the monsoon season. For the Greenland ice core records, models follow the observed temporal patterns quite well, with the maximum around the 1930s, and a magnitude similar to the observations. However, compared to the decreasing trend after the 1950s in the ice core records, modeled BC concentrations show either much slower decrease or no decrease. Finally, models tend to capture the temporal trends seen in the Alps ice core records (i.e., two ice core records from the Colle Gnifetti glacier and one from the Fiescherhorn glacier), although some discrepancy is observed between two BC ice core records from the Colle Gnifetti glacier possibly due to different measurement techniques. Both simulated and observed temporal trends indicate a strong influence from Europe.
Globally and annually averaged BC albedo forcing from the offline NCAR Community models ranges from 0.014 W m −2 to 0.019 W m −2 in 2000 relative to 1850 among the ACCMIP models. This is smaller than previously reported forcing  because of our method to compute the forcing; the offline method leads to less snow cover over the Tibetan Plateau and other portions of Asia, where Flanner et al. (2007) estimated large positive forcing. For spatially distributed BC albedo forcing in 2000, we estimate strong positive everywhere with high forcing (i.e., over 0.1 W m −2 ) over Manchuria, Karakoram, and most of the Former USSR. Models predict the highest global annual average BC forcing in 1980 rather than 2000 mostly because of the higher FF/BF emissions in the Former USSR in 1980 compared to 2000. Interestingly, despite the similarity between Lamarque et al. (2010) and Bond et al. (2007), we find that the global annual average BC albedo forcing is higher in 1930 than in 1980, when using Bond et al. (2007). This is based on the comparison between the ACCMIP simulations in CICERO-OsloCTM2 and results from Skeie et al. (2011).
For GISS-E2-R, GISS-E2-R-TOMAS, and CICERO-OsloCTM2, we compare our offline BC albedo forcing to the online BC albedo forcing computed in its own model. They can differ by up to a factor of 2, revealing how the BC-snow parameterizations and model snow cover impact BC albedo forcing. Global annual average BC albedo forcing from two different GISS models varies from 0.015 W m −2 (GISS-E-2-R) to 0.019 W m −2 (GISS-E2-R-TOMAS) with our offline forcing but from 0.009 W m −2 (GISS-E-2-R) and 0.022 W m −2 (GISS-E2-R-TOMAS) with the online forcing computed in the GISS GCM. Given the main difference in the two GISS models is aerosol representation (i.e., GISS-E2-R with no microphysics and GISS-E2-R-TOMAS with sectional aerosol microphysics model), this suggests that different aerosol modeling can lead to approximately a factor of two difference.
Supplementary material related to this article is available online at: http://www.atmos-chem-phys.net/13/ 2607/2013/acp-13-2607-2013-supplement.pdf. through the NASA Center for Climate Simulation (NCCS) at Goddard Space Flight Center. V. Naik and L. W. Horowitz acknowledge the efforts of GFDL's Global Atmospheric Model Development Team in the development of the GFDL-AM3 and the efforts of the Modeling Services Group for assistance with data processing. S. Ghan and J.-H. Yoon were supported by the US Department of Energy Office of Science Decadal and Regional Climate Prediction using Earth System Models (EaSM) program. The Pacific Northwest National Laboratory (PNNL) is operated for the Department of Energy (DOE) by Battelle Memorial Institute under contract DE-AC06-76RLO 1830. W. J. Collins and S. T. Rumbold were supported by the Joint DECC and Defra Integrated Climate Programme (GA01101). The CICERO-OsloCTM2 simulations were done within the projects SLAC (Short Lived Atmospheric Components) and EarthClim funded by the Norwegian Research Council. The CESM project, including NCAR-CAM3.5, is supported by the National Science Foundation and the Office of Science (BER) of the US Department of Energy. The National Center for Atmospheric Research is operated by the University Corporation for Atmospheric Research under sponsorship of the National Science Foundation. The MIROC-CHEM calculations were performed on the NIES supercomputer system (NEC SX-8R) and supported by the Environment Research and Technology Development Fund (S-7) of the Ministry of the Environment, Japan. Development of historical records of BC in Greenland and Antarctic ice cores was funded by the National Science Foundation.