Comment on acp-2020-1296

This study reports results from an extensive set of simulations with the GISS-E2.1 and two different emission inventories used to investigate the recent past and projected future changes in Arctic aerosols and aerosol-induced climate impacts. I find the study interesting and suitable within the scope of ACP. However, I also think it needs substantial further improvements before it can be accepted for publication. In particular, I find parts of the manuscript difficult to follow (the most notable example being the section on radiative forcing) and in some cases the possible reasons behind particular results could be better discussed. A better description of the experiments is needed for readers not within the AMAP group and I’m missing some context with impact due to other emissions than aerosols and precursors. In the introduction, the authors could better motivate why their study is important and timely. Finally, the figures could be visually more appealing. I think that improving the structure and readability should be quite feasible and some additional efforts will make a much stronger manuscript.

This study reports results from an extensive set of simulations with the GISS-E2.1 and two different emission inventories used to investigate the recent past and projected future changes in Arctic aerosols and aerosol-induced climate impacts. I find the study interesting and suitable within the scope of ACP. However, I also think it needs substantial further improvements before it can be accepted for publication. In particular, I find parts of the manuscript difficult to follow (the most notable example being the section on radiative forcing) and in some cases the possible reasons behind particular results could be better discussed. A better description of the experiments is needed for readers not within the AMAP group and I'm missing some context with impact due to other emissions than aerosols and precursors. In the introduction, the authors could better motivate why their study is important and timely. Finally, the figures could be visually more appealing. I think that improving the structure and readability should be quite feasible and some additional efforts will make a much stronger manuscript.

Specific comments:
Line 30: "have been"? As in historical or in previous modeling work?
Line 33: Why also for climate parameters? What is different in the experimental setup?
Lines 37 onwards: would be useful to have the RF over the 1990-2014 period as well to understand changes in the scenarios.
Line 46-48: Still due to changes in aerosols only? Should be more clear from the abstract hos greenhouse gases are treated.
Line 50-54: Similarly to the above comment, the role of aerosols vs. other emissions is a bit unclear.
Line 78: "mostly" -what are the remaining effects? Line 88: "warming effects": here and in the following paragraphs I would suggest the authors be a bit more precise with regards to positive and negative RF versus warming/cooling, with the latter used only when actual temperature estimates are given. Furthermore, perhaps be clear whether it's surface warming or general. Line 109-onwards: Somewhere this section should mention/discuss long-range transport. While forcing exerted remotely is an important factor, there is also a lot of literature on the source attribution of Arctic aerosols. Given that Arctic burdens are shown later, the LRT is relevant to understand to interpret changes in burden over time.
Line 111: is this per unit global sulfur emission?
Line 131: I think this paper actually removed aerosols entirely? Relevant for the response.
Section 2.2: perhaps reconsider the number of small paragraphs? It becomes a bit broken up and the first sentences of the section are repeated later. Section: 2.3: this is probably clear to people who are familiar with the AMAP runs, but to me it's very unclear how other emissions (CO2, etc.) are treated in these experiments. Which in turn makes results hard to interpret. I think experiments could be a bit better explained.
Line 303: when I think of IMPROVE, I don't exactly think of the Arctic. Perhaps it could be useful to give the number of stations in each network that are within the relevant region? (yes, there are SI tables, but to help the reader).
Section 2.4.1: In later tables and figures satellite observations of AOD are mentioned, but I can't see those described here? Please clarify.
Line 383-384: I don't understand this sentence and relationship. Please consider rephrasing.
Section 3.1: In general, an indication of the interannual variation around the climatological mean would be very useful, at least for observations when this can be added to the figures.
Section 3.1.1: perhaps discuss the seasonal differences in the underestimation better. Moreover, I'm not convinced by the inclusion of individual ensemble members as separate experiments. I think it rather adds unnecessary complexity and, in addition given how briefly these results are discussed in the text, could rather be an average and a ± range. (This goes for climate variables as well.) Line 423-430: From Figure 3, it seems that OC is very well captured. This seems worth describing and explaining. It surprises me that the seasonal cycle of the observations is so different from BC. Is it a dominance by biogenic SOA? Figure 2: Is this an average over all stations? Please be specific. Lines 555 onwards: I find the discussion around the role of SOA hard to follow. So OC in figure 6 includes SOA? What is the OA-OC conversion factor? Furthermore, I think more explanation of why these differences exist is needed, rather than just attributing one to the other.
Section 3.2: use same unit as figure 6? Section 3.3: this section needs some improvement.
I would recommend using terminology RFari and RFaci. I also don't think you need to keep saying TOA radiative forcing, that's in the definition. Will help improve readability as well. The section is difficult to follow with the many different time periods used. For instance, lines 595-602 gives a set of numbers that are not quite different from the sum of the aerosol RF in Figure 8. Lines 595-602 seems to give the RF due to changes in aerosols from 1990 to 2010 and then from 2030 to 2050, but what but the RF due to the difference from 2010 -2030 and 2050? Figure 8: RF is already a delta, a perturbation vs. a baseline, so it's not clear to me what this figure is showing. Some RF numbers have a ± range, but it doesn't seem to be the case for the numbers in table 4. Line 654: here the 2015-2050 forcing is also introduced. To me, this is a more relevant measure than the e.g. the forcing in 2050 relative to 2030 because in many cases, the emission changes are not that large from 2030 to 2050. At the very least, give this period in table 4 and hint to the reader at the beginning of the section that it will be mentioned. And why calculate this relative to 2015 and not 2010? At the beginning of the paper, you talk about how Arctic climate change is primarily due to remote forcing. For this reason, I think it would be useful to give the reader an idea of also the global mean RF. This can be done in the SI, but would also enable comparison with previous work. One or two figures of the geographical distribution of forcing would also be useful. Can be sub-panels of figure 7.
Line 667-671: the model gives a 10 degree per decade change compared to 2 degrees from the observations? That seems like a very noticeable difference that I don't think you can just mention briefly like this, but needs more attention. What does this imply for confidence in any of the projections?
Line 698: Not sure lowNTCF has been defined anywhere? Line 754-755: Here you have 3 big figures in the SI that hardly show anything but white map and then show that anything they do show is not really significant. I would perhaps reconsider the usefulness and need for these figures. Figure S8: I'm not sure it's correct to refer to projected changes as anomalies? And, is the isoprene plot referred to anywhere in the text? This is important for the discussion about SOA burden.