Comment on acp-2021-459

However, I believe that additional work would need to be done before publication. I would recommend the authors to provide additional evidences to support their conclusions. Specifically, they argue that "online treatment of eruption dynamics improves the volcanic ash and SO2 dispersion forecast". Unfortunately, the paper fails to explain why the online treatment improves ash/SO2 dispersion forecast.

Overview: =============== This is an interesting study providing valuable insight into the importance of online coupling approach in volcanic ash and SO2 dispersion modelling.
They show that the online approach leads to significant differences in the vertical structure of ash and SO2 plumes when radiative effects are taken into account. This is a novel and important study shedding new light on the processes involved in complex dynamics of volcanic plumes and is suitable for publication in Atmospheric Chemistry and Physics.
However, I believe that additional work would need to be done before publication. I would recommend the authors to provide additional evidences to support their conclusions. Specifically, they argue that "online treatment of eruption dynamics improves the volcanic ash and SO2 dispersion forecast". Unfortunately, the paper fails to explain why the online treatment improves ash/SO2 dispersion forecast.
General comments: =============== * Throughout the article, the authors highlight the impact of aerosol-radiation interaction on the simulations. However, no evidence is provided to conclude that the online approach leads to better agreement with observations.
One of the arguments used by the authors to claim that the online treatment improves model predictions is based on the temporal evolution of the total amount of ash (Fig. 4). For example: l. 260, p. 10: "Thus, we conclude that the online treatment of plume development improves the ash loading prediction during the first hours and days after the eruption." However, this conclusion seems to be incorrect as the Fplume-norad simulation shows the best agreement with satellite data in Fig. 4. Moreover, differences in the total mass loading could be related to a possible underestimation by satellite retrievals or a poor estimation of the fraction of very fine ash used to compute the mass eruption rate.
The authors had the opportunity to show the improvement of the online approach by using the SAL metrics. However, only results for the Fplume-rad simulation are presented in Section 3.2.
Suggestion: Improvement of the online approach could be demonstrated by comparing SAL results for Fplume-rad and Fplume-norad. For example, you could compute the SAL metrics as the sum of the absolute values of S, A and L, which results in an index ranging from 0 (optimal) to 6. The Suzuki distribution should be normalized by the integral of S, instead of the maximum S. For discrete point sources, you should normalize using the sum of S requiring: E = \sum S* * It seems that you distinguish between the terms "eruption phase" and "puff" in some parts of the manuscript, while sometimes are used as synonyms.
In order to avoid confusion, it would probably be more convenient to unify the terminology and use only "eruption phase". In my opinion, "puff" is a bit ambiguous for a complex multi-phase eruption like Raikoke.
* What do you mean by "insensitive" here? Variations of 10% in column heights didn't affect MER estimations? Please clarify * Figure 2 (title): "MER of very fine ash (<30 um)" -> It should be "<32 um", right? * l. 217, p. 9: "The method assesses predefined objects based on a threshold value" You don't say how these objects were defined. What threshold value have you used? * l. 230, p. 9: Where do these gaps come from? The mean averaging you applied to fill gaps conserves the total mass? Or are you adding new mass with this approach? * l. 232, p. 9: Regridding mass loading to a coarser grid by a linear interpolation is not the best approach, as mass conservation is not ensured. Do you have an idea how much is the total mass difference induced by the interpolation method? * l. 260, p. 10: "Thus, we conclude that the online treatment of plume development improves the ash loading prediction during the first hours and days after the eruption." This statement is not correct. It cannot be concluded from the results presented in Section 3.1 that the online treatment improves simulations. In fact, FPlume-norad outperforms Mastin-rad, and you cannot say that FPlume-rad is better than FPlume-norad. I think the only valid conclusion here is: "the FPlume experiments (ie, FPlume-rad and FPlume-norad) showed better agreement with observations"  Fig. A2 are really good.
I think it would be worth including Fig. A2 in the main body of the paper, probably replacing Fig. 3. * l. 295, p. 12: Why are you defining two threshold for volcanic ash instead of using a single threshold for total ash? Why didn't you define a threshold for the giant mode? What threshold have you used for ash in Fig. 6? * l. 296, p. 12: I see no reason to remove those 'steps'. Why top height is greater than zero before the eruption starting time in Fig. 6? This has to do with the smoothing? Since data was smoothed, you should at least indicate the vertical resolution of the model. Is it comparable to the differences found in top height between rad and no-rad experiments? * Figure 6: How is the mass averaged height computed? Is a vertical average? In this case, the average is weighted by mass concentration or by mixing ratios? Or is a horizontal average? In this case, the average is weighted by mass loading (in g/m2).
Are vertical profiles also averages? Or are they computed at specific locations? How this average is performed? Do exclude grid cells without ash/SO2 from the average? * l. 306, p. 13: "in the FPlume-norad case still shows the overlap of the different phase dependent profiles" Where is shown this overlap? Please clarify. * l. 318, p. 14: "the vertical distribution of the total ash mass" What do you mean by "total mass"? Is the total mass within a model layer?
* Figure 7(c): "SO2 mass loading in kg" -> mass loading should be in units of g/m2 as in Fig. 3. Why are you showing "mass" in (a) and "mass loading" in (c)? What is the difference? * Equation (3): Obtaining the median radius for a multimodal log-normal distribution is not a trivial problem. Are you sure the median radius is given by such a simple formula? Or this expression only defines a "characteristic radius"? * Figure A1