The authors have responded to many of the comments from previous reviews by adding Supplementary Material, adding a section on Error in Ash Retrievals and revising the section on Possible Column Collapse and PDCs. As a result, the paper is greatly improved.
I feel that some of the issues regarding accuracy of satellite retrievals have not been fully addressed, but that the revised discussion makes the reasoning behind the current estimates much clearer. I have highlighted two areas below where I think more clarification of the effect on the results of variations in the methods used are required. If these points are suitably resolved, then I would recommend the paper for publication.
I have also added a list of minor errors / typos for the revised draft.
29 June 2017
## Supplementary material
This is a welcome addition. The photographs highlight the two spreading levels of the plume, as well as the overall complexity.
## Error in ash retrievals
I really appreciate the addition of this extra section. It is useful to highlight the difficulty of quantifying the effect of irregularly shaped particles when there are little data on their optical properties. Comparison with independent measurements is a useful alternative.
I would like to see more detail on the uncertainties due to cloudiness and lack of thermal contrast. In the Response to Reviewers, the authors suggest that I am under the 'common misconception' that ash clouds with high concentrations are not detected. They are detected as some kind of cloud due to their increased optical depth and lower brightness temperature, but cannot be identified as volcanic ash and a retrieval cannot be made. What is not clear from the current manuscript, nor from other similar publications e.g. Prata and Prata (2012), is how such pixels are identified and incorporated into mass loading estimates. Is it a manual process? Are they excluded from calculations? In which case, mass loadings must be considered minima.
The response to reviewers states "Stevenson goes into some detail to suggest that satellite retrievals are much less certain than we have stated. We think in general this may be true and there are many assumptions relied upon, which might not be appropriate in all cases and could be improved with further study, and these contribute to the uncertainty in the retrieval." It suggests that the uncertainty in the Grímsvötn results is lower because the agreement with validation data is good. Just because someone wins the lottery doesn't mean that the odds weren't millions to one!
The brightness temperature difference method assumes that particles are dense spheres, which only exhibit the BTD effect when the size distribution is dominated by particles <10 µm diameter. Thus, any pixel displaying a BTD signal will be interpreted as being dominated by particles <10 µm diameter. If non-spherical and bubbly particles cause a BTD signal at larger grainsizes (as demonstrated by Kylling et al., 2014), and ash grains are not dense spheres (as demonstrated by hundreds of tephrochronology studies), then the grainsize will be underestimated.
The only logical way to argue that this doesn't apply is to present evidence that Kylling is wrong, or that volcanic ash grains ARE dense spheres when airborne.
Neither the response to reviewers nor the updated manuscript addresses the additional source of uncertainty described in Stevenson et al. (2015), namely that the mathematics behind retrieval algorithms biases them towards solutions involving smaller grainsizes. For a given observation, the algorithms prefer solutions with low concentrations of optically active (small diameter) particles.
The method of Prata and Prata (2012) is not a 3-parameter retrieval like that of Francis et al. (2012), but instead uses a lookup table for a specific cloud-top temperature. In this scenario, there is only one possible combination of effective radius and optical depth that matches a given observation. The choice of cloud top temperature can therefore bias retrievals towards higher or lower grainsizes. I would like to see discussion of how the cloud top temperatures were chosen and the contribution of varying this to the uncertainty in the retrievals.
## Possible column collapse and PDCs
This section is much improved and the supplementary images of the plume illustrate the complexity well.
"There are some reasons why we cannot be sure that aggregation is the sole driver of a partial collapse." I'm not clear what argument you are making here. Are you saying that there are many course particles and so the plume would have collapsed anyway? Can you rephrase?
Clarisse and F (2016): who is second author?
# Line by line comments
3:10 - Did you mean that 16 microns is the largest size?
3:24 - Spelling: glacier
6:9 - Standard atmosphere is important, but the ice/water at the vent at Grímsvötn were probably a bigger factor and are not accounted for in Mastin.
7:30 - Spelling: grain sizes
17:5 - Do you mean fine ash or very fine ash? The two terms seem to be used interchangeably in consecutive sentences. Clarify by repeating definition.
18:6 - Kylling (2014) found errors of 40%, which is greater than the 10-30% cited here.
20:29 - Spelling: Grímsvötneruption
22:7 - Adding meltware removes energy from the plume