Comment on acp-2021-99

Lines 18-19: Agreement between growth and nucleation rates described as “remarkably good” and “matched .. well”. A quantitative description of the agreement is needed. Line 19: Formation rate should be independent of the instrumentation used, so I’m not sure that the qualification of “especially given the fact that they were calculated from different instruments” is meaningful, at least not without a thorough explanation of why one might expect them to be different.

this analysis to make instrument design recommendations which are of interest beyond this specific instrument and perhaps beyond chamber experiments. I believe that this manuscript may be better suited for Atmos. Meas. Tech., being of primarily technical interest rather than containing general implications for atmospheric science.
The details of the Bayesian algorithm are currently in review at Geosci. Model Dev. (Ozon et al., 2020). This manuscript focuses on the application to chamber data, which is valuable to the community, and seems very appropriate to keep in a separate paper form the one describing the algorithm. Advancement of techniques aerosol size distribution inversions, and calculation of processes rates, and the explicit calculation of related uncertainties as presented here, is of value of the aerosol science community. Large uncertainties in growth and nucleation rates, and a lack of mathematically rigorous techniques for calculating them has made it hard for the community to compare between different experiments and to fully understand the effects of different environments and emissions on aerosol climatic and health effects. This analysis addresses these problems for the specific case of the DMAtrain on the CLOUD chamber, which is of value in and of itself, but also as an example for those developing techniques for other instruments or environments. The implications for instrument design drawn explicitly from the analysis presented here are of great value to the community and have the potential to influence thinking on design of a range of aerosol instrumentation.
This manuscript relies heavily on the algorithm presented in Ozon et al. (2020), with more limited explanation here. My review is based on the analysis of results presented in this study only, and not the details of the algorithm presented in Ozon et al. (2020). I note than Ozon et al. (2020) is currently in review, and so caution that publication of this work should depend upon successful peer-review of that work which it builds upon.
I have three main concerns with this manuscript regarding the science presented: A major claim of this manuscript is that the nucleation rates calculated from this method agree with nucleation rates calculated using a different instrument and method. Figure 4b shows that, within calculated uncertainties, this is not consistently true. The authors mention that disagreement between instruments is often order of magnitude as opposed to the much smaller disagreement shown here. This suggests that this method is a great improvement upon previous methods, which is very valuable to our field, but more rigor and accuracy is required in how the result is described, or in explaining why we might not expect these two results to agree within uncertainties. There is a general lack of quantitative comparison between results from different methods I was unable to fully understand some aspects of the method and remain unconvinced about some method choices made in this study. This may be a clarity of presentation issue, or things that need addressing in the analysis itself.
I will address these specifically, and other issues in the line-by-line review below.
Regarding clarify of presentation, I am concerned that this manuscript relies too heavily on other literature, to the extent that it is not possible to fully understand the work presented without constant reference to other literature. Specifically, this manuscript relies heavily on Ozon et al. (2020) to an extent that the methods are not understandable without either having a very intimate knowledge of that paper, or keeping both papers open and constantly going back and forth between them. While it make sense to build on the prior work and not reproduce too much in a new paper, something should be done to make this paper more understandable on its own. I would also like to mention that, while not incorrect, the paper, being aimed at a general atmospheric chem/phys audience in this journal, might be more readable if the authors explained the general idea of Bayesian state estimation problems to a general chemistry audience. Ditto Kalman filters. I also note that this manuscript contains a lot of symbols with definitions placed throughout the text. A table defining all terms somewhere accessible would ease readability.
In terms of adequate referencing of existing literature, there are a number notable omissions in this manuscript. The discussion of formation and growth rate calculation methods makes no mention made of Kurten et al. (2015), despite this also being developed for the same CLOUD chamber. Another method used frequently on data from the CLOUD chamber is the use of the AeroCLOUD model, as described in the methods section of Kirkby et al. (2016). The work of Fiebig et al. (Fiebig et al., 2005), has also been missed. Without placing the work presented in the context of these other studies, it is hard to assess its importance and relative merits. Lack of knowledge of some of this literature seems to have led to erroneous statements in the text, which I will point out below.

Introduction
Line 27: "their concentration" it is unclear what that is referring to, I suggest being explicit.
Line 68 -The Dada et al. formation rate calculation method is mentioned here but not included in the earlier paragraph describing the different methodologies for calculating formation rates. Why?
The description of why the DMAtrain is used is a little puzzling. Were other instruments available which did not meet the criteria laid out? This section needs clarification in terms of motive along with more specific clarification as follows: Line 70 -high time resolution -what is the time resolution of the measurement and how does this relate to rates of change of key observables in nucleation events in the chamber?
Line 71 -have the collection efficiencies for other candidate instruments not been as carefully characterized?
Line 73 -it would be helpful to explain why it is advantageous to have a new instrument to calculate formation rates from instead of optimizing methods for instruments that have been used previously

Section 2
Line 76 -there are also other quantities of interest e.g. chemical composition of cluster, formation rate at larger sizes, coagulation rates … might be better to rephrase this sentence.
Line 83 -"often done by analysis of time evolution of retrieved particle size distributions" this needs references.
Line 103 -states coagulation can be neglected in certain cases with low particle concentrations -in the context of this study this seems misleading. Coagulation has been explicitly shown to matter in the context of chamber new particle formation experiments (Kurten et al., 2015), which can still be considered to be low concentration environments.
Line 110 -"size space" is this particle diameter? Needs to be explicit.
Line 117-118 "The incorporation of positivity constraints for the process rates and the aforementioned second order models require minor modifications in the definition of the state variable ðð." -it is not clear enough what the aforementioned second order models are, and the modification needed should be shown explicitly. I can understand why some of the methodology refers toOzon et al. (2020), but for instances such as this it makes reading the paper and understanding the method too time consuming and difficult. Readability in general would be greatly improved if the method could be better understood from this paper without need to refer to other literature as much.
Line 149 -loss rates were defined as lambda above, but lambda doesn't appear in eq. 5 and 6. By loss rates is the author refering to -r 1 r 2 J (k-1) ? This needs to be better defined Line 150 -ð ð¡ðð¡ and ð ðâððððr need defining. Also where does this dilution time come into the calculations presented here?
Line 159 -what is the justification for stronger correlation between closest size bins in this method vs Ozon et al 2020?

Section 3
Line 179 -only sub 10nm size distribution measurements are mentioned. What about particles that grow > 10nm? These will surely contribute to coagulation sinks and therefore need to be accounted for?
Line 190 -steady state charge distribution achieved for flow rate up to 5 lmp. Flow rates used are 5.5. lpm. How does this affect the collection efficiency? Some justification is needed for going outside of the steady state flow range.
Line 194-199 -description of 2 stage CPCs is a bit confusing. Are two "boosters" used on each channel? Is it a different booster for the different channels? If so why ?   Fig 1 c. legend refers to fine and coarse models, but caption refers to continuous and discretized forms of the kernel functions. Are these the same? I find this reference to models more confusing than kernel functions, but am aware that could be my own bias from the literature I'm familiar with. I do highly recommend the authors stick to a single terminology to describe the kernel and make sure it is well defined for non-experts. Also is the uncertainty for the continuous function, or both? This should be clarified.
Line 242 -"chamber is operated in continuous mode" did you mean "continuous flow mode"? Line 244-246 -"assumed" J uncertainty -is this assumed or calculated as described in the rest of the sentence. Working for this uncertainty calculation needs to be shown here or in supplementary material as it is not reproducible with the current level of detail.
Line 255 -"which is distinct from most others" -does this mean that most growth rate calculation methods do not include size and time dependency? This must be clearer and needs references.
Line 256 -"However, compared to the Kalman smoothing, each time step is analysed individually and the analysis framework relies on already inverted size distributions, where a point-by-point inversion procedure is used for the DMA train data of this work (Stolzenburg and McMurry, 2008)." This sentence is confusing and I don't understand what it means.
Line 260 -if INSIDE cannot provide uncertainty on the growth rate, I would argue that another method is not so much needed to "verify" the result as the authors state here, but to provide a growth rate with calculated uncertainty, and the meaning of a rate with no uncertainty is questionable.

Section 4
Line 263 -"modeled the DMA-train instrument numerically" -what does this mean? Do the authors mean that they applied the calibrated instrument kernels to the synthetic data?
Line 272 -simulated and reconstructed size distributions are compared qualitatively as "very good". Give that the y and colour axes are both log scale it is hard for the reader to get a sense of goodness of fit. A more quantitative assessment of the similarity between simulated and reconstructed size distributions is warranted. Graphical representations to more easily demonstrate this would help e.g. size distributions at discrete points in time with uncertainties shown, total number concentrations above given size cuts with uncertainties.
Regarding the oscillations on size distributions and process rates resulting from the DMAtrain channels -is it not possible to apply some smoothing or correction for this?
Line 311 -temporal oscillations in calculated growth rate -some of these appear in the PSM growth rate too, but the authors earlier explain these as a product of the discrete size channels of the DMAtrain. So then why are they mirrored in the PSM derived growth rates? Authors note this as "remarkable" and an indication that these are real oscillations on the growth rate. This does not agree with what they mentioned earlier about it being an instrumental artifact in the simulated data. Some work is needed to explain this -what is real and what is an instrumental artifact. And if some of it is real, can the authors suggest why this would occur?
Line 331 -could the Henritzi et al size distribution be reproduced here for direct comparison? It would make the paper much more readable and enable direct comparison of results. A direct comparison including errors would again be of more use than two logscale colour plots.
Line 336 -While the result presented in fig 4b is indeed an improvement upon interinstrument differenced of up to an order of magnitude reference from the literature, it is still clear that the nucleation rates derived from the different instrument do not, within the calculated uncertainties, agree. If the reasons for this are well understood from the referenced literature, the author needs to summarize the argument here, and not rely on the reader having detailed knowledge of these other studies from the CLOUD chamber. As it stands the results shown in fig 4b do not support on the of the major claims of this paper -that the FIKS method is in agreement with other methods for calculating process rates. Is it the case that the known uncertainties for the two methods are actually missing a large source of uncertainty? This needs to be addressed. I would like to point out that, even if within known sources of uncertainty these results are not in agreement, as fig 4b suggests, the reduction in disagreement from previous studies is still a very important result that deserves publication and can assist the community to improve our process rate calculations. My point is that the authors must be more rigorous and accurate in describing what has and has not been achieved here.
Line 340 -This decreasing FIKS growth rate due to a lack of information above 4.3 nm seems problematic. As I understand from the text, there is no information to suggest a decreasing growth rate here, just an assumption of smoothness. Should the algorithm not then be adjusted to take into account where there is not enough information content to calculate a growth rate, and simply not report one, in a similar manner to INSIDE? Lines 403-407 -As in previous sections, it is difficult to compare size distributions using a log-scale colour plot. More meaningful comparison, which requires uncertainties, could be achieved using a) snapshots in time, b) cumulative concentrations above given limits. The colour plots shown give a helpful graphical indication of how the different inversions work, which is valuable, but more quantitative evaluation is needed in addition to enable compare between the proposed DMAtrain configurations being discussed here.