Comment on acp-2021-970

The paper describes the analysis of both particulate matter observations and a multi-model ensemble for two decades of simulation in Europe, showing decreasing trends in particulate matter through both analysis, albeit at different levels of significance for observations versus model ensemble. As such, the work is of interest to the readers of ACP and worth publishing in the journal, subject to the comments and corrections I provide below.

*Background description of the models is missing from the manuscript.
The authors make use of a 6 member ensemble to analyse the impacts of emissions changes on particulate matter trends in Europe -without providing a summary table in the paper of the main features of the models. A reference to a previous paper is insufficient here -a summary table that allows the reader to see the manner in which each model treats particle formation and deposition is critical, in order to allow the reader to understand the limitations (or lack thereof) of the ensemble. An additional table including references and a paragraph of explanatory text is needed in the final version of the paper, with the table including: Gas phase mechanism used in each model (and reference), along with number of species and reactions. Particle size distribution representation used in each model (sectional, modal and number of bins or modes, etc.). Particle chemical composition used in each model -list of chemical components speciated and/or treated as a lumped species (e.g. "SOA"). Organic particle formation methodology used in each model. g. Yield approach (reference to yields used), VBS method (reference to specific approach), etc. Inorganic particle heterogeneous thermodynamics approach used, along with a reference and the names of the chemical species within the approach used. Cloud processing of aerosols: number of aqueous reactions and a reference for same, which particle species names may be formed or removed by cloud processing. Particle dry deposition parameterization used and reference for same. Wind blown dust algorithm (if the model has one) + reference Sea-salt emissions algorithm (if the model has one) + reference Mixing state of the aerosols (e.g., homogeneous or heterogeneous mixtures, combinations) This summary Table is essential for the paper to be publishable -it and the additional text provides a clear description of the models and their potential weak or strong points, and would aid in the subsequent interpretation of model results. It also allows the reader to place the work in the context of other current published material. It can also be referred to at different places in the text to help place some of the findings in the context of the model construction (e.g. discussion of Figure A12, paragraph ending line 445, and a few other places in the text, as I describe below.
Additional clarifications needed: Line 15: The Abstract mentions 2 and 6 ug m-3 m-3 reduction (I think that should be ug m-3 year-1 ?), but the Abstract not clear if the 2 and 6 are referring to PM10 and PM2.5 respectively, or the range of decreasing trends seen across all models, or the range of decreasing trends across all models for PM10 and PM2.5 respectively. Please clarify. *A general comment: when providing trend levels such as quoting the range from the ensemble, please provide the mean and standard deviation of the ensemble as well (e.g. "between 2 and 6 ug m-3 year-1; mean+/-standard deviation of 4.51 +/-0.53"). Similarly, for the observations, provide the mean and standard deviation. I'm wondering to what extent the ensemble-estimated variability and observation variability overlap each other: are the differences between model and observed trends significant? The authors have dealt with significance of model and observed trends… but I don't think the extent to which the model trends versus observed trends are significantly different from each other. Similarly, the figures with time series to show trends ( Figure 4, Figure 8) should show the 95% confidence level; (z* sigma)/sqrt(N), where z* is 1.960 for a 95% confidence level) calculated for each year as a shaded region around the lines: do the model and observed trends lie within each other's shaded regions -i.e. are the modelled and observed trends significantly different from each other, given the variability of both the ensemble and the observations? Line 25: "significant modelled trends" -authors need to define what is meant by "significant" here (e.g at 95% confidence level?). Line 65, "2008-2014": please include a sentence describing the possible reasons for the spread in results noted by previous authors (if they were provided by those authors). Does the current paper address those possible causes of spread in the results (e.g. I think that the attempt to harmonize emissions, and especially to use a common grid resolution, help reduce between-model variability)? Line 125 -127: the different sources of wind-blown dust information are potentially a large source of variability between models. The authors should estimate the relative magnitude of the annual wind blow dust emissions if possible, and mention the values here, with reference to the new Table requested above. Similarly -can the magnitude of the forest fire emissions differences be quantified here? Ditto for the volcanic SO2 (e.g. compare the mass emitted by volcanoes versus European sources? What I'm hoping for here is some quantitative statement of the potential relative impact of the differences in emissions between the models on the predictions. Line 134: need to mention the sampling interval (hourly, daily?), and frequency (every hour, every day, one-day-in-three, etc., for the PM2.5 and PM10 observations here. Maybe mention the time span of observations (most seem to be daily averages?) with reference to table A1? Line 141: what is meant by "rather many"? Better to state the number of sites out of the total. Line 146: Why were there gap years? An explanation is needed. Also, on the time series Figures (4, 8), include a number above each year with the number of stations out of the total (e.g. "5/15") which are being gap-filled in that year, and an addition to the captions explaining those numbers. One of the questions I have is whether some of the differences between model and observations in the trends from one year to the next might be influenced by the gap-filling in the observations. Including those numbers would allow the reader to see the potential influence of gap-filling on the observed trends as a function of time. Please also state the reasons for the measurement gaps in the text, if known. This also applies to the text describing Figures 4 and 8: mention the years that have gaps. Line 158-159: Suggest mentioning here that the extent to which the data series duration, natural variability and weak trends might have affected the authors' analysis is discussed later in the paper. The intent of line 160 (closing sentence) is a bit unclear: maybe "taking into consideration (averaging)" should just be "averaging"? Line 177: is this R or R2? Please specify.
Line 217: the model results suggest significant trends in some locations while the observations do not, and vice versa. At this point, the reader is wondering why that might be. Either include a few lines of explanation in this paragraph, or a bridging sentence to the later discussion on causes for differences in significance between model and observed trends. Line 235-236, "large uncertainties in modelling of the coarse fraction of PM": please include some text describing how the different models differ in how this is done (with reference to the Table mentioned above). Line 249: What are the contributing factors to the model differences? Discuss here or add a sentence mentioning where it is discussed later in the paper. Line 276: Might also be worth noting that the inter-annual variability introduced by forest fires can be a large addition to the net variability. 2010 was also a year in which very large fires occurred in Russia during the summer (late July to mid-August). The extent to which the models have accurately captured these fires may determine the extent to which they simulate PM2.5, PM10 correctly in the trends, especially for eastern Europe.  Figure 6 and later on line 319: were all of the observed trends significant? The text elsewhere implies this is not the case, but the Figurse only shows model trend significance levels with two colours. If some of the observed trends were not significant (as seems to be implied in the text), please show this using a similar twotone red pair of bars for the observations, in addition to the two tone blue bars for the model. Paragraphs between lines 301 -306, 307-313, 319-325, 331-335 state the result, but not the possible reasons for it. Why are there differences in variability? Why might the trends be more/less significant at different sites? How might the differences in the models result in the different trends (e.g. with respect to the Table requested above)? The authors have some discussion later in the paper -maybe a sentence mentioning this discussion "The possible causes for these differences are discussed in section …" should be included here. Line 502: "rather moderate" -can this be made more quantitative? Lines 505-509: Is there a potential explanation? E.g. significant emissions reductions in the first of the two decades? Line 535: what are the reasons why Spain might have a higher variability than elsewhere? Local emission sources with high variability? Line 543: Suggest "spatially" should be "spatially and temporally". Lines 624-628: I found these 3 sentence a bit hard to follow. Clarify? A lack of significance may be due to the low magnitude of the trends (requiring a larger sample size to identify the trends as significant relative to noise) and/or high magnitude of the variability (e.g. larger standard deviation). The authors identify the latter as the main reason for the non-significant PM trends (I think), though its really the relative magnitude of variability to trend that matters… Line 680: again, better to include ensemble mean and standard deviation rather than just the maximum and minimum of the range.

*Missing Process (?)
There are a number of places in the text which imply that the models in the ensemble might not include the reactions of inorganic heterogeneous chemistry associated with base cations (Ca(2+), Mg(2+), Na(+), K(+)). These are sometimes a significant component of mineral dust and sea-salt, and can have a significant impact on particle chemistry, particularly via a competition between the fine mode and coarse mode for nitrate. The text between lines 395 and 412, and again lines 554-558, mentions secondary inorganic aerosol (SIA) only in the context of the SO4(2-), NH4(+), NO3(-) system -which is incomplete. This is why I want the model speciation included in the Table requested earlier: its not clear from the text whether this speciation is included in the ensemble of models -and its absence could potentially have a big impact on model results. Conversely if some models in the ensemble do include base cation chemistry, then this might also help explain some of the inter-model variability. The issue with base cations is that they are "stronger" cations than ammonium, and hence may perturb the balance of nitrate between the fine and coarse modes of the particle distribution. The authors mention NH3 + HNO3 <-> NH4NO3, NH4+,NO3-: to this equilibrium, the base cations add reactions such as CaCO3 + 2 HNO3 <-> Ca(NO3)2 + CO2 + H2O, and NaCl + HNO3 <-> NaNO3 + HCl, with these base cation equilibria being strongly biased towards the right and formation of base cation nitrates. What this can mean (and has been observed in observational studies (see Anlauf et al, Atm Env., 40, 2662-2675, 2006 for a sea-salt example, and Makar et al, JGR-Atm, https://doi.org/10.1029/98JD00978, 1998 for a calcium nitrate example), is that the nitrate can off-gas as HNO3 from the fine mode ammonium nitrate particles to go to the coarse mode as base cation nitrates. Models such as CMAQ and GEOS-Chem capture this process through the use of inorganic heterogeneous chemistry solvers that include base cation equilibria, such as Athanasios Nene's ISORROPIA2. The absence of this process in some or all of the authors' ensemble of models may help account for some of the differences between model and observed trends. For example, on page 13, line 410-412, the authors mention that "The models appear to overestimate the observed negative trends for NO3-and also for NH4+, though to a smaller degree": this is what I would expect if the models in the ensemble have not included coarse mode base cation chemistry: the base cations are largely coming from sources that have a natural component (wind blown soil dust, sea salt) and hence are not affected by emissions controls on particulate matter. Particle nitrate in the sulphate, ammonium, nitrate -only system will decrease rapidly if both ammonium and NOx are decreasing -however, if base cations are present, they will slow down the nitrate decrease by providing an additional sink other than ammonium -with the result that the ammonium may be in excess and remain as ammonia gas. That is, the decrease in fine mode nitrate may not be as strong if base cations are present. Line 434-435 "Thus, as the formation of ammonium sulphate…" while the ammonium was becoming more available, it won't necessarily result in ammonium nitrate formation, if the base cations are in excess to remove the available nitric acid. The models are showing what would happen in a base-cation-less world, I suspect. The authors should discuss the base cation issue, in the context of the model speciation Table requested above, and as an addition to the SIA analysis (lines 395 -412). The nitrate chemistry of the models may not be simulating these effects (sounds like it, from lines mentioned above and 456-458) -so this should be acknowledged as a source of uncertainty in the analysis and the results. Conversely, if some of the models in the ensemble do include base cation chemistry -does this explain some of the differences between those models and others in the ensemble? Comment on paragraph ending line 427: similar effects have been observed in North America, I think: as the sulphate decreases, the available ammonia in the fine mode is more likely to allow HNO3 to enter the fine mode, as long as base cations are not present as an alternative sink for HNO3, and/or there's sufficient HNO3 to replace both the anions in the coarse mode and charge balance the excess ammonium in the fine mode. The extent to which ammonia is increasing or decreasing may also play a role. The authors should have a look at the analysis by Robert Vet et al (Atm. Env., 93, 3-100, 2014), particularly Section 4, Figures 4.9 and 4.10 and related text, and the analyses carried out for N in Europe in that paper. Lines 452-453: "only BSOA have some dependency on anthropogenic emissions" -not true: wind-blown dust and sea-salt can be a significant sink for nitric acid resulting from anthropogenic NOx emissions. Line 640: note that inorganic heterogeneous chemistry is also highly dependent on meteorological conditions, particularly the temperature, with particle nitrate formation equilibria being biased towards particulate phase by a factor of 1E6 for a 25C drop in temperature.