Articles | Volume 21, issue 23
Research article
03 Dec 2021
Research article |  | 03 Dec 2021

Quantifying the structural uncertainty of the aerosol mixing state representation in a modal model

Zhonghua Zheng, Matthew West, Lei Zhao, Po-Lun Ma, Xiaohong Liu, and Nicole Riemer

Aerosol mixing state is an important emergent property that affects aerosol radiative forcing and aerosol–cloud interactions, but it has not been easy to constrain this property globally. This study aims to verify the global distribution of aerosol mixing state represented by modal models. To quantify the aerosol mixing state, we used the aerosol mixing state indices for submicron aerosol based on the mixing of optically absorbing and non-absorbing species (χo), the mixing of primary carbonaceous and non-primary carbonaceous species (χc), and the mixing of hygroscopic and non-hygroscopic species (χh). To achieve a spatiotemporal comparison, we calculated the mixing state indices using output from the Community Earth System Model with the four-mode version of the Modal Aerosol Module (MAM4) and compared the results with the mixing state indices from a benchmark machine-learned model trained on high-detail particle-resolved simulations from the particle-resolved stochastic aerosol model PartMC-MOSAIC. The two methods yielded very different spatial patterns of the mixing state indices. In some regions, the yearly averaged χ value computed by the MAM4 model differed by up to 70 percentage points from the benchmark values. These errors tended to be zonally structured, with the MAM4 model predicting a more internally mixed aerosol at low latitudes and a more externally mixed aerosol at high latitudes compared to the benchmark. Our study quantifies potential model bias in simulating mixing state in different regions and provides insights into potential improvements to model process representation for a more realistic simulation of aerosols towards better quantification of radiative forcing and aerosol–cloud interactions.

1 Introduction

The direct and indirect climate effects of atmospheric aerosols greatly depend on the particles' spatial distribution in the atmosphere and their climate-relevant properties, including their hygroscopicity, optical properties, and their ability to act as cloud condensation nuclei (CCN) and ice nuclei (Boucher et al.2013). These properties, in turn, are closely related to the aerosol mixing state (Ching et al.2012; Cziczo et al.2009; Fierce et al.2016, 2017). Aerosol mixing state refers to the way in which different aerosol chemical species are distributed among and within the aerosol particles (Riemer et al.2019). As shown in many observational field studies, atmospheric aerosols have complex mixing states (Bondy et al.2018; Healy et al.2014; Lee et al.2019; Ye et al.2018; Yu et al.2020), ranging between the two extremes of an “internal mixture”, where the composition of all particles within the population is identical (and equal to the bulk composition of the aerosol), and an “external mixture”, where each particle in a population consists of only a single species (which may be different for each particle).

This poses a unique challenge for the modeling of aerosols in Earth system models, which, for the sake of computational efficiency, represent aerosols by simplifying the true aerosol mixing state using various mixing-state-related assumptions. For example, bulk aerosol models predict the abundance of individual aerosol chemical species by tracking the species' mass concentrations, inherently treating the aerosol as external mixtures of, e.g., sulfate, black carbon, organic carbon, sea salt, and dust (Ghan et al.2012). Univariate sectional models are able to represent size-resolved composition but cannot resolve the diversity of the aerosol within a certain size range. For modal models, the ability to resolve mixing state depends on the definition and the placement of the modes. Different approaches for modal models have been developed, ranging from a small number of internally mixed, non-overlapping modes (e.g., three modes in MAM3, Liu et al.2012; or CMAQv5.2, United States Environmental Protection Agency2017) to a larger number of modes that may overlap in a given size range and separate out different aerosol mixtures (e.g., nine modes in MADE3, Kaiser et al.2014; or 16 modes in MATRIX, Bauer et al.2008). For these multi-modal models, the aerosol processes of gas–aerosol partitioning and coagulation make it necessary to define rules for how the modes interact (Wilson et al.2001). Condensation of secondary aerosol on a mode reserved for a pure species (e.g., black carbon or dust) requires moving mass over to a mixed mode when a critical mass fraction of secondary aerosol is exceeded. Transfer terms due to coagulation of particles in different modes can be calculated analytically (Binkowski and Shankar1995), and rules need to be defined regarding the destination mode after coagulation. Generally, the transfer of aerosol mass from smaller modes to larger modes during growth can lead to inaccuracies. The removal of particles due to scavenging by cloud activation is another issue that is difficult to reconcile. Hence, the choice of the number of modes, their compositions, and the criteria for transfer between modes are user-defined, which introduces structural uncertainty in aerosol simulations that still needs to be quantified.

Given that modal models are to some extent mixing-state-aware, the following question arises: how well do modal models represent mixing state? Due to the scarcity of relevant observational data, we are not yet at the point where we can comprehensively validate model output of aerosol mixing state as is done for other aerosol-related quantities, such as bulk mass concentrations or aerosol optical depth. However, higher-detail models can serve as benchmarks to perform a verification of simulated aerosol mixing state. This paper aims to verify the global distribution of aerosol mixing state represented by a modal model by using benchmark simulations from the particle-resolved stochastic aerosol model PartMC-MOSAIC (Riemer et al.2009; Zaveri et al.2008). Our usage of the term “aerosol representation” in this paper encompasses the representation of processes that go along with the aerosol representation itself, since the two are in practice tightly coupled.

We used the aerosol mixing state index χ (Riemer and West2013) as a metric to quantify aerosol mixing state. The mixing state index χ can be interpreted as a label for particle populations to rigorously characterize where the population lies on the spectrum from external (χ=0 %) to internal (χ=100 %) mixture. This concept has been successfully applied to observational data (Healy et al.2014; Ye et al.2018) and for error quantification studies (Ching et al.2017, 2018). Particularly relevant for this work is the study by Ching et al. (2017), which showed that assuming an internal mixture when the aerosol is actually not completely internally mixed can result in errors of up to 150 % in CCN predictions.

PartMC-MOSAIC tracks the composition of individual particles and therefore resolves aerosol mixing state explicitly (Riemer et al.2009; Zaveri et al.2008). However, this modeling approach is computationally very expensive and therefore not practical for large-scale simulations of several months or years of simulation time. To estimate the global spatial distribution of mixing state, we recently developed a machine-learned (ML) model based on high-detail particle-resolved simulations (Zheng et al.2021) that uses inputs that are known from global model simulations to predict χ. In this paper, we use this ML model to predict the spatial distribution of the mixing index χ and then compare the results with χ values that are derived from the Community Earth System Model version 2 (CESM2 version 2.1.0; Danabasoglu et al.2020) using the four-mode version of the Modal Aerosol Module (MAM4; Liu et al.2016).

This paper is organized as follows. In Sect. 2 we introduce the setup of the Earth system model simulations. The definition of mixing state indices and the derivation of aerosol mixing state indices for modal models are given in Sect. 3. Section 4 briefly describes the ML model generated with machine learning and particle-resolved modeling for estimating the benchmark aerosol mixing state indices. Section 5 focuses on the comparison of mixing state indices from the particle-resolved and modal models, and Sect. 6 summarizes our findings.

2 Global model simulations

Here we employed CESM2 to provide the global model simulation data. Specifically, we used the component set FHIST to set up the global simulations with aerosols. This component set represents a typical historical simulation in the Community Atmospheric Model (CAM6; Bogenschutz et al.2018) using an active atmosphere and land with prescribed sea-surface temperatures and sea-ice extent, as well as a 1 finite-volume dycore with the forcing data available from 1979 to 2015.

MAM4 is the default aerosol module of this component set, which represents the aerosol size distribution with four lognormal modes (Aitken, accumulation, coarse, and primary carbon modes; Liu et al.2016). MAM4 tracks six aerosol species, and these are distributed over the four modes as follows. The Aitken mode consists of dust, sulfate, secondary organic aerosol (SOA), and sea salt. The accumulation mode includes sulfate, SOA, sea salt, primary organic matter (POM), black carbon (BC), and dust. The coarse mode contains sulfate, dust, and sea salt. The primary carbon mode contains only BC and POM, which are supplied by primary aerosol emissions.

The choice of modes in MAM4 is motivated by the desire to treat the microphysical aging of the primary carbonaceous aerosols in the atmosphere (Liu et al.2016) similar to other modal models used in regional or global models (Riemer et al.2003; Vogel et al.2009). In MAM4, mass and number concentrations of BC and POM in the primary carbon mode are transferred to the accumulation mode by the processes of intermodal coagulation and condensation of SOA and sulfuric acid onto the primary carbon mode. The accumulation mode then represents aged BC and POM, as these species are internally mixed with other aerosol species. The MAM4 treatment of aging is critical for improving the long-range transport of carbonaceous aerosols to remote regions such as the polar region, which suffered from a low bias in a prior version of the model when only three internally mixed modes were used (Liu et al.2016).

We ran the model for the year 2011 with 6 years (2005–2010) of spinup. The simulation was conducted at a resolution of 0.9 latitude by 1.25 longitude along with emission inventories from CMIP6 emissions (Emmons et al.2020). We stored the instantaneous outputs every 3 h during the simulation, which yields 2920 timestamps for each surface-layer grid cell for the entire year of simulation time. The surface layer was chosen to be in line with the PartMC-MOSAIC model scenarios that were used as training data for the ML models of mixing state indices (see Sect. 3) and which were designed to represent conditions in the planetary boundary layer.

3 Aerosol mixing state indices: definition and calculation

3.1 Particle-based aerosol mixing state index

The mixing state index χ (Riemer and West2013) quantifies where an aerosol population lies on the continuum from external to internal mixing – that is, how spread out the chemical species are over an aerosol population. We will focus here on the mixing state of submicron aerosols (PM1.0) due to their relevance for light scattering and absorption (Wang et al.2015) and their contribution to CCN formation (Asmi et al.2011; Pierce et al.2015; Yu and Luo2009).

To summarize, the mixing state index χ is given by the affine ratio of the average particle species diversity, Dα, and bulk population species diversity, Dγ, as

(1) χ = D α - 1 D γ - 1 .

The diversities Dα and Dγ are calculated as follows. First, the per-particle mixing entropies Hi are determined for each particle by

(2) H i = a = 1 A - p i a ln p i a .

Here, A is the number of distinct aerosol species and pia is the mass fraction of species a in particle i. These values are then averaged (mass-weighted) over the entire population to obtain the average particle species diversity Dα by


where Np is the total number of particles in the population and pi is the mass fraction of particle i in the population. Finally, the bulk diversity Dγ is calculated as


where pa is the bulk mass fraction of species a in the population.

Note that the definition of “species” for calculating χ is based on application needs. It can be based on operationally defined chemical species (Healy et al.2014; Ye et al.2018), elemental composition (Fraund et al.2017; O'Brien et al.2015), or species groups such as volatile and nonvolatile species (Dickau et al.2016) or hygroscopic and non-hygroscopic species (Ching et al.2017; Hughes et al.2018). Other possibilities include the propensity for aerosols to undergo heterogeneous reactions, quantified by the heterogeneous reaction rate coefficient for a specific reaction. In this paper we consider three different definitions of χ, which we explain in more detail in Sect. 3.2.

3.2 Mode-based aerosol mixing state index

The framework laid out in Sect. 3.1 can be easily generalized to a modal modeling framework (see Fig. 1). The bulk mixing entropy, Hγ, and the bulk diversity, Dγ, can be calculated using the bulk mass fractions, pa, of species a from the MAM4 simulation and Eqs. (5) and (6). To calculate the average particle mixing entropy, Hα, and the average particle species diversity, Dα, we use


where pma is the mass fraction of species a in mode m, pm is the mass fraction of mode m in the population, and Hm represents the per-mode mixing entropies. Finally, the mixing state index, χ, can be calculated using Eq. (1). Note that Eqs. (7) and (8) are analogous to Eqs. (2) and (3). A detailed derivation of these equations is provided in the Appendix A.

Figure 1Illustration of the mode-based calculation of the aerosol mixing state index. The coarse mode is removed because only modes dominated by submicron particles are used for calculations. Note that the Aitken mode mass fraction is very low compared to the other modes and the caption does not obscure any data.


In this study, we consider the mixing states of submicron aerosols including the Aitken, accumulation, and primary carbon modes, and we do not include the coarse mode because the coarse particles are above 1 µm. Since the mixing entropies are mass-weighted (rather than number-weighted), the mixing state index is more representative of the modes with the larger particles, i.e., the accumulation and primary carbon modes.

3.3 Grouped surrogate species

Here we compare and contrast the aerosol mixing state indices defined in three different ways, namely based on the mixing of optically absorbing and non-absorbing species (χo), based on the mixing of primary carbonaceous and non-primary carbonaceous species (χc), and based on the mixing of hygroscopic and non-hygroscopic species (χh). Table 1 shows the definitions of these aerosol mixing state indices.

Table 1Aerosol mixing state index definitions. Six aerosol species (bc: black carbon, dst: dust, ncl: sea salt, pom: primary organic matter, soa: secondary organic aerosol, so4: sulfate) are used in calculating the aerosol mixing state indices based on different species groupings. The mixing state indices χo, χc, and χh are based on two grouped surrogate species.

Download Print Version | Download XLSX

For χo, we considered two surrogate species: black carbon (strongly absorbing, assigned a mass absorption coefficient in CESM2 at 533 nm and 0 % RH of 8.144 m2 g−1) and the five other aerosol species grouped together (less absorbing or non-absorbing, with mass absorption coefficients in CESM2 at 533 nm and 0 % RH of 0.1442, 9.975×10-2, 4.703×10-2, 2×10-6, and 5×10-7 m2 g−1 for POM, SOA, dust, sea salt, and sulfate, respectively). Thus, a lower value in χo refers to the case where the strongly absorbing species black carbon (Yang et al.2009) and the sum of the other species (termed “non-absorbing” here for convenience) are more externally mixed.

The index χc is motivated by the primary carbon treatment of MAM4, where the primary particulate organic matter and black carbon are assigned to a separate primary carbon mode (Liu et al.2016). A lower value in χc refers to the situation where the primary carbonaceous species and all other species exist separately in different particles.

Similarly, χh was also calculated from two surrogate species. We combined black carbon, primary organic matter, and dust as one surrogate species, given their comparatively lower hygroscopicities (kappa values of ∼0, ∼0, and 0.068, respectively). Accordingly, NaCl (1.16), SOA (0.14), and sulfate (0.507) were grouped as the other surrogate species. Here, a lower value in χh represents the case where hygroscopic and non-hygroscopic species tend to be present in separate particles.

4 Machine-learned models of mixing state indices

Aerosol mixing state indices can be calculated directly using particle-resolved modeling, but this comes with large computational costs. Alternatively, Zheng et al. (2021) developed ML models, which integrate machine learning and particle-resolved aerosol simulations to estimate aerosol mixing state indices. To generate the training and testing data sets for developing such ML models, an ensemble of particle-resolved model scenarios was created using the particle-resolved model PartMC-MOSAIC (Riemer et al.2009; Zaveri et al.2008). In brief, PartMC-MOSAIC simulates individual aerosol particles within a representative volume of air, including stochastic coagulation, particle-phase thermodynamics, gas- and particle-phase chemistries, and dynamic gas–particle mass transfer. Thus, the composition of the individual particles within a population evolves dynamically, and assumptions about mixing state are not necessary.

The strategy to generate the data was to vary the input parameters (45 in total) for the PartMC-MOSAIC model, including primary emissions of different aerosol types (e.g., carbonaceous aerosol and dust emissions, including contribution from Aitken mode, accumulation mode, and coarse mode size ranges), primary emissions of gas phase species (e.g., SO2, NO2, and various volatile organic compounds), and meteorological parameters (see Table 1 in Zheng et al.2021, for more information). For instance, to vary the gas emissions, scaling factors were sampled from 0 % to 200 % for different gas species, based on the emission rates in Riemer et al. (2009). A Latin hypercube sampling approach was employed to sample the parameter space efficiently for the training and testing data sets. We note that new particle formation and growth was not simulated explicitly, but Aitken mode sulfate particles were introduced into the simulation by emission for a subset of scenarios (Zheng et al.2021) as a proxy for having particles present that originate from new particle formation. While PartMC-MOSAIC includes the process of new particle formation (Tian et al.2014), the reason for this simplification was that considerable uncertainty exists regarding the subsequent growth of the freshly nucleated particles (Kulmala et al.2014), which poses a challenge for a highly detailed aerosol model such as PartMC-MOSAIC. Errors in representing this particle type adequately may result in underestimating the abundance of BC-free particles in some regions (Zhang et al.2017) and thereby overestimating the degree of internal mixture. This would imply that the error in the MAM4 simulations is even larger than currently indicated. Other processes that are not explicitly included in generating training data are aerosol removal by nucleation-scavenging and other cloud processes. However, for the purpose of this study, the emphasis is on the aerosol state, i.e., having a sufficiently comprehensive set of aerosol populations that can serve as training data, not necessarily that all the processes are included.

The ML models were derived by the machine learning algorithm eXtreme Gradient Boosting (XGBoost; Chen and Guestrin2016) from 45 000 particle populations. Each ML model was a tree-based ensemble model that could handle complex nonlinear interactions and collinearity among features. The hyperparameters were determined by grid search with 10-fold cross-validation. The ML models can be expressed as

(10) χ S ( x , y , t ) = f S ( A ( x , y , t ) , G ( x , y , t ) , E ( x , y , t ) ) ,

where χS(x,y,t) is the mixing state index (χo, χc, or χh) at location (x,y) in the model layer nearest the surface at time t, and fS denotes the function for calculating the corresponding mixing state index χS. The set names A (aerosol), G (gas), and E (environmental) represent the predictors (features) used for predicting the mixing state index. The choice of features is determined by the overlap of variables that are present in both PartMC-MOSAIC and CESM2. Aerosol species include black carbon, mineral dust, sea salt, primary organic aerosol, secondary organic aerosol, and sulfate. Of note is that we used the bulk (not the per-mode) concentrations of submicron aerosol species as the features. The gas species include dimethyl sulfide, hydrogen peroxide, sulfuric acid, ozone, semi-volatile organic gas, and sulfur dioxide. The environmental variables are air temperature, relative humidity, and solar zenith angle. Table 2 shows the performance of the ML models when predicting the mixing state indices. The mixing state calculation in this study was purely based on the above six aerosol species (excluding other aerosol species) for a fair comparison with the mode-based aerosol mixing state index, which resulted in slightly different performance of the ML model compared to Zheng et al. (2021). The average error of the ML model (using the hold-out testing samples) is about 5 % for χo and 8 % for χc and χh (measured by mean absolute error).

Table 2Predictive performance of the ML models using the testing data set. Metrics include the mean absolute error (MAE), root-mean-square error (RMSE), median absolute deviation (MAD), index of agreement (d; Willmott1981), Pearson correlation coefficient (PCC), and coefficient of determination (r2).

Download Print Version | Download XLSX

We would like to emphasize that this ML modeling framework cannot compensate for any biases that the global model (here CESM2) might have in simulating the quantities that serve as the features. Instead, what we can expect from this approach is that it provides the most likely mixing state associated with the species concentrations that CESM2 simulates.

5 Results

5.1 Quantitative comparison of mode-based and particle-based mixing state indices

Let χS,tML and χS,tMAM4 denote the mixing state indices computed by the ML model and by the MAM4 model for each grid cell at timestamp t, respectively. The corresponding time-averaged values for a certain time interval and for each grid cell are χSML and χSMAM4. Here we consider the full year as the time-averaging interval. An analysis of the seasonal variation of mixing state indices can be found in Zheng et al. (2021).

To compare the annual mean values, we calculated the mean difference (ΔχS) and the mean absolute difference (|ΔχS|) for each grid cell of the layer closest to the surface:


where the subscript S refers to the mixing state index (o, c, or h), and the total number of timestamps is T=2920. Since it only makes sense to quantify mixing state when at least two species are present in a given location, areas where the mass fraction of any one surrogate species was higher than 99 % for χo (due to the low mass fraction of black carbon) and 97.5 % for χc and χh were ignored for the calculation and appear as hatched areas in Fig. 3. We will first discuss the overall probability density functions of these quantities (Fig. 2) and then their spatial distributions (Fig. 3).

Figure 2Probability density functions of annual averaged mixing state indices using the MAM4 model and ML model. The thin black lines refer to their mean values.


Figure 3Global distribution of annually averaged mixing state indices (χo, χc, and χh) using the ML model, MAM4 model, their mean difference (Δχ), and mean absolute difference (|Δχ|). Areas are hatched where the mass fraction of any one surrogate species was higher than 99 % for χo (due to the low mass fraction of black carbon) and 97.5 % for χc and χh.

Figure 2 shows the probability density functions of the annual averaged mixing state indices computed by the ML model (χSML), by MAM4 (χSMAM4), their average difference (ΔχS), and their average absolute difference (|ΔχS|) for each surface-layer grid cell. The results show large discrepancies in mixing state indices between the ML model and the MAM4 model, without a clear relationship between them (see Fig. 2d–f).

The annual average of the mixing state index χo estimated by the ML model, χoML, ranged between 55 % and 96 %, with a mean of 73 %. Calculated by the MAM4 model, χoMAM4 varied spatially from 46 % to 99.76 %, with a higher mean of 86 %. The similar mean values of Δχo (14 %) and |Δχo| (18 %) were caused by higher values in χoMAM4 compared to χoML, which is confirmed below with Fig. 3. The averaged mixing state index χcML ranged between 31 % and 84 % with a mean of 54 %, while χcMAM4 had a wider range (from 9 % to 99.81 %) with a mean (of 58 %). Similarly, χhML ranged from 21 % to 81 % with a mean of 58 %, while χhMAM4 varied between 10 % and 99.85 % with a mean of 63 %. The large discrepancy between the mean difference (4.8 % for Δχc and 4.7 % for Δχh) and mean absolute difference (30 % for |Δχc| and 38 % for |Δχh|) indicates that the errors in χc and χh were symmetric (positive and negative) but large. The maximal errors in |Δχc| and |Δχh| between the two methods were up to 59 and 76 percentage points, respectively.

The implications of these discrepancies are more easily discussed with Fig. 3, which illustrates the global spatial distribution of annually averaged mixing state indices predicted by the ML model (first column), MAM4 (second column), their mean difference (third column), and their mean absolute difference (fourth column). The differences in mixing state indices between the ML model and MAM4 varied strongly across the globe.

High values of χoML occurred in the continental regions (77 %) compared to oceans (69 %). Specifically, the ML model predicted high values for χo in central Africa (20 S–15 N, 12–30 E), the Arctic (66.5–90 N), and southern Asia (5–38 N, 60–90 E). These are also the regions with relatively larger mass fractions of black carbon (∼5 %; see Fig. A2). The mixing state index χoMAM4 showed a higher degree of internal mixing over the globe (with a median of 90 %) compared to the ML model. The only exceptions were oceans in the Northern Hemisphere at the mid-latitudes (45–60 N, dominated by sea salt, sulfate, and secondary organic aerosol in the accumulation mode) and Antarctica (66.5–90 S, dominated by sea salt and sulfate in the accumulation mode as well as sulfate in Aitken mode), where χoMAM4 was 75 %. Qualitatively, the MAM4 model captured the trend that areas with high black carbon concentration (defined here as concentrations above the 95 % percentile) tended to have higher χo values.

The ML model estimate χcML suggested a rather homogeneous spatial distribution of the annually averaged mixing state, with values of approximately 50 %. Compared to χcML, χcMAM4 values were lower (primary carbonaceous aerosol more externally mixed) at high latitudes and higher at low and mid-latitudes (primary carbonaceous aerosol more internally mixed). Note that, while χcMAM4 values were similar in the Arctic and Antarctic, the abundance of primary carbonaceous species was predicted to be higher in the Arctic compared to the Antarctic (see Fig. A2).

The spatial distributions of χhMAM4 were similar to χcMAM4. That is, the MAM4 model predicted that the hygroscopic species and non-hygroscopic species were more externally mixed at high latitudes and more internally mixed at low latitudes. In contrast, the spatial distribution of χhML shows qualitative differences compared to χcML in two aspects. First, χhML was higher than χcML at high latitudes, meaning that hygroscopic species and non-hygroscopic appeared more internally mixed than primary carbonaceous and non-carbonaceous species in this region. Second, areas over the North Atlantic Ocean (0–20 N, 20–45 W), southern Africa (5–32 S, 5–20 E), and Australia (10–30 S, 100–140 E) appeared rather externally mixed. These are areas where mineral dust is the dominant aerosol species (see Fig. A2).

These two facts lead to the overall finding that χh exhibits the largest differences between the two methods. This applies especially to regions where mineral dust was the dominant aerosol species, which points to an important structural issue of the four-mode setup used in MAM4. While the ML model predicted a more external mixture in these regions (dust externally mixed from sea salt and other species), the MAM4 model could not represent this because the accumulation mode included all six aerosol species in an internal mixture. Figure 4 illustrates the relationship of the mean absolute difference of χh and the mass fraction of dust for all model grid points. It confirms that grid points with large dust mass fractions were associated with larger mean absolute differences in χh. These results confirm the tradeoff discussed in Liu et al. (2012): MAM3 (and MAM4 in Liu et al.2016) intentionally combines dust and sea salt in the same mode to reduce the computational burden; however, this simplification does not always realistically reflect the aerosol mixing state in the ambient atmosphere.

It is interesting to note that the areas where sea salt is present, but not dust, are not associated with large errors, even though sea salt – just like mineral dust – is a primary aerosol type. The reason for this lies in our surrogate species definitions (Table 1) for computing the mixing state index. Based on our mixing state definitions, sea salt, secondary organic aerosol, and sulfate are always grouped together. Therefore, none of the mixing state indices as defined here tell us how externally mixed sea salt is when it is considered as a single aerosol type.

Figure 4Dependence of mean absolute difference of χh on dust mass fractions for all model grid points.


Figure 5 further demonstrates the zonal mean annual aerosol mixing state indices, highlighting that differences between χc and χh tended to be zonally structured, where the MAM4 model overestimated at low latitudes, while it underestimated at high latitudes relative to the ML model. In contrast, the MAM4 model overestimated χo at all latitudes north of 60 S.

Figure 5Zonal mean annual aerosol mixing state indices (a) χo, (b) χc, and (c) χh using the MAM4 model and ML model. The bands refer to the standard deviation.


5.2 Interpretation of findings

From Sect. 5.1, the following picture emerges: MAM4 overestimates the mixing state index χo except in regions at high latitudes in the Southern Hemisphere. At the same time, χc and χh are overestimated at low latitudes to mid-latitudes and underestimated at high latitudes. These findings point towards too rapid a transfer from the carbonaceous mode to the accumulation mode at low latitudes to mid-latitudes and too slow a transfer at high latitudes.

To conceptually illustrate these relationships, here we use χo and χc as examples and contrast the conditions for high and low latitudes. Figure 6a–f show conditions representative of high latitudes. A grid cell sampled from the CESM2/MAM4 simulation (73 N, 151 W) contains 15 % BC and 37 % POM, distributed over the accumulation and primary carbon mode as shown in Fig. 6a and d. The corresponding value for χo is 80 %. Figure 6b depicts particle population that was sampled from the MAM4 population in Fig. 6a. All particles, except for the smallest ones (corresponding to Aitken mode particles), contain BC, which results in the relatively high mixing state index value for χo. Note that in MAM4 BC is not included in the Aitken mode by definition. Considering the same particle population, but now evaluating the mixing state metric χc, which quantifies the degree of mixing of primary carbon and other species, yields the following observation. The entire primary carbon mode, by definition, consists of POM and BC, which results in an appreciable number of particles that contain only primary carbon (BC + POM), giving a mixing state index χc of only 27 %.

We now compare the MAM4-sampled particle populations above to particle populations that were sampled from our PartMC scenario library. We searched for populations with similar mass fractions of BC and POM as in the MAM4 populations and that were simulated at a similar latitude as the grid point location of the CESM2/MAM4 model output. Figure 6c shows that the PartMC results have comparatively more BC-free particles, and Fig. 6f shows that comparatively more particles are mixtures of primary carbon and other species. Overall, this means that in MAM4 BC appears too internally mixed (because irrespective of whether BC is placed in the primary carbon or accumulation mode, it is by design mixed with other species) and that at high latitudes the primary carbon mode is not transferring mass to the accumulation mode as quickly as is the case in PartMC simulations.

The reason why MAM4 behaves in this way can be explained by the aging process treatment in MAM4. Aging in MAM4 is formulated using a threshold criteria. That is, BC and POA mass is transferred from the primary carbon mode to the accumulation mode when a certain threshold of sulfate and SOA has condensed. In MAM4 this threshold is set to a relatively large value. This is done to prevent BC from being removed too quickly by wet deposition – because the primary carbon mode has a lower hygroscopicity than the accumulation mode and thus a lower wet scavenging efficiency – thereby counteracting a low bias in BC concentrations in the Arctic regions. From Wang et al. (2018) we already know that using such a high threshold may not be appropriate. However, the global model also has biases in other processes that contribute to the low BC bias in the Arctic, and setting the threshold to a high value compensates for these errors. Our results are a reflection of this fact. While adjusting the threshold criteria in MAM4 to a lower value may improve the agreement with the ML simulations in some regions, it may deteriorate the overall results in other areas. This is a good example how structural uncertainty manifests itself, namely by the fact that adjusting a parameter does not fundamentally fix the issue.

Figure 6g–l show conditions representative of low latitudes. A grid cell sampled from the CESM2/MAM4 simulation (20 N, 120 E) contains 11 % BC and 24 % POM, distributed over the accumulation and primary carbon mode as shown in Fig. 6g and j, with most of the mass in the accumulation mode. The corresponding value for χo is therefore 99 %, an almost complete internal mixture. For the same reason, χc is also very high. Similarly to the high-latitude case, Fig. 6i and l show that the comparable PartMC population has comparatively more BC-free particles and more particles that contain very low amounts of primary carbonaceous material, leading to lower values of both χo and χc compared to the MAM4 results.

Figure 6Illustration to explain the differences in mixing state representation between MAM4 and the ML model at high and low latitudes.


5.3 Comparison to observational data

The question that arises from Sect. 5.1 and 5.2 is of course the following: which spatial distribution of aerosol mixing state reflects reality more closely? The validation of simulated mixing state indices with observational data is still challenging since per-particle mass fractions of species are required for calculating the mixing state indices (see Sect. 3.1). These are in principle obtainable from in situ deployments of single-particle mass spectrometers or by using electron microscopy techniques, but their quantitative derivation comes with challenges and is not routinely done, so that only very few data sets exist that allow for a meaningful comparison (Riemer et al.2019). Keeping these limitations in mind, Zheng et al. (2021) reported a qualitative comparison of available measurements of mixing state metrics in locations in developed countries (Paris, France; Pittsburgh, USA; various locations in Japan) (Healy et al.2014; Ye et al.2018; Ching et al.2019) with seasonally averaged results from the ML model based on particle-resolved simulations. This showed that the ML model was able to capture the range of values that is consistent with the observations.

We further compared the ML model estimates using recent observations from China. Specifically, we compared χoML and χoMAM4 with χ values from Taizhou (Zhao et al.2021) and Beijing (Yu et al.2020) derived from Single Particle Soot Photometer (SP2) measurements. For both locations, χoMAM4 overestimated the observed χ values, while χoML was in the range of the observations. Specifically, the χ measured at a suburban site Taizhou from 26 May to 18 June 2017 ranged from 62 % to 82 %. During the same time period (but in the year 2011), the values of χoML were between 63 % and 84 %, while χoMAM4 was between 84 % and 96 %. The χ values at the urban site of Beijing ranged between 55 % and 70 % in winter (from 10 November to 10 December 2016) and varied between 60 % and 75 % in summer (from 18 May to 25 June 2017). Using our simulations of the year 2011, χoML varied from 60 % to 88 % in winter and from 59 % to 83 % in summer. As a comparison, χoMAM4 ranged from 92 % to 97 % in winter and from 87 % to 95 % in summer. A caveat when comparing χoML and χoMAM4, respectively, with the observations reported in Zhao et al. (2021) and Yu et al. (2020) is that the definition of χoML and χoMAM4 included BC-free particles, while the χ values in the measurements by Zhao et al. (2021) and Yu et al. (2020) were calculated only considering the subpopulation of BC-containing particles. This might introduce a bias in the mixing state index between the χo index used in this paper and the observations (depending on the fraction of the BC-free particles present at any given location).

We can also relate our χo index qualitatively to the SP2 measurements in the Finnish Arctic during winter 2011–2012 (Raatikainen et al.2015). Although this study did not provide quantitative mixing state index calculations, it is an important finding that BC-containing particles (with various amounts of coatings) co-existed with BC-free particles. As we saw in Sect. 5.2, this condition can easily be represented with a particle-resolved approach. However, the modal model with modes configured as in MAM4 puts black carbon in all accumulation-sized particles (Fig. 6), which is not consistent with the observations.

6 Conclusions

In this paper we present a framework for evaluating the error in submicron aerosol mixing state induced by aerosol representation assumptions, which is one of the important contributors to structural uncertainty in aerosol models. We quantitatively compared mixing state indices for submicron aerosol predicted by the modal model MAM4 within the global model CESM to a machine-learned model based on high-detail particle-resolved simulations. We focused on the mixing of optically absorbing and non-absorbing species (χo), the mixing of primary carbonaceous with other aerosol species (χc), and the mixing of hygroscopic and non-hygroscopic species (χh).

For χo, the MAM4 modal representation generally overestimated the degree of mixing of BC with other aerosol species. This overestimation is due to the fact that MAM4's choice of modes does not allow for representing BC-free particles in the accumulation and primary carbon modes. This is in contrast to field observations by Brown et al. (2021), which showed that BC and POM may be externally mixed near sources. The implication of this is that, if optical properties are calculated based on the aerosol composition, absorption will be overestimated.

For χc and χh, the error tended to be zonally structured, where the MAM4 model overestimated the mixing state indices at low latitudes and underestimated them at high latitudes compared to the ML model. This behavior could be explained by modeling choices in MAM4, in particular that (1) BC is always emitted with POM, (2) no BC-free particles exist in the submicron modes, and (3) dust is always internally mixed with other aerosol species.

Mixing state is an important emergent property that affects the aerosol radiative forcing and aerosol–cloud interactions, but it is not easy to constrain this property globally. To the best of our knowledge, this is the first study that evaluated the spatial distribution of aerosol mixing state as predicted by a global model. Since errors in mixing state predictions propagate into errors in aerosol climate impacts, our findings provide a framework and reference for Earth system model developers and users regarding simulation reliability. For example, this framework can be used to (1) quantify model bias in simulating mixing state in different regions, identifying model structural deficiencies, and (2) provide insights into potential improvements of model process representations for a more realistic simulation of aerosols.

Appendix A: Derivation of mode-based aerosol mixing state index

Table A1 details the notation for aerosol mass and mass fractions to calculate Hα using modal information.

To explain how to obtain Eqs. (7) and (8) from Eqs. (2) and (3), let us assume that each mode m contains Nm particles and the number of species in the population is A. The mixing entropy of particle i in mode m, Hm,i, is given by

(A1) H m , i = a = 1 A - p m , i a ln p m , i a .

The average particle mixing entropy of the entire population (summed over all modes), Hα, is

(A2) H α = m = 1 M i = 1 N m p m , i H m , i = p 1 , 1 H 1 , 1 + p 1 , 2 H 1 , 2 + p 1 , N 1 H 1 , N 1 m = 1 + + p M , 1 H M , 1 + p M , 2 H M , 2 + p M , N M H M , N M m = M .

Given that each mode is assumed to be internally mixed, particles within the same mode have the same composition, and we have

(A3) p m , i a = μ m , i a μ m , i = μ m a μ m = p m a .

This results in

(A4) H m , i = a = 1 A - p m , i a ln p m , i a = a = 1 A - p m a ln p m a = H m .

Therefore, based on Eq. (A4) and the fact that pm=i=1Nmpm,i, Eq. (A2) can be rewritten as

(A5) H α = p 1 H 1 m = 1 + + p M H M m = M = m = 1 M p m H m .

With the mode-based Hα, the other mixing state quantities can be computed as described in Sect. 3.2.

Table A1Aerosol mass and mass fraction definition and notation. The number of modes is M (M=3 for MAM4 without the coarse mode), the number of particles in mode m is Nm, and the number of species is A.

Download Print Version | Download XLSX

Figure A1Aerosol species mixing ratio (µg kg−1). Accumulation mode: a1; Aitken mode: a2; primary carbon mode: a4. The coarse mode (a3) is not used in this study and therefore omitted in this figure. Black carbon: bc; dust: dst; sea salt: ncl; primary organic matter: pom; secondary organic aerosol: soa; sulfate: so4.

Figure A2Fraction of aerosol species mixing ratio (%). Accumulation mode: a1; Aitken mode: a2; primary carbon mode: a4. The coarse mode (a3) is not used in this study and therefore omitted in this figure. Black carbon: bc; dust: dst; sea salt: ncl; primary organic matter: pom; secondary organic aerosol: soa; sulfate: so4.

Code and data availability

Notebooks and data to reproduce the global mixing state index analysis are available at (last access: 16 November 2021)​​​​​​​ or (Zheng2021).

Author contributions

ZZ, MW, and NR conceptualized the analysis and wrote the manuscript with input from the co-authors. ZZ developed the code, carried out the simulations, and performed the analysis. LZ, PLM, and XL provided scientific suggestions for the manuscript. All authors were involved in helpful discussions and contributed to the manuscript.

Competing interests

Some authors are members of the editorial board of Atmospheric Chemistry and Physics. The peer-review process was guided by an independent editor, and the authors have also no other competing interests to declare.


Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


We would like to acknowledge high-performance computing support from Cheyenne ( provided by NCAR's Computational and Information Systems Laboratory, sponsored by the National Science Foundation. The CESM project is supported primarily by the National Science Foundation. This research is part of the Blue Waters sustained-petascale computing project, which is supported by the National Science Foundation (awards OCI-0725070 and ACI-1238993), the State of Illinois, and as of December 2019 the National Geospatial-Intelligence Agency. Blue Waters is a joint effort of the University of Illinois at Urbana-Champaign and its National Center for Supercomputing Applications. Po-Lun Ma and Xiaohong Liu were supported by the Enabling Aerosol-cloud interactions at GLobal convection-permitting scalES (EAGLES) project (74358), funded by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research, Earth System Model Development program. The Pacific Northwest National Laboratory is operated for the U.S. Department of Energy by Battelle Memorial Institute under contract DE-AC05-76RL01830.

Financial support

This research has been supported by the Office of Biological and Environmental Research (grant no. DE-SC0019192), the NSF Division of Atmospheric and Geospace Sciences (grant no. AGS-1254428), the Office of Biological and Environmental Research (Enabling Aerosol-cloud interactions at GLobal convection-permitting scalES (EAGLES) project (grant no. 74358)), the Office of Advanced Cyberinfrastructure (grant no. OCI-0725070), and the Division of Advanced Cyberinfrastructure (grant no. ACI-1238993).

Review statement

This paper was edited by Qiang Zhang and reviewed by three anonymous referees.


Asmi, A., Wiedensohler, A., Laj, P., Fjaeraa, A.-M., Sellegri, K., Birmili, W., Weingartner, E., Baltensperger, U., Zdimal, V., Zikova, N., Putaud, J.-P., Marinoni, A., Tunved, P., Hansson, H.-C., Fiebig, M., Kivekäs, N., Lihavainen, H., Asmi, E., Ulevicius, V., Aalto, P. P., Swietlicki, E., Kristensson, A., Mihalopoulos, N., Kalivitis, N., Kalapov, I., Kiss, G., de Leeuw, G., Henzing, B., Harrison, R. M., Beddows, D., O'Dowd, C., Jennings, S. G., Flentje, H., Weinhold, K., Meinhardt, F., Ries, L., and Kulmala, M.: Number size distributions and seasonality of submicron particles in Europe 2008–2009, Atmos. Chem. Phys., 11, 5505–5538,, 2011. a

Bauer, S. E., Wright, D. L., Koch, D., Lewis, E. R., McGraw, R., Chang, L.-S., Schwartz, S. E., and Ruedy, R.: MATRIX (Multiconfiguration Aerosol TRacker of mIXing state): an aerosol microphysical module for global atmospheric models, Atmos. Chem. Phys., 8, 6003–6035,, 2008. a

Binkowski, F. S. and Shankar, U.: The Regional Particulate Matter Model: 1. Model Description and Preliminary Results, J. Geophys. Res.-Atmos., 100, 26191–26209,, 1995. a

Bogenschutz, P. A., Gettelman, A., Hannay, C., Larson, V. E., Neale, R. B., Craig, C., and Chen, C.-C.: The path to CAM6: coupled simulations with CAM5.4 and CAM5.5, Geosci. Model Dev., 11, 235–255,, 2018. a

Bondy, A. L., Bonanno, D., Moffet, R. C., Wang, B., Laskin, A., and Ault, A. P.: The diverse chemical mixing state of aerosol particles in the southeastern United States, Atmos. Chem. Phys., 18, 12595–12612,, 2018. a

Boucher, O., Randall, D., Artaxo, P., Bretherton, C., Feingold, G., Forster, P., Kerminen, V.-M., Kondo, Y., Liao, H., Lohmann, U., Rasch, P., Satheesh, S., Sherwood, S., Stevens, B., and Zhang, X.: Clouds and Aerosols, in: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, edited by: Stocker, T., Qin, D., Plattner, G.-K., Tignor, M., Allen, S., Boschung, J., Nauels, A., Xia, Y., Bex, V., and Midgley, P., book section 7, pp. 571–658, Cambridge University Press, Cambridge, United Kingdom and New York, NY, USA,, 2013. a

Brown, H., Liu, X., Pokhrel, R., Murphy, S., Lu, Z., Saleh, R., Mielonen, T., Kokkola, H., Bergman, T., Myhre, G., Skeie, R. B., Watson-Paris, D., Stier, P., Johnson, B., Bellouin, N., Schulz, M., Vakkari, V., Beukes, J. P., van Zyl, P. G., Liu, S., and Chand, D.: Biomass Burning Aerosols in Most Climate Models Are Too Absorbing, Nat. Commun., 12, 277,, 2021. a

Chen, T. and Guestrin, C.: XGBoost: A Scalable Tree Boosting System, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM Press, San Francisco, California, USA, pp. 785–794,, 2016. a

Ching, J., Riemer, N., and West, M.: Impacts of black carbon mixing state on black carbon nucleation scavenging: Insights from a particle-resolved model, J. Geophys. Res.-Atmos., 117, D23209,, 2012. a

Ching, J., Fast, J., West, M., and Riemer, N.: Metrics to quantify the importance of mixing state for CCN activity, Atmos. Chem. Phys., 17, 7445–7458,, 2017. a, b, c

Ching, J., West, M., and Riemer, N.: Quantifying Impacts of Aerosol Mixing State on Nucleation-Scavenging of Black Carbon Aerosol Particles, Atmosphere, 9, 17,, 2018. a

Ching, J., Adachi, K., Zaizen, Y., Igarashi, Y., and Kajino, M.: Aerosol Mixing State Revealed by Transmission Electron Microscopy Pertaining to Cloud Formation and Human Airway Deposition, npj Clim. Atmos. Sci., 2, 22,, 2019. a

Cziczo, D. J., Froyd, K. D., Gallavardin, S. J., Moehler, O., Benz, S., Saathoff, H., and Murphy, D. M.: Deactivation of Ice Nuclei Due to Atmospherically Relevant Surface Coatings, Environ. Res. Lett., 4, 044013,, 2009. a

Danabasoglu, G., Lamarque, J.-F., Bacmeister, J., Bailey, D. A., DuVivier, A. K., Edwards, J., Emmons, L. K., Fasullo, J., Garcia, R., Gettelman, A., Hannay, C., Holland, M. M., Large, W. G., Lauritzen, P. H., Lawrence, D. M., Lenaerts, J. T. M., Lindsay, K., Lipscomb, W. H., Mills, M. J., Neale, R., Oleson, K. W., Otto-Bliesner, B., Phillips, A. S., Sacks, W., Tilmes, S., Kampenhout, L., Vertenstein, M., Bertini, A., Dennis, J., Deser, C., Fischer, C., Fox-Kemper, B., Kay, J. E., Kinnison, D., Kushner, P. J., Larson, V. E., Long, M. C., Mickelson, S., Moore, J. K., Nienhouse, E., Polvani, L., Rasch, P. J., and Strand, W. G.: The Community Earth System Model Version 2 (CESM2), J. Adv. Model. Earth Syst., 12, e2019MS001916,, 2020. a

Dickau, M., Olfert, J., Stettler, M. E. J., Boies, A., Momenimovahed, A., Thomson, K., Smallwood, G., and Johnson, M.: Methodology for Quantifying the Volatile Mixing State of an Aerosol, Aerosol Sci. Technol., 50, 759–772,, 2016. a

Emmons, L. K., Schwantes, R. H., Orlando, J. J., Tyndall, G., Kinnison, D., Lamarque, J.-F., Marsh, D., Mills, M. J., Tilmes, S., Bardeen, C., Buchholz, R. R., Conley, A., Gettelman, A., Garcia, R., Simpson, I., Blake, D. R., Meinardi, S., and Pétron, G.: The Chemistry Mechanism in the Community Earth System Model version 2 (CESM2), J. Adv. Model. Earth Syst., 12, e2019MS001882,, 2020. a

Fierce, L., Bond, T. C., Bauer, S. E., Mena, F., and Riemer, N.: Black Carbon Absorption at the Global Scale Is Affected by Particle-Scale Diversity in Composition, Nat. Commun., 7, 12361,, 2016. a

Fierce, L., Riemer, N., and Bond, T. C.: Toward Reduced Representation of Mixing State for Simulating Aerosol Effects on Climate, B. Am. Meteorol. Soc., 98, 971–980,, 2017. a

Fraund, M., Pham, D., Bonanno, D., Harder, T., Wang, B., Brito, J., de Sá, S., Carbone, S., China, S., Artaxo, P., Martin, S., Pöhlker, C., Andreae, M., Laskin, A., Gilles, M., and Moffet, R.: Elemental Mixing State of Aerosol Particles Collected in Central Amazonia during GoAmazon2014/15, Atmosphere, 8, 173,, 2017. a

Ghan, S. J., Liu, X., Easter, R. C., Zaveri, R., Rasch, P. J., Yoon, J.-H., and Eaton, B.: Toward a Minimal Representation of Aerosols in Climate Models: Comparative Decomposition of Aerosol Direct, Semidirect, and Indirect Radiative Forcing, J. Climate, 25, 6461–6476,, 2012. a

Healy, R. M., Riemer, N., Wenger, J. C., Murphy, M., West, M., Poulain, L., Wiedensohler, A., O'Connor, I. P., McGillicuddy, E., Sodeau, J. R., and Evans, G. J.: Single particle diversity and mixing state measurements, Atmos. Chem. Phys., 14, 6289–6299,, 2014. a, b, c, d

Hughes, M., Kodros, J., Pierce, J., West, M., and Riemer, N.: Machine Learning to Predict the Global Distribution of Aerosol Mixing State Metrics, Atmosphere, 9, 15,, 2018. a

Kaiser, J. C., Hendricks, J., Righi, M., Riemer, N., Zaveri, R. A., Metzger, S., and Aquila, V.: The MESSy aerosol submodel MADE3 (v2.0b): description and a box model test, Geosci. Model Dev., 7, 1137–1157,, 2014. a

Kulmala, M., Petäjä, T., Ehn, M., Thornton, J., Sipilä, M., Worsnop, D., and Kerminen, V.-M.: Chemistry of Atmospheric Nucleation: On the Recent Advances on Precursor Characterization and Atmospheric Cluster Composition in Connection with Atmospheric New Particle Formation, Annu. Rev. Phys. Chem., 65, 21–37,, 2014. a

Lee, A. K., Rivellini, L.-H., Chen, C.-L., Liu, J., Price, D. J., Betha, R., Russell, L. M., Zhang, X., and Cappa, C. D.: Influences of Primary Emission and Secondary Coating Formation on the Particle Diversity and Mixing State of Black Carbon Particles, Environ. Sci. Technol., 53, 9429–9438,, 2019. a

Liu, X., Easter, R. C., Ghan, S. J., Zaveri, R., Rasch, P., Shi, X., Lamarque, J.-F., Gettelman, A., Morrison, H., Vitt, F., Conley, A., Park, S., Neale, R., Hannay, C., Ekman, A. M. L., Hess, P., Mahowald, N., Collins, W., Iacono, M. J., Bretherton, C. S., Flanner, M. G., and Mitchell, D.: Toward a minimal representation of aerosols in climate models: description and evaluation in the Community Atmosphere Model CAM5, Geosci. Model Dev., 5, 709–739,, 2012. a, b

Liu, X., Ma, P.-L., Wang, H., Tilmes, S., Singh, B., Easter, R. C., Ghan, S. J., and Rasch, P. J.: Description and evaluation of a new four-mode version of the Modal Aerosol Module (MAM4) within version 5.3 of the Community Atmosphere Model, Geosci. Model Dev., 9, 505–522,, 2016. a, b, c, d, e, f

O'Brien, R. E., Wang, B., Laskin, A., Riemer, N., West, M., Zhang, Q., Sun, Y., Yu, X.-Y., Alpert, P., Knopf, D. A., Gilles, M. K., and Moffet, R. C.: Chemical Imaging of Ambient Aerosol Particles: Observational Constraints on Mixing State Parameterization, J. Geophys. Res.-Atmos., 120, 9591–9605,, 2015. a

Pierce, J. R., Croft, B., Kodros, J. K., D'Andrea, S. D., and Martin, R. V.: The importance of interstitial particle scavenging by cloud droplets in shaping the remote aerosol size distribution and global aerosol-climate effects, Atmos. Chem. Phys., 15, 6147–6158,, 2015. a

Raatikainen, T., Brus, D., Hyvärinen, A.-P., Svensson, J., Asmi, E., and Lihavainen, H.: Black carbon concentrations and mixing state in the Finnish Arctic, Atmos. Chem. Phys., 15, 10057–10070,, 2015. a

Riemer, N. and West, M.: Quantifying aerosol mixing state with entropy and diversity measures, Atmos. Chem. Phys., 13, 11423–11439,, 2013. a, b

Riemer, N., Vogel, H., Vogel, B., and Fiedler, F.: Modeling Aerosols on the Mesoscale-γ: Treatment of Soot Aerosol and Its Radiative Effects, J. Geophys. Res.-Atmos., 108, 4601,, 2003. a

Riemer, N., West, M., Zaveri, R. A., and Easter, R. C.: Simulating the Evolution of Soot Mixing State with a Particle-resolved Aerosol Model, J. Geophys. Res.-Atmos., 114, D09202,, 2009. a, b, c, d

Riemer, N., Ault, A. P., West, M., Craig, R. L., and Curtis, J. H.: Aerosol Mixing State: Measurements, Modeling, and Impacts, Rev. Geophys., 57, 187–249,, 2019. a, b

Tian, J., Riemer, N., West, M., Pfaffenberger, L., Schlager, H., and Petzold, A.: Modeling the evolution of aerosol particles in a ship plume using PartMC-MOSAIC, Atmos. Chem. Phys., 14, 5327–5347,, 2014. a

United States Environmental Protection Agency: CMAQ (Version 5.2), Zenodo [software],, 2017. a

Vogel, B., Vogel, H., Bäumer, D., Bangert, M., Lundgren, K., Rinke, R., and Stanelle, T.: The comprehensive model system COSMO-ART – Radiative impact of aerosol on the state of the atmosphere on the regional scale, Atmos. Chem. Phys., 9, 8661–8680,, 2009. a

Wang, Y., Ma, P.-L., Peng, J., Zhang, R., Jiang, J. H., Easter, R. C., and Yung, Y. L.: Constraining Aging Processes of Black Carbon in the Community Atmosphere Model Using Environmental Chamber Measurements, J. Adv. Model. Earth Syst., 10, 2514–2526,, 2018. a

Wang, Y. H., Liu, Z. R., Zhang, J. K., Hu, B., Ji, D. S., Yu, Y. C., and Wang, Y. S.: Aerosol physicochemical properties and implications for visibility during an intense haze episode during winter in Beijing, Atmos. Chem. Phys., 15, 3205–3215,, 2015. a

Willmott, C. J.: On the Validation of Models, Phys. Geogr., 2, 184–194,, 1981.  a

Wilson, J., Cuvelier, C., and Raes, F.: A Modeling Study of Global Mixed Aerosol Fields, J. Geophys. Res.-Atmos., 106, 34081–34108,, 2001. a

Yang, M., Howell, S. G., Zhuang, J., and Huebert, B. J.: Attribution of aerosol light absorption to black carbon, brown carbon, and dust in China – interpretations of atmospheric measurements during EAST-AIRE, Atmos. Chem. Phys., 9, 2035–2050,, 2009. a

Ye, Q., Gu, P., Li, H. Z., Robinson, E. S., Lipsky, E., Kaltsonoudis, C., Lee, A. K., Apte, J. S., Robinson, A. L., Sullivan, R. C., Presto, A. A., and Donahue, N. M.: Spatial Variability of Sources and Mixing State of Atmospheric Particles in a Metropolitan Area, Environ. Sci. Technol., 52, 6807–6815,, 2018. a, b, c, d

Yu, C., Liu, D., Broda, K., Joshi, R., Olfert, J., Sun, Y., Fu, P., Coe, H., and Allan, J. D.: Characterising mass-resolved mixing state of black carbon in Beijing using a morphology-independent measurement method, Atmos. Chem. Phys., 20, 3645–3661,, 2020. a, b, c, d

Yu, F. and Luo, G.: Simulation of particle size distribution with a global aerosol model: contribution of nucleation to aerosol and CCN number concentrations, Atmos. Chem. Phys., 9, 7691–7710,, 2009. a

Zaveri, R. A., Easter, R. C., Fast, J. D., and Peters, L. K.: Model for Simulating Aerosol Interactions and Chemistry (MOSAIC), J. Geophys. Res.-Atmos., 113, D13204,, 2008. a, b, c

Zhang, Y., Su, H., Kecorius, S., Wang, Z., Hu, M., Zhu, T., He, K., Wiedensohler, A., Zhang, Q., and Cheng, Y.: Mixing State of Refractory Black Carbon of the North China Plain Regional Aerosol Combining a Single Particle Soot Photometer and a Volatility Tandem Differential Mobility Analyzer, Atmos. Chem. Phys. Discuss. [preprint],, 2017. a

Zhao, G., Tan, T., Zhu, Y., Hu, M., and Zhao, C.: Method to Quantify the Black Carbon Aerosol Light Absorption Enhancement with Entropy and Diversity Measures, Atmos. Chem. Phys. Discuss. [preprint],, in review, 2021. a, b, c

Zheng, Z.: zzheng93/code_ms_ml_mam4: First release, Zenodo [data set],, 2021. a

Zheng, Z., Curtis, J. H., Yao, Y., Gasparik, J. T., Anantharaj, V. G., Zhao, L., West, M., and Riemer, N.: Estimating submicron aerosol mixing state at the global scale with machine learning and Earth system modeling, Earth Space Sci., 8, e2020EA001500,, 2021. a, b, c, d, e, f, g

Short summary
Aerosol mixing state is an important emergent property that affects aerosol radiative forcing and aerosol–cloud interactions, but it has not been easy to constrain this property globally. We present a framework for evaluating the error in aerosol mixing state induced by aerosol representation assumptions, which is one of the important contributors to structural uncertainty in aerosol models. Our study provides insights into potential improvements to model process representation for aerosols.
Final-revised paper