Classifying aerosol type using in situ surface spectral aerosol optical properties

Abstract. Knowledge of aerosol size and composition is important for determining radiative forcing effects of aerosols, identifying aerosol sources and improving aerosol satellite retrieval algorithms. The ability to extrapolate aerosol size and composition, or type, from intensive aerosol optical properties can help expand the current knowledge of spatiotemporal variability in aerosol type globally, particularly where chemical composition measurements do not exist concurrently with optical property measurements. This study uses medians of the scattering Angstrom exponent (SAE), absorption Angstrom exponent (AAE) and single scattering albedo (SSA) from 24 stations within the NOAA/ESRL Federated Aerosol Monitoring Network to infer aerosol type using previously published aerosol classification schemes. Three methods are implemented to obtain a best estimate of dominant aerosol type at each station using aerosol optical properties. The first method plots station medians into an AAE vs. SAE plot space, so that a unique combination of intensive properties corresponds with an aerosol type. The second typing method expands on the first by introducing a multivariate cluster analysis, which aims to group stations with similar optical characteristics and thus similar dominant aerosol type. The third and final classification method pairs 3-day backward air mass trajectories with median aerosol optical properties to explore the relationship between trajectory origin (proxy for likely aerosol type) and aerosol intensive parameters, while allowing for multiple dominant aerosol types at each station. The three aerosol classification methods have some common, and thus robust, results. In general, estimating dominant aerosol type using optical properties is best suited for site locations with a stable and homogenous aerosol population, particularly continental polluted (carbonaceous aerosol), marine polluted (carbonaceous aerosol mixed with sea salt) and continental dust/biomass sites (dust and carbonaceous aerosol); however, current classification schemes perform poorly when predicting dominant aerosol type at remote marine and Arctic sites and at stations with more complex locations and topography where variable aerosol populations are not well represented by median optical properties. Although the aerosol classification methods presented here provide new ways to reduce ambiguity in typing schemes, there is more work needed to find aerosol typing methods that are useful for a larger range of geographic locations and aerosol populations.

Abstract. Knowledge of aerosol size and composition is important for determining radiative forcing effects of aerosols, identifying aerosol sources and improving aerosol satellite retrieval algorithms. The ability to extrapolate aerosol size and composition, or type, from intensive aerosol optical properties can help expand the current knowledge of spatiotemporal variability in aerosol type globally, particularly where chemical composition measurements do not exist concurrently with optical property measurements. This study uses medians of the scattering Ångström exponent (SAE), absorption Ångström exponent (AAE) and single scattering albedo (SSA) from 24 stations within the NOAA/ESRL Federated Aerosol Monitoring Network to infer aerosol type using previously published aerosol classification schemes.
Three methods are implemented to obtain a best estimate of dominant aerosol type at each station using aerosol optical properties. The first method plots station medians into an AAE vs. SAE plot space, so that a unique combination of intensive properties corresponds with an aerosol type. The second typing method expands on the first by introducing a multivariate cluster analysis, which aims to group stations with similar optical characteristics and thus similar dominant Published by Copernicus Publications on behalf of the European Geosciences Union. aerosol type. The third and final classification method pairs 3-day backward air mass trajectories with median aerosol optical properties to explore the relationship between trajectory origin (proxy for likely aerosol type) and aerosol intensive parameters, while allowing for multiple dominant aerosol types at each station.
The three aerosol classification methods have some common, and thus robust, results. In general, estimating dominant aerosol type using optical properties is best suited for site locations with a stable and homogenous aerosol population, particularly continental polluted (carbonaceous aerosol), marine polluted (carbonaceous aerosol mixed with sea salt) and continental dust/biomass sites (dust and carbonaceous aerosol); however, current classification schemes perform poorly when predicting dominant aerosol type at remote marine and Arctic sites and at stations with more complex locations and topography where variable aerosol populations are not well represented by median optical properties. Although the aerosol classification methods presented here provide new ways to reduce ambiguity in typing schemes, there is more work needed to find aerosol typing methods that are useful for a larger range of geographic locations and aerosol populations.

Introduction
Although it is well established that aerosol particles affect the radiative forcing of climate both directly by scattering and absorbing sunlight and indirectly by influencing cloud formation and precipitation, aerosols still remain a primary source of uncertainty in assessing the Earth's radiative budget (Boucher et al., 2013). This uncertainty arises from a large range of aerosol chemical and physical properties as well as from the high spatiotemporal variability in aerosol particles. In order to help reduce this uncertainty and be able to better predict climatic effects of aerosols, there is a need for long-term global monitoring of aerosols (Hansen et al., 1996), compiling records not only of aerosol loading but also of aerosol characteristics and type.
Determination of aerosol type (e.g., black carbon, sea salt, dust), which is defined by the size and composition of an aerosol, is important in characterizing the role of aerosols in atmospheric processes and feedbacks, since different aerosol types have different radiative forcing effects and atmospheric behavior. Additionally, knowledge of aerosol type helps identify the aerosol source, which can be useful in implementing controls or policies to reduce aerosols that negatively influence air quality and public health and also to better understand atmospheric dynamics and long-range transport. Constraining aerosol type is also needed for improving aerosol satellite retrieval algorithms and for validating climate models (Russell et al., 2014).
Recent studies, discussed below, present classification schemes to infer aerosol type from intensive optical properties, which are calculated from ratios of extensive properties and thus not directly dependent on the aerosol amount. Successful application of this method could allow for access to aerosol composition information from remote or in situ optical property measurements that do not otherwise provide an indication of aerosol type.

Background
Three optical properties that hold information on aerosol type include the scattering Ångström exponent (SAE), absorption Ångström exponent (AAE) and single scattering albedo (SSA). SAE represents the wavelength dependence of scattering and varies inversely with particle size, so that small values of SAE indicate larger aerosol particles (e.g., dust and sea salt), and large values of SAE indicate relatively smaller aerosol particles (Schuster et al., 2006;Bergin et al., 2000, and references therein). AAE represents the wavelength dependence of absorption and depends on the composition of absorbing aerosols, such that aerosol materials have a unique range of AAE values Bergstrom et al., 2002Bergstrom et al., , 2007. Black carbon (BC), for example, has a theoretical AAE value of around 1, while dust aerosol typically has AAE values greater than 2 (Bergstrom et al., 2002(Bergstrom et al., , 2007Kirchstetter et al., 2004), though AAE of ambient aerosol will likely evolve with atmospheric processing and depend strongly on composition (BC-to-OA (organic aerosol) ratio), coating and size (Saleh et al., 2014;Costabile et al., 2017;Moosmüller et al., 2011). SSA is the ratio of scattering to extinction (absorption + scattering) and provides information on aerosol darkness and composition and may determine the net sign of an aerosol's radiative forcing (Hansen et al., 1997). High SSA values near 1 indicate low-or nonabsorbing "white" aerosols, while low SSA values (below 0.85) indicate "darker" highly absorbing aerosols, and thus an SSA value can be used to characterize the aerosol type (Bergstrom et al., 2002;Russell et al., 2010;Gyawali et al., 2012). Equations for calculating these properties from extensive optical parameters are found in Sect. 4. Many studies have used the information inherent in these optical properties to predict aerosol type; Table 1 provides a review of previous studies that have utilized intensive optical property thresholds to identify aerosol type.
The studies listed in Table 1 all take slightly different approaches to show that intensive aerosol optical properties (SAE, AAE and SSA) can be utilized to classify aerosol type. Bahadur et al. (2012) determine a scheme to partition various absorbing aerosol types based on absorbing aerosol optical depth measurements from numerous AERONET sites that represent a single absorbing aerosol and test the proposed scheme using California AERONET sites with mixed aerosols. Cazorla et al. (2013) also make use of California AERONET sites by combining the measured aerosol optical properties with in situ aerosol chemical composition measurements from an aircraft campaign to create a matrix that delineates aerosol type in an AAE vs. SAE plot space. Eleven AERONET sites from around the globe are used in the study by Russell et al. (2010) to show that AAE values from full-column measurements are highly correlated with aerosol type, in general agreement with the two previously mentioned AERONET aerosol typing schemes that suggest AAE values near 1 indicate fossil fuel burning aerosol, higher AAE values indicate absorbing organic carbon (OC)/biomass burning aerosols and the highest AAE values indicate dust aerosols.
In situ measurements have also been used for aerosol classification schemes. In situ optical measurements from the INTEX-NA aircraft campaign are used by Clark et al. (2007) to separate biomass burning from pollution plumes. Costabile et al. (2013) propose a scheme to classify aerosols based on absorption and scattering values, using 2 years of in situ urban data from Rome, Italy, coupled with numerical simu-lations to create a paradigm linking key aerosol populations to their unique aerosol optical properties. Six months of optical property measurements from the in situ monitoring site in Gosan, South Korea, are used by Lee et al. (2012) and categorized by air mass type (either pollution or dust) using chemical composition, back trajectories and meteorological conditions, and SAE and AAE values are analyzed, yielding results that show dust air masses have the highest AAE values, with OC-polluted air masses showing the next highest AAE values. Cappa et al. (2016) utilized surface in situ measurements from the CARES field campaign in California to categorize aerosol they observed and to suggest some modifications to the Cazorla et al. (2013) aerosol classification scheme. Finally, Yang et al. (2009) used the distinct SSA, AAE and SAE values of different air plumes in the EAST-AIRE campaign to identify absorption contributions from desert dust, biomass burning, industrial plumes and clean air in Beijing, China. It is worth mentioning that some studies take into account the spectral dependence of SSA in aerosol classification schemes (Li et al., 2015;Russell et al., 2010). This parameter was calculated for the monitoring stations in this study but was not useful in classifying aerosol type compared to the other optical properties discussed; therefore, the spectral dependence of SSA is not discussed here.
Care must be taken in comparing thresholds from all aforementioned studies, as differences are likely between columnaverage, ambient AERONET measurements and low-RH, surface in situ measurements. Furthermore, different wavelength pairs are used to calculate AAE and SAE depending on the study. In general, however, all studies suggest similar typing thresholds. Most previous works agree that AAE values of around 1 represent BC and/or fossil fuel burning aerosols and higher AAE values indicate light-absorbing OC (a.k.a. brown carbon; BrC) and/or dust and that high SAE values are associated with small anthropogenic aerosols (e.g., BC, sulfates or nitrates) and low SAE values are associated with large aerosols like sea salt and dust. This paper aims to assess the applicability of previous typing methods/schemes to data from 24 in situ monitoring sites within the NOAA/ESRL Federated Aerosol Monitoring Network and to explore how typing schemes may be improved based on methods using cluster analyses and air mass back trajectories. The following questions are addressed: The literature on classifying aerosols has been largely dominated by the analysis of ground-based remote sensing or satellite data (Cazorla et al., 2013;Russell et al., 2010Russell et al., , 2014Omar et al., 2005;Giles et al., 2012;Bergstrom et al., 2007Bergstrom et al., , 2010Bahadur et al., 2012;, with fewer analyses done using surface in situ aerosol optical property measurements (Cappa et al., 2016;Costabile et al., 2013;Yang et al., 2009;Lee et al., 2012). The analyses in this paper utilize ground-based in situ spectral optical data that afford a unique insight into long-term, quality-assured point observations. Furthermore, since the in situ data sets used in this study are not restricted by aerosol optical depth (AOD) thresholds as are AERONET data sets, they offer a more thorough look at regions with relatively clean air.
Unlike most previous studies, this study looks at longterm records of aerosol optical properties and does so at a wide range of geographic locations, including mountaintop, desert, continental and coastal sites. Not only does the study offer a wide range of aerosol types to be analyzed in an individual geographic location but provides analysis of the same aerosol type in different geographic locations.

Site descriptions
This study investigates aerosol populations at 24 monitoring stations in the NOAA/ESRL Federated Aerosol Monitoring Network. Sites were selected for the study based on the availability of data -each site had to meet the following criteria: (1) aerosol optical data available at three wavelengths and (2) long-term (> 6 months) continuous measurement records of scattering and absorption coefficients during the 2-year time period 2012-2013, unless otherwise noted (see Table 2 for time range for each site). The ARM Mobile Facility (AMF; part of the US Department of Energy's ARM Climate Research Facility) deployments, indicated in bold in Table 2, are typically 1-to 2-year deployments. Most of the AMF measurement times do not overlap with the 2012-2013 analysis period but should nevertheless be comparable to other sites and are included as a means of broadening the range of geographic locations for the analysis. One advantage of this study is the wide diversity of location types and observed aerosol loadings (which span over 3 orders of magnitude). This study includes sites in both the Northern and Southern Hemispheres, ranging in altitude from sea level to 3800 m above sea level (a.s.l.), with various climate regimes including marine, continental and Arctic. The sites experience different levels of anthropogenic influence ranging from clean remote sites to very polluted urban sites. The 24 stations are described in Table 2, and Fig. 1 shows a map of the stations.  (Ion et al., 2005). Distinct diurnal patterns in upslope/downslope air flow, with minimal influence from regional aerosol sources (Bodhaine, 1995 High-altitude station located on the dry, arid Tibetan Plateau in China. The site experiences clean or dusty air masses coming in from the west and anthropogenically influenced and polluted air masses coming from the east (Kivekäs et al., 2009;Che et al., 2011).  Sites are categorized based on the site's geography and surrounding land use. Arctic sites are at latitudes greater than 70 • N. Continental polluted sites have influence from urban and industrial pollution. Continental dust/biomass sites are generally more rural with influence from desert dust and/or biomass burning. Marine clean sites are in remote coastal locations, have little influence from pollution sources (except perhaps from long-range transport events) and see an abundance of marine aerosols. Marine polluted sites are also in coastal locations and may measure pollution aerosols (from continental air masses) or marine aerosols (from oceanic air masses) or some combination thereof, depending on the wind direction. Mountaintop classifications indicate sites that are higher than 2800 m in elevation; these high-altitude monitoring stations sample both free-troposphere air and air masses transported from lower elevations due to upslope/downslope flow. Site classification is inherently subjective and not always clear-cut. We acknowledge that sites could be considered to have more than one classification and have multiple aerosol types. However, the classifications were designated based on "best fit" to the site characteristics and are intended to be representative of the dominant aerosol type at each site.

Data and instruments
The data sets used for the analysis are comprised of in situ scattering and absorption coefficients (σ sp and σ ap , respectively), which are quality assured and used to calculate additional parameters (AAE, SAE and SSA) as described in Eqs.
(1)-(3). One-hour averaged data are used for the assessment of aerosol classification schemes and the multivariate cluster analysis. However, we use 6 h averaged optical properties for the back trajectory analysis, since back trajectories are run at 6 h intervals. Data sets from NOAA and collaborators are publically available from the World Data Center for Aerosols (http://ebas.nilu.no/), with the exception of WLG data, while the AMF data sets are publically available from Department of Energy (DOE) (http://www.arm.gov/). Scattering coefficients were obtained with a TSI 3563 integrating nephelometer (TSI Inc.) at all sites, operating at wavelength channels 450, 500 and 700 nm. Absorption coefficients were measured by either a three-wavelength particle soot absorption photometer (PSAP, Radiance Research), or a three-wavelength continuous light absorption photometer (CLAP, NOAA). The PSAP instruments operate at wavelengths 467, 530 and 660 nm, and CLAP instruments operate at wavelengths 467, 528 and 652 nm. In either case, the σ ap values are corrected to 450, 550 and 700 nm (using AAE) so as to match the wavelengths of the σ sp measurements. Table 2 indicates which instruments operate at each station. At MLO and BND, data from both the PSAP and CLAP were utilized, since at both stations the PSAP was replaced with a CLAP in the middle of the study period. An analysis of concurrent PSAP and CLAP measurements shows that the two instruments produce comparable measurements, and thus combining or directly comparing data from both instruments is not expected to affect results (Ogren et al., 2010).
To ensure data sets are comparable across monitoring stations, all data are quality controlled. In order to minimize aerosol hygroscopic effects, measurements at all stations (except SUM and SPL) are made at a reduced relative humidity (RH < 40 %) by heating the inlet air or by diluting with filtered, dry air. The inlets at most sites are either gently heated (heating does not exceed 40 • C) with a stack heater or a small heater by the impactor and are only utilized if the relative humidity exceeds 40 %. Although heating the sampling inlet can cause loss of organic and volatile aerosol material, which can alter the aerosol spectral optical properties, this is not expected to substantially impact results here. Studies show that the number of volatile components removed at 40 • C (by a thermal denuder) is less than 10 % (Mendes et al., 2016;Huffman et al., 2009). For this particular study, we do not have the data necessary to evaluate the extent to which aerosol optical properties are affected by the heating, but evidence from other studies suggests the effect is likely small.
Monitoring station buildings are also temperature controlled, and inlet stacks have protective caps and screens to prevent interference from precipitation, insects or debris. All aerosol scattering coefficient measurements from the TSI nephelometers are corrected for angular non-idealities using corrections from Anderson and Ogren (1998). After the corrections, scattering coefficients measured by the nephelometer have an uncertainty of 9.3 % for the 10 µm size cut, based on the analysis by Sherman et al. (2015). The Sherman et al. (2015) calculations represent median continental conditions and might change at sites with cleaner or more polluted conditions. Aerosol absorption coefficient measurements from PSAP and CLAP instruments are adjusted for flow rate, spot size and aerosol scattering, using the correction from Bond et al. (1999) and further adjusted for wavelength based on corrections from Ogren (2010). After corrections, absorption coefficients measured by the PSAP or CLAP have an uncertainty of ∼ 20 % (Sherman et al., 2015). Finally, all data are passed through a qualityassurance-quality-control editing process in which measurement records are screened for atypical aerosol parameters (see Delene andOgren, 2002, andSheridan et al., 2016, for detailed descriptions of quality assurance and quality control procedures). Points that appear anomalous due to local pollution sources (nonrepresentative of regional aerosol), instrument error or excessive noise are not included in this analysis.
The measured scattering and absorption coefficients are extensive aerosol properties because they depend on the amount of aerosol present (Ogren, 1995;Delene and Ogren, 2002). Intensive aerosol optical properties are calculated from ratios of the extensive properties. The aerosol intensive properties, including AAE, SAE and SSA, are of primary interest to this study since they contain information on aerosol size or composition and are calculated as indicated in the following equations: where σ ap , λ1 represents absorption coefficient at wavelength λ 1 and σ ap , λ2 represents absorption coefficient at wavelength λ 2 . Similarly, σ sp , λ1 and σ sp , λ2 represent scattering coefficients at wavelengths λ 1 and λ 2 , respectively. Unless otherwise indicated, all data presented here refer to the green wavelength channel (550 nm) for SSA, absorption and scattering coefficient values or the blue/red wavelength pair (450 nm/700 nm) for the SAE and AAE values. CLAP and PSAP wavelengths were adjusted to match the nephelometer wavelengths to compute the intensive variables.
Only aerosol measurements where σ sp > 1 and σ ap > 0.5 Mm −1 are included in the analyses. Data below these values are less reliable due to instrument noise at low aerosol loading, thus the constraints are meant to act as noise thresholds. This inherently adds bias to the data, as monitoring sites with consistently low absorption and scattering coefficients may end up with limited data points after the thresholds are applied, leaving measurement records with higher loadings that may not be fully representative of typical aerosol populations at the site. This constraint has the greatest effect on clean sites like ALT, BRW and SUM (which measure Arctic air), BEO and MLO (which sometimes measure free-tropospheric air), and CPR, CPT, PVC, PYE and THD (which sometimes measure clean marine air). The constraints push the extensive scattering and absorption values higher. More details on the effect of the thresholds on the analysis of clean stations can be found in Table S5 in the Supplement.
There are some differences in monitoring station data that may affect the results of the following analyses and are noted here. SUM utilizes a 2.5 µm size cut, while all other stations use a size cut of 1 and 10 µm, but only the 10 µm data are used in this study. This size cut discrepancy will bias SUM data towards higher SAE values than would be found with a larger size cut. Since ARM station data records are typically less than 1 year in length, while all other station data are 2 years in length, any site-specific seasonal variations may not be captured in the ARM data records. Furthermore, ARM measurement times and CPT times typically do not overlap with the baseline study period of 2012-2013, so any extreme events specific to those years are not reflected in the CPT (data only from years 2010-2011) or ARM (FKB, GRW, NIM, PGH, PVC, PYE) sites measurements.

Data analysis methods
The aerosol classification analysis presented here proceeds in three steps.
1. Application and assessment of previous aerosol typing schemes: presenting station intensive optical property medians in an AAE vs. SAE plot space modeled closely on Cappa et al. (2016) in order to link a combination of AAE and SAE values to aerosol type.
2. Multivariate cluster analysis: performing a multivariate cluster analysis to group stations with like optical properties to better infer a common aerosol type.
3. Back trajectory analysis: combining back trajectories and the land type over which they traveled with aerosol optical properties to better understand the relationship between trajectory origin (proxy for likely aerosol type) and aerosol intensive properties, while allowing more than one dominant aerosol type at each station. The methods for these analysis techniques are described in detail here.

Methods for application and assessment of previous aerosol typing schemes
Like many previous studies (Cappa et al., 2016;Cazorla et al., 2013;Costabile et al., 2013;Yang et al., 2009;Lee et al., 2012;Bahadur et al., 2012), an AAE vs. SAE plot space is used here to visualize relationships between aerosol optical properties and likely aerosol type. Since SAE indicates aerosol size and AAE holds information on aerosol composition and size (Costabile et al., 2017), a unique combination of the two, and thus where that combination falls within the AAE vs. SAE plot space, suggests a particular aerosol type. Many previous studies use chemical composition data (Costabile et al., 2013;Lee et al., 2012;Cazorla et al., 2013) or numerical simulations (Costabile et al., 2013) Cappa et al. (2016) matrix makes more specific designations of aerosol mixtures (e.g., adds "mixed dust, BC, BrC" and "large-particle-BC mix"). The Cappa et al. (2016) matrix also replaces the Cazorla et al. (2013) matrix designation of "large coated particles" with "large-particle-low-absorption mix or large black particles". Finally, the Cappa et al. (2016) matrix replaces the Cazorla et al. (2013) matrix designation of "EC" with "small-particlelow-absorption mix". We chose to primarily use the Cappa et al. (2016) matrix since it is based on in situ data (Cazorla et al., 2013, is based on AERONET data) and since the aerosol designations seemed to align most closely with our data. Results are presented in Sect. 6.1.

Methods for multivariate cluster analysis
In order to infer a more accurate representation of aerosol type using intensive optical properties as an indication of aerosol size/composition and extensive optical properties as an indication of loading, a multivariate clustering analysis is performed to build on the first classification method. A cluster analysis is the process of statistical grouping that yields "clusters" with similar characteristics. A few other studies also implement multidimensional clustering as a means of solidifying aerosol property thresholds for different aerosol types Omar et al., 2005;Levy et al., 2007). In this study, a cluster analysis is used to determine groups of stations with similar aerosol type based on aerosol optical properties. The clusters are then plotted in a 3-D parameter space (AAE vs. SAE vs. log(σ sp )) as a means of visualizing any spatial patterns that emerge.
The k means clustering algorithm was run using medians of four aerosol optical property parameters -SAE, AAE, SSA and the log of the scattering coefficient (log(σ sp ))from hourly averaged records at each monitoring station. The scattering coefficient, σ sp , is an indication of aerosol loading and is implemented here as an additional parameter to improve the inference regarding aerosol types. The log of σ sp (in Mm −1 ) is used rather than the raw σ sp median in order to make the scattering coefficient values more comparable with the magnitude of the optical property values, so the clustering is not dominated by one parameter. While the magnitude of loading (σ sp ) alone does not correspond to a specific aerosol type (for example, high loadings can be observed for dust, pollution or biomass burning events), it may act as a secondary indicator of aerosol conditions (i.e., frequency of aerosol type occurrence, loading) and source contributions, so it is included in the clustering analysis.
To run the clustering algorithm, a number of clusters k is selected. Choosing the k initial seed points is inherently subjective -in this analysis, k needs to be small enough such that the number of stations that fall into each cluster makes for a meaningful grouping and large enough such that a distinction between station groups is apparent. The algorithm then takes k initial seed points at random and iteratively assigns each point to the nearest cluster centroid taking into account the clustering properties. The next iteration chooses k new seed points and repeats the process until the algorithm converges. In this study, six clusters are selected, creating six unique groups each with similar SAE, AAE, SSA and log(σ sp ) characteristics. Each monitoring station was assigned to one of the six clusters produced from the algorithm, and the groupings were used to further analyze aerosol type and conditions. Results are presented in Sect. 6.2.

Methods for back trajectory analysis
The NOAA Air Resources Laboratory Hybrid Single Particle Lagrangian Integrated Trajectory (HYSPLIT) model (Draxler and Rolph, 2003) was utilized to produce 3-day air mass back trajectories at 6 h intervals for the entirety of the measurement period at each station. A cluster analysis was performed in HYSPLIT on the back trajectories from individual stations in order to group air masses of similar speed, direction and altitude. A thorough description of the HYS-PLIT cluster analysis methodology can be found in Kelly et al. (2013). The number of back trajectory clusters differs by station, since the selection of cluster numbers is dependent on the individual data set and is somewhat subjective. For this study, and in adherence with typical clustering methodology, a plot of total spatial variance versus number of clusters was used to determine the cluster number; the cluster number point just before the total spatial variances increases dramatically is the number of clusters used for analysis at that site. From the cluster analysis, each 6 h (00:00, 06:00, 12:00, 18:00 UTC) trajectory was assigned a cluster number and paired with 6 h averaged aerosol optical property data from the monitoring station for which the back trajectories were produced. For example, the back trajectory at 06:00 UTC was paired with aerosol optical property data averaged over 03:00-09:00 UTC. The paired optical property data were then plotted in the AAE vs. SAE plot space and color-coded based on back trajectory cluster number, individually for each site. The method described assumes that clustered back trajectories may carry similar aerosol type(s) that may be unique compared to aerosol found in another back trajectory clusters; this allows for temporal variation in aerosols at a site that is dependent on the geography from which the air masses arrived at the station. Results are presented in Sect. 6.3.

Application and assessment of previous aerosol typing schemes
The median and interquartile spread of SAE, AAE, SSA, scattering coefficient and absorption coefficient values at each site are presented in Table 3. Additionally, Table 3 indicates the aerosol type as determined by the variation in the Cappa et al. (2016) matrix overlaid on the plot of optical property medians in Fig. 2b ("aerosol type before clustering"), as well as the aerosol type determined from a clustering analysis ("aerosol type after clustering"), as described in the next section. Descriptions of the aerosol types can be found in Cazorla et al. (2013) and Cappa et al. (2016). Median AAE and SAE values for each station are shown in Fig. 2a along with bars that represent the interquartile spread (25th to 75th percentiles) of the data. Points are shaded by median SSA value at that station. Medians are used in order to minimize influence from outliers. There are no strong spatial patterns visible in SSA shading within the AAE vs. SAE plot space in Fig. 2a. Stations with high median SAE (smaller particles) tend to have slightly lower median SSA values (darker particles) than those with low median SAE and vice versa. However, there are exceptions to this tendency, with NIM having a low median SAE value and relatively low median SSA and PVC having a high median SAE value and relatively high median SSA. Previous studies established that SSA and the wavelength dependence of SSA can be used to signify aerosol type (Yang et al., 2009;Russell et al., 2010). A three-dimensional plot space helps visualize the relationships amongst SAE, AAE and SSA. This will be further explored in the next section. Figure 2a shows the wide variance of intensive properties at any one site, with values spanning beyond the optical property signatures of a single aerosol type. For example, CPR has interquartile AAE values ranging from 1.16 to 2.65, a spread that encompasses multiple potential aerosol compositions, as outlined by the thresholds in Table 2 and by the classification matrix in Fig. 2b. Interquartile ranges conservatively bound the intensive properties and thus represent the dominant aerosol type at each monitoring site. Some, if not all, of the sites could have multiple aerosol types that are not well represented by the medians illustrated in Fig. 2, as discussed in the next section. Figure 2b shows the same optical property medians that are plotted in Fig. 2a. Station points are colored by station location type (as listed in Table 2 Lee et al., 2012;Yang et al., 2009;Cazorla et al., 2013). Furthermore, both remote/clean marine (e.g., GRW, PYE, THD) sites and dust-influenced sites (e.g., NIM) tend to fall on the left-hand side of the plot with low SAE values, indicative of sea salt, highly processed and coated particles, or dust (Cappa et al., 2016;Cazorla et al., 2013;Lee et al., 2012;Clarke et al., 2007;Yang et al., 2009). The largest median AAE values are observed at NIM and CPR, both of which experience Saharan dust events. NIM is located at the southern edge of the Saharan desert. Dust transport to CPR is predominantly from the African Sahel region    Fig. 2, due to a common low SAE value among the sites, they are not clustered along the AAE axis. All stations in the plot with median SAE values less than or equal to 1.1 are classified as either continental dust/biomass or marine clean, but those classifications cannot be distinguished in the Cazorla et al. (2013) matrix or the modified Cappa et al. (2016) matrix. An improved matrix may include dust, marine aerosol, large coated particles and/or highly processed (aged) particles as possible aerosol types for SAE values less than 1.1. Figure 2a shows that marine clean sites exhibit much higher SSA values than the continental dust/biomass sites with similarly low SAE values, which suggests that the addition of more optical parameters, including SSA, into the clustering analysis could yield more optimized aerosol classification results. Consequently, in the next section, results from a multivariate cluster analysis are used to help reduce ambiguity in aerosol classification and further hone potential aerosol type identification. Figure 3 shows median optical property values, plotted in a 3-D AAE vs. SAE vs. log(σ sp ) parameter space. Station points are color-coded by cluster number and sized by SSA median values. Not only does the 3-D parameter space provide a robust visualization of the clustering results, but it also provides further insight into an aerosol population than the AAE vs. SAE parameter space used previously, since information on loading and SSA are also visible. Table 4 shows median AAE, SAE, SSA and log(σ sp ) values along with interquartile values for each cluster, plus aerosol type and condition (where applicable) based on clus- ter optical property medians, thresholds from previous literature and previous knowledge of station characteristics at the sites within each cluster.

Multivariate cluster analysis
In the 3-D plot seen in Fig. 3, stations that fall within the same cluster number are also located near each other in the three-dimensional parameter space, making for an effective visualization of the relationship between aerosol population and optical properties. Furthermore, stations in each cluster generally share similar site characteristics and expected aerosol type. Discussion of results for each individual cluster is available in the Supplement, while more general results are discussed here.
The clusters presented in Fig. 3 generally group together sites that are expected to have similar aerosols, and the expected aerosol characterizations generally agree with the aerosol type inferred with the aerosol classification schemes. The method does particularly well with identifying aerosol type at stations with a more or less stable, homogeneous aerosol population, including continental stations sampling BC-dominated aerosol (i.e., clusters 2 and 3), as well as the continental stations sampling high loads of dust aerosol (i.e., Cluster 4). The method also does a fair job at identifying remote Arctic or mountaintop sites (i.e., Cluster 1) that sample large processed particles (due to aging during transport) and occasional instances of local pollution. These methods do not do as well at identifying the dominant aerosol type at stations with a more complex location and topography, where variable aerosol populations that depend on wind direction and/or occasional extreme aerosol events are not well characterized by median optical properties within the parameter space.
An advantage to the incorporation of log(σ sp ) into the clustering algorithm and the 3-D parameter space plot is that it allows for a more complete picture of aerosol type and conditions at the station. For example, even though the Cappa et al. (2016) aerosol typing scheme assigns a BC-dominated aerosol to both clusters 1 (remote Arctic and mountaintop stations: ALT, BRW, SUM, MLO, SPL) and 2 (heavily polluted urban coastal sites: AMY, GSN), Fig. 3 shows that these clusters are clearly different, given that Cluster 1 exhibits much lower aerosol loading than Cluster 2. The stations in these clusters are indistinguishable within just the AAE vs. SAE 2-D parameter space. Using σ sp in the analysis gives further insight into the frequency of occurrence and loading of the inferred aerosol (stations in Cluster 1 measure less BCdominated aerosol than stations in Cluster 2).
There are a few weaknesses to the approaches used thus far in typing aerosols using median optical properties and clustering to reduce ambiguity in the aerosol classification. First, knowledge of station location alone cannot accurately determine the type of aerosols found there (Omar et al., 2005). For example, long-range transport or extreme events may result in aerosols being sampled that are not generally representative of the local geographic region. Second, using a climatological mean or median value of an optical property like SAE or AAE can be misleading in the case of two or more differing aerosols being present at different times over the measurement period. For example, a median SAE value of 1 for a site that measures sea salt (low SAE near 0) over half the measurement period and pollution aerosol (high SAE near 2) over the other half of the measurement period, does not provide any real information about the aerosol population, since neither aerosol type has an SAE value of 1. In order to address these concerns, an additional analysis using air mass back trajectories is performed as a means of exploring the spread in optical property data at each site. This analysis also allows for multiple aerosol types to be present at any one location.

Back trajectory analysis
The preceding results are derived from the application of aerosol typing schemes to median optical properties at multiple stations, a method that depends on the assumption that each site has only a single dominant aerosol type. Many of the sites in this analysis, however, are likely to have a heterogeneous aerosol population with various aerosol types. Backward air mass trajectories are incorporated into the analysis here as a means of both (1) allowing for the consideration of multiple dominant aerosol types at one station and (2) allowing for attribution of a likely aerosol source, which can help confirm the practicality of using optical properties to infer aerosol type.

Case studies
Due to a need for brevity, the back trajectory analyses for all 24 stations cannot be presented, so we selected four monitoring stations to present here: Mt Waliguan, China (WLG); Cape Cod, Massachusetts, USA (PVC); Niamey, Niger (NIM); and Heselbach, Germany (FKB). The four sites presented here were chosen to represent cases both where back trajectories helped identify aerosol types and where back trajectories did not elucidate information beyond the initial aerosol classification analysis using median optical properties. Shown for each of the four stations (Figs. 4-7) are a map of mean back trajectory paths for each cluster, a plot of trajectory height vs. backward time (color-coded by trajectory cluster number), and a plot of AAE vs. SAE properties for 6 h averaged optical property data, color-coded by paired trajectory cluster number and overlaid by the median optical property values of each cluster in the largest colorcoded point. If a station's dominant aerosol type differs with air mass origin, these plots can elucidate a station's various aerosol types.

Mt Waliguan, China
The back trajectories at WLG were grouped into four clusters in HYSPLIT, as shown in Fig. 4. Cluster 1 contains ∼ 33 % of the site's back trajectories and has origins to the west of the station near northern Pakistan and traveling through western China; Cluster 2 contains ∼ 30 % of the site's back trajectories and has origins (on average) to the west of the station in rural China; Cluster 3 contains ∼ 33 % of the site's back trajectories and has origins very near the site itself and slightly to the east; and Cluster 4 contains ∼ 3 % of the site's back trajectories and has origins to the far northwest of the station, traveling to the station at high altitudes from rural Russia. AAE values are similar for each trajectory cluster, though SAE values vary. Furthermore, the median aerosol optical property values from each of the trajectory clusters are unique, suggesting a variety of aerosol types using thresholds from previous literature (Cazorla et al., 2013;Costabile et al., 2013). The optical properties from the aerosols in back trajectory Cluster 1 (from deserts in northern Pakistan and western China) imply a dust mixture. Lower SAE values mean the aerosols from this trajectory cluster are larger, and AAE values near and above 1.5 likely mean a dust and/or carbonaceous aerosol mixture (Cazorla et al., 2013). These results support those of Che et al. (2011) and Kivekäs et al. (2009), which cite deserts as aerosol sources from western wind sectors at WLG. clusters 1 and 2 are most similar in terms of median SAE and AAE values, though the map shows that Cluster 1 trajectories traveled farther in the 3day period, and thus had faster wind speeds. Cluster 2 and Cluster 3 have mean trajectory paths that are relatively short and are thus associated with low wind speeds. This means that these clusters are likely to be more influenced by local aerosol sources. The optical properties of the aerosols from back trajectory Cluster 3 coming from the east suggest BC, given the AAE value near 1. This is in agreement with findings of Kivekäs et al. (2009) that show that increased particle concentrations from the east of the WLG station indicated anthropogenic pollution. Cluster 4 looks quite different than the other trajectories and has median optical properties indicative of dust (Cazorla et al., 2013), which makes sense given the trajectory cluster's origin to the northwest of the site (Che et al., 2011;Kivekäs et al., 2009).

Niamey, Niger
The back trajectories at NIM were grouped into three clusters in HYSPLIT, as shown in Fig. 5. Cluster 1 contains slightly over half (∼ 53 %) of the back trajectories, with air mass trajectories reaching the site (on average) from the south/southwest, and traveling at a relatively low altitude over populated regions. Cluster 1 differs from clusters 2 and 3 in that it has a lower median AAE value and a higher median SAE value. Given the optical properties of the trajectory cluster 1, along with the knowledge of anthropogenic activities in the source region, the likely dominant aerosol during those trajectories is a biomass-burning-soot-aerosol mixture (e.g., Osborne et al., 2008;MacFarlane et al., 2009). Clusters 2 and 3 constitute slightly less than half (∼ 46 %) of the back trajectories at NIM and originate (on average) from the north and northeast of the site. In Fig. 5, the median optical property values of clusters 2 and 3 are nearly indistinguishable. For these two clusters, the small SAE values and AAE values above ∼ 1.5 suggest dust mixtures (Cazorla et al., 2013;Lee et al., 2012;Yang et al., 2009). Previous observations by Osborne et al. (2008) noted dust during northerly flow due to the proximity of the Sahara desert to the north/northeast of the site, as did MacFarlane et al. (2009). NIM provides a good example of trajectory analysis elucidating two dominant aerosol types that were obscured when only the climatological medians of AAE and SAE values were evaluated. However, it should be noted that local sources and meteorological conditions also have a large influence on aerosol at the site, in addition to trajectory sources.

Cape Cod, Massachusetts, USA
Back trajectories at PVC were clustered into three groups in HYSPLIT, as shown in Fig. 6. Cluster 1 contains almost half (∼ 49 %) of the trajectories and originates (on average) to the south and southeast of the Cape Cod site along the heavily populated eastern US seaboard. Cluster 2 contains ∼ 43 % of the trajectories and (on average) travels to the monitoring station from the northwest over eastern Canada. Cluster and large cities like Boston, it is unsurprising that the site measures both marine and urban aerosols, depending on the wind direction. The pairing of back trajectory analysis with optical property classification gives a more detailed picture of the multiple aerosol populations at PVC, in accord with other aerosol research done at the site (Titos et al., 2014). Since the back trajectories from over the Atlantic make up such a small portion of the air masses that arrive at PVC, this could explain why this station clusters with continental polluted stations instead of marine polluted stations in the first cluster analysis of this study in Sect. 5.1.

Heselbach, Black Forest, Germany
Back trajectories at FKB group into two clusters (Fig. 7), each containing approximately half of the back trajectories. Back trajectories associated with Cluster 1 typically originated from the northwest over the North Atlantic and are associated with higher wind speeds and longer-distance transport than those in Cluster 2. Cluster 2 tended to travel shorter distances in reaching the site, with mean back trajectories originating from the east of the station in southern Germany, as shown in Fig. 7. Despite the very different geographical origins of the two air mass clusters and very different wind speeds (on average), both trajectory groups have similar median optical property signatures and suggest a BCdominated aerosol type (Cazorla et al., 2013;Yang et al., 2012;Costabile et al., 2013;Lee et al., 2009). The similarity in aerosol properties between the two trajectory clusters arriving at FKB suggests that FKB measures aerosols that are regionally representative aerosols of western Europe. Previous analysis of FKB data shows that the site is dominated by anthropogenic aerosol (Jefferson, 2010). Due to the homogeneity of the aerosol population at the FKB site, back trajectory analysis does not provide any additional information useful for aerosol typing.

All stations
A broader understanding of the link between back trajectory clusters, aerosol optical property measurements and aerosol type can be gained by collectively analyzing all trajectory clusters at all stations rather than looking at stations individually. Here, each trajectory cluster from every site is classified based on where the trajectory originated and the geography over which the air mass traveled; then trajectory clusters from all stations are plotted in one AAE vs. SAE plot space. The classifications include continental Arctic, continental dust, continental dust/polluted, continental polluted, polluted marine and remote marine. A trajectory cluster is classified as continental Arctic if it passes over land north of 60 • N latitude, continental dust if it passes over remote desert, continental dust/polluted if it passes over populated desert regions with anthropogenic influence, continental polluted if it passes over populated land, polluted marine if it passes over populated coastal regions with anthropogenic influence, and remote marine if it passes over clean, unpopulated ocean regions. Table S6 details classifications of each trajectory cluster at all stations. There is unavoidable subjectivity in this classification method, for a few reasons. For one, some trajectories travel over geography that falls into one or more of the classifications chosen for the analysis. In these cases, other factors, such as underlying geography and typical site aerosol populations, were considered to make the more nuanced classifications. Back trajectory analysis of aerosol type are needed to account for air mass dispersion, aerosol wet and dry deposition, cloud processing, and additional sources added at low altitudes and locally. A good example of this is long-range transport of African dust over the Atlantic Ocean. A 3-day back trajectory may not be sufficient to identify long-range dust transport from the African continent. Here, the delineation of dust from marine aerosol is ambiguous. More information on the aerosol composition and hygroscopicity is needed for more conclusive aerosol identification. The authors acknowledge this weakness of the methodology and its inherent uncertainty and subjectivity.
Median values of optical properties from each trajectory cluster at all sites are presented in Fig. 8. There are some clear spatial patterns that emerge when visualizing the trajectory cluster classifications and the median optical properties in the AAE vs. SAE plot space. The majority of continental polluted trajectory clusters group tightly in the area of the plot that would be classified as BC dominated by the Cappa et al. (2016) matrix. This is similar to earlier findings in this paper where continental polluted sites were aggregated at higher SAE (smaller size) and at AAE values in the range of ∼ 1-1.5. Trajectories classified as polluted marine show a similar range of AAE values as the continental polluted trajectories, though with lower SAE values, indicative of large sea salt mixed with organic carbon. Trajectory clusters classified as continental dust are best defined by AAE values greater than 1.4, though they are poorly defined by SAE values due to the large variance in SAE for those clusters. Continental dust/polluted trajectory clusters are more or less tightly defined by AAE values between 0.9 and 1.4 and SAE values between 0.5 and 1.2, though it is hard to draw significant conclusions about this trajectory type, since only three trajectories meet this classification. Trajectories identified as continental Arctic are not well defined in this plot space. Both AAE and SAE values of this trajectory type are variable, though median SSA values for this trajectory class are more similar and are close to 0.95.
The range of Arctic optical properties most likely stems from the seasonal transport of European and Siberian continental aerosol to the sites in the winter and spring, contrasted with sea salt from open water in the summer. Remote marine trajectories are the least well defined of all the trajectory cluster classes, with highly variable optical properties. Remote marine trajectories show AAE values that range anywhere from 0 to 2.2, with SAE values slightly more defined There are some clear outliers within trajectory classification groups that may be explained by misclassification of trajectories. For example, the points labeled 1 and 2 in Fig. 8 are back trajectories from CPR, both with 3-day paths that travel only over the Atlantic Ocean. Although the trajectory classification methodology yielded a class of remote marine for those specific trajectories (the air masses only traveled over unpopulated ocean regions for 3 days before reaching the site), previous studies suggest that these air masses could be heavily influenced by African dust events (Denjean et al., 2016;Kalashnikova and Kahn, 2008;Reid et al., 2003). If indeed the dominant aerosol type in these back trajectories was dust, this would fit in much more neatly with previous dust classification schemes (i.e., Lee et al., 2012;Clarke et al., 2007;Yang et al., 2009) and the Cappa et al. (2016) matrix.
By classifying back trajectory clusters from all station locations and including them in the optical property plot space, we get a clearer idea of what types of trajectories, and thus likely aerosol type, are well defined by median optical properties, and which are poorly defined by median optical properties. Continental polluted and marine polluted trajectories have median optical parameters that are well defined and visually cluster in the plot space. Continental dust and continental dust/biomass are somewhat well defined by optical properties in the plot space. Continental Arctic trajectories appear to be well defined by AAE, with all cluster AAE values of around 1, though the trajectories are not well defined by SAE, which shows a larger range. The remote marine trajectory cluster (presumably clean air masses) is poorly defined by optical properties and thus is not easily visualized in the plot space.
To our knowledge, few previous studies have classified remote marine aerosol (only Costabile et al., 2013, classified a coarse marine mode in the suburbs of Rome, Italy), and no previous studies have classified continental Arctic aerosols using an aerosol classification matrix. Our findings show that at these site types, typing schemes that use aerosol optical properties need more detailed analysis that account for seasonal variability and local sources. Using aerosol optical parameters to infer aerosol type works well for certain types of aerosol that fit neatly into matrices like that from Cappa et al. (2016), including BC-dominated aerosol and dust mixtures. Marine aerosol, processed aerosol and highly heterogeneous aerosol populations are much more poorly defined by optical properties and do not fit cleanly in existing matrices without overlap with different aerosol types.

Discussion
The application of previous aerosol classification schemes to the aerosol optical property data from stations in the NOAA/ESRL Federated Aerosol Monitoring Network generally yields a dominant aerosol type that would be expected at that site location. The classification schemes do particularly well at inferring aerosol type from optical properties at continental sites that measure BC mixtures but do not do as well at sites with more complex topography (e.g., mountaintop, coastal) that measure a more heterogeneous aerosol population that changes with wind direction. Including median optical parameters from multiple stations on one AAE vs. SAE plot allows for a comparison of dominant aerosol type at many sites, though the use of median optical properties makes the most sense for sites with a homogenous aerosol population. The single AAE vs. SAE plot can provide ambiguous results for sites with a heterogeneous aerosol population.
The two aerosol classification methods (Sect. 6.1 and 6.2) had varying degrees of success. The first method, a multivariate cluster analysis, generated groups of monitoring sites with similar AAE, SAE, SSA and log(σ sp ) values. The first classification scheme was applied to median optical properties from all station data within each cluster to produce a new aerosol type for stations within that cluster. One advantage to this approach is that the inclusion of log(σ sp ) in the clustering analysis, and subsequent visualization of station clusters in the AAE v. SAE v. log(σ sp ) 3-D parameter space, provides insight not only into a cluster's aerosol type. This approach also provides insight as to how aerosol loading (and thus site conditions) differs between clusters. Although the AAE and SAE aerosol typing schemes yield similar inferred aerosol type of BC-dominated aerosol for both remote Arctic/mountaintop sites and continental sites, the notable difference in log(σ sp ) values among these dissimilar stations defines the separate clusters. An anticipated advantage to the multivariate cluster analysis was that it would help to reduce ambiguity in the results of aerosol typing schemes, though this was not the case with every cluster. Rather than falling more surely within the optical property thresholds of one aerosol type, the median optical properties of a few clusters still fell on the cusp of two or more aerosol type thresholds. This left the aerosol type of some clusters uncertain, particularly for clusters with coastal and/or remote sites.
The third method (Sect. 6.3), pairing 6 h averaged optical properties with corresponding back trajectories, provided more detailed insight into the aerosol population at an individual station. This method allowed for the typing of multiple aerosols related to different air masses. At stations where aerosol populations are diverse and varying, such as NIM (dust and biomass burning), WLG (dust, pollution, freetroposphere long-range transport aerosol) and PVC (marine aerosol and pollution), the different aerosol types that were previously obscured using the site's median optical properties were more apparent when using the trajectory cluster approach. At stations where aerosol populations are homogeneous (like FKB; regional pollution), no new information on aerosol type was gained. Consolidating all trajectory clusters and corresponding classifications into one plot space (Sect. 6.3.2) allowed us to see a large variety of back trajectory and likely aerosol type and confirmed previous findings from the paper that some trajectory classes (like continental polluted and marine polluted) are well defined by a unique range and combination of optical properties, while other trajectory classes (like remote marine and continental Arctic) have highly variable ranges and combinations of SAE, AAE and SSA and are thus less likely to be typed by aerosol classification schemes using only optical parameters.
The application of varying classification methods gave satisfactory inferences regarding some aerosol types, in great part due to the quality of previously developed aerosol classification schemes. Despite the differences in optical property thresholds presented from each scheme, many of the schemes' thresholds do have large overlap, making it easy to affirm inferred aerosol type with multiple schemes. Many typing schemes provided satisfactory aerosol typing results for fossil fuel burning aerosol, biomass burning aerosol and dust (Cappa et al., 2016;Cazorla et al., 2013;Lee et al., 2012;Yang et al., 2009;Bahadur et al., 2012;Russell et al., 2010), though fewer schemes were available to type large coated particles (Cazorla et al., 2013), sea salt (Costabile et al., 2013) and mixed aerosol (Cappa et al., 2016;Cazorla et al., 2013). Perhaps the most useful typing schemes were that of Cazorla et al. (2013) and Cappa et al. (2016), which provided thresholds for typing mixed aerosol and large coated particles or a large-particle-low-absorption mix. The Cazorla et al. (2013) and Cappa et al. (2016) schemes also delineated the entirety of the AAE vs. SAE plot space, leaving no combination of optical property values without a category.
It should be mentioned that the success of aerosol classification schemes is largely dependent on uncertainties in AAE attribution (Cappa et al., 2016). The scientific community has yet to fully assess AAE as an indicator of aerosol composition. Although AAE = 1 is often taken within the community to indicate black carbon, some studies show that this largely depends on aerosol composition and size, as well as the age of the particle and atmospheric processing that it endures (Lack and Langridge, 2013;Saleh et al., 2014;Costabile et al., 2017;Moosmüller et al., 2011). Furthermore, the accuracy of these aerosol classification methods are only as good as the extent to which the AAE value is an indication of the aerosol composition. As the scientific community advances our understanding of AAE and its relationship to aerosol composition and size, these aerosol classification schemes should be refined.
A major missing piece of the currently available aerosol classification methods is the identification and validation of optical property thresholds to identify sea salt aerosol. To the authors' knowledge, only one study includes marine aerosol identification; Costabile et al. (2013) provide values of SSA > 0.95, SAE < 0.5, dSSA = 0-0.05 and AAE > 2 for coarse marine mode aerosol. Many studies ignore the contribution of sea salt altogether (or do not use data that would have sea salt aerosol contributions), while other studies do not include sea salt aerosol in their typing scheme because sea salt has negligible absorption and thus poorly defined AAE . The best match with sea salt aerosol in the Cappa et al. (2016) matrix presented here is likely the "large particles, low-absorption" classification. Since sea salt aerosols are dominated by large particles, there is a general consensus that marine particles are characterized by low SAE values and high SSA values Costabile et al., 2013;Smirnov et al., 2002;Dubovnik et al., 2002). Of the 24 stations analyzed in this study, sea salt aerosol is expected at CPT, CPR, GRW, PYE and THD and to a lesser extent at ARN, AMY, GSN and PVC. With the exception of ARN, AMY, GSN and PVC, which often measured polluted air masses (see scattering coefficient values for these four stations in Table 3 and back trajectories for PVC), these coastal stations have median values of SAE < 1 and SSA > 0.95. Median values of AAE, however, range from 0.5 to 2.0. Further back trajectory analysis (not shown here) relating air masses of oceanic origin at these sites to aerosol optical properties does not show specific patterns in AAE values for marine aerosols. Although no new marine aerosol typing information is included here, the authors do encourage consideration of SAE and SSA thresholds for sea salt to be included in future aerosol classification analyses. Furthermore, the authors acknowledge that although no sea salt aerosol types are designated here explicitly at coastal stations, some of the aerosol types are likely sea salt aerosol mixed (however slightly) with some absorbing component. Cappa et al. (2016) in some ways account for sea salt aerosol by changing the categorization in the lower left of the box to "large-particle-lower-absorption mix" although in the original matrix they also suggest that this regime could be represented by large black particles.
Although this study generally affirms existing aerosol typing schemes, the results here are only applicable given certain conditions and for specific aerosol types. One stipulation of this analysis is that results were compared to aerosol typing schemes from studies that used optical prop-erty data from in situ surface measurements, aircraft campaigns and AERONET measurements. There are few studies (e.g., Cappa et al., 2016) that evaluate the differences that may exist in aerosol typing schemes/thresholds based on the type of data (in situ vs. remote sensing, column vs. point, dry vs. ambient measurements) used. The difference in RH between dry (most in situ surface) and ambient (AERONET) measurements could have some effect on the determined thresholds. A higher RH would decrease SAE (larger aerosol), SSA thresholds might shift up (whiter aerosol), scattering coefficients would get larger, and AAE might change due to coating on absorbing particles. Future analysis comparing dry and ambient aerosol, as well as surface measured vs. remotely sensed, typing schemes would be useful for determining the validity of the comparisons made in this study.
An additional caveat in the parameter clustering analysis and back trajectory cluster analysis is the presence of externally mixed aerosol with size-dependent composition that renders the analysis ambiguous for a given aerosol class. Future work on this would add much needed information to the subject of aerosol typing from optical properties.
Another limitation to the classification analyses presented here is that aerosol aging during transport can influence aerosol type. A study by Devi et al. (2016) shows that prior to atmospheric aging, mobile sources and biomass burning sources can have relatively high (∼ 1.2-2.0) AAE values; however, after aging during transport (∼ 1-2 days), the brown carbon signal can go away, reducing the AAE value. There may be a point when source information from aerosol intensive optical properties can be lost during transport. In that case, aerosol classification schemes may no longer be applicable.
There are still many ways in which this analysis can be expanded. The incorporation of aerosol shape into the typing analysis could be helpful, particularly in determining the differences between particles with similar optical properties. Further stratification of the measurement data by season, time of day, composition or hygroscopicity would elucidate more about the variability in aerosol type with time. And finally, more analyses of stations that have concurrent chemistry measurements and aerosol optical property measurements could help verify existing aerosol classification schemes (e.g., Cappa et al., 2016;Costabile et al., 2017).

Conclusions
Surface in situ aerosol optical properties obtained at 24 stations in the NOAA/ESRL Federated Aerosol Monitoring Network were used to classify aerosol type at the site, using aerosol classification schemes from the literature, cluster analyses, and general knowledge of station location and characteristics. The monitoring sites utilized for the analysis offered a diverse range of station locations and aerosol types, providing a look at fossil fuel burning, biomass burning, sea salt, dust and regionally mixed aerosols observed at various continental sites. Plotting station optical property medians in an AAE vs. SAE plot space, overlaid by the Cappa et al. (2016) classification matrix, for the most part yielded inferences regarding aerosol types that were to be expected based on knowledge of the monitoring station location. A handful of stations, however, yielded unexpected results that appeared uncharacteristic of the site, which indicated a need for a different visualization or analysis method. Furthermore, the interquartile values of the optical properties from each station in an AAE vs. SAE parameter space showed that there is often large variability in optical properties at any given location, suggesting that a single dominant aerosol type is not realistic at all stations.
A multivariate cluster analysis was performed as a means of grouping together monitoring sites with not only similar aerosol type, but similar site conditions (frequency of aerosol type, loadings, proximity to source, location, etc.). The multivariate cluster analysis yielded six clusters of stations with similar median AAE, SAE, SSA and log(σ sp ) values. Sites that grouped within the same cluster most often had similar expected aerosol types that aligned with the aerosol type predicted by the aerosol typing scheme. Incorporation of the scattering coefficient into the multivariate cluster analysis improved the inference regarding aerosol type and conditions (i.e., aerosol loading, source) from optical property measurements.
In order to further explore the complexity of aerosol populations and allow for multiple aerosol types at some sites, an additional analysis was presented using air mass back trajectories. Air mass back trajectories were clustered based on similar direction, altitude and speed, and these clusters were paired with optical property data and plotted in the AAE vs. SAE parameter space. More detailed results from 4 of the 24 stations -WLG, NIM, PVC and FKB -were discussed in order to show the range of success (or lack thereof) of this approach. At complex sites like WLG, NIM and PVC, multiple dominant aerosol types emerged, unique to different clusters of air mass back trajectories. The classification of numerous aerosol types, along with the information from the back trajectory clusters on how often those aerosol types were measured, allowed for a more complete picture of the heterogeneous aerosol populations at those sites. In the case of FKB, only one aerosol type is inferred in each of the different trajectory clusters, suggesting a homogenous aerosol population that is readily predicted by the simpler analysis of just the median optical properties in the AAE vs. SAE parameter space.
Combining back trajectory clusters and classifications from all 24 sites showed that comparing optical characteristics with trajectory characteristics yields results that further inform aerosol typing schemes. While all trajectory clusters that were classified as marine polluted or continental polluted had optical properties that were well defined, other trajectory clusters classified as continental Arctic or remote marine had highly variable optical parameters that were not informative in aerosol typing.
This study has further assessed existing aerosol typing schemes, provided additional methods that can be implemented to reduce ambiguity in typing schemes, elucidate aerosol conditions that accompany aerosol type and allow for the identification of multiple aerosol types at one site. A major conclusion from the analysis, however, is that there is no combination of extensive and/or intensive optical properties that allows for a perfect classification of aerosol types. Prior knowledge of the measurement site can help inform aerosol classification schemes, but obscurity remains in these techniques. Furthermore, this paper highlighted the need for further analyses and suggests specific ideas for future work needed to progress and refine aerosol typing schemes that infer aerosol type from optical properties: repeating this analysis with concurrent aerosol chemical and optical measurements to verify aerosol classification thresholds will be essential to expand and improve aerosol classification schemes.
Data availability. Data for AMF sites are available from the DOE/ARM website (http://www.arm.gov). Data from all other sites (except WLG) are available from the World Data Center for Aerosols (http://ebas.nilu.no/). WLG data are available from Junying Sun at CAMS.