Highly non-linear dynamical systems, such as those found in atmospheric
chemistry, necessitate hierarchical approaches to both experiment and
modelling in order to ultimately identify and achieve fundamental
process-understanding in the full open system. Atmospheric simulation
chambers comprise an intermediate in complexity, between a classical
laboratory experiment and the full, ambient system. As such, they can
generate large volumes of difficult-to-interpret data. Here we describe and
implement a chemometric dimension reduction methodology for the deconvolution
and interpretation of complex gas- and particle-phase composition spectra.
The methodology comprises principal component analysis (PCA), hierarchical
cluster analysis (HCA) and positive least-squares discriminant analysis
(PLS-DA). These methods are, for the first time, applied to simultaneous gas-
and particle-phase composition data obtained from a comprehensive series of
environmental simulation chamber experiments focused on biogenic volatile
organic compound (BVOC) photooxidation and associated secondary organic
aerosol (SOA) formation. We primarily investigated the biogenic SOA
precursors isoprene,
Biogenic volatile organic compounds (BVOCs) are ubiquitous in the global
troposphere, being emitted primarily from terrestrial plant life
(Kanakidou et al., 2005). It is estimated
that the total annual emission rate of all (non-methane) BVOCs is roughly
10 times that of all anthropogenic volatile organic compounds (VOCs), being around
750 Tg C yr
Within the troposphere terpenes are able to react with OH, O
Aerosol particles are natural components of the Earth's atmosphere responsible for a range of well-documented impacts, ranging from visibility impairment on the local scale to climate change, with suspended particles being able to perturb the Earth's radiative budget via both direct and indirect mechanisms (Solomon et al., 2007). Furthermore, fine airborne particles have been shown to have numerous detrimental effects on human health, particularly in vulnerable members of the population (Harrison et al., 2010; Heal et al., 2012).
Biogenic SOA (BSOA) has been estimated to account for a significant fraction
of total global SOA. Modelling studies suggest the annual global production
rate of BSOA is of the order of 16.4 Tg yr
The chemistry of the atmospheric system is highly non-linear and can be studied by experiments ranging from highly controlled laboratory studies of a single process, to field studies of the whole complex system. A significant proportion of the findings gained regarding SOA over the last decade and more have come from atmospheric simulation chamber experiments, intermediate in complexity between classical single-process experiments and the fully open system (for various different chamber systems and VOC precursors, see for example, Pandis et al., 1991; Odum et al., 1996; Hoffmann et al., 1997; Griffin et al., 1999; Glasius et al., 2000; Cocker et al., 2001; Jaoui and Kamens, 2003; Kleindienst et al., 2004; Presto et al., 2005; Bloss et al., 2005; Rohrer et al., 2005; Ng et al., 2006, 2007; Dommen et al., 2006; Surrat et al., 2006; Grieshop et al., 2007; Chan et al., 2007; Wyche et al., 2009; Hildebrandt et al., 2009; Rickard et al., 2010; Camredon et al., 2010; Chhabra et al., 2011; Hennigan et al., 2011; Jenkin et al., 2012). Chamber experiments produce a large amount of data, the interpretation of which can often be highly complex and time consuming even though the set-up of the chamber constrains the complexity to a large degree.
In the current “big data” age, advanced monitoring techniques are
producing increasingly larger, more complex and detailed data sets. Modern
chamber experiments, monitored by state-of-the-art gas- and particle-phase
instrumentation, often yield so much data that only a fraction is
subsequently used in a given analysis. For example, during a typical
6 h environmental simulation chamber experiment, VOC monitoring
chemical-ionisation reaction time-of-flight mass spectrometry (CIR-TOF-MS)
will produce roughly 1.1
The work reported here focuses on detailed organic gas-phase and particle-phase composition data, recorded during SOA atmospheric simulation chamber experiments, using CIR-TOF-MS and liquid-chromatography ion-trap mass spectrometry (LC-MS/MS), respectively, as well as broad (i.e. generic composition “type”: oxygenated organic aerosol, nitrated, sulfated) aerosol composition data, recorded by compact time-of-flight aerosol mass spectrometry (cTOF-AMS). The goal of this paper is to demonstrate and evaluate the application of an ensemble reductive chemometric methodology for these comprehensive oxidation chamber data sets, to be used as a model framework to map chemical reactivity from mesocosm systems, thus providing a link from model systems to more “real” mixtures of organics. The intermediate complexity offered by simulation chamber experiments makes them an ideal test bed for the methodology. Application of the methodology to resultant particle-phase data also aims to provide a level of particle composition classification in the context of gas-phase oxidation.
Similar approaches using statistical analyses have been recently applied to both detailed and broad ambient aerosol composition data (e.g. Heringa et al., 2012; Paglione et al., 2014), particularly in the context of source apportionment (e.g. Alier et al., 2013). Different methods have been attempted by several groups to deconvolve organic aerosol spectra measured by the aerosol mass spectrometer (AMS) in particular (e.g. Zhang et al., 2005, 2007; Marcolli et al., 2006; Lanz et al., 2007). Zhang et al. (2005) applied a custom principal component analysis (CPCA) method to extract two distinct sources of organic aerosols in an urban environment using linear decomposition of AMS spectra and later applied a multiple-component analysis technique (MCA; an expanded version of the CPCA) to separate more than two factors in data sets from 37 field campaigns in the Northern Hemisphere (Zhang et al., 2007). Marcolli et al. (2006) applied a hierarchical cluster analysis method to an ambient AMS data set, and reported clusters representing biogenic VOC oxidation products, highly oxidised organic aerosols and other small categories. Receptor modelling techniques such as positive matrix factorisation (PMF) employ similar multivariate statistical methods in order to deconvolve a time series of simultaneous measurements into a set of factors and their time-dependent concentrations (Paatero and Tapper, 1994; Paatero, 1997). Depending on their specific chemical and temporal characteristics, these factors may then be related to emission sources, chemical composition and atmospheric processing. For example, Lanz et al. (2007) and Ulbrich et al. (2009) applied PMF to the organic fraction of AMS data sets and were able to conduct source apportionment analysis identifying factors contributing to the composition of organic aerosol at urban locations. Slowik et al. (2010), combined both particle-phase AMS and gas-phase proton transfer reaction mass spectrometry (PTR-MS) data for the PMF analysis of urban air, and were able to successfully obtain “regional transport, local traffic, charbroiling and oxidative-process” factors. By combining the two data sets, Slowik and colleagues were able to acquire more in-depth information regarding the urban atmosphere than could be derived from the analysis of each of the sets of measurements on their own.
Because receptor models require no a priori knowledge of meteorological conditions or emission inventories, they are ideal for use in locations where emission inventories are poorly characterised or highly complicated (e.g. urban areas), or where atmospheric processing plays a major role. However, because all of the values in the profiles and contributions are constrained to be positive, the PMF model can have an arbitrary number of factors and the user must select the “best” solution that explains the data. This subjective step of PMF analysis relies greatly on the judgment and skill of the user.
The central methodology employed in this work is based around the application of principal component analysis (PCA), hierarchical cluster analysis (HCA) and positive least-squares discriminant analysis (PLS-DA) of single-precursor oxidant chemistry in environmental simulation chambers. Colloquially, we can describe these three approaches as providing dimensions along which the data are separable (PCA), tests of relatedness (HCA) and checks for false positives (PLS-DA).
Such dimension reduction techniques can be very powerful when used in chemometrics, enabling large and often complex data sets to be rendered down to a relatively small set of pattern-vectors to provide an optimal description of the variance of the data (Jackson, 1980; Sousa et al., 2013; Kuppusami et al., 2015). Unlike other statistical techniques such as PMF, the ensemble methodology presented here does not require the use of additional external databases (comprising information regarding different environments/reference spectra), is simpler to use and less labour intensive, and places less importance on user skill in the production of accurate and meaningful results. Moreover, the primary focus of techniques such as PMF is on source identification/separation, whereas here the focus is placed on compositional isolation.
Summary of experiments conducted.
The analysis conducted in this work shows that “model” biogenic oxidative
systems can be clearly separated and classified according to their gaseous
oxidation products, i.e. isoprene from
The methodology described and the results presented (supported by findings obtained from zero-dimensional box modelling) indicate that there is some potential that the approach could ultimately provide the foundations for a framework onto which it would be possible to map the chemistry and oxidation characteristics of ambient air measurements. This could in turn allow “pattern” typing and source origination for certain complex air matrices and provide a snapshot of the reactive chemistry at work, lending insight into the type of chemistry driving the compositional change of the contemporary atmosphere. There are similarities between this approach to discovery science in the atmosphere and metabolomic strategies in biology (e.g. Sousa et al., 2013; Kuppusami et al., 2015).
Key technical features of MAC, EUPHORE and PSISC (Alfarra et al., 2012; Becker, 1996; Bloss et al., 2005; Camredon et al., 2010; Paulsen et al., 2005; Zador et al., 2006).
Six different BVOCs and one anthropogenic VOC were chosen for analysis. The
target compounds, their structures and reaction rate constants with respect
to OH and O
The VOC precursors employed have certain similarities in terms of reaction rate constants with respect to OH and O
Experiments were carried out across three different European environmental simulation chamber facilities over a number of separate campaigns. The chambers used, included (1) The University of Manchester Aerosol Chamber (MAC), UK (Alfarra et al., 2012); (2) The European Photoreactor (EUPHORE), ES (Becker, 1996); and (3) The Paul Scherrer Institut Smog Chamber (PSISC), CH (Paulsen et al., 2005). A brief technical description of each facility is given in Table 2.
Table 1 provides a summary of the experiments conducted, which can be
divided into three separate categories: (1) photooxidation, indoor chamber
(Wyche et al., 2009; Alfarra et al., 2012, 2013); (2)
photooxidation, outdoor chamber (Bloss et al., 2005; Camredon et al.,
2010); and (3) mesocosm photooxidation, indoor chamber
(Wyche et al., 2014). In each case the reaction
chamber matrix comprised a temperature (
CIR-TOF-MS was used to make real-time (i.e. 1 min) measurements of the
complex distribution of VOCs (
Aerosol samples were collected on 47 mm quartz fibre filters at the end of
certain experiments and the water-soluble organic content was extracted for
analysis using LC-MS/MS. Reversed-phase LC separation was achieved using an
HP 1100 LC system equipped with an Eclipse ODS-C18 column with 5
For several experiments, real-time broad chemical characterisation of the SOA was
made using a cTOF-AMS (Aerodyne Research Inc., USA). The cTOF-AMS was
operated in standard configuration, taking both mass spectrum (MS) and
particle time-of-flight (PTOF) data; it was calibrated for ionisation
efficiency using 350 nm monodisperse ammonium nitrate particles, the
vaporiser was set to
Filter and cTOF-AMS data were collected only during photooxidation
experiments conducted at the MAC. Repeat experiments conducted at the MAC
were carried out under similar starting conditions (e.g. VOC
Each chamber was additionally instrumented with online
chemiluminescence/photolytic NO
In order to aid analysis, the composition and evolution of the gas-phase
components of the
All CIR-TOF-MS data were recorded at a time resolution of 1 min. In order to remove the time dimension and simultaneously increase detection limit, the individual mass spectra were integrated over the entire experiment; as such, no account is taken of overall reaction time in the CIR-TOF-MS analysis. Removing the time dimension acts to reduce the dimensionality of the data, whilst maintaining the central characteristic spectral fingerprints produced by the photooxidation process. On average across all experiments studied, 98 % of the precursor had been consumed by the conclusion of the experiment; hence, it is assumed that sufficient reaction took place in each instance to provide summed-normalised mass spectra that fully capture first- and higher-generation product formation.
The resultant summed spectra were normalised to 10
The LC-MS/MS signal intensity data for the region 51 <
Before any multivariate analysis was conducted, the processed CIR-TOF-MS,
LC-MS/MS and cTOF-AMS spectra were first filtered to remove unwanted data that
were deemed not to be statistically significant. In order to do this, the
mass spectra were initially grouped by structure of the precursor employed,
giving seven separate groups for the CIR-TOF-MS data and three groups (owing
to the smaller number of precursor species investigated) for the LC-MS/MS
and cTOF-AMS data, respectively. A two-sided Mann–Whitney test was then used to
assess whether signals reported in individual mass channels were
significantly different from the corresponding signals measured during a
blank experiment. SPSS V20 (IBM, USA) was used for the analysis. A
To begin with, to reduce the data and identify similarities between the precursor oxidation systems, a PCA was conducted on the BVOC data set and the model generated was then employed to map the reactivity of fig and birch tree mesocosm systems and to investigate the fit of a typical anthropogenic system (toluene) into the PCA space (both introduced into the model as test data sets). An unsupervised pattern recognition, hierarchical cluster analysis was also conducted on the data and a dendrogram was produced to test relatedness, support the PCA and help interpret the precursor class separations achieved. The dendrogram was constructed using PCA scores, the centroid method and Mahalanobis distance coefficients. Finally, a supervised pattern recognition PLS-DA analysis was employed as a check for false positives and as a quantitative classification tool to test the effectiveness of classification of the various systems in the model.
For the superposition of “classification” confidence levels onto the
results of the PCA and HCA and for classification discrimination in the
PLS-DA, prior to analysis the experiments were grouped according to the
structure of the precursor investigated. Group 1
The temporal evolution of various key gas-phase (a) and particle-phase (b) parameters measured during a typical photooxidation experiment, are shown in Fig. 1 in order to provide background context. In this instance the precursor was myrcene and the facility employed was the MAC. Full details describing the underlying chemical and physical mechanisms at play within such experiments can be found elsewhere (e.g. Larsen et al., 2001; Bloss et al., 2005; Paulsen et al., 2005; Surratt et al., 2006, 2010; Wyche et al., 2009, 2014; Camredon et al., 2010; Rickard et al., 2010; Eddingsaas et al., 2012a and b; Hamilton et al., 2011; Jenkin et al., 2012; Alfarra et al., 2012, 2013, and references therein).
Of the 191 different mass channels extracted from the CIR-TOF-MS data for
analysis (i.e. 65 <
PCA loadings bi-plot of the second vs. first principal
components derived from the PCA analysis of the isoprene, cyclic monoterpene
(c-m-terpene in the legend;
Figure 2 shows a loadings bi-plot of PC2 vs. PC1. It is clear from Fig. 2,
that the model is able to successfully separate the four different classes
of biogenic systems investigated.
The
Moving past the precursors into the detailed chemical information provided by
the oxidation products formed within the chamber, we can see from the data
and Fig. 2 that amongst others,
Having employed the terpene data as a training set to construct a PCA model, a test set of mesocosm data was introduced in order to investigate the ability of the model to map the classification of more complex biogenic mixtures. In this instance the mesocosm test set comprised two birch tree and two fig tree photooxidation experiments, containing a more complex and “realistic” mixture of various different VOCs (Wyche et al., 2014). The resultant scores plot is shown in Fig. 3.
Figure 3 demonstrates that the model can successfully distinguish between the
two different types of mesocosm systems. Moreover, the model correctly
classifies the mesocosm systems within the PCA space, with the birch trees,
which primarily emit monoterpenes and only small quantities of isoprene
(Wyche et al., 2014), grouped with the single-precursor monoterpene cluster
and the fig trees, which primarily emit isoprene and camphor and only a small
amount of monoterpenes (Wyche et al., 2014), grouped between the monoterpene
and isoprene clusters. Investigation of the mesocosm mass spectra and PCA
loadings shows that mass channels 137, 139, 107, 95, 93, 81 and 71 are
amongst features important in classifying the birch tree systems, with the
relatively strong presence of
PCA scores plot of the second vs. first principal components
derived from the PCA analysis of the mesocosm test set using the PCA model
derived from the isoprene, cyclic monoterpene (
PCA scores plot of the second vs. first principal components
derived from the PCA analysis of the toluene test set using the PCA model
derived from the isoprene, cyclic monoterpene (
List of certain major product ions integral to the separation of BVOC photooxidation spectra in statistical space, their corresponding tentative assignments and their precursor. See main text (Sects. 4.2 and 4.5) for further information.
Dendrogram showing the grouping relationship between the
various gas-phase matrices of systems examined. Red – isoprene, pink – fig, green – cyclic monoterpenes
(
As a further test of the technique to distinguish between and to classify
VOCs and the matrix of oxidised organic compounds that may derive from their
atmospheric chemistry, test data from an anthropogenic system was introduced
into the model. In this instance, the toluene photooxidation system was
employed. Toluene is an important pollutant in urban environments,
originating for example from vehicle exhausts and fuel evaporation; furthermore, it
represents a model mono-aromatic, SOA precursor system (e.g. Bloss
et al., 2005). As can be seen from the resultant scores plot in Fig. 4,
the model is also able to discriminate the anthropogenic system from those
of biogenic origin. Besides the protonated toluene parent ion, those ions
contributing to the positioning of the toluene cluster within the PCA space,
include the protonated parent ions
The relationships between the various terpene and mesocosm systems and their
groupings with respect to one another can be explored further via the
implementation of HCA; Fig. 5 gives the dendrogram produced. Inspection of
Fig. 5 provides further evidence that the various systems in the four classes
of terpenes investigated distinctly group together, with overall
relatedness < 1 on the (centroid) distance between clusters scale
using the Mahalanobis distance measure (Mahalanobis, 1936). Figure 5 shows
that the sesquiterpene oxidation system has the most distinct spectral
fingerprint (containing distinctive, higher mass oxidation products, e.g.
In order to advance our chemometric mapping of biogenic systems beyond PCA
and HCA (which do not consider user supplied a priori observation “class”
information) and to provide a degree of quantification to our analysis, a
PLS-DA using six latent variables (LVs) was conducted on the terpene and
mesocosm data. For the PLS-DA, the experiments were grouped into their
respective “classes”, i.e. hemiterpene – isoprene; cyclic monoterpene –
PLS-DA model classification sensitivity and specificity for the gas-phase biogenic air matrices.
Scores plot of the first three latent variables derived from
the PLS-DA model analysis of the isoprene, cyclic monoterpene (
As can be seen from inspection of Table 4, model classification sensitivity and specificity was high in each instance. Each of the biogenic systems studied were predicted with 100 % sensitivity (with the exception of birch mesocosm), meaning that each set of experiments (again, except birch mesocosm) was predicted to fit perfectly within its class. The relatively low sensitivity obtained for birch mesocosm (50 %), is most likely a result of the use of only two repeat experiments in the model, coupled with experiment limitations and ageing trees producing slightly lower emissions during the final birch mesocosm experiment. All of the systems were predicted with > 90 % specificity (four of the six with 100 % specificity), indicating that all experiments are highly unlikely to be incorrectly classified.
In order to explore similar classifications and linkages in the concomitant particle-phase, the PCA, HCA and PLS-DA techniques were also applied to the off-line LC-MS/MS spectra obtained from analysis of filter samples and on-line cTOF-AMS spectra.
As can be seen from inspection of Fig. 7, the detailed LC-MS/MS aerosol
spectra produce PCA results somewhat similar to those of the gas-phase
CIR-TOF-MS spectra, with distinct clusters of cyclic monoterpenes, straight
chain monoterpenes and sesquiterpenes. From inspection of the loadings
components of the bi-plot (Fig. 7a), we can see that
Similarly, those ions (compounds) significant in isolating the cyclic
monoterpenes include,
As with the PCA, the dendrogram produced via cluster analysis of the LC-MS/MS
particle-phase data gave three distinct clusters (Fig. 7b), i.e. cyclic
monoterpene, straight chain monoterpene and sesquiterpene. The corresponding
PLS-DA analysis reported 100 % sensitivity in each case and 100 %
specificity for all systems except sesquiterpenes (i.e.
Despite utilising the somewhat destructive electron impact (EI) ionisation technique, the cTOF-AMS produces spectra of sufficient chemical detail such that the PCA and HCA are able to successfully differentiate between the groups of terpenes tested (Fig. 8a and b). However, unlike the outputs from the CIR-TOF-MS and LC-MS/MS PCAs, the cyclic and straight chain monoterpenes in the cTOF-AMS PCA do not group into two distinct classes, instead they tend to group in their species-specific sub-classes within the upper region of the PCA space. Indeed, the PLS-DA gave 100 % sensitivity and specificity for the cyclic monoterpenes and sesquiterpenes, but only 75 % sensitivity for the straight chain monoterpenes, suggesting that the model does less well at assigning myrcene and linalool cTOF-AMS spectra to their defined class.
As can be seen from inspection of Fig. 8a,
From further inspection of the loadings bi-plot (Fig. 8a), we see that the
four sesquiterpene (
Figure 9 provides a highly simplified overview of the current state of knowledge regarding the atmospheric oxidation of hemi-, sesqui-, cyclic and straight chain monoterpenes, showing selected key steps and intermediates on route to SOA formation. The mechanisms outlined in Fig. 9 underpin the findings reported here and help to explain how the atmospheric chemistry of the various terpene oxidation systems and their SOA can be chemometrically mapped with respect to one another.
From a review of recent literature and from the summary presented in Fig. 9, it can be seen that isoprene can react to form condensable second and
higher-generation nitrates in the presence of NO
Depending on the chemistry involved (Fig. 9), potential SOA forming
monoterpene products will either be (six-member) ring-retaining (e.g. from
reaction with OH) or (six-member) ring cleaved (e.g. from reaction with OH or
O
Simplified schematic illustrating some of the important
mechanistic pathways in the gas-phase oxidation of isoprene,
OH will react with straight chain monoterpenes, such as myrcene, primarily
by addition to either the isolated or the conjugated double bond system.
Reaction at the isolated C
By comparing both the gas- and particle-phase PCA results for cyclic
monoterpenes in Figs. 2 and 7a, it is evident that the dominant loadings
represent compounds of similar MW, i.e.
Within the cyclic monoterpene group there is a small degree of separation
between the limonene and
In order to explore how the PCA technique can be used to investigate product
distributions driven by certain starting conditions, a separate analysis was
conducted on the five toluene experiments. In this instance we investigate
the product distribution dependency on initial VOC
From inspection of the PCA loadings bi-plot in Fig. 10, it is clear that
the toluene photooxidation spectra distribute in statistical space according
to their respective initial VOC
The moderate NO
PCA loadings bi-plot of the second vs. first principal
components derived from the PCA analysis of the toluene experiments.
Experiments were conducted under low NO
The high NO
Having successfully used the mechanistic fingerprints in the chamber data to construct descriptive statistical models of the gas and particle phases, and having applied the methodology to map mesocosm environments, a next logical step would be to use this detailed chemical framework to investigate ambient VOC and SOA composition data, in an attempt to help elucidate and deconvolve the important chemistry controlling the gas- and particle-phase composition of inherently more complex real-world environments.
If ambient biogenic gas/particle composition spectra of unknown origin, uncertain speciated composition and/or a high level of detail and complexity were to be mapped onto the relevant statistical model (i.e. introduced as a separate test set), their resultant vector description in the statistical space could provide information regarding the type of precursors present and the underlying chemical mechanisms at play, as exemplified by the classifying of the mesocosm experiments by the fraction of isoprene, monoterpene and sesquiterpene chemistry in the experimental fingerprints. Furthermore, as shown by the mapping of toluene photooxidation experiments into a separate and distinct cluster, the methodology is potentially able to be robust with respect to other chemical compositions expected for a real-world environment that is significantly impacted by both anthropogenic and biogenic emissions (e.g. Houston, USA and the Black Forest – Munich, DE). This capability is important when attempting to understand the complex interactions that exist between urban and rural atmospheres and when attempting to understand VOC and SOA source identification.
One potential problem in moving from simulation chamber data to real-world systems, would be the applicability of using “static” experimental spectra (i.e. time averaged) to build a model to accept “dynamic” data, in which there would be potentially overlapping reaction coordinates and multiple precursor and radical sources.
In order to investigate the impact of a more dynamic system on the
composition of the gas-phase matrix and hence on the composition of the
spectra employed to build the model, a zero-dimensional chamber box model
was constructed for the
Basic chamber simulation: Spiked chamber simulation: Constant injection chamber simulation:
It should be noted here that the model runs are not idealised. The aim of
these simulations is only to provide systematically more complex chemical systems
with which to compare and contrast a simulation representing the measured
data set. For work regarding the evaluation of the MCM with respect to single
VOC precursor chamber experiments (including model-measurement
intercomparison), see, for example, Bloss et al. (2005) (toluene), Metzger et
al. (2008) and Rickard et al. (2010) (1,3,5-TMB), Camredon et al. (2010)
(
The results of the three different model scenarios are given in Fig. 11, mapped through to the resultant simulated mass spectra (i.e. integrated across the experiment).
Figure 11a and b show the results from scenario (1). Figure 11a gives the evolution of the system over the molecular weight region of interest with time and Fig. 11b gives the scenario summed “model mass spectra”, i.e. the relative abundance of all simulated compounds within the gas-phase molecular weight region of interest (with relative contributions from isobaric species summed into a single “peak”). Scenario (1) and Fig. 11a and b approximate the experimental data employed within this work and constitute the model base case.
Results from MCM
Figure 11c and d show the results from scenario (2). Figure 11c clearly
shows the second
The results from model scenario (3) are given in Fig. 11e and f. As with
scenario (2), there is no dramatic difference between the simulated mass
spectra of scenario (3) and the base-case scenario (1). In this instance
Scenarios (2) and (3) represent complex mixtures with overlapping reaction coordinates, each one step closer to a real-world case than scenario (1) and the chamber data employed within this work. However, despite the increase in complexity of the scenarios, both exhibit very little compositional difference to the base-case scenario and hence the chamber data employed in this work. These results give some confidence that despite being constructed from summed simulation chamber data, the statistical models employed here represents a solid framework onto which real atmosphere spectra could be mapped and interpreted.
A further step in increasing complexity and hence a further step towards the real-world system, would be the addition of other (potentially unidentified) precursors to the simulation, which may be at different stages of oxidation or have passed through different reactive environments. Further increases in complexity, beyond the analysis discussed here, will form the focus of future work.
A chemometric dimension reduction methodology, comprising PCA, HCA and PLS-DA has been successfully applied for the first time to complex gas- and particle-phase composition spectra of a wide range of BVOC and mesocosm environmental simulation chamber photooxidation experiments. The results show that the oxidised gas-phase atmosphere (i.e. the integrated reaction coordinate) of each different structural type of BVOC can be classified into a distinct group according to the controlling chemistry and the products formed. Indeed, a potential major strength of the data analysis methodology described here, could lie in the decoding of mechanisms into pathways (i.e. separation within a group on account of different underlying chemistry) and consequently linking chemical pathways to precursor compounds. Furthermore, the methodology was similarly able to differentiate between the types of SOA particles formed by each different class of terpene, both in the detailed and broad chemical composition spectra. In concert, these results show the different SOA formation chemistry, starting in the gas-phase, proceeding to govern the differences between the various terpene particle compositions.
The ability of the methodology employed here to efficiently and effectively “data mine” large and complex data sets becomes particularly pertinent when considering that modern instrumentation/techniques produce large quantities of high-resolution temporal and speciated data over potentially long observation periods. Such statistical mapping of organic reactivity offers the ability to simplify complex chemical data sets and provide rapid and meaningful insight into detailed reaction systems comprising hundreds of reactive species. Moreover, the demonstrated methodology has the potential to assist in the evaluation of (chamber and real-world) modelling results, providing easy to use, comprehensive observational metrics with which to test and evaluate model mechanisms and outputs, and thus help advance our understanding of complex organic oxidation chemistry and SOA formation.
The authors gratefully acknowledge the UK Natural Environment Research Council (NERC) for funding the APPRAISE ACES consortium (NE/E011217/1) and the TRAPOZ project (NE/E016081/1); the EU-FP7 EUROCHAMP-2 programme for funding the TOXIC project (E2-2009-06-24-0001); the EU ACCENT Access to Infrastructures program for funding work at the PSI and the EU PEGASOS project (FP7-ENV-2010-265148) for funding used to support this work. A. R. Rickard and M. R. Alfarra were supported by the NERC National Centre for Atmospheric Sciences (NCAS). The authors would like to thank the University of Leicester Atmospheric Chemistry group for assistance throughout all experiments, including Alex Parker, Chris Whyte, Iain White and Timo Carr; co-workers at the University of Manchester for assistance with MAC experiments; co-workers from Fundacion CEAM, Marie Camredon and Salim Alam for assistance with EUPHORE experiments and co-workers from the Laboratory of Atmospheric Chemistry smog chamber facility at the Paul Scherrer Institute (PSI) for assistance with PSISC experiments. The authors are grateful to M. Wiseman from the University of Brighton, for useful discussions regarding various statistical techniques. Edited by: V. F. McNeill