Seawater Analysis by Ambient Mass Spectrometry-Based Seaomics and Implications on Secondary Organic Aerosol Formation

A transmission mode-direct analysis in real time-quadrupole time of flight-mass spectrometry (TM-DARTQTOF-MS)-based analytical method coupled to multivariate statistical analysis was developed to interrogate lipophilic compounds in seawater samples without the need of desalinization. An untargeted metabolomics approach addressed here as seaomics was successfully implemented to discriminate sea surface microlayer (SML) from underlying water (ULW) 35 samples (n=22, 10 paired samples) collected during a field campaign at the Cape Verde islands in September-October 2017. A panel of 11 ionic species detected in all samples allowed sample class discrimination by means of supervised multivariate statistical models. Tentative identification of these species suggest that saturated fatty acids, peptides, fatty alcohols, halogenated compounds, and oxygenated boron-containing organic compounds may be involved in water-air transfer processes and in photochemical reactions at the water-air interface of the ocean. A subset of SML samples (n=5) were 40 subject to on-site experiments during the campaign using a lab-to-the-field approach to test their secondary organic aerosol (SOA) formation potency. Results from these experiments and the analytical seaomics strategy provide a proof of concept that organic compounds play a key role in aerosol formation processes at the water/air interface.


Introduction
Oceans act as sinks and sources for gases and aerosol particles. The ocean surface chemical composition influences the physicochemical processes occurring at the air-water interface by connecting the ocean biogeochemistry with the atmospheric chemistry in the marine boundary layer (MBL; Donaldson and George, 2012). Therefore, understanding how the organic compounds of marine origin are influencing the formation of aerosols in the MBL with potential impacts on the radiative fluxes, aerosol hygroscopicity, and subsequent cloud condensation nuclei properties is important. It has been suggested that complex photoactive compounds are enhanced at the air-sea interface (Reeser et al., 2009a, b), thus inducing the abiotic production of volatile organic compounds. For instance, experimental photosensitized reactions at the air-water interface by using humic acids as a proxy of dissolved organic matter (DOM) have led to the chemical conversion of linear saturated fatty acids into unsaturated Published by Copernicus Publications on behalf of the European Geosciences Union. 6244 N. Zabalegui et al.: Seawater analysis by ambient mass-spectrometry-based seaomics functionalized gas-phase products . Atmospheric photochemistry can even take place in the absence of photosensitizers if the air-water interface is coated with a fatty acid (Rossignol et al., 2016). On a global scale, interfacial photochemistry has recently been suggested to serve as an abiotic source of volatile organic compounds, which is comparable to marine biological emissions (Brüggemann et al., 2018).
The sea surface microlayer (SML) covers up to 70 % of the Earth's surface and is enriched with DOM, including organic compounds such as fatty acids, fatty alcohols, sterols, amines, amino acids, proteins, lipids, phenolic compounds, and UV-absorbing humic-like substances derived from oceanic biota (Liss and Duce, 1997). In addition, particulate matter, microorganisms (Donaldson and George, 2012), and colloids and phytoplankton-exuded aggregates mainly constituted by lipopolysaccharides can also be found (Liss and Duce, 1997;Hunter and Liss, 1977;Bayliss and Bucat, 1975;Liss, 1986;Hardy, 1982;Garabetian et al., 1993;Williams et al., 1986;Schneider and Gagosian, 1985;Gershy, 1983;Guitart et al., 2004;Facchini et al., 2008;Kovac et al., 2002). While the identification of these classes of compounds has been achieved in the past, an improved chemical characterization of the SML and its chemical processing is highly desirable to better understand its contribution to atmospheric composition, air quality, and climate change (Liss and Duce, 1997).
Metabolomics is the comprehensive analysis and characterization of all small molecules (MW < 1500) in a biological system (Fiehn et al., 2000;Nicholson and Lindon, 2008), such as the marine metabolome. Mass spectrometry (MS) is one of the primary analytical techniques used to explore the metabolome, as it is highly sensitive and versatile for conducting chemical analyses in targeted and untargeted studies (Clendinen et al., 2017;Weckwerth and Morgenthal, 2005). Targeted metabolomics focuses on detecting and quantifying a preselected set of metabolites. Conversely, untargeted metabolomics attempts to cover the broadest range of detectable compounds in a biological system (Viant et al., 2019), in order to subsequently extract chemical patterns or class fingerprints that can allow for sample classification based on metabolite panels without any a priori hypotheses. Multivariate statistical techniques compute all of the compound features (variables) simultaneously with the aim of reducing data dimensionality, finding underlying trends, and isolating feature panels relevant to class discrimination (Saccenti et al., 2014). Following compound identification, the relative changes of abundances can be analyzed for biological interpretation.
The advancements in the new, soft ambient ion generation techniques offer alternative MS-based applications for surface analysis, with little to no sample preparation, and address high-throughput analytical challenges in untargeted metabolomics workflows (Monge et al., 2013;Harris et al., 2011;Clendinen et al., 2017). In particular, direct analysis in real time (DART) (Cody et al., 2005;Gross, 2014;Jones et al., 2014;Monge and Fernández, 2014), which is a plasmabased ambient ion source, has been successfully applied in untargeted metabolomics studies in different scientific fields (Salter et al., 2011;Ifa et al., 2009;Steiner and Larson, 2009;Fernández et al., 2006;Chernetsova et al., 2010;Cajka et al., 2011;Dove et al., 2012;Jones and Fernández, 2013;Zang et al., 2017); to date, however, no studies have been reported to explore oceanic biological systems. In DART-MS, a stream of metastable atomic or molecular species generated within the heated discharge He or N 2 gas is directed at the sample, and ions are suctioned into the mass spectrometer (Cody et al., 2005). Thermally desorbed analytes, typically having MW < 1000, are ionized following atmospheric pressure chemical ionization (ACPI)-like pathways (Cody et al., 2005;Song et al., 2009a, b;McEwen and Larsen, 2009). Therefore, a major limitation is that it requires analytes to be volatile or semivolatile, which reduces the metabolome coverage. An important advantage of DART, when compared to electrospray ionization (ESI) for seawater analysis, is that it is less affected by high salt levels (Kaylor et al., 2014;Tang et al., 2004), thus avoiding the desalinization processes that may lead to sample alteration. Conversely, ESI sources allow for the coupling of MS to chromatographic systems that provide an additional parameter to improve confidence in the compound identification when compared to an authentic chemical standard.
In the present work, a transmission mode (TM)-DART-quadrupole time-of-flight (QTOF)-MS-based analytical method was developed to interrogate the seawater DOM composition in SML and underlying water (ULW) samples collected during a field campaign at the Cabo Verde islands during September-October 2017. An untargeted metabolomics approach, addressed here as seaomics, was implemented to successfully discriminate the SML from ULW samples based on a selected panel of 11 ionic species. Tentative identification of the discriminant panel provided insight into the family of compounds that may be involved in air-water transfer processes and photochemical reactions at the air-water interface of the ocean surface. In addition, secondary organic aerosol formation potency from the SML interfacial photochemical products was explored during the field campaign by using a lab-to-field approach. To our knowledge, this is the first study to apply an untargeted TM-DART-QTOF-MS-based seaomics analytical strategy coupled to multivariate statistical analysis to investigate the DOM seawater composition.

Sample collection at the Cabo Verde field campaign
Sea surface microlayer (SML) samples were manually collected by the traditional glass plate (GP) method (van Pinxteren et al., 2012) and with an automatic catamaran named MarParCat (CAT) by using the same sampling principle as GP. The MarParCat is an autonomous catamaran for sampling the SML on rotating glass plates. Larger quantities of SML samples can be collected with this method in a shorter time. Underlying water (ULW) samples were collected from 1.0 m sea subsurface during the same time window as the SML samples, using both strategies, i.e., manual sampling addressed as GP and MarParCat (Table S1 in the Supplement). SML and ULW samples that were collected at the same site are addressed as paired samples (Table S1). The samples analyzed in the present study (n = 22) were collected between 18 September 2017 and 10 October 2017 and stored at −20 • C until processing. Information related to sampling conditions, sample salinity, pH, and temperature is provided in Table S1. Dissolved organic carbon (DOC) levels varied between 1.8 and 3.2 mg L −1 in the SML and between 0.9 and 2.8 mg L −1 in the ULW (van Pinxteren et al., 2019).

Aerosol particle formation experiments at the Cabo Verde islands
A subset of collected SML seawater samples were subjected to on-site experiments using a lab-to-field approach to test whether they were photochemically active . Before each experiment, a 100 mL SML sample was conditioned to room temperature and divided into 12 aliquots. These were centrifuged at 3500 rpm and 4 • C for 25 min to exclude colloids and aggregates (particulate matter), using a 5702R centrifuge (Eppendorf, Hamburg, Germany). Subsequently, 2 mL of surface solution was collected from each centrifugal vessel to isolate closer representations of SML samples considering the dilution factor inherent to the collection process, i.e., SML diluted with the ULW contribution, which led to a total sample volume of 24 mL for subsequent experiments. Centrifugation was aimed at concentrating SML samples as a condition for aerosol formation. Sample irradiation was conducted using a cylindrical quartz cell reactor (2 cm diameter, 10 cm length, and 30 mL volume), half-filled with 14 mL of SML solution, thereby recreating an air-water interface with a maximum area of 20 cm 2 . Experimental details of the reactor can be found elsewhere . This quartz reactor was surrounded by UV lamps in a ventilated box, which maintained the system at a relatively constant room temperature. The interface was irradiated by means of 210 W actinic UV irradiation peaking at 350 nm (the spectrum is displayed in Fig. S1 in the Supplement, Supporting Text 1), which was supplied by seven low-pressure mercury UV lamps (Philips) and one extra UV pen ray (UVP, Philips).
This experimental approach allowed for the reproduction of the air-sea exchanges under quiescent conditions and for the investigation of particle formation that potentially arises from the reaction between photochemically emitted gaseous products and OH radicals. For this purpose, the quartz cell was continuously flushed with 600 sccm purified air, thus entraining the air-water interfacial-exchange gaseous products into a potential aerosol mass (PAM) oxidation flow reactor with a 254 nm light supply (hereafter referred to as OFR254). Particle formation via OH radical photochemistry in the OFR254 was monitored by using a scanning mobility particle sizer (SMPS, model 3976, TSI Incorporated, Minnesota, USA) and one extra ultrafine condensation particle counter (UCPC, model 3776, TSI Incorporated, Minnesota, USA; d50 > 2.5 nm). A description of the OFR254 operation and a scheme of the experimental setup are detailed in the Supplement (Figs. S2 and S3). Blank experiments were routinely conducted by using ultrapure water (18.2 M cm resistivity).

Sample preparation for DART-MS analysis
Samples were thawed at 4 • C for 5 h; neither desalination nor filtration was performed. Samples were split into 8 mL aliquots using 15 mL conical tubes and were subsequently frozen at −20 • C until lyophilization. Quality-control (QC) samples were prepared by mixing equal volumes of all of the samples including both collection methods before the sample lyophilization (QC ALL ) and after metabolite extraction and reconstitution in acetonitrile (QC MIX22 ). The chemical standard mixtures used for the analytical method development and as system suitability samples (SSSs) were prepared in ultrapure water for sugars, and amino acids, and in methanolwater mixtures for lipids and by combining all the standards from the three families of compounds (Table S2). The sample preparation blank was prepared with ultrapure water as fol-lows: fresh ultrapure water was stored for 2 d at −20 • C in a new plastic bottle equivalent to those used for sample collection; it was subsequently thawed, split in 8 mL aliquots, and stored in 15 mL conical tubes at −20 • C until lyophilization. This protocol was also implemented to prepare the commercial seawater samples (CSW) that were used for analytical method development. Blanks, QCs, SSSs, and samples were lyophilized at 0.280 mbar for 48 h by using an Alpha 1-4 LSCbasic freeze dryer (Martin Christ, Göttingen, Germany). The SML samples, ULW samples, QCs, and SSSs were lyophilized with sample blanks in different batches to evaluate possible cross-contamination. Lyophilized samples were shipped from TROPOS (Germany) to CIBION-CONICET (Argentina), where they were stored at −80 • C until the TM-DART-QTOF-MS analysis occurred. Lyophilized residues were reconstituted in 1200 µL of acetonitrile, yielding a concentration factor of 6.67. Reconstituted samples were vortex mixed for 5 min for metabolite extraction and centrifuged for 10 min at 4861 × g and 20 • C to favor the formation of a salt pellet. For each sample, 500 µL of supernatant was collected for further analysis.

DART-MS analysis
A DART ® SVP ionization source (IonSense Inc., Massachusetts, USA) was coupled to a Xevo G2-S QToF mass spectrometer (Waters Corporation, Wilmslow, UK) by means of a VAPUR ® interface flange (IonSense Inc., Massachusetts, USA). The DART source was operated with He as the discharge gas heated to 300 • C, and the data were acquired in the negative ionization mode. A transmission mode (TM)-DART geometry was implemented for sample analysis, by setting a distance of 2.5 cm in the rail holding the source. This allowed for the use of the minimum possible DART-to-sample distance to provide the greatest sensitivity (Zang et al., 2017;Jones and Fernández, 2013). Samples were deposited in a stainless-steel mesh that was subsequently placed in a linear rail-based sampler, which was digitally controlled to minimize variance in sample position. Figure S4 illustrates the experimental design for depositing samples in different spots of the mesh to avoid cross-contamination. A protocol for calibrating the mass spectrometer across the range of m/z 50-850 by using the DART source operated in TM was developed by using a mixture of standards prepared in a water-methanol solution (1 : 1 v/v) that would provide almost equidistant m/z peaks. Signals of different adduct ions from 2cyanoguanidine, enalapril maleate, mercaptosuccinic acid, 2amino-4,5-dimethoxybenzoic acid, flecainide acetate, and lacosamide were used for the time-of-flight (TOF) calibration (Table S3). Drift correction was performed after data acquisition by using stearic acid present as an ambient contaminant. The [M-H] − adduct ion with m/z 283.2643 was chosen as a lock mass to have a high degree of accuracy in the exact mass measurement. Data were acquired in the continuum mode in the range of m/z 50-850, and the scan time was set to 1 s. A standard solution of enalapril 3.7 µM was used as an additional SSS and added to each mesh in spot no. 3 (Fig. S4) to evaluate mass accuracy of the [M-H] − ion at m/z 375.1925. The resolving power and mass accuracy of the TM-DART-QTOF-MS system were 23 000 full width at half maximum and 0.2 mDa at m/z 375.1925, respectively. A total of 12 spots per mesh were utilized for analysis. Each spot contained three droplets of 20 µL of the same sample, which was dried at room temperature before analysis. The mesh holder was moved at a speed of 0.2 mm s −1 for data acquisition. Mesh nos. 1-11 included a solvent (SV) blank (acetonitrile); a commercial seawater control; a sample preparation blank (using ultrapure water); a QC MIX22 (pooled QC sample from all reconstituted samples: 10 SML + 12 ULW); and technical triplicates of all SML and ULW samples (Fig. S4). As indicated in Fig. S4, mesh no. 12 included QC ALL samples (pooled QC sample from all samples before lyophilization: 10 SML + 12 ULW samples). For TM-DART-QTOF-MS/MS experiments, the product ion mass spectra were acquired with collision cell voltages between 10 and 40 V, depending on the analyte. Ultrahigh-purity argon (≥ 99.999 %) was used as the collision gas. Data acquisition and processing were carried out by using MassLynx 4.1 (Waters Corporation, Milford, Massachusetts, USA). Data were acquired for each spot, and acquisition over each mesh was automatically performed through synchronization between the DART software (IonSense, Inc.) and MassLynx (Waters Corporation, Milford, Massachusetts, USA). System suitability procedures were performed to verify that the method and associated instrumentation were fully functioning before and during the analysis of experimental samples.

Seaomics data analysis
The Progenesis Bridge (Waters Corporation, Milford, Massachusetts, USA) application was used for data preprocessing. This software allowed for the defining of the lock mass for drift correction after acquisition and merged the original data into a Gaussian profile. Spectral features (m/z values) were further extracted from the TM-DART-QTOF-MS data using Progenesis QI version 2.1 (Nonlinear Dynamics, Waters Corporation, Milford, Massachusetts, USA). An absolute ion intensity filter was applied in the peak-picking process for integration, thus defining a threshold for the aggregate run. Only SML and ULW samples were considered for peak picking. This process yielded 889 features (m/z) that were detected within the samples. Subsequently, six features were removed due to high mass defects (potential salt clusters). For the correction of intermesh effects, a qualitycontrol-based robust locally estimated scatterplot smoothing (LOESS) signal correction method (Dunn et al., 2011) was applied by using QC MIX22 samples. This strategy allowed correcting for the temporal signal fluctuation of each feature along the total acquisition time. Subsequently, features with relative standard deviation (RSD) > 30 % in QC MIX22 were discarded, and only those with a 5-fold average intensity in samples compared to the blanks (i.e., sample preparation blanks and solvent blanks) were retained. Manual curation of features was also performed to eliminate redundancy (isotopic peaks from the same feature), to retain signals with a detected isotopic pattern, and to account for resolution limitations in the peak-picking process. Moreover, only those monoisotopic peaks with intensity > 10 3 in the continuum spectra were retained. The final curated matrix consisted of 51 features (m/z values) and was normalized by the total ion area. Abundance values from technical triplicates were averaged, except for the SML GP2 sample, for which only two replicates were considered. The matrices obtained before and after averaging the technical replicates (data set S1 in the Supplement) were utilized to build unsupervised and supervised multivariate statistical analysis models using MATLAB R2015a (MathWorks, Natick, Massachusetts, USA) with PLS_Toolbox version 8.1 (Eigenvector Research, Inc., Manson, Washington, USA). Principal component analysis (PCA) (Johnson and Wichern, 2007) and t-distributed stochastic neighbor embedding (t-SNE; Van Der Maaten and Hinton, 2008) techniques were used to track the data quality, reduce the data dimensionality, and identify potential outliers in the data set as well as to identify sample clusters and evaluate the analytical method reproducibility. Orthogonal projections to latent structures discriminant analysis (OPLS-DA; Trygg et al., 2007;Bylesjö et al., 2006;Trygg and Wold, 2002;Shrestha and Vertes, 2010), coupled with a genetic algorithm (GA) variable selection method, were applied to find a feature panel that maximized the classification accuracy for the binary comparison of the SML and ULW samples. The selected group of discriminant features had the lowest root mean square error of cross-validation (RMSECV) at the conclusion of the GA variable selection process. This process was performed five different times. The selected panel yielded the lowest RMSECV and exhibited the largest feature overlap with the other four panels. The parameters for the GA were as follows: population size -64; variable window width -1; percent of initial terms (variables) -15; target minimum no. of variables -5, target maximum no. of variables -15; penalty slope -0.03; maximum generations -100; percent at convergence -50; mutation rate -0.005; crossover -double; regression choice -PLS; no. of latent variables -5; cross-validation -contiguous; no. of splits -10; no. of iterations -10; and replicate runs -10. The OPLS-DA model was cross-validated using venetian blinds with four data splits and one sample per blind to account for overfitting. The data were preprocessed by autoscaling prior to the PCA or OPLS-DA. The PCA was also performed to inspect the data before and after GA variable selection (i.e., on the curated spectral feature matrix and on the discriminant feature panel). Fold changes were calculated for paired samples for each discriminant feature by comparing the sample replicate average values for the SML and ULW samples. The Wilcoxon paired signed rank test was used to compare SML with ULW samples (p < 0.05). Median fold changes were calculated for each discriminant feature (Table S4).

Metabolite identification procedure
Metabolite identification was attempted for the discriminant features resulting from the GA variable selection process. The elemental formulae were generated based on accurate masses and isotopic patterns and taking the stringent conditions for isotope ratios into account. For those cases in which there was an overlap between isotopic peaks of different features, the isotopic pattern was not considered for molecular formula generation. In addition, fragmentation patterns obtained from TM-DART-QTOF-MS/MS experiments were used for tentative identification.
3 Results and discussion 3.1 TM-DART-QTOF-MS-based method optimization Figure 1 illustrates the untargeted TM-DART-QTOF-MS seaomics analytical workflow implemented for the analysis of seawater samples collected during the Cabo Verde field campaign. A TM geometry was implemented to analyze samples in a flow-through fashion to increase the reproducibility with a lower risk of cross-contamination (Zhou et al., 2010a, b;Jones and Fernández, 2013;Perez et al., 2010;Zang et al., 2017;Jones et al., 2014). The analytical method development involved the following: (i) the optimization of the ion source stabilization time, which was accomplished in 60 s, and the synchronization between data acquisition and the linear-rail control; (ii) the selection of He over N 2 to generate the plasma, based on higher sensitivity obtained with the former; (iii) the optimization of the He temperature set at 300 • C; (iv) the selection of acetonitrile for metabolite extraction; (v) the optimization of the solvent volume required for extraction to allow for maximum metabolite concentration, considering that the seawater metabolome is comprised of organic compounds with a wide range of physicochemical properties and levels, and to allow for enough sample volume for technical replicates, QCs, and tandem MS analyses; and (vi) the optimization of the sample volume deposited on the mesh to maximize signal-to-noise ratio (number of sample droplets and droplet volume). The selected OM extraction method with acetonitrile as an extracting solvent favored the analysis of lipophilic compounds. In addition, to enhance the detection of organic acids, the analytical method was optimized by operating the DART ion source in negative ionization mode, since it follows negative ionization APCI-like mechanisms including electron capture, dissociative electron capture, proton abstraction, and anion adduction (McEwen and Larsen, 2009;Cody and Dane, 2013;Gross, 2014).  with solvent blanks (squares). The plot can be read by using the following: WB -sample preparation blanks using ultrapure water (gray); QC ALL -pooled sample from all seawater samples before lyophilization (purple); QC MIX22 -pooled sample from all reconstituted seawater samples (pink); SML -sea surface microlayer water samples (light blue); ACN -acetonitrile (red); CSW -commercial seawater samples (gold); and ULW -underlying water samples (black). PCA and t-SNE models were built using the 51 extracted features and all replicates were included.

Seawater sample fingerprinting
The curated data matrix, comprised of 51 features, i.e., m/z values, and all sample replicates (data set S1 in the Supplement), was used to build a PCA model that accumulated 62.29 % of the total variance in the first two principal components (PCs) (Fig. 2). The 2D score plot illustrated in Fig. 2a shows distinguishable separation between acetonitrile blanks, sample preparation blanks, commercial seawater samples, and seawater samples collected during the field campaign. Since the maximum data variance in a PCA model is in the direction of the base of the eigenvectors of the covariance matrix, the largest differences are given by seawater samples that are compared to blanks. However, seawater samples from the Cabo Verde islands were discriminated from commercial seawater samples. In addition, QC MIX22 replicates clustered together, which indicates reproducibility in the sample preparation method, high data quality, and adequate performance of the analytical platform. Moreover, overlapping of both types of QC samples (QC MIX22 and QC ALL ) suggested reproducibility in the sample extraction protocol. Solvent blanks from different mesh and different positions (spots) were clustered together, which suggests negligible cross-contamination in the analysis. Results provided by the t-SNE model (Fig. 2b), which is a nonlinear dimensionality reduction technique, were in agreement with those provided by the linear transformation-based technique of PCA and emphasized the reproducibility of the developed analytical method for seawater sample analysis. This was further evidenced by the visualization of sample replicate clusters in a t-SNE model that only included SML and ULW samples (Fig. S5).
To investigate the possibility of seawater sample clustering, a PCA model was built with the 51 extracted and curated features for averaged technical replicates of SML and ULW samples. Figure 3a shows the PCA score plot, including the first three principal components that accounted for 43.93 %, 25.08 %, and 8.40 % variance, respectively. No outliers were detected by this analysis, and no sample clustering was visualized in the score plot. Thus, sample discrimination was further attempted by means of OPLS-DA coupled to a GA variable selection method to find a reduced set of features that would allow for sample classification and class membership prediction. A panel of 11 features with the lowest RMSECV was selected through the GA process. Figure 3b shows the cross-validated prediction plot using the selected feature panel by means of a model that consisted of five latent variables that interpreted 82.19 % and 95.41 % variance from the X block (feature abundances) and Y block (class membership), respectively. This OPLS-DA model resulted in a 100 % cross-validated accuracy, sensitivity, and specificity; therefore, there was no sample misclassification. Sample classification was further evaluated by means of a nonsupervised method by using the 11 discriminant features to discard possible overfitting by the supervised multivariate model. Figure 3c shows a certain degree of sample separation into clusters in the PC3 dimension according to the seawater sample collection depth, i.e., SML or ULW.

SOA formation potency from SML samples
A subset of SML samples (CAT 8, GP 10, CAT 6, CAT 3, and CAT 4) that were analyzed by the TM-DART-QTOF-MS seaomics strategy were also subject to on-site experiments during the field campaign by using a lab-to-field approach to test their SOA formation potency. The outcome of a typical SML irradiation experiment is illustrated for sample CAT 8 in Fig. 4. The different time periods (P) when the experimental parameters were modified along the experiment are indicated in the figure. In the absence of light (before P1), no particle formation was detected downstream of the preconditioned OFR (5.0 ppmv initial O 3 and half-power UV light supply). However, when SML samples were exposed to actinic irradiation (periods P1-P4), particle formation was detected in the OFR254. Moreover, the particle number concentration exhibited trends that were dependent on OH exposure (OHexp; P2-P3). Gaseous products were probably generated from the photosensitized reactions at the SML interface and subsequently reacted with OH radicals in the OFR254, which led to particle formation.
Because of the difficulty associated with on-site measuring total OHR (OH radical reactivity) from the cell reactor or tracing OHexp in the OFR, we only tested the particle generation rates qualitatively with respect to various oxidation degrees, by changing the UV light intensity or O 3 concentration in the OFR. Assuming that the photochemistry occurring at the SML interface was at a steady state, the air-water exchanged gaseous products were constantly entrained into the OFR, and the estimated particle generation rates/OHexp for each period followed the trend of P1 < P4 < P2 < P3. During P1, particle concentration gradually increased with SML illumination, and the final number concentration exceeded 8 × 10 3 cm −3 . These particles exhibited a median diameter of several nanometers at the edge of the lower 10 nm size limit of the SMPS detection system; thus, measuring the particle size distribution was not possible. During P2, the UV light intensity was doubled in the OFR by turning all lamps on. A particle burst was detected by the UCPC, together with a shift towards larger particle sizes. The oxidation capacity in the OFR was further enhanced by supplying additional external O 3 (initial mixing ratio of 7.0 ppmv). The total particle concentration decreased while larger particles were formed. During P4, one UV lamp in the OFR was turned off, and a sharp decrease in particle concentration was observed, but the final concentration was still higher than during P1 (Fig. 4). Particle formation was observed for CAT 8 and GP 10 SML samples. The results from the atmospheric simulation experiments conducted on SML samples were in agreement with previous laboratory studies that demonstrated airsea interfacial-driven chemistry as a source of marine sec-  . Irradiation experiment for SML CAT 8 sample in a quartz cell and subsequent particle formation from the SML interfacial gaseous products via OH radical photochemistry in the OFR. The (a) O 3 mixing ratio and humidity in the OFR; (b) particle concentration measured by CPC; and (c) particle size distribution profiles scanned by SMPS downstream of the OFR. The yellow shading represents the time period in which the quartz cell containing the concentrated SML sample was illuminated. P1 to P4 correspond to different operations to the OFR in varying oxidation degrees of the gaseous products from the quartz cell. ondary aerosol Fu et al., 2015).

Discriminant compound identification and role in aerosol particle formation
Compound identification was attempted for the 11 features of the discriminant panel. The coupling of the DART source to a high-resolution mass spectrometer allowed for the generation of elemental formulae for unknown compounds which, together with tandem MS capability, contributed to their identification. Figure S6 shows the high-resolution continuum mass spectra obtained for each of the discriminant features detected in all samples and obtained from the GA selection process. The analysis of fragment ions detected in tandem MS experiments, together with neutral loss analysis, provided information regarding the functional groups and contributed to filter molecular formulae obtained by accurate mass and isotopic pattern analysis. Table 1 describes the ionic species associated with the discriminant features and their corresponding molecular formulae, and it provides information about product ions and neutral and/or radical losses identified in TM-DART-QTOF-MS/MS experiments. In addition, the table includes the family of compounds identified with a certain confidence level. In general, discriminant features comprised saturated fatty acids, fatty alcohols, peptides, brominated compounds, and boron-containing organic compounds. An expected limitation of TM-DART-QTOF-MS analysis was associated with the spectral overlap; thus, in some cases the isotopic pattern was not considered for compound identification. However, two different quadrupole-mass windows of 6 and 1 Da were used in tandem MS experiments to mitigate this problem. The mass window of 6 Da allowed for the investigation of the complete isotopic profile, with a high sensitivity at the expense of lower selectivity than the narrower mass window. In contrast, the mass window of 1 Da provided more confidence in the identification of product ions with higher selectivity at the expense of lower sensitivity than the broader mass window. In cases of low precursor Table 1.  No ID ↓ window) limits interpretation ion intensity or quadrupole coselection, the MS/MS spectra were not collected (Table 1).
Different types of species were generated for desorbed and ionized analytes (M) by the plasma-based source operated in negative mode, including [M-H] − , [M] − , and [M] − q ionic species. The generation of a radical anion, [M] − q , was suggested for feature no. 4 based on the product ions detected in tandem MS experiments and the generated molecular formulae. Based on the tentative identification of feature no. 4, additional experiments were performed with chemical standards including a dicarboxylic acid (succinic acid) and saturated fatty acids under the same experimental conditions as for the seawater sample analysis. Different ionic species were detected in these experiments, except for radical anions. However, literature evidence suggests that the production of radical anions based on electron-capture mechanisms occurs in He-based plasma sources (Cody and Dane, 2016;Bridoux and Machuron-Mandard, 2013;Jorabchi et al., 2013).
Based on the analysis of the isotopic patterns and tandem MS results, several features were identified as oxygenated boron-containing organic compounds. In these compounds, the boron atom is speculated to be functionalized with saturated fatty acids yielding tetra coordinated boron esters that would generate [M] − anions. Boron-containing compounds are known to be ubiquitous in vascular plants, marine algal species, and microorganisms (Dembitsky et al., 2002). Four out of five features identified as boron-containing organic compounds functionalized with saturated fatty acids as well as features identified as fatty alcohols were enriched in the SML samples when compared to ULW samples (Table S4).
Compounds having a bromine atom in their molecular formula were also tentatively identified in the discriminant panel and are suggested to be halogenated compounds rather than bromine adduct ions. This hypothesis is based on the results yielded by the comparative analysis of a saturated acetonitrile solution with KBr and 2 mM phenol, and the analysis of an acetonitrile solution of 4-bromophenol (Fig. S7) that was used as a model compound. The [M-H] − ion was detected in the analysis of 4-bromophenol, but the [M + Br] − adduct ion was not observed for the KBr saturated solution containing phenol. The two features (nos. 21 and 34) that were identified as halogenated compounds were enriched in the SML samples (Table S4). Possible sources of halogenated compounds in the SML samples are photochemical reactions occurring at the air-water interface Donaldson and George, 2012). It is worth noting that organic compounds identified in the discriminant panel may have derived both from the secreted (exometabolome) and/or intracellular metabolites (endometabolome) of biological organisms, such as algal species and microorganisms present in seawater, since the samples were not filtered. In a real environment, therefore, some of these compounds may be present in lower levels than those detected in the present work, or they may not be available to participate in the sea surface secondary organic aerosol (SOA) chemistry. Figure 5. Bidimensional PCA score plot for SML samples using the matrix with 51 features for averaged technical replicates. Samples that were evaluated for particle formation during the Cabo Verde field campaign are indicated with circles (led to SOA formation) and rectangles (did not lead to SOA formation).
Putative identification of the discriminant panel capable of differentiating SML from ULW samples provides further evidence to support SOA formation detected by the lab-tofield approach during the campaign. The PCA score plot illustrated in Fig. 5 shows that SML samples were not distinguished based on the collection method, i.e., GP or CAT, and points out those SML samples that were also evaluated for SOA formation during the field campaign. As previously discussed, two of these SML samples (CAT 8 and GP 10) yielded SOA formation (Fig. 4). Since CAT 8 and GP 10 were separated in the bidimensional score map from the group formed from CAT 3, CAT 4, and CAT 6, a further PCA model was built only with those samples (n = 5) that were analyzed by both TM-DART-QTOF-MS and the lab-to-field approach (Fig. S8). Figure S8a shows that PC2 clearly separates samples according to SOA formation. Four out of seven features that mainly contribute to sample class separation with the largest absolute values in the loadings plot associated with PC2, and illustrated in Fig. S8B, were putatively identified as boron-containing organic compounds (Table S5). Despite the limitations associated with the low number of samples used to perform statistical analysis, the results suggest that SML samples that led to particle formation were enriched on boron-containing organic compounds and other unidentified molecules (Table S5).
An untargeted TM-DART-QTOF-MS-based analytical method coupled to multivariate statistical analysis allowed for the analysis of organic compounds present in the SML and ULW seawater samples collected during a field campaign at the Cabo Verde islands without the need for desalinization. This seaomics approach was successfully implemented to discriminate the SML from ULW samples. Tentative identification of the discriminant metabolite panel suggests that halogenated compounds, fatty alcohols, and oxygenated boron-containing organic compounds are available for air-water transfer processes and photochemical reactions at the air-water interface of the ocean. Combined results from TM-DART-QTOF-MS and on-site SOA formation testing experiments on SML samples suggest that organic compounds enriched at the air-water interface may be contributing to the differential SOA-forming ability of SML samples. This strategy, implemented for the first time in this collaborative study, provides new opportunities for improving the characterization of seawater OM content, and discovering compounds involved in aerosol formation processes.
Author contributions. MEM, MvP, HH, and CG designed the collaborative study. MvP and HH designed the sample collection methods. MM processed the samples until they were stored at −80 • C. MEM, MM, and NZ developed the TM-DART-MS-based seaomics strategy and analyzed the data. MEM, NZ, MM, AD, NH, and CG contributed to optimizing the TM-DART-MS-based analytical method. NZ and MM conducted TM-DART-MS and MS/MS experiments. MR, CL, and CG conducted on-site aerosol particle formation experiments. MEM, NZ, and CG wrote the paper. All authors revised the paper.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "Marine organic matter: from biological production in the ocean to organic aerosol particles and marine clouds (ACP/OS inter-journal SI)". It is not associated with a conference.
Acknowledgements. María Eugenia Monge is a research staff member from CONICET (Consejo Nacional de Investigaciones Científicas y Técnicas, Argentina). Nadja Triesch and Sebastian Zeppenfeld from the TROPOS Atmospheric Chemistry Department (ACD) are acknowledged for their support during the SML collection and sample preparation. Coretta Bauer from the UFZ Helmholtz Centre for Environmental Research is acknowledged for assisting with sample lyophilization.
Financial support. This research has been supported by the Marie Skłodowska-Curie Actions (MSCA) Research and Innovation Staff Exchange (RISE), and Horizon 2020 (H2020-MSCA-RISE-2015; grant no. 690958), which finances the European "MARSU" network. (MARSU represents the "MARine atmospheric Science Unravelled: analytical and mass spectrometric techniques development and application".) Funding was also provided by the Argentine National Mass Spectrometry System (SNEM), CONICET, Ministerio de Ciencia, Tecnología e Innovación (MINCyT; grant no. project E-