High levels of primary biogenic organic aerosols are driven by only a few plant-associated microbial taxa

Abstract. Primary biogenic organic aerosols (PBOAs) represent a
major fraction of coarse organic matter (OM) in air. Despite their
implication in many atmospheric processes and human health problems, we
surprisingly know little about PBOA characteristics (i.e., composition,
dominant sources, and contribution to airborne particles). In addition,
specific primary sugar compounds (SCs) are generally used as markers of PBOAs
associated with bacteria and fungi, but our knowledge of microbial
communities associated with atmospheric particulate matter (PM) remains
incomplete. This work aimed at providing a comprehensive understanding of
the microbial fingerprints associated with SCs in PM10 (particles
smaller than 10 µm) and their main sources in the surrounding
environment (soils and vegetation). An intensive study was conducted on
PM10 collected at a rural background site located in an agricultural area
in France. We combined high-throughput sequencing of bacteria and fungi with
detailed physicochemical characterizations of PM10, soil, and plant
samples and monitored meteorological and agricultural activities throughout
the sampling period. Results show that in summer SCs in PM10 are a
major contributor of OM in air, representing 0.8 % to 13.5 % of OM mass. SC
concentrations are clearly determined by the abundance of only a few
specific airborne fungal and bacterial taxa. The temporal fluctuations in the
abundance of only four predominant fungal genera, namely Cladosporium, Alternaria,
Sporobolomyces, and Dioszegia, reflect the
temporal dynamics in SC concentrations. Among bacterial taxa, the abundance
of only Massilia, Pseudomonas, Frigoribacterium, and Sphingomonas
is positively correlated with SC species. These
microbes
are significantly enhanced in leaf over soil samples. Interestingly, the
overall community structure of bacteria and fungi are similar within
PM10 and leaf samples and significantly distinct between PM10 and
soil samples, indicating that surrounding vegetation is the major source of
SC-associated microbial taxa in PM10 in this rural area of France.



Introduction
Airborne particulate matter (PM) is the subject of high scientific and political interest mainly because of its important effects on climate and public health (Boucher et al., 2013;Fuzzi et al., 2006). Numerous epidemiological studies have significantly related both acute and chronic exposures to ambient PM with respiratory impairments, heart diseases, asthma, and lung cancer, as well as increased risk of mortality (Kelly and Fussell, 2015;Lecours et al., 2017;Pope and Dockery, 2006). PM can also affect directly or indirectly the climate by absorbing and/or diffusing both the incoming and outgoing solar radiation (Boucher et al., 2013;. These effects are modulated by highly variable physical characteristics (e.g., size, specific surface, concentrations) and the complex chemical composition of PM Fuzzi et al., 2015). PM consists of a complex mixture of inorganic trace elements and carbonaceous matter (organic carbon and elemental carbon), with organic mat-Published by Copernicus Publications on behalf of the European Geosciences Union.
ter (OM) generally being the major but poorly characterized constituent of PM (Boucher et al., 2013;Bozzetti et al., 2016;Fortenberry et al., 2018). A quantitative understanding of OM sources is critically important to develop efficient guidelines for both air-quality control and abatement strategies. So far, considerable efforts have been undertaken to investigate OM associated with anthropogenic and secondary sources, but much less is known about emissions from primary biogenic sources (Bozzetti et al., 2016;China et al., 2018;Yan et al., 2019).
Primary biogenic organic aerosols (PBOAs) are a subset of organic PM that are directly emitted by processes involving the biosphere (Boucher et al., 2013;Elbert et al., 2007). PBOAs typically refer to biologically derived materials, notably including living organisms (e.g., bacteria, fungal spores, Protozoa, viruses), nonliving biomass (e.g., microbial fragments), and other types of biological materials like pollen or plant debris (Amato et al., 2017;Elbert et al., 2007;. PBOAs are gaining increasing attention notably because of their ability to affect human health by causing infectious, toxic, and hypersensitivity diseases Huffman et al., 2019). For instance, PBOA components, especially fungal spores and bacterial cells, have recently been shown to cause significant oxidative potential (Samaké et al., 2017). However, to date, the precise role of PBOA components and interplay regarding mechanisms of diseases are remarkably misunderstood (Coz et al., 2010;Hill et al., 2017). Specific PBOA components can also participate in many relevant atmospheric processes like cloud condensation and ice nucleation, thereby directly or indirectly affecting the Earth's hydrological cycle and radiative balance (Boucher et al., 2013;Hill et al., 2017). These diverse impacts are effective at a regional scale due to the transport of PBOAs (Dommergue et al., 2019;Yu et al., 2016). Moreover, PBOAs are a major component of OM found in particles less than 10 µm in aerodynamic diameter (PM 10 ) (Bozzetti et al., 2016;Coz et al., 2010;Samaké et al., 2019b). For instance, Bozzetti et al. (2016) have shown that PBOAs equal the contribution of secondary organic aerosols (SOAs) to OM in PM 10 collected at a rural background site in Switzerland during both the summer and winter periods. However, current estimates of global terrestrial PBOA emissions are very uncertain and range between 50 and 1000 Tg yr −1 (Boucher et al., 2013;Coz et al., 2010;Elbert et al., 2007), underlining the critical gap in the understanding of this significant OM fraction.
The recent application of fluorescent technics, such as the ultraviolet aerodynamic particle sizer, the wideband integrated bioaerosol sensor (Bozzetti et al., 2016;Gosselin et al., 2016;Huffman and Santarpia, 2017;Huffman et al., 2019), and scanning electron microscopy (Coz et al., 2010), has provided very insightful information on the abundance of size-segregated ambient PBOAs. Atmospheric sources of PBOAs are numerous and include agricultural activities, leaf abrasion, and soil resuspension (Coz et al., 2010;Medeiros et al., 2006;Pietrogrande et al., 2014). To date, the detailed constituents of PBOAs, their predominant sources, and their atmospheric emission processes, as well as their contributions to total airborne particles, remain poorly documented and quantified (Bozzetti et al., 2016;Coz et al., 2010;Elbert et al., 2007). Such information would be important for investigating the properties and atmospheric impacts of PBOAs, as well as for a future optimization of source-resolved chemical transport models (CTMs), which are still generally unable to accurately simulate important OM fractions (Ciarelli et al., 2016;Heald et al., 2011;Kang et al., 2018).
Primary sugar compounds (SCs, defined as sugar alcohols and saccharides) are ubiquitous water-soluble compounds found in atmospheric PM (Gosselin et al., 2016;Medeiros et al., 2006;Pietrogrande et al., 2014;Jia et al., 2010b). SC species are emitted from biologically derived sources (Medeiros et al., 2006;Verma et al., 2018) and have sometimes been detected in aerosols taken from air masses influenced by smoke from biomass burning (Fu et al., 2012;Yang et al., 2012). However, recent studies conducted at several sites across France revealed a weak correlation between daily concentrations of SC and levoglucosan in PM 2.5 and PM 10 collected throughout the year (Golly et al., 2018;Samaké et al., 2019a). This suggests that the open burning of biomass is not a significant source of SCs in the environments studied here. In this context, specific SC species are still extensively viewed as powerful markers for tracking sources and estimating PBOA contributions to OM mass (Bauer et al., 2008;Gosselin et al., 2016;Jia et al., 2010b;Medeiros et al., 2006). For example, glucose is the most common monosaccharide in vascular plants, and it has been predominantly used as an indicator of plant material (such as pollen or plant debris) from several areas around the world (Jia et al., 2010b;Medeiros et al., 2006;Pietrogrande et al., 2014;Verma et al., 2018). Trehalose (a.k.a. mycose) is a common metabolite of various microorganisms, serving as an osmoprotectant accumulating in a cell's cytosol during harsh conditions (e.g., dehydration and heat) (Bougouffa et al., 2014). It has been proposed as a generic indicator of soil-borne microbiota (Jia et al., 2010b;Medeiros et al., 2006;Pietrogrande et al., 2014;Verma et al., 2018). Similarly, mannitol and arabitol are two very common sugar alcohols (also called polyols), serving as storage and transport solutes in fungi (Gosselin et al., 2016;Medeiros et al., 2006;Verma et al., 2018). Their atmospheric concentration levels have frequently been used to investigate fungal spore contributions to PBOA mass in different environments (urban, rural, costal, and polar) around the world (Barbaro et al., 2015;Gosselin et al., 2016;Jia et al., 2010b;Verma et al., 2018;Weber et al., 2018).
Despite the relatively vast literature using the atmospheric concentration levels of SCs as potential suitable markers of PBOAs associated with bacteria and fungi, our understanding of associated airborne microbial communities (i.e., diversity and community composition) remains poor. This is due in particular to the lack of high-resolution (i.e., daily) data sets characterizing how well the variability of these microbial communities may be related to that of primary sugar species. Such information is of paramount importance to better understand the dominant atmospheric sources of SCs (and then PBOAs), as well as their relevant effective environmental drivers, which are still poorly documented (Bozzetti et al., 2016).
Our recent works discussed the size distribution features and the spatial and temporal variability in atmospheric particulate SC concentrations in France (Golly et al., 2018;Samaké et al., 2019a, b). As a continuation, in this study we present the first daily temporal-concurrent characterization of ambient SC species concentrations and both bacterial and fungal community compositions for PM 10 collected at a rural background site located in an intensive agricultural area. The aim of this study was to use a DNA metabarcoding approach (Taberlet et al., 2018) to investigate PM 10 -associated microbial communities, which can help answer the following research questions. (i) What are the microbial community structures associated with PM 10 ? (ii) Is the temporal dynamics of SC concentrations related to changes in the airborne microbial community composition? (iii) What are the predominant sources of SC-associated microbial communities at a continental rural field site? Since soil and vegetation are currently believed to be the dominant sources of airborne microorganisms in most continental areas (Bowers et al., 2011;Jia et al., 2010a;Rathnayake et al., 2016), our study focused on these two potential sources.

Site description
The Observatoire Pérenne de l'Environnement (OPE) is a continental rural background observatory located about 230 km east of Paris at an altitude of 392 m (Fig. 1). This French critical zone observatory (CZO) is part of a longterm multidisciplinary project monitoring the state of environmental variables, including among others fluxes, abiotic and biotic variables, and their functions and dynamics (http://ope.andra.fr/index.php?lang=en, last access: 10 December 2019). It is largely impacted by agricultural activities. It is also characterized by a low population density (less than 22 per square kilometer within an area of 900 km 2 ) with no industrial activities or surrounding major transport roads. The air monitoring site itself lies in a "reference sector" of 240 km 2 in the middle of a field crop area (tens of kilometers in all directions). This reference sector is composed of vast farmlands interspersed with wooded areas. The area is further defined by a homogeneous soil type with a predominantly superficial clay-limestone composition. The daily agricultural practices and meteorological data (including wind speed and direction, temperature, rainfall level, and relative humidity) within the reference sector are recorded and made available by ANDRA (Agence nationale pour la gestion des déchets radioactifs). The agricultural fields of the area are generally submitted to a 3-year crop-rotation system. The major crops during the campaign period were pea and oilseed rape.

Sample collection
An intensive field campaign was conducted at this site for the sake of the present study. The aerosol sampling campaign period lasted from 12 June to 21 August 2017, covering the summer period in France. During this period, ambient PM 10 was collected daily (from 09:00 UTC to 09:00 UTC the next day) onto prebaked quartz fiber filters (Pall Tissuquartz 2500QAT-UP, Ø = 150 mm) using high-volume samplers (Aerosol Sampler DHA-80, DIGITEL; 24 h at 30 m 3 h −1 ). After collection, all filter samples were wrapped in aluminum foil, sealed in zipper plastic bags, and stored at <4 • C until further analysis. More details on the preparation, storage, and handling of these filter samples can be found in Samaké et al. (2019b). A total of 69 samples and 6 field blanks were collected.
Surface soil samples (0-5 cm depth, 15 cm×15 cm area) were simultaneously collected from two fields within the pea and oilseed rape growing areas. The fields are located in the immediate vicinity of the PM 10 sampler and under the prevailing wind directions (Fig. 1). To represent as closely as possible the local soil microbial communities, we randomly collected five subsamples (about 100 g per sampling unit) within each parcel and pooled them. Topsoil sampling took place on a weekly basis during the campaign period. After collection and homogenization, 15 g of each subsample was stored in airtight containers (sterile bottles, Schott, GL45, 100 mL) containing the same weight of sterile silica gel (around 15 g). Such a soil desiccation method is a straightforward approach to prevent any microbial growth and change in community over time at room temperature (Taberlet et al., 2018). A total of eight topsoil samples were collected for each parcel.
Finally, leaf samples were collected from the major types of vegetation within the reference sector. These include leaves of oilseed rape, pea, oak, maple, beech, and herbs ( Fig. 1). A total of eight leaf samples were analyzed. These samples were also stored in airtight containers (sterile bottles, Schott, GL45, 100 mL) containing 15 g of silica gel. It should be noted that leaf samples were collected only once, 4 weeks after the end of PM and soil sampling, while the major crops were still on site.

Chemical analyses
Daily PM 10 samples were analyzed for various chemical species using subsampled fractions of the collection filters and a large array of analytical methods. Detailed information on all the chemical-analysis procedures has been reported previously (Golly et al., 2018;Samaké et al., 2019b;Waked et al., 2014). Briefly, SCs (i.e., polyols and saccharides) and water-soluble ions (including Ca 2+ ) were systematically analyzed in all samples, using respectively high-performance liquid chromatography with pulsed amperometric detection (HPLC-PAD) and ionic chromatography (IC; Thermo Fisher ICS 3000, USA). Free-cellulose concentrations were determined using an optimized enzymatic hydrolysis (Samaké et al., 2019a) and the subsequent analysis method of the resultant glucose units using HPLC-PAD (Golly et al., 2018;Samaké et al., 2019b;Waked et al., 2014). Organic and elemental carbon (OC, EC) were analyzed using a Sunset thermal-optic instrument and the EUSAAR2 protocol (Cavalli et al., 2010). This analytical method requires a high temperature, thereby constraining the choice of quartz as the sampling filter material. OM content in PM 10 samples was then estimated using an OM-to-OC conversion factor of 1.8: OM = 1.8 × OC (Samaké et al., 2019a, b). This value of 1.8 for the OM/OC ratio was chosen on the basis of previous studies carried out in France (Samaké et al., 2019b, and references therein) 2.4 Biological analyses: DNA extraction in PM 10 samples Aerosol samples typically contain very low DNA concentrations, and the DNA-binding properties of the quartz fibers of aerosol collection filters make its extraction with traditional protocols challenging (Dommergue et al., 2019;Jiang et al., 2015;Luhung et al., 2015). In the present study, we were also constrained by the limited daily collection filter surface for the simultaneous chemical and microbiological analyses of the same filters. To circumvent issues of low efficiency during genomic DNA extraction, several technical improvements have been made to optimize the extraction of high-quality DNA from PM 10 samples (Dommergue et al., 2019;Jiang et al., 2015;Luhung et al., 2015). This includes thermal water bath sonication helping the lysis process in thick cell walls (e.g., fungal spores and Gram-positive bacteria), which might not be effectively lysed solely by means of bead beating (Luhung et al., 2015). Some consecutive (2 d at maximum) quartz filter samples with low OM concentrations were also pooled when necessary. Detailed information regarding the resultant composite samples (labeled A1 to A36) is presented in Table S1 in the Supplement. Figure S1 in the Supplement presents the average concentration levels of SC species in each sample. The results clearly show that air samples can be categorized from low (background, from A1 to A4 and A21 to A36) to high (peak, from A5 to A20) PM 10 -SC concentration levels.
In terms of DNA extraction, one-quarter (about 38.5 cm 2 ) of each filter sample was used. First, filter aliquots were aseptically inserted into individual 50 mL Falcon tubes filed with sterilized saturated phosphate buffer (Na 2 HPO 4 , NaH 2 PO 4 , 0.12 M; pH ≈ 8). PM 10 was desorbed from the filter samples by gentle shaking for 10 min at 250 rpm. This pretreatment allows the separation of the collected particles from the quartz filters thanks to the high competing interaction between the saturated phosphate buffer and the charged biological materials (Jiang et al., 2015;Taberlet et al., 2018). After gentle vortex mixing, the subsequent resuspension was filtered with a polyethersulfone membrane disk filter (Supor ® PES 47 mm 200, 0.2 µm, Pall, USA). We repeated this desorbing step three times to enhance the recovery of the biological material from the quartz filters. Each collection PES membrane was then shredded into small pieces and used for DNA extractions using the DNeasy PowerWater kit (Qiagen, Germantown, MD, USA). The standard protocol of the supplier was followed with only minor modifications: 30 min of thermal water bath sonication at 65 • C (EMAG, Emmi-60 HC, Germany; 50 % efficiency) and 5 min of bead beating Atmos. Chem. Phys., 20, 5609-5628, 2020 www.atmos-chem-phys.net/20/5609/2020/ before and after sonication were added. Finally, DNA was eluted in 50 µL of EB buffer. Such an optimized protocol has been recently shown to produce a 10-fold increase in DNA extraction efficiency (Dommergue et al., 2019;Luhung et al., 2015), thereby allowing high-throughput sequencing of air samples. Note that all the steps mentioned above were performed under laminar flow hoods and that materials (filter funnels, forceps, and scissors) were sterilized prior to use.

Biological analyses: DNA extraction from soil and leaf samples
The soil sample pretreatment and the extracellular DNA extraction were achieved following an optimized protocol proposed elsewhere (Taberlet et al., 2018). Briefly, this protocol involves thoroughly mixing and extracting 15 g of soil in 15 mL of sterile saturated phosphate buffer for 15 min. About 2 mL of the resulting extracts was centrifuged for 10 min at 10 000 g, and 500 µL of the resulting supernatant was used for DNA extraction using the NucleoSpin Soil Kit (Macherey-Nagel, Düren, Germany) following the manufacturer's original protocol after skipping the cell lysis step. Finally, DNA was eluted with 100 µL of SE buffer.
To extract DNA from either endophytic or epiphytic microorganisms, aliquots of leaf samples (about 25-30 mg) were extracted with the DNeasy Plant Mini Kit (QIAGEN, Germany) according to the supplier's instructions with the following minor modifications: after the resuspension of powdered samples in 400 µL of AP1 buffer, the samples were incubated for 45 min at 65 • C with RNase A. Finally, DNA was eluted with 100 µL of AE buffer.

Biological analyses: polymerase chain reaction amplification and sequencing
Bacterial and fungal community compositions were surveyed using, respectively, the Bact02 (Forward 5 -KGCCAGCMGCCGCGGTAA-3 and Reverse 3 -GGACTACCMGGGTATCTAA-5 ) and Fung02 (Forward 5 -GGAAGTAAAAGTCGTAACAAGG-3 and Reverse 3 -CAAGAGATCCGTTGYTGAAAGTK-5 ) published primer pairs (see Taberlet et al., 2018, for details on these primers). The primer pair Bact02 targets the V4 region of the bacterial 16S rDNA region while the Fung02 primer pair targets the nuclear ribosomal internal transcribed spacer region 1 (ITS1). Four independent PCR (polymerase chain reaction) replicates were carried out for each DNA extract. Eight nucleotide tags were added to both primer ends to uniquely identify each sample, ensuring that each PCR replicate was labeled by a unique combination of forward and reverse tags. The tag sequences were created with the oligotag command within the open-source OBITools software suite (Boyer et al., 2016) so that all pairwise tag combinations were differentiated by at least five different base pairs (Taberlet et al., 2018).
DNA amplification was performed in a 20 µL total volume solution containing 10 µL of AmpliTaq Gold 360 Master Mix (Applied Biosystems, Foster City, CA, USA), 0.16 µL of 20 mg mL −1 bovine serum albumin (BSA; Roche Diagnostics, Basel, Switzerland), 0.2 µM of each primer, and 2 µL of diluted DNA extract. DNA extracts from soil and filters were diluted eight times, while DNA extracts from leaves were diluted four times. Amplifications were performed using the following thermocycling program: an initial activation of DNA polymerase for 10 min at 95 • C; x cycles of 30 s denaturation at 95 • C, 30 s annealing at 53 • C and 56 • C for bacteria and fungi, respectively, and 90 s elongation at 72 • C; and a final extension at 72 • C for 7 min. The number of cycles x was determined by qPCR and set at 40 for all markers and DNA extract types, except for the Bact02 amplification of soil and leaf samples (30 cycles) and the Fung02 amplification of filter samples (42 cycles). After amplification, about 10 % of amplification products were randomly selected and verified using a QIAxcel Advanced device (QIAGEN, Hilden, Germany) equipped with a high-resolution cartridge for separation.
After amplification, PCR products from the same marker were pooled in equal volumes and cleaned with the MinElute PCR purification kit (Qiagen, Hilden, Germany), following the manufacturer's instructions. The two pools were sent to Fasteris SA (Geneva, Switzerland; https://www.fasteris.com/dna/, last access: 10 December 2019) for library preparation and MiSeq Illumina 2 × 250 bp paired-end sequencing. The two sequencing libraries (one per marker) were prepared according to the PCR-free MetaFast protocol (https://www.fasteris.com/dna/?q= content/metafast-protocol-amplicon-metagenomic-analysis, last access: 10 May 2020), which aims at limiting the formation of chimeras.
To monitor any potential false positives inherent to tag jumps and contamination (Schnell et al., 2015), the sequencing experiment included both extraction and PCR negatives, as well as unused tag combinations.

Bioinformatic analyses of raw reads
The Illumina raw sequence reads were processed separately for each library using the OBITools software suite (Boyer et al., 2016), specifically dedicated to metabarcoding data processing. First, the raw paired-ends were assembled using the illuminapairedend program, and the sequences with a low alignment score (fastq average quality score <40) were discarded. The aligned sequences were then assigned to the corresponding PCR replicates with the program ngsfilter by allowing zero and two mismatches on tags and primers, respectively. Strictly identical sequences were dereplicated using the program obuniq, and a basic filtration step was performed with the obigrep program to select sequences within the expected range length (i.e., longer than 65 and 39 bp for fungi and bacteria, respectively, excluding tags and primers), with-out ambiguous nucleotides, and observed at least 10 times in at least one PCR replicate.
The remaining unique sequences were grouped and assigned to molecular operational taxonomic units (MOTUs) with a 97 % sequence identity using the Sumatra and Sumaclust programs (Mercier et al., 2013). The Sumatra algorithm computes pairwise similarities among sequences based on the length of the longest common subsequence, and the Sumaclust program uses these similarities to cluster the sequences (Mercier et al., 2013). An abundance of sequences belonging to the same cluster was summed up, and the cluster center was defined as the MOTU representative of the cluster (Mercier et al., 2013).
The taxonomic classification of each MOTU was performed using the ecotag program (Boyer et al., 2016), which uses full-length metabarcodes as references. The ecoPCR program (Ficetola et al., 2010) was used to build the metabarcode reference database for each marker. Briefly, ecoPCR performs an in silico amplification within the EMBL public database (release 133), using the Fung02 and Bact02 primer pairs and allowing a maximum of three mismatches per primer. The resultant reference database was further refined by keeping only sequence records assigned at the species, genus, and family levels.
After taxonomic assignment, data sets were acquired, and further processing with the open-source R software (RStudio interface, version 3.4.1) was performed to filter out chimeras, potential contaminants, and failed PCR replicates. More specifically, MOTUs that were highly dissimilar to any reference sequence (sequence identity <0.95) were considered chimeras and discarded. Secondly, MOTUs whose abundance was higher in extraction and PCR negatives were also excluded. Finally, PCR replicates inconstantly distant from the barycenter of the four PCR replicates corresponding to the same sample were considered dysfunctional and discarded. The remaining PCR replicates were summed up per sample.

Data analysis
Unless specified otherwise, all exploratory statistical analyses were achieved with R. Rarefaction and extrapolation curves were obtained with the iNEXT 2.0-12 package (Hsieh et al., 2016) to investigate the gain in species richness as we increased the sequencing depth for each sample. Alpha diversity estimators including Shannon and Chao1 were calculated with the phyloseq 1.22-3 package (McMurdie and Holmes, 2013) on data rarefied to the same sequencing depth per sample type (see Table S2 for details on the rarefaction depths). Nonmetric multidimensional scaling (NMDS) ordination analysis was performed to decipher the temporal patterns in airborne microbial community structures (phylum or class taxonomic group) in air samples. These analyses were achieved with the metaMDS function within the vegan package (Oksanen et al., 2019) with the number of random starts set to 500. The NMDS ordinations were obtained using pairwise dissimilarity matrices based on Bray-Curtis index. The envfit function implemented in vegan was used to assess the airborne microbial communities, which could explain the temporal dynamics of ambient SC species concentrations. Pairwise analysis of similarity (ANOSIM) was performed to assess similarity between groups of PM 10 aerosol samples. This was achieved using the anosim function of vegan (Oksanen et al., 2019) with the number of permutations set to 999. Spearman's rank correlation analysis was used to investigate further the relationship between airborne microbial communities and SC species.
To gain further insight into the dominant source of SCassociated microbial communities, NMDS analysis based on Horn distance was performed to compare the microbial community composition similarities between PM 10 aerosol, soil, and leaf samples.

Primary sugar compounds and relative contributions to OM mass
Temporal dynamics of daily PM 10 carbonaceous components (e.g., primary sugar compounds, cellulose, and OM) are presented in Fig. 2. Nine SCs including seven polyols and two saccharide compounds have been quantified in all ambient PM 10 collected at the study site. Ambient SC concentration levels peaked on 8 August 2017 in excellent agreement with the daily harvest activities around the study site (Fig. 2a). The average concentration (average ± SD) of total SCs during the campaign is 259.8 ± 253.8 ng m −3 with a range of 26.6 to 1679.5 ng m −3 , contributing on average to 5.7 ± 3.2 % of total OM mass in PM 10 with a range of 0.8-13.5 % (Fig. 2b). The total measured polyols present an average concentration of 26.3 ± 54.4 ng m −3 . Among all the measured polyols, arabitol (67.4±83.1 ng m −3 ) and mannitol (68.1 ± 75.3 ng m −3 ) are the predominant species, followed by lesser amounts of sorbitol (10.9 ± 7.6 ng m −3 ), erythritol (7.0±8.8 ng m −3 ), inositol (2.3±2.0 ng m −3 ), and xylitol (2.3 ± 3.0 ng m −3 ). Glycerol was also observed in our samples but with concentrations frequently below the quantification limit. The average concentration of saccharide compounds is 51.2±45.0 ng m −3 . Trehalose (55.8±51.9 ng m −3 ) is the most abundant saccharide species, followed by glucose (46.9 ± 37.1 ng m −3 ). The average concentration of calcium is 251.1 ± 248.4 ng m −3 . A Spearman's rank correlation analysis based on the daily dynamics was used to examine the relationships between SC species. As shown in Table 1, sorbitol and inositol are well-correlated linearly (R = 0.57, p<0.001). Herein, sorbitol (R = 0.59, p<0.001) and inositol (R = 0.64, p<0.001) are significantly correlated to Ca 2+ . It can also be noted that all other SC species are highly correlated with each other Atmos. Chem. Phys., 20, 5609-5628, 2020 www.atmos-chem-phys.net/20/5609/2020/ Contribution of SCs to organic matter mass. Results for 9-week daily measurements indicate that SCs together represent a large fraction of OM, contributing to between 0.8 % to 13.5 % of OM mass in summer. Glycerol is not presented because its concentration was generally below the quantification limit.
(p<0.001) and that they are weakly correlated to the temporal dynamics of sorbitol and inositol (Table 1).

Microbial characterization of samples, richness and diversity
The structures of bacterial and fungal communities were generated for the 62 collected samples, consisting of 36 aerosol, 18 surface soil, and 8 leaf samples. After paired-end assembly of sequence reads, sample assignment, filtering based on sequence length and quality, and discarding rare sequences, we are left with 2 575 857 and 1 647 000 reads respectively for fungi and bacteria, corresponding to 4762 and 5852 unique sequences, respectively. After the clustering of highquality sequences, potential contaminants, and chimeras, the final data sets (all samples pooled) consist respectively of 597 and 944 MOTUs for fungi and bacteria, with 1 959 549 and 901 539 reads. The average number of reads (average ± SE) per sample is 31 607 ± 2072 and 14 563 ± 1221 for fungi and bacteria, respectively. The rarefaction curves of MOTU diversity showed common logarithmic shapes approaching a plateau in all cases (Fig. S2). This indicates an overall sufficient sequencing depth to capture the diversity of sequences occurring in the different types of samples. To compare the microbial community diversity and species richness, data normalization was performed by randomly selecting from  (Fig. S3a). In contrast, PM 10 and soil samples showed higher values of Shannon index (p<0.05), indicating a higher fungal diversity in these ecosystems. The soil harbors higher bacterial richness and diversity than PM 10 (p<0.05), which in turn harbors greater richness and diversity compared to leaf samples (p<0.05) (Fig. S3b).

Relationship between airborne microbial community abundances and PM 10 SC species
The NMDS (nonmetric multidimensional scaling) ordination exploring the temporal dynamics of microbial community beta diversity among all PM 10 aerosol samples revealed significant temporal shifts in community structure for both fungi and bacteria (Fig. 4). An NMDS (two dimensions, stress = 0.15) based on fungal class-level compositions (Fig. 4a) results in three distinct clusters of PM 10 samples. With one exception (A23), all air samples with higher SC concentration levels (A5 to A20, see Table S2 and Fig. S1) are clustered together and are distinct from those with background levels of atmospheric SC concentrations. This pattern is further confirmed with the analysis of similarity, which shows a significant separation of clusters of samples (ANOSIM, R = 0.31, p<0.01). As evidenced in Fig. 4a, this difference is Atmos. Chem. Phys., 20, 5609-5628, 2020 www.atmos-chem-phys.net/20/5609/2020/ The circle from the inner to the outer layer represents classification from kingdom to order successively. Further details on fungal and bacterial taxa at genus level are provided in Fig. S4. The node size represents the average relative abundance of taxa. Only nodes with relative abundance ≥ 1 are highlighted in bold.
Given the distinct clustering patterns of airborne PM 10 microbial beta diversity structures according to SC concentration levels, a Pearson's rank correlation analysis has been performed to further examine the relationship between individual SC profiles and airborne microbial community abundance at phylum and class levels. This analysis reveals that for class-level fungi, the abundances of Dothideomycetes, Tremellomycetes, and Microbotryomycetes are highly positively correlated (p<0.05) to the temporal evolutions of the individual SC species concentration levels (Fig. S5a). Likewise, ambient SC species concentration levels are significantly correlated (p<0.05) to the Proteobacteria phylum (Fig. S5b). To gain further insight into the airborne microbial fingerprints associated with ambient SC species, correlation analyses were also performed at a finer taxonomic level. These analyses show that the temporal dynamics of SC species primarily correlates best (p<0.05) with the Cladosporium, Alternaria, Sporobolomyces, and Dioszegia fungal genera (Fig. 5a). Similarly, the time series of SC species are primarily positively correlated (p<0.05) with Massilia, Pseudomonas, Frigoribacterium, and to a lesser degree (nonsignificant) with the Sphingomonas bacterial genus (Fig. 5b).

Sources of airborne microbial communities at the study site
As shown in Fig. 6, the airborne microbial genera most positively correlated with SC species are also distributed in the surrounding environmental samples of surface soils and leaves. In addition, microbial taxa of PM 10 associated with SC species are generally more abundant in the leaf than in the topsoil samples (Fig. 6). In order to further explore and visualize the similarity of species compositions across local environment types, we conducted an NMDS ordination analysis (Fig. 7). As evidenced in Fig. 7, the beta diversities of fungal and bacterial MOTUs are more similar within the same habitat (PM 10 , plant, or soil) and are grouped across habitats as expected. Interestingly, the beta diversities of fungal and bacterial MOTUs in leaf samples and those in airborne PM 10 are generally not readily distinguishable, with similarity becoming more prominent during atmospheric peaks of SC concentration levels (Fig. 6). However, the overall beta diversities in airborne PM 10 and in leaf samples are significantly different from those from topsoil samples (ANOSIM, R = 0.89 and 0.80 and p<0.01 for fungal and bacterial communities, respectively), without any overlap regardless of whether or not harvesting activities are performed around the sampling site.
This observation is also confirmed by an unsupervised hierarchical cluster analysis, which reveals a pattern similar to that observed in the NMDS ordination, where taxa from leaf samples and airborne PM 10 are clustered together regardless of whether ambient concentration levels of SC peaked or not. They are clustered separately from those of topsoil samples (Fig. S7).

Discussion
Very few studies exist on the interactions between the air microbiome and PM chemical profiles (Cao et al., 2014;Elbert et al., 2007). In this study, we used a comprehensive multidisciplinary approach to produce for the first time airborne microbial fingerprints associated with SC species in PM 10 and to identify the dominant sources of SCs in an extensively cultivated, continental rural area.

SCs as a major source of organic matter in PM 10
SC species have recently been reported to be ubiquitous in PM 10 collected in several areas in France (Golly et al., 2018;Samaké et al., 2019b). In this study, the total SCs presented an average concentration of 259.8 ± 253.8 ng m −3 with a range of 26.6 to 1679.5 ng m −3 in all air samples. These concentration values are on average 5 times higher than those typically observed in urban areas in France (average values during summer 48.5 ± 43.6 ng m −3 ) (Golly et al., 2018;Samaké et al., 2019a, b). However, these concentration levels are in agreement with a previous study conducted in a similar environment, i.e., continental rural sites located in large crop fields (Yan et al., 2019). The total concentrations of SCs quantified in atmospheric PM 10 over our study site accounted for 0.8 % to 13.5 % of the daily OM mass. This is remarkable considering that less than 20 % of total particulate OM mass can generally be identified at the molecular level. Hence, our results for a 9week-long period indicate that SC could be a major identified molecular fraction of OM for agricultural areas during summer, which is in agreement with several previous studies conducted worldwide (Jia et al., 2010b;Verma et al., 2018;Yan et al., 2019). Further, it has been shown (Samaké et al., 2019a) that the identified polyols probably represent only a small fraction of the emission flux from this PBOA source and that a large fraction of the co-emitted organic material remains unknown. Hence, the PBOA source can potentially represent, for part of the year, a major source of atmospheric OM unaccounted for in CTMs.

Composition of airborne fungal and bacterial communities
In this study, 597 (39-132 MOTUs per sample) and 944 (31-129 MOTUs per sample) MOTUs were obtained for the fungal and bacterial libraries, respectively, reflecting the high richness of airborne microbial communities associated with ambient PM 10 in this rural agricultural zone in France. Airborne fungi were dominated by Ascomycota (AMC) followed by Basidiomycota (BMC) phyla, consistent with the natural feature of many Ascomycota, whose single-celled or hyphal forms are small enough to be rapidly aerosolized, in contrast to many Basidiomycota that are typically too large  to be easily aerosolized (Moore et al., 2011;Womack et al., 2015). Many members of AMC and BMC are known to actively eject ascospores and basidiospores, as well as aqueous jets and droplets containing a mixture of carbohydrates and inorganic solutes, into the atmosphere (Elbert et al., 2007;Womack et al., 2015). The prevalence of Ascomycota and Basidiomycota is consistent with results from previous studies also indicating that the Dikarya subkingdom (Ascomy-cota and Basidiomycota) represents about 98 % of known species in the biological kingdom of Eumycota (i.e., fungi) in the atmosphere (Elbert et al., 2007;James et al., 2006;Womack et al., 2015;Xu et al., 2017). Airborne bacteria in this study belonged mainly to the Proteobacteria, Bacteroidetes, Actinobacteria, and Firmicutes phyla, consistent with previous studies (Liu et al., 2019;Maron et al., 2005;Wei et al., 2019b). Gram-negative Atmos. Chem. Phys., 20, 5609-5628, 2020 www.atmos-chem-phys.net/20/5609/2020/ Proteobacteria constitute a major taxonomic group among prokaryotes (Itävaara et al., 2016;Yadav et al., 2018), including bacterial taxa which are very diverse, important in agriculture, and capable of fixing nitrogen in symbiosis with plants (Itävaara et al., 2016;Yadav et al., 2018). Proteobacteria can survive under conditions with very low nutrient content, which explains their atmospheric versatility (Itävaara et al., 2016;Yadav et al., 2018). These results are similar to those observed in previous studies conducted in different environments around the world, where Proteobacteria, Actinobacteria, and Firmicutes have also been reported as dominant bacterial phyla (Liu et al., 2019;Maron et al., 2005;Wei et al., 2019a). In particular, the most frequent Gram-negative (Proteobacteria and Bacteroidetes) and Gram-positive (Actinobacteria and Firmicutes) bacteria and filamentous fungi (Ascomycota and Basidiomycota) have been previously linked to raw straw handling activities. For instance, it has been suggested that straw combustion during agricultural activities could be a major source of airborne microorganisms in PM 2.5 in the northern plains of China (Wei et al., 2019a, b). However, in our study, SC species are not correlated (R = −0.09, p = 0.46; Fig. S7) with levoglucosan during the campaign period, confirming that biomass burning is not an important source of airborne microbial taxa associated with SCs in our PM 10 series. Bubble bursting associated with sea spray could also potentially be a source of bacteria, fungi, and water-soluble organic species, along with sea salts, to PM 10 (Prather et al., 2013;Zhu et al., 2015). However, SC species were not found to be significantly related to Cl − (R = −0.14, p = 0.28) or Na + (R = −0.18, p = 0.16), which are two inorganic tracers typical of marine sources, nor did they correlate with methanesulfonic acid (R = −0.05, p = 0.69), a well-known tracer of biogenic marine activity (Arndt et al., 2017;Gaston et al., 2010). It therefore seems unlikely that the sources of SCs from marine environments were significant at this site. This point is further discussed in Sect. 4.4.

Atmospheric concentration levels of SC species in
PM 10 are associated with the abundance of a few specific airborne taxa of fungi and bacteria SCs are widely produced in large quantities by many microorganisms to cope with environmental stress conditions (Medeiros et al., 2006). SC species are known to accumulate in high concentrations in microorganisms at low water availability to reduce intracellular water activity and prevent enzyme inhibition due to dehydration (Hrynkiewicz et al., 2010). In addition, the temporal dynamics of ambient polyol concentrations has been suggested as an indicator to follow the general seasonal trend in airborne fungal spore counts (Bauer et al., 2008;Gosselin et al., 2016). Although this strategy has allowed the introduction of conversion ratios between specific polyol species (i.e., arabitol and mannitol) and airborne fungal spores in general (Bauer et al., 2008), the structure of the airborne microbial community associated with SC species has not yet been studied. Our results provide culture-independent evidence that the airborne microbiome structure and the combined bacterial and fungal communities largely determine the SC species concentration levels in PM 10 . Temporal fluctuations in the abundance of only a few specific fungal and bacterial genera reflect the temporal dynamics of ambient SC concentrations. For fungi, genera that show a significant positive correlation (p<0.05) with SC species include Cladosporium, Alternaria, Sporobolomyces, and Dioszegia. Cladosporium and Alternaria are fungal genera that contribute on average to 47.9 % of total fungal sequence reads in our air samples series. These are asexual fungal genera that produce spores by dry-discharge mechanisms, wherein spores are detached from their parent colonies and easily dispersed by the ambient airflow or other external forces (e.g., raindrops, elevated temperature), as opposed to actively discharged spores with liquid jets or droplets in the air (Elbert et al., 2007;Wei et al., 2019b;Womack et al., 2015). Our results are consistent with the well-known seasonal behavior of airborne fungal spores with levels of Cladosporium and Alternaria which have been shown to reach their maximum from early to midsummer in a rural agricultural area of Portugal (Oliveira et al., 2009).
Similarly, bacterial genera positively correlated with SC species are Massilia, Pseudomonas, Frigoribacterium, and Sphingomonas. Although it is the prevalent bacterial genus at the study site, Sphingomonas is indeed not significantly positively correlated with SC species. The genus Sphingomonas is known to include numerous metabolically versatile species capable of using carbon compounds usually present in the atmosphere (Cáliz et al., 2018). The atmospheric abundance of species affiliated with Massilia has already been linked to the change in the stage of plant development (Ofek et al., 2012), which can be attributed to the capacity of Massilia to promote plant growth through the production of indoleacetic acid (Kuffner et al., 2010) and siderophores (Hrynkiewizc et al., 2010), which makes it antagonistic towards Phytophthora infestans (Weinert et al., 2010).
As far as we know, this is the first study evaluating microbial fingerprints with SC species in atmospheric PM; hence, it is not possible to compare our correlation results with those of previous works. However, it has already been suggested that types and quantities of SC species produced by fungi under culture conditions are specific to microbial species and external conditions such as carbon source, drought, heat, etc. (Hrynkiewicz et al., 2010). In future studies, we intend to apply a culture-dependent method to directly characterize the SC contents of some species amongst the dominant microbial taxa identified in this study after growth under several laboratory chambers, reproducing controlled environmental conditions in terms of temperature, water vapor, and carbon sources.

Local vegetation as a major source of airborne microbial taxa of PM 10 associated with SC species
There are still many challenging questions on the emission processes leading to fungi and bacteria being introduced into the atmosphere together with their chemical components. In particular, the potential influence of soil and vegetation and their respective roles in structuring airborne microbial communities are still debated (Lymperopoulou et al., 2016;Rathnayake et al., 2016;Womack et al., 2015), especially since this knowledge is particularly essential for the precise modeling of PBOA emissions processes to the atmosphere within chemical transport models. The characterization of the temporal dynamics of SC species concentrations could provide important information on PBOA sources in terms of composition, environmental drivers, and impacts. The results obtained over a 9-week period of daily PM 10 SC measurements clearly show that the temporal dynamics of sorbitol (R = 0.59, p<0.001) and inositol (R = 0.64, p<0.001) are well-correlated linearly with that of calcium, a typical inorganic water-soluble ion from crustal material. This indicates a common atmospheric origin for these chemicals. Sorbitol and inositol are well-known reduced sugars that serve as a carbon source for microorganisms when other carbon sources are limited (Ng et al., 2018;Xue et al., 2010). In microorganisms, sorbitol and inositol are mainly produced by the reduction of intracellular glucose by aldose reductase in the cytoplasm (Ng et al., 2018;Welsh, 2000;Xue et al., 2010). Moreover, significant concentrations of both sorbitol and inositol have already been measured in surface-soil samples from five cultivated fields in the San Joaquin Valley, USA (Jia et al., 2010b;Medeiros et al., 2006). Therefore, sorbitol and inositol are most likely associated with microorganisms from soil resuspension.
With the exception of sorbitol and inositol, all other SC species measured in air samples at our sampling site are strongly correlated with each other, indicating a common origin. Daily calcium concentration peaks are not systematically associated with those of these other SC species. Interestingly, the highest atmospheric levels of these SC species occurred on 8 August 2017, coinciding well with daily harvesting activities around the site. This is also consistent with the multi-year monitoring of the dominant SCs in PM 10 at this site, where ambient SCs showed a clear seasonal trend with higher values recorded in early August and in good agreement with harvesting activities around the study area every year from 2012 to 2017 (Samaké et al., 2019a). This suggests that the processes responsible for the dynamics of atmospheric concentrations of SCs are replicated annually and are most likely effective over large areas of field crop (Golly et al., 2018;Samaké et al., 2019a). Interestingly, glucose -the most common monosaccharide present in vascular plants and microorganism -has already been proposed as a molecular indicator of biota emitted into the atmosphere by vascular plants and/or by the resuspension of soil from agricultural land (Jia et al., 2010b;Pietrogrande et al., 2014). Therefore, all other SC species measured in our series can be considered to be most likely the result of the mechanical resuspension of crop residues (e.g., leaf debris) and microorganisms attached to them. Other confirmations of this interpretation stem from the excellent daily covariations observed in the PM 10 between SC species levels and ambient cellulose, which is widely considered a reliable indicator of plant debris sources in PM studies (Bozzetti et al., 2016;Hiranuma et al., 2019).
Microbial abundance and community structure in samples from the surrounding environment can provide further useful information on source apportionment and importance. Our data indicate that the airborne microbial genera most positively correlated to SC species are also distributed in the surrounding environmental samples of both surface soils and leaves, suggesting a dominant influence of the local environment for microbial taxa associated with SC species as opposed to long-range transport. This observation makes sense since actively discharged ascospores and basidiospores are generally relatively large airborne particles with a short atmospheric residence time (Elbert et al., 2007;Womack et al., 2015), limiting the possibilities of long-range dissemination. Accordingly, the majority of previous studies investigating the potential sources of air microbes identified local surface environments (e.g., leaves, soils) as having more important effects on the airborne microbiome structure in field crop areas (Bowers et al., 2011;Wei et al., 2019b;Womack et al., 2015). This is all the more the case in our study with homogeneous crop activities for tens to hundreds of kilometers around the site.
In the present study, microbial diversity and richness observed in the surface soils are generally higher than those on leaf surfaces. Microbial taxa which most positively correlated with PM 10 SC species are generally more abundant in leaf than in topsoil samples. These results were unexpected and show the possible importance of leaf surfaces in structuring the airborne taxa associated with SC species. Considering the general grouping of leaf samples and airborne PM 10 regardless of harvesting activities around the study site and in addition to the separate assemblies of rarefied MOTUs in airborne PM 10 and topsoil samples, it can be argued that the aerial parts of plants are the major source of microbial taxa associated with SC species. Such an observation is most likely related to increased vegetative surfaces (e.g., leaves) in summer that provides sufficient nutrient resources for microbial growth . By reviewing previous studies, Alternaria and Epicoccum, which made up 30 % of total fungal sequence reads in all air samples in this study, have been shown to be common saprobes or weak pathogens of leaf surfaces (Andersen et al., 2009). Similarly, Cladosporium, which accounted for 32.9 % of total fungal genera in all air samples, has also been shown to be a common saprotrophic fungus, inhabiting decayed tree or plant debris (Wei et al., 2019b). The high relative abundance of Sphingomonas and Massilia, accounting for 28.4 % of total bacterial genera in all air samples, is also noticeable. These two phyllosphereinhabiting bacterial genera are well-known for their plant protective potential against phytopathogens (Aydogan et al., 2018;Rastogi et al., 2013).
Altogether, these observations support our interpretation that leaves are the major direct source of airborne fungi and bacteria during the summer months at this site of high agricultural activity. Endophytes and epiphytes can be dispersed in the air and transported vertically as particles by air currents, much faster and more widely than by other mechanisms, such as direct dissemination from surface soil, which is generally controlled by soil moisture (Jocteur Monrozier et al., 1993). The most wind-dispersible soil constituents are indeed the smallest soil particles (i.e., clay-size fraction), which contain the largest number of microorganisms (Jocteur Monrozier et al., 1993) and can only be released into the atmosphere under conditions of prolonged drought. This interpretation is also consistent with previous studies (Bowers et al., 2011;Liu et al., 2019;Lymperopoulou et al., 2016;Mhuireach et al., 2016), which also show the extent to which endophytes and epiphytes can serve as quantitatively important sources of airborne microbes during summertime when vegetation density is highest. For example, Lymperopoulou et al. (2016) observed that bacteria and fungi suspended in the air are generally 2 to more than 10 times more abundant in air that passed over 50 m of a vegetative surface than in air that is immediately upwind of the same vegetative surface. However, the relative abundance of taxa associated with SCs in surface soils in this study could also be indicative of a feedback loop, in which the soil may serve as a source of microbial endophytes and epiphytes for plants, while the local vegetation in turn may serve as a source and sink of microbes for local soils during leaf senescence.

Conclusions
Primary biogenic organic aerosols (PBOAs) affect human health, climate, agriculture, etc. However, the details of microbial communities associated with the temporal and spatial variations in atmospheric concentrations of SC, which are tracers of PBOAs, remain unknown. The present study aimed at identifying the airborne fungi and bacteria associated with SC species in PM 10 and their major sources in the surrounding environment (soils and vegetation). To that end, we combined high-throughput sequencing of bacteria and fungi with detailed physicochemical characterizations of PM 10 soil and leaf samples collected at a continental rural background site located in a large agricultural area in France.
The main results demonstrate that the identified SC species are a major contributor of OM in summer, accounting together for 0.8 % to 13.5 % of OM mass in air. The atmospheric concentration peaks of SC coincide with the daily harvest activities around the sampling site, pointing towards direct resuspension of biological materials, i.e., crop residues and associated microbiota as an important source of SC in our PM 10 series. Furthermore, we have also discovered that the temporal evolutions of SC in PM 10 are associated with the abundance of only a few specific airborne fungal and bacterial taxa. These microbial taxa are significantly enhanced in the surrounding environmental samples of leaves over surface soils. Finally, the excellent correlation of SC species and cellulose, a marker of plant materials, implies that local vegetation is likely the most important source of fungal and bacterial taxa associated with SC in PM 10 at rural locations directly influenced by agricultural activities in France.
Our findings are a first step in the understanding of the processes leading to the emission of these important chemical species and the large OM fraction of PM in the atmosphere, and in the parametrization of these processes for their introduction in CTMs. They could also be used for planning efforts to reduce both the PBOA source strengths and the spreading of airborne microbial and derivative allergens such as endotoxins and mycotoxins. However, it remains to be investigated how well different climate patterns and sampling site specificities, in terms of land use and vegetation cover, could affect our main conclusions.
Data availability. The sequencing data files are available from the DRYAD repository (https://doi.org/10.5061/dryad.2fqz612m4, Samaké et al., 2020). All relevant chemical and environmental data sets are archived at the IGE (Institut des Géosciences de l'Environnement) and are available upon request from the co-author (Jean-Luc Jaffrezo).
Author contributions. JLJ, JMFM, and GU supervised the thesis of AS, and JLJ, JMFM, GU, and AS designed the research project. PT gave advice for soil and leaf sampling. SC supervised the sample collections and provided the agricultural activity records. VJ developed the analytical techniques for SC species and cellulose measurements. AS and AB performed the experiments. AB performed the bioinformatic analyses. AS performed statistical analyses and wrote the original draft. SW produced the circular phylogenetic trees. All authors reviewed and edited the final paper.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. We acknowledge the work of many engineers in the lab at the Institut des Géosciences de l'Environnement for the analyses (Anthony Vella, Vincent Lucaire). The authors would like to kindly thank the dedicated efforts of many other people at the sampling site and in the laboratories for collecting and analyzing the samples.
The PhD of Abdoulaye Samaké is funded by the government of Mali. We gratefully acknowledge the LEFE-CHAT and EC2CO programs of the CNRS for financial supports of the CAREM-BIOS multidisciplinary project, with ADEME funding. Chemical and microbiological analytical aspects were supported at IGE by the Air-O-Sol and MOME platforms, respectively, within Labex OSUG@2020 (ANR10 LABX56).
Review statement. This paper was edited by Alex Huffman and reviewed by two anonymous referees.