New Particle Formation Events Detection with Deep Learning

. Atmospheric new particle formation (NPF) is an important source of climate-relevant aerosol particles which has been observed at many locations globally. To study this phenomenon, the ﬁrst step is to identify whether an NPF event occurs or not on a given day. In practice, NPF event identiﬁcation is performed visually by classifying the NPF event or non-event days from the particle number size distribution surface plots. Unfortunately, this day-by-day visual classiﬁcation is time-consuming, labor-intensive, and the identiﬁcation process renders subjective results. To detect NPF events automatically, we regard the 5 visual signature (banana times and times by the new that the start times may be controlled by normal distributions, the end times presented one peak in their histograms for the Värriö and Hyytiälä This promotes the development of the automatic detection of NPF events. The method presented can help to study the statistical properties of the strongest with minimal human participation, for the datasets acquired over a lengthy period. Besides a deeper understanding of the mechanisms of we will try further improve automatic for in the future.


Introduction
Atmospheric aerosols have profound impacts on air quality, human health, ecosystem, weather, and climate (Asmi et al., 2011a;Hirsikko et al., 2011;Joutsensaari et al., 2018;Chu et al., 2019;Lee et al., 2019). New particle formation (NPF) is an important 20 source of atmospheric aerosols, which has been observed in a variety of locations in the world such as different types of forests, semi or heavily polluted cities, high-altitude sites, coastal sites, and polar regions (Kulmala et al., 2004;Kuang et al., 2010;Kulmala et al., 2012;Nieminen et al., 2018;Dada et al., 2018;Lee et al., 2019). In addition to the spatial scale, on the temporal scale, NPF events have also been observed in sites built long term ago (Dal Maso et al., 2005;Järvi et al., 2009;Asmi et al., 2011b) and newly built sites Chu et al., 2019;Liu et al., 2020;Yan et al., 2021). 25 To analyze NPF events, the first step is to determine whether an NPF event has occurred or not (Kulmala et al., 2012).
Previous studies on detecting NPF types can be roughly divided into three categories: vision-based, rule-based, and data-driven.
Vision-based methods visually classify the NPF types day by day according to some criteria based on surface plots of the size distribution time series (Mäkelä et al., 2000;Dal Maso et al., 2005;Hirsikko et al., 2011). The advantage of vision-based methods is that experts can explicitly tell which region in a surface plot is thought of as the evidence of an NPF event, and the 30 drawbacks of vision-based methods are labor-intensive, time-consuming, and the classification process is subject to human bias.
Rule-based methods classify NPF types with several explicit steps where some thresholds on the particle number concentrations are used as prior knowledge (Kulmala et al., 2012;Dada et al., 2018). Rule-based methods can classify NPF types automatically, but the drawback of these methods is that the particle number concentrations can vary a lot between different environments, meaning that the prior knowledge used in one site may fail on other sites or complex situations. Data-driven methods utilize 35 the measured particle number size distributions and annotated NPF types (labels) to establish a model which can identify NPF types. For instance, neural networks (NNs) have been used to classify NPF types no matter whether handcrafted features (Nanni et al., 2017) are used (Zaidan et al., 2018) or not (Joutsensaari et al., 2018). The advantages of data-driven or NN-based methods are that they do not need any specific threshold on particle number concentration and the classification process is automatic. However, annotated NPF labels are required to train the NNs, and since the label annotation process is subjective, we use NPF images to represent surface plots without axes. Though surface plots have clear physical meanings, we can apply different image transformations on NPF images without any restriction. In this study, we use an instance segmentation method called Mask R-CNN , a deep learning model, to localize the NPF events by predicting a mask that can cover the spatial layout (the banana shape) of each NPF event. In other words, we try to answer the NPF classification problem by 55 directly localizing the visual signature of NPF events. Since Mask R-CNN only focuses on the banana shape that has been observed globally, it can be used on datasets collected from different sites automatically. For more information about object detection and instance segmentation, please refer to Appendix A.
To verify the generality of the presented method, we test the Mask R-CNN model on three SMEAR stations (Station for Measuring Ecosystem and Atmospheric Relations I, II, and III) in Finland and one station located in San Pietro Capofiume 60 at the Po Valley basin in Italy (SPC station). The datasets collected in the four stations sum up approximately 73 years of measurements. Besides the classification problem, the accurate location of events makes it easier to determine the growth rates, start times, and end times automatically. Our code is released to test on datasets collected in other sites and facilitate future research. Our aims in this study are (1) to automatically localize the globally observed visual signature (banana shape) for regional NPF events, which can identify NPF types (events occurs or not, especially for the strongest events), and determine 65 the growth rates, start times, and end times, (2) to investigate the statistical characteristics of growth rates, start times, and end times for the strongest NPF events for the three SMEAR stations in Finland and the SPC station in Italy.

Measurement sites
We utilized aerosol size distribution data from three observation sites in Finland and one in Italy. All the sites operated sim-70 ilar instrumentation and the observations followed guidelines set by the ACTRIS on in situ aerosol number size distribution measurements (Wiedensohler et al., 2012). The observation sites and instruments are shortly described below.
The SMEAR I station is located at the Värriö Subarctic Research Station of the University of Helsinki (67°46 N, 29°36 E, 390 m a.s.l.) in northern Finland. The station is surrounded by 70-year-old Scots pine (Pinus sylvestris) boreal forest at Kotovaara hill, while some small lakes and mires exist in valleys 60 m lower and more than 1 km far. The measurements of particle 75 number size distribution started in 1997 in SMEAR I station. For more details about the site and measurements, please refer to Vana et al. (2016), Kyrö et al. (2014), andHari et al. (1994). The analyzed particle number size distribution dataset collected in Värriö covers 8189 days from 10 December 1997 until 14 January 2021 (8436 days in total and the days with missing data were omitted from this study).
The SMEAR II station is located in Hyytiälä Forestry Research Station of the University of Helsinki in central Finland 80 (61°51 N, 24°17 E, 130 m a.s.l.), within pine dominated boreal forest with some deciduous birch (Betula pubescens) and aspen (Populus tremuloides) trees. Comprehensive measurements including particle, radiation, gas, meteorological and complementary data have been measured for more than 20 years (Hari and Kulmala, 2005;Dada et al., 2017Dada et al., , 2018. The location is considered as a semi-clean boreal forest environment according to the level of anthropogenic pollutants (Nieminen et al., 2015; The analyzed particle number size distribution dataset collected in Hyytiälä covers 8642 days from 31 January 1996 until 21 January 2020 (8756 days in total).
The SMEAR III station is located in Kumpula campus of the University of Helsinki in southern Finland (60°12 N, 24°58 E, 26 m a.s.l.). The station has accumulated approximately 17 years of measurements such as air pollution, meteorological and turbulent exchange (Järvi et al., 2009). The location is located within urban environment surrounded both by campus buildings,  The aerosol particle number size distributions were measured by differential mobility particle sizer (DMPS) (Aalto et al., 2001) at all four stations (Fig. 1). The particle number size distribution datasets collected from the four stations are termed as Värriö dataset, Hyytiälä dataset, Kumpula dataset, and SPC dataset. The DMPS systems installed in different stations have different detection ranges for particle sizes, and particle sizes ranging from 3 to 1000 nm are considered in this work. Note that the detected particle size does not have to reach 1000 nm for all DMPS systems.

NPF types
According to the guidelines reported in previous studies, the particle number size distributions can be classified into six different types (Dal Maso et al., 2005;Kulmala et al., 2012;Joutsensaari et al., 2018): -Class Ia events. Ia-type events show clear and strong formation of small particles (especially 3-6 nm), with little or no pre-existing particles in the smallest size ranges (Fig. 2a).

110
-Class Ib events. Ib-type events show the same behavior as class Ia but with less clarity (Fig. 2b).
-Class II events. II-type events do not show clear evidence for observing the growth. That is, the growth rate cannot be determined without a large uncertainty (Fig. 2c).
-Class Non-Event (NE). NE does not show any evidence for new particle formation in the nucleation particle size range  -Class Undefined (Undef). Undef is a type that is difficult to be classified as events or NEs since some but not all features for events can be seen (Fig. 2e).
-Class Bad-Data (BD). BD type is caused by instrument malfunction. Generally, missing data or the particle concentrations are too high or too low can be observed (Fig. 2f). Figure 2 shows the example surface plots for different NPF types. The banana shape can be seen clearly for Ia-type and 120 Ib-type NPF events because they are so consistent throughout the day and are little influenced by local wind fields. Ia-type and Ib-type NPF events are usually connected with phenomena happening at large (regional) spatial scales. However, for II-type NPF events, interruptions in surface plots are often associated with more local sources of variability. The banana shape is not very clear for II-type NPF events and can be observed even in some Undef types.

125
In order to fill the research gap mentioned in the introduction section, we used an object instance segmentation technique called Mask R-CNN, which can accurately localize an NPF event's spatial layout. Mask R-CNN extends the object detection method  (Lin et al., 2014) with only 358 annotated masks. These 358 masks were created through a labeling tool LabelMe (Russell et al., 2008), and were from 358 NPF images (78 Ia-type, 202 Ib-type, and 78 II-type). The 358 NPF images were generated from the Hyytiälä dataset, and the period was from 1996 to 2003. During training, 300 NPF images with masks were randomly selected as the training set, and the rest 58 NPF images with masks were the validation set (Fig. 4). The learning rate was 5 × 10 −3 , 135 and decreased every 3 epochs with a factor of 0.10. The stochastic gradient descent optimizer was used. We used weight decay of 5 × 10 −4 and momentum of 0.90. The Mask R-CNN model was fine-tuned for 10 epochs. All the NPF images and masks were resampled to 256 × 256 pixels, and with an NVIDIA V100 GPU, the training process lasted around five minutes. Code and more results are available at https://github.com/cvvsu/maskNPF.git.
Since Mask R-CNN only focuses on the banana shape, some regions in NPF images that are not events can also be localized, 140 resulting in more than one mask that can be detected for one NPF image (Fig. 3). For each mask, there is an objectiveness score in [0,1] showing the probability of an event occurrence. In addition to the objectiveness score, a bounding box is also obtained.
Assuming the time resolution of DMPS systems are 10 minutes and there are 52 samples for particle sizes ranging from 3 to 1000 nm, the recorded particle number size distribution for one day is a data matrix with the shape of 52 × 144 (3 to 1000  (Lin et al., 2017). RPN is the region proposal network (Ren et al., 2016). RoIAlign is the RoIAlign layer that properly aligning the features .
nm from the bottom row to the top row and 0 o'clock to 24 o'clock from the first column to the last column). We resampled the predicted masks to the size of 52 × 144, aligning to the shapes of collected data (Fig. 5).
The value of a pixel in a mask represents the probability of the pixel belongs to an event. For each predicted mask, it was binarized at a threshold of 0.50 . The left and right edges of bounding boxes determine the start and end times, respectively. The bottom and upper edges of bounding boxes automatically provide a size window that covers the related NPF 150 event ( Fig. 3 and Fig. 5).

Growth rate
The particle growth rate (GR) is the rate of change for a given particle: where D p2 and D p1 is the particle diameters at times t 2 and t 1 , respectively.

155
The maximum concentration method and log-normal distribution function (mode fitting) method are two widely used methods to calculate the growth rate (GR) for an NPF event (Kulmala et al., 2012;Dada et al., 2020a). The GRs determined by these two methods have the same order and seasonal variations (Dal Maso et al., 2005;Hirsikko et al., 2005;Yli-Juuti et al., 2011).
Since the localization of the NPF events can be detected, we can accordingly calculate the GR of an NPF event automatically using the maximum concentration method. We used the random sample consensus (RANSAC) algorithm (Choi et al., 2009)   instead of ordinary least square fitting to determine GRs. Compared to ordinary least square fitting, the RANSAC algorithm is robust to outliers. In addition to GRs, the predicted masks can also be used to analyze the characters of start times and end times of the strongest NPF events.
3 Results and discussion

165
According to the classification results on the Hyytiälä dataset (Table 1)   According to the classification results shown in Table 1 and Table 2, there is a trade-off between the classification accuracy 170 of NPF events and the number of misclassified days, which is controlled by the threshold. Re-training the Mask R-CNN model on masks derived from the SPC dataset may improve the classification accuracy on the SPC dataset and mask the classification results stable independent of the chosen threshold. We did not re-train the Mask R-CNN model to demonstrate the generality of our method (Table 2). Once a small threshold such as 0.20 for the objectiveness score is selected, on the SPC dataset and without annotated masks or class labels, the classification accuracy for Ia-type NPF events is 94.80%, for Ib-type NPF events 175 is 87.94%, and for a combination of Ia-type and Ib-type NPF events is 90.57% (Table 2), which are higher than the results reported in Joutsensaari et al. (2018), where an NN-based method was applied. The classification results on the SPC dataset demonstrate the idea that regarding the banana shape in NPF images as a special object is reasonable.
According to the classification results of the four datasets, scientists who are only concerned about identifying Ia and Ib event types, this method will save them plenty of time and effort. Since the II-type events usually do not present a clear banana 180 shape in the NPF images, the Mask R-CNN model fails to find some of these NPF events. However, detection results of Mask R-CNN can be used as auxiliary information to help determine the II types for scientists. Furthermore, the detection process is consistent compared with human-made determination.

Growth rate
In this study, we show that combined with the detected masks, the maximum concentration method can be used to calculate 185 the GRs automatically ( Fig. 5 and Fig. 6). If not specified, we only focus on determining the GRs, start times, and end times for the strongest NPF events.
Daytime hours between 6:00 and 18:00 (local time) were used for the traditional maximum concentration method to calculate the GRs. However, when the prior is not satisfied or particle burst presents in the surface plots, scientists need to select the start and end times manually. With the detected masks, the proposed method can automatically determine the time window (left and 190 right edges of the bounding boxes, Fig. 3 and Fig. 5), and there is no need to manually adjust the start and end times. Usually, different size windows were applied to calculate GRs, and we selected the 3-25 nm as the size range for GR calculation (Fig.   6). However, other size ranges are also possible, and for more information, please refer to our code. To avoid confusion, the maximum concentration and mode fitting methods are termed as traditional methods in this work.
As shown in Fig. 7, an obvious downtrend of GRs for the SPC station can be seen, and the medians of GRs for the SPC station GRs automatically leads to consistent results and gets rid of human errors.

Start time and end time
In addition to the GR, with the detected mask, the start time and end time of an event can also be determined automatically, which are only reported in very few publications Dada et al., 2018). Figure 8 shows the start and end times for the NPF events for different datasets. For the SPC dataset, the automatic method summarized the start times for events 210 that occurred from 2002 to 2017, and the human-annotated results summarized the start times for events that occurred from 2011 to 2017. However, the histograms of the start times and end times determined by different methods show similar shapes ( Fig. 8), illustrating the validity of the automatic method. Considering the end time of an event is difficult to determine in some cases, the end time of the NPF event cannot be identified as clearly as the start time.
Generally, the histograms of the start times for four datasets are bell-shaped, which may be controlled by normal distributions 215 (Fig. 8). The histograms of end times for the SPC station also show the bell shape, but there is more than one peak in the histograms of end times for Värriö and Hyytiälä stations (Fig. 8). For NPF events that last for more than one day, interactions between particles in the two days lead to the end times being much more difficult to determine.
The event durations for the NPF events for SPC station are generally shorter than those for Värriö, Hyytiälä, and Kumpula stations ( Fig. 8 and Table 3). The possible reason is that the atmospheric environment for the SPC station is much more The median start time is almost the same for the Hyytiälä and Kumpula stations (Table 3), which is consistent with that these two stations are located closely and further verifies that the intensity of solar radiation reaching the Earth's surface seems to be the most important factor affecting whether an NPF event occurs or not .  Figure 7. Comparison of growth rates calculated by different methods. GR-T means that growth rates are determined by the traditional methods (manually selecting the start and end times when necessary), and GR-P means that growth rates are determined by the proposed automatic method. R is the Pearson correlation coefficient between GRs calculated by different methods. The density scatter plots in the bottom row show the ranges that the growth rates are usually located in.

Advantages, limitations, and future studies
to the sizes and aspect ratios of the input NPF images since the model has already "seen" the related image transformations One limitation of using the Mask R-CNN model is that some days that are not NPF event days will be misclassified as event days. Therefore, for scientists focusing on the comparison between event and nonevent types, manual work is still required to select the misclassified days out. In this case, the Mask R-CNN model can only be used as an auxiliary tool.
The key to determine the correct start time, end time, and the GR for an event is that the detected mask can accurately depict 240 the spatial layout of an NPF event. Since the Mask R-CNN model used in this study only trained on 358 annotated masks, To detect the NPF events automatically, we presented a method called Mask R-CNN for identifying the regional (banana-type) NPF events (especially the strongest events), and this method can also determine the growth rates, start times, and end times for events automatically. The method generalized well on different stations, and we tested the method on SMEAR I, II, and III (Värriö, Hyytiälä, and Kumpula, respectively) stations in Finland as well as the SPC station in Italy. All together approximately 73 years of measurements for datasets collected in the four stations were processed.

250
The proposed automatic method achieved the highest classification results for Ia-type and Ib-type events on the SPC station without any annotated information, showing the potential to apply the new method on other stations. The automatically determined growth rates by the new method are consistent with the manually calculated growth rates. The start times and end times determined by the new method illustrated that the start times may be controlled by normal distributions, but the end times presented more than one peak in their histograms for the Värriö and Hyytiälä stations.
This study promotes the development of the automatic detection of NPF events. The method presented can help to study the statistical properties of the strongest NPF events with minimal human participation, especially for the datasets acquired over a lengthy period. Besides a deeper understanding of the mechanisms of NPF, we will try to further improve the automatic level for NPF studies in the future.
Code and data availability. Code is available at https://github.com/cvvsu/maskNPF.git. Datasets collected in the three SMEAR stations are 260 available at https://smear.avaa.csc.fi/. The dataset collected in San Pietro Capofiume station is available from Jorma Joutsensaari on request (Joutsensaari et al., 2018).

Appendix A: Object detection and instance segmentation
Object detection is one of the fundamental and challenging tasks in computer vision. Generally, some object detection techniques focus on detecting different kinds of objects such as cats and cars, while others focus on specific scenarios such as face 265 detection (Zou et al., 2019). With the development of deep learning, object detection achieves unprecedented improvements.
The techniques can be roughly divided into one-stage detection such as single-shot multi-box detector (Liu et al., 2016) and two-stage detection such as Faster R-CNN (Ren et al., 2016). Usually, one-stage detection is much faster, while two-stage detection can achieve better detection accuracy. Instance segmentation, however, tries to delineate each distinct object of interest in a more precise manner. In other words, instance segmentation segments an object according to its spatial layout. Compared 270 with a bounding box which needs four corner positions to cover an object, an instance segmentation model needs to find all the pixels that belong in the object.