Long-term observations of aerosol size distributions in semi-clean and polluted savannah in South Africa

This study presents a total of four years of submicron aerosol particle size distribution measurements in the southern African savannah, an environment with few previous observations covering a full seasonal cycle and the size range below 100 nm. During the first 19 months, July 2006– January 2008, the measurements were carried out at Botsalano, a semi-clean location, whereas during the latter part, February 2008–May 2010, the measurements were carried out at Marikana (approximately 150 km east of Botsalano), which is a more polluted location with both pyrometallurgical industries and informal settlements nearby. The median total concentration of aerosol particles was more than four times as high at Marikana than at Botsalano. In the size ranges of 12–840 nm, 50–840 nm and 100–840 nm the median concentrations were 1856, 1278 and 698 particles cm −3 at Botsalano and 7805, 3843 and 1634 particles cm−3 at Marikana, respectively. The diurnal variation of the size distribution for Botsalano arose as a result of frequent regional new particle formation. However, for Marikana the diurnal variation was dominated by the morning and evening household burning in the informal settlements, although regional new particle formation was even more frequent than at Botsalano. The effect of the industrial emissions was not discernible in the size distribution at Marikana although it was clear in the sulphur dioxide diurnal pattern, indicating the emissions to be mostly gaseous. Seasonal variation was strongest in the concentration of particles larger than 100 nm, which was clearly elevated at both locations during the dry season from May to September. In the absence of wet removal during the dry season, the concentration of particles larger than 100 nm had a correlation above 0.7 with CO for b th locations, which implies incomplete burning to be an important source of aerosol particles during the dry season. However, the sources of burning differ: at Botsalano the rise in concentration originates from regional wild fires, while at Marikana domestic heating in the informal settlements is the main source. Air mass history analysis for Botsalano identified four regional scale source areas in southern Africa and enabled the differentiation between fresh and aged rural background aerosol originating from the clean sector, i.e., western sector with very few large anthr p genic sources. Comparison to size distributions published for other comparable environments in Northern Hemisphere shows southern African savannah to have a unique combination of sources and meteorological parameters. The observed strong link between combustion and seasonal variation is comparable only to the Amazon basin; however, the lack of long-term observations in the Amazonas does not allow a quantitative comparison. All the data presented in the figures, as well as the time series of monthly mean and median size distributions are included in numeric form as a Supplement to provide a reference point for the aerosol modelling community. Published by Copernicus Publications on behalf of the European Geosciences Union. 1752 V. Vakkari et al.: Long-term observations of aerosol size distributions


Introduction
Atmospheric aerosol particles impact our lives in several ways.They moderate climate via directly scattering and absorbing solar radiation and indirectly by modifying the properties of clouds and, therefore, affecting the global radiation budget (e.g., Seinfeld and Pandis, 2006).In addition to climate impacts, aerosol particles cause adverse health effects and deteriorate visibility (e.g., Charlson, 1969;Pope and Dockery, 2006).Atmospheric aerosol particles are recognised as the source of the largest uncertainty in the current global climate models (Forster et al., 2007).Reducing the uncertainty in the global climate models requires temporally and spatially representative datasets on a global scale, preferably including both chemical and physical properties of the aerosol particles.One of the key physical parameters of aerosols is the number size distribution, and especially for the climate effects, the size distribution in the sub-micron range.
Aerosol size distribution measurements covering at least one full year have been conducted in a number of locations, e.g., the dataset compiled by Spracklen et al. (2010) for comparison with a global aerosol model.However, most of the long-term observations are from the Northern Hemisphere and very few from continental locations in the Southern Hemisphere (Spracklen et al., 2010).Since the study by Spracklen et al. (2010) even more datasets have been published for the continental boundary layer in the Northern Hemisphere.For example Asmi et al. (2011) presented two years of size distributions from 20 European stations, Hyvärinen et al. (2011) more than two years of size distributions from two sites in India and Shen et al. (2011) more than a year of size distributions from the North China Plain.
However, for the Southern Hemisphere there is a very limited number of long-term datasets of sub-micron aerosol particle size distributions in the continental boundary layer (Laakso et al., 2006(Laakso et al., , 2008(Laakso et al., , 2012;;Hirsikko et al., 2012).Even from the Amazon basin, where several intensive measurement campaigns have been carried out during recent years, no size distribution measurements that cover a full year have been published (see, for example, the review by Martin et al. (2010) and references therein).From Australia the majority of observations are from coastal or urban locations (e.g., Gras, 1995;Mejia and Morawska, 2009;Cheung et al., 2011).
The locations of currently available (published or accessible via databases) long-term datasets representing continental boundary layer are illustrated in Fig. 1.The figure is based on the dataset by Spracklen et al. (2010) and follows the division into marine, high altitude and continental boundary layer locations by Spracklen et al. (2010).Figure 1 was updated with this study and other recently published aerosol size distribution observations in the continental boundary layer extending below 100 nm and covering at least one full seasonal cycle (Asmi et al., 2011;Hyvärinen et  Long-term observations of aerosol particle number size distribution or total concentration extending below 100 nm in the continental boundary layer excluding urban environments.The observations from the locations indicated with red circles are compared with the results of this study.Note that the South American comparison site had only two months of measurements.al., 2011;Shen et al., 2011;Hirsikko et al., 2012;Laakso et al., 2012).
The only exception to the rule of one full seasonal cycle was made to have one comparison location in South America.The dataset in South America in Fig. 1 refers to Rissler et al. (2006) who measured size distributions extending below 100 nm during the transition from dry to wet season in the southwestern Amazon basin.The Rissler et al. (2006) dataset was selected from the Amazon basin also because of its location in an agricultural (i.e., deforested) area comparable to savannah.
For southern Africa the main source of information on atmospheric aerosol particles has been the SAFARI 2000 campaign (Swap et al., 2003).However, as SAFARI 2000 focused on wild fires, the size distribution measurements were biased toward measuring emissions from fires.The measurements were conducted onboard aircrafts to enable the tracking of fire plumes and, hence, the datasets gathered were typically only for some tens of hours of flight time (Haywood et al., 2003a, b;Hobbs et al., 2003;Ross et al., 2003).The campaign was conducted in September 2000, which is usually part of the peak period for wild fires, i.e., the late dry season.Furthermore, the wild fires were exceptionally intense in September 2000 (Swap et al., 2003), if compared with an average burning season.Considering all the above, the measurements during SAFARI 2000 are, therefore, not temporally or spatially representative of the typical aerosol particle size distribution in southern Africa.
Since SAFARI 2000 the next observations on sub-micron aerosol particle size distribution in southern Africa were published by Laakso et al. (2008), who presented the monthly statistics of the number concentration of 10 to 840 nm aerosol particles measured at Botsalano from July 2006 to July 2007.Laakso et al. (2008) were the first to cover one full seasonal cycle in southern Africa; however, the paper did not give any more detail than the monthly medians and percentiles of the observed total concentration.Recently Hirsikko et al. (2012) briefly discussed the diurnal and seasonal variations of the sub-micron aerosol size distribution in connection to new particle formation at Marikana village, 150 km east of Botsalano, from February 2008 to May 2010.Laakso et al. (2012) included seasonal variation of aerosol number concentration from 10 nm to 10 µm in an overview of measurements at Elandsfontein, 200 km east of Marikana, from February 2009 to January 2011, that were part of the South African component of the European Union sponsored project EUCAARI (Kulmala et al., 2009(Kulmala et al., , 2011)).
To partially fill the gap in the size distribution data for southern Africa we present here nearly four years of submicron aerosol size distribution measurements in South Africa.The first part of the measurements was conducted in a semi-clean background location (Laakso et al., 2008) and the second in a more polluted area with mixed industrial sources and some informal settlements (Hirsikko et al., 2012;Venter et al., 2012) enabling a comparison between clean and polluted size distributions.The afore-mentioned measurement campaigns were supported also by the EUCAARI South African component (Kulmala et al., 2009(Kulmala et al., , 2011;;Laakso et al., 2012).
The diurnal, seasonal and spatial variations of sub-micron aerosol size distribution were analysed and the differences between the semi-clean and the polluted environment discussed.Additionally, we have included all the data used in the figures, as well as the monthly averaged time series of the size distributions in full size resolution as a Supplement to provide a reference point for future modelling studies.
The Botsalano game reserve has no local sources and has an extensive clean sector to the west, with very little anthropogenic activities (Laakso et al., 2008).However, the prevailing air mass origin for Botsalano is from the anticyclonic recirculation path, which accumulates emissions from the entire industrialised Highveld (Tyson et al., 1996;Vakkari et al., 2011).The median SO 2 concentration is, therefore, nearly 1 ppb at Botsalano and, hence, the site is described as semiclean instead of a clean background site (Laakso et al., 2008).Botsalano is also occasionally affected by direct plumes from the megacity of Johannesburg-Pretoria (Lourens et al., 2012) and the surrounding industry (Vakkari et al., 2011).
Marikana, on the other hand is in the middle of the pyrometallurgical industry surrounding the city of Rustenburg approximately 100 km northwest of the Johannesburg-Pretoria megacity and 150 km east of Botsalano (Hirsikko et al., 2012;Venter et al., 2012).The pyrometallurgical industry around Marikana consists mainly of platinum group metal smelters that have high SO 2 emissions (Venter et al., 2012) and ferrochrome smelters (Beukes et al., 2010).In addition to the industrial sources, there are also informal settlements and low-cost housing in the Marikana village and the site is impacted daily by the household cooking and space heating activities in these areas (Hirsikko et al., 2012;Venter et al., 2012).Because of the significance of the industrial sources the area surrounding the measurement site at Marikana has been proposed as a third legislatively proclaimed national air pollution hotspot in South Africa (Scott, 2010), in addition to the two existing priority areas, i.e., the Vaal Triangle and the Highveld priority areas (Yako, 2005;van Schalkwyk, 2007).Wild fires are frequent during the dry season in southern Africa and may occur in any direction from the aforementioned measurement sites.
Marikana is located close to the transitional zone of the grassland biome to the savannah biome (Mucina and Rutherford, 2006).The area surrounding Marikana that are not in industrial or residential use, are mainly used for farming activities, including both grazing and cash crop (e.g., maize) production (Venter et al., 2012).The Botsalano game reserve is situated totally in the savannah biome, which also supports agricultural use (Friedl et al., 2002;Laakso et al., 2008).The clean sector west of Botsalano changes within approximately 100 km into semi-arid shrublands with low biomass and biological activity (Friedl et al., 2002;Mucina and Rutherford, 2006).This biome continues in the sector between north and west from Botsalano into the neighbouring countries of Botswana and Namibia.This region is commonly referred to as the Kalahari.To the southwest of Botsalano the mixed grassland/savannah vegetation changes within a couple of hundred kilometres into the dry Karoo biome, which has even less biomass and biological activity than the Kalahari.

Instrumentation
The measurements were carried out at both Botsalano and Marikana with a mobile measurement trailer, which has been described in detail by Petäjä et al. (2007) and Laakso et al. (2008).In this study, we utilise only the aerosol particle size distributions from the differential mobility particle sizer (DMPS) (Hoppel, 1978;Jokinen and Mäkelä, 1997) Data coverage of the aerosol particle size distribution measurements from the DMPS was 75 % for the entire measurement period combined at both sites.For Botsalano problems at the beginning of the measurement campaign, i.e., irregularities in the incoming power and CPC breakdown in September 2006, decreased the average data coverage to 69 %.However, from January 2007 onwards until the end of the measurements at Botsalano on 5 February 2008 the data coverage was on average 82 %.
For Marikana the average data coverage was 80 %.From the start of the measurements on 10 February 2008 until the end of July 2009 the data coverage was good, on average 90 %, but during the second half of the Marikana measurement campaign some technical problems occurred: a CPC breakdown at the end of July 2009, a virus infection in the measurement PC in December 2009-January 2010 and a leak in the DMPS sheath air pump in March-April 2010.Nevertheless, the monthly data coverage for both campaigns presented in Fig. 2 shows that the gaps in the measurements do not hinder studying seasonal variation.
A TSI 3772 CPC was running in parallel with the trailer DMPS for one week in August 2011.While the concentration was below 10 000 particles cm −3 , i.e., the 3772 CPC operating in single particle count mode, the DMPS total concentration was on average (median) 2 % higher than the TSI 3772 concentration.The 25th and 75th percentiles of the ratio of the DMPS to the TSI 3772 concentration were 0.97 and 1.10, respectively, which are comparable to the counting accuracy of a CPC.

Size distribution parameters
Number concentrations of aerosol particles larger than 50 or 100 nm in diameter are quite commonly used as proxies of cloud condensation nuclei (CCN), when direct measurements of CCN do not exist (e.g., Asmi et al., 2011).In this study, these number concentrations are approximated by integrating the measured size distribution concentrations from 50 to 840 nm (N 50) and from 100 to 840 nm (N 100) in particles cm −3 .To provide a more comprehensive overview of the data, the size distribution number concentrations are also integrated from 12 to 840 nm (N 12) and from 12 to 25 nm (N < 25) [cm −3 ] and, in addition to the sectional number concentrations, log-normal size distribution fits also are calculated.
The fitting of the log-normal size distribution to the measured aerosol particle size distribution was done with the method described by Vartiainen et al. (2007).An n-modal log-normal size distribution is here defined with 10-base logarithm as where D p is the particle diameter in nm, N i is particle number in mode i, µ i is the geometric mean of the mode i in nm and σ i is the standard deviation of mode i (Seinfeld and Pandis, 2006).In this method (Vartiainen et al., 2007) the number of modes per size distribution, n, is allowed to vary freely from one to three to obtain the best fit.Also the geometric means of the modes (µ i ) are left unconstrained.
While the modal fits presented in this study describe the size distribution in more detail than the number concentrations of N12, N < 25, N50 and N 100, it is also a potential source of error.The error of the modal fits is estimated as the mean relative error of the fitted size distribution in percent, MRE.MRE is calculated as the arithmetic mean of the relative error of the fitted distribution n fit compared to the measured size distribution n obs where D p is the size resolution vector of the measured distribution n obs as dN d log 10 D p and the limits of the summation i 1 and i 2 can be selected to cover the entire range of the size distribution or only a fraction of it.The n fit here is calculated from Eq. ( 1).
Especially at the larger particle sizes the size distribution was quite often not log-normal for both Botsalano and Marikana, which led to over-or underestimation of the size distribution by the modal fits.To account for this, we have calculated the MRE separately for the distribution both below and above 300 nm in addition to the overall error.
For the Botsalano median distribution, Fig. 3, the modal fits represented the distribution quite well below 300 nm with a mean relative error of 5 %, but above 300 nm the fits overestimated the concentration by on average 30 %.For Marikana the modal fits were closer to the median distribution with a 0.5 % mean relative error below 300 nm and 8 % mean relative error above 300 nm.
Since the concentrations above 300 nm were not large, even a 30 % overestimate of the size distribution did not affect the total concentration significantly.For example for the Botsalano median distribution using the log-normal modal fits instead of the measured data to calculate N 100 led to an error of 0.9 %.However, we recommend using the primary data instead of the modal fits whenever possible.
The number size distributions are presented as dN d log 10 D p with units of particles cm −3 throughout this paper in prevailing conditions, but we have included in the Supplement the atmospheric pressure measured at the sites and the temperature of the DMPS system to facilitate comparison with concentrations given in STP conditions.

Ancillary data
The spatial variability of the size distributions was studied with air mass history from back-trajectories similarly to Vakkari et al. (2011).The 96-h back-trajectories for each hour throughout the measurement period were calculated with the HYSPLIT 4.8 model (Draxler and Hess, 2004).The HYSPLIT model was run with the GDAS meteorological archive produced by the US National Weather Service's National Centre for Environmental Prediction (NCEP) and archived by the National Oceanic and Atmospheric Administration (NOAA) Air Resources Laboratory (ARL) (Air Resources Laboratory, 2011).
In the simple approach used by Vakkari et al. (2011), a 0.5 • × 0.5 • grid is first defined over southern Africa.Each back-trajectory is then assigned the parameters observed at the measurement site when the trajectory arrived -in this case the N12, N50 and N 100 number concentrations.Each grid cell is then allocated an average value of the observed parameters assigned to the trajectories passing over it, i.e., the value of each grid cell represents the average value ob-served at the measurement site when air masses passed over that point.
The accuracy of trajectories depends on the quality of the underlying meteorological data in use (Stohl, 1998) and the errors accompanying single trajectories are currently estimated as 15 to 30 % of the trajectory distance travelled (Stohl, 1998;Riddle et al., 2006).However, Vakkari et al. (2011) demonstrated that the afore-mentioned approach gives a fairly representative picture of the regional patterns around Botsalano.
The seasonality of wild fires in southern Africa was studied using MODIS collection 5 Burned Area product (Roy et al., 2008).The MODIS Burned Area product provides an estimate of when a specific 500 m × 500 m land area has been burned based on rapid changes in the surface reflectance (Roy et al., 2008).The monthly number of fire observations at a 500 km radius around each measurement location was calculated for the entire measurement period.

Median size distributions
The individual overall median distributions with modal fits are presented in Fig. 3 for both semi-clean savannah at Botsalano and polluted savannah at Marikana.The median total concentration from 12 to 840 nm (N12) was 1856 and 7805 particles cm −3 for Botsalano and Marikana, respectively.The median and mean number concentrations for both Botsalano and Marikana in all four size ranges (N12, N < 25, N50 and N100) are given in Table 1 together with the six reference datasets indicated in Fig. 1.Also two datasets from southern Africa covering a full seasonal cycle are included in Table 1, although they did not present size resolved concentrations.Log-normal size distribution parameters fitted to both median and mean size distributions are presented in Table 2 for both Botsalano and Marikana.
The fitted Aitken mode concentration of Marikana (Table 2) is over four times higher than for Botsalano, although the fitted accumulation mode number concentrations are quite close to each other.Also the nucleation mode concentration is relatively high for Marikana, whereas the Botsalano median distribution below 25 nm is fairly well represented as the tail of the Aitken mode.This indicates that the measurements at Marikana were much closer to the sources of the aerosol particles than at Botsalano.

Diurnal variation of the size distribution
Figure 4 illustrates the median diurnal variation of aerosol particle size distribution for Botsalano and Marikana.Both surface plots have been calculated from the measurements so that each 10 min size distribution is a median for that specific time interval over the entire measurement period for each location.The edges of the new particle formation event in   Fig. 4 left panel (Botsalano) are not as sharp as in a typical event, since the onset of the new particle formation follows sunrise, which varies from 05:18 to 07:01 local time (LT) at Botsalano.This time dependant variation, as well as the shape of the typical event can be seen in Fig. A1, where median diurnal variation for Botsalano is plotted for each month.
For the semi-clean Botsalano new particle formation is the main driving force of the diurnal variation in the size distribution.Furthermore, the accumulation mode concentration does not appear to drop at the onset of the event, if the median diurnal behaviour or data from March to November is considered.However, during summer, i.e., December to February, there is a drop in the accumulation mode in the morning, as seen in monthly median diurnal plots in Fig. A1 in Appendix A -for mean diurnal variation the drop is stronger.
Considering the median diurnal distribution, the accumulation mode concentration increased at the onset of the new particle formation event: the increase in N100 in Fig. 4 is concurrent with the appearance of the nucleation mode.Even in the one hour median size distribution parameters in Table 3 the N100 increased from 06:00 to 12:00 LT.This is due to the growth of the pre-existing Aitken mode particles, as is seen in Fig. 5 modal fitting parameters.The mode 2 mean diameter in Fig. 5 starts to increase rapidly from 70 nm already at 12:00 LT, which is clearly before the mode 1 particles could have reached this size as the median growth rate in new particle formation at Botsalano was 8.9 nm h −1 (Vakkari et al., 2011).At 18:00 LT the mode 2 has grown out from the Aitken mode size range and merges with the previous accumulation mode (mode 3).The growth of the pre-existing Aitken mode, therefore, seems to be an important process producing CCN-sized particles in a semi-clean savannah environment such as Botsalano.
Note also that the modal fittings in Fig. 5 are calculated independently for each size distribution letting the modal fitting algorithm decide the number of modes from one to three.Also the diameters of the modes are let to vary freely (Vartiainen et al., 2007), which may lead to more than one mode in e.g., Aitken mode size range in some cases.The division into three (or four for Marikana) modes in Fig. 5 is then done independently of the modal fittings to better illustrate the diurnal patterns in the size distribution.
For Marikana, the polluted savannah, the size distribution also presents a strong regional new particle formation event in the midday (Fig. 4).However, in addition to this the aerosol particle concentration also increased in the early morning at sunrise (after 06:00 LT) and again in the evening at sunset (after 18:00 LT).The morning and evening peaks originate from domestic space heating and cooking in the surrounding informal and semi-formal settlements (Venter et al., 2012;Hirsikko et al., 2012), which is seen also as two Table 3. Size distribution parameters for median distributions at 06:00, 12:00, 18:00 and 24:00 LT for Botsalano and Marikana.

Fractional concentrations
Modal fitting parameters  peaks in the CO concentration during corresponding time periods in Fig. 6.
The seasonal variation of sunrise and sunset times affects the Marikana median diurnal variation (Fig. 4) similarly as discussed previously for Botsalano.The dependency on sunrise and sunset in Marikana is clearly seen in Fig. A2 in Appendix A, where diurnal variation is plotted separately for each month.
The regional new particle formation in Fig. 4 dominates the diurnal variation to a degree where other daytime regional phenomena may be suppressed by it.However, as new particle formation frequency is so high in Botsalano and Marikana, the diurnal variation of only non-event days would not be representative: 6 % of days at Botsalano and only 0.3 % (i.e., two days) at Marikana were classified as non-event.
The effect of the pyrometallurgical industry around Marikana is seen as a steep rise in the SO 2 concentration after sunrise (Fig. 6).The peak in SO 2 concentration at Marikana does not reflect a change in the emissions, as the industrial processes are continuous, but can rather be attributed to the development of the boundary layer (Venter et al., 2012).As the top of the boundary layer reaches the effective stack height, the SO 2 emissions from the stacks reach the ground level and at the same time the mixing volume of the emissions is at its minimum, which leads to a peak in the ground level concentration (Venter et al., 2012).If Fig. 4 and 6 are compared, it seems that the emissions from the industry are mainly gaseous, since there is no simultaneous increase in the concentrations of N50 and N100.The onset of the regional new particle formation event does occur at the same time as the increase in the SO 2 concentration; however, the calculated sulphuric acid proxy cannot explain the observed growth at Marikana (Hirsikko et al., 2012).In addition to the regional new particle formation during daytime the nucleation mode is also present during nighttime at Marikana, as seen in the fitted log-normal distribution parameters in Fig. 7.The night-time nucleation mode appears simultaneously with the evening household combustion peak, as seen in Fig. 7.However, the increase in the nucleation mode particle concentration is approximately only 10 % of the total particle concentration increase, which is comparable to previous measurements on residential wood combustion (e.g., Tissari et al., 2008).Figure 7 also shows that the growth of pre-existing Aitken mode particles during the new particle formation event may contribute significantly to the concentration of CCN-sized particles at Marikana.
The morning and evening peaks at Marikana do, however, not necessarily give a regionally representative picture of the size distribution for the entire mining and metallurgical industrial region around Marikana, known as the western Igneous Bushveld Complex (Venter et al., 2012;Hirsikko et al., 2012).This is because the nights in Marikana are calm and, therefore, the emissions from the household combustion accumulate close to the surface, which is seen also as dilution of the concentration after sunrise in Figs. 4, 6 and  7.Even though the early morning and evening peaks might not represent the regional aerosol, they characterise the emissions from informal settlements that have not received much attention (Hirsikko et al., 2012), notwithstanding that such informal settlements are common around the cities in South Africa (Venter et al., 2012).

Seasonal variation of the size distribution
The seasonality in the size distribution at both semi-clean and polluted savannah sites is strongest for the N 100 concentra- tion, i.e., the accumulation mode.For Marikana N50 also exhibit seasonal variation in addition to the underlying N 100 seasonality.Comparing Figs. 8 and 9 shows how the highest N50 and N 100 concentrations are concurrent with the highest CO concentrations.
The seasonality of log-normal modal parameters was studied by fits to the monthly median distributions presented in Appendix A. Interestingly from the modal fitting point-ofview the seasonality in N100 for both semi-clean and polluted savannah (Fig. 8) originates in an increase in the Aitken mode (mode 2) number concentration rather than in the accumulation mode (mode 3) concentration.
The frequency of occurrence of the mode 4 at Marikana has a clear seasonality with maximum during the colder months as seen in Fig. 10.Also the frequency of occurrence of mode 3 at Botsalano seems to have a seasonality with a minimum from July to August, Fig. 10, which is when the number concentration of mode 2 is elevated.The monthly median modal fitting parameters are included in the Supplement.
Studying the correlation coefficient of hourly mean CO and N100 for each month (Fig. 10) indicates how the months with higher N100 concentration also have a higher correlation between CO and N100, which implies that the seasonal variability of the size distribution is closely associated with incomplete burning for both Botsalano and Marikana.During the dry season, from May to September, N 100 and CO are continuously relatively highly correlated with correlation coefficient above 0.7 for both locations.
The monthly average number of MODIS Burned Area product (Roy et al., 2008) fire observations within 500 km radius of each measurement location shown in Fig. 11 indicates that for Botsalano the highest N 100 concentrations (Fig. 8) are reached at the peak of wild fire occurrence, i.e., September.In contrast, for Marikana the highest N100 and N 50 concentrations were observed already in July, which is the coldest month of the year (Fig. 11).Therefore, it seems that the seasonality of N100 and N50 for Marikana during the cold winter months is determined rather by domestic space heating than regional wild fires.Additionally the seasonal peak in wild fires results in continued high correlation of CO and N100, even after the diminishing need for domestic space heating in September.Even if only daytime data is selected for Marikana, the shape of the correlation with CO and N 100 and the N100 seasonal variation stay unchanged; only the correlation with CO and N100 during the wet season (from October until April) is lower.The daytime N 50 and N12 indicate the seasonality of the formation and growth rates (Hirsikko et al., 2012), which is reflected as increased concentrations during the wet season.However, the dry season peak in N50 and N100 still stays in July, which indicates that the household space heating and cooking remain a stronger source of particles than regional wild fires at Marikana even considering only daytime data.The monthly median CO concentration, Fig. 9, has two peaks at Marikana: one in July, the coldest month, and a secondary one in September, which is the wild fire peak month, Fig. 10.The N 100, however, does not peak in September but only in July (Fig. 8).In contrast in Botsalano September is the peak month for both CO and N100, which is approximately 560 particles cm −3 higher than during summer (cf.Table 4).
In Marikana the N100 in September is approximately 470 particles cm −3 higher than during summer (Table 4), not far from the increase at Botsalano, but still 1350 particles cm −3 lower than in July.The intensity of the evening burning peak in September is, however, close to what it is in February and March, Fig. A2, which is reasonable as the mean temperature is 19.4 • C and there is no or very little need for heating in the evenings.Therefore, the elevated N100 concentration at Marikana in September is due to regional wild fires; only the concentration appears low compared to the July peak from domestic heating and cooking.
Why does the CO then peak in September at Marikana?The reason is that CO has a lifetime of 30-90 days in troposphere, while aerosol particle lifetime varies from a few days to a few weeks (Seinfeld and Pandis, 2006).Therefore, CO accumulates in the atmosphere over a longer period than aerosol particles, which leads to an increase in the ratio of CO to N100 towards the end of the dry season and especially in the early wet season, when increased wet removal decreases aerosol particle concentration, but not CO concentration.
The impact of burning, as wild fires for semi-clean savannah or as a combination of wild fires and household

Fractional concentrations
Modal fitting parameters combustion for polluted savannah, cannot fully explain the relatively increase in the concentrations during the dry season.The absence of wet removal, combined with lower formation and growth rates during the dry season (Vakkari et al., 2011;Hirsikko et al., 2012) increases the relative importance of the afore-mentioned combustion source of aerosol particles during the dry season.This is seen as a drop in the N 100 and CO correlation at the beginning of the wet season (Fig. 11), which usually start after middle October.The seasonal median size distribution parameters for both Botsalano and Marikana are collated in Table 4.Here summer is defined as December to February, autumn as March to May, winter as June to August and spring as September to November.

Comparison to previous observations
Four of the sites chosen for the comparison in Table 1, i.e., Mukteshwahr, India (Hyvärinen et al., 2011), Shangdianzi, China (Shen et al., 2011), Southern Great Plains, US (Sheridan et al., 2001) and K-Puszta, Hungary (Asmi et al., 2011), are surrounded by grassland or cropland with no local sources, although none of them are really remote locations.
Mukteswahr lies approximately 200 km from the megacity of New Delhi (Hyvärinen et al., 2011) and Shangdianzi approximately 150 km from the megacity of Beijing (Shen et al., 2011).The nearest coal-fired power plants lie within 50 km of the Southern Great Plains site (Rissman et al., 2006) and K-Puszta is located only 80 km from Budapest with 1.5 million inhabitants (Asmi et al., 2011).The fifth site characterised as crop-or grassland in Table 1 is located at Fazenda Nossa Senhora Aparecida in Rondonia, Brazil approximately 50 km from the closest city of Ji-Parana with 100 000 inhabitants (Rissler et al., 2006).The Rondonia site is impacted by extensive biomass burning during the dry season (Rissler et al., 2006) and, thus, the concentrations in Rondonia are clearly higher than in a natural environment in the Amazon basin (Martin et al., 2010).
The Gual Pahari site in India was included for comparison with Marikana, because both are affected by biomass burning for household heating and cooking (Hyvärinen et al., 2011;Hirsikko et al., 2012).However, Gual Pahari is only 25 km from New Delhi, thus, it is impacted by the megacity (Hyvärinen et al., 2011).The Ispra site in Italy was selected to represent a more industrially polluted location (Asmi et al., 2011).

Semi-clean savannah
The total concentration at Botsalano is slightly lower or comparable to the other semi-clean grass-or cropland sites in Table 1, except for the Shangdianzi site (Shen et al., 2011) and the Rondonia site during the dry season (Rissler et al., 2006), which have clearly higher total concentrations.However, the concentration of particles larger than 100 nm is clearly lower at Botsalano than at the other semi-clean sites except for the Southern Great Plains (Sheridan et al., 2001).One plausible explanation is that the prevailing anticyclonic recirculation of air masses for Botsalano (Vakkari et al., 2011) forces air masses from the industrial sources around Johannesburg to travel considerably longer than the direct distance to Botsalano.Longer transportation allows more time for removal processes and dilution.On the other hand the concentration below 100 nm and, therefore, also the total concentration are kept relatively high by the extremely high frequency of new particle formation observed at Botsalano (Vakkari et al., 2011).
In addition to Botsalano diurnal variation is dominated by new particle formation all year round only in the Shangdianzi site in China (Shen et al., 2011) of the previously published datasets in Table 1.Also the Southern Great Plains site has higher total concentration during midday and Sheridan et al. (2001) speculate this to be due to new particle formation.However, there is no size-resolved size distribution information available for the Southern Great Plains below 100 nm and, therefore, the source of the diurnal variation remains unknown (Sheridan et al., 2001).In Gual Pahari, India, new particle formation is also seen as an increase in the total concentration during pre-monsoon and monsoon seasons, although the morning peak from traffic and the evening peak from heating and cooking dominate the diurnal variation (Hyvärinen et al., 2009;Raatikainen et al., 2011).
The seasonality in Botsalano originates in wild fires and agricultural biomass burning, which have been recognised as a major source of aerosol particles during the dry season also in the Amazon basin (Martin et al., 2010).However, there are no size distribution measurements covering the complete seasonal cycle in the Amazon basin and, therefore, the full effect of the fires cannot be quantified (Martin et al., 2010).Rissler et al. (2006) reported the total aerosol concentration in Rondonia to be on average five times as high at the end of the dry season as during the early wet season.Furthermore, the measurements in Rondonia were conducted in an area with very intensive biomass burning and, therefore, the results cannot be considered representative of more pristine regions in the Amazon Basin although they are affected by the fires as well (Rissler et al., 2006;Martin et al., 2010).In Shangdianzi the seasonality is driven by new particle formation leading to highest concentrations in spring (Shen et al., 2011) and for the Southern Great Plains the highest concentrations of particles larger than 100 nm are had during late summer, which has been attributed to windblown dust (Sheridan et al., 2001).

Polluted savannah
The total aerosol particle concentration at Marikana is clearly lower than at Gual Pahari and only slightly higher than at semi-clean Shangdianzi, but higher than for Ispra (Table 1).The concentration above 100 nm in Marikana is actually lower than in any of these three sites.Therefore, it seems that for Marikana the total concentration is largely due to the new particle formation, which occurs with recordhigh frequency (Hirsikko et al., 2012).The two comparison datasets from southern Africa in Table 1, Elandsfontein (Laakso et al., 2012) and Gaborone (Jayaratne and Verma, 2001), lie between the observed concentrations in Botsalano and Marikana, which is reasonable considering the anthropogenic sources in the locations.
The morning and evening peaks at Marikana are relatively similar to Gual Pahari size distribution diurnal variation, but the origin and timing of the morning peak are different (Raatikainen et al., 2011).Also in Mukteswahr, India, the evening concentrations are elevated before and af-ter the Monsoon season (Hyvärinen et al., 2009), but at that site the diurnal variation seems to depend mainly on the boundary layer evolution rather than changes in the sources (Raatikainen et al., 2011).In Rondonia, Brazil, the highest concentrations from biomass burning occur during evening and night-time, which is similar to Marikana.However, there is no or little new particle formation during daytime in Rondonia (Rissler et al., 2006).The best resemblance to the Marikana morning and evening peaks has been reported by Jayaratne and Verma (2001) from Gaborone, Botswana, although their measurements only covered the size range above 100 nm.Jayaratne and Verma (2001) also interpreted the increase in evening concentration to originate in biomass burning for space heating.
In addition to the size distribution seasonality at Marikana, the Asian Brown Cloud has also been shown to originate largely in biomass burning (e.g., Gustafsson et al., 2009), but in Gual Pahari and Mukteswahr the seasonality seems to be due to the monsoon seasonality rather than changes in the sources (Hyvärinen et al., 2011;Raatikainen et al., 2011).Of the other measurement sites listed in Table 1 higher concentrations were reported during winter in Gaborone (Jayaratne and Verma, 2001) and Ispra (Asmi et al., 2011).However, the origin of the seasonality for Ispra is not discussed by Asmi et al. (2011) and, therefore, the only comparison locations where seasonality has previously been attributed to biomass burning are Rondonia (Rissler et al., 2006) and Gaborone, where the burning seems to originate in space heating (Jayaratne and Verma, 2001).
Therefore, barring the limited spatial coverage of the comparison data, outside of southern African savannah, whether semi-clean or polluted, combustion seems to be the most important source of seasonal variation in the size distribution only in the Amazon basin, although there are no long-term datasets from the Amazon to quantify the effect over the complete seasonal cycle (Rissler et al., 2006;Martin et al., 2010).

Spatial variation of the size distribution
Spatial variability of the size distributions was studied by combining the size distribution measurements with backtrajectories as by Vakkari et al. (2011).However, for Marikana anthropogenic sources in the surrounding 60 km long and 30 km wide valley are so strong that they dominate the air mass history and, therefore, only the data from the semi-clean Botsalano could be used for this purpose.
Figure 12 illustrates how at Botsalano the clean, semi-arid western sector supports clearly lower particle concentrations than the eastern sector with higher biological and anthropogenic activity.The source areas N12, N50 and N100 at Botsalano can be divided further into four regions, as indicated in Fig. 12.The first two are the clean sector west of Botsalano, which is divided into two sub-regions, i.e., the Karoo region southwest of Botsalano and the Kalahari region northwest of Botsalano.The third is the industrial hub of South Africa located around the Johannesburg-Pretoria megacity and the fourth is the anticyclonic re-circulation path (Tyson et al., 1996;Vakkari et al., 2011) that encircles the industrial hub of South Africa.
In order to obtain a more detailed picture of the size distribution within the source regions, hourly back-trajectories were used to select a subset of the measurements best representing each source region.For the selection the time spent over each source region in Fig. 12 was first calculated for each back-trajectory.The calculated time-over-source-region was then linearly interpolated to the DMPS time stamps, thus, attributing to each size distribution a time the air mass had spend over each of the source regions.In this manner, each 10 min size distribution could be classified according to the criteria listed in Table 5, which resulted in a total of 17 000 10 min size distributions with well-defined source region origins.The criteria in Table 5 were set to select the air masses best representing each source region, while simultaneously minimising the contribution from other source regions.
The median distributions for the four source regions are shown in Fig. 13, which confirms the differences between the regions defined in Fig. 12.In the clean sector the Karoo size distribution is dominated by nucleation mode and in the Kalahari by accumulation mode.This can be explained by the different origin of air masses from the Karoo and the Kalahari source regions.The air masses from the Karoo source region frequently originate over the ocean, especially during times of arriving cold fronts sweeping over southern Africa from the south-west.In contrast the Kalahari source region air masses originate over land for all of the four day back-trajectory calculations done.In addition the Karoo source region air masses have to pass over the coastal mountains reaching up to 2000 m a.s.l., which is likely to lead to increased wet removal of the aerosol.Therefore, the Kalahari source region can be considered as an aged clean sector and the Karoo source region as a fresh clean sector for Botsalano.
To the east of Botsalano both the re-circulation and the industrial hub source regions have high N50 and N 100 concentrations compared to the clean source regions.The industrial hub differs from the re-circulation by having a higher Aitken mode concentration and slightly lower N100 concentration, as is seen in Table 6.This indicates that the industrial hub aerosol is fresher than the re-circulation aerosol, which is reasonable since it contains most of the large point sources, e.g., at least 13 coal fired power station without de-SO x and de-NO x , several petrochemical plants, at least 13 pyrometallurgical smelters and the mega-city of Johannesburg-Pretoria, with more than 10 million inhabitants (Lourens et al., 2012).There are some individual large anthropogenic point sources also in the re-circulation source region, including one coal-fired power plant approximately 300 km northeast of Botsalano and the city of Gaborone 50 km north of Botsalano.However, there are certainly far less large anthropogenic point sources in the recirculation source region and they are also less concentrated in terms of geographical distribution.
Notwithstanding the relatively high total number concentration of the industrial hub source region, its concentration is more than three times lower than the measured concentration in Marikana, cf.Tables 1 and 2. At Marikana also the mean diameter is clearly lower than for the industrial hub source region at Botsalano, as seen in Fig. 13.This demonstrates how after 200 km of transport over relatively clean area the industrial hub source region at Botsalano does not represent the size distribution at the sources.
While the differences in the anthropogenic activities are clear between the western and eastern source regions, the difference in the aerosol size distribution is partly of natural origin as well.The re-circulation and industrial hub source regions lie in the savannah and grassland biomes while the Karoo and Kalahari source regions are mostly semi-arid regions, with limited coverage of the grassland biome.Vakkari et al. (2011) concluded that the amount of biological activity in the western and eastern sectors has an effect in the growth  Fractional concentrations Modal fitting parameters rates of aerosol particles in regional new particle formation, which cannot be distinguished from anthropogenic sources in this analysis.Considering the median distributions in Fig. 13 differences observed in the median diurnal variation for the defined source regions in Fig. 14 are not surprising.The recirculation and industrial hub source regions have distinct new particle formation events at midday.The main difference is that the industrial hub source region's new mode concentration is higher than the re-circulation source region.The Kalahari is the only source region that does not exhibit regional new particle formation in the median diurnal variation.This is probably due to smaller concentrations of both biogenic and anthropogenic precursors if compared to the re-circulation and industrial hub source regions, because the condensation sink (CS) in Kalahari (2.5 × 10 −3 s −1 ) is lower than that of the eastern source regions (Vakkari et al., 2011).In the Karoo source region the combination of an even lower condensation sink (1.4 × 10 −3 s −1 ) than in the Kalahari source region to lower growth rates (Vakkari et al., 2011) results in the nucleation mode being continuously present.
Despite lower accumulation mode concentration than the re-circulation and industrial hub source regions, the Kalahari source region has a higher AOD as seen in Fig. 15.Most likely the increase in AOD over the Kalahari source region originates in desert dust in the coarse mode size range.However, the PM 2.5 and PM 10 mass concentrations observed at Botsalano are not elevated for the Kalahari source region  6 for the source regions.
(Fig. 15), which implies that the desert dust from the Kalahari is not transported to Botsalano.As the condensable vapours have a short lifetime (approximately CS −1 ) it seems that the Kalahari desert dust cannot explain the lack of new particle formation at Botsalano for the Kalahari source region, although dust storms have been shown to scavenge effectively sub-100 nm aerosol particles (Jayaratne et al., 2011) Now if the condensable vapour source rate and aerosol formation rate at 2 nm are assumed equal during new particle formation for Karoo and Kalahari source regions, even the difference in the submicron CS is enough to suppress new particle formation events for the Kalahari source region.The Kerminen and Kulmala (2002) formulation connects the ratio of observed formation rate to nucleation rate at a lower diameter with CS and growth rate, and from the assumptions above and observations for Karoo (Vakkari et al., 2011) it follows that the J 10 for Kalahari source region would be lower than the J 10 from Karoo by a factor of 10 13 , i.e., the nucleated particles will be lost by coagulation before they reach the 10 nm detection limit.

Conclusions
We have presented here a total of four years of submicron aerosol particle size distribution measurements from semiclean and polluted southern African savannah.Very few previous observations, extending below 100 nm and covering a full seasonal cycle, exist for this region.The median total concentration from 12 to 840 nm in the semi-clean Botsalano was 1856 particles cm −3 .In the more polluted Marikana the total concentration was more than four times higher, median 7805 particles cm −3 .The difference between the semi-clean and polluted median distributions was largest in the nucleation mode, partly because the nucleation mode was present for Marikana also at night-time.
Regional new particle formation frequency for both Botsalano and Marikana is the highest ever recorded (Vakkari  (Remer et al., 2005).In the middle the mean PM 10 mass concentration and at the bottom the mean PM 2.5 mass concentration for the four defined source regions at Botsalano.Black dots indicate Botsalano (on the left) and Marikana (on the right).et al., 2011;Hirsikko et al., 2012) and at Botsalano the diurnal behaviour of the size distribution is dominated by the new particle formation.In Marikana, however, the effect of regional new particle formation is dominated by the effect of the heating and cooking in the informal and semi-formal settlements.Surprisingly the industry in Marikana does not have discernible direct effects on the size distribution, although the SO 2 shows clearly the emissions from the industry.
The seasonal variation of the size distribution is driven by emissions from incomplete combustion at both Botsalano and Marikana.At Botsalano the source of the combustion is the regional wild fires and the highest concentrations of N100 are in September, i.e., the end of the dry season and the peak of wild fire occurrence.In Marikana, however, the seasonal variation in N 100 and N50 originates from the domestic heating and cooking practises in the informal and semi-formal residential areas.Consequently the highest concentrations occur in July, which is the coldest month of the year.In both locations the N 100 and CO are correlated throughout the dry season from May to September.
Comparison of the data presented here to previously published long-term aerosol particle size distribution measurements carried out in comparable environments shows that Botsalano and Marikana have unique combinations of aerosol particle sources and meteorological conditions.Especially the strong seasonal dependency on incomplete burning differentiates the semi-clean and polluted savannah from the previous observations.The Amazon basin seems to be the only location outside southern Africa where seasonality of the aerosol particle size distribution is dominated by wild fires and biomass burning, but the lack of measurements covering a full seasonal cycle does not allow quantifying the effect of the combustion in the Amazon basin (Martin et al., 2010).
The air mass history study revealed four different source regions for size distributions for Botsalano.For Marikana the large local sources made it impossible to distinguish source regions from the air mass history.Two of the source regions for Botsalano lie in the clean western sector: the northwest-ern Kalahari region and the southwestern Karoo region.Because of the different meteorological patterns transporting air from the Karoo and the Kalahari to Botsalano these two clean sector source regions differ substantially.The Karoo represents fresh clean background air with very low accumulation mode concentration and a continuously present nucleation mode.The Kalahari, on the other hand, represents aged clean background air and is dominated by the accumulation mode and a nearly complete absence of the nucleation mode.The concentrations from the Kalahari are lower than from the eastern sector.
In the eastern sector from Botsalano the difference between the re-circulation and the industrial hub source regions is that the industrial hub has higher concentration in Aitken mode, a sign of fresh aerosol.The N100 concentration from the eastern source regions is at least twice as high as the N 100 in the western source regions and compared to the fresh clean air from the Karoo source region up to four times as high.However, the difference between the clean and polluted source regions is not only anthropogenic, but partly also natural as the eastern sector has higher biological activity and, therefore, higher aerosol particle formation and growth rates (Vakkari et al., 2011).
Fig.1.Long-term observations of aerosol particle number size distribution or total concentration extending below 100 nm in the continental boundary layer excluding urban environments.The observations from the locations indicated with red circles are compared with the results of this study.Note that the South American comparison site had only two months of measurements.

Fig. 3 .
Fig. 3. Median distributions for Botsalano and Marikana.The dots indicate the median measured distribution and the lines show the fitted log-normal distribution fromTable 2. Shaded areas indicate the upper and lower quartiles.

Fig. 4 .
Fig. 4. Median diurnal variation of the aerosol particle size distribution for Botsalano and Marikana.In the lower panels the upper and lower quartiles are indicated by the shaded areas.

Fig. 6 .
Fig. 6.Median SO 2 and CO diurnal variations for Botsalano and Marikana.Shaded areas indicate the upper and lower quartiles.

Fig. 10 .
Fig. 10.Monthly frequency of occurrence of each mode in Botsalano (upper panel) and Marikana (lower panel).The frequency is based on modal fits to the monthly median diurnal plots in Appendix A.

Fig. 11 .
Fig.11.On the top are presented the monthly correlation coefficient for CO and N100 concentrations for Botsalano and Marikana.The correlation coefficient has been calculated for hourly median CO and N100.On the bottom are presented the monthly median temperature and the number of MODIS burned area fire observations within 500 km radius of Botsalano (left panel) and Marikana (right panel).

Fig. 12 .
Fig. 12. Mean N 12, N50 and N 100 for the four defined source regions at Botsalano.Black dots indicate Botsalano on the left and the later measurement location, Marikana, on the right.

Fig. 13 .
Fig. 13.Median size distributions for the four source regions defined for Botsalano (left axis) and the Marikana median size distribution (right axis).Modal fitting parameters are given in Table6for the source regions.

Fig. 14 .
Fig. 14.Median diurnal variation of the size distribution for the four source regions derived for Botsalano.

Fig. 15 .
Fig. 15.Top: median AOD over southern Africa from July 2006 to January 2008 from MODIS aerosol product at 550 nm(Remer et al., 2005).In the middle the mean PM 10 mass concentration and at the bottom the mean PM 2.5 mass concentration for the four defined source regions at Botsalano.Black dots indicate Botsalano (on the left) and Marikana (on the right).

Table 1 .
Median and mean aerosol number concentrations for Botsalano and Marikana on different size ranges.Medians are indicated with bold font.Number concentrations, locations an short descriptions of the selected comparison measurements are also included.

Table 2 .
Size distribution parameters for Botsalano and Marikana.Negative mean relative error (MRE) values indicate overestimation by the modal fits.

Table 4 .
Seasonal median distribution parameters for Botsalano and Marikana.Summer is December to February, autumn is March to May, winter is June to August and spring is September to November.

Table 5 .
Criteria for source region characterisation and the number of size distribution measurements obtained for each source region.Also average number of observations per each 10 min average in the average diurnal variation in Fig.13is included.

Table 6 .
Size distribution parameters for the four source regions defined for Botsalano.