Are dense networks of low-cost nodes really useful for monitoring air pollution? A case study in Staffordshire

. Air pollution exhibits hyper-local variation, especially near emissions sources. In addition to people’s time-activity patterns, this variation is the most critical element determining exposure. Pollution exposure is time-activity and path-dependent with specific behaviors such as mode of commuting and time spent near a roadway or in a park playing a decisive role. Compared to conventional air pollution monitoring stations, nodes containing low-cost air pollution sensors can be deployed with very high density. In this study, a network of 18 nodes using low-cost air pollution sensors was deployed in Newcastle-5 under-Lyme, Staffordshire, UK, in June 2020. Each node measured a range of species including nitrogen dioxide (NO 2 ), ozone (O 3 ) and particulate matter (PM 2 . 5 and PM 10 ); this study focuses on NO 2 and PM 2 . 5 over a one year period from August 1, 2020 to October 1, 2021. A simple and effective temperature, scale and offset correction was able to overcome data quality issues associated with temperature bias in the NO 2 readings. In its recent update, the World Health Organization dramatically reduced annual exposure limit values from 40 to 10 µ g m − 3 for NO 2 and from 10 to 5 µ g m − 3 for PM 2 . 5 . We found the 10 average annual mean NO 2 concentration for the network was 17.5 µ g m

1 Introduction 20 According to the World Health Organization (WHO), seven million premature deaths every year can be attributed to poor air quality (Lelieveld, 2015;WHO, 2021). In response to the adverse health effects caused by air pollution, the WHO developed Air Quality Guidelines (AQGs) for a set of key air pollutants, including nitrogen dioxide (NO 2 ) and particulate matter with an aerodynamic diameter ≤ 2.5 µm (PM 2.5 ) (WHO, 2021). Since WHO's 2015 recommendation, evidence has accumulated showing many additional negative impacts of air pollution on health (Abdo, 2016;Sun, 2016;Chen, 2018;Ai, 2019;Wang, 25 2019; Wu, 2019;Zhang, 2020). After a comprehensive review of the evidence the WHO has recently recommended a much more strict set of standards and warned that exceeding the new air quality guideline levels is associated with significant health risks. Table 1 shows the previous and revised AQGs for the pollutants of focus within this study along with the EU standards.
These standards are legally binding, while the WHO values are indicative.
Traditionally, air quality monitoring is based on static air quality monitoring stations (AQMS) with calibrated high-precision 30 instruments. However, due to their purchase and maintenance costs, conventional AQMSs are generally sparsely located (Kumar, 2015;Maag, 2018). This monitoring strategy is suited to characterizing regional air quality but could fail to account for elevated concentrations near sources. Moreover the temporal and spatial resolution of such monitoring station networks is limited (Motlagh, 2020). For example there are a total of 18 AQMSs in the nation of Denmark, responsible for measuring concentrations at street level, urban background and regions (Danish National Monitoring Program for Water and Nature 35 (NOVANA) (Ellermann, 2018)).
Meanwhile, field studies have shown that pollution levels, especially in urban environments, can vary substantially within a few meters due to localized air pollution sources Kingham, 2000;Monn, 2001;Zou, 2009;Wang, 2018;Li, 2019;Wilson, 2019). The local component can often be an important factor contributing to people's exposure, for example, for those who commute in a vehicle and/or work as professional drivers, street police, bicycle delivery etc. (Frederickson, 2020a), 40 or live or work in buildings near busy roads. Low-cost air pollution sensors and sensor networks have evolved rapidly during the last few decades, enabled by technological progress and the development of fast and inexpensive wireless communication systems (Snyder, 2013). While the technologies are still evolving, low-cost air pollution sensors are becoming available and are starting to become a valuable supplement to the sparse conventional AQMS. Low-cost sensor (LCS) (Gemmer, 2013) and WHO's global air quality guidelines (AQGs) from 2015 and 2021 (WHO, 2021). All concentrations are in µg m −3 .
2 not a substitute for networks of conventional AQMSs, since high-quality monitoring data is necessary for checking compliance 45 with guidelines and they are also necessary for validating less expensive mapping obtained from modelling and/or LCS based monitoring.
Networks of low-cost air pollution sensors are becoming more common. On a device level, clearly the sensor elements cannot compete with commercial instruments regarding The Three 'S's: Sensitivity, Stability, and Selectivity Borrego, 2016;Castell, 2017;Frederickson, 2020b), this may be more than compensated because LCSs enable greatly increased site 50 density and temporal resolution, facilitating new insights into patterns and sources of air pollution. In addition, LCSs can supplement not only coarse-scale monitoring networks but also add substantial value to mappings provided by mathematical models. Dense networks of LCSs can be used for source apportionment and to distinguish local from non-local pollution (Heimann, 2015), and as an aid in interpreting mathematical models that are often an integrated part of air quality monitoring (Hertel, 2007).

55
Within this study, electrochemical LCSs are used to measure gaseous pollutants and laser based particle counters are used to quantify particulate matter. Electrochemical sensor technology offers a number of advantages including linear response, small size, low cost in fabrication, relatively fast response, and low power consumption (Frederickson, 2020b). While low-cost air pollution sensors bring new opportunities for monitoring, important issues remain regarding data quality. Studies show that sensor data can be influenced by environmental factors such as temperature and confounding gases (Spinelle, 2015(Spinelle, , 201760 Mead, 2013;Bulot, 2020). Considerable efforts have been made to understand these factors, with varying success. Field work presents a complex and dynamic environment, greatly complicating the task of calibration. Experience shows that it is crucial to test each individual sensor and correct for multiple ambient factors (Popoola, 2016).
While a time series analysis based on summary statistics is a simple and effective tool, more sophisticated techniques are necessary to better understand the ultimate causes of these variations (Hwang, 2000). Spectral analysis using the Fourier transform 65 can provide a deeper understanding of time series, because transformation into the frequency domain allows characterization of sources according to their periodicity and rate of change (Percival, 1998). While spectral analysis has long been used for meteorological variables, because of its ability to distinguish synoptic and seasonal signals (Van der Hoven, 1957;Lyons, 1975;Eskridge, 1997), studies applying the Fourier transform to air pollution data emerged much later (Rao, 1976;Hogrefe, 2006;Choi, 2008;Lazi, 2016).

70
There is a relation between temporal and spatial scales of air pollution (Brasseur, 2017). Analysis of air quality data in the frequency domain contributes to the understanding of periodic behaviors and yields information about spatial and temporal scales of the hidden, underlying mechanisms Marr, 2002). Short-term fluctuations of the pollutant concentrations are related to local-scale phenomena, including local dispersion conditions and patterns in local emissions and chemistry. Conversely, seasonal changes and the long-range transport and emissions of pollutants contribute to the spectrum at 75 very low frequencies (Tchepel, 2009). On the time-scale of days, there are the motions of weather systems for example a high pressure system with well-developed photochemical air pollution. Pollution arriving from a distant source is characterized by a slowly rising and falling signal due to the effects of transport time and atmospheric mixing. Regional emissions are of course regional in scale and photochemical pollution typically develops in a synoptic air mass. In contrast local sources (e.g. traffic) Figure 1. Spatial distribution of the AirNode network (left) and an overview of the location of the network relative to the two closest reference stations (right). The urban background station at Stoke-on-Trent Center is highlighted with a red marker, whereas roadside monitoring station at Stoke-on-Trent A50 Roadside is highlighted with a blue marker. The last AQMS used in this study (regional background monitoring station at Ladybower) is located 54 km from the network and is for clarity not included on the map. Maps obtained from ©OpenStreetMap contributors 2021. Distributed under the Open Data Commons Open Database License (ODbL) v1.0 (OpenStreetMap, 2021). more often present as a sharp spike in concentration. Even an instantaneous puff of pollution will broaden with time based on 80 the vertical and horizontal eddy diffusion coefficients, K, which are on the order of 100 m 2 s −1 (Seinfeld and Pandis, 2016).
In this paper, we show how low-cost air pollution sensors provide additional insights into the patterns and sources of air pollution when deployed as a network rather than as individual sensors. A low-cost air pollution sensor network consisting of 18 low-cost air pollution sensor nodes (called AirNode4PX) was deployed in Newcastle-under-Lyme, UK, in the area centered around the ring road (see Figure 1). The variation in road width, the different types of road structure, and highly variable traffic 85 patterns all impact pollutant dispersion, resulting in significant spatiotemporal variation of pollution in the area. Each AirNode measured a range of species including nitrogen dioxide (NO 2 ), ozone (O 3 ) and particulate matter (PM 2.5 and PM 10 ); in this paper we focus on NO 2 and PM 2.5 . This paper does not attempt to demonstrate that the low-cost air pollution sensors meet specific air quality monitoring standards. Rather, we argue that data obtained from such a network is able to provide useful additional information about local air pollution that extends what can be learned from conventional air quality monitoring 90 stations. The data obtained from the low-cost air pollution sensor network is used for time series analysis in the frequency domain to obtain information on the variability of air pollution concentrations and to distinguish local sources from regional.
The network, together with the analysis approach, has allowed pollutant emissions attributable solely to the local sources to be distinguished from other regional or long-range transport sources. The approach of frequency domain analysis will be further evaluated in subsequent studies. In June 2020, a network of 18 air pollution sensor nodes containing low-cost electrochemical and metal oxide gas sensors and optical particle counters was deployed in Newcastle-under-Lyme in Staffordshire, UK, in the area centered around the ring road. In addition, an anemometer was installed to record wind speed and direction. The initial 14 day installation, stabilization and testing period of the measurement campaign are excluded from the analysis. In all the study covers a 14-month period 100 from August 1, 2020 to October 1, 2021.

Nodes of low-cost air pollution sensors
The nodes include low-cost air pollution sensors, signal processing and communications. The units, 88 × 88 × 90 mm, are assembled by AirLabs into weatherproof enclosures with full exposure to ambient air, and are set up to report measurements to a cloud hosted by Amazon Web Services. The low-cost air pollution sensor nodes are generation 4P and are referred to as AirN-105 ode, AirNode4PX or 4PX, with X being the node number. Each AirNode includes sensors for measuring NO 2 (NO2-B43F from Alphasense Ltd.) and O 3 (MiCS-6814 from SGX Sensortech) as well as PM 2.5 and PM 10 (SDS-011 from Nova Fitness Co.) at a 1-min time-resolution. In addition, each node is equipped with a control board and micro-controller unit (ESP32) for programming the sensors. The AirNodes were laboratory tested in Copenhagen, Denmark, to validate their response and obtain laboratory-based calibration coefficients, which are used to interpret the preliminary data. After laboratory calibration 110 the AirNodes were shipped to Newcastle-under-Lyme in Staffordshire, United Kingdom, and were mounted 2.5 to 3 m above street level on lamp posts which also provided power as shown in Figure 1. Since the study focuses on NO 2 and PM 2.5 , a brief description of the sensors is given below.
The SDS-011 sensor (Nova Fitness Co. Ltd, 2015) is a low-cost air pollution sensor measuring PM 2.5 and PM 10 . Its principle of operation is based on light scattering (van de Hulst, 1981), where particle density distribution is determined using the 115 intensity distribution patterns produced when particles scatter a laser beam (Liu, 2019). The sensor module includes a fan to ensure a continuous flow of air through the sensor chamber (Genikomsakis, 2018). An algorithm converts the particle density distribution into particle mass, and it can measure the particle density distribution between 0.3 to 10 µm (Bulot, 2020;Budde, 2018).
For NO 2 measurements, the NO2-B43F sensor (Alphasense, 2019) is used. This is an amperometric electrochemical gas 120 sensor containing four electrodes, where the principle of operation is based on electrochemistry (Frederickson, 2020b). When the Working Electrode (WE) is exposed to ambient air, the target gas can diffuse onto the surface of the electrode, where it is chemically reduced, resulting in a change in current. The Counter Electrode balances the current, and the Reference Electrode sets the operating potential of WE. The fourth electrode is an Auxiliary Electrode (AE) and has the same structure as WE but is not exposed to ambient air, hence is not affected by the target gas concentration, only by environmental parameters such as 125 temperature. Therefore, the difference in voltage between the WE and AE corresponds to changes in target gas concentration at the electrochemical cell surface. A trans-impedance amplifier converts the currents from the electrochemical cell into a voltage.
The voltage is amplified further by a non-inverting operational amplifier, then a 16-bit analogue to digital (A/D) converter (ADS1115) samples the output and produces a digital reading of the voltage level. This is used by the microprocessor to calculate the actual gas concentration (Cross, 2017;Stetter, 2008;Mead, 2013). To minimize possible cross-interference from 130 ozone, the NO 2 sensors were fitted with integrated catalytic ozone filters (MnO 2 filters). The performance of these filters was verified in the laboratory, and the NO 2 sensors showed no significant response to ozone in the range of 0 − 100 ppb. Crossinterferences from other common gas pollutants were not considered important based on prior studies (Sun, 2017;Mead, 2013).

135
The calibration of the electrochemical sensors measuring NO 2 is known to vary at high (>20 • C) and low (< 0 • C) temperatures and with rapid temperature change (Alphasense, 2019;Popoola, 2016;Li, 2021). Therefore we apply a correction with coefficients determined by using a linear regression model: where NO 2 (dv) are the raw output obtained by the Alphasense NO 2 cell. The NO 2 (dv) readings are found from the voltage 140 change in cell 2, which is determined by the difference between the WE and AE outputs, W E2 v and AE2 v : T is filtered temperature data obtained from the nearest reference station. Filtered temperature represents the temperature reading when ambient temperature exceeds 10 • C and is transformed according to and dT /dt is the rate of change of the filtered temperature. The temperature threshold of 10 • C was chosen because the internal temperatures of the LCS nodes often exceed the ambient temperatures and the performance of the correction was sufficient.
The linear regression coefficients, a 0 , a 1 , a 2 and a 3 , are calculated using the method of multiple least squares, separately for each AirNode (Spinelle, 2017). In this formula a 3 is a measure of the sensor's sensitivity and a 0 is the offset of the sensor, whereas a 1 and a 2 are temperature correction coefficients.

150
All electrochemical sensors have a different inherent sensitivity, hence the NO 2 readings need to be scale-corrected. The scale-correction is carried out by multiplying the temperature-corrected NO 2 readings (NO 2 (cor T )), from each AirNode, with α, which is the ratio between the 0.80 and 0.20 quantiles of the NO 2 readings obtained from the AirNodes (Q diff, AirNode ) and from the reference (Q diff, Reference ). The reference is the NO 2 readings obtained by chemiluminescence from the reference-grade instrument at the AQMS at Stoke-on-Trent Centre, 4.1 km from the network, from the same period as the measurements took 155 place. All data from the reference station are used for the correction. The difference between the 0.80 and 0.20 quantiles is a proxy for the variation obtained in the measurements.
The offsets of the readings are determined by calculating the difference between the 0.25 quantile (Q 0.25 ) obtained from each AirNode (Q 0.25, AirNode ) and from the reference (Q 0.25, Reference ). Hence, the offset of the temperature-and scale-corrected reading (NO 2 (cor T,S )) is adjusted by subtracting the calculated offset (β). The reference used in the offset-correction is the same as the one used for the scale-correction. The 0.25 quantile is a proxy for measured background concentration.
where NO 2 (cor T,S ) is the temperature-and scale-corrected NO 2 reading, and NO 2 (cor) is the temperature-, offset-and scalecorrected NO 2 reading.
Regarding the SDS-011 PM 2.5 readings, outliers were removed by removing all values exceeding 5 times the standard 170 deviation. Scale and offset correction was performed for PM 2.5 similar to the one for the NO 2 readings. However, there was no significant difference between the corrected and uncorrected PM 2.5 readings since the PM 2.5 readings were already highly correlated (mean R 2 = 0.72) with the reference readings from the Stoke-on-Trent Centre.

Comparison with regulatory air quality monitoring stations
The data obtained from the network is compared with data from the three nearest regulatory air quality monitoring stations: between the main road and a parallel side road, near a pedestrian footbridge, beside the dual carriageway A50 through Stoke.
All three AQMSs are equipped with instruments for measuring NO 2 by chemiluminescence, but only Stoke-on-Trent Centre measures PM 2.5 . Hourly air pollution data from each monitoring station were manually downloaded using the UK-Air data selector (DEFRA, 2022).

Spectral analysis 190
Spectral analysis is widely used for investigating cycles and variations of pollutants in time series to reveal the sources of pollution (Marr, 2002;Lazi, 2016). Within spectral analysis, the Fourier transform is a powerful tool for analyzing time series including periodicities and rate of change. To use the method it is necessary to overcome obstacles including the often unevenly spaced time points in time series due to technical and practical problems during monitoring (Sun, 1996(Sun, , 1997. The unequally spaced or missing data can be circumvented by applying the fast Fourier transform after filling the gaps and missing values 195 with the mean. In addition, the linear trend in the time series is removed by subtracting the average concentration obtained by each LCS. The periodogram for a finite time series is calculated as the square of the magnitude of X where k = 0, 1, ..., N −1, and N is the number of observations, x t is the time series, and ν k = k/N . The periodogram indicates the strength of the signal as a function of frequency, while its spectrum over the frequency range corresponds to the variance 200 of the time series data. Parseval's Theorem (Parseval, 1806;Narayanan, 2003) states that the energy, or in this case intensity, is conserved during Fourier transformation. Thus, the contribution of the different pollution sources can be quantified by integrating the peaks in the periodograms (Marr, 2002).
There is a relationship between temporal and spatial scales of the different air pollutants. Rapid and the fluctuations in-between (blue). B. Schematic illustration of air pollutant contribution from regional transport (yellow), the urban area (blue), and the street (red). The relative concentration of the contributions depends on the considered pollutant and the dispersion conditions. above 0.0417 h −1 , i.e., events with a frequency higher/shorter than one day. This is referred to as the 'local' contribution to the pollutant concentration. The local-cutoff is chosen based on the European Environment Agency's definition of local time scale (EEA, 2008). The seasonal changes in the emissions and long-range transport of the pollution contribute to the periodogram at low frequencies (< 0.0139 h −1 ), i.e., events with a frequency lower/longer than three days. This is then referred to as the 210 'regional' contribution to the pollutant concentration. In this model, intermediate frequencies are due to the 'urban' contribution to the pollutant concentration. The cutoff frequency for the regional contribution is based on the intercontinental transport, which occurs on timescales on the order of three days to one month (Stohl, 2002). As noted in the introduction, the mixing of pollution with time provides an upper limit on frequency for distant sources; only local sources can give a high frequency signal. The above-mentioned definitions are illustrated in Figure 2. One of the properties of diffusion is that a pulse of pollution 215 will propagate in a Gaussian concentration profile depending on the diffusion constant and time. Under the Fourier transform, a Gaussian is mapped onto another Gaussian with a different width. The transform of a wide function is narrow and vice versa.
By integrating periodograms in the three different frequency bins by the equations below, the relative contribution of local, urban and regional pollution of the LCS data can be quantified.
Where ν start and ν end are the start and end frequencies in the chosen frequency bins, which is elaborated below: and g is non-negative and square integrable with respect to Lebesgue measure on ν k . After the relative contributions are calculated for each LCS node, the average concentration together with the standard deviation can be calculated across the AirNode network to illustrate how much local pollution the network is seeing on average and how much variation is seen 225 across all AirNodes.

Results and discussion
In the following section we present the results of our study and of the data analysis.

9
3.1 Sensor data quality The first requirement is to establish the fidelity of the monitoring network.

Missing data
The data completeness of the AirNodes varies between sites. In the monitoring network, apart from four AirNodes (4P04, 4P06, 4P08 and 4P20), all AirNodes have more than 80% data completeness during the sampling period. The four AirNodes with data completeness below 80% were excluded from the analysis. Across the network of AirNodes, the mean data completeness is 95%, which is sufficient to investigate the local variation of air pollution. The main reasons for data gaps are the irregularities 235 in the line power and lapses in the wireless internet connection. In addition, spiders had in a few cases entered through the small holes at the base of the AirNodes and nested in the housing leading to sensor failure.

Correction of NO 2 readings
It is necessary to account for temperature bias while deploying electrochemical NO 2 sensors (Alphasense, 2014). For our study, this correction was crucial in order to get meaningful readings from the electrochemical sensors since the raw readings showed 240 unphysical behavior. The typical NO 2 patterns during weekdays (Monday to Friday) and weekends (Saturday and Sunday) measured by AirNode4P01 as an example are shown before and after the correction in Figure 3. All AirNodes have the same tendencies, so Figure 3 is characteristic of all AirNodes. For clarity, the NO 2 _raw represents the Alphasense NO 2 cell output and/or correction. The correction coefficient, a 0 , or offset of the sensor, is higher than the average concentration and it has a relatively high standard deviation. This is a property of the Alphasense cell and can vary significantly from cell to cell, which underscores the importance of calibrating and correcting the raw data from low-cost sensors in order to obtain accurate concentrations.
It is known that in cities the temperature can vary strongly over small distances (Cao, 2021), therefore it would have been 255 more accurate to measure the internal temperature of the AirNodes and use that information for the correction. However, the correction methodology even with the modeled temperature data, yields corrected readings that follow expected trends, giving confidence in sensor accuracy. However, as seen in Figure 3, there is still a relatively large discrepancy between the reference and the corrected AirNode readings on weekdays between 8 and 12 h, which can be attributed to the large distance between the reference instrument and the AirNodes (4.1 km) and the fact that the concentration of NO 2 can have different profiles at 260 different locations, depending on the traffic modes and sources. Sensor performance is validated below.

Inter-sensor variability
Inter-sensor variability has been used as a metric of sensor reliability in recent studies (Liu, 2020). Figure 4 displays the correlation heatmap of Pearson correlation coefficient for NO 2 and PM 2.5 , in which the respective reference measurements from Stoke-on-Trent Centre are included. The Pearson correlation coefficients for PM 2.5 among the AirNodes ranged from 0.87 265 to 0.99 with a mean of 0.95. In contrast, the Pearson correlation coefficients for NO 2 ranged from 0.30 to 0.88 with a mean of 0.64. For PM 2.5 , the lowest Pearson correlation coefficients were above our quality criteria of 0.85. We did not choose a similar criterion for NO 2 since we expect there is much higher variation between the sensors due to localized sources. The AirNode network readings rose and fell simultaneously as ambient concentrations and conditions changed confirming that the sensors are operating as expected and giving confidence in sensor measurement reliability. The AirNode readings generally followed 270 the same trends as seen at the reference instruments at Stoke-on-Trent Centre. The mean Pearson correlation coefficient for NO 2 was 0.40 with a range of 0.3 to 0.47, whereas the mean Pearson correlation coefficient for PM 2.5 was 0.85 with a range of 0.80 to 0.87. Again, the larger discrepancy between the reference and the corrected NO 2 readings can be attributed the more spatially variable nature of NO 2 .

Descriptive statistics 275
Air quality data for NO 2 and PM 2.5 measured at the different sites during 2020 and 2021 were analyzed. For this section, only one year of data (August 1, 2020 to August 1, 2021) is used to compare with official guidelines. Descriptive statistics of the air quality measurements are presented in  Table 2. Statistics for air quality data measured from Aug 01, 2020 to Aug 01, 2021. For comparison, the descriptive statistics from the three regulatory air quality monitoring stations (Regional = Ladybower, urban background = Stoke-on-Trent Centre, roadside = Stoke-on-Trent A50 Roadside) are shown for the corresponding period. Neither Ladybower nor Stoke-on-Trent A50 Roadside has instruments for monitoring PM2.5. Abbreviations: SD = standard deviation, max = maximum value. We are aware that the measurement uncertainty is significantly higher for low-cost air pollution sensors than for reference air quality monitoring measurements. However, EU air quality guidelines approve lowcost sensor data as indicative but not quantitative data -in line with calculations with air quality models.

Temporal trends
295 Figure 3 shows the temporal variation in NO 2 . On weekdays, the NO 2 concentration increases in two time periods during the day, with peaks at 7:00 and 18:00. On weekends, the NO 2 concentration rises steadily throughout the day. There is a notable decrease in concentration during the weekend compared to the weekdays at all sites. For both weekends and weekdays the NO 2 concentration is lowest at night. The two time periods with increased NO 2 concentration during the weekdays are typically periods of increased traffic during morning and afternoon rush hours when people commute to and from work 300 Berkowicz, 1996). Thus, traffic likely drives this observed variation, in line with the declining NO 2 concentration during the night and over the weekend.
In terms of monthly trends, Figure 5 displays the monthly average of the NO 2 concentration measured by one of the AirNodes together with the monthly readings from the nearest urban background AQMS (Stoke-on-Trent Center). All AirNodes have the same tendencies, so Figure 5 is characteristic of all AirNodes. The readings from the AirNode and the AQMS follow 305 the same trends, with the highest NO 2 concentrations in the Spring and the Winter.

Spatial trends
Wind speed and direction have been shown to provide essential information that can help identify source location (Carslaw, 2006;Westmoreland, 2007). The description of variation with wind direction and wind speed on a specific street (the so-called street canyon effect) is described in . Bivariate polar plots are a powerful tool for source characterization in-310 cluding mean pollutant concentrations for specific wind speed and direction bins (Uria-Tellaetxe, 2012; Grange, 2016;Carslaw, 2012Carslaw, , 2013. In these plots wind direction is displayed from 0 to 360 • clockwise on the angular axis and wind speed is shown on the radial scale.
The wind speed and direction data used in this study are shown as a windrose in Figure 6. The windrose shows that the prevailing winds come from the south and northwest during the measurement period. To assess spatially-resolved source 315 patterns, bivariate polar plots of the NO 2 and PM 2.5 are investigated. The bivariate polar plots for each pollutant for all sites are shown in Figure 7. Reddish colors represent higher values compared to the blueish ones. The bivariate polar plots show patterns that depend on deployment location. AirNode4P23, AirNode4P19 and AirNode4P02 are located in the southern part of the ring road, and they display similar patterns in their bivariate polar plots. Their surroundings are almost identical and the traffic influence on their readings is similar. The nodes located in the northern part of the ring 320 road have different patterns relative to the ones in the southern area. They experience the highest values at lower wind speeds.
When peak concentrations occur at low wind speeds it suggests local sources. For example in a street canyon there is both a direct and a recycled contribution to the concentration, where the relative size of these two contributions depends on whether the measuring site at a given time is in the leeward or windward side of the street. AirNode4P10 is located in front of a school, and at lower wind speeds or with westerly wind, elevated levels of NO 2 were observed. In general, the highest concentrations 325 are observed at low wind speeds, where no whirlwind is formed inside the street, independent of wind directions, or when the measuring site is on the leeward side of the street (in relation to the whirlwind). In the latter case, pollution from the traffic in the specific section of the street will be led directly to the measuring site, at the same time as there is a contribution due to trapping of pollution within the limited volume of the whirlwind.
Higher NO 2 values are correlated with wind speed and the orientation of the road. The traffic comes from the ring road The bivariate polar plot for AirNode4P01 and AirNode4P05 show similar patterns, with the highest concentrations found for easterly and southwesterly winds, whereas the lowest concentrations were seen with westerly and southwesterly winds. Higher speed southwesterly winds contributed to the peak concentrations at these locations. A wide-open parking area is located next 335 to the ring road in that direction, which could explain the elevated concentrations. The wind speed dependence of concentrations in a street canyon can be complex as there are opposing effects: Higher wind speeds lead to more O 3 but also more dilution of NO x (NO + NO 2 ). High wind speeds will therefore lead to lower NO 2 while at low wind speeds, NO 2 formation is limited by O 3 , which goes towards zero in the street . Bivariate polar plots are good at revealing these interrelationships. The wind speed dependence can help distinguish sources from one another.

340
When several measurement sites are available, polar plots can triangulate different sources (Carslaw, 2006). As expected, NO 2 is dominated by local emissions, and peak values mainly occur for low wind speeds, where elevated concentrations were observed due to accumulation and lack of dispersion. The most obvious features of NO 2 bivariate polar plots are that the elevated levels are attributable to the orientation of the road or the place with the highest traffic density.
Relative to the bivariate polar plots of NO 2 , the bivariate polar plots of PM 2.5 do not show as much variation across the 345 network. Generally, the highest concentrations of PM 2.5 are seen for southeasterly winds and higher wind speeds. This is confirmed by the frequency spectrum showing slow changes consistent with large air masses. This indicates that particles originate from long-range transport. The bivariate polar plots for PM 2.5 also suggest that the locally-sourced particulate matter is present, shown by the elevated concentrations at low wind speeds, where the atmospheric conditions are more stable. In general, sites across the sensor network show a variation in their bivariate polar plots (however more for NO 2 than 350 for PM 2.5 ) due to the different pollution sources. Thus, there are additional benefits of multi-sensor node measurements for characterizing sources in detail, especially when combining them with meteorological information. with a time-resolution of 30-minutes, we see more local variability in the data, compared to the readings from the reference station. The data have a measurement density in both time and space, which can not be achieved using current conventional measurement methods. As seen on Figure 8, the readings from the AirNode and the AQMS follow the same trend, but the correlation of determination is only 0.28. This is expected since the AQMS is located around 4 km from the AirNode network.
However, increasing the time-resolution will increase the correlation of determination, i.e., a time-resolution of 3 hours results 360 in a correlation of determination of 0.38, and a time-resolution of 1 day yields 0.63.

Spectral analysis
Spectral analysis is performed on the air pollution data to investigate its hidden periodicities and quantify their magnitude. The contributions of local and regional sources to the pollution concentrations are determined based on the determined amplitudes and frequencies. The local sources are shown in the high-frequency periodogram, and the regional or long-range sources are 365 revealed in the low-frequency periodogram. Note however that local sources may be present in both the low and high frequency regions. For example, in an urban street, the traffic patterns follow stable patterns with daily and weekly periodicities. Holiday periods follow their own pattern, and for wood smoke, emissions follow variations in outdoor temperature. By comparing the spectra for the different pollutants measured by the same AirNode, information on the sources can be revealed. If the emission sources for the different pollutants are the same, similarly cyclic patterns can be expected. The differences in the pollution Spectral analysis is performed on NO 2 data from three different types of AQMSs to illustrate how periodograms vary depending on location. These AQMSs are 1) regional background (Ladybower), 2) urban background (Stoke-on-Trent Centre) and 3) street (Stoke-on-Trent A50 Roadside). The three periodograms are shown in Figure 9. While all three periodograms 375 have significant peaks in the low-frequency region, only the urban background and street AQMSs have significant peaks in the high-frequency region. We conclude that these high frequency peaks are due to the proximity and strength of local NO 2 sources. Figure 10. Periodogram of NO2 (left) and PM2.5 (right) obtained by AirNode4P01. Figure 10 displays the periodograms for NO 2 and PM 2.5 measured by AirNode4P01. The periodogram of NO 2 features three distinct peaks at 0.125, 0.084 and 0.042 h −1 corresponding to periods of 8, 12 and 24 h, respectively. In addition, one 380 peak is identifiable in the high-frequency region at 0.17 h −1 (6 h). In the low-frequency region, there are multiple peaks close to each other, however, the peaks corresponding to 5 days (0.0083 h −1 ), 1 week (0.0061 h −1 ) and 1 month (0.00135 h −1 ) can still be identified. All these cycles can be related to local sources of pollution e.g. traffic or meteorological changes. Peaks located in the low-frequency region can originate from changes over either synoptic or larger scale. Highest intensity occurs in the high-frequency region since most of NO 2 originates from local sources. The daily changes in NO 2 concentrations can be 385 associated with the daily changes in traffic from nearby roads and the diurnal variation caused by sunlight. Weekly periodicity may also originate from changes in traffic.
The periodogram for PM 2.5 (see Figure 10) features one distinct peak in the high-frequency region at 0.042 h −1 (24 h), and a prominent peak at 0.084 h −1 (12 h). Besides these two peaks, most peaks are seen in the low-frequency region of the periodogram, which is expected since PM 2.5 is dominated by long-range transport and non-local sources. However, the 390 contribution by PM from a nearby road can originate from traffic since vehicles, in general, can re-suspend particles from the road into the air, and abrasion from brakes and tires also produce PM (Grigoratos, 2015).
Periodograms for the rest of the AirNodes in the network show results similar to the ones shown in Figure 10, with small changes in position and amplitude at specific locations. Conclusions regarding trends in pollution sources can be drawn by examining the relative contributions from local, urban and regional sources. Figure 11 shows the calculated percentages of 395 local, urban and regional contributions for the AirNodes as well as for the three different types of AQMSs. The results for the network indicate that local emissions are the most important source of NO 2 with an average of 54.3 ± 4.3 %, whereas PM 2.5 is mainly due to regional sources (62.1 ± 3.5 %). For NO 2 , urban sources contribute 14.3 ± 1.9 % and regional sources 31.2 ± 4.5 %. For PM 2.5 , urban sources contribute 20.0 ± 1.2 % and local sources 17.9 ± 3.2 %.
As expected the regional background AQMS shows the highest relative contribution from of regionally sourced NO 2 , and the 400 street AQMS has the highest level of locally sourced NO 2 . The AirNodes in the network show a distribution of contributions.
The results obtained for both NO 2 and PM 2.5 reveal contributions of short-term (12 h and 24 h) and long-term fluctuations.
The contributions at low frequencies are significantly different between the two pollutants, indicating that temporal variations are influenced by different processes. The methodology is a powerful tool for analyzing the causes of air pollution.

405
Air pollution can be hyper-local and low-cost air pollution sensors are capable of accurately describing variation close to pollution sources. This study assessed more than one year of NO 2 and PM 2.5 data with high spatiotemporal resolution (1-min) obtained using a network of 18 low-cost air pollution sensor nodes. Initially there were significant calibration issues associated with temperature bias in the NO 2 readings but a simple and effective temperature, scale and offset correction was able to overcome this problem. Therefore, this study, like many others, clearly indicates that while low-cost air pollution sensors can 410 be useful, calibration and correction is far from trivial and requires supporting data from reference stations. The corrected NO 2 concentrations have a strong connection with the reference station used so results reflect both the low-cost air pollution sensor data and reference station data.
In its recent update and revision of the air quality guidelines for Europe, the WHO has proposed annual NO 2 and PM 2.5 exposure guideline thresholds of 10 and 5 µg m −3 , respectively. The annual mean NO 2 and PM 2.5 concentrations across the 415 network exceed the updated WHO guidelines by 7 µg m −3 for NO 2 and 3 µg m −3 for PM 2.5 . However, none of the sites had values exceeding the legally binding UK/EU standards. An excess concentration of 12.5 µg m −3 of NO 2 in the network was seen relative to background levels measured by the regional monitoring station at Ladybower reservoir. This highlights the risk of pollution exposure for individuals due to local sources and supports the use of local monitoring to characterize the risk. Figure 11. Histogram of percentages of contribution (%) of local (red), urban (yellow) and regional sources (blue) for NO2 (top) and PM2.5 (bottom) measured by all AirNodes in the network as well as for the three AQMS. Ref_Reg is the regional background AQMS, Ladybower, Ref_Street is the street AQMS, Stoke-on-Trent A50 Roadside, and Ref_Urban is the urban background AQMS, Stoke-on-Trent Centre.
Spectral analysis is found to be a good method for studying the variation within the time series. This approach enabled 420 the detection of different underlying periodicities in time series data and allowed the pollution signal to be apportioned to different categories of pollution source whether local, urban or regional. The results highlighted the advantages of having a densely deployed sensor network over the sparse conventional air quality monitoring stations. The highly increased spatiotemporal resolution of low cost sensors combined with their dense placement near pollution sources makes it possible to provide additional information on the patterns and sources of air pollution, which in turn provides a better description of the 425 highly variable and complex nature of pollution.
Data availability. All raw data is available upon request Competing interests. RS, JAS and MSJ are employees of AirLabs and LBF is partly funded by AirLabs.