Articles | Volume 18, issue 18
Research article
28 Sep 2018
Research article |  | 28 Sep 2018

Observing local CO2 sources using low-cost, near-surface urban monitors

Alexis A. Shusterman, Jinsol Kim, Kaitlyn J. Lieschke, Catherine Newman, Paul J. Wooldridge, and Ronald C. Cohen

Urban carbon dioxide comprises the largest fraction of anthropogenic greenhouse gas emissions, but quantifying urban emissions at subnational scales is highly challenging, as numerous emission sources reside in close proximity within each topographically intricate urban dome. In attempting to better understand each individual source's contribution to the overall emission budget, there exists a large gap between activity-based emission inventories and observational constraints on integrated, regional emission estimates. Here we leverage urban CO2 observations from the BErkeley Atmospheric CO2 Observation Network (BEACO2N) to enhance, rather than average across or cancel out, our sensitivity to these hyperlocal emission sources. We utilize a method for isolating the local component of a CO2 signal that accentuates the observed intra-urban heterogeneity and thereby increases sensitivity to mobile emissions from specific highway segments. We demonstrate a multiple-linear-regression analysis technique that accounts for boundary layer and wind effects and allows for the detection of changes in traffic emissions on scale with anticipated changes in vehicle fuel economy – an unprecedented level of sensitivity for low-cost sensor technologies. The ability to represent trends of policy-relevant magnitudes with a low-cost sensor network has important implications for future applications of this approach, whether as a supplement to existing, sparse reference networks or as a substitute in areas where fewer resources are available.

1 Introduction

Initiatives to curb greenhouse gas emissions and thereby reduce the extent of climate-change-related damages are gaining momentum from city to global scales (United Nations, 2015). To support this effort, there is a clear need for monitoring strategies capable of describing emission changes and attributing those changes to the relevant policy measures (Pacala et al., 2010). Currently, an estimated 70 %–80 % of global CO2 emissions are urban in origin, and this fraction is expected to grow as migration to urban areas continues and intensifies with the industrialization of developing nations (United Nations, 2011). However, cities also present the largest atmospheric monitoring challenge in that many disparate emission sources combine with complex topography.

A considerable amount of emission estimation work has been invested in the development of activity-based emission inventories for selected metropolitan areas, such as Indianapolis (Gurney et al., 2012), Paris (Bréon et al., 2015), Los Angeles (Newman et al., 2016), Salt Lake City (Patarasuk et al., 2016), and Toronto (Pugliese et al., 2018), as well as other inventories constructed and maintained by individual air management agencies for internal use. These inventories, when updated regularly, offer the possibility of direct source attribution without the use of computationally intense and/or heavily parameterized atmospheric transport models; they do, however, typically rely on interpolations, generalizations, or proxies to generate the necessary input activity data. The Fuel-based Inventory for Vehicle Emissions (FIVE) developed by McDonald et al. (2014), for example, uses a representative 7 days of highway traffic flow measurements to drive the weekly cycle of CO2 emissions from mobile sources on roads of all sizes year-round. While traffic patterns and residential and commercial energy usage are known to vary by day of week (Harley et al., 2005), the specific timing and magnitude of these variations are likely to be heterogeneous in space and time. Mobile emission estimates constructed using an average week of highway observations therefore neglect the impact of anomalous events as well as the variety of vehicle fleets, commute practices, and congestion patterns that occur at the neighborhood level. As knowledge of emission factors and fuel efficiency grows, activity data will become one of the largest sources of uncertainty in bottom-up inventory products.

Ambient atmospheric measurements offer the opportunity to observe nuanced variations in CO2 emission activities directly without generalizing across space and time. In order to document baseline conditions in and upcoming changes to urban greenhouse gas emissions, surface-level monitoring campaigns in cities using varied approaches are being pursued (e.g., Bréon et al., 2015; Chen et al., 2016; McKain et al., 2012, 2015; Shusterman et al., 2016; Turnbull et al., 2015; and Verhulst et al., 2017). These networks, typically consisting of 2–15 instruments, attempt to constrain and supplement activity-based emission inventories with observation-based estimates. Most previous work on observation-based emission estimates has focused on domain-wide emission totals over monthly to annual timescales (e.g., Kort et al., 2013). This emphasis on integrated signals has led to site selection and data analysis techniques that minimize sensitivity to local emissions, thus discarding a large portion of the information contained in the datasets collected at individual measurement sites and the differences between them (Shusterman et al., 2016; Turner et al., 2016).

We hypothesize that, if trends in the specific small-scale CO2 sources implicated in most mitigation strategies are to be resolved from atmospheric monitoring datasets, site-to-site heterogeneity must be sought out and retained. Here we present an initial characterization of the degree of spatial heterogeneity present in an urban monitoring dataset and offer these direct observations of intracity heterogeneities as a possible strategy for providing direct constraints on CO2 emissions from individual sectors. We provide an initial approach to quantifying changes in the mobile sector and separating the influence of that sector from other emissions.

2 Measurements

2.1 The BErkeley Atmospheric CO2 Observation Network

The BErkeley Atmospheric CO2 Observation Network (BEACO2N; see Shusterman et al., 2016) is an ongoing greenhouse gas and air quality monitoring campaign operating in the San Francisco Bay Area since late 2012. The current network is comprised of ∼50 “nodes” stationed on top of schools and museums at approximate 2 km intervals (Fig. 1). The nodes contain a variety of commercially available, low-cost sensor technologies: a Vaisala CarboCap GMP343 for CO2; a Shinyei PPD42NS for particulate matter; a suite of Alphasense B4 electrochemical devices for O3, CO, NO, and NO2; and meteorological sensors for pressure, temperature, and relative humidity. Data are collected every 2–10 s and transmitted wirelessly or via an on-site Ethernet connection to a central server, where it is made publicly available in near-real time. The distributed low-cost dataset is supplemented by a “supersite” at the RFS location featuring a Picarro G2401 cavity ring-down spectroscopy analyzer for CO2, CO, and H2O; a TSI Optical Particle Sizer 3330 for particulate matter; a Thermo Fisher Scientific 42i-TL NOx analyzer for NO and NO2; a Teledyne 703E photometric calibrator for O3; a Pandora spectrometer system for total column O3 and NO2; a Lufft CHM 15k ceilometer for cloud and aerosol layer height; and various instruments for meteorological measurements (i.e., a Vaisala WXT520 weather transmitter, a Campbell Scientific CS500 temperature and relative humidity probe, and a Davis Vantage Pro2 system with a Davis 6410 anemometer and Davis 6450 solar radiation sensor). This high-cost, reference-grade instrumentation serves as a high-accuracy anchor point within the network domain. Atmospheric boundary conditions are monitored by the Bay Area Air Quality Management District's Greenhouse Gas Measurement Program, which maintains its own reference instruments at four background sites to the northwest, east, southeast, and south. A description of the design, deployment, and evaluation of the BEACO2N approach can be found in Shusterman et al. (2016) and Kim et al. (2018).

Table 1List of site geo-coordinates, relevant traffic monitor IDs, and approximate distances from a highway.

a Sites with data available in winter 2017 only. b Sites with data available in summer 2017 only.

Download Print Version | Download XLSX

Figure 1Map of BEACO2N node locations (black dots). Nodes used in this study are labeled. Map data ©2017 Google.


Here we utilize CO2 observations from the 20 BEACO2N sites operating most consistently during the summer and/or winter of 2017 (Table 1), defined as 1 June 2017 through 30 September 2017 and 1 November 2017 through 31 January 2018, respectively. The raw 2 s CO2 concentrations are averaged to 1 min means, which are subsequently converted to bias-corrected dry-air mole fractions using site-specific meteorological observations and in-network reference measurements (see Shusterman et al., 2016). The processed 1 min averages are assumed to have an instrumental uncertainty of less than ±4 ppm. The longer averaging timescales used hereafter reduce the error of the mean (e.g., ±1.8 ppm at 5 min resolution, ±0.5 ppm at hourly resolution, ±0.06 ppm for a given hour of the day over an entire season), although the concomitant increase in the influence of atmospheric variability cannot be quantified. Any long-term drift in the sensors is accounted for via a combination of periodic (i.e., every 12–24 months) laboratory recalibration and a post hoc data treatment based on the supersite situated within the network domain. This procedure allows us to confidently compare measurements taken multiple years apart, thus enabling interannual changes in CO2-related phenomena to be monitored. The exact details of the calibration and post hoc data treatment are provided in Shusterman et al. (2016).

2.2 Traffic counts

Traffic count data are collected by the California Department of Transportation as part of the Caltrans Performance Measurement System (PeMS;, last access: 23 September 2018). Hourly passenger vehicle flow data (in vehicles per hour) are obtained from the road monitors nearest to the relevant BEACO2N site with > 50 % directly observed (as opposed to modeled) data and are summed across all lanes and directions. Due to limited data coverage, in some cases it is necessary to sample road monitors upstream or downstream of the desired roadway segment; here we assume the sampled traffic conditions to be reasonable approximations of those on the desired segment. The specific monitor IDs used in each analysis are given in Table 1.

Figure 2Optimal correlation coefficients for every possible pairing of summer 2017 sites as a function of their separation distance during all hours (a), daytime hours (11:00–18:00 LT, b), and nighttime hours (21:00–04:00 LT, c). Solid lines show exponential decay of the correlation coefficients.


3 Results & discussion

To quantify the spatial heterogeneity present across the network, we examine the degree of correlation between every possible pairing of sites in a given season as a function of the distance between them, borrowing from a similar analysis used by McKain et al. (2012). For straightforward comparison with the McKain et al. results, we first average the total CO2 mole fractions to 5 min resolution. Then, for every pairwise combination of two sites, we perform an ordinary least squares linear regression between the two 5 min time series and calculate the Pearson correlation coefficient. We repeat this procedure after offsetting the two time series by ±5 min, ±10 min, etc., allowing for up to a ±3 h lag, and choose the optimal r2 value from the possible offsets. We plot the thus-optimized pairwise correlations as a function of the distance separating the two relevant sites (Figs. 2 and 3) and fit the results to a single term exponential decay on top of a constant background, defined by the mean correlation observed at inter-site distances greater than 20 km.

In the summer months, there appears to be some relationship between the proximity of the sites and the correlation of their observations at all hours, with higher correlations between neighboring sites decaying into more modest, but still significant, correlations at longer inter-site distances. The characteristic length scale of this correlation is 2.9 km (defined as the e-folding distance of the exponential fits in Fig. 2; 3.6 km during the day and 2.2 km at night), which we interpret as an indicator of the distance at which various emission sources exert influence over a site's measurements. Shorter correlation lengths indicate sensitivity to near-field emissions, while longer correlation lengths imply sensitivity to far-field phenomena.

The winter months exhibit lower pairwise correlations overall and shorter correlation lengths relative to the summertime (2.4 km; 2.6 km during the day and 2.1 km at night). Some portion of the summer–winter differences may be attributable to seasonal differences in dominant wind patterns, although this effect is difficult to disentangle from the slightly different collection of sites sampled during the two seasons; the winter sample, for example, contains fewer pairs with separation lengths less than 5 km, which affects the perceived overall trend. In either season, the correlation lengths are, as expected, considerably longer than the previously observed ∼100–1000 m e-folding distances of reactive urban pollutants that are also lost via chemical pathways (e.g., Zhu et al., 2006; Beckerman et al., 2008; Choi et al., 2014), thus validating the original choice of 2 km as the desirable inter-site separation in the design of the BEACO2N instrument.

Figure 3Optimal correlation coefficients for every possible pairing of winter 2017 sites as a function of their separation distance during all hours (a), daytime hours (11:00–18:00 LT, b), and nighttime hours (2:100–04:00 LT, c). Solid lines show exponential decay of the correlation coefficients.


The 24 h findings (top panels of Figs. 2 and 3) compare well to those presented by McKain et al., who also documented a decaying but nevertheless persistent correlation with increasing site separation. However, McKain et al. saw very little correlation after restricting their analysis to daytime hours, even at very short (< 5 km) inter-site distances, which implies that daytime observations reflect hyperlocal phenomena only. In contrast, we observe moderate to high correlations during the day, which illustrates that information about emissions and transport phenomena on a variety of scales is preserved. A spatial visualization of the daytime correlation coefficients at four representative winter sites is shown in Fig. 4. We see that PER is well correlated with its neighbors only, suggesting the presence of local phenomena that do not affect other parts of the network. LCC, however, also exhibits relationships with more distant sites, indicating a sensitivity to more regional-scale (10–30 km) influences. Meanwhile, HRS and OHS each possess at least one near neighbor with whom they are poorly correlated, perhaps due to hyperlocal events specific to those sites. While the region-wide phenomena can be characterized using sparser networks of high-cost, conventional monitoring equipment, the ability to capture these local processes is unique to the high-density approach.

Figure 4Optimal correlation coefficients representing network-wide correlation with 5 min mean total CO2 concentrations at four representative sites during daytime hours (11:00–18:00 LT) of winter 2017. Yellow spot (r2=1) on each subplot shows the location of the site at which the correlation is examined.


We posit that the true strength of a high-density, surface-level monitoring network lies in its characterization of hyperlocal phenomena unique to a given site or subset of sites. In order to directly examine signals attributable to these specific local CO2 emission processes, we separate each site's observations into a “regional” and “local” component. The regional component is, by definition, the same at all sites network-wide, calculated from the bottom 10th percentile of all BEACO2N readings collected during the surrounding 1 h window. The bottom 10th percentile is chosen (rather than the absolute minimum) to account for measurement error (±4 ppm at 1 min resolution; see Shusterman et al., 2016) as well as any near-field drawdown from the local biosphere; negative values in the local signals are likely attributable to some combination of these effects. While many different sites contribute to this bottom 10th percentile over the course of the data record, some sites located in close proximity to emission sources are never represented in the bottom 10th percentile and always exhibit some enhancement (i.e., a nonzero local component) over the regional background signal. The regional component is allowed to vary throughout the data record and will therefore reflect domain-wide changes in response to day of week, synoptic weather events, etc.

The diel profiles of the regional signal measured in summer and winter 2017 are shown in Fig. 5, reflecting the typical convolution of background concentrations, emission processes, and dynamics experienced across the entire BEACO2N domain. In both seasons, we see an increase in the regional signal beginning around 04:00 local time (LT), followed by a decrease in concentrations at 08:00 LT in the winter months and 11:00 LT in the summer, and another increase in early to late afternoon, depending on the season. This diurnal profile corresponds well with known patterns in traffic emissions – which are largely consistent across seasons – superimposed on diel fluctuations in boundary layer height and/or biosphere activity that vary in timing and magnitude according to the season. Namely, these results might be interpreted to conclude that the nighttime boundary layer in the BEACO2N domain is shallower during the winter months, producing a larger regional increase in response to rush hour traffic. The wintertime layer also appears to expand and re-contract earlier in the day than the summertime layer, resulting in both an earlier minimum and an earlier rise in afternoon–evening concentrations. The larger amplitude of the wintertime diurnal cycle may also reflect the greater influence of daytime photosynthesis and nighttime respiration during the San Francisco Bay Area's rainy winter season. An analysis of the regional signals calculated for similar periods in 2013 revealed qualitatively similar results (Fig. S1 in the Supplement), although it should be noted that the 2013 analysis uses observations from a significantly different subset of sites in the BEACO2N network.

Figure 5Hourly median values of the network-wide, regional CO2 signals calculated for summer (orange) and winter (blue) periods in 2017. Lighter colored curves indicate the standard error; note the difference in y scale.


We isolate the local signals by subtracting the network-wide regional component from the data record at each site. Median 1 min local CO2 signals range from 0.3 to 40.2 ppm during the day (11:00–18:00 LT) and 1.1 to 38.5 ppm at night (21:00–04:00 LT) during the summer months, although the distributions are skewed, with the 10th- to 90th-percentile ranges stretching from −2.4 to 69.0 ppm during the day and −2.0 to 45.0 ppm at night. During the winter months, the daytime medians range from 3.6 to 34.8 ppm (−7.0 to 90.8 ppm 10th- to 90th-percentile range), while the nighttime medians range from −0.8 to 58.7 ppm (−15.0 to 90.6 ppm 10th- to 90th-percentile range). A full picture of the overall distributions is shown in Figs. S2 and S3, confirming a much greater frequency of high CO2 concentrations during the winter months. In both seasons, the distribution of the local enhancements is typically unimodal with a heavy right-hand tail, although some sites exhibit more complex bi- or multi-modal distributions.

By definition, we expect these local signals to represent a unique combination of emission sources and atmospheric dynamics specific to a given site. Here we endeavor to determine whether measurements of local CO2 enhancements can be used to monitor a single urban emission source, despite the complex landscape of CO2 sources and sinks present within the study domain. We choose to focus on mobile CO2 emissions as these are estimated to comprise approximately 40 % of the San Francisco Bay Area's annual CO2 emissions (Claire et al., 2015). This is the largest source sector in the CO2 emission inventory and likely to represent an even larger fraction within the urban core, where the next-largest source sectors (industrial/commercial and electricity/co-generation) are less abundant. However, as noted in the discussion of the regional signals above, direct observation of the magnitude and variation of traffic emissions via ambient CO2 concentrations is complicated by the coincident variation in turbulent mixing and boundary layer height as the earth's surface warms and cools at sunrise and sunset (Fig. S4).

In order to more directly examine the relationship between highway traffic flow and urban CO2 concentrations, we begin by analyzing the subset of observations collected between 04:00 and 08:00 LT at the LAN site, located less than 40 m from Interstate 880. During this period, traffic emissions are high, but the boundary layer is relatively shallow, thus increasing the sensitivity of the surface-level monitor to the traffic signal. The resultant strong positive correlation between rush hour traffic flow and local CO2 concentrations is shown in Fig. 6. An alternative analysis using traffic density – obtained by dividing the traffic flow by the average vehicle speed – yields almost identical results (Fig. S5), revealing a factor-of-2 increase in local CO2 mole fraction enhancements during congestion (high traffic flow/density) relative to free-flowing conditions (low traffic flow/density), similar to that observed by a previous on-road mobile monitoring study by Maness et al. (2015). Also shown in Fig. 6 are the median CO2 concentrations observed in each 500 vehicles h−1 traffic flow increment and the ordinary least squares linear regression through these binned medians.

In addition to this first-order sensitivity to vehicle emissions at the near-roadway LAN site, we find that relatively subtle emission changes can also be detected using nodes stationed greater distances from the highway by controlling for the confounding impacts of dispersion and the biosphere. To do so, we decompose the CO2 signals into terms that represent the influence of meteorology (which is correlated with both dispersion and biosphere activity) and emissions separately via a multiple-linear-regression (MLR) approach analogous to that described by de Foy (2018). Briefly, we use an ordinary least squares linear regression to calculate the best fit of the relationship between a site's CO2 signal and temperature, specific humidity, wind, boundary layer height, time of day, day of week, and time of year. Hourly measurements of temperature, specific humidity, wind speed, and wind direction are taken from a single NOAA Integrated Surface Database weather station at the Port of Oakland International Airport (, last access: 23 September 2018), and 3 h boundary layer heights are provided at 0.125 by 0.125 resolution by the ECMWF's ERA-Interim model (Dee et al., 2011;, last access: 23 September 2018). Although the low spatiotemporal resolution of these datasets limits their ability to capture hyperlocal meteorologies, here we follow the example of de Foy, who was nonetheless able to derive meaningful results from similarly coarse weather products.

The nonlinear relationship between CO2 concentrations and wind or boundary layer height is captured by dividing these meteorological datasets into quartiles and assigning each observation a value between 0 (at the maximum of the quartile) and 1 (at the minimum) using piecewise linear interpolation. The wind speed quartiles are further subdivided by wind direction and reassigned values of 0–1 accordingly before fitting a linear coefficient to each subset. The time of year is represented as a sum of sines and cosines with annual or semiannual periodicities whose values also vary between 0 and 1 and whose amplitudes are determined by the linear regression. Zeroes and ones are used to designate each hour of each type of day of the week as well. For example, time steps corresponding to 08:00 LT on a Monday may be assigned a 1 while all other time steps are set to zero before the linear regression is performed. As a result, the MLR factors derived for each of the preceding explanatory variables can be interpreted in units of ppm CO2. Meanwhile, the temperature and specific humidity variables are treated by calculating their difference from their mean values and dividing by their respective standard deviations before each is fit to CO2 with a single linear coefficient, which will have units of ppm K−1 and ppm (kgwater kgair-1)-1, respectively.

The independent variable leading to the greatest square of the Pearson correlation coefficient is then combined with each of the remaining variables, and a second regression is performed. The two-input combination leading to the largest increase in the correlation coefficient is then combined with each of the remaining variables, and so on, until the addition of a new independent variable no longer increases the r2 value by at least 0.005.

Table 2Explanatory variables included in the multiple-linear-regression analysis of each site; values indicate the correlation coefficient increase achieved by subsequent inclusion of each variable.

Download Print Version | Download XLSX

Figure 6Morning (04:00–08:00 LT) local summertime CO2 concentrations at LAN shown as a function of nearby highway traffic flow. Darker points indicate the median CO2 concentration observed in each 500 vehicles h−1 traffic flow increment; black solid line indicates the linear regression through the binned medians (equation given above plot), and gray dashed lines show the uncertainty in the regression slope.


For this analysis, we use hourly total CO2 concentrations (the sum of the local and regional components) measured at five sites between 15 February 2017 and 15 February 2018. For each site, the optimal set of explanatory variables and their relative contributions to the correlation coefficient are given in Table 2. Summing the products of each of the MLR factors with their respective independent variables (e.g., time of day, wind speed) gives the mixing ratio predicted by the MLR model; a representative week of observed and modeled CO2 concentrations is shown in Fig. 7. We find generally good agreement, with some significant hour-by-hour model–observation differences, especially at RFS. These do not, however, appear to be systematic either in sign or in timing (e.g., the rush hour peak in CO2 may be poorly modeled on one day but well predicted on another).

Figure 7Representative week of total CO2 concentrations observed (thick gray curve) and modeled (dashed blue curve) at five sites using a multiple-linear-regression approach based on de Foy (2018).


Multiple-linear-regression coefficients are derived for each hour of the day during five types of days of the week (Mondays, Tuesdays through Thursdays, Fridays, Saturdays, and Sundays); for clarity, Fig. 8 shows the regression coefficients for Tuesdays through Thursdays and Sundays. Other days of the week are shown in Fig. S6. These MLR “factors” signify the average CO2 enhancement or depletion (in ppm) uniquely associated with a particular hour of a particular day of the week. The dependencies on time of day and day of week derived via this method are hypothesized to primarily reflect the changes in emissions, as the influence of the coincident changes in atmospheric dynamics has been at least partially controlled for. For reference, the corresponding Tuesday–Thursday and Sunday diel cycles in the total CO2 observed at each site are shown in Fig. 9. Indeed, we do observe some of the same intuitive patterns in the linear regression coefficients, such as higher coefficients on weekday mornings corresponding to higher rush hour traffic emissions on those days, but with greater opportunity to differentiate between days of the week, especially around noon, when raw concentrations are generally similar. As expected, the Tuesday–Thursday enhancement in the MLR factors is larger at the sites located close to a freeway (e.g., up to 520 % higher than the corresponding Sunday MLR factor at FTK) but is less pronounced at LBL (70 %), which is farther away from major mobile sources. For reference, the 1 km by 1 km FIVE mobile emission inventory developed for the San Francisco Bay Area by McDonald et al. (2014) predicts a ∼210 % weekday enhancement on average, peaking around 05:00 LT, much earlier in the day than is observed here.

Figure 8Multiple-linear-regression coefficients for five sites derived for each hour of the day on Tuesdays through Thursdays (orange solid line) and Sundays (blue dashed line) between 15 February 2017 and 15 February 2018.


Figure 9Hourly median CO2 concentrations observed at five sites on Tuesdays through Thursdays (orange solid line) and Sundays (blue dashed line) between 15 February 2017 and 15 February 2018; lighter curves indicate the standard error in the medians.


When we examine the relationship between these multiple-linear-regression coefficients and morning traffic flow as we did at LAN (Fig. 10), we again find positive correlations. This is an interesting result, given that the traffic flow measured on a single highway likely provides only a first-order approximation of the total traffic emissions influencing a single CO2 monitor, especially those situated at greater distances from said highway, which may be sensitive to additional highways, as well as local roads. Although the predominance of a single highway's emissions (or at least its correlation with those from other sources) is not a necessary condition of our MLR analysis, the strong positive correlations we observe suggest that this methodology may nonetheless be useful in monitoring emissions from individual highways such as these.

Figure 10Morning (04:00–08:00 LT) multiple-linear-regression coefficients shown as a function of summertime traffic flow; black solid lines indicate the linear regression through the MLR factors (equations given above each subplot), and gray dashed lines show the uncertainty in the regression slope.


The standard error of the slope of the linear regression is calculated as the standard deviation of the model–data CO2 residuals divided by the square root of the sum of the squared differences between each traffic flow increment and the mean traffic flow. The 1σ uncertainty in the slopes (i.e., the 68 % confidence interval, assuming a Gaussian error distribution) is thus found to be 11 %–30 %, indicating that analysis of a single site could be used to detect as small as 11 % changes in average emissions per vehicle, an improvement upon the 17 % slope uncertainty calculated for the near-highway LAN site. For reference, under the Corporate Average Fuel Economy standards, the state of California aims to achieve a fleet-wide average fuel economy of 23.2 km L−1 by the year 2025 (US EPA, 2012), corresponding to a 35 % decrease in emissions relative to the 15.1 km L−1 economy of 2012–2016 model year vehicles. Assuming a steady decrease in emissions of 3.5 % yr−1, an 11 % decrease would be achieved after approximately 3 years, showing that one BEACO2N site is therefore sufficiently sensitive to detect such a trend with 68 % confidence in as little as 3 years. By leveraging multiple independent sites, even greater confidence and/or shorter timescales could be achieved.

It is likely that sensitivity could be further enhanced with more accurate meteorological datasets. While the single weather station and relatively coarse (0.125 by 0.125) reanalysis product we use here may be adequate to represent the meteorological conditions across some domains, the San Francisco Bay Area is at the high end of complexity in terms of terrain and microclimatology. Higher-resolution boundary layer heights and neighborhood-specific wind observations may improve the results of our multiple linear regression, but these types of measurements are rarely available on the spatial scale of the BEACO2N instrument and are difficult to simulate with accuracy (Jiménez et al., 2013; Banks et al., 2016). In future work, high-density networks like BEACO2N may therefore be useful not just in source attribution but also in providing a much-needed observational constraint on our understanding of near-surface transport.

Future work will also make use of the ancillary datasets provided by the BEACO2N platform, such as the concurrent NOx and CO concentrations. Prior studies have demonstrated a methodology for detecting plume-like events in the BEACO2N NOx and CO observations (Kim et al., 2018), and the ratio of these species to CO2 provides a unique signature for each different CO2 source (e.g., Ban-Weiss et al., 2008; Harley et al., 2005; Lopez et al., 2013; Nathan et al., 2018; Turnbull et al., 2015), allowing subsets of the data record to be directly attributed to specific (e.g., mobile) source types and allowing the relationship between these specific activities and CO2 mixing ratios to be derived more precisely. With such a precise methodology for converting between emissions and concentrations, subtler interannual trends in emissions could be detected, for example changes in vehicle emissions following construction of new housing.

4 Conclusions

We have described the heterogeneity measured at the individual sites of a high-density, surface-level urban CO2 monitoring network. Network-wide correlation length scales are found to be slightly longer during daytime during the summer and generally shorter during winter months, but they fall in the range of values reported previously based on other stationary observation networks and mobile monitoring campaigns. High near-field correlations are thought to be driven by shared sensitivity to local emission events, while moderate far-field correlations reflect regional episodes, suggesting that a given site's data record is likely a convolution of both phenomena. We therefore present a methodology for separating the observed CO2 concentrations into local and regional components and observe distinct distributions (i.e., unimodal vs. bimodal) of local CO2 enhancements within single neighborhoods. A clear relationship is seen between morning rush hour traffic counts and local CO2 concentrations, allowing for the detection of changes in vehicle emissions within 3 years if those changes proceed at a rate consistent with policy objectives.

Most prior studies of urban CO2 emissions (e.g., McKain et al., 2012; Kort et al., 2013; Wu et al., 2018) have favored sparser networks of high-quality instruments, finding this approach to be better suited for resolving trends in total region-wide emissions. It is hypothesized that the ideal monitoring strategy depends on the particular goals and location of a given application, with certain locales and emission sources necessitating high-cost, low-density instrumentation, complemented by other domains where low-cost, high-density platforms are more effective. The potential trade-offs between measurement quality and instrument quantity specific to the San Francisco Bay Area have been investigated previously using an ensemble of observing system simulations by Turner et al. (2016), who found BEACO2N-like observing systems to outperform smaller, higher-quality networks in estimating regional as well as more localized emission phenomena there. While Turner et al. saw significant benefits to achieving an hourly instrument precision of 1 ppm, further increases in measurement quality offered little advantage in constraining emissions, especially those from line and point sources.

This work thus provides an important data-based validation of the conclusions of Turner et al.'s theoretical analysis. Not only do we demonstrate the ability of low-cost sensors to sufficiently constrain policy-relevant trends in line source (i.e., highway traffic) emissions, but we do so without the use of computationally intense and heavily parameterized atmospheric transport models. Furthermore, we show that a multiple-linear-regression analysis allows the signature of highway traffic to be extracted from sites located throughout the network, enabling trends in mobile emissions to be quantified without specially situated roadside monitors. Although this approach requires real-time traffic count information that is not yet available at all locations, our finding is nonetheless an important result, as deriving and implementing a particular a priori network layout is a non-trivial task. Domain-specific transport patterns prevent the development of general principles of optimal sensor placement, and, even if ideal locations can be identified, cooperation from facilities in the area cannot be guaranteed. By establishing for the first time that an ad hoc, opportunistic sensor siting approach can nonetheless provide sensitivity to emission sources of interest, we thus improve the prospects for widespread adoption of distributed monitoring systems in the future.

Progress toward evaluating the capabilities and proper use of low-cost sensors has particular relevance for nations with rapidly developing economies, where CO2 emissions are increasing much faster than the resources needed to monitor them by conventional means. Domestically, citizen science and environmental justice groups are also adopting these technologies (Snyder et al., 2013) as an economically accessible means of advocating for greater public health and ecological wellbeing. While the specific correlation lengths and emission estimates we derive here are unique to the San Francisco Bay Area domain, the sensor performance capabilities and data analysis techniques we outline provide guidance more generally to any future studies attempting to interpret similar datasets around the world. High-resolution surface networks enabled by low-cost technologies offer a unique opportunity to provide ground truth constraints on difficult-to-model near-surface dynamics as well as on the individual CO2 sources and sinks that comprise the strategic backbone of greenhouse gas mitigation regulation.

Data availability

All BEACO2N CO2 observations used in this analysis can be downloaded at (Shusterman and Cohen, 2018). Traffic counts are available on the California Department of Transportation website (, last access: 23 September 2018); wind, temperature, and humidity observations are available on the NOAA Integrated Surface Database website (, last access: 23 September 2018); and boundary layer heights are available on the ECMWF website (, last access: 23 September 2018).


The supplement related to this article is available online at:

Author contributions

AS, JK, KL, CN, and PW collected the data used in this analysis. AS designed and executed said analysis and composed the manuscript. JK, KL, and RC provided additional manuscript feedback and RC supervised the project.

Competing interests

The authors declare that they have no conflict of interest.


This work was funded by the National Science Foundation (1035050; 1038191), the National Aeronautics and Aerospace Administration (NAS2-03144), the Bay Area Air Quality Management District (2013.145), and the Environmental Defense Fund. Additional support was provided by an NSF Graduate Research Fellowship to Alexis A. Shusterman, a Kwanjeong Lee Chonghwan Educational Fellowship to Jinsol Kim, and a Hellman Fund Fellowship to Kaitlyn J. Lieschke. We acknowledge the use of datasets maintained by the California Department of Transportation, the National Oceanic and Atmospheric Administration, and the European Centre for Medium-Range Weather Forecasts.

Edited by: Andreas Engel
Reviewed by: Jocelyn Turnbull and one anonymous referee


Banks, R. F., Tiana-Alsina, J., Baldasano, J. M., Rocadenbosch, F., Papayannis, A., Solomos, S., and Tzanis, C. G.: Sensitivity of boundary-layer variables to PBL schemes in the WRF model based on surface meteorological observations, lidar, and radiosondes during the HygrA-CD campaign, Atmos. Res., 176, 185–201,, 2016. 

Ban-Weiss, G. A., McLaughlin, J. P., Harley, R. A., Lunden, M. M., Kirchstetter, T. W., Kean, A. J., Strawa, A. W., Stevenson, E. D., and Kendall, G. R.: Long-term changes in emissions of nitrogen oxides and particulate matter from on-road gasoline and diesel vehicles, Atmos. Environ., 42, 220–232,, 2008. 

Beckerman, B., Jerrett, M., Brook, J. R., Verma, D. K., Arain, M. A., and Finkelstein, M. M.: Correlation of nitrogen dioxide with other traffic pollutants near a major expressway, Atmos. Environ., 42, 275–290,, 2008. 

Bréon, F. M., Broquet, G., Puygrenier, V., Chevallier, F., Xueref-Remy, I., Ramonet, M., Dieudonné, E., Lopez, M., Schmidt, M., Perrussel, O., and Ciais, P.: An attempt at estimating Paris area CO2 emissions from atmospheric concentration measurements, Atmos. Chem. Phys., 15, 1707–1724,, 2015. 

Chen, J., Viatte, C., Hedelius, J. K., Jones, T., Franklin, J. E., Parker, H., Gottlieb, E. W., Wennberg, P. O., Dubey, M. K., and Wofsy, S. C.: Differential column measurements using compact solar-tracking spectrometers, Atmos. Chem. Phys., 16, 8479–8498,, 2016. 

Choi, W., Winer, A. M., and Paulson, S. E.: Factors controlling pollutant plume length downwind of major roadways in nocturnal surface inversions, Atmos. Chem. Phys., 14, 6925–6940,, 2014. 

Claire, S. J., Dinh, T. M., Fanai, A. K., Nguyen, M. H., and Schultz, S. A.: Bay Area emissions inventory summary report: greenhouse gases, Tech. rep., Bay Area Air Quality Management District, San Francisco, CA, USA, 2015. 

Dee, D. P., Uppala, S. M., Simmons, A. J., Berrisford, P., Poli, P., Kobayashi, S., Andrae, U., Balmaseda, M. A., Balsamo, G., Bauer, P., Bechtold, P., Beljaars, A. C. M., van de Berg, L., Bidlot, J., Bormann, N., Delsol, C., Dragani, R., Fuentes, M., Geer, A. J., Haimberger, L., Healy, S. B., Hersbach, H., Hólm, E. V., Isaksen, L., Kållberg, P., Köhler, M., Matricardi, M., McNally, A. P., Monge-Sanz, B. M., Morcrette, J.-J., Park, B.-K., Peubey, C., de Rosnay, P., Tavolato, C., Thépaut, J.-N., and Vitart, F.: The ERA-Interim reanalysis: configuration and performance of the data assimilation system, Q. J. Roy. Meteor. Soc., 137, 553–597,, 2011. 

de Foy, B.: City-level variations in NOx emissions derived from hourly monitoring data in Chicago, Atmos. Environ., 176, 128–139,, 2018. 

Gurney, K. R., Razlivanov, I., Song, Y., Zhou, Y., Benes, B., and Abdul-Massih, M.: Quantification of fossil fuel CO2 emissions on the building/street scale for a large U.S. city, Environ. Sci. Technol., 46, 12194–12202,, 2012. 

Harley, R. A., Marr, L. C., Lehner, J. K., and Giddings, S. N.: Changes in motor vehicle emissions on diurnal to decadal time scales and effects on atmospheric composition, Environ. Sci. Technol., 39, 5356–5362,, 2005. 

Jiménez, P. A., Dudhia, J., González-Rouco J. F., Montávez, J. P., García-Bustamante, E., Navarro, J., Vilà-Guerau de Arellano, J., and Muñoz-Roldán, A.: An evaluation of WRF's ability to reproduce the surface wind over complex terrain based on typical circulation patterns, J. Geophys. Res.-Atmos., 118, 7651–7669,, 2013. 

Kim, J., Shusterman, A. A., Lieschke, K. J., Newman, C., and Cohen, R. C.: The BErkeley Atmospheric CO2 Observation Network: field calibration and evaluation of low-cost air quality sensors, Atmos. Meas. Tech., 11, 1937–1946,, 2018. 

Kort, E. A., Angevine, W. M., Duren, R., and Miller, C. E.: Surface observations for monitoring urban fossil fuel CO2 emissions: minimum site location requirements for the Los Angeles Megacity, J. Geophys. Res.-Atmos., 118, 1577–1584,, 2013. 

Lopez, M., Schmidt, M., Delmotte, M., Colomb, A., Gros, V., Janssen, C., Lehman, S. J., Mondelain, D., Perrussel, O., Ramonet, M., Xueref-Remy, I., and Bousquet, P.: CO, NOx and 13CO2 as tracers for fossil fuel CO2: results from a pilot study in Paris during winter 2010, Atmos. Chem. Phys., 13, 7343–7358,, 2013. 

Maness, H. L., Thurlow, M. E., McDonald, B. C., and Harley, R. A.: Estimates of CO2 traffic emissions from mobile concentration estimates, J. Geophys. Res.-Atmos., 120, 2087–2102,, 2015. 

McDonald, B. C., McBride, Z. C., Martin, E. W., and Harley, R. A.: High-resolution mapping of motor vehicle carbon dioxide emissions, J. Geophys. Res.-Atmos., 119, 5283–5298,, 2014. 

McKain, K., Wofsy, S. C., Nehrkorn, T., Eluszkiewicz, J., Ehleringer, J. R., and Stephens, B. B.: Assessment of ground-based atmospheric observations for verification of greenhouse gas emissions from an urban region, P. Natl. Acad. Sci. USA, 109, 8423–8428,, 2012. 

McKain, K., Down, A., Raciti, S. M., Budney, J., Hutrya, L. R., Floerchinger, C., Herndon, S. C., Nehrkorn, T., Zahniser, M. S., Jackson, R. B., Phillips, N., and Wofsy, S. C.: Methane emissions from natural gas infrastructure and use in the urban region of Boston, Massachusetts, P. Natl. Acad. Sci. USA, 112, 1941–1946,, 2015. 

Nathan, B., Lauvaux, T., Turnbull, J., and Gurney, K.: Investigations into the use of multi-species measurements for source apportionment of the Indianapolis fossil fuel CO2 signal, Elem. Sci. Anth., 6, 21,, 2018. 

Newman, S., Xu, X., Gurney, K. R., Hsu, Y. K., Li, K. F., Jiang, X., Keeling, R., Feng, S., O'Keefe, D., Patarasuk, R., Wong, K. W., Rao, P., Fischer, M. L., and Yung, Y. L.: Toward consistency between trends in bottom-up CO2 emissions and top-down atmospheric measurements in the Los Angeles megacity, Atmos. Chem. Phys., 16, 3843–3863,, 2016. 

Pacala, S. W., Breidenich, C., Brewer, P. G., Fung, I., Gunson, M. R., Heddle, G., Law, B., Marland, G., Paustian, K., Prather, M., Randerson, J. T., Tans, P., and Wofsy, S. C.: Verifying Greenhouse Gas Emissions: Methods to Support International Climate Agreements, The National Academies Press, Washington, D. C., 2010. 

Patarasuk, R., Gurney, K. R., O'Keeffe, D., Song, Y., Huang, J., Preeti, R., Buchert, M., Lin, J. C., Mendoza, D., and Ehleringer, J. R.: Urban high-resolution fossil fuel CO2 emissions quantification and exploration of emission drivers for potential policy applications, Urban Ecosyst., 19, 1013–1039,, 2016. 

Pugliese, S. C., Murphy, J. G., Vogel, F. R., Moran, M. D., Zhang, J., Zheng, Q., Stroud, C. A., Ren, S., Worthy, D., and Broquet, G.: High-resolution quantification of atmospheric CO2 mixing ratios in the Greater Toronto Area, Canada, Atmos. Chem. Phys., 18, 3387–3401,, 2018. 

Shusterman, A. A. and Cohen, R. C.: Selected CO2 Data from BErkeley Atmospheric CO2 Observation Network [Data set], Zenodo,, 2018. 

Shusterman, A. A., Teige, V. E., Turner, A. J., Newman, C., Kim, J., and Cohen, R. C.: The BErkeley Atmospheric CO2 Observation Network: initial evaluation, Atmos. Chem. Phys., 16, 13449–13463,, 2016. 

Snyder, E. G., Watkins, T. H., Solomon, P. A., Thoma, E. D., Williams, R. W., Hagler, G. S. W., Shelow, D., Hindin, D. A., Kilaru, V. J., and Preuss, P. W.: The changing paradigm of air pollution monitoring, Environ. Sci. Technol., 47, 11369–11377,, 2013. 

Turnbull, J. C., Sweeney, C., Karion, A., Newberger, T., Lehman, S. J., Tans, P. P., Davis, K. J., Lauvaux, T., Miles, N. L., Richardson, S. J., Cambaliza, M. O., Shepson, P. B., Gurney, K., Patarasuk, R., and Razlivanov, I.: Toward quantification and source sector identification of fossil fuel CO2 emissions from an urban area: Results from the INFLUX experiment, J. Geophys. Res.-Atmos., 120, 292–312,, 2015. 

Turner, A. J., Shusterman, A. A., McDonald, B. C., Teige, V., Harley, R. A., and Cohen, R. C.: Network design for quantifying urban CO2 emissions: assessing trade-offs between precision and network density, Atmos. Chem. Phys., 16, 13465–13475,, 2016. 

United Nations, Human Settlement Programme: Hot Cities: Battle-Ground for Climate Change, 2011. 

United Nations, Framework Convention on Climate Change: Adoption of the Paris Agreement, 21st Conference of the Parties, Paris, 2015. 

United States Environmental Protection Agency, 2017: Later Model Year Light-Duty Vehicle Greenhouse Gas Emissions and Corporate Average Fuel Economy Standards, Washington, D.C., 2012. 

Verhulst, K. R., Karion, A., Kim, J., Salameh, P. K., Keeling, R. F., Newman, S., Miller, J., Sloop, C., Pongetti, T., Rao, P., Wong, C., Hopkins, F. M., Yadav, V., Weiss, R. F., Duren, R. M., and Miller, C. E.: Carbon dioxide and methane measurements from the Los Angeles Megacity Carbon Project – Part 1: calibration, urban enhancements, and uncertainty estimates, Atmos. Chem. Phys., 17, 8313–8341,, 2017. 

Wu, K., Lauvaux, T., Davis, K. J., Deng, A., Lopez Coto, I., Gurney, K. R., and Patarasuk, R.: Joint inverse estimation of fossil fuel and biogenic CO2 fluxes in an urban environment: An observing system simulation experiment to assess the impact of multiple uncertainties, Elem. Sci. Anth., 6, 17,, 2018.  

Zhu, Y., Kuhn, T., Mayo, P., and Hinds, W. C.: Comparison of daytime and nighttime concentration profiles and size distributions of ultrafine particles near a major highway, Environ. Sci. Technol., 40, 2531–2536,, 2006. 

Short summary
We describe the diversity and heterogeneity of urban CO2 levels observed using the BErkeley Atmospheric CO2 Observation Network, a distributed instrument of > 50 CO2 sensors stationed every ~ 2 km across the San Francisco Bay Area. We demonstrate that relatively simple mathematical techniques, applied to these observations, can be used to detect the small changes in highway CO2 emissions expected to result from upcoming fuel economy regulations, affirming the policy relevance of low-cost sensors.
Final-revised paper