Assessing Filtering of Mountaintop Co 2 Mole Fractions for Application to Inverse Models of Biosphere-atmosphere Carbon Exchange

There is a widely recognized need to improve our understanding of biosphere-atmosphere carbon exchanges in areas of complex terrain including the United States Mountain West. CO 2 fluxes over mountainous terrain are often difficult to measure due to unusual and complicated influences associated with atmospheric transport. Consequently, deriving regional fluxes in mountain regions with carbon cycle inversion of atmospheric CO 2 mole fraction is sensitive to filtering of observations to those that can be represented at the transport model resolution. Using five years of CO 2 mole fraction observations from the Regional Atmospheric Continuous CO 2 Network in the Rocky Mountains (Rocky RACCOON), five statistical filters are used to investigate a range of approaches for identifying regionally representative CO 2 mole fractions. Test results from three filters indicate that subsets based on short-term variance and local CO 2 gradients across tower inlet heights retain nine-tenths of the total observations and are able to define representative diel variability and seasonal cycles even for difficult-to-model sites where the influence of local fluxes is much larger than regional mole fraction variations. Test results from two other filters that consider measurements from previous and following days using spline fitting or sliding windows are overly selective. Case study examples showed that these windowing-filters rejected measurements representing synoptic changes in CO 2 , which suggests that they are not well suited to filtering continental CO 2 measurements. We present a novel CO 2 lapse rate filter that uses CO 2 differences between levels in the model atmosphere to select subsets of site measurements that are representative on model scales. Our new filtering techniques provide guidance for novel approaches to assimilating mountain-top CO 2 mole fractions in carbon cycle inverse models.


Introduction
The Western United States is suspected to have substantial carbon sinks with uptake that is strongly determined by ecosystem dynamics in complex terrain above 750 m (Schimel et al., 2002;Hu et al., 2010).Carbon cycle inverse models that assimilate CO 2 mole fractions to infer land-atmosphere CO 2 fluxes present an excellent opportunity for identifying the magnitude and climate sensitivity of these different carbon sinks in the Mountain West (Raupach, 2011).There are however two major issues when using carbon cycle inversion models in complex terrain.First, model topographies are often too coarsely gridded to represent complex terrain resulting in large mismatches (e.g. 10 3 m) between the actual surface elevation and the model surface elevation.Second, winds used to drive inversion models are not always accurate, particularly in complex terrain, and may incorrectly inform the model about the source region of assimilated measurements.
One method for dealing with scale representativeness of atmospheric transport inversions is to use a high resolution modeling framework (Lauvaux et al., 2008;Göckede et al., 2010;Pillai et al., 2011;Gourdji et al., 2012;Lauvaux et al., 2012).However these studies are limited in scope because they require strong sensitivity to proper specification of lateral inflow fluxes, do not cover CO 2 exchange on continental to global scales and do not span multiple years, which limits our capability to make inferences about the spatiotemporal variability of regional terrestrial carbon sources and sinks that are highly variable year to year.
Another approach is to assimilate CO 2 mole fraction measurements from mountaintop locations where the causes of CO 2 variability have been well studied (e.g.Pérez-Landa et al., 2007;Sun et al., 2010).However, airflow patterns at such sites cannot be assumed to be representative of all variability across the region.CO 2 mole fractions must be precisely measured by a network of sites and filtered to remove observations that are strongly influenced by local sources and sinks.Although filtering (selecting representative subsets) reduces the number of observations available for use as assimilation constraints, filtering is necessary in order to distinguish the model-resolvable biotic changes in regional CO 2 fluxes caused for example by photosynthesis, respiration, and disturbance (Boisvenue and Running, 2010;Medvigy et al., 2010) from potentially larger diel and seasonal variations that are driven by complex terrain transport (Stewart et al., 2002;Yi et al., 2008;Burns et al., 2011).
Until recently much of the Mountain West region between Colorado and Nevada (Fig. 1) represented a large gap in the monitoring coverage of continuous CO 2 mole fraction measurements, which has limited our ability to determine its relative importance as a carbon sink.The Mountain West region spans a large portion of the western US, where ecoregions are abruptly divided by physiographic barriers that give rise to heterogeneous plant distributions, and complex CO 2 transport and climate drivers.
Although site-scale eddy flux towers such as the Niwot Ridge AmeriFlux tower can capture local (e.g. 1 km 2 ) net ecosystem exchange (NEE, Monson et al., 2002;Hu et al., 2010) these measurements may not be representative of regional changes ( e.g. 10 000 km 2 ).Regional scale boundary layer budgets are difficult to construct (Desai et al., 2011), which leaves atmospheric tracer-transport inversion modeling as one of few ways to constrain regional carbon budgets.
There is a need to identify well-mixed regional air mass measurements corresponding to the resolution of one model grid cell over smoothed terrain for accurate retrievals of CO 2 fluxes by tracer-transport inversion (Denning et al., 2002;Gurney et al., 2002;Gerbig et al., 2003;Lin et al., 2004;de Wekker et al., 2009;Gurney and Eckels, 2011).
In this study we examine how partitioning the complete set using different methods selects subsets of the data that have different representativeness and which method(s) are most likely to produce subsets that imply well mixed air on spatial scales corresponding to the transport model.We use measurements from the Regional Atmospheric Continuous CO 2 Network in the Rocky Mountains (http://raccoon.ucar.edu/).Datasets from the still-growing RACCOON network range back though August, 2005.However, as described earlier, the complete set of these data contain samples representing both local and regional air influences that necessitate filtering.Analyses of the subsets selected by the filters permits us to: (1) determine if hourly-statistical filters of CO 2 time series, which do not consider past and future CO 2 variability, are sufficient for identifying local or regional air masses as a way to "flag" data prior to assimilation into an inverse model; and (2) investigate how these filters compare to CO 2 filters that utilize preceding and following CO 2 mole fractions and variability to determine cutoff ranges.

Importance of regionally representative CO 2 mole fractions
When estimating carbon cycle flux parameters and magnitudes by inverse techniques (e.g.Bayesian synthesis inversion, geostatistical inverse modeling, ensemble Kalman filtering) unfiltered data that include measurements representative of small-scale local influences (especially in complex terrain, e.g.Turnipseed et al., 2004) can lead to model parameters that do not accurately represent the process of interest.Mountaintop observations of CO 2 mole fractions are particularly important because stations at high elevation can frequently be subject to descending well-mixed air masses that may be suitable for assimilation by inverse models.Regional representativeness of the data can also be improved by selectively partitioning measurements that are likely to be representative of well mixed air masses on scales of 10 000 km 2 .Filtering data, however, poses challenges.The spatial and temporal scale coverage of automated regional observation networks makes flagging measurements "by hand" impossible and requires robust autonomous filtering approaches.
Given that co-located meteorological data are not always available, our goal was to use statistical filters that operate on the CO 2 mole fractions alone.
Until recently, most carbon cycle inversion models avoided much of the need for filtering observations because they assimilated monthly or annually averaged CO 2 often sampled from remote marine boundary layer sites (Tans et al., 1990;Enting and Mansbridge, 1991;Fan et al., 1998).This has changed with the present class of inversion models (e.g.Göckede et al., 2010;Peters et al., 2010;Schuh et al., 2010), which assimilate CO 2 mole fractions and compute fluxes on sub-daily scales using high frequency observations taken from many locations including continental sites.In dealing with spatial representativeness issues of high frequency observations, ensemble assimilation strategies, including variance inflation (Hamill et al., 2001;Zupanski et al., 2007), are used to mitigate some but not all of the model error.No matter the correction strategy, removing certain observations that do not match model resolved processes is necessary to ensure that posterior fluxes optimized based on measured CO 2 mole fractions are physically realistic.
Variability in the CO 2 mole fractions due to local influences not resolvable by inverse models can be several ppm to tens of ppm (van der Molen and Dolman, 2007).Our goal through this study is to partition CO 2 mole fractions so that they correspond to a given model resolution.We diagnose the performance of filters at rejecting observations representative of local-scale flux heterogeneities and unresolved topographic airflows using synoptic frontal passages, comparisons to aircraft CO 2 profiles and model CO 2 lapse rates.CarbonTracker (Peters et al., 2007(Peters et al., , 2010) ) is one example of an inverse data assimilation system that incorporates CO 2 mole fractions including observations from Rocky RACCOON and will be referred to and used for comparison throughout this paper.As mentioned earlier high resolution inversion studies are limited in space and time, thus we use a global scale inversion system.Coarse models such as Car-bonTracker suffer from large discrepancies between the representativeness of the measurements they assimilate and their average grid cell size and therefore are most in need of filters that are specific to their model resolutions.

Causes of variability in CO 2 mole fractions in complex terrain
The causes of CO 2 variability beyond the diel and seasonal cycles of carbon dioxide measured at Rocky Mountain lo-cations (see Fig. 2) have been a topic of study for several decades (Gillette and Steele, 1983).Deviations from the signal of well-mixed free-tropospheric carbon dioxide can be difficult to model for several reasons.For example, upwind sources and sinks of CO 2 typically have a primary influence on mole fractions at the measurement sites.However, in complex terrain this is often found not to be the case during the morning transition when prevailing winds slacken and upslope flows become more influential (Stewart et al., 2002;de Wekker et al., 2009;Bowling et al., 2011).Strong upslope flows or weak winds can result in prominent CO 2 spikes in time series, often on the order of several ppm and lasting a few hours or less.On the other hand some terrain flows actually provide favorable sampling conditions characterized by CO 2 signals that do not deviate substantially.
Although several studies within the RACCOON domain have been able to identify the principal atmospheric transport mechanism causing variability at particular sites (Turnipseed et al., 2004;Sun et al., 2010), consistently robust methods capable of identifying problematic airflows across the entire RACCOON domain are not easily made autonomous.Therefore it is necessary to test and use filters that reject observations with small spatial representativeness (relative to the spatial resolution of the data assimilation system used to evaluate the data) based on statistical identifiers in the time series of CO 2 .Our objective in applying filters directly to time series of CO 2 observations is to remove observations that do not communicate useful information to an inversion model about the regional carbon cycle without resorting to other information about terrain flows.

Filters of mountaintop CO 2 mole fractions
Previous methods for filtering mountaintop observations of CO 2 have operated on either statistical bases for rejecting paired flask observations (i.e.detection error, Keeling et al., 1976), fixed rejection criteria about an interpolated curve (Gillette and Steele, 1983), or combined low passinterpolation schemes for rejecting outliers (i.e.statistical interpolation, Thoning et al., 1989).Keeling et al. (1976) recognized that in order to improve the synoptic scale representativeness of measurements made on Mauna Loa (Hawaii) it would be necessary to remove observations from the complete CO 2 time series that appeared to be the consequence of local anthropogenic emissions, volcanic outgassing, and vegetation from the lower slopes of the mountain and around the island.Thoning et al. (1989) controlled for these observations by interpolating through the data points and rejecting outliers as well as using low-pass spectral filtering.
The statistical interpolation filter used by Thoning et al. (1989) was developed to filter measurements for a remote marine mountaintop location with influences very different from most continental sites.This filter was used to select a subset of the measurements made atop a volcano where pulses of CO 2 with small-scale representativeness  1), which although not intended to make measurements of well mixed mountaintop air, we include in one test in order to distinguish filters using an extreme case from a site that is not ideally situated.
were relatively infrequent and most measurements reflected large-scale well mixed marine air masses.Statistical interpolation (an example of which appears in Sect.3.5) relies on rejection limits that are determined a priori and are specific to remote marine mountaintop locations.On the other hand these kinds of sliding window filters may have an advantage in constraining seasonal or diel variability because they take into consideration the previous and following CO 2 variability when filtering the data.We use a similar statistical interpolation filter in this study to examine their performance with continental data.
Another filtering approach used by current carbon cycle inversion systems that assimilate observations on sub-daily time steps in areas of complex terrain (e.g.Peters et al., 2010) is time-of-day filtering.Time-of-day filtering assimilates only nocturnal observations during hours when the station is most likely to sample downward descending air from the free troposphere (ca.00:00-04:00 LT).Although 00:00-04:00 LT filtering does not distinguish observations in any way aside from the measurement time it can be used in combination with other filters.Because such time-of-day filters are frequently used for inversions instead of or in combination with statistical filtering our results in several places present both the full subsets (all hours) and the subset when further constrained by time-of-day (00:00-04:00 LT).

Site descriptions, instrumentation, and sampling protocol
The Autonomous, Inexpensive, Robust CO 2 Analyzer (AIR-COA Stephens et al., 2006Stephens et al., , 2011) ) is the atmospheric carbon dioxide sampling system developed for use at each RAC-COON site.At the heart of the AIRCOA system is a singlecell infrared gas analyzer (IRGA).To compensate for moderate short-term noise and instrument drift, AIRCOA employ signal averaging and frequent calibrations using multiple reference gases tied to the World Meteorological Organization CO 2 scale.
Each AIRCOA system samples CO 2 across multiple inlet heights, which provides vertical CO 2 profiles across the height of the station.The complete description of these methods can be found in Stephens et al. (2006Stephens et al. ( , 2011)), see also: http://www.eol.ucar.edu/∼ stephens/RACCOON/ AIRCOADIST/.
Rocky RACCOON is an ongoing campaign to record atmospheric CO 2 across a topographically complex landscape using a network of six sites in Colorado, Arizona and Utah (Fig. 1).The NWR RACCOON site is located above tree-line on Niwot Ridge which is 5 km to the west and 470 m higher than the AmeriFlux forest site.Storm Peak Lab (SPL), Hidden Peak (HDP) and Roof Butte (RBA) are mountaintop facilities.Entrada Field Station (EFS) is located in a desert canyon, and observations here were stopped after 2 yr because of inadequate mixing within the canyon.
The Fraser Experimental Forest site (FEF) is situated within a high elevation valley and subalpine coniferous forest about 100 km west of Denver, Colorado.Of the six RACCOON sites FEF has the strongest diel CO 2 cycle where summertime respiration can elevate nighttime CO 2 to 460 ppm within the valley.Due to strong diel variability at FEF (see Fig. 2) and local influences at EFS, RACCOON network results and statistics presented here are based on the other four sites.Specific diagnostic tests of filters were conducted with FEF data separately, as will be discussed later.

Methods
Here we describe five site-independent filtering methods chosen to represent a range of filters that are presently used to filter CO 2 mole fraction data.All filters operate on hourly RACCOON CO 2 mole fraction means derived from 2.5 min measurements, each with a 1σ precision of 0.1 ppm.Methods 1, 2 and 3 represent filters that consider the statistics of the hourly observation being evaluated.Methods 4 and 5 represent methods that filter based on the observed variability over preceding and following hours.As discussed in Sect.2.3 results for each of the five filters will be compared with and without time-of-day (00:00-04:00 LT) filtering.To provide clear examples of each filtering protocol we have also included a Supplement spreadsheet with this paper that exactly demonstrates the filtering procedures.

Method 1: short-term variance filtering
The short-term variance filter (SV) is a simple routine for flagging measurements with excessive hourly CO 2 variance under the assumption that regionally-representative conditions can be characterized by low CO 2 variance.Observations are retained by the SV filter if they have hourly standard deviations less than 1 ppm.The 1 ppm hourly standard deviation limit is determined subjectively by considering monthly distributions of hourly variance and excluding obvious locally influenced data.We should note that the 1 ppm variance limit, which is used in this and the following two statistical filters, is calculated from ∼3-min means for each hourly measurement, which is important for evaluating observations representing synoptic changes in CO 2 .This is examined further in Sect.4.3.

Method 2: short-term variance local gradient filtering using two inlets
Similar to the SV filter, the Short-term Variance Local Gradient filter (SVLG) adds one additional constraint that rejects observations at each time step with vertical CO 2 gradients (across the upper two inlets) larger than 0.5 ppm.Constrain-ing for observations that represent small vertical gradients attempts to reject bias caused for example by strong local sources of poorly mixed air.
The formation of each SVLG subset of measurements is formally described in terms of time series signals.The related discrete time signal x(n) is extracted from the original signal X(n) where the hourly mean standard deviation at the top inlet height σ x h is less than 1 ppm and the absolute difference in CO 2 mixing ratios between the top two inlet heights |X h (n) − X h−1 (n)| is less than 0.5 ppm.Like the 1 ppm hourly standard deviation limit, the 0.5 ppm limit is determined subjectively.For RACCOON data Carbon-Tracker uses this SVLG filter in combination with time-ofday filtering to form subsets of observations that are suitable for assimilation.

Method 3: short-term variance lapse rate filtering
Although filtering for excessive hourly variance and large CO 2 gradients has the advantage of excluding measurements that are clearly not representative of well mixed air or may indicate strongly stratified air, it is not clear that the cutoff values (i.e.σ < 1 ppm, gradient <0.5 ppm) are well suited for a given measurement site or inversion model resolution.Past research (Bergamaschi et al., 2006) has shown differences in near surface gradients of atmospheric tracers between transport models up to a factor of 3, which suggests a notable spread in capability between models at simulating vertical mixing.We addressed this issue using a new shortterm variance lapse rate filter (SVLR) that connects the filter selectivity to the discretization (or vertical resolution) of the inversion model being used to assimilate the data.The protocol for SVLR filtering is exactly the same as SVLG except that rather than using the 0.5 ppm difference cutoff, SVLR uses minimum and maximum CO 2 lapse rates (in ppm m −1 ) that are determined from the near-surface CO 2 lapse rate in the model atmosphere above each measurement site over the entire model record (e.g. 2005-2009).
To develop SVLR subsets, near-surface CO 2 lapse rate ranges (min and max) were queried for each site's location from CarbonTracker-2009 output for afternoon (09:00-20:00 LT) and nocturnal (00:00-09:00, 21:00-23:00 LT) times of day.Filter lapse rate limits were set as the smallest and largest rates at which CO 2 mole fractions lapsed with decreasing elevation.For each site minimum and maximum lapse rate limits were computed for afternoon and nocturnal times of day.These were calculated as the change in CO 2 between the model surface and the interface of the lowest model atmosphere level, which ranged between 42 and 52 m depending on the location and time of day.Also, we interpolated across CarbonTracker's North American-1 • × 1 • model domain to the spot corresponding to the locations of 3 of the RACCOON sites (NWR, RBA, SPL).Additional commentary on issues of horizontal grid coarseness of the CarbonTracker North American-1 • × 1 • grid and the vertical atmosphere levels appear later in the Discussion.Lapse rates for the station data were computed between the upper two inlet levels of each station which differed in height by 1 to 8 m.This differs from the ∼ 45 m range over which the filter lapse rate limits were computed from the model data.Station lapse rates were computed for each hourly measurement as the difference in CO 2 between the upper two inlets divided by the difference in their heights (ppm m −1 ).The SVLR filter thus rejected all hourly site measurements with hourly standard deviations equal to or greater than 1 ppm and with lapse rates larger than the model-specified lapse rate limits for that time of day.All SVLR results are based on data from 3 sites (NWR, RBA, SPL) rather than 4 like other subsets.SVLR statistics do not include data from HDP because its inlets are horizontally separated rather than vertically separated, and vertical lapse rates cannot be computed.

Method 4: filtering of outliers using a weighted median smoother
An effective method that has been used in many signal filtering applications is the weighted median smoother (cf.Tukey, 1974).Because our intent is not necessarily to smooth but to reject CO 2 mole fractions that are not regionally representative when present in the data, we have modified Tukey's method.We use a Weighted Median Filter (WM) that rejects an observation if its residual from the daily median is in excess of the summed and weighted inter-day variance for the previous two weeks.
The WM filter slides a backward-looking window over the time sequence of daily medians X(N), obtained from hourly measurements X(n), to create the related subset of hourly values S(n).A range of acceptable mole fractions centered about the daily median value X(N) is computed dynamically at each step (day) N. The limits of the range are a function of the sum of differences between each daily median value in the sequence [ X(N) − X(N − j )] over the previous two weeks and are weighted using a geometrical decay function to favor more recent variability.Thus the difference between today and yesterday is weighted at 1/2, and the residual between yesterday and the day before yesterday is weighted 1/4, and so on.The upper or lower limit, L, is computed using the difference equation: for which X(N ) is the daily median at day N in the series.
Then for each day, N, the set S(N ) of hourly data (h) to keep is:

Method 5: iterative filtering of outliers from a fitted polynomial (statistical interpolation)
We also include a statistical interpolation filter (SI) that is duplicated from the method used by Thoning et al. (1989) at Mauna Loa to identify well-mixed background CO 2 (see the original publication for a complete explanation of the filtering protocol).SI considers past and future observations through a sliding window to reject outliers from a fitted spline.The SI filter used here has one key difference from Thoning et al. (1989), which is that it does not use a low-pass spectral filter.Following the protocol outlined by (Thoning et al., 1989) our SI filter works by passing a ten day sliding window over the original time series of hourly values X(n) to create subset S(n) that consists initially of non-afternoon samples (the 15 hourly CO 2 mole fractions for the day excluding hours 11, 12, ..., 19).Daytime values were removed in the first steps by Thoning et al. (1989) in order to fit the spline to values not strongly influenced by afternoon photosynthesis.For each ten day window a cubic spline S(X) is fitted through the daily means X(N ), that exclude afternoon samples.In the first phase of filtering if the daily standard deviation (σ X(N ) ) exceeds 0.5 ppm the filter will reject the hourly observation X(n) with the largest residual from spline curve S(X), which is described by the expression: where the residual is computed as the absolute difference between the spline curve and the hourly measurement.
In the second phase of filtering the window advances across all days, re-fitting a new ten day spline with each new window and rejecting no more than one observation per day with each iteration over the entire time series.After no more than 14 iterations (the maximum number of hourly observations that can be rejected for each day), or when the standard deviation of all daily means is less than 0.5 ppm (e.g.σ of X(N ) ≤ 0.5) the excluded daytime observations from the original time series (i.e. hours 11, 12, ..., 19) are incorporated back into S(n).A final spline is refitted and those observations that are within 0.5 ppm of the spline form the final subset, which is expressed as: As mentioned above to simplify exposition of these filters we also provide a Supplement spreadsheet that can be referred to do duplicate our methods.Figure 3 illustrates the impact of our various filters on one year of measurements for one RACCOON site (Storm Peak).

Meteorological data
To test each filter under various scenarios including synoptic scale frontal systems we selected case studies for the NWR station where both CO 2 and meteorological data were available.These meteorological data were obtained from the Saddle climate station located 150 m up-ridge from the NWR RACCOON site.We extracted variables (barometric pressure, dew point, wind direction, wind speed) and focused on frontal passages that showed longer-lived CO 2 shifts resulting from synoptic weather changes.

Site filtering statistics
CO 2 inversion results are strongly affected not only by the number and density of observations, but also by the trends and seasonality of those data.We began by analyzing the general statistics of collective subsets representing measurements from 4 of our RACCOON sites (FEF was excluded because it reflects a special topographic setting not intended to sample well-mixed air).Combining time-of-day filtering for the complete set and the five filters results in a ∼ 70 % reduction in the number of observations available as constraints in assimilation, but the subset means increase only by 0.2 to 0.3 ppm for the complete set and the hourly-statistical filters (SV, SVLG, SVLR), and decrease by 0.1 ppm for the two windowing-filters (WM, SI).
The statistical spread (distribution) of each subset is shown by the deseasonalized variance that appears in Table 2.The deseasonalized variance shows that most filtering methods constrain subset variance to a narrower distribution about the mean than in the complete set, except for SI (and SVLR but only when all hours are used).Also time-of-day filtering alone generally has little effect on subset variability.The larger variance for SI (and smaller number of retained observations) suggests that this filtering method produces rel- atively sparse subsets of widely distributed values when applied to RACCOON data.SVLR, which is starred because it does not include data from the HDP site, retains most observations but still has a relatively large variance.We further tested to see whether this could be due to a bias in the number of observations during certain months or times of day, but found no significant difference between SVLR and the complete set.We can only infer from this that SVLR filtering results in subsets that have about twice the variability as the complete set.
www.atmos-chem-phys.net/12/2099/2012/Table 3 shows that the choice of filtering methods affects the CO 2 seasonality of subset observations, which may have implications for the strength and timing of retrieved NEE seasonality and carbon budgets.Subsets from statisticalfilters (SV, SVLG, SVLR) have seasonal amplitudes that differ by no more than −0.3 ppm from the complete set.The seasonal amplitudes of windowing-filters (WM, SI) differ slightly more, on the order of −0.4 ppm from the complete set, implying a slightly weaker seasonal amplitude than the complete set.In situ CO 2 lapse rates can be used to infer local CO 2 stratification/mixing, and can be an important consideration for model-observation representativeness issues.We investigated how lapse rates would differ between subsets, and summarized our results in Fig. 4.These subsets of RACCOON measurements broadly break into three groups that can be characterized as consistently well-mixed, reasonably wellmixed, and biased groups.Largely due to the way these filters are defined, SVLG and SI subsets (well-mixed group) are the least likely to include measurements representing stratified air.For the intermediate group, SVLR and SV subsets include slightly more observations representing local stratification, although SVLR appears more like the consistently well-mixed subsets except for its final downtick near +4σ .The 00:00-04:00 subset appears to be an intermediary between reasonably well-mixed and biased groups because of a final uptick near +4σ .For the biased group, the complete set and WM subset are the most likely to include stratified measurements and should be generally regarded as having measurements likely to incorrectly inform most carbon cycle inversion models, particularly from high CO 2 values.

Comparisons to aircraft observations
We expanded our investigation by comparing filtered subsets from the Niwot Ridge (Colorado) RACCOON site (NWR) to CO 2 mole fraction measurements from NOAA's bi-weekly airborne flights over Carr, Colorado, about 100 km northeast of NWR.From the 255 flights between years 2006 and 2009, 37 of them represented vertical CO 2 gradients less than ±1 ppm across the bottom 1500 m of the atmosphere.24 of these 37 corresponded to hours when data were collected by the nearby NWR station.A time-line of Carr CO 2 measurements and corresponding observations from NWR are given in the Supplement spreadsheet that accompanies this paper.Fig. 4. CO 2 lapse rates binned by standard deviations from the deseasonalized subset mean.These show the average degree of CO 2 stratification in the vicinity of the measurement stations.On the vertical axis (CO 2 lapse rate) measurements representing wellmixed conditions appear near-zero, while measurements representing strongly stable conditions have large negative values.On the horizontal axis (standard deviations) measurements typical of afternoon CO 2 uptake appear to the left, while nocturnal measurements (tending to have larger CO 2 values) appear to the right.
We standardized our filtered subsets in order to remove bias from subjectively chosen limits, window sizes, standard deviations etc. in our filter implementations.WM and SI subsets from NWR were standardized by relaxing the filtering criteria of the windowing-filters in order to reject onethird of NWR measurements.This resulted in 20 common hourly CO 2 measurements between NWR and Carr.Specifically, these filter criteria were slackened by expanding the ppm range limits of their sliding windows by factors of 2.0 for WM and 1.68 for SI.SV, SVLG, and SVLR were standardized by increasing filter criteria (vertical CO 2 gradient and or hourly standard deviation) until 20 common measurements were obtained between each subset and Carr.
In this approach we assume that small vertical CO 2 gradients over Carr reflect strong vertical mixing and large spatial homogeneity in CO 2 .Therefore, these 20 "well-mixed" reference points served as a baseline for computing biases for filtered subsets from NWR.We caution however that these common reference points are not evenly distributed across seasons.Note that there is a slight seasonal bias that tends to under-represent months June through September.These months are only represented by 3 out of the 20 reference points, which is due to both missing hourly measurements from NWR RACCOON site in some cases and large vertical CO 2 gradients in the Carr data in other cases.
We noticed in our analysis that three subsets had similarly small biases from the Carr reference points.SVLG, SVLR and WM had errors (RMSE) that differed by no more than 0.05 ppm from each other and two data points or fewer, therefore we simplified their exposition in Fig. 5 by showing SVLR as an example of all three.Figure 5 indicates that when SVLG, SVLR, and WM subsets are standardized to a common subset size they are nearly equally likely to represent well-mixed air in this case study (i.e.small estimator bias (error) from Carr).Consequently these subsets have nearly the same error (RMSE ≈ 0.5 ppm) from the 20 well-mixed reference points.This is in contrast to the SI and SV subsets, which selected observations deviating more from Carr, and thus could be regarded as less representative of spatially homogeneous/well-mixed conditions for this case study.

Synoptic case studies
In the above case although three of the five filters (SVLG, SVLR, WM) were roughly equal in filtering observations representing well-mixed air, each contained slightly different selections of observations, which might steer inversions differently.To investigate these selection preferences in detail we looked at several synoptic case studies representing cold front events at Niwot Ridge, two of which appear in Figs. 6  and 7. Synoptic changes in the origin of air can be critical to inversions, as they can have a larger impact on carbon dioxide mole fractions than diel changes due to local fluctuations in boundary layer height and fluxes, and carry www.atmos-chem-phys.net/12/2099/2012/important information on differences in upwind fluxes over large regions.Frontal passages were identified using associated meteorological data (Sect.3.6) and retained CO 2 measurements were compared between subsets.Cold front systems were identified by prolonged troughs in barometric pressure coupled to decreases in temperature, humidity, and abrupt wind direction shifts.Figure 6 shows two of the meteorological variables used (dew point, wind speed) to identify the winter frontal system that passed over NWR in February, 2007.A notable feature is the transient jump in CO 2 mixing ratios that is synchronous with a near 180 • wind direction shift near 09:00 LT and 12:00 LT on 13 February.Particle back trajectories also indicated that the CO 2 jump reflects a switch in the origin of surface air from west to north-east accompanied by strong vertical shear (data not shown).This change in surface wind direction may include information important to a carbon cycle inversion model, but that depends on the transport model's ability to resolve such airflows in the inversion.
As discussed in Sect.4.2 in order to standardize sample sizes for the NWR site we constrained SV, SVLG and SVLR subsets down to two-thirds of the total observations, but scaled-up WM and SI subsets (by relaxing cutoff ranges) to retain two-thirds.For this case study, subsets from windowing-filters WM and SI did not retain any of the 9 measurements during the 9 h synoptic shift in wind direction.Subsets from hourly-statistical filters SV, SVLG, and SVLR retained 3, 3, and 2 of the 9 measurements in different combinations (Fig. 6), indicating a general similarity between subsets.Still these subsets do not exactly agree, which is the result of differences between using hourly standard deviation, local gradient, or model-specified lapse rate to filter.
For this case study subsets derived using hourly-statistical filters retained hourly CO 2 measurements made during the synoptic event.This indicates that SV, SVLG, and SVLR filters are capable of filtering without rejecting transient measurements that might be informative to the inversion model.On the other hand subsets from windowing filters (WM, SI) may not sufficiently represent abrupt regional-scale changes in CO 2 that are relevant to most state-of-the-art inversion model systems.
Filtering by time-of-day alone (00:00-04:00 LT) would result in 0 of the 9 synoptic shift observations being assimilated because the event occurs during daytime hours.Alternatively, combining time-of-day after filtering by statistical (SV, SVLG, SVLR) or windowing (WM, SI) would result in only one different observation, which occurs at 00:00 LT on 14 February in Fig. 6.
Figure 7 presents a second synoptic case study of standardized subset sizes from 4-8 June 2007, this time characterized by a shorter ∼ 4 h wind direction shift followed by a 2 day decline in barometric pressure and dew point.These synoptic changes resulted in a different scenario of gradually rising CO 2 during diel oscillations.Subsets from SVLG and SVLR retained nearly identical collections of observa- Fig. 6.A synoptic case study from NWR site (13-15 February 2007) comparing filtered subsets standardized to retain two-thirds of NWR measurements.The upper plot of meteorology data has dual vertical axes that are listed on the left and right.The number of observations comprising each case study subset is listed in the legend box.The gray band locates the 00:00-04:00 LT interval, which is used by some inversion model systems in place of statistical filtering prior to assimilation.
tions during the frontal passage near 12:00 LT on 4 June.Also SVLG and SVLR subsets, and to a limited extent SV, retained observations near daily minimum and maximum values.By contrast WM and SI subsets favored observations with CO 2 values near the daily mean/median and rejected values near daily extremes (00:00-04:00 and 12:00-16:00 LT).For inversion systems that use time-of-day filtering, this suggests that when combined with statistical filters (e.g.SVLG and SVLR) more observations would be retained than if combined with windowing filters, which may be underrepresented in measurements during hours when daily extremes occur.
Observations that pass multiple filters could allow for greater statistical weighting to be applied to those assimilated observations.This could be an important consideration for future inversion model systems, for example that employ multiple-grid resolutions (e.g.Wu et al., 2011) for computationally efficiently representation of areas that change relatively little, but effective representation at higher resolution of complex areas.

Fraser Experimental Forest
The majority of our focus until now has been on filtering measurements made at mountaintop locations that are intended for assimilation by carbon cycle inversion models.However, not all measurement sites are so ideally located, but they may still offer measurements useful as constraints for carbon cycle inversion model systems.Here we shift our attention to examine the effectiveness of these filters for a complicated case using mole fractions from Fraser Experimental Forest, which is located in an alpine valley at 2745 m a.s.l.where summertime respiration can push nocturnal CO 2 mole fractions above 460 ppm.Local influences at FEF are difficult to model, thus our goal is to determine if CO 2 observations useful to biosphere-atmosphere inversion models (on scales that can be modeled) can be extracted from FEF despite its topographic setting.Our intent is not to suggest that data from such sites be assimilated, but to diagnose and compare filters when presented with data from an extreme case.
Figure 8 shows that windowing-filters that rely on constrained diel CO 2 ranges (cf.WM, SI methods in Sects.3.4 and 3.5) fail to locate realistic diel and seasonal cycles.When diel variability in CO 2 mole fractions is high at FEF (June through November) windowing-filter subsets do not follow realistic seasonal cycles, and the subset range wanders dramatically from day to day.On the other hand statisticalfilter subsets (SV, SVLG, SVLR) that do not depend on constrained diel variability, or observations from other time steps, identify more regionally consistent diel and seasonal cycles despite high diel CO 2 variability.

Choosing the appropriate filter
In the previous section we showed that variously filtered subsets have distributions with different CO 2 stratifications (Fig. 4).When standardized to equal sample sizes and compared to cases of well-mixed air from airborne CO 2 profiles, SVLG, SVLR and WM filters were similarly capable of selecting for spatially homogeneous CO 2 mole fraction measurements for the Niwot Ridge RACCOON site (Fig. 5).However, when we used case study analysis to scrutinize subset differences during frontal passages contrasts between standardized subsets from statistical and windowing-filters became evident (Figs. 6 and 7).For prolonged shifts (ca. 9 h) in air mass source regions statistical-filters (SV, SVLG, SVLR) were able to identify and retain CO 2 measurements despite abrupt 4-6 ppm CO 2 changes, while windowingfilters (WM, SI) retained none.
The synoptic case studies represented in Figs. 6 and 7 also show that subsets from SI and WM (the two filters that use preceding and following data to determine cutoff ranges) do not capture the diel variability that is particular to many continental sites.These filters would be of greater use for filtering CO 2 measurements from remote marine locations where diel variability is smaller (which was the intended use for SI in Thoning et al., 1989).Also in the above case studies it seems likely that subsets from statistical-filters (SV, SVLG, SVLR) are likely to be more informative to carbon cycle inversion models during events that bring about synoptic scale changes in CO 2 , however is not clear how selective a filter should be for measurements from a given site and a given inversion model.We addressed this problem in the SVLR filter which specifies the filter selectivity for each site and time of day by using the model as a benchmark for determining filter selectivity.As opposed to SVLG, SVLR uses near-surface CO 2 lapse rate limits that come from the model min and max lapse rates for each measurement location and time of day.As with SV and SVLG, SVLR does not favor measurements near daily mean values (cf.Fig. 7), and is still able to filter even when diel variability is large (cf.Fig. 8).Filtering in this way retains a majority of observations (90 % ). Figure 9 shows that these 90 % of network observations fall within narrow lapse rate ranges corresponding to lapse rates our example model can represent (see magenta colored region in Fig. 9).The remaining 14 % constitute the gray regions in Fig. 9 that represent RACCOON measurements with lapse rates larger than can be represented in the discretized model output and thus should be filtered.
The SVLR protocol however, assumes that model CO 2 lapse rates are valid indicators for model resolvable atmospheric transport and can be compared to station lapse rates.To test this assumption we performed sensitivity test of the effect of lapse rate uncertainty on subset size.For these we assume that due to unconstrained error station lapse rates underestimate the actual lapse rate.Thus we inflated each lapse rate randomly between 0 % and 20 % across a uniform distribution.We repeated this same method for unconstrained error up to 40 %, 80 %, 200 %, 400 %, and 1000 % larger than the measured lapse rate.For the 20 % trial, a negative lapse rate of −1 ppm m −1 would be inflated to −1.2 if a 20 % random error were assigned.Sensitivity test results indicate that a random 20 % underestimation of station lapse rates across the 4 RACCOON towers (excluding FEF and HDP) could result in a 1 % reduction in subset size.A 200 % random uncer- Fig. 8. Extracting useful observations from a "difficult-to-model" mountaintop location using one year (2010) of observations from the FEF site, which is strongly influenced by local (within-valley) circulations and strong summertime respiration.This figure reveals an important limitation of windowing-filters, which are not able to identify a realistic seasonal CO 2 cycles when diel variability is high.This is because SI and WM filters rely on daily mean/median values (or trends) to filter observations.This suggests that windowingfilters (WM, SI) by themselves may not be suitable for continental CO 2 measurements.
tainty could reduce subset size by 6 %, and 1000 % shows a 25 % reduction (see Supplement spreadsheet for full results).Random error trials for station lapse rates are used in this sense as an indirect indication of the uncertainty in matching station lapse rates to model atmosphere lapse rates, and suggest that systematic underestimates of station lapse rates on the order of hundreds of percent are necessary to substantially diminish SVLR subset size.itions (negative lapse rates) at night caused for example by respiration and CO2 pooling.Furthermore, g the day CarbonTracker appears less able to represent positive lapse rates than at night.35 Fig. 9. SVLR: RACCOON station CO 2 lapse rate as a function of CO 2 mole fraction.Also included are the CT-2009 lapse rate ranges for the day (09:00-21:00 LT) and night (all other hours).Lapse rate rejection ranges for day appear in red for each RACCOON station (NWR, RBA, SPL) on the left.Nocturnal rejection ranges appear on the right in blue.Using Niwot Ridge (NWR) as an example this figure shows that CarbonTracker lapse rate ranges fail to capture the full variability in CO 2 stratification, particularly for strongly stable CO 2 conditions (negative lapse rates) at night caused for example by respiration and CO 2 pooling.Furthermore, during the day Car-bonTracker appears less able to represent positive lapse rates than at night.
Figure 10 illustrates the application of the SVLR filter using CarbonTracker CO 2 lapse rates for a new case study during June 2007.The SVLR subset in the upper panel of Fig. 10 shows the variability in CO 2 mole fractions during frontal passages near 12:00 LT on 13 June (abrupt NW to NE wind shift), near 15:00 LT on 15 June (W to S wind shift), and another event near 00:00 LT on 18 June when the only substantial meteorological shift is a slackening of wind speed from 9 m s −1 to less than 1 m s −1 as well as large negative CO 2 lapse rates.The SVLR filter indicates that 157 of 192 CO 2 measurements could be represented by the discretized model atmosphere.In the lower panel of  tions (and presumably atmospheric transport processes) that cannot be represented by the model atmosphere.Time-of-day filtering (i.e.00:00-04:00 LT) as shown in Figs. 6, 7, 10, by itself may be a useful filtering method if no other reliable statistical filters can be employed.However, synoptic changes in CO 2 can be much larger than diel variability (cf.Fig. 6) and may occur outside of subset sampling hours thus excluding them from these subsets.

Sources of error in lapse rate filter
There are a few potential sources of error when extrapolating the results of our study that merit discussion here.The advantage of inversion model systems is that they optimize first-guess-fluxes using CO 2 mole fraction observations, but this requires observations that are representative on spatial scales similar to the model's resolution.Although SVLR filtering uses model specified lapse rates to constrain measurements representing model resolvable air flows there is potential for false positives if SVLR falsely rejects an observation due to an excessive lapse rate when in fact the top inlet height is measuring well-mixed air.This kind of error could occur for example when unusually strong gradients near the surface (e.g.horizontal advection, van Gorsel et al., 2009) influence the lower inlets but not the uppermost inlet.
To implement the SVLR lapse rate filter we adopted a philosophy assuming that CarbonTracker's predicted lapse rates could be used to reliably reject measurements.The weakness in this approach is that CT may in fact be able to assimilate data with lapse rates greater than what it predicts.This weakness is mitigated somewhat because we use only two lapse rate limits, one for day and one for night, that were based on multiple seasons and years.This allowed a majority (90 %) of observations to be used, which is within 4 % of other statistical filters.
Another point of consideration for applications of the SVLR filter to a site should be the height across which lapse rates are calculated.Large height differences, or where the lower inlet is well within the canopy or close to the surface, are more likely to show large differences in CO 2 mole fractions (and CO 2 lapse rates calculated across that length).Therefore it may be necessary to apply some added flexibility to lapse rate cutoffs that are used to subset measurements from such a site.
When implementing the SVLR filter we might instead have used lapse rate limits that were seasonally specific or even specific to the month being filtered.We tried such implementations and found that it reduced the number of subset observations by one-third to one-half.Consequently we decided to use lapse rates that were specific only to the time of day because our intent with this filter is to represent the total range of possible lapse rate reproducible in the model, which has the largest difference between day and night.
Another caveat to consider is that we did not account for discrepancies in the lengths across which we computed our lapse rates from the model nor the sites.Site lapse rates were computed over lengths ranging from from 1 to 8 m, while model lapse rates were computed over length ranging from 42-52 m (if calculated as the difference between the surface and the first model atmosphere interface levels, as was used here) or 82-89.5 m (if using the middle of the model atmosphere levels).A robust implementation of our lapse rate filter would provide uncertainty in the lapse rate calculation that was based on the difference in lengths between the station and the model.
These lapse rates computed from model simulations are dependent on the model's vertical mixing, which can differ substantially between models (Bergamaschi et al., 2006).For example a model with pronounced stratification at night will appear to have larger lapse rate limits that will result in different SVLR limits as compared to another model with the same horizontal resolution but different vertical mixing.Furthermore, vertical CO 2 gradients in TM5 Carbontracker's atmospheric transport model) may prove to be unreliable indicators of vertical mixing (Williams et al., 2011).
Our simulated lapse rates are taken from Carbontracker's North American-1 • × 1 • model domain.The coarseness of horizontal 1 • × 1 • orography also impacts the reliability of lapse rates computed across vertical levels in the atmosphere.Overall, the benefits of specifying model-based lapse rate limits may be outweighed when a coarsely discretized model is used to determine CO 2 gradients.

Towards estimating model-data mismatch
A final point of consideration is that we have not used a model to assimilate and compare the model output between the five different subsets.Model-data mismatch (used in this way) may not necessarily indicate better filter performance.One reason is that models typically apply less statistical weighting to measurements from complex terrain.Also coarsely gridded atmospheric transport models are known to incorrectly attribute the source region of assimilated CO 2 measurements in mountainous regions particularly during strong vertical wind shear.Our ongoing work is testing the implementation of various subsets of RACCOON data in CT, to estimate the strength of this effect.

Conclusions
It is suspected that mountain ecosystems of the Western US are not only large stores of carbon but serve as carbon sinks, however, there is no consensus on the location or variability of potential sinks.Inverse models of biosphere-atmosphere carbon exchange that assimilate CO 2 mole fractions are one of few ways to retrieve CO 2 fluxes in complex terrain.But model representation of important carbon cycle changes and feedbacks in complex terrain is limited by an inability to accurately identify and assimilate CO 2 mole fractions that are model resolvable (Gerbig et al., 2009;Pillai et al., 2011).In some cases this may have resulted in inversions that are optimized from measurements reflecting CO 2 gradients that the model itself cannot represent.In other cases when timeof-day filtering is used (e.g.filtering for 00:00-04:00 LT measurements targeting descending air from the free troposphere) other times of day are not used to optimize prior flux estimates.Our goal in this study has been to evaluate filters in terms of their capabilities in selecting measurements that corresponded to the resolution of carbon cycle inversion models.
Of the five filters of mountaintop measurements analyzed in this study each had its own selectivity, which resulted in: subsets of different sizes (Table 2), subsets representing air masses with different CO 2 stratifications (Fig. 4), and subsets that disagreed on which CO 2 measurements should be retained during synoptic-scale frontal passages (Figs. 6 and 7).Two filters employed here, lapse rate (SVLR) and local gradient (SVLG), provide two choices to address these issues and constrain the spatial representativeness of in situ mountaintop CO 2 mole fraction measurements for stations with multiple inlet heights.The lapse rate filter (SVLR) does so by isolating a subset of measurements that correspond to the range of represented lapse rates from the inversion model, which can be an advantage of this method because it works with the limitations of the model.The local gradient filter (SVLG) performs similarly well and retains nearly the same number and kind of observations, but depends on subjective knowledge in order to establish vertical CO 2 gradient limits for filtering in situ measurements.
Lapse rate or local gradient filtering can be implemented for any carbon cycle inversion model system using assimilated CO 2 mole fractions measured across multiple inlet heights.These subsets of locally well mixed air (cf.Fig. 4) are more likely to infer regionally well mixed air corresponding to the transport model resolution of state-of-the-art carbon cycle inversion models (e.g. 10 000 km 2 ).Lapse rate and local gradient filtering of RACCOON data resulted in subsets with the smallest errors from vertically well-mixed airborne CO 2 profiles from Carr, Colorado (Fig. 5).The choice of SVLR or SVLG will depend on the inversion model system.
For sites where multi-inlet measurements are not available our results show that subsets of the data selected according to hourly CO 2 variance criteria (i.e.SV) are helpful when diel variability is high (Fig. 8), and during synoptic changes in CO 2 (Figs. 6 and 7).However filtering by statistical variance alone requires that the variance limit be specified using subjective knowledge of the measurement site and in our case studies resulted in subsets that were not quite as representative of well-mixed air as SVLR, SVLG, and WM subsets (Fig. 5) and may include stratified conditions not resolvable by a model.Supplementary material related to this article is available online at: http://www.atmos-chem-phys.net/12/2099/2012/acp-12-2099-2012-supplement.zip.

Fig. 1 .
Fig. 1.Map of RACCOON domain.The complimentary positioning of the RACCOON mountaintop network of autonomous CO 2 mole fraction surface sites is shown with reference to NOAA's CO 2 mole fraction measurement network, Penn State's Midcontinental Ring 2 sites, the ORCA network, and other well-calibrated in situ CO 2 mole fraction sites in the Continental US.
: RACCOON station CO2 lapse rate as a function of CO2 mole fraction.Also included are the 009 lapse rate ranges for the day (09:00-21:00 LT) and night (all other hours).Lapse rate rejection ranges ay appear in red for each RACCOON station (NWR, RBA, SPL) on the left.Nocturnal rejection ranges r on the right in blue.Using Niwot Ridge (NWR) as an example this figure shows that CarbonTracker rate ranges fail to capture the full variability in CO2 stratification, particularly for strongly stable CO2 Figure10illustrates the application of the SVLR filter using CarbonTracker CO 2 lapse rates for a new case study during June 2007.The SVLR subset in the upper panel of Fig.10shows the variability in CO 2 mole fractions during frontal passages near 12:00 LT on 13 June (abrupt NW to NE wind shift), near 15:00 LT on 15 June (W to S wind shift), and another event near 00:00 LT on 18 June when the only substantial meteorological shift is a slackening of wind speed from 9 m s −1 to less than 1 m s −1 as well as large negative CO 2 lapse rates.The SVLR filter indicates that 157 of 192 CO 2 measurements could be represented by the discretized model atmosphere.In the lower panel of Fig.10open circles within the magenta shaded area indicate observations rejected due to excessive statistical variance (> 1 ppm), while open circles outside the shaded area represent observations rejected due to lapse rates smaller than occur in Carbon-Tracker output.Although the inversion model may be able to assimilate and use the rejected observations, doing so would mean assimilating observations representing CO 2 stratifica-

Table 1 .
Site details.Listed are the site coordinates, elevations in m a.s.l., inlet height, and installation year.All sites are topographically situated on mountaintops except FEF (alpine valley), and NWR (ridgetop).As discussed previously, measurements from FEF are only used in a specific diagnostic test and are otherwise not used.

Table 2
Diel and seasonal variability in CO 2 mole fractions from all five filters are contrasted against the complete set of observations using one year of data (2006) from one RACCOON site(Storm  Peak, Colorado).This figure shows that hourly statistical filters (SV, SVLG, SVLR) retain a majority of the diel variability in CO 2 , while windowing-filters produce subsets with fewer observations constrained to narrower diel ranges.

Table 2 .
Filter statistics for the complete set (CS) and each subset (SV, SVLG, SVLR, WM, SI) representing four RACCOON sites HDP, NWR, RBA and SPL (asterisk indicates SVLR statistics exclude HDP data).The retained fraction is computed as the proportion of observations remaining after filtering.Subset means represent the average subset value, and deseasonalized variability indicates the average variability for each subset after removing the seasonal cycle.The topmost sub-table compares subsets when no timeof-day filtering is used, while the lower two sub-tables compare filtering methods when used in combination with time-of-day fil-

Table 3 .
The strength of seasonal cycle of CO 2 mole fractions is listed by the difference between seasonal means representing the annual maximum (February, March, April) and minimum (August, September, October) values in ppm CO 2 (asterisk indicates SVLR statistics exclude HDP data).
Colorado are compared against the corresponding hourly mean measurements from the Niwot Ridge RAC-COON site for each filtered subset.Subsets are standardized to equal numbers of measurements.The top panel shows (using SVLR) that SVLG, SVLR and WM subsets have roughly similar error from Carr measurements (RMSE ≈ 0.5 ppm) from wellmixed cases at Carr.SI and WM subsets show slightly larger bias (RMSE ≈ 0.7 ppm) over the same 20 cases.This suggests that when standardized to a common sample size filters are roughly similar in filtering observations representing well-mixed air, and that this does not resolve how stringent the filter criteria should be.
Lapse rate filter case study(12-20 June 2007 at Niwot  Ridge).SVLR works by first identifying the range of model CO 2 lapse rates (indicated by the diurnally varying magenta band in the lower panel).In the lower panel open circles represent observations with either excessive hourly variance or that fall outside model lapse rate ranges.The upper panel shows the corresponding CO 2 mole fractions.