Interactive comment on “ Atmospheric transport simulations in support of the Carbon in Arctic Reservoirs Vulnerability Experiment ( CARVE ) ”

Abstract. This paper describes the atmospheric modeling that underlies the Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) science analysis, including its meteorological and atmospheric transport components (polar variant of the Weather Research and Forecasting (WRF) and Stochastic Time Inverted Lagrangian Transport (STILT) models), and provides WRF validation for May–October 2012 and March–November 2013 – the first 2 years of the aircraft field campaign. A triply nested computational domain for WRF was chosen so that the innermost domain with 3.3 km grid spacing encompasses the entire mainland of Alaska and enables the substantial orography of the state to be represented by the underlying high-resolution topographic input field. Summary statistics of the WRF model performance on the 3.3 km grid indicate good overall agreement with quality-controlled surface and radiosonde observations. Two-meter temperatures are generally too cold by approximately 1.4 K in 2012 and 1.1 K in 2013, while 2 m dewpoint temperatures are too low (dry) by 0.2 K in 2012 and too high (moist) by 0.6 K in 2013. Wind speeds are biased too low by 0.2 m s−1 in 2012 and 0.3 m s−1 in 2013. Model representation of upper level variables is very good. These measures are comparable to model performance metrics of similar model configurations found in the literature. The high quality of these fine-resolution WRF meteorological fields inspires confidence in their use to drive STILT for the purpose of computing surface influences ("footprints") at commensurably increased resolution. Indeed, footprints generated on a 0.1° grid show increased spatial detail compared with those on the more common 0.5° grid, better allowing for convolution with flux models for carbon dioxide and methane across the heterogeneous Alaskan landscape. Ozone deposition rates computed using STILT footprints indicate good agreement with observations and exhibit realistic seasonal variability, further indicating that WRF-STILT footprints are of high quality and will support accurate estimates of CO2 and CH4 surface–atmosphere fluxes using CARVE observations.


Introduction
Increased concentrations of greenhouse gases (GHGs), including carbon dioxide (CO 2 ) and methane (CH 4 ) are warming the atmosphere (IPCC, 2013).The Arctic exhibits amplified signs of this warming, with unprecedented changes over the past 2 decades, including warmer surface temperatures and a moistening of the Arctic boundary layer (Cohen et al., 2012), as well as a dramatic decline in sea ice extent in late summer and early fall (Stroeve et al., 2012).The interplay between these processes creates feedbacks that further amplify environmental change, e.g., a decrease in sea ice reduces the surface albedo and increases latent and sensible heat fluxes into the atmosphere, resulting in warmer surface tempera-Published by Copernicus Publications on behalf of the European Geosciences Union.
A warmer and moister Arctic is expected to cause carbon (CO 2 and CH 4 ) releases from the vast shallow continental and marine mega pool reservoirs (ACIA, 2004), but conclusive evidence for these releases is yet lacking (Bergamaschi et al., 2013).This creates the need for continuous monitoring and modeling of concentrations and surface emissions of GHGs, leading to a comprehensive quantification and process-based understanding of the Arctic carbon budget that ultimately will enable prescient discussion and action in this vulnerable and strategically important region.Responding to this need, the objective of the NASA Carbon in Arctic Reservoirs Vulnerability Experiment (CARVE) is to "quantify correlations between atmospheric and surface state variables for the Alaskan terrestrial ecosystems through intensive seasonal aircraft campaigns, ground-based observations, and analysis sustained over a 5-year mission" (Miller and Dinardo, 2012).The aircraft campaigns obtain measurements during the spring through autumn months, while instrumented towers provide year-round observations.As described in the companion papers by Miller et al. (2015) and Karion et al. (2015), the overall approach to achieving the science objectives of CARVE is two-pronged: (1) direct fieldwork using aircraft and ground measurements of atmospheric GHG concentrations, and (2) "top-down" (inverse) estimates of surface fluxes using concentration measurements as inputs.The basic components of the inverse method include measurements of atmospheric CH 4 and CO 2 (dry air mole fractions), an "invertible" transport model computing the sensitivity of the measurements to fluxes in the upwind source regions (footprints), and a priori flux models to be optimized by minimizing the model-data mismatch.CARVE also measures a suite of environmental variables to evaluate and improve the transport model, develop empirical flux models, and enable a physically based flux aggregation across spatial scales.This paper presents the atmospheric transport model that underpins the CARVE GHG flux inversion work, with a focus on two aspects: (1) validation against observations of the meteorological fields driving the transport model, and (2) demonstration of the benefits of highresolution transport modeling for simulating concentrations of GHGs measured by CARVE and the collaborating tower sites.
The organization of this paper is as follows: Sect. 2 describes the STILT model that is used for atmospheric transport.Section 3 describes a customized version of the Weather Research and Forecasting (WRF) numerical weather prediction (NWP) model that provides meteorological input fields to drive STILT.Section 4 describes WRF model verification for both surface and upper-air variables during the 2012 and 2013 CARVE campaigns.Examples of the impact of model resolution on the representation of meteorology in the WRF model and on STILT footprint calculations are described in Sect. 5. Conclusions are drawn in Sect.6.

Atmospheric transport model
A central goal of CARVE is to infer surface-atmosphere fluxes of CH 4 and CO 2 from space-and time-varying atmospheric concentration measurements.This problem may be solved using inverse approaches ranging in complexity from simple scaling (Press et al., 1992) to formal Bayesian or geostatistical inversions (e.g., Matross et al., 2006;Gourdji et al., 2012;Miller et al., 2013).At the heart of flux inversions is the availability of an accurate transport model that is capable of computing the sensitivity of atmospheric concentrations to surface fluxes upwind.In an Eulerian (gridded) approach, the inverse method involves the calculation or approximation of model adjoints, a complex and demanding approach that must be performed for each version of the nonlinear model, or the computationally intensive calculation of "basis functions" or ensembles of forward model runs for a set of predefined aggregated fluxes.Because of numerical diffusion (e.g., Eluszkiewicz et al., 2000), the Eulerian approach has inherent limitations in dealing with localized (subgridscale) sources, such as lakes, and in situ measurements, such as made from aircraft or towers.These shortcomings are addressed by applying a Lagrangian particle dispersion model (LPDM), which is computationally more flexible when applied to a shared computational resource.In an LPDM, atmospheric dispersion is simulated by advecting tracer particles (500 per receptor in these transport simulations) by the three-dimensional gridded wind field from an NWP model, plus a turbulent velocity component represented as a stochastic process (Markov chain).The inclusion of both the mean and stochastic wind components (Uliasz, 1994) sets LPDMs apart from trajectory models that only employ mean winds and thus cannot simulate dispersion or surface interactions (Stohl, 1998;Fuelberg et al., 2010).
When applied backward in time from a measurement location (i.e., the "receptor" location), the LPDM creates the adjoint of the transport model in the form of a "footprint" field.The footprint, with units of mixing ratio / (micromole m −2 s −1 ), quantifies the influence of upwind surface fluxes on concentrations measured at the receptor and is computed by counting the number of particles in a surfaceinfluenced volume (defined as the lower half of the planetary boundary layer; PBL) and the time spent in that volume (Lin et al., 2003).When multiplied by an a priori flux field (units of micromole m −2 s −1 ), the footprint gives the associated contribution to the mixing ratio (units of ppm) measured at the receptor.Lagrangian methods minimize numerical dif-fusion, and through coupling to a mesoscale weather model, meteorological realism and mass conservation are achieved.These aspects enable the Lagrangian approach to compute realistic surface fluxes and their uncertainties for measurements from a variety of platforms including towers, aircraft, and satellites.Furthermore, the collection of footprints forms a library of sensitivity functions that can be applied to a variety of trace gases without the need to rerun the transport model.
The flux inversion work for CARVE relies on the Stochastic Time-Inverted Lagrangian Transport (STILT) model (Lin et al., 2003, www.stilt-model.org),an LPDM rigorously tested by Hegarty et al. (2013) and widely used in regional GHG flux inversions (e.g., Zhao et al., 2009;Kort et al., 2008Kort et al., , 2010;;Gourdji et al., 2012;Miller et al., 2012Miller et al., , 2013;;McKain et al., 2012).STILT is an enhanced version of the NOAA Air Resources Laboratory's HYSPLIT model (Draxler and Hess, 1998) that combines the powerful features of HYSPLIT, such as the ability to make optimal use of highly nested, highresolution meteorological input fields from a large number of data sources, with enhancements aimed at mass conservation, a critical consideration for inversion work.These enhancements include a reflection/transmission scheme and the use of customized meteorological fields (time-averaged mass fluxes and convective mass fluxes) in the dispersion calculations (Lin et al., 2003;Nehrkorn et al., 2010).Studies by Brioude et al. (2012) and Hegarty et al. (2013) have confirmed the benefits of using the customized meteorological output for LPDM computations, particularly in complex terrain.A sample comparison between STILT footprints and HYSPLIT trajectories (Fig. 1) shows general agreement between footprints and trajectories, but, in general, the convolution of footprints with surface-atmosphere flux models reveals details of flux distributions not captured by mean trajectories.Furthermore, the footprints are flux-and speciesindependent and can be efficiently applied to different flux models and species and incorporated into formal inversion frameworks.In all these aspects, our transport modeling for CARVE goes beyond previous airborne campaign meteorological and transport modeling that was predominantly focused on studying the origin of air masses through forward computation of mean trajectories or particle dispersion for selected species (e.g., Fuelberg et al., 2010 for the Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) mission).

WRF v3.4.1 baseline simulations for 2012 campaign
The STILT runs for CARVE carbon flux inversions are driven by customized meteorological fields from the Advanced Research version of the Weather Research and Fore- casting (WRF) model (ARW; Skamarock et al., 2008).The atmospheric model was configured to generate high-quality, high-resolution meteorological fields over Arctic and boreal Alaska.This is a region of challenging atmosphere-oceancryosphere interactions that is subjected to numerous unique physical processes (e.g., Vihma et al., 2013).The WRF fields form the starting point of many lines of research by the CARVE team and were made available in a timely manner through use of NASA supercomputer resources.Initially (see Sect. 3.2 for additional runs), for the 2012 campaign, we used Polar version 3.4.1 of the WRF model (for brevity, hereafter, simply WRF), with features, primarily related to the Noah land surface model (Chen and Dudhia, 2001), optimized for polar applications (Wilson et al., 2011).The fields generated by WRF v3.4.1 were used by the CARVE team in their subsequent 2012 analyses (see Sect. 5.3), and thus our analysis of model performance in this paper will focus on output from this configuration.
Placement of the modeling domains, and specification of STILT receptor locations, was dictated mainly by the locations of the 2012 CARVE aircraft flight tracks (Fig. 2a) over mainland Alaska.The 2013 aircraft flight tracks (Fig. 2b) show a similar geographical extent.A triply nested computational domain on a polar stereographic grid (Fig. 2c) was chosen so that the innermost domain with 3.3 km grid spacing encompasses the entire mainland of Alaska and enables the substantial orography of the state to be represented by the underlying topographic input field.The edges of this domain are positioned distant from the location of the CARVE aircraft flights to avoid deleterious model domain edge effects.Domain 2 is similarly positioned so that the domain edges are a considerable distance from any tower locations, while domain 1 is sufficiently large for effects from the lateral boundary conditions to be minimized.Table 1 summarizes the WRF v3.4.1 model configuration and physics options used for the baseline 2012 simulations.Our choice of major physics options follows the WRF configuration selected for the Arctic System Reanalysis (ASR; Bromwich et al., 2012) except that we used the ensemble Grell-Devenyi cumulus parameterization that is coupled with STILT.Cloud microphysics were parameterized in all three domains.Soil moisture and a binary (i.e., open water versus complete ice cover) sea ice field from the NCAR/NCEP Reanalysis product (Kalnay et al., 1996) were applied in our simulations.The default sea ice thickness of 3 m and snow depth on sea ice of 0.05 m were used in the Noah land surface model.
CARVE 2012 science flights took place between 23 May and 1 October (Alaska time).WRF runs were generated for 10 May through 2 October, 2012 (a total of 146 days), thus accommodating back trajectory simulations for measurements starting on 23 May.Meteorological fields for the entire period were formed from daily 30 h WRF runs that were initiated at 00:00 UTC every 24 h with the first 6 h of each simulation dropped to avoid spin-up errors.That is, the WRF fields of forecast lengths 6-30 h from individual model simulations are retained for each 06:00 to 06:00 UTC period.The 00:00 to 06:00 UTC period is thus represented by the 24 to 30 h forecast fields from the prior day's simulation.This approach was found to enhance STILT accuracy and computational efficiency (Nehrkorn et al., 2010).The practice of  (Lo et al., 2008).The use of 30 h simulations also minimizes the magnitude of the seams between model runs compared with longer runs.The boundary and initial conditions are derived from the NASA Modern Era Retrospective-analysis for Research and Applications (MERRA, Rienecker et al., 2011).Grid nudging above the boundary layer every 3 h in domain 1 further prevents model drift.These initial WRF v3.4.1 results for 2012 represent the baseline against which later efforts aimed at optimizing the WRF performance through numerical and physics options are evaluated.

WRF v3.5.1 simulations for 2012 and 2013 campaigns
The preliminary meteorological fields from WRF v3.4.1 (described above) were ingested by STILT.An example of CARVE chemical analysis facilitated by these data is presented in Sect.5.3.Subsequently, to improve the scientific fidelity of the CARVE modeling efforts, WRF simulations were repeated for the 2012 campaign, and then extended to the 2013 CARVE campaign, using the polar variant of WRF v3.5.1.The modeling period for 2013 was expanded in time to accommodate the CARVE 2013 science flights that took place between 2 April and 28 October (Alaska time), and a subset of observations from the CARVE tower in Fox, AK, and other existing towers.WRF runs for 2013 were generated for 1 March to 30 November (a total of 275 days).Using v3.5.1 of WRF, more rigorous implementation of cryospheric fields was enabled.The fractional MERRA snow cover field was used over land.Over water bodies in our modeling domains, daily Pan-Arctic Ice Ocean Modeling and Assimilation System (PIOMAS) v2.1 ice thickness and depth of snow cover over ice (Zhang and Rothrock, 2003) supplemental data sets were implemented as part of the model preprocessing following Hines et al. (2015).Sea ice thickness in the Noah land surface model was restricted to within 0.1 to 10 m, while snow depth over ice was restricted to within 0.001 to 1 m.Sea ice albedo was prescribed following Hines et al. (2015) to vary by Julian day and latitude.The specified v3.5.1 model configuration, otherwise, was unchanged from that of the baseline v3.4.1, and software refinements and corrections inherent to any model update are not anticipated to have significant effect.

WRF verification
In this section we evaluate domain 3, the innermost WRF domain with 3.3 km grid spacing, against surface and upper air observations to quantify the accuracy of the WRF meteorological fields.This high-resolution domain is expected to generate more realistic meteorological features than conventional meso-and global-scale atmospheric NWP models that utilize horizontal grids at least an order of magnitude larger.The two-way grid configuration that we adopted permits feedback from domain 3 to the coarser-resolution domains.While this results in more realistic simulations, it does not allow for an independent evaluation of model performance with respect to grid resolution in the outer domains; they are consequently excluded from the current verification.
The meteorological evaluation of our WRF runs was performed using the Model Evaluation Toolkit (MET, version 4.1).MET is the official WRF validation software maintained by the NCAR Developmental Testbed Center (DTC, http://www.dtcenter.org/met/users/)and is tailored to ingest a variety of observation types available from the NCAR Research Data Archive data set 337.0 (http://rda.ucar.edu/datasets/ds337.0/)that subsequently is used in the NCEP Global Data Assimilation System.MET interpolates meteorological fields to observation locations to form matched pairs for a range of variables, including temperature, dewpoint, relative humidity, specific humidity, horizontal wind speed, and horizontal wind components.Extensive quality control was performed to identify unphysical observations and remove stations that do not meet data availability thresholds.The latter includes disuse of platforms, mesonets, and locations with fewer than one-third of the expected hourly availability and those with data of unknown quality (e.g., due to poor exposure of the instruments).With these procedures in place, the remaining observations form the basis of a quantitative analysis aimed at ensuring that our WRF fields are adequate to drive STILT.

Monthly bias and RMSE
WRF model performance is quantified against surface observations through two summary statistics, bias and RMSE, computed for three different modeling periods and configura- tions: (a) v3.4.1 for 2012, (b) v3.5.1 for 2012, and (c) v3.5 The 2 calendar years of 2012 and 2013 had substantially different growing seasons: 2012 was cooler and wetter than the 1980-2010 mean, while 2013 was warmer and drier.In 2013, a very late thaw (∼ 20 May in Fairbanks) transitioned rapidly into summer, while the autumn refreeze occurred very late (∼ 1 November).Statistics provided here are computed using approximately 120 land sites located in domain 3, predominantly in mainland Alaska, with some on offshore Alaskan islands and in the Yukon and Northwest Territories of Canada.The following model fields are interpolated every hour to the location of the observations and included in the evaluation: 2 m temperature, 2 m dewpoint temperature, and 10 m wind speed.No vertical correction is applied to account for mismatch between model topographic heights and the true station elevation.Such an adjustment is needed less for high-resolution grids at 3.3 km grid spacing than for coarserresolution grids.Furthermore, the choice of lapse rate used in these adjustments is often not representative of the environmental lapse rate, and, for our purposes, obscures evaluation of the WRF fields as provided to the STILT model.
Table 2 shows that WRF v3.4.1 exhibits a bias of −1.40 K in 2 m temperature for the 2012 campaign, and negative temperature biases are also present in each month.The model temperatures are relatively warmer and thus closer to observations during August and September, but remain too low overall.Dewpoint temperature has a negative bias of −0.16 K across the 2012 campaign.The bias changes from positive (moist) early in the campaign in May and June to negative (dry) from July onwards.This evolving model performance with a change in sign may indicate challenges related to inaccurate representation of the underlying soil conditions and the melting of snow and ice cover.More evidence is provided later in this section.Two-meter wind speed has a small bias of −0.17 m s −1 , which is encouraging given the primary importance of the wind field as an input to STILT.A prolonged negative bias in wind speed decreases in magnitude from −0.56 m s −1 during the spring and summer and be-comes positive by September.We retain model-observation pairs when the observed wind speeds are greater than 3 kn (∼ 1.5 m s −1 ).Below this value, mechanical and logistical influences, such as wind sensor starting thresholds, rounding, and administrative limits (Bellinger, 2011) complicate scientific interpretation of the errors.Indeed, standard Automated Surface Observing System (ASOS) sensors report variable direction under certain meteorological conditions.The count of observed wind speeds by wind speed bin for the entire 2012 campaign demonstrates an unphysical distribution (Table 3).The imposition of a wind speed threshold by definition focuses the statistics on higher wind speeds that are more important to transport errors.Fox (2013) summarized the effects of ASOS implementation on wind speed observations, including a note that there was a factor of 2.4 increase in the number of pre-versus post-ASOS incidences of calm observations.Indeed, inclusion in the statistics of an artificially large number of calm observations compared to large numbers of non-zero model values introduces an apparent positive bias in the model wind speeds.Additionally, the ambiguous wind direction coding in the observation database for calm, and light and variable, winds also results in inflated wind direction RMSE.For all months of the 2012 campaign, wind direction biases are small (less than or equal to 5 • ), with July exhibiting the smallest values.Transport uncertainties related to the small wind speed and direction biases are similarly expected to be small.
RMSEs for WRF 2012 v3.4.1 reported in Table 4 are 2.97, 2.52 K, and 2.19 m s −1 for temperature, dewpoint temperature, and wind speed, respectively.Temperature and moisture RMSE values decrease month to month from May-June (∼ 3.0 to 3.5 K) to September (∼ 2.0 K).Wind speed errors, however, remain relatively constant throughout the campaign, with an increase in September perhaps related to the passage of strong extratropical cyclones during this month.Such storms, while typical for the region as autumn commences, pose a significant modeling challenge, especially for the wind field.As detailed in Sect 5.1, model representation of these events is fraught with timing and position errors associated with strong horizontal thermal and moisture advection from the south, mixing of strong low-level winds aloft to the surface, and the occurrence of downslope wind events to the lee of mountain ranges.Strong cyclones affected Alaska around 5, 16 and 26 of September 2012.Model performance during a downslope wind event on 16-17 September follows in Sect.5.1.Despite the varying amounts of cyclonic activity by month, wind direction RMSEs decrease slightly from May through September with a 2012 campaign average of 52.1 • .

Validation of arctic modeling in literature: surface variables
As noted earlier, the purpose of this manuscript is to assess the suitability of CARVE WRF simulations for use in trans- port modeling.While there are many recent studies that have illustrated both subjective and objective characteristics of Arctic modeling (e.g., Cassano et al., 2011;Jakobson et al., 2012;Tilinina et al., 2014;Jung and Matsueda, 2015;Simmonds and Rudeva, 2012;Glisan and Gutowski Jr., 2014), we put our model performance in perspective by noting the most relevant recent modeling studies that share many components similar to our study.We include comparisons against recent reanalysis products that are the result of intensive efforts to reduce errors in simulated fields by coupling the forecast model to a data assimilation system, in contrast to a free-running forecast.When making these comparisons, the evolving nature of WRF and polar modifications, differing choices of observational data sets, grid resolution, modeling domain, and simulation periods must be kept in mind.
It should be noted that the different lengths of the CARVE 2012 and 2013 modeling periods strongly influence how seasonal and monthly differences in errors affect the campaign averages.This should also be kept in mind when comparing model performance against those from the literature listed below where periods of performance vary greatly.Hines et al. (2011) used Polar WRF version 3.0.1.1 in a 25 km western Arctic domain during 2006-2007 and compared model performance to ten observing sites over Alaska.Model biases of 2 m temperature averaged over all sites for the months of May, June, and July (i.e., the months that overlap with the 2012 CARVE campaign) were 1.7, 2.5, and 1.4 K, respectively.RMSEs for the same months were 3.4, 3.9, and 3.5 K, respectively.The biases (RMSEs) in 10 m wind speed for Barrow for May, June, and July were −0.4 (1.5), 0.3 (1.8), and −0.3 m s −1 (1.4 m s −1 ), respectively.Wilson et al. (2011), for an ASR-like domain with 60 km grid spacing during December 2006 to November 2007, reported an Arctic-wide bias of −1.3 and an RMSE of 4.4 K for temperature.For dewpoint temperature, values were −0.4 and 4.4 K, respectively.Individual sites experienced substantial biases ranging from −7.0 to 5.9 K for temperature and −7.2 to 6.5 K for dewpoint temperature.Wind speed bias was 0.5 m s −1 , with an RMSE of 2.7 m s −1 .
More recently, Hines et al. (2015) used Polar WRF v3.5 on a 20 km grid to perform a number of sensitivity experiments related to varying sea ice treatment.In the current work, we closely follow their baseline implementation of Polar WRF and use of supplemental sea ice thickness, snow depth over sea ice, and sea ice albedo.Using observations from the Surface Heat Budget of the Arctic Ocean (SHEBA; Uttal et al., 2002) drifting ice station for January 1998, they obtained surface temperature and wind speed bias (RMSE) values of −1.2 K (3.2 K) and 0.3 m s −1 (1.3 m s −1 ), respectively.Bromwich et al. (2015) reported the performance of the ERA-Interim global reanalysis (ERA) and Arctic System Reanalysis (ASR) version 1 based on Polar WRF v3.3.1 against a large number of standard observations during the period December 2006-November 2007.They reported annual biases (RMSE) for 2 m temperature for the ASR and ERA of 0.10 (1.33 K) and 0.29 K (1.99 K), respectively.For 2 m dewpoint temperature, they reported annual biases (RM-SEs) for the ASR and ERA of −0.02 (1.72 K) and 0.32 K (2.04 K), respectively.For 10 m wind speed, they reported annual biases (RMSEs) for the ASR and ERA of −0.24 (1.78 m s −1 ) and 0.41 m s −1 (2.13 m s −1 ), respectively.Errors were larger regionally where topography is complex and were compounded by the use of a coarse grid spacing of 30 km underlying the reanalysis product.Wesslén et al. (2014) evaluated the performance of the global ERA-Interim reanalysis (∼ 80 km grid spacing) and two versions of the developmental Arctic System Reanalysis (ASR1 and ASR2) against ship-based observations obtained by the Arctic Summer Cloud Ocean Study (ASCOS) in the mostly ice covered Arctic Ocean in August and September 2008.For 2 m temperature, they obtained biases for the ASR1, ASR2, and ERA-I of −0.8, −1.3, and 1.3 K, respectively.Corresponding RMSEs were 2.3, 2.5, and 1.9 K, respectively.For wind speed, values were −1.4,−1.6, and −0.4 m s −1 , respectively, and RMSEs were 2.2, 2.3, and 1.6 m s −1 .The above details strongly suggest that the current CARVE modeling effort has generated near-surface meteorological fields that are of comparable quality to those in the recent literature.It should be kept in mind that, while realism is improved, skill scores from traditional verification techniques are often degraded due to imperfect timing and placement of small-scale features (Mass et al., 2002).

Diurnal cycle of surface variables
We now investigate model bias of the WRF v3.4.1 simulations for 2012 as a function of time of day (all times are UTC unless otherwise indicated; subtract 8 h for AKDT; e.g., 12:00 AKDT is 20:00 UTC) for the 2012 campaign and also each month of the 2012 runs.All subsequent figures and discussion, unless otherwise noted, refer to the WRF v3.4.1 model simulations for 2012, since these fields were used by CARVE in preliminary analyses and in the examples that follow in Sect.5.3.When assessing model representation of the true diurnal cycle, it should be kept in mind that the first 6 h of each daily 30 h simulation is discarded to avoid model spin-up errors, with the 00:00-06:00 UTC period formed from the 24 to 30 h simulation period from the previous day's run.This splicing together of short model simulations results in a seam between adjacent model runs.
Fields from forecast hour zero (not shown) exhibit very small bias and RMSE values.Since these hour-zero WRF fields are simply a spatial interpolation of the MERRA reanalysis to the WRF grid, the minimal error demonstrates the accuracy of the MERRA fields when used for initial conditions.Conversely, deficiencies that are inherent to any numerical weather prediction model are responsible for the subsequent growth of errors with respect to the observations over the 30 h simulation periods.These are responsible for any discontinuities seen between 06:00 and 07:00 UTC.It should also be noted that the diurnal cycle is less well defined above the Arctic circle around the summer solstice when solar insolation persists 24 h per day, with resulting effects on the PBL moisture and temperature structure that can differ from those in the mid-latitudes (Tjernström, 2007).In general, the summer Arctic PBL is strongly modulated by surface processes and the presence, and change of phase, of ice and snow.This is in contrast to the free troposphere, which retains characteristics of air masses advected from lower latitudes (e.g, Tjernström et al., 2004).
Performance of the WRF v3.4.1 simulations as a function of time of day for the entire 2012 campaign is similar to patterns seen for individual months, with exceptions noted below.A rather pronounced diurnal cycle in model performance for temperature (Fig. 3a.) results in increasingly negative biases starting around ∼ 15:00-16:00 UTC as the solar zenith angle decreases.This indicates a lack of sufficient sensible heating at the surface through the time of peak heating across all months.Of note is a pronounced increase in daytime positive bias (too moist) for dewpoint temperature for May (Fig. 3b), in contrast to other months that exhibit only small biases and are relatively insensitive to the time of day.This may indicate misrepresentation in the model of surface processes, such as changes of state, when (relatively) large diurnal cycles of solar radiation interact with extensive snow and ice cover that is starting to melt.Modeling of the timing and duration of freeze/thaw cycles during the spring thaw is challenging and small errors in timing can result in large errors in both temperature and dewpoint temperature.Wind speed bias is generally negative across all hours for all months, with several months showing a reduction in the negative bias starting around 14:00-15:00 UTC with increasing solar insolation.Of all months during the 2012 campaign, September exhibits the smallest cold bias in temperature and is unique with a positive wind speed bias across all hours (Fig. 3c).An increase in RMSE values of temperature (Fig. 4a) occurs in the afternoon as the bias becomes more negative.RMSEs for dewpoint temperature show (Fig. 4b) a marked discontinuity between 06:00 and 07:00 UTC, reflecting the seam between model runs.The discontinuity is most pronounced in May and steadily decreases through the summer.Again this suggests that the quality of the model representation of surface dewpoint temperature degrades substantially with forecast length due to model deficiencies, including those related to changes of state in the spring months.This increase in error accelerates around 18:00 UTC (10:00 AKDT) and remains large through the end of each 30 h simulation.A relatively marked seam of approximately 0.3 m s −1 between 06:00 and 07:00 UTC in wind speed RMSE for September (Fig. 4c) is larger than for all other months, though still modest in magnitude.

Aggregate wind rose plots
Wind rose plots showing modeled and observed winds are shown for all stations during the 2012 campaign in Fig. 5.The plots contain the same thirty-six 10 • wind direction bins used for encoding by the ASOS instrument to avoid the wind direction bias noted by Droppo and Napier (2008).As noted earlier, model-observation wind pairs are retained and plotted only if the observed wind speed is at least 3 kn.The wind roses demonstrate that the model well represents the wind direction frequency distribution, with subtle favored wind directions from the southwest and east in both the model (Fig. 5a) and observations (Fig. 5b).Of note is the higher frequency in the observations of the highest wind speeds (light green and orange).

Spatial distribution of bias and RMSEs
The spatial performance of WRF v3.41 shows expansive areas of negative bias across domain 3 for the 2012 campaign (Fig. 6a).The 2 m model temperatures are too cold across all of Alaska, with sites on the North Slope adjacent to the Beaufort Sea and in the Yukon and Northwest Territories of Canada showing the largest negative biases.The persistence during June, July, and August (monthly plots for the 2012 campaign are provided in the Supplement) of substantial negative biases on the North Slope and over other coastal sites results in these locations exhibiting the largest campaignaveraged cold bias.Similar negative biases are present in interior Alaska in May, but diminish in the summer months substantially following snowmelt.
Dewpoint temperature (Fig. 6b) shows a small positive bias for most areas, except on the North Slope where a small negative bias exists.Similar to temperature, a pronounced seasonal change occurs over the 2012 campaign.Monthly plots for the 2012 campaign are provided in the Supplement.During May, a large positive bias of 2-4 K exists across most interior stations.As summer progresses, the magnitude decreases everywhere, with a bias of −3 to −5 K developing and persisting across North Slope sites.The bias is minimal by September at most sites.The coastal stations at Unalakleet (PAUN) and Point Hope (PAPO), however, have substantial positive biases of moisture, despite generally good model performance for other variables at these sites.Wind speed (Fig. 6c) is biased slightly high at many inland sites, but shows small negative bias for coastal regions and sites on the North Slope.There is no substantial seasonal signal, except that many sites in southern Alaska in September exhibited a noticeable positive wind speed bias of approximately 1-2 m s −1 .
RMSEs in temperature for the overall 2012 campaign (Fig. 7a) are substantial (up to 6 K) and extensive along the coast of the North Slope, with smaller values elsewhere in Alaska.For these coastal locations, it is conceivable that errors in resolving the coastline on the 3. tions that are likely challenged by the arrival from the south of warm and moist air masses during the boreal summer over land surfaces still influenced by snow, ice, or meltwater.RM-SEs decrease noticeably across all sites during August and remain small (approximately 2 K) in September.RMSE values for dewpoint temperature (Fig. 7b) do not exhibit substantial geographical variation when averaged over the entire 2012 campaign, despite monthly values following a pattern similar to those for temperature, though with less extreme and more transient maximum errors on the North Slope in July of approximately 7 K. Wind speed RMSEs (Fig. 7c) are more uniform across all regions, with the most pronounced feature being a substantial regional increase in September in southeast Alaska.The increase in RMSE may be related to poor timing of high wind speeds associated with the strong cyclones that occurred in this mountainous region of Alaska during the month.Of note are persistent and large errors in dewpoint temperature for stations PAUN and PAPO.Evidently erroneous low dewpoint temperatures (not shown) observed at PAPO are the primary cause for the moisture errors (both RMSE and bias) at that station.These unphysical values are not present in the 2013 observations at these sites.The source of the large errors at PAUN, however, is not obvious and may be related to the fact that the coastline is not sufficiently resolved in the model.

Model performance at representative stations
We now show WRF v3.4.1 performance for three sites in Alaska often visited by the CARVE aircraft: McGrath (PAMC), Deadhorse (PASC), and Barrow (PABR).These regions are low-lying wetlands with abundant seasonal CO 2 and CH 4 fluxes and substantial evapotranspiration.We expect model performance at these stations to be representative of neighboring flight locations.Indeed, there is relatively small spatial variation (see Fig. 7) of errors in the vicinity of these three sites.
McGrath (62.95 • N, 155.58 • W) is located along the upper Kuskokwim River in interior Alaska.At McGrath, a negative temperature bias (Fig. 8a) and (possibly related) positive (moist) bias (Fig. 8b) in dewpoint temperature exist from May through late June.Extensive snow cover ended in the vicinity of McGrath around 10 May -the date of the first model simulation.Wind speed (Fig. 8c) in the model is too high, in part because of frequent calm observations.The effect on the magnitude of wind speed errors will be small due to the low wind speeds involved here.Wind direction is well represented in the model, with prevailing winds from the south-southwest in both the model (Fig. 9a) and observations (Fig. 9b).Secondary peaks in the wind direction distribution from the west-northwest, north-northwest, and east are also well modeled.The varied directions (compare to subsequent analyses for Deadhorse and Barrow) in part suggest complex source regions of air masses in the interior of Alaska.The frequency of lower wind speeds (blue colors) is higher in the observations.Deadhorse (70.21 • N, 148.51 • W) is located on the North Slope adjacent to the Beaufort Sea amid low-lying tundra.While experiencing a very cold and dry climate, it is susceptible to marine influences during periods of open water.Local variability in the warmer months of the year is enhanced by its location at approximately 70 • N, as incursions of warm, moist air from the south result in brief and infrequent but pronounced departures from a cold annual state.Timing of such frontal passages and also local cooling effects of the nearby ocean are challenging for NWP models to reproduce.Temperatures (Fig. 10a) and dewpoint temperatures (Fig. 10b) in WRF generally exhibit a strong negative bias between the disappearance of extensive snow cover around 10 June 2012 and retreat of sea ice by the middle of July, with forecast errors especially large during and following a mid-June warm spell.The warm spell was associated with temperature and moisture advection from the south in the absence of low clouds and snow cover.Degradation in model performance, however, appears strongly related to thawing of the top layer of soil.The appearance of a pronounced negative temperature and dewpoint temperature bias between 12 and 15 June 2012 coincides with an abrupt increase in near-surface soil moisture from snowmelt as soil temperatures rise above the melting point at a nearby Natural Resources Conservation Service Snow Telemetry (SNO-TEL) instrument at Prudhoe Bay, AK (http://www.wcc.nrcs.usda.gov/nwcc/site?sitenum=1177;"near-surface" measurements are obtained at a depth of 5 cm).A similar bias that appears in the 2013 time series during 9-11 June also coincides with thawing of the soil.A potential source of error is that the Noah land surface model in WRF may not be completely spun up, and therefore not reach equilibrium with the diurnal cycle at small spatial scales (e.g., Chen et al., 2007).Daily use of coarse-resolution soil temperature and moisture inputs from MERRA and NNRP presumably contributes to the degraded surface temperature simulations near the coastline in the presence of moisture from snowmelt and thawing soil layers.This is despite the fact that the initial soil and temperature fields from MERRA and NNRP are likely in equilibrium for their respective much coarser-resolution analysis systems, though biases may still be present.Indeed, to minimize the effect of spin up in the land surface model, Hines et al. (2011) performed an offline 10-year cycling of soil mois-  2011) also adopted a longer spin up of 24 h (compared to our 6 h) in an attempt to improve representation of the model atmosphere, specifically the surface interface and the boundary layer.However, their results, given earlier in Sect.4.1.2,show bias and RMSEs comparable to results from CARVE.It is to be noted that the sign of the dewpoint temperature bias following melting of snow cover is different here than in McGrath.Model temperatures improve noticeably around the middle of July, likely related to the retreat of coastal sea ice, and errors remain small in August and September.The large negative bias in dewpoint temperature also decreases around the middle of July, but frequent episodes of negative bias persist into September.A time series plot of wind speed (Fig. 10c) shows a negative bias, especially during May, with peak speeds during the mid-June warm spell also too low.Wind roses show excellent agreement between model (Fig. 11a) and observations (Fig. 11b), with the pre-dominant direction being east-northeast.A secondary peak is seen for winds from the southwest.The strongest observed winds (light green and orange in Fig. 11b) occurred just prior to the June warm spell but are absent in the model (Fig. 11a).
Barrow (71.29 • N, 156.77 • W) is located on a peninsula extending into the Beaufort Sea.It is the northernmost point in Alaska and is surrounded to the south by wetlands and low-lying tundra.Results for Barrow are comparable to those for Deadhorse.Of note is the large negative temperature bias during the summer months starting in the middle of June (Fig. 12a), while there also are occurrences of excessive nighttime cooling in May and early June when snow cover is present.Extensive snow cover ended by the middle of June in Barrow and extensive sea ice in close proximity to the shoreline receded by the middle of August.In addition to the possibility of inadequate representation in the model of saturated soil conditions, poor representation of the extent and frequency of low clouds and coastal fog, which are modulated by the retreat of coastal sea ice starting in July, can affect modeled temperatures (e.g., Dong and Mace, 2003) and so may play an important role in the negative bias along the North Slope including Deadhorse and Barrow.Interestingly, simulation of Barrow summer temperatures by Hines et al. (2011) exhibited a positive bias (contrast to the negative bias reported here) of over 10 K on many occasions following snowmelt in June.Coastal marine influences also appeared to be poorly modeled in their study, with much better agreement farther inland at, for example, Atqasuk, which lies approximately 110 km to the south of Barrow and experiences a more continental climate (McFarlane et al., 2009).Unfortunately, there is a dearth of non-coastal sites in our observational database with which to investigate further.Very low dewpoint temperatures in our simulations (Fig. 12b) in May were modeled on nights with excessive radiational cooling.A pronounced low (dry) bias from late June through late August was also seen.By September, low-level moisture in the model showed much improvement compared to observations.Wind speed (Fig. 12c) in the model is slightly too low, but does not exhibit any egregiously bad time periods.Similar to the wind roses for Deadhorse, those for Barrow indicate good agreement between model (Fig. 13a) and observations (Fig. 13b), with predominantly east-northeast winds and a secondary maximum from the southwest.The highest wind speeds (light green color) are more frequently seen in the observations (Fig. 13b).

Campaign bias and RMSEs
To quantify model performance aloft, model bias and RMSE statistics are computed for all 00:00 and 12:00 UTC upper air observations in the 2012 and 2013 campaigns at 850, 700, 500, 300, and 200 hPa.Model values were interpolated to the location of 11 balloon-launch sounding sites (10 in Alaska −0.17/1.16−6.4/12.6 4.3/16.30.17/3.05and 1 in the Northwest Territories of Canada).For WRF v3.4.1 in Table 5, model representation of upper-level variables is very good, with acceptably small bias and RMSEs.The 200 hPa level, which is frequently above the tropopause in the Arctic, often has a bias of a different sign than in the troposphere, but the magnitude remains modest.Temperature bias values are small (magnitude less than 0.2 K) and negative at lower levels.The largest bias is 0.39 K at 300 hPa.Geopotential height bias errors range from −2.5 to −6.4 m up to 300 hPa and are small and positive at 200 hPa.Model relative humidity is too high at all levels, ranging from +4.3 to +10.8 %, except at 200 hPa, where the bias is −7.6 %.Low-level wind speeds have a small positive bias of up to +0.17 m s −1 , with the bias remaining small but negative at 300 and 500 hPa.A positive bias of +0.36 m s −1 exists at 200 hPa.RMSEs are modest, +0.98 to +1.80 K for temperature, 11.4 to 18.1 m for geopotential height, 14.1 to 22.7 % for relative humidity, and 3.00 to 4.88 m s −1 for wind speed, and reasonably uniform at all levels.We note that near-surface temperature errors are larger than those from each level of the free atmosphere.This may be a result of deficiencies in boundary layer and surface energy processes or the representation of clouds above the PBL that affects surface radiation budgets.Each of these potential sources may contribute to the episodes of large temperature errors at Deadhorse (Fig. 10a) and Barrow (Fig. 12a).

Validation of Arctic modeling in literature:
upper-air variables Bromwich et al. (2015) evaluated version 1 of the ASR and ERA-Interim against upper-level observations for December 2006 to November 2007.For temperature (values here refer to levels at 200 hPa and below to aid in direct comparison), they report mostly small negative biases of magnitude less than 0.2 K at all levels, with RMSEs ranging from 0.69 to 1.94 K.For geopotential height, bias values are approximately 2 m or less in magnitude, with RMSEs between 7 and 22 m.For relative humidity, biases are of mixed sign but generally less than 2 %, while RMSEs range from approximately 9 to 23 %.For wind speed, biases are almost exclusively negative with magnitudes less than 0.5 m s −1 ; RMSEs are largest at high altitudes at approximately 3 m s −1 .While these values generally are in good agreement with the CARVE WRF v3.4.1 values reported in Table 5, the CARVE biases for relative humidity and geopotential height are larger.CARVE errors are similar to those reported by Wilson et al. (2011).They reported a magnitude of temperature bias through the entire column of less than 1 K, with RMSEs of 1- 2 K. Bias for geopotential height ranged from −11 to +40 m and RMSEs ranged from 20 to 50 m.For relative humidity, they note that obtaining accurate measurements in cold conditions, such as often found in the Arctic, is challenging, thus making this field difficult to verify.Limiting to those (warmer) levels below 500 hPa, they report biases of less than 5 %, with RMSEs of 15-20 %.Biases in wind speed range from −1.1 to 1.9 m s −1 .

Representation of small-scale features
Recent years have seen the emergence of reanalysis data sets with increasingly high resolution.These products typically have grid spacing of approximately 30 km and, as described above, generally compare favorably to observations.Bromwich et al. (2015) do note, however, that regions of complex topography still pose a modeling challenge, in part due to poor model representation of terrain and local wind effects.Here we demonstrate the ability of the WRF model on a 3.3 km grid to reasonably represent a damaging windstorm that is tied closely to mountain wave activity, while the downslope windstorm is absent on a coarse 30 km grid with resolution comparable to current reanalyses.It is our intent to illustrate the increased realism possible with higherresolution modeling and suggest that transport studies will, in general, benefit from the increased detail in flow fields.
September 2012 featured the passage of several strong extratropical cyclones through Alaska.During the evening of September 16, damaging downslope winds (e.g., 62 kn at 13:53 AKST at Delta Junction and an unofficial report of 99 kn around 23:00 AKST at Dry Creek) were reported across extensive regions to the north of the Alaska Range (NOAA, 2012).The winds were associated with a deep lowpressure system of central pressure 975 hPa that moved along the west coast of Alaska.For 5 h starting around 23:00 AKST (08:00 UTC 17 September), the village of Tanacross observed severe wind gusts from a generally southerly direction.This event was noteworthy because of the presence of strong winds both at high elevations and also in the valleys where the village of Tanacross is located.It is hypothesized that weak static stability during the autumn months (relative to winter) and the absence of strong surface-based temperature inversions may have contributed to the occurrence of the damaging wind event at Tanacross, which is located farther east than the favored high elevation locations of wind storms in the region (R. Thoman, personal communication, 2013).
Here we demonstrate the improved realism of the downslope wind event that is afforded by use of high-resolution modeling.Mass et al. (2002) discussed similar benefits related to flow over orography that are associated with decreasing WRF grid spacing from 36 to 12 to 4 km.They also note that, while realism is improved, skill scores from traditional verification techniques are often degraded due to imperfect timing and placement of small-scale features.
To quantify the impact of model resolution on the ability of WRF to reproduce the mountain waves and subsequent strong surface winds associated with this episode, we have performed additional one-way nested runs of the non-polar WRF v3.5.1 for the period 11-17 September 2012.(Use of WRF v3.5.1 ensured that we took advantage of any software refinements and corrections inherent to any new model update, but these changes are not anticipated to have a significant effect.)The placement of 61 vertical levels follows the most recent configuration (available from http://www2.mmm.ucar.edu/rt/amps/) of the Antarctic Mesoscale Prediction System (AMPS; Powers et al., 2012).The use of oneway nesting enables a direct comparison of the effects of grid spacing, whereas the production runs reported on earlieremployed two-way nesting aimed at optimizing model performance.Recent studies (e.g., Moore, 2013) have used the Interim Arctic System Reanalysis (ASRI) product (that has the same 30 km grid spacing as domain 1 in this study) to in- vestigate the climatology in polar WRF of high-speed wind events.Here, however, we demonstrate that use of coarse resolution at 30 km grid spacing versus 3.3 km grid spacing leads to a much more diffuse representation of model orography, mountain waves, and low-level flow fields.
Figure 14 shows the model representation of a downslope windstorm along the Alaska Range.The 10 m wind field in the 3.3 km grid indicates extensive strong winds over and near the Alaska Range (Fig. 14c), including some locations over 70 kn, while the 30 km wind field offers a much more muted representation with maximum surface winds of about 30 kn (Fig. 14d).The spatial variation in the wind fields is considerably greater and more realistic in the higher-resolution domain.The cross section for domain 3 (Fig. 14a) clearly suggests the presence of a mountain wave in both the potential temperature and wind fields, while mountain waves are absent in domain 1 (Fig. 14b).The unrealistically smooth cross-barrier terrain profile with insufficient vertical extent is a consequence of spatial averaging on a horizontal grid of insufficient resolution to fully resolve mountain waves (e.g., Chow et al., 2012), despite the presence aloft of strong antecedent cross-barrier flow during the evening of 16 September (not shown).Indeed, a minimum of the Froude number (e.g., Vosper, 2004)

Examples of STILT footprint calculations
To illustrate the effect of WRF resolution on STILT output, we have computed footprints using the WRF configuration in Sect.5.1 (v3.5.1 with one-way nesting) for receptors placed at the CARVE tower located to the northeast of Fairbanks (64.986 • N, 147.600 • W) every hour from 00:00 UTC 13 September to 23:00 UTC 17 September 2012.
Strong tropospheric-deep wind fields associated with migratory cyclones during September likely maximized the influence of orography on the near-surface receptors.The receptors in STILT are positioned at an altitude of 301 m a.g.l., as well as at the actual altitude of 32 m a.g.l., to account for the 269 m discrepancy in model height (even as represented in domain 3) versus the true height of the base of the CARVE tower.STILT footprints for both heights are computed at 0.1 • horizontal resolution using two sets of WRF fields: first (a) from all domains (i.e., 30, 10, and 3.3 km grid spacing) and then (b) from just the outermost domain 1 with 30 km grid spacing.The resulting near-field footprints are summed every hour over each 24 h trajectory period.The corresponding footprints are substantially different, particularly for the 301 m receptors (Fig. 16a and b).For the receptors at 32 m (Fig. 16c and d), the footprint field is confined closer to the tower location as might be expected, making the differences appear smaller.The footprint fields generated using only the domain 1 wind field (Fig. 16b and d) are more diffuse because of the coarse grid spacing of the input wind field.While there is, obviously, no "ground truth" for establishing which footprints are more accurate, those derived from high-resolution WRF fields intrinsically can contain more horizontal spatial detail and will benefit from refined boundary layer processes and the more realistic representation of orography.At a minimum, these differences contribute to the transport uncertainties entering the measurement error budget for inverse flux estimates.Another aspect of resolution is the impact of grid size employed in the STILT footprint calculations.For this purpose, we have utilized winds from all three domains of the v3.4.1 two-way nested runs and computed the footprints at 0.1 and 0.5 • resolution for a subset of 79 receptors during the same 5-day period as above.For a direct comparison, the 0.5 • footprints have been scaled by a factor of 1 / 25, resulting in an effective resolution of 0.1 • .The two "native" STILT resolutions (0.5 • coarse grid in Fig. 17a and c; 0.1 • fine grid in Fig. 17b and d) result in noticeable differences in both magnitude and spatial patterns of the footprints.The 0.1 • footprints (at 301 m in Fig. 16b and at 32 m in Fig. 16d) exhibit substantially more detail, making them more suitable for applications in the heterogeneous Alaskan landscape.Note the color scale has been standardized across all four panels to highlight the "washed out" look to the coarse-grid footprints.Chang et al. (2014) used the WRF-STILT footprints (based on WRF v3.4.1, described in Sect.3.1) with vertical profiles of the CARVE aircraft methane mixing ratios to determine methane fluxes for Alaska for 2012.This set of vertical profiles is comprised of receptors located from near the surface to over 5000 m a.s.l.The vertical profiles of six chemical and dynamic tracers measured by the CARVE aircraft (CH 4 , CO 2 , CO, O 3 , water vapor, and potential temperature) were used to identify the depth of the atmospheric column enhancement, which is defined as the wellmixed surface-influenced air from the ground to the bottom of the free troposphere.An independent estimation of the depth of the column enhancement is also provided by the height (a.s.l.) at which the WRF-STILT surfaceinfluence function (i.e., the footprint) becomes vanishingly small (< 0.1 ppm /( micromole m −2 s −1 )).The 10-day long footprint multiplied by a land mask for each of the receptors within a profile was summed to determine the total surface influence from land for that profile.Typically each profile contains between 200 and 400 individual receptors.For each of the 30 vertical profiles used by Chang et al. (2014), the WRF-STILT transport framework identifies the top of the column enhancement to within 500 m of the value identified by the CARVE aircraft in 67 % of the profiles.In Fig. 18a, we show a sample aircraft-observed vertical profile of CH 4 over interior Alaska near Fairbanks with the top of the atmospheric column enhancement at approximately 2400 m a.s.l.We see a well-mixed surface-influenced layer, with free tropospheric methane mixing ratios above the top of the column enhancement.The vertical profile of the WRF-STILT influence for receptors within this flight segment (Fig. 18b) is in good agreement and demonstrates that WRF-STILT is able to capture the shape of the CH 4 enhancement throughout the column, as well as the approximate depth of the column enhancement.This also ensures that the volume of air that is affected by surface emissions is well estimated, which ultimately is an important aspect of the simulation of GHG concentrations.

Impact on CARVE chemical simulations
We also compare modeled ozone loss with measured concentrations.Ozone can be used as a chemical tracer for land influence when dry deposition is the major loss process and photochemical sources and sinks are negligible.This is the case in the lowest 1.5 km of the Arctic atmosphere, where photochemistry is approximately 10 times slower than dry deposition (Jacob et al., 1992b;Walker et al., 2012).It is also more likely in the spring and fall, when incoming solar radiation is lower, resulting in less photochemistry and lower vegetative emissions of volatile organic compounds, both of which lead to lower ozone production.By studying the lowest layer of the atmosphere in which ozone flux can be assumed constant, the ozone loss can be calculated us-ing WRF-STILT footprints and compared to measured ozone concentrations.
Model ozone loss was calculated at each receptor location by summing the portion of the footprint that had been in contact with land and multiplying it by an initial estimate of the deposition velocity (−0.3 cm s −1 ) and the surface ozone concentration (determined for each flight by taking the mean ozone concentration below 500 m).Monthly results are shown in Fig. 19 for the lowest 1.5 km a.g.l. after the mean for each flight was subtracted for both the model and the observations.The best-fit lines are determined using standard major axis regression and the slope can be used to calculate the dry deposition velocity required to match the model with the observations.These computed values range from 0.1 to 0.5 cm s −1 , with a mean over the 5 months of 0.3 ± 0.1 cm s −1 , where the uncertainty is the 95 % confidence interval (2 times standard error).This range of deposition velocities is consistent with those measured over tundra, sub-Arctic fen, and sub-Arctic Norwegian spruce and Scots pine forests (0.18-0.60 cm s −1 ) (Jacob et al., 1992a;Mikkelsen et al., 2004;Tuovinen et al., 1998Tuovinen et al., , 2004)).Furthermore, our simulated deposition rates exhibit a realistic seasonal cycle, with lower velocities in May (Fig. 19a) when the surface is snow covered and leaves are still unexposed, and higher velocities in the warmer months (e.g., July and August, respectively, in Fig. 19c and d).This analysis provides confidence in WRF-STILT as implemented for CARVE and lends credence to applying the WRF-STILT footprints in the science analysis and flux inversions.

Conclusions
We have presented a detailed description and validation of the atmospheric transport model used to estimate surfaceatmosphere CO 2 and CH 4 fluxes from CARVE airborne and tower observations.Polar WRF was run on a 3.3 km grid centered over Alaska to generate high-resolution atmospheric fields for input to the STILT transport model.Aircraft and tower-based receptor locations from 2012 and 2013 formed the starting points of backward trajectory computations by the STILT transport model.Model upgrades are ongoing.Related papers provide more details about research enabled by observations and modeling from the CARVE campaigns (Chang et al., 2014;Miller et al., 2015;Karion et al., 2015).
While the bulk statistics computed here for 2012 and 2013 CARVE model fields cannot be compared directly with published values that use different time periods and observations, the error magnitudes are in general agreement with others in the recent literature.The grid spacing of 3.3 km used for CARVE represents an order of magnitude increase in model resolution compared to standard reanalysis products.The high resolution permits more realistic depiction of flow, including the explicit modeling of downslope windstorms that are absent in coarser-scale model grids.The substantial influence of high-resolution model wind fields input to STILT on footprint fields is demonstrated, as is the increased detail when the footprints themselves are computed on a finer-scale grid.These approaches are likely to be beneficial in the complex orography and surface flux patterns of Alaska and the Arctic in general.These preliminary modeling results will be refined in future CARVE years to form a consistent modeling database of WRF simulations and STILT-based footprints that extends from the beginning to the end of the CARVE field campaigns.
The measurement-observation system developed for CARVE ultimately is applicable to other regions of the Arctic, such as the Mackenzie Delta in the Northwest Territories of Canada, Scandinavia, and Siberia.The entire data set is publically available from NASA from the CARVE data portal (https://ilma.jpl.nasa.gov/portal/).The modeling framework is available to the general Arctic research community and the planned ABoVE NASA mission.
The Supplement related to this article is available online at doi:10.5194/acp-15-4093-2015-supplement.

Figure 2 .
Figure 2. CARVE flight tracks from (a) 2012 and (b) 2013 superimposed on WRF innermost domain 3, and (c) placement of nested WRF domains used for CARVE modeling, with model topography field (shaded, m) for outermost domain 1 (30 km grid spacing).Nested subdomains are shown by green rectangles: domain 2 (d2) with 10 km grid spacing covers eastern Russia and western Canada, while innermost domain (domain 3, d3, 3.3 km grid spacing) covers mainland Alaska.Innermost domain model topography field (shaded, m) is shown in (a) and (b).
.1 for 2013.Hereafter, we denote the "2012 campaign" as being the period 10 May-2 October 2012 and the "2013 campaign" as being the period 1 March-30 November 2013.Statistics are also compiled separately for each month (the 2 days in October 2012 are included only in the full 2012 campaign statistics).Detailed statistics based on the 2012 WRF v3.4.1 simulations are provided below.Results for v3.5.1 WRF simulations are, in general, similar and tabular statistics for the v3.5.1 2012 and 2013 campaigns are available in the Supplement.

Figure 3 .Figure 4 .Figure 5 .
Figure 3. WRF v3.4.1 model bias by time of day (UTC; subtract 8 h for Alaska Daylight Savings Time, AKDT) for (a) temperature (K), (b) dewpoint temperature (K), and (c) wind speed (m s −1 ).Each curve and color indicates a different period of time (either monthly or the full 2012 campaign) over which the errors are computed.

Figure 14 .Figure 15 .Figure 16 .
Figure 14.Cross sections valid at 11:00 UTC 17 September 2012 of model potential temperature (contoured every 2 K) and horizontal wind speed (shaded in knots) for (a) domain 3 and (b) domain 1.Plan-view maps for (c) domain 3 and (d) domain 1 of sea level pressure (SLP, contoured) and 10 m wind field (shaded in knots with standard representation of wind barbs; every second grid point shown for domain 3).Red lines near 143 • W in (c) and (d) denote locations of cross sections along grid column 439 of domain 3 and grid column 220 of domain 1, respectively.Model surface pressure trace shown as joined dark blue circles on (a) and (b).Locations of reference orange and light blue circles on surface pressure trace (a and b) are shown in (c) and (d).Line near 144.5 • W on (c) shows location of cross section in Fig. 15.Locations of Tanacross (63.38 • N, 143.36 • W) and Dry Creek (63.68 • N, 144.60 • W) denoted by a black-outlined circle and solid black circle, respectively, in (c) and (d).

Figure 17 .
Figure 17.As in Fig. 16, except for the aggregate of 79 CARVE tower footprints on a 0.5 • grid (a and c) and 0.1 • grid (b and d).The 0.5 • footprints are scaled by a factor of 1 / 25.Winds from all three domains of two-way nested WRF v3.4.1 runs are used.

Figure 18 .
Figure 18.Vertical profiles near Fairbanks, AK, on 21 August 2012 of (a) methane mixing ratio based on observations from the CARVE aircraft (in black), and (b) enhancements to the methane mixing ratio based on an aggregation of 370 WRF-STILT footprints (in red).Dashed black line at approximately 2400 m in each panel represents the top of the atmospheric column enhancement as determined using observations from panel (a).Red dashed line in panel (b) at approximately 2000 m denotes the top of the atmospheric column enhancement as defined by WRF-STILT.

Figure 19 .
Figure 19.Deviation of modeled and measured ozone concentrations (ppb) from the means of each flight for each month of the 2012 campaign (a-e).Computed deposition velocity (cm s −1 ) is shown in the lower right corner of each panel.

Table 3 .
Count of observed wind speeds (ws) by wind speed (in knots) for the 2012 campaign.