Atmospheric carbon cycle dynamics over the ABoVE domain: an integrated analysis using aircraft observations (Arctic-CAP) and model simulations (GEOS)

The Arctic Carbon Atmospheric Profiles (Arctic-CAP) project conducted six airborne surveys of Alaska and northwestern Canada between April and November 2017 to capture the spatial and temporal gradients of northern high-latitude carbon dioxide (CO2), methane (CH4) and carbon monoxide (CO) as part of NASA’s Arctic-Boreal Vulnerability Experiment (ABoVE). The Arctic-CAP sampling strategy involved acquiring vertical profiles of CO2, CH4 and CO from the surface to 5 km 20 altitude at 25 sites around the ABoVE domain on a 4to 6-week time interval. We observed vertical gradients of CO2, CH4 and CO that vary by eco-region and duration of the sampling period, which spanned the majority of the seasonal cycle. All Arctic-CAP measurements were compared to a global simulation using the Goddard Earth Observing System (GEOS) modeling system. Comparisons with GEOS simulations of atmospheric CO2, CH4 and CO highlight the potential of these multi-species 25 observations to inform improvements in surface flux estimates and the representation of atmospheric transport. GEOS simulations provide estimates of the near surface average CO2 and CH4 enhancements that are well correlated with aircraft observations (R=0.74 and R=0.60 respectively), suggesting that GEOS has reasonable fidelity over this complex and heterogeneous region. This model-data comparison over the ABoVE domain reveals that while current state-of-the-art models and flux estimates are able to 30 capture broadscale spatial and temporal patterns in near-surface CO2 and CH4 concentrations, more work is needed to resolve fine-scale flux features that are observed. The study also provides a framework for benchmarking a global model at regional scales, which is needed to use climate models as tools to investigate high-latitude carbon-climate feedbacks. 35 https://doi.org/10.5194/acp-2020-609 Preprint. Discussion started: 11 September 2020 c © Author(s) 2020. CC BY 4.0 License.


Introduction
Accelerated Arctic system change (Hinzman et al., 2013), coupled with the vast quantities of carbon sequestered in the permafrost soils of the northern high latitudes (Hugelius et al., 2014), have led to concerns about the potential for significant carbon emissions due to changes in ecosystems, permafrost and large-scale disturbances like fires (Schuur et al., 2015;McGuire et al., 2018;Turetsky et al., 2020). 40 Our understanding of the magnitude and behavior of the carbon system response to these changes is rudimentary . For instance, release of carbon from the permafrost pool could result in increased emissions of CH4 from anaerobic degradation; increased emissions of CO2 from aerobic degradation; increased uptake of carbon due to new availability of nutrients and above-ground ecosystem growth; or an increase in mobilization of carbon through runoff. Alternatively, increases in 45 disturbances such as fires may significantly impact below-ground carbon storage, uptake of CO2 and emissions of CH4, CO, and CO2. Limitations in our understanding of the accuracy of modeled fluxes of CO2, CO and CH4 have increased uncertainties in predictions of the magnitude of Arctic carbon-climate feedbacks (e.g., Koven et al., 2011;Lawrence et al., 2015;Schaefer et al., 2014;Schneider von Deimling et al., 2012;Schuur et al., 2015). 50 The lack of observations represents a significant limitation that lead to both enhanced uncertainty and reduced fidelity in our model simulations. In general, land-and ocean-atmosphere fluxes from climate models are most commonly evaluated using flux measurements made with eddy covariance or flux chamber techniques. While flux measurements of these types are widely available over many ecosystem types, they represent the impact of limited spatial domains that are rarely more than a 1000 m radius 55 around a given site (Gockede et al., 2005;Schmid, 2002) and may be significantly smaller depending on topography, wind direction and boundary layer stability. Land surface inhomogeneities within these small footprints (Baldocchi et al., 2005) and regional-scale (100-1000 km scales) variability of these ecosystems can lead to significant biases when eddy covariance measurements are scaled up to represent large areas. This is especially true in the Arctic where microtopography can result in fluxes 60 varying by orders of magnitude on a scale of 1-100 meters (Johnston et al., 2014). While flux towers can be found in many ecosystem types, they do not necessarily represent landscape-scale heterogeneity within a broadly defined ecosystem such as the boreal forest, peatlands, or tundra regions of the Arctic. An alternative to the "bottom-up" evaluation approach is the "top-down" approach, which makes use of atmospheric measurements of species like CO2, CH4 and CO and modeled atmospheric transport 65 patterns to infer the surface fluxes needed to reproduce observed atmospheric concentrations. This inverse approach generally takes a forward-flux model, or a set of observations that are likely correlated with the flux, as a prior or first guess. The inverse approach then estimates the flux by scaling the prior. While the inverse approach results in a flux estimate that meets the constraint of the trace gas measurements and modeled transport, the variability in surface flux from these analyses cannot be 70 directly attributed to mechanisms such as temperature changes, CO2 fertilization and water stress. Also, inverse methods are influenced by errors in atmospheric transport and assumptions about error covariances, which are difficult to characterize (Gourdji et al., 2012;Lauvaux et al., 2012;Mueller et al., 2018;Chatterjee and Michalak, 2013).
In this study, we explore a hybrid approach using atmospheric trace gas mole fractions from aircraft 75 profiles and an advanced global tracer transport model to evaluate the ability of current state-of-the-art bottom-up land-surface flux models to capture complex carbon cycle dynamics over the northern highlatitudes. NASA's Goddard Earth Observing System (GEOS) general circulation model (GCM) is used with a unique combination of surface flux components for CO2, CH4 and CO to create 4D atmospheric fields; these fields are subsequently evaluated using profiles collected during the Arctic Carbon 80 Atmospheric Profiles (Arctic-CAP) airborne campaign. Both the Arctic-CAP project and the GEOS model runs for the domain are part of NASA's Arctic Boreal Vulnerability Experiment (ABoVE, www.above.nasa.gov), a decade-long research program focused on evaluating the vulnerability and resiliency of the Arctic tundra and boreal ecosystems in western North America . One of the primary objectives of the ABoVE program is to 85 better understand the major processes driving observed trends in Arctic carbon cycle dynamics, in order to understand how the ecosystem is responding to environmental changes and to characterize the impact of climate feedbacks on greenhouse gas emissions. ABoVE has taken two approaches to better understand critical ecosystem processes vulnerable to change. The first is through ground-based surveys and monitoring sites in representative regions of the ABoVE domain. These multi-year studies provide 90 a backbone for intensive investigations, such as airborne deployments. The Arctic-CAP campaign discussed here was one such airborne deployment that was conducted during the spring-summer-fall of 2017 (Section 2.1). The subsequent analysis described here illustrates how improvements in surface models develop through ground-based surveys, and monitoring sites can be evaluated and tested over larger spatial scales using such aircraft profiles (Section 3). This study uses Arctic-CAP aircraft profiles 95 to directly evaluate both the transport model and the terrestrial surface flux models of CO2, CH4 and CO. For the sake of demonstration, we rely on one transport model and one flux scenario for each tracer (i.e., CO2, CH4 and CO) to show the utility of the three carbon species to diagnose and identify deficiencies in both flux and transport models. Ongoing and future studies build upon the results discussed here and further diagnose transport and flux patterns from multiple models based on 100 additional aircraft and ground-based observations throughout the ABoVE domain.

Arctic-CAP Flight Planning and Sampling Strategy
Arctic-CAP was designed to measure vertical profiles of atmospheric CO2, CH4 and CO mole fraction to capture the spatial and temporal variability of carbon cycle dynamics (Parazoo et al., 2016;Sweeney 105 et al., 2015) across the ABoVE domain. Flights were conducted aboard a Scientific Aviation Mooney Ovation 3 (N617DH). Six campaigns were performed during 2017: late April -early May, June, July, August, September, and late-October -early November. Arctic-CAP flights surveyed the ABoVE Study Area and were organized around an Alaskan circuit and a Canadian circuit (Fig. 1) measurements of atmospheric CO2 and CH4. The Arctic-CAP Canadian circuit focused on flying over sites in and around the Inuvik and Yellowknife areas in the Canadian Arctic. In the Inuvik region, the aircraft overflew the Trail Valley Creek and Havipak Creek research sites, and the Daring Lake and Scotty Creek flux tower sites were overflown on the way to and from the Yellowknife area. The Canadian Circuit expands upon the ecoregions covered in the CARVE missions to include the Boreal 120 Cordillera, Taiga Plain, Taiga Shield and the Southern Arctic Tundra ecoregions. Approximately 25 vertical profiles were acquired during each campaign (Fig 2). The majority of each flight day was spent in the well-mixed boundary layer with 2-4 vertical profiles up to altitudes of 5000 m above sea level (masl). Using missed approaches to get as near to the ground as possible, profiles diagnosed the temporal change in the boundary layer as well as the residual layers above where surface 125 fluxes may have recently (< 3 days) influenced that atmospheric column. this study will be on the CO2, CH4 and CO data acquired during Arctic-CAP.

Aircraft and Payload
Arctic-CAP flights were performed in a Mooney Ovation 3 (N617DH, Scientific Aviation). The Mooney operated at a cruise speed of 170 kts and reached profile altitudes of 5 km (17,000 feet) on 135 each flight, with most legs lasting 4-5 hours and covering an average distance of ~1350 km. The average ascent and descent rates were limited to ~100 m/min to minimize hysteresis in the temperature and relative humidity measurements. The basic research payload flown on all six research missions included continuous in-situ CO2, CH4, CO, H2O, temperature and horizontal winds. The in-situ measurements  followed the methodology described in Karion et al. 140 [2013], and wind measurements followed the protocol outlined in Conley et al. [2013]. Programmable flask packages (PFPs; Sweeney et al., 2015) provided an independent check of the calibration scale of the continuous in situ CO2, CH4 and CO measurements, as well as samples for more than 50 different species including N2O, SF6, and a variety of hydrocarbons, halocarbons and isotopes of carbon (Sweeney et al., 2020). Carbonyl sulfide measured in the flask samples can be used as a tracer of gross 145 primary productivity (GPP) (Montzka et al., 2007), while ethane, propane and C-13 isotope of CH4 provide another constraint on the source of the CH4 emissions. Each flight sampled a single 12-flask package providing a total of ~84 flasks per research mission to better understand the factors controlling local fluxes of CO2, CH4 and CO and the long-range transport of these species from low latitudes.
Era Retrospective Analysis for Research and Applications (MERRA) (Rienecker et al., 2011) and 155 MERRA-2 (Bosilovich et al., 2015;Gelaro et al., 2017). The GEOS Forward Processing (GEOS FP) system produces atmospheric analyses and 10-day forecasts in near real-time, which are used to provide forecasting support to NASA field campaigns and satellite instrument teams (e.g. Strode et al., 2018). GEOS has also been used extensively to study atmospheric carbon species (e.g. Allen et al., 2012;Ott et al., 2015;Weir et al., 2020). 160 The GEOS setup utilized in this work simulates CO2, CO and CH4 simultaneously at nominal 0.5° horizontal resolution, 72 vertical layers (up to ~0.1 hPa) with trace gas output saved every 3-hours. For CO2, the surface fluxes consist of 5 different components from a Low-order Flux Inversion (LoFI) package (Weir et al., 2020): 1) net ecosystem exchange (NEE) from the Carnegie Ames Stanford Approach -Global Fire Emissions Database (CASA-GFED) mode with a parametric adjustment applied 165 to match the atmospheric growth rate (Weir et al., 2020), 2) anthropogenic biofuel burning emissions, i.e., harvested wood product (Van Der Werf et al., 2003), year 2017. As shown later, this is not a bad assumption considering that for the majority of the ABoVE domain, the most critical CH4 emissions are from the wetlands sector. On the other hand, care was taken to use a version of the LPJ-wsl model that includes a state-of-the-art hydrology subroutine (TOPMODEL) to determine wetland area and its inter-and intra-annual dynamics (Zhang et al., 2016), a permafrost and dynamic snow model (Wania et al., 2009) with explicit representation of the effects of 185 snow and freeze/thaw cycles on soil temperature and moisture, and thus the CH4 emissions. Table 1 provides a summary of the flux components, their specifications and associated references.  (Tables 2 and 3) and compared against existing studies and estimates to establish the fidelity of the model fluxes for large-scale assessments.

ABoVE Domain pan-Arctic (>48 N) Global Land Sink Fuel Sources Land Sink Fuel Sources
Land Sink Fuel Sources -0.32 0.11 -1.84 1.37 -3.28 11.08 CO2 flux estimates indicate that the ABOVE domain is a 0.32 PgC sink for our study year, 2017. This represents about 17% of the calculated pan-Arctic terrestrial carbon sink, which is consistent with the fraction of the land area > 48N represented by the ABoVE domain (~16%). Perhaps more significantly, the 1.84 PgC pan-Arctic sink represents 56% of the global sink for 2017. We attribute this large uptake 200 to the vast boreal forests > 48 N, particularly in Siberia (Sasakawa et al., 2013), where the contemporary Arctic tundra is thought to be nearly carbon neutral with uncertainties allowing for a small to moderate sink or a small source (McGuire et al., 2016). These findings are also consistent with Wunch et al. (2013) who used GOSAT satellite data and TCCON ground-based column measurements to determine that interannual variability in Northern Hemisphere CO2 uptake was dominated by changes in the boreal 205 forest. More recent studies, such as Welp et al. (2016) and Commane et al. (2017) have also used atmospheric inversions to highlight that >90% of the carbon sink in the northern high latitudes reside in the boreal forests. Our simple forward model simulations and the Arctic-CAP data provide a unique opportunity to assess the validity of these previous findings over the ABoVE domain. Sub-regional flux estimates within the ABoVE domain are part of ongoing investigations and will be captured in future 210 studies.  1 in Saunois et al., 2019); instead, we adopt a much more simplistic approach of repeating the EDGARv4.3.2 from 2012 for the year 2017. Contrary to the emissions from the coal, oil and gas sector, our wetland methane flux emissions are obtained from the LPJ-wsl model (Table 1). LPJ-wsl is 235 one of the prognostic models that provide wetland emission estimates to the global methane budget (   (Table  4). Figure 4 presents the composite vertical profile data for each campaign. The monthly composite CO2, CH4 and CO vertical profiles capture the expected variations in the seasonal cycle. The composite profiles also show more variability in the boundary layer (altitudes < 3000 masl) within each month and across months than in the free troposphere for CO2 and CH4 (altitudes > 3000 masl). Unlike CO2 and 250 CH4, CO variability in the free troposphere is significantly greater in July and October than the boundary layer showing either long-range transport of CO or CO injected high (>3000 masl) into the troposphere by local wildfires. A clearer picture of the vertical gradients between the free troposphere and the boundary layer can be seen by subtracting free tropospheric means from measurements below 3000 masl. The CO2 gradients 255 between the measurements below 3000 masl and average daily free troposphere values show a drawdown in the boundary layer for most of the profiles starting in June and lasting until the end of the September campaign (Fig. 5). The drawdown signal in CO2 over the Northern Alaska Tundra (often referred to as the "North Slope") was most pronounced in mid-July and continued through the September campaign. The CO2 drawdown in the more southerly regions of the Boreal Cordillera and 260 Alaskan Boreal Interior peaked in August. By the October campaign many regions were showing significant enhancements in the boundary layer CO2 mole fraction relative to the free troposphere. On the other hand, for both CH4 and CO, significant enhancements were observed from June through early November. Methane enhancements over the Northern Alaska Tundra CH4 enhancements were observed from July onward, consistent with patterns observed at the long-term surface monitoring station in 265 Utqiaġvik . Similarly, boundary layer CO2 and CH4 are both most enhanced in September and October on the North Alaska Tundra. Due to the high variability in CO above 3000 masl during July and October (Fig. 4), it is more difficult to use this approach to derive CO enhancements from surface fluxes. To avoid the impact of fire-based CO that has been injected into the free troposphere, the mean background value is taken from measurements above 4000 masl. This analysis 270 shows that Canadian Taiga and Alaskan Boreal Interior are the predominant sources of boundary layer CO emissions likely reflecting fires in these regions at that time. It should be noted that large enhancement values for CO2, CH4 and CO were observed with the Alaskan Boreal Interior, which were the result of samples taken in the early morning (10:00 local time) before the boundary layer has fully developed (typically around 11:00-12:00 local time). This trapping of night-time emissions results in 275 significant enhancements that quickly taper off with altitude. These measurements were typically taken during the first profile out of Fairbanks where the majority of the Arctic-CAP flights originated.

Model Data Comparisons
Aircraft profiles that measure the gradient from the boundary layer into the free troposphere are particularly useful for evaluating atmospheric models and for separating errors and uncertainties related 280 to atmospheric transport and surface flux model simulations. This is demonstrated by comparing surface flux models for CO2, CH4, and CO using a single GCM to evaluate both the transport features and the flux model specifications. For the data-model comparison, the aircraft observations are aggregated to 10s averages and the model 4D fields (latitude, longitude, altitude and time) are sampled at the location and time of those 10s averages. Sampling with the 10s averages reduces the spatial representativeness 285 error between the model grid cell and the aircraft observations.

Point by Point Comparison
In the GEOS model run used for these comparisons, an effort was made to match the global atmospheric burdens of CO2, CH4 and CO; however, given the uncertainties in the sources and sinks of these trace gases and in the representation of long-range and local atmospheric transport, it is not 290 uncommon to have mean offsets between the observed and the modeled mole fractions. To evaluate surface fluxes in the ABoVE domain, it is important to consider both the impact of regional-scale fluxes and long-range transport processes that control the mole fractions of CO2, CH4 and CO throughout the ABoVE domain. A time series comparison of the modeled and the observed CO2, CH4 and CO mole fractions (Fig. 6) suggests that gross features of the seasonal cycles are matched, although some 295 significant differences require detailed analysis by considering different elements of each vertical profile.

Free Troposphere Comparisons
As demonstrated from the analysis of the boundary layer enhancements, it is useful to subtract the average free tropospheric mole fraction from each profile to better understand the local influences 300 within a particular profile. Differences in the mean free tropospheric values, however, can be a valuable indicator of how large-scale biases in the model influence point-to-point comparisons.
In the case of CO2, the mean daily CO2 mole fraction in the observed free troposphere is increasing faster than modeled values over the course of 6 research missions. The largest offset exceeds a mean value of ~2 ppm (observed -modeled) during the September campaign (Fig. 7). Based on the available 305 model runs, it is difficult to diagnose what causes this offset, although a few hypotheses can be put forward. Given the decreasing latitudinal gradient for CO2 in the free troposphere at this time of year, the offset could be explained by sluggish meridional transport in the model. Alternatively, exaggerated biological uptake in the model in regions outside the study area could be pulling down the CO2 in modeled free troposphere more rapidly than the drawdown observed over the ABoVE domain. 310 Measured CH4 increases faster than modeled CH4 over the course of the campaign. Given the decreasing meridional gradient for CH4 that exists during the summer months, sluggish transport could explain the difference between model and observations. Alternatively, modeled June-July-August emissions of CH4 in areas surrounding the ABoVE domain could be underestimated, leading to slower increase in modeled free tropospheric CH4. 315 Finally, the difference between modeled and observed mole fractions of CO in the free troposphere is mainly driven by inaccuracies in the modeled CO from fire plumes both within and outside the ABoVE domain. Figures 4, 6 and 7 show observations of large CO enhancements above 4000 masl during the July, August and October/November campaigns. Given the large excursions in the free tropospheric CO between different profiles, local fires were likely responsible for these enhancements. Accurately 320 simulating the injection height of fire plumes is challenging (Freitas et al., 2007;Strode et al., 2018). The GEOS model distributes biomass burning emissions throughout the planetary boundary layer (PBL) to represent injection above the surface layer, but this method can result in underestimated local emissions for fire plumes detraining in the free troposphere. In regions remote to the ABoVE domain, emissions can be mixed and lofted by large-scale weather systems, which may explain why the model 325 performs better in simulating long-range CO plume transport than it does in capturing the CO enhancements from local fires. The observation-model mismatch is likely compounded by the inability of the model to accurately simulate the subgrid-scale vertical mixing necessary for capturing vertical profiles for local sources.

330
Accurately modeling boundary layer mole fractions of CO2, CH4 and CO depends on an accurate representation of two key factors. First, there is a need to accurately model the local surface-atmosphere flux and second there is a need to correctly model the physical evolution of the PBL, as well as horizontal transport and vertical mixing out of the PBL into the free troposphere. GCMs have limited horizontal and vertical resolution and require parameterizations to predict both the rate of change and 335 the absolute value of the PBL height over the course of the day. Errors in PBL mixing directly impact the tracer mole fraction estimate. Overestimation of the PBL height causes an artificial dilution of the impact of surface flux. Conversely, underestimation of the PBL height results in amplification of the impact of a surface flux on the simulated PBL mole fraction. Additionally, GCMs typically simulate large-scale horizontal gradients more accurately than PBL height unless there are large topographic 340 changes that occur on horizontal scales less than the model resolution (for GEOS, 0.5 degree). This is because such large-scale patterns are generally well-constrained by the millions of in situ and satellite observations incorporated into meteorological analyses while PBL mixing is represented by highly simplified parameterizations The three carbon species that we investigate in this study provide different diagnostic information about 345 the model transport and flux specifications. In the case of a gas like CO that often comes from a specific point source in the Arctic, accurate placement of the emissions, both in the horizontal and the vertical, and the modeled wind direction are critical factors. The ABoVE domain is made up of large expanses of forest and tundra in which CO2 fluxes are more uniformly distributed, making the transport accuracy of individual plumes a less critical factor for simulating CO2. Accurately estimating CH4 mole fractions 350 may be more sensitive to horizontal transport in the PBL if CH4 emissions are dominated by specific features such as lakes or wetlands, or anthropogenic point sources from oil and gas production such as those observed on the North Slope (Floerchinger et al., 2019). However, we observed consistent PBL CH4 enhancements throughout each campaign (Fig. 5), suggesting a spatial homogeneity in CH4 emissions rather than emissions from specific point sources. 355

The advantage of vertical trace gas gradients
While individual mole fraction measurements are challenging to reproduce given errors in both modeled surface fluxes and transport, the vertical profile provides a unique opportunity for removing significant uncertainties in transport in order to better assess the surface flux model of a specific long-lived tracer.
Assuming that horizontal transport is a relatively small source of bias and the upper part of the free 360 troposphere (>3000 masl) is largely unaffected by local processes, it is possible to use the information in the vertical profile to reduce the effects of vertical transport. This can be estimated by vertically integrating the net change in the PBL due to a surface flux from the surface to a specific altitude that is well above the boundary layer. For this study, almost all the enhancements for CO2 and CH4 were observed below 3000 masl. By subtracting the average free tropospheric (FT) values in both the model 365 and the measurements and averaging the resulting enhancements or depletions for each profile mapped on equal altitude bins from surface to 3000 masl, we quantify a total enhancement resulting from the surface flux (Fig. 8). The resulting measured and modeled boundary layer enhancements show good matches for all three gases. The average measured enhancement in CO2 and CH4 below 3000 masl is correlated with the forward 370 model such that more than 50% and 36%, respectively, of the variability observed is captured by the model (Fig. 8). The average CO enhancements in the lower 3000 masl is captured by the model with lesser accuracy -in fact, the model only captures 26% of the observed variability along with a significant bias throughout the growing season.

375
To understand the true value of the aircraft profile in evaluating the ability of the surface flux model to reproduce observed fluxes over large regional expanses, it is useful to rigorously compare the differences between modeled and observed near-surface enhancements. The enhancements of CO2 below 3000 masl shown in Fig. 8 for both data and the GEOS model are well correlated. As expected, during April/May we see very little change in the average enhancements below 3000 masl, while June 380 and July and August show significant drawdown, followed by enhancements in September and October/November ( Fig. 6 and 8). The modeled enhancements in the lower 3000 masl reproduce the observations suggesting that the surface flux of CO2 throughout most of the ABoVE domain is accurately modeled by GEOS. Despite the overall agreement indicated by aggregated statistics, a closer look shows significant 385 differences in observed and modeled CO2 enhancements for many individual flight days (Fig. 9).
Inspection of individual profiles (Fig. 10) reveal that in some cases the model is not capturing nearground stratification. This is not surprising given that the observations have a much higher vertical resolution than the model's vertical resolution, which is ~100m in the PBL. Consequently, the observed mole fraction values are much higher than the model estimates because the model is not able to capture 390 the stratification in the river valleys in the interior parts of the ABoVE domain. However, the overall modeled vertical gradients in CO2 match the observations suggesting that the large-scale vertical transport of emissions is accurately simulated above ~1000 masl. As an example, the set of profiles from July 10 ( Fig. 10) demonstrates that, although infrequent, high PBL heights and emissions from fires (as indicated by large (>400 ppb) enhancements in CO) add some uncertainty to the average BL 395 enhancement values. Both of these factors impact the mean free tropospheric correction and altitude of integration that we have chosen to accurately capture the total CO2 enhancement from the surface fluxes.

Average CH4 enhancement
Although the correlation between the observed and modeled average enhancements of CH4 is 400 significant, there are some key deviations that should be noted. In particular, we see some clear biases in the seasonality where the enhancements in the early part of the season are underestimated by the model while the enhancements in the later part of the season are overestimated. This is demonstrated both by the comparisons of average enhancements (Fig. 8) and of mole fraction enhancements below 3000 masl (Fig. 9) where the mean difference (observed -modeled) switches from positive to negative 405 over the course of the study period. The Arctic-CAP profile observations provide a critical point of comparison to which future surface flux models of CH4 can be compared, helping to identify areas where process improvements are needed.

Average CO enhancement
The comparison of observed and modeled average enhancements of CO is less useful because some of 410 the critical assumptions we make for this comparison are designed to shed light on surface processes affecting CO2 and CH4. The biggest limitation in the CO simulation for interpreting vertical profile observations appears to be in the accuracy of the vertical distribution of CO emissions. While the model shows an increase in mole fractions during the July and October/November campaigns, the extreme mole fractions in the observations are twice that of the model (Fig. 6). A good example of how the 415 model and the observed mole fractions are different can be seen on July 10, 2017 (Fig. 10) during a flight up the Mackenzie River in the Northwest Territories of Canada. Here, large enhancements of CO (>400 ppb) are observed at altitudes between 3000 and 5000 masl while CH4 and CO2 boundary layer enhancements are observed below 3000 masl in most of the profiles measured that day. The ~100 ppb CO/ppm CO2 ratio and the large CO enhancement not only support the idea that a fire is the source but 420 that the fire is nearby (<100 km). Both the magnitude and altitude of the CO enhancement point to a few critical limitations in the model that was less important for CO2 and CH4. First, most GCMs, including GEOS, do not take into account the massive heat source that fires provide to correctly model the injection of fire emissions above the boundary layer. Second, the fire radiative power observations used to estimate emissions can be obscured by thick clouds or aerosols resulting in the emissions 425 estimates missing some fire hotspots. Third, the heterogenous nature of fires as a surface source of CO means that any inaccuracies in horizontal transport or location of the fire will play a large role in the ability of the model to accurately reproduce the observations. Fourth, the lack of diurnal cycle in biomass burning emissions from the emission database (QFED; Table 1) may result in 'temporal aggregation errors', whereby the model simulations may miss the high emission values that coincide 430 with the daytime aircraft observations.

Model-data mismatch over ecoregions
It is also helpful to break the model-data mismatch into regional domains (Fig. 11) to obtain more insight into where the observed and modeled concentrations diverge, and whether the difference can be attributed to the underlying fluxes or deficiencies in the model's atmospheric transport. For most 435 regions and times of year, the difference in average CO2 enhancements is not statistically significant; however, there are certain regions such as the Northern Tundra of Alaska, where the modeled average CO2 enhancements are significantly different and amplify a pattern that is observed over other regions.
In early spring, the model slightly overestimates observed boundary layer enhancements but a month later the model underestimates drawdown. Figures 6 and 11 suggest that the model drawdown in CO2 is 440 preceding the observed early-summer CO2 drawdown. The difference between observed and modeled enhancements change sign again during the July flight in Northern Tundra Alaska with an underestimation of the drawdown. Similar patterns can be observed in the Canadian Boreal Cordillera, suggesting that the timing of the summertime drawdown is too early in the model in this region. Over the same period, however, comparisons over the Western Alaska Tundra depict opposite patterns 445 (although far more subtle). While the offsets in the fall months are smaller, there is the suggestion that the enhancements in the Southern Arctic and Canadian Taiga ecoregions are both underestimated in the model. For CH4, the seasonal bias (underestimation in the spring and overestimation between July-September) in the integrated enhancements between observations and models stands out as the most significant feature. The notable exceptions are again the Northern Tundra of Alaska and Canadian 450 Boreal Cordillera, where CH4 enhancements in July and at the end of October are significantly underestimated. For reasons explained earlier, the CO comparison is less informative. However, if one were to analyze data from the month of September, which had no significant influence from fires in the free troposphere, it would suggest that the model continues to underestimate the impact of CO emissions across all regions. 455

Separating local, region and global vertical gradients
By extracting enhancements below 3000 masl from the observations and the model we have largely separated two major sources of biases and uncertainty in a model-data comparison -vertical transport and offsets in background mole fraction. However, it should be acknowledged that gradients between the boundary layer and free troposphere are not controlled exclusively by local fluxes and that in the 460 Arctic, in particular, vertical gradients can be controlled by non-local influences. To explore the impact of long-range transport Parazoo et al. (2016) preformed three simulations to better understand the drivers of the vertical gradient over Alaska and found that 48% of the amplitude (April/May-July/August) in the seasonal vertical gradient was driven by local fluxes from Alaska while the rest was driven by fluxes from the rest of the Arctic (11%) and low latitude (<60N, 41%). For CO2, the impact of 465 long-range transport to the vertical gradient is complicated by the difference in timing of the initial drawdown in the spring and the uptick in the fall at low latitudes verses that of high latitudes. The earlier drawdown of CO2 at low latitudes and the transport of that air via the free troposphere to Arctic significantly reduces the negative vertical gradient in the Arctic. At the same time, the early uptick of CO2 mole fraction in the Arctic relative to the low latitudes enhances the positive vertical gradient in the 470 early fall (Parazoo et al., 2016).
To account for the background vertical gradient in CH4 entering the contiguous US, Baier et al. (2020) and Lan et al. (2019) subtracted 12-15 ppt from the vertical gradient to account for a preexisting gradient in CH4 coming onto the continent. Analysis of the background gradient suggests that this preexisting vertical gradient is a combination of upstream emissions and wind shear which separates the 475 origin of the boundary layer air from that of the free troposphere. Large meridional gradients in CH4, such as those observed in the mid latitudes, will drive depletion of the free troposphere relative to that of the boundary layer over the Arctic. Similarly, CO vertical gradients will also be affected by non-local fluxes and wind shear between the boundary layer and the free troposphere. In the case of CO and CH4 there is also likely to be a vertical gradient that is influenced by the oxidation of these molecules. 480 However, given the relatively long residence time of these molecules and the low sampling altitude in the free troposphere (between 3000 and 5000 masl) of this experiment, this effect is small. From this perspective, the preexisting vertical gradient outside the domain of interest illustrates the importance of the model accuracy in non-local fluxes and the importance of long-range transport in the analysis. One approach ensuring a better boundary conditions is to use a global inversion (e.g. 485 CarbonTracker (Peters et al., 2007)) to initialize the local region where the prognostic flux model is then run to simulate local fields as is done to initialize regional Legrangian inversion models (e.g. Hu et al., 2019).

Conclusions 490
The Arctic-CAP campaign was composed of 6 different research missions from April to November 2017. It sampled CO2, CH4 and CO vertical profiles from the surface to 5000 masl across the ABoVE domain in Alaska and Northwestern Canada, covering 6 major Arctic ecoregions. Arctic-CAP airborne surveys included large Tundra and Boreal ecosystems that are the likely sources of large changes in the seasonal cycle of CO2 and have been the subject of great speculation about future emissions of CH4. 495 Arctic-CAP's CO2, CH4 and CO profiles provide an excellent basis for evaluating the surface flux models used within state-of-the-art atmospheric transport models, and thus, are an important tool for understanding carbon cycle feedbacks. Comparisons of Arctic-CAP CO2, CH4 and CO observations against GEOS model results show that: (a) for CO2, the flux model (land and ocean biosphere and fossil fuel) reproduces seasonal and regional depletions and enhancements observed by aircraft profiles after 500 adjusting for small systematic offsets; (b) for CH4, the model simulations agree reasonably well with the observed vertical profiles, but the model underestimates CH4 in the spring and overestimates it in the fall. Modeled North Slope CH4 is underestimated throughout the measurement period pointing to deficiencies in the wetland flux specifications over this ecoregion; and (c) for CO, the comparison between modeled and observed values were confounded by large biomass burning enhancements in the 505 free troposphere that were not captured in the model. Despite these minor shortcomings, the forward model estimates for CO2 and CH4 represent a marked improvement in model-data differences compared to those done previously for CARVE (Chang et al., 2014;Commane et al., 2017). Results and the flux budgets demonstrate that model representation of CO2 and CH4 for northern high-latitude ecosystems have advanced significantly since the state-of-the-science survey by Fisher et al. (2014). Inversions of 510 the Arctic-CAP data using these fluxes as the prior estimate should further refine the flux estimates and the budget for the ABoVE domain. We note that our comparisons used only GEOS forward model values and slightly different model-data mismatches may be obtained by using a different transport model. Finally, this study highlights the value of collocated airborne CO2, CH4 and CO vertical profiles for 515 quantifying model strengths and weaknesses. This feedback is essential to improve model characterization of both surface-atmosphere fluxes and atmospheric transport and to improve our confidence in the accuracy of projections of future conditions. We strongly recommend regular, systematic CO2, CH4 and CO vertical profile observations across the Arctic as an important and costeffective method to monitor the Arctic system and abrupt transformations or potential tipping points in 520 the permafrost carbon feedback.

Author contributions
CS, KM, CM, AC did experimental design. CS, TN, SW, KM carried out experiment. CS, AC, CM, RB, SW, LS, LH helped with manuscript.

Competing interests 535
The authors declare that they have no conflict of interest.

Acknowledgements
This research was supported by the NASA Terrestrial Ecology Program award #NNX17AC61A, "Airborne Seasonal Survey of CO2 and CH4 Across ABOVE Domain", as part of the Arctic-Boreal Vulnerability Experiment (ABoVE). A portion of the research presented in this paper was performed at the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National 545 Aeronautics and Space Administration. GEOS model runs and the work of AC was supported by funding from the NASA ROSES-2016 Grant/Cooperative Agreement NNX17AD69A.     (Fig. 7). Colors show the altitude of each deviation. Dark blue indicates differences near the surface while yellow indicates differences near 3000 masl. Figure 11. Average observation-model integrated enhancement differences by ecoregion. Standard deviation of 805 differences for each region are shown with black and red bars. Red (black) bars signify a negative (positive) average enhancement below 3000 meters relative to the daily mean tropospheric value above 3000 masl for CO2 and CH4 and above 4000 masl for CO.