Calibrating satellite-derived carbon fluxes for retrospective and near real-time assimilation systems

The ability to monitor and understand natural and anthropogenic variability in atmospheric carbon dioxide (CO2) is a growing need of many stakeholders across the world. Systems that assimilate satellite observations, given their short latency and dense spatial coverage, into high-resolution global models are valuable, if not essential, tools for addressing this need. A notable drawback of modern assimilation systems is the long latency of many vital input datasets, e.g., inventories, in situ measurements, and reprocessed remote-sensing data can trail the current date by months to years. This paper describes techniques 5 for calibrating surface fluxes derived from satellite observations of the Earth’s surface to be consistent with constraints from inventories and in situ CO2 datasets. The techniques are applicable in both short-term forecasts and retrospective simulations, thus taking advantage of the coverage and short latency of satellite data while reproducing the major features of long-term inventory and in situ records. Our approach begins with a standard collection of diagnostic fluxes which incorporate a variety of remote-sensing driver data, viz. vegetation indices, fire radiative power, and nighttime lights. We then apply an empirical 10 sink to calibrate the diagnostic fluxes to match given atmospheric and oceanic growth rates for each year. This step removes coherent, systematic flux errors that produce biases in CO2 which mask the signals an assimilation system hopes to capture. Depending on the simulation mode, the empirical sink uses different choices of atmospheric growth rates: estimates based on observations in retrospective mode and projections based on seasonal forecasts of sea surface temperature in forecasting mode. The retrospective fluxes, when used in simulations with NASA’s Goddard Earth Observing System (GEOS), reproduce marine 15 boundary layer measurements with comparable skill to those using fluxes from a modern inversion system. The forecasted fluxes show promising accuracy in their application to the analysis of changes in the carbon cycle as they occur. Copyright statement. The author’s copyright for this publication is transferred to the National Aeronautics and Space Administration.

assimilating satellite data, thus increasing the ability of the system to capture other signals of interest in the observations (Dee, 2005). For example, in any given year, estimates of global net biospheric exchange (NBE) from the TRENDY ensemble of terrestrial biosphere models (Sitch et al., 2015) have a range of roughly 4 petagrams of carbon (Pg C). Transported through 60 the atmosphere, this produces a 2 ppm spread in CO 2 , which is twice the size of the atmospheric growth rate perturbation of the 2015-2016 El Niño. In addition, an under/over-estimation of NBE leads to an under/over-prediction of the seasonal cycle amplitude of both surface and column-average mixing ratios (Yang et al., 2007;Keppel-Aleks et al., 2012). Among flux inversion systems that ingest in situ data, e.g., those analyzed by Gaubert et al. (2019), the spread in NBE is usually much smaller, about 0.25 Pg C. Nevertheless, flux inversions typically achieve this agreement through their reliance on long-term data records and transport simulations, and can trail the current date by a year or more. Our goal is to achieve a similar error reduction without the resulting latency, allowing the subsequent assimilation of satellite CO 2 data to focus on regional and seasonal errors rather than global budgeting errors. For now, we are willing to accept that the empirical sink adds a biophysical inconsistency into our system, but hope to better address this in the future. This paper presents a collection of surface fluxes having both retrospective and forecasting modes that reproduce background 70 CO 2 measurements with comparable skill to a modern flux inversion system. The collection includes an additional, empirically derived land sink to our baseline flux collection to ensure that global flux totals are consistent with observed atmospheric CO 2 growth rates. Empirical adjustments of this kind date back at least to the work of Tans et al. (1990), who showed that their simulations recreated the North-South gradient of CO 2 when they increased terrestrial uptake in the Northern Hemisphere Extratropics. Later, Chevallier et al. (2009) tuned their fluxes to match observed atmospheric growth rates, Keppel-Aleks 75 et al. (2012) adjusted Northern Hemisphere mid-latitude uptake to improve their simulated seasonal cycle amplitude, and Agustí-Panareda et al. (2016) derived an adjustment to fluxes from their prognostic model (i.e., one which does not ingest satellite vegetation data) based on comparisons to a flux inversion system. The method described here is an extension to that of Chevallier et al. (2009), while sharing some features of each of the previously cited works. In retrospective mode, it applies an atmospheric growth rate based on in situ observations in the marine boundary layer (MBL; Dlugokencky and Tans, 2016b). 80 In forecasting mode, when many of the diagnostic flux products and observationally constrained growth rates are unavailable, the fluxes use extrapolation and a method for predicting the growth rate based on sea surface temperature forecasts (Jones and Cox, 2005;Betts et al., 2016). It differs from previous works in its functional form and its application to fluxes diagnosed from satellite measurements instead of those from a prognostic model.
In the work that follows, the construction of the empirical sink is detailed in Section 2. The evaluation of the fluxes against 85 other products and comparisons of transported mixing ratios to in situ measurements is presented in Section 3. In particular, we compare GEOS simulations using the calibrated fluxes and using fluxes from an inversion system to MBL measurements and show the difference in skill is negligible. These findings and their potential scientific impact are summarized in Section 4.
The Low-order Flux Inversion (LoFI) is a collection of carbon fluxes driven by remote-sensing data calibrated to reproduce 90 given atmospheric and oceanic growth rates. We use the term "low-order" to distinguish it from a modern flux inversion system.
To demonstrate the impact of the empirical sink, we also consider a "baseline" flux collection similar to that used in Ott et al. (2015) with the empirical land sink removed. All products in both the LoFI and baseline collections are conservatively regridded to the 0.5 • × 0.625 • dateline and pole centered grid used by GEOS for many applications including the MERRA-2 reanalysis, which is used for the meteorological inputs needed by the fluxes. All fluxes have a daily timestep, except the 95 land biosphere which has a 3-hourly timestep to resolve the diurnal cycle. In retrospective mode, the LoFI collection trails the current date by up to a year. It forecasts fluxes for the current and next year using the approach described in Section 2.1.
As a rough metric of truth/plausibility, we compare both collections to widely-used ensembles of terrestrial biospheric, ocean biogeochemical models, and flux inversions. These choices and the components of the LoFI fluxes are detailed below and summarized in Table 1.

100
The LoFI flux collection consists of the following six components: Net Ecosystem Exchange (NEE) -An implementation of the Carnegie Ames Stanford Approach (CASA; Randerson et al., 1996) and Global Fire Emissions Dataset version 3 (GFED 3; van der Werf et al., 2003;2010) referred to here as CASA-GFED 3. CASA-GFED 3 uses satellite-based measurements of land cover and vegetation changes along with meteorology from MERRA-2 to constrain carbon stocks and fluxes, viz. net primary productivity (NPP), which is de-105 termined from measurements of normalized difference vegetation index (NDVI; Pinzon and Tucker, 2014), and biomass burning, which is determined from Moderate Resolution Imaging Spectrometer (MODIS) burned area estimates (Giglio et al., 2010). This version is available on a 0.5 degree grid with a monthly timestep. More details about our particular implementation and its use in GEOS are available in Ott et al. (2015).
Biofuel -CASA-GFED 3 also produces an estimate of the anthropogenic burning of harvested wood (van der Werf et al.,110 2010), which we refer to here as biofuel. The emissions have no seasonality and are calculated as the population density times national per capita fuel consumption estimates while being constrained by the total available coarse woody debris at each model time step.
Biomass burning -The Quick Fire Emissions Dataset (QFED; Darmenov and da Silva, 2015), an NRT product, which determines emissions based on MODIS fire radiative power (FRP) estimates using a technique similar to the Global 115 Fire Assimilation System (GFAS; Kaiser et al., 2012). QFED is produced on a 0.1 degree grid for every day with a climatological diurnal cycle applied.
Fossil fuel combustion -The Open-source Data Inventory for Anthropogenic CO 2 (ODIAC; Maksyutov, 2011, 2015;Oda et al., 2018). ODIAC is a global, monthly, high-resolution (1 km × 1 km) fossil fuel CO 2 gridded emission data product based on the disaggregation of country-level fossil fuel CO 2 emission estimates using a global power 120 plant database and satellite observations of nighttime lights. It is updated on an annual basis upon the availability of updated global fuel statistical data. This work uses the 2016 version which covers 2000-2015. For all but the two most recent years (here, 2014 and 2015), ODIAC uses global and country estimates from the Carbon Dioxide Information and Analysis Center (CDIAC; Boden et al., 2018), while estimates for the two most recent years are projected using BP's Statistical Review of World Energy 2016 (Oda et al., 2018).

125
Ocean exchange -An extension to the monthly climatology of Takahashi et al. (2009) that restores inter-annual variability.
This approach reapplies the global mean climatological growth rate estimate of 1.5 µatm/yr of the partial pressure of CO 2 in seawater (pCO sw 2 ) that Takahasi et al. (2009) use to derive their climatology. For the partial pressure in the atmosphere (pCO atm 2 ), we use weekly values of zonal-mean surface CO 2 from the NOAA MBL reference (Masarie and Tans, 1995;Dlugokencky and Tans, 2016a) for the partial pressure of CO 2 in the atmosphere (pCO atm 2 ). Given the two 130 partial pressures, estimates of the surface flux follow from the expression (Wanninkhof, 2014) where k is a constant, U 10 is the 10-meter wind speed, and C is the fractional sea-ice coverage. To complete the flux calculation, we use daily, observationally-constrained estimates of U 10 and C from MERRA-2. This approach has been used by a number of previous studies and is derived from one of the ocean priors in the NOAA CarbonTracker System 135 (see footnote b, Table 1). More information is available in Sections 3.1 & A3.
Empirical land sink -An additional, empirical sink following the "poor man's inversion" approach of Chevallier et al. (2009) that constrains the global atmospheric growth rate of the combined LoFI flux package. The empirical sink decreases heterotrophic respiration (HR) in months where the 2-meter air temperature (T ), meant as a rough proxy for soil temperature, increases from the previous month. This is designed to concentrate the correction to the Northern Extratropics during 140 the spring and summer, where the neutral biosphere assumption of CASA is thought to be most problematic (discussed in Section 3.2). For the m-th month of each year, at every point on the surface, the sink S m has the form where ∆ + T m denotes the temperature increase from the previous month, and α is a constant scaling factor computed 145 such that the global total fluxes for the year match a specified atmospheric growth rate. In retrospective years (those preceding the current), we use growth rates derived from the NOAA MBL reference (Dlugokencky and Tans, 2016b), and in NRT years (the current and following) we use projections based on seasonal forecasts of sea surface temperature described in Section 2.1. For more information about the construction and evaluation of the empirical sink, see  This separation assumes that, added together, biomass burning and biofuel emissions account for emissions from both naturally occurring wildfires and burning due to gross land-use/land-cover change. Furthermore, emissions from ethanol, biodiesel, and other short-cycle fuels used for transportation are potentially underestimated since they are not included in the ODIAC fossil fuel and CASA-GFED 3 biofuel emissions, yet the removal of carbon due to the corn and soybean harvest in the Midwestern United States is included in CASA-GFED 3 (derived from USDA National Agricultural Statistics Service data for 155 2005). Excluding the lateral transport of significant amounts of carbon over continental scales, we do not expect these assumptions to have a noticeable effect on the comparisons to follow since they consider only NBE, the sum of all land fluxes except fossil fuel emissions, and not the individual terms.

Forecasting fluxes
A number of products used in the above fluxes are unavailable until a few months to years following the end of a given year.

160
In particular, the fossil fuel inventory data used by ODIAC and NOAA MBL growth rate require data collection and analysis that must necessarily trail real time. Simulations during the current year, in particular NRT runs, are thus only possible with some type of extrapolation and/or statistical model applied to past values. For all flux components except biomass burning and the empirical sink, we produce estimates during NRT years by first extrapolating a linear fit through the retrospective values for each point and each month. This choice allows different regions and different seasons to have different trend lines.
165 While more complex choices are possible, this is meant as a simple first step as we work to reduce the latency of our baseline satellite-derived flux products.
The primary motivation for using QFED biomass burning instead of GFED was QFED's ability to produce NRT estimates.
This ability is particularly important for biomass burning emissions whose inter-annual variability is comparable in magnitude to its annual mean and seasonal cycle amplitude and is much greater than its long-term trend. While the removal of terrestrial 170 carbon by QFED fires does not match the corresponding fire loss in CASA carbon stocks, QFED emissions are calibrated to GFED (Darmenov and da Silva, 2015), resulting in a difference that is minor compared to that of the empirical sink (see the Supplementary Section A3).
For the empirical sink, we switch from using an atmospheric growth rate derived from in situ data in retrospective years to a linear functional fit to seasonal SST anomalies and anthropogenic emissions (Jones and Cox, 2005;Betts et al., 2016). This 175 approach estimates the growth rate of CO 2 in ppm, ∆CO 2 as where N is the average of SST anomalies in the Niño 3.4 region (5 Historical data records tend to show that the long-term trend in the airborne fraction is insignificant compared to its inter-annual variability (Knorr 2009;Ballantyne et al., 2012), but there is some indication of significant multi-decadal trends, including the possibility of a recent decrease (Keenan et al., 2016).

Flux and transport simulation analysis
Our primary method for evaluating the LoFI flux package, along with the baseline fluxes, are comparisons to the ensembles 190 of bottom-up terrestrial biosphere and ocean biogeochemistry models and top-down flux inversions outlined in Table 1 and described in more detail in the Supplemental Section A2. It is important to note that the bottom-up ensembles and top-down ensemble are not directly comparable: riverine input of carbon from the terrestrial biosphere to the ocean causes the top-down ensemble to infer a greater terrestrial sink and smaller oceanic sink than the biospheric model ensemble (Le Quéré et al., 2018). Jacobson et al. (2007) used fluxes from a global erosion model (Amiotte Suchet and Probst, 1995;Ludwig et al., 1996) 195 to estimate a riverine contribution of 0.45±0.18 Pg C (all uncertainties reported here are 1σ). Recently, Resplandy et al. (2018) showed that relationships between ocean heat and carbon transport to derive an estimate of 0.78 ± 0.20 Pg C. Since there is so much uncertainty about this discrepancy, in particular about its distribution in time and space, we do not make any corrections to the ensemble ranges used in the comparisons, and this choice should be kept in mind in the interpretation of the following results. In any case, the most appropriate ensemble for our purposes is the top-down ensemble, while the bottom-up ensembles 200 indicate a greater range of plausibility.
There are a number of other possible flux evaluation metrics, each with different limitations. For example, fluxes can be measured directly from towers with eddy covariance techniques (Dabberdt et al., 1993), but the spatial footprint of a flux tower is typically much smaller than ten kilometers (Raczka et al., 2013), and upscaling flux tower data to a global, gridded product has thus far been unable to produce reliable global NBE budgets (Jung et al., 2011). Eddy covariance measurements 205 from aircraft (Desjardins et al., 1989), e.g., NASA's Carbon Airborne Flux Experiment (CARAFE; Wolfe et al., 2018), are representative of much longer horizontal length scales than towers, but the currently available campaign data are sparse in space and time. As shown in Section 3.3, another choice is to transport our fluxes through the atmosphere with the GEOS general circulation model (GCM) and compare the simulated CO 2 mixing ratios with atmospheric observations. While this evaluation is able to test the ability of the fluxes to reproduce large-scale, long-term signals, it is susceptible to the misinterpretation of a 210 transport error as a flux error.

Ocean exchange
Over the past few decades, global pCO sw 2 has increased by roughly 1.5 µatm/yr, yet with considerable regional differences (Takahashi et al., 2009). At the same time, pCO atm 2 has increased at an even greater pace, driving an increasing ocean sink.
Deviations from these trends are predominantly limited to the Tropical Pacific, where the El Niño, Southern Oscillation (ENSO) 215 is the dominant driver of interannual variability , and the Southern Ocean, where the net ocean sink switched from increasing to decreasing in the early 1990s and switched back in the early 2000s .
By imposing observed trends in pCO sw 2 and pCO atm 2 , our ocean exchange fluxes produce a sink that is generally consistent with the inversion ensemble and GCP 2018 ocean biogeochemical model ensemble (Figures 1 and A1), again with the provision that the biogeochemical model ensemble does not include outgassing due to riverine input. Averaged over 2000-2010, our mean 220 ocean sink is 1.55 Pg C. This is consistent with annual budgets based on atmospheric measurements of a combination of of O 2 and CO 2 called atmospheric potential oxygen (APO; Stephens et al., 1998): Keeling and Manning (2014) estimate an ocean sink of 2.50 ± 0.60 Pg C for 2000-2010, which reduces to 1.72 ± 0.60 Pg C after removing 0.78 Pg C to account for riverine input (see discussion above). The pCO sw 2 based products of Landschützer et al. (2016) and Rödenbeck et al. (2014), which do not require a riverine adjustment, produce sinks of 1.37 and 1.74 Pg C (averages from GCP 2018; Le Quéré et al., 2018), 225 also in line with our budgets (several more comparable estimates are available in Jacobson et al., 2007). Even during the strong ENSO of 2015-2016, the constant growth in pCO sw 2 that we impose is able to reproduce a global sink within the ranges of the top-down and bottom-up ensembles (Figure 1). While linear growth is not appropriate for simulations of several decades, over which pCO sw 2 increases exponentially (Raupach et al., 2014), it is sufficient for our 15-year study period and something we hope to address in future developments.

The empirical land sink
In comparison to the flux inversion ensemble and the TRENDY Version 7, Simulation 3 (Sitch et al., 2015) ensemble of dynamical global vegetation models (DGVMs), the baseline fluxes consistently underestimate NBE-the sum of NEE, biofuels, and biomass burning-by over 3 Pg C (see Figure 2a and Table 2). This is due to the assumption of a neutral biosphere in CASA-GFED, which thus has no long-term net sink. In the Northern Hemisphere, this assumption causes CASA to system-235 atically underpredict seasonal cycle amplitudes in comparison to measurements from flux towers (Carvalhais et al., 2008) and a combination of aircraft and ground-based remote-sensing retrievals (Yang et al., 2007). This effect is seen in the diagnostic fluxes as an underprediction of the global sink strength in March through July (Figure 2) limited primarily to the Northern Extratropics ( Figure 3, first column). In contrast, including the empirical sink in the LoFI fluxes moves its annual totals and seasonal cycle significantly closer to the ranges of the comparison ensembles. with the difference being due to the decision to average observational data over month-long intervals (Peylin et al., 2013).  Table 2 for comparable ranges from the TRENDY and inversion ensembles). The net terrestrial sink found by inversions has been shown to be consistent with growth in temperate and boreal terrestrial ecosystems driven primarily by carbon fertilization and forest regrowth (Schimel et al., 2015;Fernández-Martinez et al., 2019) with the possibility of an additional contribution from agriculture (Zeng et al., 2014). Notably, the forest inventory analysis in shrublands and savannas than to forest regrowth (Liu et al., 2015) and estimating a significantly smaller tropical regrowth sink (Pugh et al., 2019).
Taken together, the findings described above suggest an empirical sink proportional to the CO 2 growth rate (e.g., driven by 265 carbon fertilization) and monthly temperature increase (e.g., focused to the extratropical growing season). The choice of the last remaining factor in Equation 1, heterotrophic respiration (HR) over, for example, Net Primary Production (NPP), requires further investigation. Since boreal forests allocate a much greater percentage of biomass below ground than tropical forests do (Pan et al., 2011), it seems more suitable to focus the correction on HR rather than NPP. Furthermore, the NDVI driver data of CASA tends to put a strong, observationally-driven constraint on NPP. In any case, adjustments to HR or NPP are both likely 270 to drive the long-term, hemispheric and seasonal corrections we look for in the empirical sink.
Even after adding the empirical sink, the LoFI fluxes have some discrepancies with the comparison ensembles worth noting.
In particular, the LoFI fluxes predict a sink from 0 • S to 15 • S during JJA, while the comparison ensembles predict neutral fluxes ( Figure 3, first column, fourth row difference between blue line and grey shading). The opposite difference, although less noticeable, is present during DJF (Figure 3, first, column, second row). It is unclear, however, how accurate either ensemble 275 is in this case: the inversions are hindered by the limitation of in situ data and uncertainties in transport over this latitude band, while the biospheric models are potentially hindered by inaccuracies in meteorological driver data and deficiencies in the representation of disturbance (Molina et al., 2015;van der Laan-Luijkx et al., 2015). This approach may overestimate the net sink over the Midwestern United States, where our version of CASA-GFED 3 already includes a corn and soybean harvest.
Understanding the interaction of these two adjustments is the subject of ongoing work. replay approach to reproduce the effect of the meteorological data assimilation system without having to rerun it (see Orbe et al., 2017 for the most up to date description). In this configuration, the large scale circulation, temperature, and moisture are constrained by analysis fields every six hours, while physical processes such as convection, turbulence, and radiative transfer are recalculated at a high temporal resolution. This computationally efficient framework provides the ability to simulate realistic meteorology with a tight coupling between fine scale atmospheric transport processes and trace gas emissions. For the analyzed 295 meteorology, we use MERRA-2, resulting in a transport simulation sharing many of the properties of that used by Ott et al. (2015).
The evaluation of the two model runs (one with LoFI fluxes, the other with CT2016 fluxes) against the NOAA MBL surface sites is shown in Figure 4. Neither of the runs is clearly superior. This suggests that, at least in terms of aggregate statistics over multiple years, the empirical sink is able to reproduce a correction to the baseline, diagnostic fluxes with a similar skill 300 as running a formal inversion system based on MBL data. While our approach may degrade the skill of CT2016 by using a different transport model, and may also improve it by running at a higher resolution, our calculated differences are consistent

Conclusions
This paper presented an adjustment to a diagnostic collection of surface fluxes designed to calibrate its sink strength to observations.
In each year, we adjusted the magnitude of the sink so that the fluxes match the global atmospheric and oceanic growth rates. Among the inversion ensemble, the spread in estimated growth rates is quite small, usually less than 0.25 Pg C.
On the other hand, the spread in growth rates among biospheric models can be 4 Pg C or more. In a transport simulation, a Using the fluxes calibrated with the empirical sink in transport simulations reproduced atmospheric measurements with the same skill as transport simulations using one of the fluxes in the inversion ensemble. In particular, the annual total errors at a 335 collection of sites in the marine boundary layer of the transport simulations with the calibrated fluxes were consistently less than a ppm, with the true value falling within a quartile of the differences. Globally, the errors are on the order of a few tenths of a ppm. The empirical sink thus enables the study of carbon cycle anomalies, like the 2015-2016 El Niño, whose perturbation to atmospheric mixing ratios is just a couple ppm.
system, it is exceedingly simple to implement. It is also less susceptible to errors due to the particular transport model, data selection, and error covariance models. For example, constraining the global growth rate alone requires few, if any, assumptions about atmospheric transport or decorrelation times and lengths. When used in simulations or as a prior in an assimilation system, this approach significantly reduces biases due to mis-specification of the growth rate. Failing to remove such a bias before assimilating data limits the ability of the assimilation system to account for other signals of interest in the observations 345 (Dee, 2005), e.g., synoptic-scale variations due to passing weather systems, regional and seasonal anomalies due to drought, and changes in anthropogenic emissions.
Since observational estimates of the global growth rate are currently only available at the end of each year, using the empirical sink developed here in an NRT atmospheric monitoring system requires a prediction of the global growth rate. We projected growth rates in 2016 and 2017 based on forecasts and analyses of SST (Jones and Cox, 2005;Betts et al., 2016).  (Denning et al., 1995). To resolve the diurnal cycle of the fluxes, we first downscale the monthly LoFI fluxes to daily using the the algorithm described below. We stop at daily for all fluxes other than terrestrial NEE, which is the difference of ecosystem respiration (ER) and GPP. Daily terrestrial ER and GPP are downscaled to 3-hourly following the approach of Olsen and Randerson (2004) with the slight modification of starting from daily instead of monthly fluxes. This approach is quite similar to the downscaling approach used by NOAA's CarbonTracker system. It has the advantage 365 of avoiding the noticeable discontinuities at monthly boundaries that are present in the monthly to 3-hourly downscaling, but the disadvantage of possibly missing synoptic scale disturbances that occur over multiple days since the downscaling from monthly to daily uses interpolation.
Fluxes are downscaled to a higher resolution either spatially or temporally by finding the smoothest interpolant that preserves the averages at the original, coarser resolution. This interpolant is found by minimizing a quadratic cost function subject to a 370 linear constraint. The quadratic cost function is the square of the discrete approximation to the Laplacian of the downscaled field. When interpolating in space, we use the spherical Laplacian, which takes the shrinking distance between grid boxes near the poles into account. The linear constraint is that the averages of the downscaled field at the coarser resolution be the original values. It is imposed using Lagrange multiplier, which transforms the problem into an unconstrained quadratic optimization problem on a higher-dimensional space. This approach is used to downscale monthly NEE to daily as described above and is 375 applied and to downscale ocean pCO sw 2 from its native 4 • × 5 • spatial grid to the 0.5 • × 0.625 • grid of MERRA-2. we compute NBE by subtracting a common fossil fuel product, which we use in our diagnostic fluxes, from the total surface flux. No distinction is made here between changes in NEE and biomass burning due to natural effects or to anthropogenic activities such as land use change. To avoid issues associated with statistics on small sample sizes or bias due to over representation of certain model configurations, we use only the range of the minimum and maximum values over each ensemble.

395
A3 LoFI ocean and land fluxes: further details Figure A1 shows the differences of the LoFI ocean fluxes with the inversion ensemble. In general, the LoFI fluxes are within the range of values from the inversions.
The main motivation behind replacing GFED in CASA-GFED with QFED is to avoid having to extrapolate biomass burning emissions in the forward processing product. Unlike other flux components, we do not expect the construction of a month-by-400 month linear climatology to have much skill for biomass burning. Rather than switching from GFED to QFED in the switch from reanalysis to forward processing fluxes, we chose to use QFED at all times. This prevents the introduction of a jump in the fluxes and allows a more direct comparison of averages, but has the downside of introducing a biomass burning component that is not in balance with the carbon stocks of CASA. In any case, as is shown in Figures A2 and A3, the difference between QFED and GFED biomass burning emissions is quite small at almost every time and place. The only noticeable difference is 405 that the baseline fluxes are slightly lower in the Amazon and Congo rain forests during JJA. Still, this difference is minor in comparison to the empirical sink. Figure A1. Climatologies (2003Climatologies ( -2015 of ocean exchange from the LoFI fluxes. The first row represents annual averages and each subsequent row represents averages over different seasons. The left column depicts zonal means from the LoFI fluxes (dash-dot red) and the ranges of zonal means from the inversion ensemble (dark grey); and the right column shows seasonal gridded maps of the diagnostic fluxes.  The left column depicts zonal means of the baseline fluxes using GFED and QFED, the range of the inversion ensemble (dark grey), and range of the TRENDY V7 ensemble (light grey); the middle column depicts the distance of the baseline fluxes using GFED to the range of the inversion ensemble; and the right column the difference of the baseline fluxes using QFED with those using GFED, i.e., the difference of QFED with GFED.