Interannual variability of summertime formaldehyde (HCHO) vertical column density and its main drivers in northern high latitudes

: The northern high latitudes (50-90°N, mostly including boreal forest and tundra ecosystem) has been undergoing rapid climate and ecological changes over recent decades, leading to significant variations in volatile organic compounds (VOCs) emissions from biogenic and 15 biomass burning sources. HCHO, a widely used indicator of VOC emission, exhibits high climate sensitivity. However, the interannual variability of HCHO and its main drivers over the region remain unclear. In this study, we use the GEOS-Chem chemical transport model and satellite retrievals from Ozone Monitoring Instrument (OMI) and Ozone Mapping and Profiler Suite (OMPS) to examine HCHO vertical column density (VCD) interannual variations in summertime 20


Introduction
Volatile organic compounds (VOCs) are the main precursors of tropospheric ozone and secondary organic aerosols, strongly impacting air quality and climate (Atkinson, 2000;Kroll and Seinfeld, 2008;Mao et al., 2018;Zheng et al., 2020).Formaldehyde (HCHO) is mainly produced from atmospheric VOC oxidation with a short photochemical lifetime on the order of hours, serving as an indicator of non-methane VOC (NMVOC) emissions and photochemical processes (Fu et al., 2007;Millet et al., 2008).Understanding the interannual variability of HCHO is important for quantifying long-term trends of VOC emissions in response to climate changes and air quality control implementations.
Several studies suggest that biogenic VOC emissions are largely responsible for interannual variabilities of HCHO on a global scale (Palmer et al., 2001;De Smedt et al., 2008;González Abad et al., 2015;De Smedt et al., 2018).
T. Zhao et al.: Interannual variability of summertime formaldehyde Stavrakou et al. (2009) consider biogenic VOC (BVOC) emissions to be the predominant source of global HCHO columns, in which isoprene alone contributes to 30 % of global HCHO.Isoprene emissions were also found to be the major driver of HCHO interannual variability (Bauwens et al., 2016;Stavrakou et al., 2018;Morfopoulos et al., 2022).During wildfire seasons, pyrogenic emission is the secondary important controlling factor of HCHO over the whole Amazon (Zhang et al., 2019) and contributes to 50 %-72 % of the HCHO total column in Alaskan summer fire seasons (Zhao et al., 2022).Over the Antarctic region, HCHO is produced mainly from methane oxidation with hydroxyl radicals (OHs), with possible unknown HCHO sources and longrange transport (Riedel et al., 1999).The interannual variability of HCHO over this region is still unclear.
Wildfire is another important source of HCHO (Permar et al., 2021).A number of studies have shown positive trends and strong interannual variabilities of wildfires over Arctic regions in the past few decades (Kelly et al., 2013;Giglio et al., 2013;Descals et al., 2022).Several modeling studies suggest that wildfires can become the main source of HCHO over Alaska (Zhao et al., 2022), Siberia and Canada (Stavrakou et al., 2018).In fact, the contribution from wildfires could be even larger as models tend to underestimate the secondary production of HCHO from other VOC precursors (Alvarado et al., 2020;Zhao et al., 2022;Jin et al., 2023).To what extent wildfires contribute to HCHO interannual variability remains unclear.
Solar-induced fluorescence (SIF) could potentially provide additional constraints on the biogenic-related HCHO column over northern high latitudes due to their similar dependence on temperature and light availability (Foster et al., 2014;Zheng et al., 2015).SIF is the re-emission of light by plants as a result of absorbing solar radiation during photosynthesis and is widely used to estimate vegetation productivity and health (Porcar-Castell et al., 2014;Magney et al., 2019).Isotopic labeling studies show that 70 %-90 % of isoprene production is from chloroplasts, directly linked to photosynthesis (Delwiche and Sharkey, 1993;Karl et al., 2002;Affek and Yakir, 2003).As SIF is directly linked to flux-derived gross primary productivity (GPP) and because HCHO can be largely explained by isoprene emissions (Zheng et al., 2017), we expect to use SIF as a valuable tool to constrain biogenic emissions from boreal forests at northern high latitudes.
The new retrievals of HCHO from the Ozone Monitoring Instrument (OMI) and the Ozone Mapping and Profiler Suite (OMPS) provide a continuous long-term record on a global scale, with improved calibration, updates in spectral fitting and air mass factor calculations (González Abad et al., 2022;Nowlan et al., 2023).Here, we use the newly retrieved HCHO vertical column density (VCD) products from OMI and OMPS, combined with the GEOS-Chem chemical transport model, to examine summertime HCHO spatiotemporal variability over northern high latitudes from 2005 to 2019.The satellites and the model are introduced in Sect. 2. In Sect.3, we evaluate the spatial variability of HCHO VCD using satellite retrievals, and we evaluate BVOC emissions with previous in situ measurements.In Sect.4, we evaluate the interannual variability of HCHO VCD using satellite retrievals, and we present model sensitivity tests to demonstrate how background HCHO, wildfire and biogenic VOC emissions influence HCHO interannual variability across Alaska, Siberia, northern Canada and eastern Europe.In Sect.5, we evaluate biogenic HCHO interannual variability using satellite SIF data.A summary and discussion are in Sect.6.

Observational data sets
We use satellite observations of tropospheric HCHO columns from OMI and OMPS to evaluate summertime HCHO variability at northern high latitudes.OMI is a UV-Vis backscatter spectrometer on board the Aura satellite launched in July 2004, with global daily coverage at an overpass time of 13:30 LT.OMI provides a long-term record of HCHO VCD but was discontinued in 2023.OMPS is the continuation of OMI HCHO measurements over polar regions.OMPS is a spectrometer on board two satellites: NASA/NOAA SUOMI NPP (hereafter SNPP) and NOAA-20, which were launched in October 2011 and November 2017, respectively.Compared to OMI, OMPS-SNPP has a relatively lower nadir spatial resolution (OMI: 13 × 24 km 2 , OMPS-SNPP: 50 × 50 km 2 ) (de Graaf et al., 2016;Levelt et al., 2006) but has an improved signal-to-noise ratio (González Abad et al., 2016).OMI and OMPS HCHO products share a similar concept and retrieval approach; thus, the joint evaluation by the two satellites can examine the consistency between OMI and OMPS and, more importantly, provide the capa-bility to study HCHO interannual variability on a decadal timescale.Here, we use monthly mean HCHO VCD from the OMI HCHO VCD retrieval (OMHCHO version 4) product (González Abad et al., 2022) during the 2005-2019 summertime and the OMPS-SNPP Level-2 HCHO total column V1 product (Nowlan et al., 2023) during the 2012-2019 summertime, provided by the Smithsonian Astrophysical Observatory.
The OMI and OMPS HCHO retrievals use a three-step procedure to calculate the HCHO VCD (Nowlan et al., 2023).First, the slant column density (SCD) is determined through spectral fitting of a backscattered radiance spectrum collected in the wavelength region of 328.5 to 356.5 nm.This fit uses a daily reference spectrum (one for each cross-track position) determined from radiances collected over a relatively clean area of the Pacific between latitudes 30°S and 30°N.The area used for this reference calculation is referred to as the reference sector.Second, scene-by-scene radiative transfer calculations are performed to determine vertically resolved scattering weights, which can be used to determine the air mass factor (AMF) in combination with the trace gas profile (Palmer et al., 2001).This AMF describes the path of light and is used for converting the SCD to a VCD (VCD = SCD/AMF).Third, the background reference slant column (SCD R ) in the radiance sector region is determined using a model to correct the retrieved SCD which is, in fact, the differential SCD determined from the ratio of the observed radiance and the reference radiance.A further bias correction (SCD B ) is applied to reduce high-latitude biases, which mostly affect OMPS-SNPP (Nowlan et al., 2023).
To compare with modeled results, OMI and OMPS-SNPP HCHO retrievals are reprocessed following a threestep procedure.This is primarily done to replace the climatology used in the OMI and OMPS-SNPP products with our own GEOS-Chem simulations.First, we remove the data points fulfilling the following criteria: (1) main quality flag > 0, (2) cloud cover fraction ≥ 40 %, (3) solar zenith angle (SZA) ≥ 70°, and (4) ice or snow flag = 1.After filtering, we regrid the level-2 swath data in the local time window of 12:00-15:00 LT to 0.5°× 0.625°horizontal resolution.Second, we calculate the air mass factor (AMF GC ) based on the local GEOS-Chem HCHO vertical profile and satellite scattering weight (Palmer et al., 2001).Third, we calculate the slant column density of HCHO in the reference sector (SCD R,SAT ) using the modeled HCHO reference sector column and the satellite air mass factor over the same location (VCD R,GC and AMF R,SAT ) (De Smedt et al., 2018;Zhu et al., 2016): (1) VCD R,GC is calculated by a global monthly climatology of hourly HCHO profiles at the time of overpass based on a 2018 GEOS-Chem high-performance (GCHP) run at 0.5°× 0.5°resolution (Bindle et al., 2021;Eastham et al., 2018).AMF R,SAT is the AMF from the satellite product, which is calculated using the VLIDORT radiative transfer model as described in Nowlan et al. (2023).We rearrange the satellite vertical column as follows: VCD SAT,reprocessed = ( SCD S AT + SCD B,SAT + SCD R,SAT )/AMF GC . (2) Here SCD SAT is the fitted HCHO slant column, and SCD B,SAT is the bias correction term for unexplained background patterns in the HCHO retrievals, which may be due to instrument or retrieval issues (Nowlan et al., 2023).The single-scene precision of the retrieval is 1 × 10 16 molec.cm −2 (absolute) for OMI and 3.5 × 10 15 molec.cm −2 for OMPS-SNPP from spectral fitting and 45 %-105 % (relative) from the AMF (González Abad et al., 2015;Nowlan et al., 2023).The spectral fitting error is primarily random in individual measurements, while the AMF error has both random and systematic components.The precision can be improved by spatial and temporal averaging (De Smedt et al., 2008;Zhu et al., 2016;Boeke et al., 2011).Our analyses in this work are based on monthly data; thus, the absolute uncertainty in the HCHO column is reduced to < 1 × 10 15 molec.cm −2 (De Smedt et al., 2018).We utilize high-resolution SIF estimates derived from OCO-2 and MODIS (https://doi.org/10.3334/ORNLDAAC/1863,Yu et al., 2021).These data sets provided globally contiguous daily SIF estimates at a spatial resolution of approximately 0.05°× 0.05°(around 5 km at the Equator) and a temporal resolution of 16 d for the period of September 2014 to July 2020.The data set was estimated using an artificial neural network (ANN) trained on the native OCO-2 SIF observations and MODIS BRDF-corrected seven-band surface reflectance along orbits of OCO-2.The ANN model was subsequently used to predict daily mean SIF (mW m −2 nm −1 sr −1 ) in the gap regions based on MODIS reflectance and land cover.In our study, the OCO-2 SIF estimates are monthly averaged and regridded to 0.1°× 0.1°s patial resolution for the comparison with OMI HCHO VCD and are regridded to 2°× 2.5°spatial resolution when comparing with GEOS-Chem results.(https://github.com/geoschem/geos-chem/issues/906,last access: 10 August 2022).The simulations encompass 15 summers (1 May to 31 August) from 2005 to 2019 at a horizontal resolution of 2°× 2.5°and 72 vertical layers from the surface to 0.01 hPa.For all model runs, we use a standard restart file from the GEOS-Chem 1-year benchmark simulation, followed by an additional spinup period of several days to allow adequate representation of HCHO production and loss in the model.

Global GEOS-Chem simulations
Biomass burning emissions in our simulation are derived from the Global Fire Emission Database (GFED4.1s)inventory (van der Werf et al., 2017;Randerson et al., 2017).A year-specific GFED4.1sinventory is used for each year of the simulation to make sure the representation of the interannual variability in wildfire emissions is accurate.Emissions on a 3 h basis are obtained from MODIS satellite observations, which provide information on fire detection and burning area (Mu et al., 2011;van der Werf et al., 2017).The GFED4.1s inventory reports the HCHO emission factors of 1.86 and 2.09 g kg −1 dry matter for boreal-forest and temperate-forest fires (Akagi et al., 2011).
BVOC emissions in the study are calculated online (emission factor maps computed online) using the Model of Emissions of Gases and Aerosols from Nature (MEGAN, v2.1) (Guenther et al., 2006(Guenther et al., , 2012) ) as implemented by Hu et al. (2015).Terrestrial vegetation for BVOC emissions is based on the plant functional type (PFT) distribution derived from the Community Land Model (CLM4) (Lawrence et al., 2011;Oleson et al., 2013).Utilizing online MEGAN simplifies the investigation of the relationship between BVOC emission patterns and PFTs.CLM4 output (Fig. S1 in the Supplement) suggests two major PFTs over northern high latitudes: broadleaf deciduous boreal shrubs (mainly over northern and southern Alaska, northern Canada and northern Siberia) and needleleaf evergreen boreal trees (mainly over interior Alaska, northern Canada, south Siberia and the northern part of eastern Europe), both with high emission factors in isoprene and low emission factors in monoterpenes.The southern part of eastern Europe is dominated by croplands and broadleaf deciduous temperate trees.In this work, "monoterpenes" from model calculation are lumped monoterpenes, including α-pinene, β-pinene, sabinene and carene.
We conducted a model sensitivity test to assess the difference in BVOC emissions and HCHO dVCD Bio,GC due to online versus offline MEGAN applications.The results of the tests show that the use of online MEGAN has a modest impact on monthly ISOPe and MONOe (25 %-53 % for ISOPe in Alaska, northern Canada and eastern Europe; 53 % for ISOPe in Siberia; 17 %-24 % for MONOe across the four domains) and provides a similar, isoprene-dominated BVOC emission regime over Alaska, central Siberia, northern Canada and eastern Europe in comparison to results from using offline MEGAN.The difference in dVCD Bio,GC between using online and offline MEGAN is approximately 13 %-26 %, suggesting a minor impact on dVCD Bio,GC and VCD GC variability over northern high latitudes when using online or offline MEGAN.
In this study, we use the detailed O 3 -NO x -HO x -VOC chemistry ("tropchem" mechanism) (Park et al., 2004;Mao et al., 2010Mao et al., , 2013)), incorporating updates on isoprene chemistry (Fisher et al., 2016).The performance of this version of isoprene chemistry in GEOS-Chem has been extensively evaluated using recent field campaigns and satellite observations over the southeastern US (Fisher et al., 2016;Travis et al., 2016), including HCHO production from isoprene oxidation (Zhu et al., 2016(Zhu et al., , 2020;;Kaiser et al., 2018).The ability of GEOS-Chem, with this chemistry, to reproduce the vertical profiles of HCHO observed during the Alaskan summer, as shown in the ATom-1 in situ campaign, has been demonstrated (Zhao et al., 2022).Under high-NO x conditions (1 ppbv), HCHO production is rapid, reaching 70 %-80 % of its maximum yield within a few hours, whereas, under low-NO x conditions (0.1 ppbv or lower), it takes several days to reach the maximum yield, and the cumulative yield is approximately 2-3 times lower than that under high-NO x conditions (Marais et al., 2012).
To examine the influence of different sources on HCHO columns at northern high latitudes, we conducted four GEOS-Chem simulations, as described in Table 1, to separate the modeled HCHO total column (VCD GC ) into three parts, namely the background column (VCD 0,GC ), the biogenicemission-induced column (dVCD Bio,GC ) and the wildfireemission-induced column (dVCD Fire,GC ): VCD GC = VCD 0,GC + dVCD Bio,GC + dVCD Fire,GC .
(3) VCD 0,GC is the VCD GC from the GEOS-Chem simulation in which both biogenic and wildfire emissions are turned off.VCD 0,GC , dVCD Fire,GC and dVCD Bio,GC are derived using Eq. ( 4a) to (4c): To assess the linearity assumption in Eq. ( 3), we conducted model sensitivity tests over a 1-month period to evaluate the disparity between VCD GC and VCD 0,GC + dVCD * Fire,GC + dVCD * Bio,GC (derived from Eq. 4a, d and e).The difference between these two terms is less than 14 % at northern high latitudes, suggesting a minor importance of the nonlinear effect in this area.
Figure 1a defines the four domains focused on in this work.The selection of the Alaska domain follows Zhao et al. (2022), the selections of the eastern Europe and Siberia domains follow Bauwens et al. (2016), and the selection of the northern Canada domain follows that of the North America domain in Bauwens et al. (2016) but excluding Alaska.To emphasize the key drivers of HCHO interannual variability, we categorize the years spanning from 2005 to 2019 into two distinct groups, "high-HCHO years" and "low-HCHO years", within each of the four specified domains.For each domain, the years that have an above-average May-August sum of regionally averaged monthly OMI VCD SAT,reprocessed are categorized as high-HCHO years; those years that have a value that is below average are categorized as low-HCHO years (shown in Table S1).
We use the coefficient of variation (CV) to quantify the interannual variability of summertime HCHO VCD.CV is defined as the ratio of the standard deviation to the mean (CV = σ µ ), which is a measure of interannual variability (Giglio et al., 2013).Assuming VCD 0,GC , dVCD Bio,GC and dVCD Fire,GC are three independent components of VCD GC , we have σ 2 VCD GC = σ 2 VCD 0,GC + σ 2 dVCD Bio,GC +σ 2 dVCD Fire,GC ; thus, the contribution of each component to the CV of VCD GC can be calculated by CVcontribution dVCD Bio,GC = CVcontribution dVCD Fire,GC =

Evaluation of the spatial distribution of HCHO VCD and BVOC emissions
Figure 1 shows the July mean HCHO VCD over northern high latitudes during 2012-2019 from reprocessed OMI and OMPS-SNPP retrievals, as well as GEOS-Chem model output.We show that OMI and OMPS-SNPP HCHO VCD have consistent spatial patterns and that their magnitudes agree within 15 % (Fig. 1a, b and d).OMPS-SNPP does show lower values in some regions, perhaps due to several cloud and surface reflectance assumptions made in OMPS-SNPP retrievals or biases that may persist at high latitudes and large solar zenith angles (Nowlan et al., 2023).While GEOS-Chem reproduced well the spatial pattern of HCHO VCD that OMI and OMPS-SNPP captured (Fig. 1c), we find that GEOS-Chem HCHO VCD is lower than that of OMI by 40 %, particularly over wildfire-impacted areas (Fig. 1e).The model-satellite discrepancies in wildfire areas can, in part, be due to model underestimates of VOC emissions and HCHO production from wildfire plumes (Jin et al., 2023) and can, in part, be due to the uncertainties in air mass factor calculation for satellite HCHO retrievals in the presence of wildfire smoke (Jung et al., 2019).The model-satellite discrepancies outside wildfire areas could be also due to model underestimates of oxygenated VOCs (OVOCs), biogenic VOC emissions and biases in satellite HCHO retrieval products.For example, Selimovic et al. (2022) found that GEOS-Chem underestimates OVOCs, including HCHO, by a factor of 3-12 at Toolik Field Station in northern Alaska.Stavrakou et al. (2015) show model underestimations of biogenic isoprene emissions and wildfire emissions over eastern Europe and Alaska.Recent studies suggest that TROPOMI HCHO retrievals may have a positive bias under low-HCHO conditions (Vigouroux et al., 2020).OMPS-SNPP HCHO shows a similar positive bias at clean sites but has a closer agreement with FTIR HCHO columns at polluted sites (Nowlan et al., 2023;Kwon et al., 2023).
We use model sensitivity tests to characterize the spatial variability of HCHO VCD GC,0 , dVCD Bio,GC and dVCD Fire,GC over northern high latitudes.Figure 2 shows a similar spatial pattern of VCD 0,GC and dVCD Bio,GC , with a distinctive spatial pattern of dVCD Fire,GC .The enhancement of VCD 0,GC is mainly shown over eastern Europe, eastern Siberia and central Canada, around 2-4 × 10 15 molec.cm −2 ; dVCD Fire,GC exhibits increases mainly over Alaska, northern Canada and central Siberia, with values larger than 5 × 10 15 molec.cm −2 at fire hotspots; dVCD Bio,GC spatial pattern corresponds mainly to isoprene emissions over vegetated areas and is enhanced over eastern Europe (2.4 × 10 15 molec.cm −2 ) and eastern Siberia (1.1 × 10 15 molec.cm −2 ).The model suggests that eastern Europe is covered by needleleaf evergreen temperate trees and broadleaf deciduous boreal trees, while eastern Siberia is mainly covered by needleleaf evergreen boreal trees (Fig. S1).We note that dVCD Bio,GC : ISOPe (isoprene emission flux, unit: 10 16 molec.cm −2 per 10 13 atmosC cm −2 s −1 ) over northern high latitudes is around 0.24, a factor of 10 lower than VCD GC : ISOPe over the southeastern US (Millet et al., 2008).This indicates a much lower HCHO production efficiency from isoprene oxidation at northern high latitudes compared to middle latitudes, possibly resulting from the availability of NO x and https://doi.org/10.5194/acp-24-6105-2024Atmos.Chem.Phys., 24, 6105-6121, 2024 the difference in temperature, photolysis and oxidant levels (Marais et al., 2012;Mao et al., 2013;Li et al., 2016;Wolfe et al., 2016).Our modeled ISOPe is ∼ 1-2 times higher than MONOe (monoterpene emission) in the Alaskan, European, northern Canadian and central Siberian boreal-forest zones, as shown in Fig. 4d and e.Our model shows comparable isoprene surface mixing ratios in relation to the in situ measurements along the Trans-Siberian Railway within Russian boreal forests (generally < 1 ppb in our model and around 0.31-0.48ppb in the in situ campaign in Timkovskys et al. ( 2010) -both can reach ∼ 4 ppb in eastern Siberia).Our model also shows comparable monoterpene surface mixing ratios over the Alaska North Slope (0.009 ppbv in our model and ∼ 0.014 ppbv in Selimovic et al., 2022).In comparison to Stavrakou et al. (2018), our modeled ISOPe results over eastern Europe, Alaska and northern Canada agree within 20 %, but our modeled MONOe result is around 40 % lower, likely because we are using online MEGAN as well as different PFT maps and canopy models (Guenther et al., 2012).
A remarkable feature is the heterogeneity of BVOC emissions at northern high latitudes revealed by measurements.We show in Table 2 that, while isoprene dominates BVOC emission over the Arctic tundra and broadleaf forests, monoterpene becomes the dominant species over coniferous forests.This includes a large portion over the European boreal zone, such as at Hyytiälä in Finland (Rinne et al., 2000;Bäck et al., 2012;Rantala et al., 2015;Zhou et al., 2017;Ciarelli et al., 2024), Bílý Kříž in Czech Republic (Juráň et al., 2017) and Norunda research station in Sweden (Wang et al., 2017).However, this large-scale heterogeneity is not being reproduced by our model.We find from Fig. 2f that modeled BVOC emissions are dominated by isoprene in most parts of the northern high latitudes, except in eastern Siberia and eastern Greenland.As shown in Fig. S1, the isoprene dominance in the region is mainly due to the broadleaf deciduous boreal shrubs and needleleaf evergreen boreal trees that are  2).Isoprene-dominated regime in a pixel means isoprene emission is significantly higher than monoterpene emission (p < 0.05 in t test) for May-August in 2005-2019.assumed in the model and the fact that the region exhibits higher isoprene emission factors than monoterpenes; in contrast, eastern Siberia is covered predominantly by needleleaf deciduous boreal trees, leading to higher monoterpenes than isoprene emissions (Guenther et al., 2012).

Examination of the interannual variabilities of HCHO VCD
Figure 3 shows that, in Alaska, northern Canada and Siberia, high-HCHO years are often associated with strong wildfire VOC emissions (R 2 = 0.78-0.89)and, to a lesser extent, are associated with biogenic VOC emissions (R 2 = 0.21-0.47).The interannual variability of wildfire VOC emission is further supported by CO emissions from both GFED4 and satellite-based estimation (Yurganov and Rakitin, 2022).The high correlation between OMI HCHO VCD and GFED wildfire VOC emissions in Alaska, Siberia and northern Canada indicates a strong wildfire impact on interannual variabilities of HCHO VCD in these domains.In eastern Europe, high-HCHO years are associated with large biogenic emis-sions (with wildfire VOC emissions: R 2 = 0.51; with biogenic emissions: R 2 = 0.72), indicating the important role of biogenic emissions in interannual variability of HCHO in eastern Europe.
Figure 4a to c show that wildfire is the main driver of HCHO VCD GC interannual variability over Siberia, northern Canada and Alaska.In the low-HCHO years of these three domains, the dVCD Fire,GC contribution is ∼ 2 %-11 % of the HCHO total column, which is less than that of VCD 0,GC and dVCD Bio,GC ; in high-HCHO years, the dVCD Fire,GC contribution to the total column rises to ∼ 20 %-34 %.This is consistent with Fig. 3 in that HCHO VCD interannual variability has significantly higher correlations with wildfire emissions (R 2 = 0.78-0.89)than with biogenic emissions (R 2 = 0.21-0.47)over Siberia, northern Canada and Alaska.These findings highlight the role of wildfire in driving HCHO interannual variability in the three domains.
In eastern Europe, biogenic emissions and background HCHO account for the majority of HCHO VCD interannual variability, largely due to the relatively higher surface temperature, stronger photolysis, higher oxidant levels and https://doi.org/10.5194/acp-24-6105-2024Atmos.Chem.Phys., 24, 6105-6121, 2024   S1).The R 2 between reprocessed OMI HCHO VCD and biogenic VOC emissions (green) and wildfire VOC emissions (black) is shown at top right of each panel.higher availability of NO x than in the other three domains.On a regional scale, BVOC emissions and methane oxidation with hydroxyl radicals (OHs) both depend on temperature (Guenther et al., 2012;Holmes et al., 2013).In Fig. 4d, the surface temperature in eastern Europe is higher than that in Alaska, northern Canada and Siberia by 5-7 K, leading to an increase in BVOC emissions and VCD 0,GC through methane oxidation.HCHO VCD is further enhanced through the higher NO x level (0.4-1 ppbv) in eastern Europe than in the other three domains (0.1-0.5 ppbv) as the HCHO yield from isoprene photooxidation increases with NO x level.The high NO x level in eastern Europe results from its large urban areas and high anthropogenic emissions.The large contribution of BVOC to HCHO VCD is consistent with Fig. 5, which shows that the CV of dVCD Bio,GC + VCD 0,GC accounts for > 90 % of VCD GC 's CV in eastern Europe.Similarly, Fig. 3d shows that biogenic emission has a higher correlation (R 2 = 0.72) with VCD GC than wildfire emission does (R 2 = 0.51).These results suggest that biogenic emissions and background are the main contributors to HCHO interannual variability in eastern Europe.
We further examine the contribution from background, biogenic and pyrogenic emissions to the interannual variability of HCHO VCD GC over each region.We find from model results that biogenic emissions and background signals contribute to 90 % of the interannual variability of HCHO VCD GC in eastern Europe, while wildfire accounts for over 90 % of CV in Alaska, Siberia and northern Canada, which is consistent with previous work (Stavrakou et al., 2018;Zhao et al., 2022).We use the Mann-Kendall test, a non-parametric statistical test used to detect trends in time series data, to test the significance of the trend of monthly HCHO VCD GC time series over a specific domain (Gilbert, 1987).We found no significant trend of HCHO VCD GC over eastern Europe, northern Canada and Alaska from either satellites or the model.On the other hand, we find that the trend of HCHO VCD GC over Siberia is significant (p < 0.05) and increasing (1.7 % per year).VCD 0,GC and dVCD Bio,GC show no significant trend, while the trend of dVCD Fire,GC is significant and increasing in Siberia (12 % per year), sug-gesting that wildfires are responsible for the VCD GC trends in Siberia.In contrast to Bauwens et al. (2016), we find that the HCHO VCD GC trend over Siberia is largely driven by the increasing wildfires in recent years and, to a lesser extent, by biogenic VOC emissions, highlighting the important role of wildfires in HCHO VCD interannual variability.

SIF evaluation of dVCD Bio,GC interannual variability
In Fig. 6a to d, we find a good linear relationship (R = 0.6-0.7) between OCO-2 monthly SIF and dVCD Bio,GC in Alaska, Siberia, northern Canada and eastern Europe.Foster et al. ( 2014) show a highly linear correlation between seasonal variations of the satellite HCHO column (fire free) and GPP at northern high latitudes.This is consistent with our findings over most continental areas at northern high latitudes (Fig. S2) since SIF is a widely used proxy of GPP (Frankenberg et al., 2011).In Fig. 6g to j, SIF and ISOPe show a linear relationship when SIF is within 0-0.25 W m −2 µm −1 sr −1 but tend to decouple when SIF > 0.25 W m −2 µm −1 sr −1 , possibly due to the different temperature optimums of isoprene emission and photosynthesis (Harrison et al., 2013;Zheng et al., 2015).Despite the difference in the distribution of vegetation types, the dVCD Bio,GC -SIF slope is homogeneous over Siberia, northern Canada and eastern Europe (slope = 0.28-0.45,unit: 10 16 molec.cm −2 per W m −2 µm −1 sr −1 ), suggesting SIF as a tool to understand biogenic HCHO variability in these regions.The dVCD Bio,GC -SIF slope in Alaska is 3-5 times lower than in the other three domains, which warrants further investigation.In contrast to high latitudes, we find that both the ISOPe : SIF slope and the dVCD Bio,GC : SIF slope are a factor of 2-10 times higher in the southeastern US and the Amazon (Fig. 6e-f, k-l) than at northern high latitudes, indicating that the dVCD Bio,GC -SIF slope over northern high latitudes and lower latitudes could be very different.
SIF offers an independent evaluation of the interannual variability of HCHO dVCD Bio,GC .As SIF shows a linear relationship with dVCD Bio,GC at northern high latitudes (Fig. 6a to d), it is reasonable to infer from Fig. 4 that the low interannual variability shown in SIF (CV = 1 %-9 %) is expected for dVCD Bio,GC (CV = 1 %-2 %) in Alaska, Siberia and northern Canada.In contrast, we find that dVCD Fire,GC has a much weaker correlation with SIF (Fig. S2c) and shows a higher interannual variability (CV = 8 %-13 %).As wildfire emission is highly correlated (R 2 = 78 %-89 %) with OMI HCHO VCD over northern Canada, Siberia, and Alaska (Fig. 3), the high interannual variabilities of OMI HCHO VCD (CV = 10 %-16 %) in these domains are likely driven by wildfires instead of biogenic emissions.

Conclusions and discussions
We use reprocessed new retrievals of HCHO from OMI and OMPS-SNPP to evaluate the interannual variability of HCHO VCD from GEOS-Chem over northern high latitudes in 2005-2019 summers.The reprocessed OMI and OMPS-SNPP HCHO VCDs show a high consistency in the spatial pattern and interannual variability.GEOS-Chem reproduced the interannual variability of HCHO VCD, but the magnitude is biased low in comparison to satellite retrievals.
Our modeled HCHO VCD can be biased low due to large underestimations of HCHO production and emission factors in wildfire smoke.Previous in situ campaigns show underestimated emission factors of VOCs in the GFED4.1semission inventory for temperate forests in the western US (Liu et al., 2017;Permar et al., 2021), while the bias in VOC emission factors in boreal-forest wildfires remains unclear.HCHO underestimation can also be due to the missing HCHO secondary production in wildfire-impacted conditions (Liao et al., 2021;Jin et al., 2023).GEOS-Chem is found to underestimate oxygenated VOCs by a factor of 3 to 12 in some Arctic regions, which could contribute to the bias in modeled HCHO at northern high latitudes (Selimovic et al., 2022).More measurements in the Arctic region are needed to reconcile the model-observation discrepancies.
Wildfire accounts for the majority of the HCHO interannual variability in Alaska, northern Canada and Siberia.Compared to biogenic emissions and background HCHO, wildfire emission shows a better correlation with HCHO VCD despite the fact that biogenic and background HCHO can dominate HCHO VCD in the low-HCHO years of these three regions.We also find an increasing trend (p < 0.05) in wildfire emission and HCHO VCD over northern Canada and Siberia.With rapid Arctic warming, wildfire frequency and intensity have risen rapidly in recent decades and will continue to do so in the near future (Descals et al., 2022).We expect that wildfire will continue to dominate HCHO interannual variability in the three regions.
Eastern Europe is the only one of the four studied regions where HCHO interannual variability is dominated by biogenic emission and background HCHO.This is due to a combination of lower wildfire activities, higher surface temperatures and anthropogenic NO x emissions in this region.No significant trends in terms of biogenic emission, biogenic-related HCHO and background HCHO are found in the four regions during the summertime of 2005-2019.However, model estimates of HCHO from biogenic emissions are largely uncertain as the model-calculated VOC speciation is at odds with field measurements (Fig. 2f and Table 2).Previous work shows good performance by the model in capturing the long-term variability of biogenic emissions in response to climate variables (Stavrakou et al., 2018), but the model underestimates biogenic and fire emissions over northern high latitudes, especially over eastern Europe and Alaska (Stavrakou et al., 2015).Future research is warranted to examine the HCHO signal from biogenic emissions in this region.
The OCO-2 satellite SIF provides an additional constraint on the interannual variability of biogenic emissions and is independent of wildfire emissions.As a proxy of vegetation photosynthesis and GPP, SIF is expected to have a good correlation with isoprene emission and HCHO VCD in the northern boreal regions, though this correlation can be worse at middle latitudes and tropical regions (Foster et al., https://doi.org/10.5194/acp-24-6105-2024Atmos.Chem.Phys., 24, 6105-6121, 2024

Figure 2 .
Figure 2. Panels (a) to (e) show GEOS-Chem HCHO VCD 0,GC , dVCD Bio,GC , dVCD Fire,GC and isoprene and monoterpene emission fluxes over northern high latitudes, averaged for July from 2005 to 2019.(f) BVOC emission regimes over northern high latitudes in GEOS-Chem simulation for 2005-2019 summers and from in situ measurements (references listed in Table2).Isoprene-dominated regime in a pixel means isoprene emission is significantly higher than monoterpene emission (p < 0.05 in t test) for May-August in 2005-2019.

Figure 3 .
Figure 3.Time series of HCHO VCD and biogenic and wildfire emissions over (a) Alaska, (b) Siberia, (c) northern Canada and (d) eastern Europe for 1 May-31 August in 2005-2019.The blue lines are monthly HCHO VCD from reprocessed OMI, cyan lines are from reprocessed OMPS-SNPP, and gray lines are from GEOS-Chem.Red and black bars are area-normalized wildfire CO and VOC emissions during the summer of each year; green bars are area-normalized biogenic VOC emissions.Wildfire emissions are calculated from the GFED4.1sinventory; biogenic VOC emissions are calculated by the MEGAN2.1 model.Pink shading indicates high-HCHO-VCD years (for a definition, see Sect.2.2 and TableS1).The R 2 between reprocessed OMI HCHO VCD and biogenic VOC emissions (green) and wildfire VOC emissions (black) is shown at top right of each panel.

Figure 4 .
Figure 4. Interannual variability of monthly HCHO VCD 0,GC , dVCD Bio,GC and dVCD Fire,GC , as well as near-surface temperature, over (a) Alaska, (b) Siberia, (c) northern Canada and (d) eastern Europe in 2005-2019 summers.For each year, only the monthly values in May, June, July and August are shown.The indigo, green and red shades are background HCHO VCD 0,GC , dVCD Bio,GC and dVCD Fire,GC based on GEOS-Chem sensitivity tests (Table1).The orange curves are monthly surface temperature from the MERRA-2 data set.The black curves are OCO-2 monthly SIF.

Table 1 .
Configuration of GEOS-Chem global simulations in this study.

Table 2 .
In situ measurements of BVOC in Fig.2f.
* means most abundant BVOC in mixing ratio and without * means most emitted BVOC.

Table 1 )
. The orange curves are monthly surface temperature from the MERRA-2 data set.The black curves are OCO-2 monthly SIF.