Estimating European volatile organic compound emissions using satellite observations of formaldehyde from the Ozone Monitoring Instrument

Abstract. Emission of non-methane Volatile Organic Compounds (VOCs) to the atmosphere stems from biogenic and human activities, and their estimation is difficult because of the many and not fully understood processes involved. In order to narrow down the uncertainty related to VOC emissions, which negatively reflects on our ability to simulate the atmospheric composition, we exploit satellite observations of formaldehyde (HCHO), an ubiquitous oxidation product of most VOCs, focusing on Europe. HCHO column observations from the Ozone Monitoring Instrument (OMI) reveal a marked seasonal cycle with a summer maximum and winter minimum. In summer, the oxidation of methane and other long-lived VOCs supply a slowly varying background HCHO column, while HCHO variability is dominated by most reactive VOC, primarily biogenic isoprene followed in importance by biogenic terpenes and anthropogenic VOCs. The chemistry-transport model CHIMERE qualitatively reproduces the temporal and spatial features of the observed HCHO column, but display regional biases which are attributed mainly to incorrect biogenic VOC emissions, calculated with the Model of Emissions of Gases and Aerosol from Nature (MEGAN) algorithm. These "bottom-up" or a-priori emissions are corrected through a Bayesian inversion of the OMI HCHO observations. Resulting "top-down" or a-posteriori isoprene emissions are lower than "bottom-up" by 40% over the Balkans and by 20% over Southern Germany, and higher by 20% over Iberian Peninsula, Greece and Italy. We conclude that OMI satellite observations of HCHO can provide a quantitative "top-down" constraint on the European "bottom-up" VOC inventories.


Introduction
Non-methane volatile organic compounds (VOCs) contribute to the oxidizing capacity and the optical properties of the atmosphere, through the formation of ozone and secondary particulate matter (Finlayson-Pitts and Pitts, 1997). They also play a role in feedbacks inside the climate system related to the carbon cycle (Kulmala et al., 2004) and to landuse management (Purves et al., 2004;Lathière et al., 2006).
Global emissions of anthropogenic VOCs (AVOCs), estimated to be ∼180 TgC/year (EDGAR3.2), are small compared to emissions of VOCs from biogenic activity (BVOCs) that account for ∼1150 Tg C/y (Guenther et al., 1995). Isoprene is the most abundantly emitted BVOC with ∼500 Tg C/y (Arneth et al., 2008), followed by oxygenated VOCs (OVOCs) and monoterpenes (Guenther et al., 1995). Once in the atmosphere, VOCs may deposit or undergo chemical degradation, normally initiated by reaction with OH, O 3 or NO 3 (Atkinson, 2000), that lead to the formation of other VOCs (e.g. aldehydes and ketones) and finally CO 2 and/or secondary organic aerosol (Goldstein and Galbally, 2007). Because many BVOCs are extremely reactive (Fuentes et al., 2000;Atkinson and Arey, 2003), they can contribute significantly to episodes of elevated surface-level ozone in NO x -rich conditions (e.g. Pierce et al., 1998). Formaldehyde (HCHO) is a common intermediate product of the oxidation of most VOCs. Its concentration in the remote atmosphere is determined by the oxidation of methane (CH 4 ), which can be significantly increased in the continental boundary layer due to the oxidation of non-methane hydrocarbons (Wiedinmyer et al., 2005;Possanzini et al., 2002;Lee et al., 1998). HCHO also has a direct source from incomplete combustion, e.g., biomass burning (Andreae and Merlet, 2001). The main sinks of HCHO include photolysis and reaction with OH, resulting in a lifetime of a few hours during summertime conditions. Previous work showed that HCHO concentrations measured from satellites can be used to estimate emissions of VOCs (Palmer et al., 2003). The efficacy of this approach in determining the emission of a particular VOC depends on two factors: (1) the parent VOC having a significant HCHO yield and (2) the parent VOC having sufficiently short lifetimes such that there exists a local relationship between the emission of the VOC and the observed HCHO column. Early work illustrated this approach for isoprene by using column observations of HCHO from the Global Ozone Monitoring Experiment (GOME), in combination with the GEOS-Chem model, over North America during summertime when isoprene explained most the observed variability of the column (Palmer et al., 2003. They showed that GOME derived isoprene emissions are well correlated with in situ flux measurements over a Michigan forest, and they found a bias of −30%, which is within the estimated uncertainty of satellite derived emissions . More recent work have applied the general methodology to (1) East Asia (Fu et al., 2007), where AVOCs and fires complicate interpretation of HCHO columns; (2) South America, where low-NO x conditions and fires prevail (Barkley et al., 2008); and (3) North America using higher spatial and temporal resolution data from the Ozone Monitoring Instrument (OMI) (Millet et al., 2008). OMI HCHO observations have subsequently been used also to investigate the relationship of isoprene emission with surface temperature over South Eastern United States (Duncan et al., 2009). Other work chose to interpret these data on a global scale using a Bayesian approach (Shim et al., 2005;Stravakou et al., 2009). To our knowledge, there has been only one study focused on European VOC emissions using SCIAMACHY HCHO columns . HCHO columns over Europe are typically much lower than other mentioned regions, and satellite HCHO measurement are close to detection limit. However, Dufour et al. (2009) showed that monthly average of SCIAMACHY data decreases the observational error to the degree that they may reduce the a-priori uncertainty on isoprene emissions.
Unlike the global scale, annual European AVOC emissions (estimated to be ∼19 Tg/y, Simpson et al., 1999) are comparable to BVOC emissions (estimated to be ∼13 Tg/y, Simpson et al., 1999;Steinbrecher et al., 2009;Karl et al., 2009). BVOC emissions generally have a more pro-nounced seasonal cycle, peaking in hotter summer months and therefore still have the potential to play a considerable role in O 3 and SOA chemistry. Recent multi-year assessments Karl et al., 2009) reported that 30-40% of European BVOC emissions are concentrated in July, almost equally shared among isoprene, terpenes and OVOCs. Emissions during June and August both represent 25-30% of the annual emissions. Isoprene and monoterpene are dominated by a relatively small number of forest species with largest coverage (Keenan et al., 2009), while OVOCs have also important contributions from crops (Karl et al., 2009). During the European growing season (April-September) BVOCs are estimated to contribute about 2.5 ppbv to average surface ozone maximum over continental Europe, with peaks of 15 ppbv and 5 ppbv respectively over Portugal and the Mediterranean basin . During severe pollution episodes, BVOC emissions can contribute 30-75% to ozone production (Duane et al., 2002;Solmon et al., 2004). In contrast, BVOCs lead to a net ozone loss through the year in the Northern European boundary layer . The uncertainty related to modelling European BVOC emissions at the regional scale (<100 km) is estimated to be a factor of 2-3 for isoprene and a factor of at least 5 for monoterpenes (Simpson et al., 1999;Steinbrecher et al., 2009), with plant emission potentials being the single most important factor of uncertainty (Arneth et al., 2008). These emission uncertainties correspond to a BVOCderived ozone uncertainty of about 50% . Considering the sparsity of emission flux measurements that can be used both to develop and validate biogenic emission models, the availability of satellite data as a potential additional source of information is certainly worth to be explored, because of their global coverage with an homogeneous characterization of measurement error.
Here, we use satellite observations of formaldehyde column from the OMI instrument aboard NASA Aura to constrain VOC emissions over Europe. In Sect. 2, we briefly describe the CHIMERE chemistry-transport model, which serves as a tool for quantifying the atmospheric budget of HCHO and as the forward model for the interpretation of the satellite data. In Sect. 3, we interpret the monthly mean HCHO column distributions observed by OMI in relation to the emission of precursors emissions. In Sect. 4, we describe the inversion method, present our results, and discuss the robustness of our VOC emission estimates. We conclude the paper in Sect. 5.

CHIMERE chemistry-transport model
We use the CHIMERE chemistry-transport model (version 200709C, http://www.lmd.polytechnique.fr/chimere/) to help interpret observed HCHO columns and to act as the forward model in the inversion by providing the relationships between VOC emissions and the HCHO columns. The model simulates gaseous and the aerosol phases (Bessagnet et al., 2008), but here we focus on the gas phase. The model is setup on a 0.5 • × 0.5 • horizontal grid covering Europe (35 • -58 • N; 15 • W−25 • E) and 20 hybrid-sigma vertical layers extending to 200 hPa, in order to simulate the tropospheric abundance of chemical species. The model has been applied to simulate and analyze pollution episodes (Drobinski et al., 2007;Hodzic et al., 2006;Vautard et al., 2005), for longterm O3 trends analysis , for diagnostics or inverse modelling of emissions Deguillaume et al., 2007;Konovalov et al., 2007) and for operational forecast of pollutant levels over Western Europe (Rouil et al., 2009, http://www.prevair.org).
Meteorological input is provided by PSU/NCAR MM5 model (Dudhia, 1993) run at 45×45 km 2 horizontal resolution and 32 vertical sigma layers extending up to 100 hPa, and regridded on the 0.5 • × 0.5 • CHIMERE grid. The model is forced by ECMWF analyses using the grid nudging (grid FDDA) option implemented in MM5.
Anthropogenic emissions are derived from the Cooperative Programme for Monitoring and Evaluation of the Long-range Transmission of Air pollutants in Europe (EMEP) annual totals (Vestreng, 2003) scaled to hourly emissions applying temporal profiles provided by University of Stuttgart (IER) (Friedrich, 1997), as described in Schmidt et al. (2001). VOC emissions are aggregated into 11 model classes following the mass and reactivity weighting procedure proposed by Middleton et al. (1990).
Biogenic emissions of isoprene and monoterpenes are calculated with the MEGAN model (Guenther et al., 2006, v. 2.04) and implemented in CHIMERE as described in Bessagnet et al. (2008). The old biogenic emission module, derived from the European inventory proposed by Simpson et al. (1999) and implemented in CHIMERE as described by Derognat et al. (2003), is retained here to estimate uncertainty on BVOC emissions (Sect. 4).
Chemical boundary conditions for long-lived species are provided by a monthly mean global climatology from LMDz-INCA model (Hauglustaine et al., 2004).
The gas-phase chemical mechanism MELCHIOR (Latuatti, 1997) includes about 80 species and more than 300 reactions. Isoprene oxidation is derived from the work of Paulson and Seinfeld (1992). α-pinene is chosen as a representative for terpenes and its oxidation pathway is based on that included in the RACM mechanism (Stockwell et al., 1997). We are thus assuming that all monoterpenes have a HCHO yield equivalent to that of α-pinene. Following the review of HCHO yields from oxidation of monoterpenes by Atkinson and Arey (2003), α-pinene is at lower end of yield range among monoterpenes with appreciable amount of HCHO formed. Other monoterpenes don't have a reported value of HCHO yield. The effect of monoterpenes oxidation may thus be underestimated in our present work, but, as shown in the following their oxidation generally contributes very little to HCHO column over Europe. The model does not contain the latest findings on isoprene chemistry in low-NO x conditions, such as enhanced HOx recycling (Lelieveld et al., 2008;Hofzumahaus et al., 2009) and epoxide formation (Paulot et al., 2009). Those mechanisms have been proposed for "pristine" atmospheres, such as Amazonia, and are expected to have negligible impact on European polluted atmosphere (Lelieveld et al., 2008;Paulot et al., 2009). The OH-recycling mechanism hypothesized for polluted atmosphere by Hofzumahaus et al. (2009) is still very uncertain and would require a comprehensive database of measurements (Pugh et al., 2010) in order to be constrained for the European case. Dufour et al. (2009) presented a detailed comparison of HCHO yield from biogenic isoprene and α-pinene and anthropogenic VOCs simulated with MELCHIOR against results from complete explicit mechanisms such as the Master Chemical Mechanism (MCM) Jenkin et al., 2003) and the Self-Generated Master Mechanism (SGMM) (Aumont et al., 2005). The MELCHIOR HCHO yield from isoprene oxidation is about 10% lower than in the MCM and SGMM under high-NO x conditions and within 8% under low-NO x conditions. The MELCHIOR HCHO yield from α-pinene oxidation is within 20% of the MCM and SGMM under both high and low NO x conditions. The MELCHIOR HCHO yield from the degradation of anthropogenic VOCs is consistent with the MCM and SGMM within 20% and 50% under high and low NO x conditions, respectively, generally displaying an overestimation of HCHO yields from more reactive VOCs. Figure 1 shows the comparison of observed and model surface concentrations of HCHO and isoprene at EMEP stations with available measurements for the period under analysis (May-September 2005). Surface concentrations of formaldehyde and other VOCs are measured twice a week at few European sites by EMEP (Solberg et al., 2001;Solberg 2008, and references therein), with 8-hours sampling time centred at noon. HCHO measurements show a winter minimum (monthly mean concentration 0.3-2 ppbv) and a summer maximum (0.9-4 ppbv), and HCHO concentrations generally increase with decreasing latitude. The model generally show a very small bias (<0.10 ppbv, or ∼10%) and good correlation (0.65-0.85) with respect to EMEP surface measurements, with the exception of the Campisabalos site in Central Spain, where the model shows a positive bias (0.58 ppbv, ore ∼40%) and poor correlation (0.21). Modelled isoprene, one of the main HCHO precursors (see Sect. 1 and next Sect. 3.3), is also in good agreement with EMEP measurements, with the exception of a few episodes of enhanced isoprene in France. Previously reported comparison of CHIMERE simulations and EMEP HCHO measurements for summer of 2003 show similar results for French and Czech stations

HCHO columns observed by the OMI satellite instrument
The Ozone Monitoring Instrument (OMI) (Levelt et al., 2006) is one of the instruments aboard the NASA EOS-Aura spacecraft (http://aura.gsfc.nasa.gov/), launched in July 2004 on a near-polar sun-synchronous orbit. In the daytime ascending direction, Aura crosses the equator around 13:45 local time (LT), so that OMI observes Europe between 10:00 and 14:00 UTC. The instrument detects Earth's backscattered radiation with two Charge-Coupled Devices (CCDs) in the UV/Vis spectral range (270-500 nm), scanning the atmosphere in the nadir direction with 60 across-track pixels along a swath of 2600 km that permits a near-global coverage in one day. The finest spatial resolution is 13 × 24 km 2 at nadir, degrading towards swath edges. We use the OMI 1-Orbit level 2 swath HCHO product (version 003, algorithm version 2.0) publicly released in May 2008 through NASA's DAAC (http://disc.gsfc.nasa. gov/). The HCHO column is retrieved by direct fitting of radiances and irradiances in the 327.5-356.5 nm UV spectral window (OMI-ATBD, 2002) with a procedure based on the algorithm developed for GOME (Chance et al., 2000) and including a new sampling correction . The slant columns resulting from the fitting procedure are converted to vertical columns dividing by an air mass factors (AMFs), as a function of viewing geometry, surface albedo, atmospheric Rayleigh (air molecules) and Mie (aerosol and clouds) scattering and HCHO profile (Palmer et al., 2001;Martin et al., 2002). Scene-dependent atmospheric scattering is described by scattering weights (Palmer et al., 2001) calculated with the LIDORT radiative transfer model . UV surface albedo database is derived from several years of GOME satellite observations (Koelemeijer et al., 2003). We use the cloud information provided by the OMI product (OMI-ATBD, 2002). We use the GEOS-Chem global chemistry-transport model (http://www.geos-chem.org, v7-04-11) to calculate aerosol optical depths. The same model is used to compute normalized HCHO distributions for the AMF (Palmer et al., 2001) using the simulated HCHO profiles. The uncertainty on a single HCHO slant column observation ranges 40-100%, with lower end over hot-spots . Uncertainty on the AMF calculated with GEOS-Chem is estimated to be about 30% for cloud fractions less than 0.2 (Millet et al., 2006;Palmer et al., 2006). The total uncertainty of a single HCHO vertical column observation, adding the individual sources of uncertainty in quadrature, ranges from 50 to 105% . OMI has a limit of detection for HCHO of about 8 × 10 15 molecules cm −2 , which is about double the value for the GOME instrument due to a lower signal-to-noise ratio (but with much higher spatial resolution). We exclude data that do not satisfy fit convergence and statistical outliers (column negative within 2σ uncertainty and column values >1×10 19 molecules cm −2 ) as indicated by quality checks . We also exclude scenes where cloud fraction >20% or the solar zenith angle >84 • (Millet et al., 2008).
To simplify the comparison between OMI and CHIMERE we average OMI data with daily frequency onto the same regular 0.5 • × 0.5 • grid used for model simulations (see Sect. 2). Model output is sampled at same time and location as OMI overpass. Data availability largely depend on presence of clouds: the number N of gridded daily observations goes from a minimum of about 10 per month over the British Isles to almost 30 per month over Southern Europe. In the following analysis we will focus only on grid cells over land. Table 1 summarises the distribution of HCHO columns and related uncertainties observed by OMI from May to September 2005 over continental Europe. In July, when average HCHO columns are highest, about half of data fall below the detection limit of 8 × 10 15 molecules cm −2 and the standard error averaged over the domain is 8.4 × 10 15 molecules cm −2 . About 33% and 23% of data are above the detection limit respectively in June-August and May-September, while average uncertainties are 7.9-9.7 × 10 15 molecules cm −2 .
In other months observed columns are mostly below detection limit (not shown). We tested for sensitivity to the choice of a cloud fraction threshold for OMI scenes of 20% repeating the calculations of Table 1 with a threshold of 40%. Results are presented in online Supplement in Table S1. OMI HCHO column differ by 0.1-0.2 × 10 15 molecules cm −2 on average at the expense of an increased error of 0.4-0.9 × 10 15 molecules cm −2 . Figure 2 shows monthly mean spatial distributions of OMI HCHO observed in 2005. We find that OMI clearly shows a seasonal cycle, peaking in summer, which is qualitatively in phase with the main growing season (April to September). Generally, HCHO columns are below 8 × 10 15 molecules cm −2 in colder months (October to April, not shown). In May two slightly enhanced features above the industrialized Po Valley (Northern Italy) and Benelux appear; both these features are associated with 8-10 × 10 15 molecules cm −2 .
From October observed column return to low winter values.
HCHO columns observed from GOME show a similar seasonal cycle over Europe, with vertical columns going from 3-4 × 10 15 molecules cm −2 in winter to 8-10 × 10 15 molecules cm −2 in summer (Wittrock, 2006;De Smedt et al., 2008). A clear seasonal cycle is not observed by SCIAMACHY, but this is probably due to a too low signal-to-noise ratio of measurements over Europe (De Smedt et al., 2008).
Airborne profile measurements of formaldehyde over Europe were collected during the "Mediterranean Intensive Oxidant Study" (MINOS) aircraft campaign (Kormann et al., 2003) over South Eastern Mediterranean near Crete in August 2001, and during the "Upper Tropospheric Ozone: processes Involving HO x and NO x " (UTOPIHAN II) aircraft campaign (Stickler et al., 2006) over Central Europe in July 2003. Observed vertical HCHO profiles from these campaigns are generally "C-shaped", with HCHO mixing ratios approximately 1.5 ppbv in the boundary layer, decreasing rapidly to approximately 0.3 ppbv in the free troposphere (4-8 km altitude) and then slightly increasing again in the upper troposphere. Elevated HCHO concentrations in the free and upper troposphere were attributed to the influence of longrange transport from North America and South Asia (Kormann et al., 2003), or air masses from the continental boundary layer recently lofted by large-scale convection (Stickel et al., 2006). The column calculated from the observed HCHO mean profile observed during MINOS (Table 1 in Kormann et al., 2003) is of the order of 10 × 10 15 molecules cm −2 , consistent with those determined by OMI during summer 2005. Ground-based MAX-DOAS measurements over the Netherlands (Wittrock, 2006) and Northern Italy (Heckel et al., 2005) also show this steep decrease of HCHO from the surface to free troposphere, with associated columns ranging from 5 to 20 × 10 15 molecules cm −2 .

CHIMERE model columns of HCHO
In our inversion analysis we use our CTM model results to help interpret the variability and drivers of European HCHO column. We evaluated the simulation of HCHO at ground level against measurements at EMEP monitoring stations (Solberg, 2008). Here we compare the simulation of HCHO column against OMI observations described in Sect. 3.1.
The monthly mean picture derived from OMI ( Fig. 2) shows an annual cycle in phase with the growing season (April-September) and winter values well below the detection limit (8 × 10 15 molecules cm −2 ). We focus on the period from May to September because a significant fraction of observed HCHO values are above the instrument detection limit (Sect. 3.1) and thus potentially useful to constrain underlying VOC emissions. Figure 2 shows a comparison between monthly average HCHO column observed by OMI from May to September 2005 with the CHIMERE model. Table 1 summarises the mean statistics of the comparison. The model qualitatively reproduces the seasonal cycle with a maximum in July, but it underestimates the amplitude of the HCHO cycle with respect to OMI observations. We found a significant mean spatial correlation of 0.42-0.62 between OMI and CHIMERE HCHO column in summer. CHIMERE overestimates OMI HCHO over the Balkans and Southern Germany from May to September and it underestimates OMI HCHO over Spain, France and Italy in July. The EMEP sites are located in regions where the OMI minus model HCHO difference is relatively small. The model bias with respect to EMEP is the same sign as the model bias with respect to OMI at same location (Fig. 1).
Over the ocean, model HCHO columns are systematically lower than observations. Observed HCHO concentrations over the South Eastern Mediterranean during MINOS campaign are a factor of 3 larger than those expected over the remote marine environments (Kormann et al., 2003). A definitive reason for this difference has not been clearly identified and deserves further investigation. We speculate that several combined factors, peculiar of the Mediterranean Sea, may play a role: (1) Mediterranean is a relatively closed sea, hotter and more salty than nearby Atlantic Ocean: life of marine organisms (and their related emissions) should be affected; (2) in summer, the basin continuously receives polluted air masses from the continent, rich of VOCs and NO y ; (3) in spring-summer, large amounts of Saharan dust are de-posited over the sea, together with their minerals (potential nutrients); (4) there is an intense ship traffic that may provide additional NO x to a generally NO x -poor photochemical environment, stimulating photo-oxidation of available VOC to HCHO and other compounds. These factors may not be properly understood and represented in our emission and chemistry-transport models, leading to the large underestimate of satellite HCHO observations.
Over the Balkans, the model predicts an enhanced HCHO feature not seen by OMI: this is due to overestimated biogenic isoprene emissions, because extended broadleaf forests are present in this region. Recent studies support the hypothesis of too high isoprene emissions in MEGAN over the Balkans for July 2003 , most probably because of too high emission factors at standard conditions. Over the Iberian Peninsula, the model underpredicts OMI column in July and overstimates the column in May and September. The model also has a negative bias over Western France and a positive bias over Southern Germany. These regional biases are shown to be most likely due to incorrect prescription of biogenic emission estimates in next section, and illustrate the importance of satellite observations as a potential constraint on emissions.

Drivers of observed variability of HCHO columns
We now investigate factors that control the production and variability of HCHO columns over Europe. In summer months, HCHO budget is largely controlled by photochemistry (NO + peroxy radicals), while main loss pathways were identified in reaction with OH and photolysis in rural (Solberg et al., 2001;Borbon et al., 2004), polluted (Duane et al., 2002;Possanzini et al., 2002), and free tropospheric environments (Stikler et al., 2006). Formaldehyde photochemical production terms over Europe were quantified by means of a tagged-tracers version of the CHIMERE model for summer of 2003 , which keeps track of the HCHO produced from the oxidation of individual VOCs separately. The study found that oxidation of methane and other long-lived VOCs contributes to a slowly varying HCHO column background building up the 55-85% of the total column, with higher contribution over the sea. Variability was found to be driven by non-methane VOCs. Isoprene oxidation was estimated to contribute an average 20% of HCHO column, with peaks of 50% over strong source regions. Contribution from monoterpenes was found to be 8% on average and up to 20% over source regions. Anthropogenic reactive VOCs was found to make a small average contribution of 11%, but may contribute up to 40% of HCHO features such as columns over the Po Valley. Solberg et al. (2001) calculated that HCHO production rate is very sensitive to isoprene emissions at 6 EMEP sites, including those considered in this study. Duane et al. (2002) reported that in Po Valley isoprene contribute 30-60% of HCHO production in summer, while Borbon et al. (2004) estimated a contribution to HCHO production from isoprene <10% at Donon in France. Po Valley was studied in two "Formaldehyde as a Tracer of Oxidation in the Troposphere" (FORMAT) campaigns (Liu et al., 2007a, b). During the 2002 campaign held in summer, Liu et al. (2007a) found a significant influence of isoprene on HCHO production, while Liu et al. (2007b) reported a significant influence of anthropogenic VOCs on HCHO during the 2003 campaign held from September to October. This might indicate a switch from biogenic to anthropogenic control on HCHO when going from July-August to September-October.
We further test the control on the HCHO column production from isoprene, terpenes and reactive anthropogenic VOCs using five simulations of the CHIMERE model arranged as follows. Our reference calculation includes all emissions (CTRL simulation). Another one excludes emissions of BVOCs and reactive AVOCs (RAVOCs), such as ethene (C 2 H 4 ), propene (C 3 H 6 ), and xylenes (BKGD). The remaining three simulations are from sequentially imposing reductions of 30% to isoprene (ISOP30), terpenes (TERP30) and RAVOC (RAVOC30) emissions to estimate the sensitivity of HCHO columns to parent VOC emissions. Figure 3 shows the monthly timeseries of HCHO column averaged over selected regions over inland Europe. The comparison of OMI and model confirms the model bias discussed previously: the seasonal cycle is qualitatively reproduced by the model, but the amplitude is generally underestimated, and a systematic high bias is found over the Balkans and Southern Germany. Consistent with previous work , CHIMERE predicts that the bulk of the European HCHO column is made up by a background supplied by the oxidation of long-lived hydrocarbons. This background represents more than 90% of the column during colder months (September-May) and about 50% during summer. In summer, isoprene oxidation supplies about 30% of HCHO content, while terpenes and RAVOCs oxidation share almost equally the rest.
Our results suggest that HCHO production over Northern Italy is most sensitive to local emissions, because the difference between observed and model background column remains relatively high even in shoulder months. This can be explained by the extremely oxidative environment found there, favoured by vertical dispersion inhibited by thermal inversion above the PBL, and recirculation of pollutants by mountain breezes (Dosio et al., 2002). Moreover, the model suggest that oxidation of RAVOC may be an HCHO source almost as important as isoprene during May and September.
The key points that can be drawn to setup an inversion analysis of the OMI HCHO column can be summarized as follows. Oxidation of methane is expected to be the dominant "background" source of formaldehyde. Due to its high reactivity, isoprene is expected to dominate the temporal and spatial variability of the column, with the exception of the industrialized Po Valley and North-Western Europe, where a OMI and model monthly HCHO column (10 15 molecules cm −2 ) over selected continental European regions during May-September 2005. The black line denotes OMI observations, the red line denotes the CHIMERE model; the continuous red line denotes the model with standard (a-priori) emissions, and the dashed red line denotes the model with OMI-corrected (aposteriori) emissions. Other lines represent contributions to HCHO column from the oxidation of long-lived hydrocarbons (blue, background, BKGD), isoprene (green, ISOP), terpenes (magenta, TERP), and anthropogenic reactive VOCs (cyan, RAVOC: ethene, propene, and xylenes). The blue dotted line denotes the background HCHO calculated by subtracting the sum of contributions from reactive VOCs from the total model column, and is used to evaluate the uncertainty related to estimate of Jacobian matrix (see text). potentially important contribution from most reactive anthropogenic VOCs is expected. Biogenic monoterpenes make a minor contribution.

Constraining European VOC emissions with OMI HCHO columns
Here we apply Bayes' theorem to obtain a Maximum A Posteriori (MAP) solution (Rodgers, 2000) of the inverse problem relating satellite observation of formaldehyde and a state vector of isoprene and terpenes biogenic emissions and anthropogenic reactive VOC emissions (ethene, propene, and xylenes, see Sect. 3.3).

Inversion method
We assume a local (i.e. we neglect transport) and linear relationship between parent VOC emissions and the resulting HCHO columns. Palmer et al. (2003) showed that the assumption is valid in summer at midlatitudes for the high HCHO-yield (∼2.5 per mol) and reactive (lifetime ∼30 min.) isoprene on a length scale of O(100 km). Following our analysis on terms of HCHO production and calculations of  Dufour et al. (2009 , Table 1 and 2), in addition to biogenic isoprene we include in our inverse analysis reactive anthropogenic VOCs ethene (C 2 H 4 ), propene (C 3 H 6 ), and xylenes (C 8 H 10 ), (1) because of their potential influence on HCHO column over strong source regions, (2) because they have relatively short lifetimes (few hours) and high HCHO-yield (1.6-1.8 per mol), and (3) because they have very different spatial and temporal distribution with respect to isoprene emissions. Biogenic monoterpenes are not included, because their sources mostly overlap with those of isoprene (forests) and the inversion problem to estimate their source would be strongly ill-conditioned (Rodgers, 2000). We use the CHIMERE CTM as the forward model to obtain a local, linear relationship between the vector of HCHO daily observation y and the two-element state column vector of monthly emissions (of isoprene and RAVOC) x: where ε is the observational error vector and K is the Jacobian matrix derived from CTM and represents the sensitivity of the observation variable y to the state variables x (K = ∂y/∂x). b is the daily HCHO background due to all contributing factors other than local emissions of isoprene and RAVOCs; it is also estimated from CTM and it is treated as a parameter.
We define a-priori knowledge the state vector x a and its associated error covariance matrix S a . Assuming Gaussian error distributions, the Maximum A Posteriori (MAP) solution gives an optimal estimatex of the state vector: where G is the gain matrix given by: which represents the sensitivity of the optimal state vector to the observation (G = ∂x/∂y). S ε is the observational error covariance matrix. The second term on the right-hand side of Eq. (1) represents the correction to the a-priori on the basis of the measurement y and accounting for the relative magnitude of the a-priori and the observational errors. The error covariance matrix of the optimal state vectorx is given by: The a-posteriori error is always equal or less than the a-priori error, again depending on the relative magnitude of the apriori and the observational errors.
A compact measure of the information added by the observation to the knowledge of the state vector is the concept of "pieces of information" (Rodgers, 2000), equivalently called "degrees of freedom for signal" (d s ). It can be shown that this quantity equals the trace of the averaging kernel matrix A: where A = ∂x/∂x represents the sensitivity of the MAP solution to the true state. The pieces of information d s are always less than the dimension of the observational vector, in our case the number of available observations in a month, being perfectly equal only in the limiting case of absence of error (ε= 0). In contrast to original work by Palmer et al. (2003Palmer et al. ( , 2006, our top-down estimate of VOC emissions builds on an apriori knowledge about the magnitude of emissions. Over Europe the additional information is made necessary by the satellite observations generally close to detection limit, as illustrated in previous sections. We apply the top-down inversion on a monthly basis from May to September 2005, and solved separately for each grid cell using daily OMI observations.

Estimate of inversion parameters
The Jacobian matrix K is estimated using results from the four simulations of CHIMERE chemistry-transport model, similar to those reported in Sect. 3.2. The background term b is calculated from the simulation with emissions we want to constrain (BVOCs and RAVOCs) switched off (BKGD).

Fig. 5.
Estimated observational error related to OMI HCHO column observations (10 15 molecules cm −2 ), and a-priori and a-posteriori errors related to biogenic and anthropogenic VOC emissions (10 11 molecules cm −2 s −1 ), for July 2005. a-posteriori errors are reduced exploiting OMI information content through the MAP solution. The two simulations with isoprene and RAVOC emissions alternatively reduced by 30% (ISOP30 and RAVOC30) are used to estimate sensitivity of HCHO column to parent emissions (elements of K). We perform simulations at 0.5 • × 0.5 • resolution and then degrade resolution to 1 • × 1 • in order to apply inversion on a length scale of ∼100 km that allows to neglect atmospheric transport. Simulations are started on 15 March 2005 to assure sufficient model spin-up, and results from May to September are used to estimate monthly sensitivity to VOC emissions.
Error associated to the observing system (ε) includes (1) fitting error from retrieval procedure, (2) uncertainty related to AMF calculations, (3) uncertainty of HCHO background b. Fitting error is provided in the standard OMI HCHO product and monthly mean values are reported in Table 1. We add an uncertainty of 30% associated to AMF (see Sect. 3.1). We further add an uncertainty of 15% associated to background b, following error analysis on CHIMERE results . Square of elements of ε are the diagonal of observational error covariance matrix S ε . We assume S ε is diagonal, i.e. we assume errors on observations are independent. This assumption is conservative and does not introduce a bias. If off-diagonal elements of error covariance matrix were known they would have favoured a convergence toward "true state" with less observations (Rodgers, 2000). On the other hand, if off-diagonal elements are erroneously specified they may introduce a bias in the MAP solution.
The error associated to a-priori MEGAN biogenic emissions is estimated from the straight difference with another biogenic emissions inventory described by Derognat et al. (2003), and also implemented into CHIMERE model (see Sect. 2). In this way we obtain a gridded estimate of uncertainty associated to current knowledge on biogenic emission, following the approach of e.g. Steinbrecher et al. (2009) and Poupkou et al. (2010). The difference between the two dataset is always non null, as shown in Fig. 5 for July and in Table 3 for annual country scale emissions. This point is further discussed in Sect. 4.3. The error associated to a-priori EMEP reactive anthropogenic VOCs emissions is estimated in 40% as assumed in the inverse analysis by Deguillaume et al. (2007). We assume error covariance matrix of a-priori is diagonal. Figure 4 shows a-priori and a-posteriori BVOC and RAVOC emissions for July 2005. Emission maps for other months under analysis are shown in the online Supplement from Fig. S2 to S6. Continental integral of monthly emissions is given in Table 2, while annual totals of isoprene emissions are reported for each country in the domain in Table 3. VOC emissions are corrected by MAP solution according to differences between HCHO columns simulated by CHIMERE and observed by OMI. In July, MEGAN isoprene emissions over the Balkans are reduced by up to 40% in the a-posteriori scenario and up to 20% in Southern Germany and South-Western Spain. The country with largest percent change is Croatia, where isoprene emissions are reduced from 90 to 64 Gg/yr (−28%). The emissions are increased by Table 3. A-priori and a-posteriori (OMI-derived) European annual isoprene emissions in 2005 split at country scale. The label MEGAN-F3 refers to inversion performed assuming a flat uncertainty of a factor of 3 for isoprene emissions. Emissions in units of Gg/year. up to 20% at most locations over Iberian Peninsula, Greece and Italy. RAVOC emissions are sensitive to MAP solution over largest urban regions of North-Western Europe, Spain and Italy, where they are increased or reduced by up to 10%. On the continental scale, as reported in Table 2, we find aposteriori total monthly reactive VOC emissions are reduced for all months from May to September 2005, with the exception of July when emissions are slightly increased. Maximum variation to emissions are found in May, when isoprene emissions decreases by 10%, and RAVOC by 1%. The degrees of freedom of the signal or pieces of information d s Eq. (3) indicate how much information from OMI penetrates into MAP solution. We find a maximum of d s in July, because, as noted previously, there is an optimal combination of highest BVOC emissions and lowest OMI observational relative error. For the opposite reason we find a minimum of d s in shoulder months.

VOC emissions over Europe constrained by OMI HCHO
Before further analysis of results on emission estimates we look at the errors used to build the matrix G (2), which defines the gain of information introduced by the MAP solution. Figure 5 shows maps of the observational error, and the a-priori and a-posteriori errors for July 2005. The number of available observations N, which defines the dimensionality of y for each grid-cell and month, is strictly related to cloud cover, and reaches a maximum over the Iberian Peninsula and a minimum over the British Isles. The uncertainty on BVOC emissions (isoprene and terpenes) is highest on major source regions over Iberian Peninsula, Italy, Southern Germany and Balkans. The uncertainty on RAVOC emissions is also highest on major source regions, i.e. Po Valley and North-Western Europe.
We find that MAP solution reduces uncertainty on a-priori emissions prevalently over regions where there is an optimal combination of low observational error and high VOC emissions. The uncertainty on isoprene emissions is reduced by 20-40% over major source regions. MAP solution is not able to significantly reduce the uncertainty on RAVOC emissions (reduction <1%).
The spatial pattern of a-priori and a-posteriori errors are the same for other months under analysis (not shown). The improvement gained from using the MAP solution decreases in June and August and becomes lower in May and September. As mentioned earlier, this can be seen in a compact way from the average "pieces of information" d s , summarized in last two columns of Table 2, which reach a maximum in July and a minimum in May and September. Going back in more detail to emission results, in Fig. 6 we show timeseries of monthly mean emissions averaged over same selected regions of Fig. 3. Isoprene emissions display the most pronounced seasonal variation, with a peak in July and a steep decrease in shoulder months. Anthropogenic RAVOC emissions have very little month-to-month variation, with a minimum in August. The region most impacted by correction induced by OMI observations is that of Balkan Mountains, with BVOC emissions reduced in all months under investigation and by up to 25% in July. Systematic reductions to a-priori emissions are found also in Southern Germany and Northern Italy, the latter displaying very small variations. Over the Iberian Peninsula, OMI observations correct the seasonal cycle of BVOC emissions, increasing them in July and decreasing them in other months. Isoprene emission estimates at country level carried out with MEGAN may be compared with estimates reported by Simpson et al. (1999), as shown in Table 3. For several countries we found a discrepancy between two dataset of a factor of 2-3. Largest differences are found over Italy, where MEGAN emissions are 8 times higher than in Simpson et al. (1999). We further discuss the result in next Sect. 4.3.
The monthly ratios of a-posteriori and a-priori emissions of isoprene and RAVOC are applied as multiplicative gridded factors to hourly emissions into CHIMERE model and used to calculate HCHO column from May to September 2005 with OMI-corrected reactive VOC emissions. In Table 1, we summarize the comparison of CHIMERE simulations with OMI observations on a monthly basis at continental scale.
Introduction of a-posteriori emissions into the model reduces the root mean square error and improves the spatial correlation (e.g. from 0.49 to 0.59 in July), since the HCHO variability is expected to be dominated by local reactive VOC emissions, which are improved here through the MAP solution. The bias is not improved, probably because this is determined more by background HCHO concentrations (not corrected by OMI here) rather than local reactive VOC oxidation.
In Supplement Fig. S1, we compare observed and simulated monthly HCHO columns over Europe. a-posteriori emissions correct model bias over the Balkans, especially in summer. Southern Germany and Iberian Peninsula are other two regions significantly impacted. The comparison of HCHO and isoprene concentrations at EMEP ground stations with a-priori and a-posteriori emissions, shown in Fig. 1, reveal small changes. Exploitation of EMEP ground observations as third term of comparison is very limited by the small dataset and because sites are located in places where model to OMI bias is relatively small, implying small change to apriori emissions.

Sensitivity tests
The Maximum A Posteriori solution of the inverse problem of constraining reactive VOC emissions from satellite HCHO columns has been shown to reduce the uncertainty on a-priori knowledge of emissions. However, the MAP solution involves uncertainties related (a) to the estimation of parameters in Eq. (2) and (b) to the choice of the a-priori. We now investigate how these uncertainties affect the MAP solution, i.e. how robust is our a-posteriori estimate of emissions.
The gain matrix G, which weights the relative importance of observational and a-priori error to evolve to a less uncertain a-posteriori, relies on estimate of three main quantities: Eq. (1) observing system error (ε); Eq. (2) a-priori covariance matrix (S a ); Eq. (3) Jacobian matrix (K).
The observing system error ε, as explained in Sect. 4.1.1, is calculated from retrieval fit error, plus minor contributions from model error associated to Air Mass Factor and background term b in Eq. (1). The uncertainty related to estimate of ε can thus be regarded as small compared to uncertainty of other terms of MAP solution.
Diagonal terms of the error covariance matrix S a are errors associated to a-priori knowledge on emissions. Error on biogenic emissions is assumed to be equal to difference with a different inventory, while error on RAVOC emissions is assumed to be 40% (Sect. 4.1.1). We test the impact of these assumption on MAP solution alternatively doubling these errors and looking at changes in a-posteriori emissions. We find that maximum differences with reference emissions are found for isoprene emissions in July, but these do not exceed 2% on the continental total, and are always <10% at specific location. This suggests that the MAP solution is mainly influenced by the measurements rather than the a priori information.
We use a brute-force approach to calculate the Jacobian matrix K, i.e. derivatives ∂ ∂E i . A more accurate calculation of the gradients would account for the non-linear effect of tropospheric photochemistry. However, we show in Fig. 3 that this effect is small. We compare the average background HCHO b in selected regions calculated with the full chemistry model (blue solid lines) and by subtracting HCHO contributions of single sources from the total model column (blue dotted lines). Contributions from single VOC sources (isoprene and RAVOC) are calculated from linear extrapolation of HCHO sensitivities to emissions, i.e. the elements of K. The difference attributable to non-linearity in photochemistry is <5%. A more important source of uncertainty in estimating K is that related to HCHO yield from isoprene in our chemical mechanism. As discussed in Sect. 2, the uncertainty on HCHO yield can be estimated to be ∼10%. We estimate the uncertainty introduced in the MAP solution, perturbing by a conservative −15% the elements of K, and find maximum difference with reference emissions <5% in July.
The overall uncertainty introduced by estimation of inversion parameters is thus expected to be small (∼10%). The other source of uncertainty is the choice of the a-priori state vector x a . We test for this swapping the role of the two BVOC emissions inventories used in this study. We recalculate the inversion parameters driving the model with the Derognat et al. (2003) inventory and using MEGAN inventory to assess its uncertainty. Results are shown in Supplement Fig. S7 to S11 and Tables 3 and S2. A-priori isoprene emissions display significant differences, but corrections brought by MAP solution genrally yield to a convergence of the two a-posteriori's. In July, continental apriori isoprene emissions differ by 215 Gg (22%), while aposteriori emissions differ by only 1 Gg.
We perfomed an additional sensitivity test on a-priori error assumption doing the inversion assuming a flat uncertainty for isoprene emissions of a factor of 3. It is the value recommended by Simpson et al. (1999) and the same assumption done by Dufour et al. (2009) in their inversion. Emissions estimates at country level (Table 3) are generally relatively stable against this drastic change to error covariance matrix. A notable exception is France, where annual total estimated with MEGAN increases from 463 to 622 Gg/yr, going close to D03 value of 634 Gg/yr. The error on a-priori emission was probably too small in our approach for this country. We also point out the case of Italy, where solution seem to diverge toward very high values. Probably its position in the middle of Mediterranean Sea, where the model display a systematic negative bias, complicate the inversion analysis. This may deserve specific future analysis in combination with an as complete as possible suite of in-situ observations that characterize the photochemical environment and help interpretation of model bias. Data from the recently reported ACCENT-VOCBAS (Fares et al., 2009) campaign may represent a good opportunity.
We analyzed satellite observations of formaldehyde column over Europe (35 • -58 • N; 15 • W-25 • E) from the OMI instrument onboard the EOS-Aura spacecraft during 2005. We find a clear HCHO seasonal cycle with a summer maximum, associated with columns generally above the detection limit (8 × 10 15 molecules cm −2 ) over the Iberian Peninsula, France, Italy and Southern Germany during May-September. Elevated HCHO concentrations are observed over the Mediterranean Sea in summer and their origin is not clear and warrants future work. In cold months, OMI HCHO values are mostly below the detection limit.
From simulations of the CHIMERE chemistry-transport model and consistently with previous findings , we find that the bulk of HCHO column over Europe is made up of a slowly varying background supplied by oxidation of methane and other long-lived VOCs. In summer, isoprene oxidation provides about 30% of the total column and dominates HCHO temporal and spatial variability owing to its high reactivity and abundant emission, while oxidation of biogenic terpenes and anthropogenic reactive VOCs (ethene, propene and xylenes) make a minor contribution (about 10% each).
The chemistry-transport model is able to qualitatively reproduce the spatial variability of OMI HCHO column over land (correlation ∼0.5), but it underestimates the amplitude of the observed seasonal cycle. The model overestimates OMI HCHO over the Balkans and Southern Germany from May to September and it underestimates OMI HCHO over Spain, France and Italy in July. Differences are mostly attributed to wrong specification of biogenic VOCs (BVOCs) emissions, calculated with the MEGAN algorithm . Anthropogenic reactive VOCs (RAVOCs) emissions only play a minor role over Northern Italy.
Building on the work by Palmer et al. (2003Palmer et al. ( , 2006, we apply Bayes' theorem to obtain a Maximum A Posteriori (MAP) solution of the inverse problem relating OMI HCHO observations to European emissions of BVOCs and RAV-OCs. The uncertainty on "bottom-up" or a-priori emissions are reduced in the "top-down" or a-posteriori emissions. Estimated uncertainties on isoprene and RAVOC emissions are reduced by up to 40% and <1%, respectively, over major source regions. The root mean square error and the spatial bias of model HCHO column with respect to OMI is reduced (RMSE decreases by 0.1, or 5%, and correlation ∼0.6), owing to correction to BVOC emissions. In particular, MEGAN isoprene emissions are found to be too high by 40% over the Balkans and by 20% over Southern Germany, and too low by 20% over Iberian Peninsula, Greece and Italy.
We tested for the robustness of our Bayesian "top-down" estimate of VOC emissions, and concluded that a relatively small (∼10%) uncertainty is related to estimation of inversion parameters. However, we found a sensible depedence on assumed a-priori and a-priori error covariance matrix for some regions (e.g. France and Italy), for which further analysis is desirable. We conclude that satellite observations of formaldehyde can be usefully exploited as a constraint on "bottom-up" European BVOC and AVOC inventories.