SCIAMACHY formaldehyde observations : constraint for isoprene emission estimates over Europe ?

Formaldehyde (HCHO) is an important intermediate compound in the degradation of volatile organic compounds (VOCs) in the troposphere. Sources of HCHO are largely dominated by its secondary production from VOC oxidation, methane and isoprene being the main precursors in unpolluted areas. As a result of the moderate lifetime of HCHO, its spatial distribution is determined by reactive hydrocarbon emissions. We focus here on Europe and investigate the influence of the different emissions on HCHO tropospheric columns with the CHIMERE chemical transport model in order to interpret the comparisons between SCIAMACHY and simulated HCHO columns. Europe was never specifically studied before for these purposes using satellite observations. The bias between measurements and model is less than 20% on average. The differences are discussed according to the errors on the model and the observations and remaining discrepancies are attributed to a misrepresentation of biogenic emissions. This study requires the characterisation of: (1) the model errors and performances concerning formaldehyde. The errors on the HCHO columns, mainly related to chemistry and mixed emission types, are evaluated to 2×1015 molecule/cm2 and the model performances evaluated using surface measurements are satisfactory ( ∼13%); (2) the observation errors that define the needs in spatial and temporal averaging for meaningful comparisons. Using SCIAMACHY observations as constraint for biogenic isoprene emissions in an inverse modelling scheme reduces their uncertainties by about a factor of two in region of intense emissions. The retrieved correction factors for the isoprene emissions range from a factor of 0.15 (North Africa) to a factor of 2 (Poland, the United Kingdom) depending on the regions. Correspondence to: G. Dufour (dufour@lisa.univ-paris12.fr)


Introduction
Volatile organic compounds (VOCs) play an important role in tropospheric chemistry in particular in the production of ozone and organic aerosols, and in odd hydrogen radical and in nitrogen species cycling.They have significant impact on pollution and climate change.Large uncertainties remain in the knowledge of their emissions at continental scales as a result of the difficulty in extrapolating local VOC emission measurements to a larger scale.A lot of efforts have been made in order to improve the parameterization of the different effects that influence the emissions, especially for biogenic emissions like the past and present temperatures and radiation, the soil moisture stress, and the age of leaves (e.g.Guenther et al., 2006;Lathière et al., 2006;Boissard et al., 2008;Müller et al., 2008, and references therein) but large uncertainties still remain.
Formaldehyde is one of the most important intermediate compound in the degradation of VOCs in the troposphere.Their oxidation by OH leads to the formation of organic peroxy radicals that produce HCHO either directly or through the degradation of higher carbonyls (Atkinson, 1994).The main source of HCHO in the background troposphere is methane oxidation.However, in the continental boundary layer, oxidation of non-methane VOCs (NMVOCs) dominates over the methane source and can make a large contribution to the HCHO columns.The anthropogenic sources of VOCs are the most significant in urban areas but biogenic sources dominate elsewhere especially during the growing season with the largest contribution coming from isoprene (Palmer et al., 2003(Palmer et al., , 2006;;Abbot et al., 2003;Millet et al., 2008, and figures therein).The sinks of HCHO are mainly photolysis and reaction with OH.The lifetime of HCHO is variable depending on the OH concentration and on its photolysis.The lifetime of HCHO with respect to OH reaction is close to 0 at night and varies from 60 h for background OH concentrations (5×10 5 molecule/cm 3 ) to 3 h for OH levels reached in photochemically active plumes (10 7 molecule/cm 3 ).Overall the lifetime of HCHO is short enough that HCHO column would map the emission fields of its parent VOCs, provided that their reactivity to OH is large enough which is essentially true for unsaturated compounds.
Satellite observations of formaldehyde as reported e.g. in Chance et al. (2000), Wittrock et al. (2006) and DeSmedt, et al. (2008) have been used as independent constraints for emissions (Palmer et al., 2003(Palmer et al., , 2006;;Abbot et al., 2003;Shim et al., 2005;Fu et al., 2007;Millet et al., 2008;Stavrakou et al., 2008).A first top-down approach for inferring isoprene emissions from HCHO column measurements has been developed by Palmer et al. (2001Palmer et al. ( , 2003) ) and extensively applied to constrain isoprene emissions in North America using observations from the Global Ozone Monitoring Experiment (GOME) UV-visible sounder.In particular the seasonal and interannual variability of the North American isoprene emissions have been studied for the 1995-2001 period (Abbot et al., 2003;Palmer et al., 2006).More recently, Millet et al. (2008) used the Ozone Monitoring Instrument (OMI) to derive isoprene emissions in North America (June-August, 2006) with 1 • ×1 • resolution.The use of space-based HCHO data has been extended to other regions of the world in order to infer biogenic isoprene emissions but also biomass burning and industrial HCHO sources.Fu et al. (2007) used GOME data to constrain NMVOC emissions from east and south Asia, where a complex overlap between high anthropogenic, biogenic and biomass burning emissions occurs.Meyer-Arnek et al. (2005) performed Lagrangian studies over Africa in September 1997 and showed that the major contributions to the measured GOME HCHO columns are biomass burning and biogenic sources.Shim et al. (2005) applied a Bayesian inversion of GOME HCHO column measurements and derived new estimates of biogenic isoprene emissions as well as biomass burning and anthropogenic HCHO sources for eight regions of the world (North America, Europe, east Asia, India, southeast Asia, South America, Africa, and Australia).Except for this last study, no detailed studies are available on the possibility of inferring HCHO sources in Europe.The use of satellite data to improve European emissions is not obvious considering that (i) HCHO columns are relatively low leading to measurements often close to the detection limit of the current spacebased instruments, and (ii) formaldehyde sources (biogenic and anthropogenic) are well mixed and then their contribution to HCHO columns difficult to separate.
The aim of the present study is to evaluate the possibility of using satellite data (HCHO tropospheric columns) for improving isoprene emission estimates over Europe.Data from the SCaning Imaging Absorption spectroMeter for Atmospheric CHartographY (SCIAMACHY) sensor aboard the Envisat satellite (Wittrock et al., 2006) and the regional chemical transport model (CTM) CHIMERE (Schmidt et al., 2001) are used.The results are discussed in respect of the emission influence accounting for the model and observations errors.This first requires characterizing the performances of the model in reproducing formaldehyde.To this end, the chemical scheme included in CHIMERE has been tested with 3 different reference schemes and the simulated surface concentrations of formaldehyde have been compared to surface measurements (Sect.2).Secondly, the influence of the emissions has to be quantified: a tagging scheme has been included in the CTM and results are presented in Sect.3. Formaldehyde columns observed with SCIAMACHY are compared to the columns simulated by CHIMERE for two summers (2003 and 2005) with different climatic conditions.The comparisons are discussed taking into account the uncertainties in both the measurements and the simulations and specific averaging are stressed (Sect.4).Finally, in Sect.5, satellite observations of HCHO are used as a constraint of the European isoprene emission estimates.

Model description
CHIMERE is an Eulerian multi-scale three-dimensional chemistry-transport model dedicated to air quality issues.It is designed for simulating fields of various pollutants and related species from the urban scale (e.g.Vautard et al., 2001;Menut, 2003;Beekmann and Derognat, 2003;Blond et al., 2003) to the continental scale (e.g.Blond and Vautard, 2004;Besagnet et al., 2004).More recently CHIMERE has been applied for long-term ozone trends analysis (Vautard et al., 2006), and for diagnostics and inverse modeling of emissions (Deguillaume et al., 2007;Konovalov et al., 2006Konovalov et al., , 2008;;Pison et al., 2007).The model is also used for forecasts of pollutant levels as part of the French national air pollution forecasting system, Prev'Air (www.prevair.org).CHIMERE has been extensively validated with surface and airborne measurements for the different scales (Schmidt et al., 2001;Blond and Vautard, 2004;Menut et al., 2000;Vautard et al., 2003;Blond et al., 2007;Honoré et al., 2008).As an example, the averaged bias for daily ozone maxima over Europe is smaller than 10% (Honoré et al., 2008).A detail description of the model is available on the web (http://euler.lmd.polytechnique.fr/chimere/).
For this study, the continental version of CHIMERE (Schmidt et al., 2001) vertically extended onto the whole troposphere (Blond et al., 2007) has been used.In this configuration the model covers Western Europe (from 10.5 • W to 22.5 • E and from 35 • N to 57.5 • N).The model runs with a horizontal resolution of 0.5 • ×0.5 • and with 17 vertical layers from the surface up to 200 hPa.The model is driven by the European Center for Medium-Range Weather Forecasts (ECMWF) meteorological analyses.The convective scheme of Tiedtke (1989) is used to rediagnose the convective fluxes.
Top and lateral concentrations for 8 species (O 3 , NO x , CO, PAN, CH 4 , C 2 H 6 , HCHO, and HNO 3 ) are fixed according to the climatological monthly means provided by the MOZART model (Horowitz et al., 2003).Anthropogenic emissions are derived from the EMEP annual totals for 2001 (Vestreng et al., 2004) for NO x , SO 2 , CO, and non methane volatile organic compounds (NMVOC).These emissions are scaled to hourly emissions applying temporal profiles provided by IER (Friedrich, 1997) as described by Schmidt et al. (2001).VOC emissions are distributed into 11 model classes following the mass and reactivity weighting procedure proposed by Middleton et al. (1990): 9 classes are considered for anthropogenic species (n-C 4 H 10 , C 2 H 6 , o-xylene, C 2 H 4 , C 3 H 6 , methylethylketone (MEK), CH 3 OH, HCHO and CH 3 CHO) and 2 classes for biogenic species (isoprene, terpenes).The biogenic emissions of isoprene, terpenes (represented only by α-pinene) and NO are calculated on-line following the Simpson et al. (1999) and Stohl et al. (1996) methodologies, respectively.The determination of the isoprene and terpene emissions is based on the widely used parameterization of Guenther et al. (1995) that defines the flux as a function of the emission potential, the foliar density for European vegetation (Simpson et al., 1999) and an environmental correction factor that accounts for temperature and radiation dependencies.The emissions potential depends on land-use and a country averaged tree species distribution, distinguishing several tenths of isoprene and/or terpene emitting tree species (Simpson et al., 1999).In the absence of specific data on tree species distributions over North Africa, the distribution from Greece was used.This introduces additional uncertainty in the emission calculation for this region (see below).Emission potentials are included also for agricultural land, grass land, and shrubs.
The complete MELCHIOR chemical mechanism (Lattuati, 1997) is used to perform the simulations of the present study.The photolysis rates are calculated using the troposphere ultraviolet and visible model (TUV) (Mandrovich and Flocke, 1998) and are tabulated depending on altitude and zenith angle.They are corrected for cloudiness, using cloud cover data for low, medium, high and convective clouds delivered by ECMWF.The mechanism includes a simplified NMHC chemistry and considers a total of 82 gaseous active species and 333 reactions.Deposition of stable intermediates is included.The hydrocarbon species containing three carbons or less are explicitly treated.Larger compounds are represented by lumped species.Biogenic VOCs are represented by isoprene and α-pinene (which represents terpenes).Table 1 summarizes the typical lifetimes and emissions of the NMVOCs emitted in CHIMERE during summertime in Europe.The estimated lifetimes are similar to the lifetimes given by Palmer et al. (2003) in North America but the emissions especially of isoprene are much smaller in Europe than in other regions of the world (Palmer et al., 2003;Shim et al., 2005;Müller et al., 2008).
The CHIMERE model is used here for the first time to study formaldehyde on a continental scale and has not yet been evaluated for this specific molecule.Thus the reliability of CHIMERE with respect to formaldehyde has been checked.The chemistry implemented in MELCHIOR has been evaluated by comparison of simulated HCHO yields with results from other chemical mechanisms.In a second step, the CHIMERE model has been evaluated by comparisons of simulated HCHO with surface measurements.

Evaluation of the MELCHIOR chemical scheme
We compare HCHO formation from oxidation of all the VOC emitted in CHIMERE (isoprene, α-pinene, anthropogenic VOCs) using MELCHIOR with results from wellestablished chemical mechanisms: (i) the SAPRC99 scheme developed and validated using smog chamber data (for high NO x conditions) (Carter et al., 2000), (ii) the Master Chemical Mechanism (MCM) (Saunders et al., 2003, Jenkin et al., 2003) which is one of the most detailed schemes available in the literature and often used as a reference, and (iii) the Self-Generated Master Mechanism (SGMM) (Aumont et al., 2005) which is a unique fully explicit chemical scheme.These three "reference" mechanisms provide a representation of the uncertainties remaining in the NMVOCs oxidation knowledge and allow evaluating how well the chemistry leading to HCHO production is reproduced by MELCHIOR.
Simulations are performed using a time-dependent box model (Aumont et al., 2005;Camredon et al., 2007).We use conditions representative of the ones encountered during summer in Europe: initialization at 09:00 LT for midlatitude summer conditions with 1 ppbv of the VOC studied, 40 ppbv of O 3 , 100 ppbv of CO, and either 0.1 or 1 ppbv of NO x that represent low and high NO x conditions.The O 3 , CO and NO x concentrations are fixed to their initial values during the simulation time period.Simulations are performed over 24 h because only the HCHO produced during the first day can be potentially related to the local VOC emissions (Palmer et al., 2006) and then provide pertinent information on the emissions.
Figure 1 shows the time evolution of the HCHO yields (formed HCHO molecules per C-atom of a parent VOC) obtained in high-NO x condition (1 ppb).The differences between the three reference schemes (MCM, SAPRC99, and SGMM) are usually within 15%.However, a large discrepancy between MCM, SGMM and SAPRC is found for the HCHO production from α-pinene oxidation.SAPRC gives a HCHO yield twice smaller than MCM and SGMM, whereas these 2 schemes are in an agreement of ∼20%.Concerning the MELCHIOR mechanism, the HCHO yields derived under high NO x conditions are globally in fair agreement (differences lower than 20%) with the reference mechanisms, especially with MCM and SGMM (Fig. 1  agree within 5% with the three references.The yield from α-pinene oxidation is in a good agreement with that derived with MCM (differences of about 15%) and with SGMM (differences lower than 10%).Concerning isoprene oxidation, MELCHIOR leads to a yield slightly smaller: about 10% on average compared to MCM, 7% compared to SAPRC99 and 13% compared to SGMM.In low NO x conditions, the agreement between the chemical mechanisms is degraded com-pared to the high NO x conditions (Table 2).The agreement remains fairly good for methanol, CH 3 CHO, isoprene and MEK.In addition to the disagreement for α-pinene between the reference mechanisms, the largest differences occur for anthropogenic species.However, these species are emitted in regions of high NO x where the consistency between the different chemical schemes is better (Fig. 1).For the low NO x conditions, the HCHO yields simulated with MELCHIOR     are in the range of the yields simulated by the three reference schemes.It is important to note that large uncertainties remain in the oxidation schemes of many VOCs (e.g.α-pinene) and the error estimate deduced from the differences between the chemical schemes gives only an indication of how consistent the different mechanisms are and cannot be an absolute error determination.

CHIMERE evaluation with HCHO surface measurements
Formaldehyde simulated by CHIMERE is evaluated using surface measurements gathered within the EMEP groundbased monitoring network (http://www.nilu.no/projects/ccc/emepdata.html).Observations are performed on a once-or twice-a-week basis.The sampling time is about 8 h centered at noon.Chemical analysis of the samples is done by reversed phase high-performance liquid chromatography fol-lowed by UV detection (Solberg et al., 1995).The precision of the HCHO measurements has been estimated applying parallel sampling and analysis at different stations between 1995 and 2001 (Solberg et al., 1998(Solberg et al., , 2001a(Solberg et al., , b, 2005)).Three consecutive years of parallel sampling at the Birkenes station (Norway) showed a measurement precision of 5-6% (Solberg et al., 1998).Similar analysis conducted at the French Donon station showed a precision of 7-8% in 1997 (Solberg et al., 1998) and about 13% in 1999 (Solberg et al., 2001a).However, the precision can be much smaller for some cases: an analysis between 1999 and 2001 at the German Waldhof station shows a precision of about 52% (Solberg et al., 2005).
The continental version of CHIMERE has been used to simulate HCHO surface concentration over Europe during the summer 2003.The simulated surface concentrations of HCHO are interpolated at the location of the surface measurements at eight rural monitoring stations of the EMEP network (see Fig. 2).The station type has been characterized using the tagged simulation results (Sect.3), plotted in Fig. 3.The different emission contributions are of similar magnitude for the stations of Košetice, Waldhof, Brotjacklreigel, and Schmücke ranging between 15 and 30%.Isoprene oxidation is slightly the largest contribution at Brotjacklreigel and Schmücke whereas terpene oxidation is the largest at Košetice.The station of Zingst, located at a remote site on the Baltic Sea coast, is largely dominated by the background oxidation of methane and the stations at Donon and Peyrusse Vieille, located in remote forested areas, by the isoprene contribution.The HCHO concentrations simulated at the La Tardière station are similarly influenced by methane and isoprene oxidation.
The HCHO surface concentrations observed at each station of the EMEP network between June and August 2003 are compared to simulations in Fig. 3.The statistical results of the comparison averaged over the measurement time period are summarized in Table 3.An overall good agreement is obtained between simulations and observations taking into account both the uncertainties of the observations and the errors in the simulations: the mean bias over the 3 months is 13%.The daily variability and the mean concentration are rather well reproduced by the model for most of the stations.This indicates that the model captures reasonably well emissions, transport and photochemistry leading to the observed HCHO concentrations.The performance of the CHIMERE model is similar to these of the EMEP model on average (Solberg et al., 2001b).There is a tendency for smaller biases and larger correlation coefficients for the three French sites with small anthropogenic contributions.On the contrary simulations underestimate surface formaldehyde for the station of Zingst (40%).Significant discrepancies for a coastal site are not surprising.Honoré et al. (2008) stressed a general problem for the score of the continental version of CHIMERE at coastal sites for simulating daily ozone maxima.A large bias appears for 2 stations (Waldhof and Brotjacklriegel) with simulations larger by a factor between 2 and 3.As contributions   from different sources are well equilibrated at these two sites, it is not obvious which process is responsible for that difference.Some well admitted model errors can be cited as a potential explanation for that difference: uncertainties in different types of VOC emissions (see below), uncertainties in the chemical scheme, uncertainties in OH concentrations affecting the HCHO formation and loss, uncertainties of the meteorology affecting transport patterns.

Contribution of the emissions to the HCHO tropospheric columns
The aim of the present section is to evaluate how the different types of emissions and the tropospheric columns of HCHO     It is verified that the tagging procedure does not alter results for the original species.In the case of the 4th pathway, several HCHO precursors with anthropogenic sources included within the CHIMERE chemical mechanism are treated simultaneously.Only those directly emitted without any secondary sources are considered, in order to avoid too strong complication of the tagged scheme.In particular, CH 3 CHO, CH 3 OH, and MEK are emitted but can also be produced secondarily, and thus are not considered.This simplification is reasonable because the primary emissions of these 3 species are negligible compared to the other anthropogenic VOCs (Table 1).However, methanol has a large biogenic source that is not accounted for in CHIMERE: the biogenic part is more than one order of magnitude larger (7.4×10 10 molecule/cm 2 /s, Granier et al., 2005) than the anthropogenic part (0.3×10 10 molecule/cm 2 /s).To estimate the impact of neglecting biogenic methanol on the HCHO column, we calculate the total potential productions of HCHO as the sum of the individual potential production of each precursor (defined as the yield of the precursor multiplied by the corresponding mean European emissions).The total potential production of HCHO obtained is about 10% larger when biogenic emissions of methanol are included.The relatively long lifetime of methanol (Table 1) suggests that the formaldehyde produced by the biogenic methanol emitted in the model domain would be widely dispersed over the domain and would not provide any information on the emissions.Thus neglecting biogenic methanol introduces an error on the simulated tropospheric columns of HCHO of the order of 10%.
The simulated 2003 summer means of the different contributions of HCHO sources are presented in Fig. 5.The results are presented at 10:00 LT, i.e. at the overpass time of SCIAMACHY (see Sect. 4).The background contribution represents about 55% of the HCHO tropospheric columns on average but can be as low as a few percent in some grid cells and reaches up to 85% in some others, especially over sea.The mean isoprene contribution to the HCHO tropospheric column is about 17% and 21% with and without over-sea columns, respectively.Isoprene oxidation is the largest source of formaldehyde over the south-western half of France, over Greece and North Africa.It also dominates contributions from other emissions over Poland and the Balkans.Note that the impact of the isoprene emissions on the HCHO columns is mainly localized above the source regions.The spatial correlation (r 2 ) between the emissions and the isoprene-tagged formaldehyde columns is 64%.This reflects the fact that isoprene is oxidized rather quickly to HCHO, i.e. largely within the same grid-cell.The terpene emissions can contribute to up to 25% of the HCHO column but their mean contribution is rather low (8% on average).The maxima of the contribution are also localized close to the main source regions but the spatial correlation of the tagged columns with the emissions is smaller (45%) and most likely results from the lower contribution to the HCHO columns as compared to that of isoprene.The mean contribution of the anthropogenic emissions to the HCHO column (11% on average) is also smaller compared to the isoprene contribution.The spatial correlation between the anthropogenic emissions and the tagged columns is also re-duced (44%) compared to isoprene and only the unsaturated species significantly contribute to this correlation.Alkane emissions represented by n-butane are too unreactive to be oxidized in the same grid cell (Table 2).A large contribution (up to 40%) is observed above Northern Italy with the largest columns localized above the most intense source regions.This implies that the influence of the most reactive species in this case like C 2 H 4 , C 3 H 6 or o-xylene (Fig. 1) on HCHO is being observed.These molecules are also emitted with the same order of magnitude in the North-West of Europe but with a much smaller impact on the HCHO columns.This behavior is attributed to two factors: (i) the longer residence time of pollutants over the Po Valley (lower winds), (ii) the larger oxidative capacity of the air in this region, which results from larger amounts of actinic radiation, relatively high ozone and water vapour concentrations (increased OH production) and lower radical losses due to lower NO x levels.Finally, the sum of all the contributions (background + emissions) is shown in Fig. 5.This map shows the influence of the boundary conditions on the HCHO tropospheric column (from the difference to 100%).In agreement with the dominant winds in Europe, the boundary conditions have a significant impact mainly close to the western and the northern boundary of the domain.The contribution ranges from 20 to 30% in the boundary regions to more than 70% in the extreme West of the domain.Ireland and the North of United Kingdom are the most affected areas.The contribution decreases rapidly towards the interior of the domain according to the short formaldehyde lifetime of several hours.
It is also worth noting that the temporal and spatial variability of the HCHO tropospheric columns is driven on average over the European domain by isoprene variability, followed by the background variability.The influence of anthropogenic VOCs and terpenes is limited except in regions with large emissions.The temporal variability (during summertime) is estimated by the standard deviation for the mean tagged column over the European domain and then likely underestimated.The standard deviation value is 0.37×10 15 molecule/cm 2 for isoprene-tagged columns, 0.27×10 15 molecule/cm 2 for background-tagged columns, and 0.10 and 0.17×10 15 molecule/cm 2 for anthropogenicand terpene-tagged columns, respectively.The spatial variability is estimated by the standard deviation of the European tagged columns averaged over the June-August period.The standard deviation values are larger than for the temporal variability with a value of 1.11×10 15 molecule/cm 2 for isoprene (1.01, 0.51 and 0.42×10 15 molecule/cm 2 for background, anthropogenic and terpenes, respectively).As a conclusion, methane oxidation is the dominant source of formaldehyde (considering the total tropospheric columns).The amount of formaldehyde produced by isoprene oxidation is significantly larger than formaldehyde produced by the other emitted VOC and dominates the HCHO NMVOC production except for very localized areas (e.g.North of Italy).Isoprene is also the main driver of the temporal and  (Bovensmann et al., 1999).
The maximum scan width in the nadir-view is 960 km and global coverage is achieved within 6 days.The horizontal resolution in the nadir mode is 30×60 km 2 .The retrieval of vertical columns of HCHO is based on the DOAS technique that relies on the separation of narrow band absorption signatures from broad band absorption and scattering features.The retrieval consists in the determination of the slant column density of the considered species and its conversion to a vertical column amount by applying an air mass factor (AMF).This accounts for the path of light through the atmosphere and takes the vertical profiles of scattering and absorbing species into account.The spectral region of 334-348 nm was selected for the retrieval to avoid any correlation with an instrument grating polarization structure around 360 nm.In comparison to HCHO measurements from GOME (Global Ozone Monitoring Experiment, e.g.Wittrock, et al., 2000) this leads to a reduced signal-tonoise-ratio.The AMF calculation uses three standard profiles based on observations reported in literature, which are assigned to the individual pixels and times based on external information (Wittrock, 2006).Only ground scenes having less than 20 percent cloud cover are considered.Prior to conversion to vertical columns, the slant columns were normalized by assuming a mean value of 3.5×10 15 molecule.cm−2 in the region between 180 • W and 160 • W in agreement with climatological values from the MOZART (Horowitz et al., 2003) and the LMDz-INCA (Hauglustaine et al., 2004) models.This normalization is necessary to compensate for offsets introduced by the solar reference measurements (Richter and Burrows, 2002) and interferences by other absorbers.The uncertainty of HCHO vertical column retrieved from SCIA-MACHY measurements depends mainly on the fitting accuracy.For a single pixel (30×60 km 2 footprint) it is about 10 16 molecule/cm 2 .By averaging both in space and time this can be significantly reduced (see section below).The surface spectral reflectance, the assumption for the aerosol ver-tical distribution and its optical thickness, and the assumed vertical distribution of formaldehyde in the lowermost troposphere contribute to the uncertainty of the AMF calculation in the retrieval.This error is strongly scene-dependent but varies typically in the range of 10 to 30% of the vertical column.Systematic error sources are the temperature dependence of the HCHO cross section and inaccuracies of its absolute value, spectral interference in particular ozone to other trace gases and the normalization of the slant columns applying a reference sector.These sources sum up to about 3 to 5×10 15 molecule/cm 2 depending on latitude which is in reasonable agreement to the error budget reported in De Smedt et al. (2008).This study has also found a consistency between their and the Bremen HCHO columns within 10% above source regions.A detailed discussion of the satellite data evaluation used here can be found in Wittrock, 2006.SCIAMACHY data are regrided on a 0.5 • ×0.5 • grid for an initial use to match the model resolution.

Results
The comparison of daily SCIAMACHY observations and simulations above Europe is not reasonable because the mean European column of formaldehyde (∼7×10 15 molecule/cm 2 ) is smaller than the error (random error for one pixel about 10 16 molecule/cm 2 , see above).Thus, it is necessary to average the observations either temporally or spatially (or both) to reduce the errors.In a first step, we choose to temporally average the observations on a seasonal (summer months) and a monthly basis.The random error is reduced by the square-root of the number of SCIA-MACHY overpasses during the time period considered.On average the number of overpasses is about 6.25 per month and 18.75 per season for one pixel.A detailed calculation for each pixel indicates that the total error decreases down to 3.8×10 15 molecule/cm 2 on average when observations are averaged over the 3 summer months and down to 5.4×10 15 molecule/cm 2 when the average is made over one month.This latter error is still large compared to the mean HCHO column.In order to have comparable error level over 3 months and over one month, the observations have been regridded on a 1 • ×1 • grid for the monthly means.The summer and monthly tropospheric formaldehyde columns measured by SCIAMACHY in 2003 and 2005 are displayed in Fig. 6 (middle and lower panels) on a 0.5 • ×0.5 • grid for the JJA period and 1 • ×1 • for individual months.For 2005 that is a rather normal year concerning climatic conditions the HCHO columns observed by SCIAMACHY decrease from June to August (Fig. 6 and Table 5) but they are too close to the detection limit of the instrument to be unambiguously compared to model simulations, especially in July and August.In 2003, the mean European columns (7.33×10 15 molecule/cm 2 ) for the JJA period is 50% larger than in 2005 (Table 5).The exceptionally hot temperatures that had persisted during all the summer with a maximum at 39  the beginning of August implied much larger biogenic emissions (Curci et al., 2008).Combined with stagnating anticyclonic conditions, this leads to much larger amount of HCHO produced.The temporal evolution of HCHO columns is different from 2005: even if the columns observed in July are smaller than those observed in June, unusually large columns are measured in August during the heat wave period (Fig. 6 and Table 5).The continental columns are generally larger than the oceanic columns in agreement with the significant part of continental emissions leading to HCHO production.The oceanic columns are quite similar between 2003 and 2005 (Table 5) except for August when heat wave influence was the largest.Moreover, the spatial variability (standard deviation) of the columns is also larger in continental areas than over ocean (Table 5).One expects that the enhanced variability is mainly due to an increased variability of the sources over land.However, some observations have to be taken with caution especially when monthly averages are considered.Their uncertainty can be in the range of their absolute value, for instance in some regions in Spain during June and July.Some of the large columns observed over sea are also questionable because they might be artifact due to interfering contributions that can perturb the HCHO retrieval (Wittrock, 2006).De Smedt et al. ( 2008) have shown that using a slightly different window for the retrieval allows avoiding some interferences and reduces the large values observed over sea in some but not all cases.However, large wild fires occurred in August 2003 in Portugal and Spain and consequently their emissions -not included in CHIMEREcould explain the large columns observed in the Gascony gulf (e.g.Fig. 12 in Tressol et al., 2008).
The standard deviation of the mean European HCHO columns observed by SCIAMACHY is indicated between parentheses; d The bias and the RMSE are given in absolute value; e Spatial correlation SCIAMACHY are qualitatively reproduced by the model (Fig. 6 -JJA period).Unusually large formaldehyde values during August are also simulated by the model in agreement with strong heat wave conditions encountered in Europe for this period.However, the magnitude of these enhanced columns is less in the model and the spatial distribution cell by cell is not well reproduced.A small tendency of the model to underestimate formaldehyde is also found for June but not for July (Table 5).The mean biases for individual months are smaller than 25% and are clearly within the measurement and model error limits.The relative RMSE (root mean square of the error) of 40-50% is large but is of the order of the SCIAMACHY uncertainties.The differences between the observations and the simulations are more important for the oceanic columns (Table 5) but these columns have to be considered with caution as mentioned previously.If one focuses on the continental part, the differences are significantly reduced: the mean bias for the JJA period decreases by a factor of 2, RMSE and spatial correlations are rather similar.The main differences between the observations and the simulations occur in the South-East of France, in North Africa and more generally for the latitude larger than 50 • N whatever the time period considered.The discrepancy in North Africa is not surprising because the biogenic emissions used are more uncertain: tree species distribution is extrapolated from a typical Mediterranean one taken from the Greek one.Discrepancies at higher latitudes are more difficult to explain.They could be partly related to emissions but also to a reduced retrieval quality in these areas due to a higher contribution of the ozone absorption on the HCHO column uncertainty Note that the model to observation differences noted over North-Eastern Germany have different signs when tropospheric columns or surface values are compared (positive bias for simulated surface values at Waldhof, negative bias for tropospheric columns).In addition, as discussed in Sect.2.3, the simulated columns in the North of Europe (UK, Poland) are the most affected by the boundary conditions.The use of climatological monthly means to force the model at the boundaries may also explain some of the differences in these regions.However, the small  spatial correlations obtained when model and observations are compared cell by cell (Table 5) suggest that such comparisons are not the most satisfying because significant noise remains in the measurements.Additional spatial averaging of regions with similar formaldehyde columns and type of emissions helps to reduce the measurement noise and to have more pertinent comparisons with simulations.Instead of degrading again the horizontal resolution without any relations with the source regions, we chose to define subdomains for their consistency in terms of biogenic emissions influence onto HCHO and for the specific differences noted between SCIAMACHY and CHIMERE (e.g.South-East of France or Poland).Nineteen subdomains have been finally selected (Fig. 7).The SCIAMACHY and CHIMERE formaldehyde columns have been spatially and temporally averaged over each subdomain and compared.The total error on the SCIA-MACHY columns is now reduced to the systematic error (3×10 15 molecule/cm 2 ): the random error contributes with less than 1%.The total error corresponds to 30 to 70% of the tropospheric columns depending on the subdomain.
The comparison results between SCIAMACHY and CHIMERE mean columns are shown in Fig. 8 for the JJA period.The agreement between SCIAMACHY and CHIMERE is better than 20% for the majority of the subdomains.The differences are the largest in the most northern domains (5, 6, 17, and 19), with simulated HCHO columns underestimated in comparison to the SCIAMACHY measurements.On the contrary, the observed columns are smaller (>20%) than the simulated ones in subdomains 2, 11 and 12. Subdomains 11 and 12 are also slightly impacted by the boundary conditions.Uncertainty in biogenic emissions is larger for domain 12 (North Africa) as seen previously.Thus on one hand, the differences between observations and simulations  are within the SCIAMACHY errors for almost all the subdomains, on the other hand, they can be explained by the large uncertainties especially in biogenic emissions (see below for more detailed discussion).

Potential gain of satellite data use for emission estimates improvement
In the previous section, we defined the adequate temporal and spatial averaging needed to have reasonable uncertainties on the observations and meaningful comparisons with simulations.Here we use these averaged observations as constraints for biogenic emissions of isoprene, which are the strongest contributor to tropospheric HCHO columns.To this purpose, we set-up a simplified inversion scheme, i.e. we applied a Bayesian least squares method (Rodgers, 2000) to optimize the a priori source strength of biogenic isoprene in the reduced space of the 19 subdomains defined in Sect. 4. Corrections of the mean emission estimates per subdomains for the summer 2003 period are inferred from the corresponding mean HCHO columns.We choose a matrix-based formulation for the inverse modeling as the problem to solve is of reduced dimension.The optimized parameters are not directly the emissions of biogenic isoprene in the 19 subdomains but a vector of multiplicative control parameters, named correction factor vector in the following.The a posteriori solution x ap and the corresponding a posteriori covariance error matrix P ap are given by:

Fig. 9.
Observation matrix H used for the inverse modelling exercise.This matrix represents the Jacobian of the model in the observation space and was determined using a perturbative method (see text for details).
x b is the a priori or background correction factor vector with an unbiased Gaussian error statistics described by the covariance matrix B. y is the observation vector of the HCHO column with their error statistics also assumed to be unbiased and Gaussian, and represented by the covariance matrix R. Note that in our case the model error statistics are included in the covariance matrix R. It is assumed that the observation errors, the model errors and the background errors are respectively uncorrelated.This implies that the matrices R and B are diagonal; the diagonal elements are the variances of the observation squared summed with the variances of the model errors and the variances of background errors, respectively.H is the observation matrix that represents the Jacobian of the model with respect to the control parameters (multiplicative factor of the emissions) convoluted with a projection operator mapping the model state space onto the observation space (19 subdomains).The observation matrix (H) is evaluated using a perturbative method: the emissions were increased by 20% for each subdomain separately and the sensitivity of HCHO columns was deduced for the same subdomain (diagonal terms of H) as well as for the other subdomains (off-diagonal terms of H).This means that 19 simulations are necessary to build the H matrix.The results are averaged over the summer 2003 period.The off-diagonal terms of the matrix are much smaller than the diagonal terms.Indeed, the impact of one domain onto the others is always less than one third of its self-impact (Fig. 9).The considered background errors (B) are fixed to a factor of 3 according to the upper admitted uncertainties in the biogenic emissions prescribed by Simpson et al. (1999).The observation errors are close to the systematic errors de-rived in Sect.4.2 (about 3×10 15 molecule/cm 2 ).The observation error considered in the R matrix is the squared sum of this systematic part and of the random part calculated for each subdomain.The model errors related to the chemistry scheme have been discussed in Sect.2.3: an error of 10% on the HCHO production from isoprene oxidation is reasonable.In addition it is necessary to account for the terpene contribution that is largely uncertain, uncertainty in emissions being the major factor contributing to uncertainty.Considering the mean contribution to the column (8%, Sect.3) and an uncertainty of a factor of 3 for the emissions, the averaged error linked to terpenes can be estimated to 1.75×10 15 molecule/cm 2 (note that again this error is explicitly calculated for each subdomain).Combined with the uncertainty on the HCHO production from isoprene oxidation and the uncertainty resulting of the nonconsideration of biogenic methanol, the model error amounts to 2×10 15 molecule/cm 2 on average.Note that the contribution of the uncertainty on the anthropogenic part of the column is negligible compared to the other error sources and thus not accounted for in the calculation of the model error.
The correction factors and the a posteriori errors obtained when Eq. ( 1) and ( 2) are applied, are shown in Table 6.The new inventory of isoprene emissions obtained with the correction factors is compared to the inventory used as reference in Fig. 10.The inverse modeling results suggest that isoprene emission estimates in the eastern part of France (subdomains 2 and 3), in Greece and North Africa (the most sensitive regions for isoprene emissions) should be reduced significantly.The corrections have the same sign as is obtained when comparing our reference emission inventory with the recent NATAIR inventory (Steinbrecher et al., 2009;Curci et al., 2009).A good agreement between the absolute value of the corrections obtained by this study and those suggested by the NATAIR inventory is observed for the French regions.On the contrary, emission estimates in the Northern part of Europe (subdomains 5, 6, 7, 8, 17, and 19) need to be increased, especially for Poland and the United Kingdom by a factor up to 2. The corrections prescribed by our method are going in the opposite direction compared to what the comparison between the recent NATAIR inventory and the reference inventory used here would suggest.Note also that confidence in the observations at these latitudes is reduced.Further investigations for both the observations and the available inventories would be necessary to firmly conclude.
The improvement of the uncertainties of the inversed emissions is reported for each subdomain in Table 6.The largest improvement is obtained for the most sensitive regions (Southern and Eastern France, Greece, Nort-Africa).In these cases, constraining isoprene emission estimates with HCHO columns from SCIAMACHY allows a reduction of the uncertainties by about a factor of two.On the other hand, the improvement is limited for the regions moderately sensitive to isoprene emissions even in the case where large corrections are obtained.a to be compared to the a priori error (factor 3) considered for the biogenic isoprene emissions in this study In order to check the consistency of the corrections, the model was run with the inversed emission estimates.The difference between the HCHO columns from SCIAMACHY and those simulated with the new estimates are plotted in the lower panel of Fig. 8 in red.The difference is systematically reduced compared to the reference case and does not exceed 20% for any of the subdomains except two of them (5 and 19).

Conclusions
Comparisons of HCHO tropospheric column measurements for June-August 2003 from SCIAMACHY aboard the Envisat satellite with simulated columns with a specific focus on Europe are presented.The columns observed in 2003 were well above the detection limit of the instrument and we showed that the most important observed spatial structures in HCHO columns are reproduced by the model (mean bias <20%).However, differences likely related to biogenic G. Dufour et al.: SCIAMACHY HCHO and European isoprene emissions isoprene emissions uncertainties remain.The discussion of the results required: (1) The evaluation of the model performances and the errors (mainly related to HCHO chemistry).Comparisons with surface measurements show that the model succeeds to reproduce the basic photochemistry leading to the observed HCHO concentrations for most of the sites.Unfortunately the small number of sites operating in-situ formaldehyde measurements does not allow a complete model evaluation.On the other hand, no strong inconsistencies concerning HCHO chemistry in the MELCHIOR mechanism used have been revealed.The error on the HCHO production from isoprene oxidation is estimated to 10%.Addition of other error contributions (terpenes, methanol) leads to a model error of ∼2×10 15 molecule/cm 2 .
(2) The definition of regions of interest in Western Europe for the isoprene emissions.This was achieved by implementing tagged chemical schemes in the model in order to infer the contribution of the different sources of formaldehyde to the European tropospheric columns and then the regions which are most sensitive to isoprene emissions.
(3) The quantification of observation errors that also controlled the definition of the regions of interest stressing the needs of specific spatial and temporal averaging over coherent HCHO sources.
Finally, perspectives of using satellite observations as constraint of isoprene emissions in Europe are investigated using an optimal estimation method.The inverse modeling results show that isoprene emissions in the Eastern half of France and in Greece and North Africa are overestimated in the reference inventory used.In contrast, the a priori emission estimates seem too low in the northern part of Europe but the higher uncertainties in the observations for these latitudes does not allow a firm conclusion.In general, SCIAMACHY observations used as constraint could reduce the errors on the emission estimates more by about a factor of two in the most sensitive regions.Thus satellite formaldehyde columns are shown in this paper to give useful constraints on isoprene emissions even over Europe, where these emissions are much weaker than over North-America.

Fig. 1 .
Fig. 1.Cumulative HCHO yields per carbon from the oxidation of the VOCs emitted in CHIMERE compared for different chemical mechanisms under high NO x conditions (1 ppb).Solid line: MELCHIOR scheme; dashed line: MCM scheme; dotted line: SAPRC scheme; dashed-dotted line: SGMM scheme.The shaded area corresponds to ±20% deviation around the yield simulated with MCM.

Figure 2 .
Figure 2. Location of the EMEP stations used for the comparison of surface HCHO concentration.

Fig. 2 .
Fig. 2. Location of the EMEP stations used for the comparison of surface HCHO concentration.

Figure 3 .
Figure 3. Timeseries of surface HCHO concentrations during summer 2003 as measured at the 8 EMEP stations and simulated by CHIMERE.

Fig. 3 .
Fig. 3. Timeseries of surface HCHO concentrations during summer 2003 as measured at the 8 EMEP stations and simulated by CHIMERE.The different emission contributions to the simulated surface concentrations are also given with grey symbols.
are connected over Europe and then to define for which emission types and in which regions satellite data would be susceptible to be a meaningful constraint of the emission estimates.The mean spatial distributions of the VOC emissions considered in CHIMERE in summer 2003 are represented in Fig. 4. The anthropogenic emissions dominate the other emissions mainly in the North-West of Europe (North of France including Paris area, Belgium, Netherlands, West of Germany and England), in the Po Valley and along the East coast of Italy.Anthropogenic and biogenic emissions are of comparable magnitude over Spain.Isoprene emissions dominate in the other European regions except in the Czech Republic and Slovakia where the terpene emissions dominate.

Figure 5 .
Figure 5. Contribution (%) to the tropospheric column of HCHO from methane (background), isoprene, α-pinene, and anthropogenic VOC oxidation.The sum of the different contribution is also presented in order to provide information on the boundary condition influence (difference to 100%).

Fig. 5 .
Fig. 5. Contribution (%) to the tropospheric column of HCHO from methane (background), isoprene, α-pinene, and anthropogenic VOC oxidation.The sum of the different contribution is also presented in order to provide information on the boundary condition influence (difference to 100%).

Figure 6 .
Figure 6.HCHO tropospheric columns simulated by CHIMERE in summer 2003 (top panel), observed by SCIAMACHY in summer 2003 (middle panel) and 2005 (bottom panel).Values are averaged over the 3 summer months (JJA period) and individually over each summer month.Columns are given with a 0.5° 0.5° resolution for the JJA period and with a 1° 1° resolution for the monthly mean (see text for details).

Fig. 6 .
Fig. 6.HCHO tropospheric columns simulated by CHIMERE in summer 2003 (top panel), observed by SCIAMACHY in summer 2003 (middle panel) and 2005 (bottom panel).Values are averaged over the 3 summer months (JJA period) and individually over each summer month.Columns are given with a 0.5 • ×0.5 • resolution for the JJA period and with a 1 • ×1 • resolution for the monthly mean (see text for details).
The formaldehyde columns observed by SCIAMACHY in summer 2003 are compared to the columns simulated by CHIMERE (upper panel) in Fig.6and the statistical results of the comparison are summarized in Table 5.The CHIMERE fields are taken at the time and location of the SCIAMACHY measurements for the comparison.Most of the main continental spatial structures observed by www.atmos-chem-phys.net/9/1647/2009/Atmos.Chem.Phys., 9, 1647-1664, 2009

41Figure 8 .
Figure 8.Comparison of the mean HCHO tropospheric column measure by SCIAMACHY and simulated by CHIMERE in each subdomain (upper panel).The relative differences are given in the bottom panel.The differences obtained with the corrected emission set are presented in red.

Fig. 8 .
Fig. 8.Comparison of the mean HCHO tropospheric column measure by SCIAMACHY and simulated by CHIMERE in each subdomain (upper panel).The relative differences are given in the bottom panel.The differences obtained with the corrected emission set are presented in red.

Figure 10 .
Figure 10.Comparison between the reference and the corrected biogenic isoprene emissions.Fig. 10.Comparison between the reference and the corrected biogenic isoprene emissions (in molecule/cm 2 /s).

Table 1 .
NMVOCs lifetime and emissions over Europe in summer.

Table 2 .
One-day yields of formaldehyde per C reacted from the oxidation of the parent VOCs emitted in CHIMERE for 1 and 0.1 ppbv of NO x .
a Oxidation schemes of aromatics (here o-xylene) are not available from SGMM.

Table 3 .
Statistics inferred from the comparison between surface observations and simulations for the June-July-August 2003 period at each EMEP stations: obs and sim represent the mean surface concentrations of HCHO observed and simulated respectively; R is the correlation coefficient; n the number of comparison points, the bias is given as sim-obs; and RMSE is the root mean square of the error between the observations and the simulations.

Table 4 .
Principle of the tagging technique.
Dufour et al.: SCIAMACHY HCHO andEuropean isoprene emissions the spatial variability.This suggests that if satellite data can be used to constrain emissions in Europe, isoprene emission estimates are likely the only estimates that could be constrained.In the following, we will then concentrate on discussing if satellite observations of HCHO would help to improve isoprene emission estimates.

Table 5 .
Statistics derived from the comparison of HCHO tropospheric columns (10 15 molecule/cm 2 ) observed by SCIAMACHY and simulated by CHIMERE for summer 2003 and 2005.The statistics are performed on the total domain, and on the land and sea columns, respectively.

Table 6 .
Mean initial or a priori emissions of biogenic isoprene in each subdomain and the corrections obtained by inverse modelling (correction factor and a posteriori errors).The improvement obtained for the uncertainties on the emissions is indicated in the last column.