Estimation of volatile organic compound emissions for Europe using data assimilation

The emissions of non-methane volatile organic compounds (VOCs) over western Europe for the year 2005 are estimated via inverse modelling by assimilation of in situ observations of concentration and then subsequently compared to a standard emission inventory. The study focuses on 15 VOC species: five aromatics, six alkanes, two alkenes, one alkyne and one biogenic diene. The inversion relies on a validated fast adjoint of the chemical transport model used to simulate the fate and transport of these VOCs. The assimilated ground-based measurements over Europe are provided by the European Monitoring and Evaluation Programme (EMEP) network. The background emission errors and the prior observational errors are estimated by maximum-likelihood approaches. The positivity assumption on the VOC emission fluxes is pivotal for a successful inversion, and this maximum-likelihood approach consistently accounts for the positivity of the fluxes. For most species, the retrieved emissions lead to a significant reduction of the bias, which underlines the misfit between the standard inventories and the observed concentrations. The results are validated through a forecast test and a cross-validation test. An estimation of the posterior uncertainty is also provided. It is shown that the statistically consistent non-Gaussian approach based on a reliable estimation of the errors offers the best performance. The efficiency in correcting the inventory depends on the lifetime of the VOCs and the accuracy of the boundary conditions. In particular, it is shown that the use of in situ observations using a sparse monitoring network to estimate emissions of isoprene is inadequate because its short chemical lifetime significantly limits the spatial radius of influence of the monitoring data. For species with a longer lifetime (a few days), successful, albeit partial, emission corrections can reach regions hundreds of kilometres away from the stations. Domain-wide corrections of the emission inventories of some VOCs are significant, with underestimations of the order of a factor of 2 for propane, ethane, ethylene and acetylene.


Introduction
Non-methane volatile organic compounds (NMVOCs: further abbreviated VOCs in the following) are of particular environmental concern because they are precursors of secondary pollutants, such as ozone and fine particulate matter (PM 2.5 ), and some VOCs are pollutants in their own right due to their adverse carcinogenic and/or non-carcinogenic health effects.Therefore, it is essential to have accurate emission inventories of VOCs to conduct air quality modelling studies for the development of optimal emission control strategies and for air quality forecasting as well as to follow their temporal emission trends over the years as emission control strategies become implemented.The large number of emission sources, both anthropogenic and biogenic, and processes leading to those emissions (combustion, evaporation and vegetation metabolism) make the development of accurate VOC emission inventories difficult.Furthermore, VOC emissions cannot be derived from mass balances of the emission process (as can be done for example for sulfur or heavy metal emissions) and they must be obtained from measurements conducted at the source of the emissions.Also, emission testing is costly and some emission factors are developed for total VOCs rather than for individual VOCs.Therefore, a chemical speciation must be applied to total VOC emission factors using speciation data that are limited and uncertain.Uncertainties in anthropogenic emissions have been reduced over the years as a result of better characterisation of major emission sources; for example, tunnel VOC measurements conducted in California have shown underestimates in VOC emissions from vehicles decreasing from a factor of 4 in 1987 (Pierson et al., 1990) to a factor of 2.5 in 1994 (Kirchstetter et al., 1996).Nevertheless, uncertainties still remain (Sawyer et al., 2000).Furthermore, uncertainties of the order of a factor of 2 are associated with isoprene emissions, and greater for other biogenic organic compounds (Guenther et al., 2000(Guenther et al., , 2006;;Warneke et al., 2010), due to the difficulty of estimating the meteorology-dependent emission rates for a large number of vegetation species as well as characterising the land use/land cover of the area of interest.Therefore, several approaches have been used to evaluate the accuracy of VOC emission inventories and, if appropriate, apply some correction.
Uncertainties in emission inventories have been estimated, for example, by comparing ambient air measurements in tunnels with vehicle exhaust emission estimates (e.g.Staehelin et al., 1998;Sawyer et al., 2000;Touaty and Bonsang, 2000;Stemmler et al., 2005;Ho et al., 2007).However, such experiments characterise only one source category (on-road traffic) and focus on a single location and time period.Satellite measurements in combination with various types of models have been used to assess VOC emissions.However, such techniques are limited by the number of VOCs that can be measured via satellite-borne instruments (Vijayaraghavan et al., 2008) and to areas that are specific to a major source category, e.g. the use of formaldehyde (an oxidation product of isoprene) to estimate isoprene emissions in remote areas where biogenic emissions dominate (e.g.Shim et al., 2005;Fu et al., 2007;Millet et al., 2008;Dufour et al., 2009).Measurements of VOC concentrations aloft have also been used to estimate fluxes of VOCs originating from an area (e.g.Hopkins et al., 2009); however, there are uncertainties associated with the mass balance method used to estimate the atmospheric transport flux and relate it to an emission flux.Furthermore, such an approach is limited to the estimation of an emission flux for a given region over a given period.Comparisons of the output concentrations of air quality model simulations with observations aloft (e.g.Xiao et al., 2008) or at ground level (e.g.Harley and Cass, 1995) provide some estimates of uncertainties in VOC emissions; however, such information is typically also limited to a specific region and period for which those measurements are available.Inverse modelling has been conducted using ground-level ambient concentrations to estimate emission inventories for some air pollutants, but such studies (e.g.Quélo et al., 2005;Elbern et al., 2007;Koohkan and Bocquet, 2012) have focused so far on regulated air pollutants with ambient concentration data available from routine monitoring networks and have not yet included a large number of speciated VOC ground-level concentration measurements.Use of glyoxal satellite data (e.g.Liu et al., 2013) and a combination of formaldehyde and glyoxal satellite data (e.g.Stavrakou et al., 2009) have also been used to estimate emission fluxes of isoprene and other biogenic and anthropogenic VOC precursors of those oxidised VOC.
It is, therefore, of great interest to investigate the current status of VOC emissions using an approach that provides information for several major VOCs with spatial distribution over a large domain and for a long period of time.To that end, we present here the first assessment of the emissions of several VOCs measured routinely at several remote sites that covers all VOC sources over western and central Europe for an entire year.
In Sect.2, the chemical transport model (CTM) used for modelling the VOCs is introduced and its reduced counterpart is described.The source-receptor relationship is built using an approximate adjoint model, which is validated.The control variables (i.e. the emission fluxes), and the inversion modelling method are described.The inversion method crucially depends on the prior statistics of errors in the system, and a method to estimate the so-called hyperparameters that parameterise these statistics is introduced.In Sect.3, the setup of the numerical experiments is described.Details about the observations and the background inventory are provided.The optimal values of the hyperparameters to be used in the inversions are computed.In Sect.4, the results of the inversions are presented and discussed.A forecast test and a cross-validation test are provided to validate the corrected emission fluxes using independent observations.Some objective results on the uncertainty attached to the emission corrections are provided.Conclusions are presented in Sect. 5.

Full chemical transport model and reduced VOC model
The chemical transport model (CTM) POLAIR3D (Sartelet et al., 2007) of the POLYPHEMUS air quality modelling system (Mallet et al., 2007) is chosen to model the atmospheric concentrations of chemical species.The numerical discretisation of the model for chemistry and transport, based on a first-order time-splitting algorithm, can be summarised as follows: where Table 1 lists the 15 VOC species for which measurements are available from the European Monitoring and Evaluation Programme (EMEP) database.In order to simulate the concentrations of these species, the RACM 2 (Regional Atmospheric Chemistry Mechanism, version 2) chemical kinetic mechanism (Goliff et al., 2013) is used within the CTM.The model has been satisfactorily evaluated with a similar configuration by Kim et al. (2009).The chemical reactions considered for these species and their typical lifetime are also presented in Table 1.After undergoing oxidation, these 15 primary species result in secondary species.The latter are not presented in Table 1 because they are not measured in the EMEP network and therefore cannot be assimilated; they are, however, included in RACM 2 either explicitly or via surrogate species.Some of the primary VOC species; isoprene (ISO), acetylene (ACE), ethane (C 2 H 6 ), ethylene (C 2 H 4 ) and benzene (BEN) are treated explicitly in RACM 2. The others are represented through a lumped molecule approach and therefore need a specific treatment to be followed separately.They are added as explicit species with their own oxidation reactions written in a way that does not affect RACM 2. The emissions of the explicit species continue to be treated as part of the lumped species of RACM 2 and the oxidant species involved in the additional oxidation reactions of the explicit species (see Table 1) are included as a product of the oxidation reaction so that their concentrations are not affected by those additional reactions.This approach has been used previously and has been shown not to affect the concentrations of the original mechanism, here RACM 2 (e.g.Pun et al., 2003;Kim et al., 2011).
This chemical mechanism involves more than 300 reactions that result in nonlinear interactions among the chemical species.As a result, the computational burden of inverse modelling studies is very large.To address this issue, we developed a reduced chemical mechanism, denoted X k , which uses the concentration fields of the oxidants, hydroxyl radicals (OH), ozone (O 3 ) and nitrate radicals (NO 3 ), provided as external data.The oxidant concentration fields are pre-computed with RACM 2 and used later in the reduced mechanism.This approximation makes sense if The validation of this approximations is checked a posteriori in Sect.4.1.1.When replacing X k by X k , Eq. ( 1) becomes linear with respect to the emission fields and the computational cost of inverse modelling becomes manageable.

The source-receptor model
The source-receptor model provides the relationship between the emissions and the observations.For species s, this can be written as follows: where µ s ∈ R d s represents the vector of the observations (d s is the number of observations for species s).H s is the Jacobian operator with respect to e s , and e s = (e s 0 , e s 1 , . . ., e s k , . . .e s N t ) ∈ R E defines the hourly and spatially discretised emission vector, where E = N t × N x × N y × N z .N t is the total number of time steps, and N x , N y and N z are the total number of elements (grid cells) in the x, y and z directions.λ s ∈ R d s is the vector of the concentrations induced by the initial and the boundary conditions for species s.If µ s,i k is the observation of species s at time t k at station i, then λ s i k is the concentration at the same time and location, computed with the full CTM -i.e.Eq. ( 1), with e s = 0.The vector s represents the errors: representativeness error, model error and instrumental error of the observations.
The Jacobian operator H s can be built following two different methods.The first method consists in using the CTM as it is.Let us assume that e s ≡ δ l,h,k is the unity source at the surface coordinate l ∈ [1; and equals 0 anywhere else.Assuming clean-air boundary and initial conditions, and this source term, the CTM-simulated concentrations at time t k and station i are stored in H s,i k l,h .In order to compute the H s operator with this method, the CTM model needs to be run E times, which is computationally intensive.That is why this method is often restricted to point-wise emission sources.
The second method consists in using the adjoint model of the CTM (Marchuk, 1974;Uliasz, 1983, and many more references since the late 90's).For a monitoring site i at time t k , using the linearity of the CTM for the VOCs, the adjoint solution can be written as follows: where π i k = δ i,k is the sampling function related to the concentration measurement at station i and time t k , X † k is the adjoint of X k and M † k is the adjoint of M k .At the final time N t , φ N t i is chosen to be 0. The adjoint model is also computed for clean air boundary conditions.Then, the Jacobian matrix is given by H . The adjoint model is run d times (d ≤ s d s ).This method is of great interest to reduce the computational time when d E. It is appropriate for estimating the emissions originating from spatially distributed sources.Because the many adjoint solutions are independent, it is straightforward to parallelise their computation.
The Jacobian matrix of the present study is computed row by row -i.e. using adjoint solutions.

Control space
In order to reduce the dimension of the control space, which is the space of the fluxes to be estimated via inverse modelling, we introduce a relation between the effective control variables α s and the emission e s : In this equation, e s,b is the background vector of emission for species s.Obviously the background value of the scaling factors α s is α s b = 1, where 1 = (1, . . ., 1) T and T denotes transposition of a vector or of a matrix.Indices k, l and h are respectively related to the time sequence, the horizontal space grid and the vertical grid.The α s factors, rather than the full 3D time-dependent emission fields of Eq. (1), will be optimised.This choice of control parameters, which is only a function of l, implies that the correction of the emission fluxes is spatially distributed in the horizontal directions but that the vertical and temporal distributions of the emission fluxes are not modified by the data assimilation analysis.Indeed, the relatively low number of observations available restricts the number of parameters that can be considered for the inversion.The fact that α s does not depend on time means that the inversion relies on the a priori temporal distribution of the emissions (given by activity sectors), which is not modified.

Inversion method
Combining Eq. (2) with Eq. (4), one obtains where H s is the Jacobian matrix that relates µ s to α s : In order to optimise the α parameters, the following objective function with a regularisation term is used: The vector α s b = 1 = (1, . . ., 1) T is the first guess (mean state of the background) of α s .R s is the observation error covariance matrix.B s is the background error covariance matrix.For each species, these two matrices are both chosen to be diagonal with uniform variances; that is, Here, I N x ×N y is the identity matrix of control space and I d s is the identity matrix of the observation space.Error standard deviations r s and m s respectively give the magnitude of the prior uncertainty on the observation and the background.They will be both discussed further in the next section.
These statistical assumptions imply that we neglect any spatial and temporal correlations between grid cells in the errors.Because of the way the anthropogenic inventory surveys are performed, the anthropogenic emissions of VOCs are not in principle expected to induce long-range correlation (a few tens of kilometres) in the errors.Yet, scaling relationships used to infer emissions of pollutants from others in bottomup approaches may have biases that would be systematic over large geo-political regions.Moreover, potential important reasons for this diagonal hypothesis to fail are when the biogenic VOC emissions have correlated errors due for some VOCs to common sources and similar emission model formulations, or when transport model errors induce temporal correlation in the error covariance matrix.That is why our diagonal assumption should be understood as a convenient approximation.
In the following, two types of solution for Eq. ( 7) are considered.The first one assumes that the errors are Gaussian distributed and the analysed parameters α s a are given by the best linear unbiased estimator (BLUE): where K s is the gain matrix: The second solution is obtained by minimising Eq. (7) on α s , under the positivity constraint of each one of its entry A bounded quasi-Newton gradient-based minimisation scheme will later be used.As opposed to the Gaussian case, the retrieved scaling factors [α s ] l cannot be negative.This solution implicitly assumes a truncated Gaussian distribution for the background error statistics.

Estimation of hyperparameters
The parameters of the prior statistics, such as r s and m s , give a measure of the magnitude of the prior uncertainties.For a reliable emission inverse modelling, these often need to be estimated, in addition to the emission fluxes, because their initial guess is usually inaccurate, while the dependence of the retrieval on these parameters can be dramatic (Davoine and Bocquet, 2007).When those r s and m s are considered as parameters to be themselves estimated on top of the flux parameters, they are usually coined hyperparameters.
The estimation method for the hyperparameters depends on the statistical assumptions underlying Eq. ( 7).The observation error vector s (in Eq. ( 5)) is assumed to be Gaussian distributed, s ∼ N (0, R s ).A similar assumption applies to the control parameters: α s ∼ N (1, B s ).The probability density function (pdf) of the observation vector can be computed as follows: This pdf is also proportional to the likelihood of r s and m s .In order to estimate the hyperparameters r s and m s , Desroziers and Ivanov (2001) suggested an iterative method to converge towards a fixed point.Chapnik et al. (2006) showed that this approach converges to one local maximum of the likelihood.The maximisation of ln (p(µ s |r s , m s )) with respect to the two hyperparameters gives the stationary conditions: where • is the Euclidean norm.However, the Desroziers method relies on Gaussian assumptions, and, for the sake of consistency, one needs another approach to compute the likelihood under the truncated Gaussian assumption (Winiarek et al., 2012).In this case, the prior on the scaling factors is where 1,B s (0) is the Gaussian cumulative density function (cdf) of N (1, B s ).I α s ≥0 is a function equal to 1 when [α s ] l ≥ 0 for each l, and equal to 0 otherwise.This pdf is referred to as N (1, B s , 0) or the truncated Gaussian distribution.From Eq. ( 10) and Eq. ( 13), one can derive where r + s and m + s refer respectively to the standard deviation of the observation error and of the background error (departure of the surface fluxes from their a priori values) according to the truncated Gaussian distribution.P a s is the a posteriori error covariance matrix of the control variables of the Gaussian case, Even though it is of formal use in the truncated Gaussian case, P a s is not the error covariance matrix of the truncated Gaussian case.A mathematical hardship is that Eq. ( 15) requires the computation of a 1768-dimensional (N x × N y ) integral over the space of all positive emission fields.To overcome this difficulty, we resort to the stochastic sampling technique used by Winiarek et al. (2012).

Observations
The in situ observations used in this study are extracted from the EMEP database1 .The EMEP monitoring network covers most of Europe.Eleven stations of western Europe are used in this study, resulting in a rather sparse network.These stations measure the concentrations of 14 different VOCs.Note that the m-xylene and p-xylene are combined in a lumped mp-xylene category.The observations are performed from 11 January 2005 to 29 December 2005.The sample duration ranges from 20 min to 24 h depending on stations and species2 .
Table 2 gives the number of observations used in the inversion per species and per station, for a total of 18 675 observations from 11 stations.A forecast test will also be performed using 19 746 observations of the year 2006.The VOC station Kollumerwaard, in the Netherlands (code NL0009R), does not provide any observation in 2005.However, this station provides 26 732 observations in 2006, and they will be used for cross validation.The locations of the EMEP sampling stations for VOCs are shown in Fig. 1.
The uncertainties associated with the measurements have been analysed in detail by Sauvage (2008) for three stations of the monitoring network considered here (stations 9, 10 and 11 of Fig. 1).For the VOC considered here, the measurement uncertainties were estimated to consist of a heteroscedastic component ranging from 11 % (ethane) to 25 % (m-& pxylene) of the measured concentration and a constant component of 0.01 ppb associated with the detection limit.

Inversion and validation setup
Two simulation periods are considered.The first one is the assimilation time window of the study, from 11 January to 29  ] l used in the inversion for each species s is 1768.The meteorological data are generated from the reanalysis fields of the European Centre for Medium Range Weather Forecast (ECMWF), delivered in 60 vertical levels and every 3 h, with a horizontal resolution of 0.36 • × 0.36 • .For anthropogenic emissions, the background emissions over the whole domain are provided by the EMEP inventory for the years 2005 and 2006 (Tarrasón et al., 2007;Fagerli et al., 2008).The disaggregation of the EMEP NMVOC emissions among the many VOC species simulated in the CTM is performed following Passant (2002).
The anthropogenic emissions of EMEP have a resolution of 0.5 • × 0.5 • .These emissions are modulated in time with the help of the hourly, weekly and monthly distribution coefficients, provided by the GENEMIS project (GENEMIS, 1994).The biogenic emissions of isoprene are also taken into account using the model proposed by Simpson et al. (1999).All these emissions are used as a first guess (mean state of the background), e b , in the data assimilation experiments.
The initial and boundary conditions (top and lateral faces of the domain) concentration fields are obtained from the global chemistry transport model MOZART 2 (Horowitz et al., 2003).Since the species we are interested in are not all explicitly present in MOZART 2, the values for the VOCs not included in MOZART 2 were inferred from the concentration fields of some species present in MOZART 2. The factors applied for this inference are given in Table 3 (see Rudolph and Ehhalt, 1981;Rudolph and Johnen, 1990;Penkett et al., 1993).
Our inverse modelling methodology estimates the magnitude of errors of all types.However, it does not handle a possible systematic bias, for instance in the boundary conditions.To get a rough idea of the uncertainties in the boundary conditions, we have performed a comparison of the results of the MOZART 2 model, from which are extracted the boundary conditions, to the EMEP observations used in the inversion.From this comparison, rough average relative errors are estimated, ranging from 1 % (for C 3 H 8 ) to 116 % (for C 3 H 6 ).For the long-lived species (ACE, C 2 H 6 and C 3 H 8 ) rough average relative errors are respectively −60 %, 8 % and 1 %.Thus, we believe that the inversion of acetylene emissions, with a lifetime of about 110 days and presumably biased boundary conditions, must be interpreted with caution.

Verification of the adjoint solutions
In order to generate the adjoint solutions at a low computational cost, we have used an approximate adjoint model following the construction of Bocquet (2005).The approximate adjoint is built by discretising the adjoint of the continuous transport equation.Since, in general, discretisation and adjointisation are operators that do not commute, this adjoint is only approximate.However, a benefit of this approach is that its implementation re-uses the forward CTM but with reversed wind fields and a backward integration in time.
Moreover, we have assumed the lifetime of the species within the domain to be less than 10 days.After 10 days, the VOCs are assumed to be out of the domain or consumed In order to check these approximations, the concentration fields from the adjoint model, obtained from the contribution of the source (Hα when α = 1), were compared with the concentration fields from the direct simulation (for e = e b , the EMEP inventory first guess, with clean air boundary and initial conditions).This is the so-called duality test (Davoine and Bocquet, 2007).
Figure 2 shows the comparison between the source contribution to the observations estimated with the forward model and with the adjoint model.Table 4 gives the Pearson correlation, the mean values and the root-mean-square error that result from the comparison of the two data sets.These results indicate that, with a view to inverse modelling, the adjoint model is accurate enough for most species.However, the approximation is not as good for isoprene (fractional bias of 17 %, a correlation of 0.97 and a normalised root-meansquare error of about 0.25) due to its short lifetime that induces large numerical errors in the adjoint computation because of the resulting weak correlations between the sources and the receptors.

Values of the hyperparameters
In the Gaussian case, the optimal values of the hyperparameters r s and m s are obtained by value screening of the pdf Eq. ( 11).This means that the couples (r s , m s ) run over all points of a regular grid of the (r s , m s ) plane, and the corresponding pdf values have been computed.Their optimal values are close to those maximising the likelihood pdf.They are reported in Table 5 for each species.As an example, the left panel of Fig. 3 displays p(µ s |r s , m s ) for the species nbutane (NBUT).The coordinates are normalised with respect to the values of the hyperparameters obtained from the fixedpoint solutions of Eq. ( 12).It can be seen that the optimal value of the likelihood Eq. ( 11) is equal to the result of the fixed-point method.
A comparison between the likelihoods Eq. ( 11) and Eq. ( 14) is also shown in Table 5.The optimal value of r + s and m + s computed with Eq. ( 14) are obviously different from r s and m s .Remarkably r + s is always larger than r s .Indeed, the Gaussian assumption inversion incorrectly As an example, the right panel of Fig. 3 displays the likelihood of the hyperparameters for ethane in the truncated Gaussian case.Again, the variables are normalised with respect to the values obtained from the Desroziers method.

Inversion results
Four types of simulations conducted for the year 2005 are reported here: -In case A, the VOC concentrations are simulated using the EMEP inventories, e k = e b k in Eq. ( 1).This is a free run serving as a reference since the observations are not assimilated.
-In cases B1 and B2, the concentrations are simulated using the emissions obtained from Eq. ( 4).In case B1, the scaling factors α s used in this equation are obtained from the BLUE matrix formula Eq. ( 8).By construction, case B1 can lead to negative emission fluxes.This may be considered unphysical but it might have some potential use for air quality forecast.
In case B2, the scaling factors α s are obtained from the minimisation of Eq. ( 7).The L-BFGS-B minimisation tool (Byrd et al., 1995) is used under the constraints that α s ≥ 0. However, it is assumed in both cases that the errors are essentially Gaussian, so the hyperparameters used in the inversion are computed with the Gaussian Fig. 3.These density plots display a monotonic transform of the likelihood of the hyperparameters for NBUT in the Gaussian case (left) and C 2 H 6 in the truncated Gaussian case (right).The monotonic transform is used to obtain a better contrast in the density plot.The abscissa and ordinate are normalised according to the optimal parameters obtained from the fixed-point method.likelihood following Sect.2.5.In case B2, the statistical assumptions are different in the estimation of the fluxes and the estimation of the hyperparameters, which may lead to inconsistencies.However, the fact that the r + and r hyperparameters are not too different proves that these inconsistencies are small.
We have also considered a third case B3, which takes the results of B1 and artificially sets all negative values to 0. However, we found that because it is not a minimum of Eq. ( 7), it leads to a poorly performing estimation.Therefore, case B3 was ruled out and is not reported here.
-Case C is similar to case B2 except that the hyperparameters are obtained using the truncated Gaussian likelihood following Sect.2.5.In this case the estimation of the emission fluxes and the estimation of the hyperparameters are statistically fully consistent.

A posteriori verification of the model linearisation
In the full CTM, which is used in the present study, the chemical kinetics of the reactions can be written as follows: where δX (c) denotes the variation of the VOC concentrations field with respect to the variation of the oxidant concentration field.Until now, we assumed that δX (c) X(c), and that the reduced model was linear.
In order to check that hypothesis, the a priori and the a posteriori (case C) oxidant concentration fields calculated by the full CTM are compared.Because the inversion of isoprene emission will be later deemed unreliable, the EMEP emissions (a priori emissions) have then been kept for ISO to perform this test.The mean value of the OH concentration is 1.882 × 10 6 # cm −3 for the a priori fields and 1.877 × 10 6 # cm −3 for the a posteriori fields.The Pearson correlation between the a priori and a posteriori concentration fields is about 1.00.The standard deviation of the difference between the a priori and the a posteriori fields (computed for all the time steps and cells) is 2.705 × 10 4 # cm −3 .Figure 5 shows the comparison of the concentration fields of OH before and after data assimilation for two representative periods of hundreds of hours: the first one in winter, the second one in summer.For NO 3 and O 3 , the average values of the a priori concentrations are 1.418 × 10 8 # cm −3 and 1.090 × 10 12 Normalised emitted mass correction case B1 case B2 case C Fig. 5.The total emitted mass correction, normalised with respect to the total emitted mass of the EMEP inventory for cases B1, B2 and C. The normalised emitted mass correction represents the correction to the emitted mass ∆E divided by the original emitted mass E0.Therefore, there is no correction if ∆E/E0 is equal to zero.If it is equal to 1, the corrected emitted mass is twice the original emitted mass.If it is equal to -1, the corrected emitted mass is zero.If it is less than -1, then the corrected emitted mass is negative.
28 Fig. 5.The total emitted mass correction, normalised with respect to the total emitted mass of the EMEP inventory for cases B1, B2 and C. The normalised emitted mass correction represents the correction to the emitted mass E divided by the original emitted mass E 0 .Therefore, there is no correction if E/E 0 is equal to 0. If it is equal to 1, the corrected emitted mass is twice the original emitted mass.If it is equal to −1, the corrected emitted mass is 0. If it is less than −1, then the corrected emitted mass is negative.molec cm −3 , respectively.They are 1.418 × 10 8 # cm −3 and 1.092 × 10 12 molec cm −3 , respectively, for the a posteriori concentrations.The Pearson correlations between the a priori and a posteriori concentration fields are about 1.00 for both NO 3 and O 3 .The standard deviation of the difference between a priori and a posteriori fields is 2.627×10 4 # cm −3 for NO 3 and 4.218×10 9 molec cm −3 for O 3 .One additional iteration of the linearisation and inversion cycle barely changes these results.
Furthermore, an examination of the a posteriori VOC concentration fields shows that the results obtained with the reduced linear model are very close to those obtained with the complete model.The relative bias between the two sets of concentrations for all of the VOC species and over the entire spatial domain is 1 % for the year 2005 and the correlation is 1.00.Therefore, since the oxidant concentration fields are little affected by the VOC data assimilation, we consider that the hypothesis that the reduced model is about linear is verified.

Comparison to the observations
The four runs A, B1, B2 and C are compared with the observations of the analysis period (year 2005).Statistical indicators for this comparison are reported in Table 6.For most species the bias between the concentrations and the observations decreases with data assimilation, except when the mean of the simulation is already close to the measurement mean.The root-mean-square errors (RMSEs) and the normalised mean square errors (NMSEs) are systematically improved Table 6.Scores from the comparison between the observations and the simulated concentrations for four simulations.For each species, the first line represents the scores for the simulations with the background fluxes (case A).The scores of the second line and third lines are related to the simulations with the a posteriori emissions from Gaussian hyperparameters estimation (case B1 and case B2, respectively).The scores of the fourth line are related to the simulations performed with the a posteriori emissions under the truncated Gaussian assumption (case C).The means and the RMSE are in µg m −3 .Bold numbers correspond to the best agreement with the observations.in the reanalysis runs, which is consistent with the fact that our inversion scheme minimises the quadratic error.For all species, except acetylene (ACE), the Pearson correlation coefficients R, FA 2 and FA 5 are remarkably improved in the reanalyses.FA x is the fraction of the simulated concentrations within a factor x of the corresponding observations.In the very few cases where an indicator is not improved, other indicators are improved.The decrease of the correlation in the ACE case is due to a very large bias, which is compensated by a very significant improvement of the other indicators, particularly the bias.Considering all species ( 14) and all statistical indicators (6) together, we have counted how often the forecast runs B1, B2 and C beat the free run A: 81 times out of 84 in case B1, 80 times out of 84 in case B2 and 79 times out of 84 in case C.
The fact that run B1 is slightly closer to the observations than B2 and C is consistent with the fact that the optimisation on which B1 relies is less constrained than that of B2 or C (emission fluxes can be negative in case B1).However, it does not prove that method B1 is better than method B2 or C since a comparison with the (already assimilated) observations is merely a check of consistency, not a validation.

Estimated inventories
For each of the 15 species, the total emitted mass of the EMEP inventory is compared to the a posteriori emission obtained from data assimilation following Gaussian assumption (B1 and B2) and truncated Gaussian assumption (C).The results are reported in Table 7. Caution must be used on the interpretation of the results of the statistically consistent Gaus-sian case B1 since fluxes are allowed to be negative in this case (e.g.isoprene emissions).
The results of case B2 and case C indicate that the EMEP inventories may underestimate the true emissions for propane (C 3 H 8 ), isobutane (IBUT), isopentane (IPEN), propene (C 3 H 6 ), acetylene (ACE), ethane (C 2 H 6 ), ethylene (C 2 H 4 ) and benzene (BEN).They may overestimate the true emissions for the other VOC species.For all the species, the total mass obtained from the inversion based on the truncated Gaussian assumption (case C) is between the EMEP inventory and the total mass estimated with the Gaussian assumption (case B2).Comparison between cases A and C shows a strong correction for C 3 H 8 , ACE, C 2 H 6 , and C 2 H 4 .It is less than 20 % for n-butane (NBUT), n-pentane (NPEN), isopentane (IPEN), toluene (TOLU), o-xylene (OXYL) and m-xylene (MXYL).Figure 5 presents the ratio between the correction of the emission (i.e. the total posterior emission minus the total prior EMEP emissions) and the total prior emission for the Gaussian cases (B1 and B2) and the truncated Gaussian case (C).

Spatial distribution
The spatial extent of the corrections from the EMEP network depends on the nature of the species.As an example, Fig. 6 displays the spatial ratio between the posterior emissions and the prior EMEP emissions for acetylene (ACE, with an average lifetime of 110 days), ethane (C 2 H 6 , with an average lifetime of about 60 days), propane (C 3 H 8 , with an average lifetime of about 14 days), isobutane (IBUT, with an average lifetime of about 7.5 days), ethylene (C 2 H 4 , with an average lifetime of about 1.45 days) and isoprene (ISO, with an average lifetime of about 1.7 h).Obviously, the corrections extend much farther from the monitoring stations for the longlived species, such as ACE and C 2 H 6 , than for the short-lived species.
As can be seen, with the B2 approach based on a Gaussian estimation of the prior errors, the magnitude of the corrections is significantly higher than when using the fully consistent non-Gaussian approach C. One way to understand this is that, with the Gaussian assumption, part of the error (noise) is misinterpreted as valuable information (signal), such that the Gaussian assumption leads to over-corrections.A similar phenomenon was put forward by Koohkan and Bocquet (2012) in the inversion of carbon monoxide emission fluxes: the proper identification of representativeness errors leads to smaller corrections of the emission fluxes.
Several studies have performed inverse modelling of isoprene emissions over Europe.To do so, they assimilate satellite observations of formaldehyde (e.g.Curci et al., 2010;Dufour et al., 2009).Specifically, they exploit the observations of SCIAMACHY (instrument operating onboard the sunsynchronous Envisat satellite) with a resolution of 30 × 60 km 2 .The study of Curci et al. (2010) shows an increase of about 5 % for the MEGAN (Guenther et al., 2006)   results show that the emissions of isoprene decrease by about 24 % in case B2 and by 13 % in case C for the emissions obtained from Simpson et al. (1999).This discrepancy is explained in part by the fact that the emission inventories of isoprene by Simpson et al. (1999) and MEGAN differ significantly (Bessagnet et al., 2008;Sartelet et al., 2012), with the former leading to greater isoprene emissions on average over Europe -for example by a factor of about 2.5 for the July-August 2006 period (Sartelet et al., 2012).This discrepancy is also explained by the very short lifetime of isoprene (1-2 h).For this species, assimilation of in situ observations seems quite inoperative because the information is not spread far enough by the (adjoint) model.Our results are only valid near the five stations measuring isoprene.On the contrary, satellite observations are well suited for this short-lived species as remote sensing offers a (indirect) spatially well-resolved snapshot of the concentrations, although uncertainties are associated with using oxidation products (formaldehyde, glyoxal) of isoprene to retrieve the isoprene emissions.In addition, the errors made by the approximate adjoint model (or automatic adjoint up to some numerical precision) are larger for short-lived species (Bocquet, 2012).As a consequence, the case of isoprene in this study is somehow singular because its lifetime is so short.Nevertheless, most results of isoprene are given in this study for the sake of comparison and to document the inadequacy of the in situ observations for regional inverse modelling in such a case.

Forecast test
In order to test and possibly validate the corrected emissions obtained from inverse modelling, one needs observations that have not been assimilated.One stringent test is to perform a forecast using the corrected emissions over a period of time different from the data assimilation window, assuming some time persistence of the VOC inventories.
Four inventories are generated with Eq. ( 4), using the background EMEP emission e s b , or α s b = 1 (case A), and the scaling factors of cases B1, B2 and C. In each case, a forecast is performed for the year 2006.These simulations are then compared with the independent observations of year 2006.
Note that because some of the fluxes retrieved in case B1 are negative, and because numerical schemes of CTMs often rely on the positivity of the concentrations, we had to circumvent the difficulty.One solution consists in decomposing the scaling factors into a positive part and a negative part: α s = α s + +α s − , such that, invoking the linearity of physics, the concentrations are given by H s (α s ) = H s (α s + ) − H s (−α s − ).The statistical indicators are reported in Table 8.The results indicate that, for most species, the scores are improved using the retrieved scaling factors of case B2 and case C, whereas they are degraded in case B1.Considering all species and all statistical indicators together, we have counted how often the forecast runs B1, B2 and C beat the free run A: 41 times out of 84 in case B1, 69 times out of 84 in case B2 and 77 times out of 84 in case C.
The fully consistent truncated Gaussian approach C performs best and beats the Gaussian-based but positively constrained approach B2.Both positively constrained approaches beat the fully consistently Gaussian approach B1.Since B1 partially leads to unphysical negative fluxes, this could have been expected.However, as was shown by Bocquet (2012), an unconstrained optimisation of parameters that are allowed to take unphysical values may sometimes lead to valuable better forecasts because it compensates for other sources of model error.Obviously, this is not the case here.
According to Table 8, an improvement of all the scores can be seen for C 3 H 8 , NBUT, NPEN, C 3 H 6 , C 2 H 6 , C 2 H 4 and BEN using the optimal scaling factors.However, the bias is increased for OXYL (in case B2), MPXYL (in cases B2 and C) and ISO (in case B2), using the corrected emissions.Despite that degraded bias, the other indicators are improved.For IBUT, the RMSE is increased in case B2, and the correlation is decreased in cases B2 and C. The statistical indicators show that the RMSE increases for IPEN using the corrected emissions.For that species, the Pearson correlation coefficient is also deteriorated in case B2.For TOLU, the bias, NMSE and FA 5 are degraded using scaling factors.The RMSE is also increased in case B2.The scores also show a decrease of the Pearson correlation coefficient for ACE when using the corrected emissions.

Cross-validation test
Since the EMEP Netherlands VOC station Kollumerwaard does not provide any observation for the year 2005, that station is not included in the list of stations used for the analysis of 2005.Yet, for the year 2006, 11 out of 14 VOCs were measured at Kollumerwaard, excluding NBUT, ACE and C 2 H 6 .Kollumerwaard is located far from the other stations whose observations lead to the corrections in the emission inventories.Therefore, we do not expect the simulated concentrations of the VOC with a short lifetime, at this station, to be very sensitive to the correction of the emissions.Indeed, as shown in Table 9, the scores obtained from the comparison of the simulated concentrations in case A of ISO (lifetime of about 1.7 h) with the measurements are similar to those obtained from the comparison of the concentrations in cases B and C with the observations.
For species with a longer lifetime (about 1 to 2 days), for C 3 H 6 (lifetime of about 15 h), OXYL (lifetime of about 25-26 h), MPXYL (lifetime of about 14-25 h) and TOLU (lifetime of about 2.4 days), the corrections of the inventories are more spread out over the domain.The scores indicate that the results in case C are in a better agreement with the observations.The results also show that for the species with a lifetime between 4 and 7 days (IBUT, NPEN and IPEN), the scores are deteriorated.It is likely that the corrections performed close to the Kollumerwaard station are not reliable enough.For the species with a lifetime longer than 10 days (C 3 H 8 and BEN), the scores are remarkably improved.The emission inventories of C 3 H 8 are corrected almost all over the domain.
Considering all species and all statistical indicators together, we have counted how often the forecast runs B2 and C beat the free run A at the Kollumerwaard station: 34 times out of 66 in case B2 and 43 times out of 66 in case C. The use of the corrected emissions from case C has a positive impact on the forecast at the station, even though the station is far away from the other stations whose observations were assimilated.

Uncertainty of the retrieved emissions
Although the forecast and cross-validation tests are sufficient to validate the added value of the retrieved emissions, they do not yield the expected uncertainty of these fields.
To do so, one approach would consist in performing sensitivity analysis of the retrieved emissions on several factors such as boundary conditions, oxidant species or transport model.However, that would require knowing the uncertainty of those influential factors, which are not always available for VOC, or that would be difficult to estimate, for instance, for the boundary conditions (Sect.3.2).Another approach relies on the fact that prior errors were objectively assessed via the estimation of two hyperparameters (Sect.2.5).It is then possible to estimate the posterior uncertainty.In the B1 case, the uncertainty is given by the a posteriori error covariance ma-trix of Eq. ( 15).Due to the positivity constraint, there is no analytical formula in cases B2 and C for the uncertainty of the retrieved emissions.However, it can be computed using a Monte Carlo approach.One first draws a new set of observations and a new emission background according to their prior statistics.Then an inversion is performed and constitutes one draw of the Monte Carlo analysis.A large number of Monte Carlo draws are performed until convergence of the statistics, for instance the variance of the emission field.In a first experiment, using these Monte Carlo statistics, the standard deviation of the total mass of the emission as well as the arithmetic and quadratic mean of the standard deviation of a flux in a grid cell are computed.From experimenting, the number of draws necessary to reach an acceptable convergence for the total mass and its standard deviation is 300.Because, for most VOCs, the correction does not extend far from the EMEP stations, the retrieved emission relaxes to the background away from them, such that the resulting total mass uncertainty is small.Besides, it does not mean that the estimated uncertainly cannot be locally large.Therefore, in this first experiment, we only perturb the observations, so the uncertainty that is computed is a measure of the uncertainty of the correction to the background, assuming the background is reliable wherever the observation does not lead to a correction.For each VOC species, the resulting standard deviation of the total mass is either normalised by the total retried mass or by the absolute difference between the retrieved and the background emission total mass.The second normalisation is more enlightening as it tells how reliable the correction to the background due to the assimilation of observations is.Moreover, the arithmetic and quadratic means of the standard deviation of individual emission fluxes averaged over all grid cells are also computed, and show the expectedly much stronger uncertainty of individual fluxes.The quadratic mean strengthens even more the heterogeneity of the local correction uncertainties.
Table 10 gives these standard deviations in case C and for each VOC.Each of these uncertainties needs to be interpreted with care.Let us look at three species.The correction to the total emitted mass of isoprene (ISO) has a large uncertainty of 30 %, which is consistent with the findings of this study.The correction to the total emitted mass of acetylene (ACE) has an uncertainty of 11 %, but the possible large bias that was deemed possible in the ACE because of a possible bias in the boundary conditions case cannot be diagnosed here because our methodology does not handle biases (i.e.systematic errors).Therefore, this value of 11 % does not represent the possible bias in the boundary condition (but it does account for errors in the boundary conditions without bias).Since these corrections extend far away from the stations, the mean standard deviation of individual emission fluxes is as large as 111 %.The correction to the benzene (BEN) total emission mass is 42 %, which although large is an estimation relative to the small correction brought about by the assimilation of EMEP observations.Species In a second experiment, a map of the gridded flux standard deviations obtained from the Monte Carlo analysis is displayed.As opposed to the previous experiment, both the background and the observations are perturbed according to their diagnosed statistics, such that an absolute uncertainty of the estimated emission fluxes is assessed instead of the uncertainty on the correction.Furthermore, the number of draws is chosen to be 3 × 10 4 instead of 300 because one needs to obtain convergence of the variance in each grid cell.As examples, two maps are displayed in Fig. 7: for propane (C 3 H 8 ) and isobutane (IBUT) chosen among the species of Fig. 6.Far from the stations, the uncertainty relaxes to some finite value driven by the uncertainty of the background, hence the value of hyperparameter m + s .Had we only perturbed the observations, the uncertainty would have relaxed to 0 far from the stations.As expected, close to the stations, the uncertainty decreases.As for the inverse modelling corrections, the spatial distribution of the uncertainty also strongly depends on the lifetime of the species.
Finally, let us mention that unaccounted model errors make these uncertainty estimations overly optimistic.However, the estimation of the diagonal part of the observation error covariance matrix R s performed in this study partially accounts for model errors, so this may alleviate the issue.

Conclusions
The goal of this study was to estimate the emission inventories of 15 VOC species using ground-based in situ measurements.The concentration observations at 11 stations from the EMEP network over western Europe were assimilated to perform inverse modelling of the emission field for each of the 15 species for the year 2005.
For that purpose, the Jacobian matrix -i.e. the sourcereceptor relationship -was built using the POLAIR3D CTM.To compute that matrix, a fast version of this CTM as well as its validated approximate adjoint model have been developed.The chemistry module of this fast version only includes the chemical reactions between the VOC species and three oxidants (OH, NO 3 and O 3 ), the concentrations of which are pre-computed with the full CTM.
For each species and each grid cell, a scaling factor that multiplies the local EMEP emission flux is computed.The uncertainty attached to the prior scaling factors and the covariance matrix of the observation errors, which are crucial statistical components of the inversion, are obtained using the maximum-likelihood principle.The principle was implemented using two different assumptions: (1) the errors attached to the scaling factors follow a Gaussian pdf or (2) the scaling factor follows a truncated Gaussian pdf.
In the Gaussian case, the simulated concentrations for the year 2005 using the corrected emissions lead to a significant improvement in most statistical indicators.However, the fact that the VOC fluxes are positive is not statistically accounted for, and this is shown to lead to a probable over-fitting to the observations and to over-corrections of the EMEP emissions.Using a fully consistent truncated Gaussian assumption for the emission fluxes, including the use of a non-Gaussian likelihood for the estimation of the hyperparameters, the corrections are significantly smaller.
For short-lived species, it is shown that information cannot propagate far from the monitoring stations, such that the corrections are rather local to the stations.That is why we deem the isoprene inversion to be unreliable.That is a typical case where remote sensing assimilation is necessary to offer a satisfying coverage.
Long-lived species modelling may be impacted by bias in the boundary conditions.We found that with a lifetime of 4 months and highly uncertain boundary conditions, the inversion of the acetylene emission should be interpreted cautiously as the inversion may try to correct for a bias in the boundary conditions.In such circumstances, the inversion of the boundary conditions, in addition to the emissions, may be required (Roustan and Bocquet, 2006).
The corrected emissions have been partly validated thanks to a forecast conducted for the year 2006 using independent observations.The simulations using the corrected emissions often led to significant improvements in the statistical indicators.Considering all statistical indicators, the fully consistent truncated Gaussian approach emerged as the best approach from this test.
The 2006 forecasts have also been compared to the observations at the Kollumerwaard station in the Netherlands.The Kollumerwaard station is not part of the 11 stations used in the analysis of 2005.Even though this station is far away from the 2005 network and its surroundings have emission fluxes that are little affected by the 2005 analysis, some improvements are noticed for several long-lived VOC species using the statistically consistent positively constrained inversion.
The estimation of the uncertainty of the retrievals has been discussed.Using a Monte Carlo technique, an objective estimation of the uncertainty of the retrieval was provided for each species.However, this estimation does not account for possible systematic errors in the boundary conditions.
For anthropogenic VOC (alkanes, alkenes, acetylene and aromatics), the inversion modelling suggests that the current EMEP emission inventories are correct within 33 % on average for n-butane, isobutane, n-pentane, isopentane, propylene and aromatics (benzene, toluene and xylenes).However, there are large discrepancies (about 1 order of magnitude) in specific regions.The emission inventories appear to be significantly underestimated (about a factor of 2) for ethane, npropane, ethylene and acetylene.Revisions to VOC emission inventories appear to be warranted and experimental investigations of speciated VOC emissions from major source categories (e.g.on-road traffic, petrochemical industry) are recommended.
As an extension to this study, we have tried but failed to disaggregate the inversion among the different emissions sectors in this study using a positive matrix factorisation approach.The chemical profiles of the sectors in regard to the considered species were too collinear to get a robust segregation.
It would be of interest to assess the method potential efficiency in more specific situations.The methodology developed here could be applied for specific source categories using tracer measurements such as levoglucosan for biomass burning.Its efficiency could also be investigated for shortlived species but for local emission inventories -for instance at the urban scale.Of course, this extension of the method implies the availability of a large and consistent set of measurements of specific species.
c k is the field of the concentrations of all simulated species at time step k, -M k is the linear advection-diffusion operator.It also includes the deposition processes, -X k is the chemical reaction operator, is the emission field at time step k.

Fig. 1 .
Fig. 1.The 11 monitoring stations of the EMEP network for volatile organic compounds whose observations are assimilated are indicated with a circle.The Kollumerwaard station in the Netherlands used for validation only is indicated by a rhombus.

Fig. 2 .Fig. 2 .
Fig. 2. Comparison of the source contribution to the observations estimated with the direct model and the adjoint model.

Fig. 4 .
Fig.4.Time evolution of the horizontal mean over the whole domain of OH hourly concentrations for different levels (ground level in green, third level in blue and top level in red).Full lines are used for a priori simulations and crossed ones are used for a posteriori simulations.The dotted lines represent the standard deviation computed for the corresponding levels.

Fig. 6 .Fig. 6 .
Fig. 6.Gridded EMEP time-integrated flux (in t.km −2 year −1 , left column) and gridded ratios of time-integrated retrieved flux to EMEP time-integrated flux for, from top to bottom, acetylene (ACE), ethane (C 2 H 6 ), propane (C 3 H 8 ), isobutane (IBUT), ethylene (C 2 H 4 ) and isoprene (ISO).The species are ordered by decreasing lifetime.The central column corresponds to case B2 and the right column corresponds to case C. Green and blue colours correspond to reduction of the emission fluxes, whereas red and pink colours correspond to an increase of the emission fluxes.

Table 10 .
Monte Carlo estimation of the uncertainty of the correction to the EMEP emission inventory yielded by data assimilation in case C. M EMEP : the total 2005 emitted mass for EMEP inventory (in Tg), M C ; the total mean reconstructed emitted mass (in Tg); std(M C ) M C : the standard deviation (std.) of the reconstructed mass normalised by the mean reconstructed emitted mass; std(M C ) | M C −M EMEP | : the standard deviation of the reconstructed mass normalised by the difference in absolute value between the reconstructed mass and the EMEP emission mass; std(m C ) m C : the arithmetic mean of the standard deviations of the individual emission fluxes normalised by the mean emitted mass in a grid cell; √ var(m C ) m C : the quadratic mean of the standard deviations of the individual emission fluxes, normalised by the mean emitted mass.

Fig. 7 .
Fig. 7. Plot of the gridded unitless standard deviations of the scaling variables α that represent a posteriori uncertainties of the gridded emission fluxes of propane (C 3 H 8 , left) and isobutane (IBUT, right), in case C.

Table 1 .
The volatile organic compounds, their (indicative) lifetime and reactions.
Measured jointly in the EMEP monitoring network and represented with the symbol MPXYL.

Table 2 .
Number of observations for each species (N obs species ) and number of observation dates for each station (N obs station ).The numbers 1-12 are given to help locate the stations on the map of Fig.1.

Table 3 .
Factors applied to MOZART 2 explicit species concentrations to determine the initial and boundary conditions of the model species.

Table 4 .
Statistics of the discrepancy between the values of the observations as generated by the forward model, or generated with the approximate adjoint model: Pearson correlation (R), mean direct value, mean adjoint-based value and normalised mean square error (NMSE).The mean values are in µg m −3 .

Table 5 .
Estimated standard deviations of the observation error (r s and r + s ) and background error (m s and m + s ), under the Gaussian likelihood and truncated Gaussian likelihood.r s and m s are related to case B1 and case B2, while r + s and m + s are related to case C. The units of r s and r + s are µg m −3 , while m s and m + s are unitless.

Table 7 .
For all species, the total emitted mass (in Gg) for the EMEP inventory run (case A), the a posteriori emissions under Gaussian assumption (cases B1 and B2) and the a posteriori emissions under the truncated Gaussian assumption (case C).

Table 8 .
Scores of the forecast test(year 2006)from the comparison of the observations and the simulated concentrations for four simulations: case A, case B1, case B2 and case C. The means and the RMSEs are in µg m −3 .Bold numbers correspond to the best agreement with observations.

Table 9 .
Scores at the Kollumerwaard station (for the year 2006) from the comparison of the observations and the simulated concentrations for three simulations: case A, case B2 and case C. The means and the RMSEs are in µg m −3 .Bold numbers correspond to the best agreement with observations.