Assimilation of ground versus lidar observations for PM 10 forecasting

Introduction Conclusions References

Abstract.This article investigates the potential impact of future ground-based lidar networks on analysis and short-term forecasts of particulate matter with a diameter smaller than 10 µm (PM 10 ).To do so, an Observing System Simulation Experiment (OSSE) is built for PM 10 data assimilation (DA) using optimal interpolation (OI) over Europe for one month from 15 July to 15 August 2001.First, using a lidar network with 12 stations and representing the "true" atmosphere by a simulation called "nature run", we estimate the efficiency of assimilating the lidar network measurements in improving PM 10 concentration for analysis and forecast.It is compared to the efficiency of assimilating concentration measurements from the AirBase ground network, which includes about 500 stations in western Europe.It is found that assimilating the lidar observations decreases by about 54 % the root mean square error (RMSE) of PM 10 concentrations after 12 h of assimilation and during the first forecast day, against 59 % for the assimilation of AirBase measurements.However, the assimilation of lidar observations leads to similar scores as AirBase's during the second forecast day.The RMSE of the second forecast day is improved on average over the summer month by 57 % by the lidar DA, against 56 % by the AirBase DA.Moreover, the spatial and temporal influence of the assimilation of lidar observations is larger and longer.The results show a potentially powerful impact of the future lidar networks.Secondly, since a lidar is a costly instrument, a sensitivity study on the number and location of required lidars is performed to help define an optimal lidar network for PM 10 forecasts.With 12 lidar stations, an efficient network in improving PM 10 forecast over Europe is obtained by regularly spacing the lidars.Data assimilation with a li-dar network of 26 or 76 stations is compared to DA with the previously-used lidar network.During the first forecast day, the assimilation of 76 lidar stations' measurements leads to a better score (the RMSE decreased by about 65 %) than Air-Base's (the RMSE decreased by about 59 %).

Introduction
Aerosols have an impact on regional and global climates (Ramanathan et al., 2001;Léon et al., 2002;Sheridan et al., 2002;Intergovernment Panel on Climate Control , IPCC 2007) as well as on ecological equilibrium (Barker and Tingey, 1992) and human health by penetrating the respiratory system and leading to respiratory and cardiovascular diseases (Lauwerys et al., 2007;Dockery and Pope, 1996).Aerosols influence the photo-dissociation of gaseous molecules (Randriamiarisoa et al., 2004) and can thus have a significant impact on photochemical smog (Dickerson et al., 1997).Thus the accurate prediction of aerosol concentration levels has signification human and economic cost implications.
Various chemistry transport models are used to simulate or predict aerosol concentrations over Europe, e.g.EMEP (European Monitoring and Evaluation Programme) (Simpson et al., 2003), LOTOS (Long Term Ozone Simulation) -EUROS (European Operational Smog) (Schaap et al., 2004), CHIMERE (Hodzic et al., 2006), DEHM (Danish Eulerean Hemispheric Model) (Brandt et al., 2007) and POLYPHE-MUS (Sartelet et al., 2007).However, uncertainties in modelling atmospheric components, in particular aerosols are high (Roustan et al., 2010), which leads to significant differences between model simulations and observations (Sartelet et al., 2007).Data assimilation (DA hereafter) can reduce the uncertainties in input data such as the initial conditions or the boundary conditions by coupling models to observations (Bouttier and Courtier, 2001).In meteorology, DA has been traditionally applied to improve forecasts (Kalnay et al., 2003;Lahoz et al., 2010).In air quality, Zhang et al. (2012) review chemical DA techniques developed to improve regional real-time air quality forecasting model performance for ozone, PM 10 , and dust.However, applications of DA to PM 10 forecasts are still sparse.They include Tombette et al. (2009) and Denby et al. (2008) over Europe and Pagowski et al. (2010) over the United States of America.They demonstrated the feasibility and the usefulness of DA for aerosol forecasts.
As in Tombette et al. (2009), in situ surface measurements are often assimilated, e.g.AirBase, BDQA (Base de Données de la Qualité de l'Air) or EMEP.However, they do not provide information on vertical profiles.Niu et al. (2008) used both satellite retrieval data and surface observations to assimilate dust for sand and dust storm (SDS) forecasts.They found that information on the vertical profiles of the SDS was needed for the SDS forecasts.Although satellite passive remote sensing can provide vertical observations, it is very expensive and data are often limited to low horizontal (e.g. 10 × 10 km 2 for the Moderate Resolution Imaging Spectroradiometers (MODIS) (Kaufman et al., 2002)) and temporal resolutions (e.g., approximately twice a day for polar orbiting satellites).Passive instruments can only retrieve column-integrated aerosol concentration (Kaufman et al., 2002).Spaceborne lidar promises to improve the vertical resolution of aerosol measurements at the global scale (Winker et al., 2003;Berthier et al., 2006;Chazette et al., 2010).Nevertheless, the spaceborne lidar measurements are only performed along the satellite ground track.
Thanks to the new generation of portable lidar systems developed in the past five years, accurate vertical profiles of aerosols can now be measured (Raut and Chazette, 2007;Chazette et al., 2007).Such instruments document the mid and lower troposphere by means of aerosol optical properties.Lidar measurements were used in several campaigns, such as ESQUIF ( Étude et Simulation de la Qualité de l'air en Île-de-France) (Chazette et al., 2005), MEGAPOLI (Megacities: Emissions, urban, regional and Global Atmospheric POLlution and climate effects, and Integrated tools for assessment and mitigation) summer experiment in July 2009 (Royer et al., 2011) and during the eruption of the Icelandic volcano Eyjafjallajökull on 14 April 2010 (Chazette et al., 2012).Raut et Chazette (2009) established a reliable relation between the mass concentration and the optical properties of PM 10 .Because the surface-to-mass ratio for fine particles (PM 2.5 , particulate matter with a diameter smaller than 2.5 µm) is high, they largely contribute to the measured lidar signal.However, the contribution of coarse particles may not be negligible as shown by Randriamiarisoa et al. (2006) who estimated it to be about 19 %.The relative contribution of PM 2.5 may increase with altitude (Chazette et al., 2005), but it is difficult to quantify.Thereby, the PM 10 concentrations above urban areas can be retrieved from a ground-based lidar system with an uncertainty of about 25 %.
Because a lidar network with continuous measurements does not yet exist, lidar observations have not yet been used for DA.This work aims to investigate the usefulness of future ground-based lidar network on analyses and short-term forecasts of PM 10 and to help future lidar network projects to design lidar networks, e.g.number and locations of lidar stations.Building and maintaining observing systems with new instruments is very costly, especially for ground-based lidars.Therefore, an Observing System Simulation Experiment (OSSE) can be used to effectively test proposed observing strategies before a field experiment takes place, and it can provide valuable information for the design of field experiments (Masutani et al., 2010).
An OSSE is constituted by a nature run, simulated observations, and DA experiments.The nature run is usually a simulation from a high-resolution state-of-the-art model forecast, and is used to create observations and validate DA experiments (Chen et al., 2011).Many applications use OSSEs, such as for investigating the accuracy of diagnostic heat and moisture budgets (Kuo et al., 1984), studying carbon dioxide measurements from the Orbiting Carbon Observatory using a four-dimensional variational assimilation (Chevallier et al., 2007;Baker et al., 2010), demonstrating the data impact of Doppler wind lidar (Masutani et al., 2010;Tan et al., 2007), defining quantitative trace carbon monoxide measurement requirements for satellite missions (Edwards et al., 2009), comparing the relative capabilities of two geostationary thermal infrared instruments to measure ozone and carbon monoxide (Claeyman et al., 2011), evaluating the contribution of column aerosol optical depth observations from a future imager on a geostationary satellite (Timmermans et al., 2009), and studying the impact of observational strategies in field experiments on weather analysis and short-term forecasts (Chen et al., 2011).
This paper is organised as follows.Section 2 provides a description of the DA methodology used in this study.Section 3 describes the experiment setup, i.e. the chemistry transport model used and real observations.An OSSE is built in Sect. 4. Results of the OSSE are shown in Sects.5 and 6.Sensitivity studies with respect to the number and locations of lidar stations are conducted in Sect.7. The findings are summarised and discussed in Sect.8.

Choice of DA method
Data assimilation couples model with simulated observations in an OSSE.Different DA algorithms may be used, e.g.OI, reduced-rank square root Kalman filter, ensemble Kalman filter (EnKF) and four-dimensional variational assimilation (4D-Var).Wu et al. (2008) have illustrated their limitations and potentials.They found that in the air quality context the OI provides overall strong performances and it is easy to implement.In terms of performance, the reduced-rank square root Kalman filter is quite similar to the EnKF.Denby et al. (2008) compared two different DA techniques, the OI and EnKF, for assimilating PM 10 concentration at the European scale.They showed OI can be more effective than the EnKF.Although aerosol assimilation could be performed with 4D-Var (Benedetti and Fisher, 2007), it may be limited to the use of a simplified aerosol model, as it is quite expensive for computation.
In this paper, we use the OI as it is the simpler method for PM 10 DA and it performs well (Denby et al., 2008;Wu et al., 2008).Furthermore, the OI method can be used in operational mode for real-time forecasts, as the computational cost of OI is low.It was used by Tombette et al. (2009) and Pagowski et al. (2010) for DA of conventional aerosol ground observations.In the OI method, DA is performed at the frequency of measurements to produce analysed concentrations, which are closer to reality (measurements) than forecasts and which are used as initial conditions for the next model iteration.The equation to compute the analysed concentrations from the model concentrations is given by: where x a is the analysed concentrations, x b is the model concentrations, y is the observation vector, H is an operator that maps x b to the observational data, H is the tangent linear operator of H (in the following, the observation operator is linear), B and R are respectively the background and observation error covariance matrices.They require the specification of the background and observation error covariance matrices (see Sects. 4.2,4.4 and 5).The background error covariance matrix determines how the corrections of the concentrations should be distributed over the domain during DA.The observation error covariance matrix specifies instrumental and representativeness errors.As in Tombette et al. (2009), after DA of PM 10 concentrations, the analysed PM 10 concentrations are redistributed over the model variables following the initial chemical and size distributions.
3 Experimental setup

Model
For our study, the chemistry transport model POLAIR3D (Sartelet et al., 2007) of the air-quality platform POLYPHE-MUS, available at http://cerea.enpc.fr/polyphemus/and described in Mallet et al. (2007) is used.Aerosols are modelled using the SIze-REsolved Aerosol Model (SIREAM-SuperSorgam), which is described in Debry et al. (2007) and Kim et al. (2011).SIREAM-SuperSorgam includes 20 aerosol species: 3 primary species (mineral dust, black carbon and primary organic species), 5 inorganic species (ammonium, sulphate, nitrate, chloride and sodium) and 12 organic species.It models coagulation and condensation.Five bins logarithmically distributed over the size range 0.01 µm-10 µm are used.The gas chemistry is solved with the chemical mechanism CB05 (Carbon Bond version 5) (Yarwood et al., 2005).POLAIR3D/SIREAM has been used for several applications.For example, it was compared to measurements for gas and aerosols over Europe by Sartelet et al. (2007) and Kim et al. (2010), and it was compared to lidar measurements over Greater Paris by Royer et al. (2011).

Input data
The modelling domain covers western and part of eastern Europe ([10.5 • W, 23 ) with a horizontal resolution of 0.5 • × 0.5 • .Nine vertical levels are considered from the ground to 12 000 m.The heights of the cell interfaces are 0, 40, 120, 300, 800, 1500, 2400, 3500, 6000 and 12 (Horowitz et al., 2003).For aerosol boundary conditions, daily means are based on outputs of the Goddard Chemistry Aerosol Radiation and Transport model (GOCART) for the year 2001 for sulphate, dust, black carbon and organic carbon (Chin et al., 2000;Sartelet et al., 2007).

Observational data
In this paper, as in Sartelet et al. (2007) and Tombette et al. (2009), we use the locations of stations of two ground databases for the comparisons to ground data measurements:  is only used for the performance assessment of the nature run, whereas the AirBase network is used for both the performance assessment of the nature run and assimilations in the OSSE. Figure 1 shows the location of the EMEP and Air-Base stations used in this study.
In this work, a network of 12 fictitious ground-based lidar stations covering western Europe is defined, as shown in Fig. 1, based on the lidar locations of existing observation stations, e.g. a subset of stations from the European Aerosol Research Lidar Network (http://www.earlinet.org/).A relation between PM 10 mass concentration and optical properties of aerosols is assumed to exist, although it has so far only been determined for pollution aerosols over Greater Paris (Raut et Chazette, 2009) and it needs to be generalised to other measurement sites.4 Observing system simulation experiment

Nature run
Observation impact experiments for not-yet-existing observing systems require an atmospheric state, from which the hypothetical observations can be generated.Since the true atmosphere is inherently unknown, a synthetic atmosphere state, in the remainder denoted "truth", needs to be defined.In an OSSE, the "true" state is used to create the observational data from existing and future instruments.In this paper, the "truth" is obtained from a simulation, called nature run, performed between 00:00 UTC 15 July to 00:00 UTC 15 August 2001 using the model (Kim et al., 2010(Kim et al., , 2011) ) and the input data described in the previous section.Here, we first evaluate the results of this simulation with the AirBase and EMEP networks.
For an OSSE study, the accuracy of the nature run compared with real observations is important, and the nature run should produce typical features of the phenomena of interest.According to Boylan and Russel (2006) 1, for PM 10 , the model performance criterion is met for the two networks, whereas for PM 2.5 the model performance goal is met for both networks, suggesting that this simulation compares well to observations.Furthermore, as shown in Fig. 2, the spatial distribution of PM 10 concentration corresponds to previously published results (Sartelet et al., 2007).This "true" simulation is subsequently used for the creation of observations from the observing system under investigation and will also be used to evaluate the results of DA experiments, for example the calculation of the Root Mean Square Error (RMSE) and correlation over land grid points from the ground level to the sixth level (1950 m above the ground) against the nature run.

Simulated observations and error modelling
The "true" state of the atmosphere (the nature run) is used to calculate the concentrations at both stations of the AirBase network and of the future ground-based lidar network.For example for the lidar network, Fig. 3 shows the "true" state of PM 10 at two arbitrary chosen lidar stations: Madrid (Perez et al., 2004) and Saclay (Raut et Chazette, 2009).We find that the high PM 10 concentrations in Madrid are mostly made of Sahara dust.Because the AirBase network covers well western Europe and provides in situ surface measurements (which have been used for the performance assessment of the nature run in section 4.1) and because AirBase measurements have been used for DA of PM 10 (Denby et al., 2008;Tombette et al., 2009), we took AirBase as an assimilation reference network in order to quantitatively show the potential impact of future ground-based lidar networks on analysis and shortterm forecasts of PM 10 .However, real observations at Air-Base stations are not used for the assimilation, but the "truth" is used to calculate the "true" states (e.g., concentrations), in order to be consistent with the lidar data.
The "true" state at each station is perturbed depending on estimated observation errors.For the network AirBase, the observation errors mainly correspond to the representativeness errors, and they are estimated to be about 35%.For the ground-based lidar network, the observation errors include the representativeness errors (about 35%) and the instrumental errors, which are estimated to be about 25% for PM 10 concentrations obtained from lidar observations (Raut et Chazette, 2009).These instrumental errors are linked to errors in estimating the extinction coefficients using the inversion of the lidar signal (Klett et al., 1981) and extinction coefficient cross sections.The covariance between the representativeness and instrumental errors is set to zero since they are independent.Finally, the observation errors of the concentrations obtained from the lidar network are estimated to be about 43 % (the square root of the sum of the representativeness error variance and the instrumental error variance, √ 35% 2 + 25% 2 ).Note that when comparing the nature run to the real data, the errors include both the representativeness errors and the model errors.They are therefore different from the observation errors used to perturb the simulated observations.
After defining the observation errors, the observations obtained from the "true" state are perturbed.For each station, let x be a vector, whose component x i is a hourly mean concentration and i depends on vertical level and time.The perturbation is implemented as follows:   -Define the observational error covariance matrix by the Balgovind approach (Balgovind et al., 1983).The error covariance between two points is where e is the observational error variance, d v is the vertical distance between the 2 points, d t is the temporal difference between the 2 points, L v = 200 m and L t = 2 h are the vertical and temporal correlation lengths.Each component of the covariance matrix may be written as ij = f d v (x i , x j ), d t (x i , x j ) .Each component of the covariance matrix depends smoothly on the altitude of the points and time.
-Use the Cholesky decomposition: where C is a lower triangular matrix with strictly positive diagonal entries.
The perturbation of x is then where γ is a random vector whose components are a standard normal distribution (of mean 0 and variance 1). Figure 4 shows an example of perturbations at an arbitrarily chosen station.We can see that the perturbations depend continuously on the vertical level and the time thanks to matrix C. The perturbed observations are subsequently used for the assimilation of the ground-based lidar network and AirBase data.

Control run
The control run is a simulation that is meant to represent the best modellers' simulation of the atmosphere.If the same model is used for both the nature run and the control run, this is called an identical twin OSSE; if the nature run model is a different version of the control run model, the OSSEs are called fraternal twin OSSEs (Liu et al., 2007;Masutani et al., 2010).We follow a "perfect model" OSSE setup, in which the model used to generate the "true" observations is the same as the one used in the control run and DA.The identical twin OSSEs are easy to set up.However, input data, such as meteorological fields, emissions (Edwards et al., 2009) or initial conditions (Liu et al., 2007) have to be perturbed.In order to be able to interpret more easily the results, we choose to perturb only initial conditions.This allows us to avoid the complications of defining model errors, and the only source of forecast errors comes from the initial conditions.With the identical twin scenario, the numerical model becomes perfect (i.e., no model error); this is counter to what happens in reality (i.e., models are never perfect) and the identical twin OSSEs usually overestimate the impact of observations on model forecasts (Chen et al., 2011).
Although the impact of PM 10 DA may be over-optimistic, it will be so for both ground observations and lidar observations (the assimilation of both ground and lidar observations lead to corrections at high vertical levels, as discussed in Sect.5).As in Sect.4.2, we use the Balvogind approach (Balgovind et al., 1983), the Cholesky decomposition and the normal distribution to perturb all model concentrations (gaseous and aerosols).In air quality models, the impact of initial conditions on PM 10 concentrations lasts for a few hours to a few days at most.For this impact to last as long as possible, both gaseous and aerosol concentrations are perturbed.As shown in Fig. 5, the differences between "true" and perturbed PM 10 concentrations in certain parts of Europe are higher than in other parts of Europe.This is due to the normal distribution, which can produce very high or low concentrations in one grid cell.Although the perturbed initial conditions are not necessarily consistent with the true state of atmosphere, they are suitable for our experiments with DA.

Parameters of the DA runs
The experiments consist of two steps: the DA analysis part and the forecast.During the assimilation period, say between [t 0 , t N ], at each time step, the observations are assimilated.During the subsequent forecast period, say between [t N+1 , t T ], the aerosol concentrations are obtained from the model simulations initialised from the analysed model state at t N .
Since only the initial conditions are perturbed in our experiments (see Sect. 4.3), the difference between two forecasts initialised with different initial conditions only lasts for a few days.For the choice of t N , Fig. 6 compares the RMSE between the true observations and the forecast concentrations from 18 July at 01:00 UTC to 20 July at 00:00 UTC, obtained for different assimilation periods varying from 6 h to 3 days and always ending at 00:00 UTC 18 July.The longer the assimilation period is, the lower the RMSE is.An assimilation period of 12 h seems a good compromise between a low RMSE and a short assimilation time.
Two different types of DA runs are performed in our OSSE, depending on whether ground or vertical observations are assimilated.The simulations use the same setup as the one of the control run.We use the perturbed PM 10 observations that are produced by the nature run (see Sect.In the OI method, the background and observation error covariance matrices need to be set.The observation error covariance matrix depends on the observational error variance, which varies with vertical levels.For ground measurements, we set the error variance to be 20 µg 2 m −6 , the square of 35 % (see section 4.2) of PM 10 concentration averaged over Air-Base stations.For lidar measurement, we set the error variance to be the square of 43 % ( √ 35% 2 + 25% 2 , see section 4.2) of PM 10 concentration averaged over lidar stations for each level from the third level to the sixth level, which is respectively 28, 24, 16 and 5 µg 2 m −6 .
In the Balgovind parametrisation of the background error covariance matrix (Wu et al., 2008;Tombette et al., 2009), the variance v is set to 60 µg 2 m −6 , which is obtained from the difference between the nature run and the control run.The correct specification of the background error correlations is crucial to the quality of the analysis, because they determine to what extent the fields will be corrected to match the observations.The horizontal correlation length and the vertical correlation length are two parameters of the Balgovind ap-proach.While the definition of background error correlations is straightforward, since they correspond to the difference between the background state and the true state, the true atmospheric state is never exactly known.The next section details the choice of the horizontal and vertical correlation length.

Choice of the horizontal and vertical correlation lengths
The National Meteorological Center (NMC) method (Parrish and Derber, 1992) is used for the choice of the horizontal correlation length L h and the vertical correlation length L v .The background error is estimated by the differences of PM 10 concentrations between two simulations.The two simulations start with the same initial conditions and last 24 hours.A 24 hours forecast is performed in the first simulation, while AirBase data of PM 10 concentrations are assimilated hourly in the second simulation.In the analysis, the background error covariance matrix is assumed to be a diagonal matrix to avoid adding special error correlations (e.g. the Balgovind approach with a given horizontal and vertical correlation length) in the NMC method.In order to eliminate potential bias due to the diurnal cycle, 24 h forecasts are issued at 00:00 UTC and 12:00 UTC.This estimation of the background error is performed for 27 consecutive days from 15 July 2001 at 00:00 UTC and 12:00 UTC.
To estimate the horizontal correlation length, at each model level, we calculate the covariance value for each site pair.We then obtain a cloud of covariance values.The covariance clouds are averaged within continuous tolerance regions.The length of the tolerance region is set to 4 grid units, so that there are enough site pairs for each tolerance region.Thus, L h is estimated at all model levels by a least-square fitting of Balgovind functions to the curves of the regionalized covariances (the covariance clouds averaged within tolerance regions).Figure 7 shows the horizontal correlation length L h of the background error covariance matrix at 00:00 UTC and 12:00 UTC.The variation of the horizontal correlation length is comparable to that of meteorology (Daley , 1991).The horizontal correlation length is relatively constant in the boundary layer, and it is about 4 grid units (200 km).Above the boundary layer, the horizontal correlation length decreases.This is a consequence of the prescribed aerosol boundary conditions and the numerical algorithm.Because the background error is estimated by the differences between a simulation with 24 h forecast and a simulation with assimilating ground measurements in the NMC method (the error sources are the ground measurements) and the same boundary conditions are used for both simulations, the background errors at the upper levels are very small.By contrast, the numerical noise can become significant and leads to short length correlations at high levels.A similar behaviour is shown in Benedetti and Fisher (2007); Pagowski et al. (2010).In the DA experiments, we should therefore use a horizontal  26 Fig. 7.The blue (resp.red) line shows the horizontal correlation length L h (grid unit) at 00:00 UTC (resp.12:00 UTC) versus altitude.Note that a grid unit is about 50 km.
correlation length scale of 200 km.The Lidar In-Space Technology Experiment (LITE) (Winker et al., 1996) data suggest that aerosol fields have a horizontal correlation length scale of 200 km.Similarly to the horizontal correlation length, we find that the vertical correlation length L v is about 250 m at the ground level.
Although the NMC method gives us estimates of the horizontal and vertical correlation lengths, DA tests with different correlation lengths are performed to assess the optimum lengths, i.e., the lengths which lead to the best forecast.The different tests performed are summarised in Table 2. Assimilation is performed with three different horizontal lengths: L h = 50 km, L h = 200 km and L h = 400 km.For AirBase DA, assimilation is also performed with three different vertical correlation lengths: L v = 250 m, L v = 1500 m and L v varying between nighttime and daytime.Because lidar provides us vertical profiles, the lidar DA can directly correct PM 10 concentrations at each model level (higher than 200 m above the ground).Therefore, we do not consider L v in the background error covariance matrix (we assume L v = 0).Moreover, column DA tests with different L v show that L v = 0 does not lead to a better forecast for the column DA run.The scores (RMSE and correlation) calculated over land grid points from the ground level to the sixth level (1950 m above the ground) are shown in Fig. 8.Because only the initial conditions (pollutant concentrations) are different between the nature run and the control runs (see Sect. 4.3), and because the influence of initial conditions fades out with the forecast time, all control runs converge (RMSEs decrease to 0 and correlations increase to 1 in Fig. 8).The role of DA is to accelerate this convergence, to make RMSEs decrease and correlations increase faster.For AirBase DA, choosing L v = 1500 m (DA test "AB 200km 1500m") leads to better scores (lower RMSE and lower correlation) than choosing L v = 250 m (DA test "AB 200km 250m"), as estimated from the NMC method.Choosing L v = 50 m in the nighttime and L v = 1500 m in the daytime (DA test "AB 200km 50/1500m") does not lead to better scores than L v = 1500 m (DA test "AB 200km 1500m").A possible explanation is that the particles are mixed by turbulence more effectively in the model than in the true state of the atmosphere.The comparison of DA tests "AB 50km 1500m", "AB 200km 1500m" and "AB 400km 1500m" for AirBase and DA tests "Col.50km 0m", "Col.200km 0m" and "Col.400km 0m" for the lidar network shows that L h = 200 km, as estimated from the NMC method, leads to good scores.The scores are better than with L h = 50 km, and similar to those obtained with L h = 400 km.
However, during the forecast period, the RMSE of the column DA run decreases faster than the AirBase DA run (to the right of the black line).After 24 hours forecast, the column DA has better scores than the AirBase DA run.It is mostly because the impact of the column DA run is higher than the AirBase DA run at high levels.
Figure 9 shows the RMSE for the PM 10 forecast without DA, with the AirBase DA and with the column DA for each one-day forecast period between 15 July and 10 August.Assimilation improves the forecast RMSE for each forecast.The averaged RMSE over all forecasts is 9.1 µg m −3 without DA, 3.7 µg m −3 (59 % less) with the AirBase DA and 4.2 µg m −3 (54 % less) with the column DA.Although the AirBase DA leads to lower RMSE than the column DA for most forecasts in Fig. 9, the column DA can also lead to lower or similar RMSE as the AirBase DA for some forecasts, e.g. the forecasts starting 19,20,21,23,26 July and 3,5,8 August.It is mostly because the lidar network provides more accurate information than AirBase on those days at high altitude, e.g.Sahara dust in Madrid as shown in Fig. 3 (upper panel).
Figure 10 shows the RMSE for the PM 10 forecast without DA, with the AirBase DA and with the column DA during the second forecast day for each experiment between 15 July and 10 August.The averaged RMSE over all forecasts is 6.1 µg m −3 without DA, 2.7 µg m −3 (56 % less) with the Air-Base DA and 2.6 µg m −3 (57 % less) with the column DA.
The results show that the impact on PM 10 forecast of assimilating data from a lidar network with 12 stations and data from a ground network AirBase with 488 stations are similar in terms of scores, although AirBase (resp.lidar) DA leads to slightly better scores for the first (resp.second) forecast day.We will study the sensitivity to the number and to the lidar locations in the next section.

Sensitivity to the number and position of lidars
In this section, we study the sensitivity of the results to the number and to the locations of lidars.Forecasts after DA with four different lidar networks are compared to DA with the previously-used lidar network (blue discs in Fig. 11).Data assimilation is performed with another lidar network of 12 lidar stations (denoted Network 12, yellow discs in Fig. 11), with a lidar network of 26 stations (denoted Network 26, magenta diamonds in Fig. 11), with a lidar network of 76 stations (denoted Network 76, cyan thin diamonds in Fig. 11) and DA with a lidar network made of all AirBase stations over western Europe (denoted Network 488, the red triangles in Fig. 1).
Figures 12 and 13 show the time evolution of the RMSE and the correlation respectively, averaged over all land grids and the vertical for the different tests.Comparing the previously-used lidar network with Network 12 in Fig. 11, we can see that although they have the same number of stations, the locations are very different.Because the stations of Network 12 are more regularly spaced than the stations of the previously-used lidar network, Network 12 stations are better spread out over Europe than the previously-used lidar network.Network 12 leads to better scores in the first forecast day than the reference network.This shows that the li- dar stations need to be regularly distributed over Europe for an overall improvement of the PM 10 forecast.The lidar networks 26, 76 and 488 which have more lidar stations perform better (lower RMSE, higher correlation) than the two others.The lidar network 26 DA run has less than 0.15 µg m −3 of RMSE higher than AirBase DA at the beginning of forecast window and has a better score than AirBase DA run after several hours forecast.If one increases the number of lidar stations from 26 to 76, the lidar network 76 DA run has better scores than the AirBase DA run at the beginning of the forecast window and has better scores than the AirBase DA during the forecast days.If one increases the number of lidar stations to 488 (the same as the number of AirBase stations), the lidar network 488 DA run has much better scores than the AirBase DA run during the forecast days.Although increasing the number of lidar gives better forecast scores, such lidar networks may be too expensive.

Conclusions
In order to investigate the potential impact of a ground-based lidar network on short-term forecasts of PM 10 , an OSSE has been implemented.Because the AirBase network covers well western Europe and provides in situ surface measurements and because AirBase measurements have been used for DA of PM 10 , we took AirBase as an assimilation reference network.We have compared the impacts of assimilating groundbased lidar network data to assimilating the AirBase surface network data.
Because we made several simplifying assumptions: we used an identical twin scenario (perfect model) and assumed uncorrelated observational errors, the PM 10 improvements from assimilating lidar and ground observations may be over optimistic.Compared to the RMSE for one-day forecasts without DA, the RMSE between one-day forecasts and the truth states is improved on average over the summer month from 15 July to 15 August 2001 by 54 % by the lidar DA with 12 lidars, and by 59 % by the AirBase DA.For the second forecast days, compared to the RMSE for second forecast days without DA, the RMSE is improved on average over the summer month from 15 July to 15 August 2001 by 57 % by the lidar DA, and by 56 % by the AirBase DA.Although AirBase DA can correct PM 10 concentrations at high levels because of the long vertical correlation length of the background errors, the lidar DA corrects PM 10 concentrations more accurately than the AirBase DA at high levels.The spatial and temporal influence of the assimilation of lidar observations is larger and longer.The results shown in this paper suggest that the assimilation of lidar observations would improve PM 10 forecast over Europe.
As lidar stations are developing over Europe following volcanic eruptions in Iceland (Chazette et al., 2012;Pappalardo et al., 2010), a sensitivity analysis has also been conducted on the number and locations of lidars.We found that spreading out the lidars regularly over Europe can improve the PM 10 forecast.Compared to the RMSE for one-day forecasts without DA, the RMSE between one-day forecast and the truth states is improved on average over the summer month from 15 July to 15 August 2001 by 57 % by the lidar DA with 12 optimised lidars, and by 59 % by the Air-Base DA.Increasing the number of lidar improves the forecast scores.For example, the improvement of the RMSE becomes as high as 65 % (compared to the RMSE for one-day forecasts without DA) if 76 lidars are used, but a lidar network with many stations may be too expensive.
For future works, we will use real measurements from lidar stations, directly assimilating the lidar signals in the chemistry transport model and performing DA with a combination of lidar and AirBase observations.

Fig. 1 .Fig. 1 .
Fig. 1.The green squares show the locations of EMEP stations, the red triangles show the locations of AirBase stations, and the blue discs show the locations of the lidar network.

Fig. 4 .
Fig. 4. Perturbation at a random AirBase station from 15 July to 15 August 2001 at from the first vertical level in the model (top left) to the last vertical level in the model (bottom right).The blue lines show the "true" PM 10 concentrations (µg m −3 ).The green lines show the simulated PM 10 concentrations (µg m −3 ).

Fig. 4 .
Fig. 4. Perturbation at a random AirBase station from 15 July to 15 August 2001 at from the first vertical level in the model (top left) to the last vertical level in the model (bottom right).The blue lines show the "true" PM 10 concentrations (µg m −3 ).The green lines show the simulated PM 10 concentrations (µg m −3 ).

Fig. 5 .Fig. 5 .
Fig. 5. Differences between "true" and perturbed PM 10 concentration at 0000 UTC 15 July 2001, which is the initial time of the first five-day experiment, from the first vertical level in the model (top left) to the last vertical level in the model (bottom right).Differences (µg m −3 ) vary from negative values in dark blue colour to positive values in dark red colour.

Fig. 7 .
Fig. 7.The blue (resp.red) line shows the horizontal correlation length Lh (grid unit) at 0000 UTC (resp.1200 UTC) versus altitude.Note that a grid unit is about 50 km.

Fig. 8 .Fig. 8 .
Fig. 8. Top (resp.bottom) figure shows the time evolution of the RMSE in µg m −3 (resp.correlation) of PM10 averaged over the different DA tests from 15 July to 10 August 2001.The scores are computed over land grid points from the ground to the sixth level (1950 m above the ground).The forecast is performed either without DA (red lines), or after AirBase DA or after column DA.The vertical black lines denote the separation between the assimilation period (to the left of the black lines) and the forecast (to the right of the black lines).27 Fig. 8. Top (resp.bottom) figure shows the time evolution of the RMSE in µg m −3 (resp.correlation) of PM 10 averaged over the different DA tests from 15 July to 10 August 2001.The scores are computed over land grid points from the ground to the sixth level (1950 m above the ground).The forecast is performed either without DA (red lines), or after AirBase DA or after column DA.The vertical black lines denote the separation between the assimilation period (to the left of the black lines) and the forecast (to the right of the black lines).

Fig. 9 .
Fig. 9. RMSE (in µg m −3 ) computed over land grid points from the ground to the sixth level (1950 m above the ground) for PM10 one-day forecast without DA (white columns), with the AirBase DA (grey columns) and with the column DA (blue columns).

Fig. 10 .Fig. 9 .
Fig. 10.RMSE (in µg m −3 ) computed over land grid points from the ground to the sixth level (1950 m above the ground) for PM10 second forecast day without DA (white columns), with the AirBase DA (grey columns) and with the column DA (blue columns).

Fig. 10 .Fig. 10 .
Fig. 10.RMSE (in µg m −3 ) computed over land grid points from the ground to the sixth level (1950 m above the ground) for PM10 second forecast day without DA (white columns), with the AirBase DA (grey columns) and with the column DA (blue columns).

FFig. 11 .Fig. 11 .
Fig. 11.Four potential lidar networks in Europe.The blue discs in the top figure show the locations of the reference lidar network.The yellow discs in the top figure show the locations of the lidar Network 12.The magenta diamonds in the bottom figure show the locations of the lidar Network 26.The cyan thin diamonds in the bottom figure show the locations of the lidar Network 76.

Fig. 13 .
Fig. 13.Hourly evolution of the PM10 correlation averaged over the different experiments from 15 July to 10 August 2001.The correlation is computed over land grid points from the ground to the sixth level (1950 m above the ground).The runs are performed without DA (red line), with AirBase DA (green line), with the reference lidar network DA (12 stations, blue line), with Network 12 DA (12 stations, yellow line), with Network 26 DA (26 stations, magenta line), with Network 76 DA (76 stations, cyan line) and with Network 488 DA (488 stations, black line).

Fig. 13 .
Fig. 13.Hourly evolution of the PM 10 correlation averaged over the different experiments from 15 July to 10 August 2001.The correlation is computed over land grid points from the ground to the sixth level (1950 m above the ground).The runs are performed without DA (red line), with AirBase DA (green line), with the reference lidar network DA (12 stations, blue line), with Network 12 DA (12 stations, yellow line), with Network 26 DA (26 stations, magenta line), with Network 76 DA (76 stations, cyan line) and with Network 488 DA (488 stations, black line).

Table 1 .
Statistics (see Appendix A) of the simulation results for the AirBase and EMEP networks from 15 July to 14 August.Ammon.stands for ammonium.Obs.stands for observation.Sim.stands for simulation.Corr.stands for correlation.

Table 1 .
Statistics (see Appendix A) of the simulation results for the AirBase and EMEP networks from 15 July to 14 August.Ammon.stands for ammonium.Obs.stands for observation.Sim.stands for simulation.

Table 2 .
DA tests with different configurations for Balgovind Scale Parameters.AB stands for AirBase.Col. stands for column.× indicates the type of DA runs used (AirBase DA or Column DA).

Table 2 .
DA tests with different configurations for Balgovind Scale Parameters.AB stands for AirBase.Col. stands for column.× indicates the type of DA runs used (AirBase DA or Column DA).
Hourly evolution of the RMSE (in µg m −3 ) of PM10 averaged over the different experiments from 15 July to 10 August 2001.The RMSE is computed over land grid points from the ground to the sixth level (1950 m above the ground).The runs are performed without DA (red line), with AirBase DA (green line), with the reference lidar network DA (12 stations, blue line), with Network 12 DA (12 stations, yellow line), with Network 26 DA (26 stations, magenta line), with Network 76 DA (76 stations, cyan line) and with Network 488 DA (488 stations, black line).Net.stands for network.Hourly evolution of the RMSE (in µg m −3 ) of PM 10 averaged over the different experiments from 15 July to 10 August 2001.The RMSE is computed over land grid points from the ground to the sixth level (1950 m above the ground).The runs are performed without DA (red line), with AirBase DA (green line), with the reference lidar network DA (12 stations, blue line), with Network 12 DA (12 stations, yellow line), with Network 26 DA (26 stations, magenta line), with Network 76 DA (76 stations, cyan line) and with Network 488 DA (488 stations, black line).Net.stands for network.