Introduction

ACP

Atmospheric Chemistry and Physics

ACP

Atmos. Chem. Phys.

1680-7324

Copernicus Publications

Göttingen, Germany

10.5194/acp-17-3423-2017

How much information do extinction and backscattering measurements contain about the chemical composition of atmospheric aerosol?

Kahnert

Michael

michael.kahnert@smhi.se

https://orcid.org/0000-0001-5695-1356

Andersson

Emma

https://orcid.org/0000-0003-1730-4599

1Research Department, Swedish Meteorological and Hydrological Institute, Folkborgsvägen 17, 601 76 Norrköping, Sweden 2Department of Earth and Space Science, Chalmers University of Technology, 412 96 Gothenburg, Sweden

Michael Kahnert (michael.kahnert@smhi.se)

9March2017

17 5 34233444 10October2016 10November2016 9February2017 11February2017

This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/

This article is available from https://acp.copernicus.org/articles/.html

The full text article is available as a PDF file from https://acp.copernicus.org/articles/.pdf

We theoretically and numerically investigate the problem of assimilating multiwavelength lidar observations of extinction and backscattering coefficients of aerosols into a chemical transport model. More specifically, we consider the inverse problem of determining the chemical composition of aerosols from these observations. The main questions are how much information the observations contain to determine the particles' chemical composition, and how one can optimize a chemical data assimilation system to make maximum use of the available information. We first quantify the information content of the measurements by computing the singular values of the scaled observation operator. From the singular values we can compute the number of signal degrees of freedom, Ns, and the reduction in Shannon entropy, H. As expected, the information content as expressed by either Ns or H grows as one increases the number of observational parameters and/or wavelengths. However, the information content is strongly sensitive to the observation error. The larger the observation error variance, the lower the growth rate of Ns or H with increasing number of observations. The right singular vectors of the scaled observation operator can be employed to transform the model variables into a new basis in which the components of the state vector can be partitioned into signal-related and noise-related components. We incorporate these results in a chemical data assimilation algorithm by introducing weak constraints that restrict the assimilation algorithm to acting on the signal-related model variables only. This ensures that the information contained in the measurements is fully exploited, but not overused. Numerical tests show that the constrained data assimilation algorithm provides a solution to the inverse problem that is considerably less noisy than the corresponding unconstrained algorithm. This suggests that the restriction of the algorithm to the signal-related model variables suppresses the assimilation of noise in the observations.

Introduction

Atmospheric aerosols have a substantial, yet highly uncertain impact on climate, they can cause respiratory health problems, degrade visibility, and even compromise air-traffic safety. The physical and chemical properties of aerosols play a key role in understanding these effects. The aerosol properties are determined by a complex interplay of different chemical, microphysical, and meteorological processes. These processes are investigated in environmental modelling by use of chemical transport models (CTMs). However, modelling aerosol processes is plagued by substantial biases and errors . It is, therefore, fundamentally important to evaluate and constrain CTMs by use of measurements.

Measurements from satellite instruments provide consistent long-term data sets with global coverage. However, it is notoriously difficult to compare measured radiances to modelled aerosol concentrations. An alternative to using radiances is to make use of satellite retrieval products. For instance, one of the products of the CALIPSO lidar instrument (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations) is a rough classification of the aerosol types (i.e. dust, smoke, clean/polluted continental, and clean/polluted marine). This retrieval product is based on lidar depolarization measurements . For the evaluation of aerosol transport models this provides us with a qualitative check for the chemical composition of aerosols. However, this is of limited practical use, since what we really need is quantitative information on the particles' chemical composition (which can be size-dependent). The most popular approach in evaluating and constraining aerosol transport models is the use of retrieved optical properties, such as aerosol optical depth, or extinction and backscattering coefficients. Yet another idea is to provide the particles' refractive index as a retrieval product e.g.. However, the use of such retrieval products still leaves us with the challenge of solving an ill-posed inverse problem, namely, of determining the particles' chemical composition from their retrieved optical or dielectric properties.

A systematic class of statistical methods for solving this inverse problem is known as data assimilation. Recent studies have applied data assimilation to aerosol models with varying degrees of sophistication, ranging from simple dust models and mass transport models to microphysical aerosol models based on modal or sectional descriptions of the aerosol size distribution. The assimilation techniques that have been used comprise variational methods, such as 2-D , 3-D , and 4-D variational methods , as well as ensemble approaches . Assimilation of satellite products for trace gases is relatively straightforward, since observed and modelled trace gas concentrations are almost directly comparable. However, aerosol optical properties observed from satellites are not directly comparable to the modelled size distribution and chemical composition of the aerosols. Solving this problem amounts to regularizing a severely under-constrained inverse problem. Previous aerosol assimilation attempts have been mainly based on educated guesses about the information content of the observations. For instance, there have been studies on the assimilation of aerosol optical depth (AOD) in which all chemical aerosol components in all size classes and at all model layers were used as independent control variables . This approach largely disregards the problems involved in inverse modelling. By contrast, it has been proposed to only allow for the total aerosol mass concentration to be corrected by data assimilation of AOD . This is a more prudent approach based on the plausible assumption that a single optical variable only contains enough information to control a single model variable. There have also been intermediate approaches in which the total aerosol mass per size bin have been used as control variables .

In all such approaches the choice of control variables is based on ad hoc assumptions. Numerical assimilation experiments by suggest that observations of several aerosol optical properties at multiple wavelengths may allow us to constrain more than just the total mass concentration, but certainly not all aerosol parameters. However, it is still an unsolved mystery how much information a given set of observations actually contains about the size distribution and chemical composition of aerosols, and exactly which model variables are related to the observed signals, and which ones are related to noise. Thus a prerequisite for assimilating remote sensing observations into aerosol transport models is to thoroughly understand the information content of the observations as well as the relation between the model variables and the signal degrees of freedom.

In numerical weather prediction (NWP) modelling, several studies have discussed the information content of satellite observations for meteorological variables. For instance, applied a singular-value decomposition (SVD) approach in order to reduce the effect of prior information in the analysis, so that the retrieval and forecast errors can be assumed to be uncorrelated. considered assimilation of IR sounders, which typically provide a large number of different channels. They applied methods of information and retrieval theory in order to decide which channels contain most information about the vertical variation of temperature and humidity. employed the influence matrix to compute diagnostics of the impact of observations in a global NWP data assimilation system. investigated filtering and interpolation aspects in a 4DVAR assimilation system by use of an SVD approach. They also used Tikhonov regularization theory to optimize the signal-to-noise regularization parameter in order to maximize the information that can be extracted from observations. compared different metrics, namely, the relative entropy and the Shannon-entropy difference, to measure information contents of radar observations assimilated into a coupled atmosphere–ocean model. used methods of information theory to address the question how to determine an optimum spatial resolution of the discretized space of control variables in geophysical data assimilation.

have recently investigated the information content of “3β+2α” lidar measurements, i.e. observations of backscattering at three wavelengths and extinction at two wavelengths, where the information content was analysed with regard to the refractive index and number distribution of the aerosol particles. have performed similar analyses of the information content of multiwavelength Raman lidar measurements with regard to the complex refractive index and the effective radius of the aerosol particles. As mentioned earlier, the refractive index is a very useful retrieval product of remote sensing observations. However, from the point of view of chemical transport modelling, the main quantities of interest are the concentrations of the different chemical species of which the aerosol particles are composed. Although the chemical composition determines the refractive index, the inversion of this relationship is still under-determined, hence an ill-posed problem. In the present paper, we want to investigate the inverse problem that goes all the way from optical properties to the chemical composition of particles.

The two main goals of this paper are (i) to apply a systematic method for analysing the information content of aerosol optical properties with regard to the particles' chemical composition, and (ii) to test an algorithm for making an automatic choice of control variables in chemical data assimilation such that all control variables are signal related, while the noise-related variables remain unchanged by the assimilation procedure. The main hypothesis is that by constraining the data assimilation algorithm to acting on the signal-related variables only, the output will be less noisy than in an unconstrained assimilation. The focus of our study will be on spectral observations of extinction and backscattering coefficients, which can be retrieved from lidar observations.

In addition to lidar measurements from ground-based and aircraft-carried instruments e.g., there are currently two space-borne lidar instruments in orbit. The CALIOP instrument on board the CALIPSO satellite has been launched in April 2006; it has three receiver channels – one at 1064 nm and two channels at 532 nm to measure orthogonally polarized components. The CATS instrument on board the International Space Station has been operational since January 2015. It measures backscattering at 355, 532, and 1064 nm, where the latter two have two orthogonal polarization channels. It is also capable of performing high-spectral-resolution measurements at 532 nm. A third instrument is planned to be launched in 2018 (ATLID on board EarthCARE).

We will not restrict this analysis to any fixed choice of wavelengths, such as 3β+2α. Instead, we will investigate the information content for varying combinations of the three main wavelengths of the commonly used neodymium-doped yttrium aluminium garnet (Nd:YAG) laser. However, it should be mentioned that extinction measurements at the lowest harmonic of 1064 nm can be difficult and plagued by high errors; in practice, this will affect the observation error, resulting in a low information content of this particular measurement.

The paper is organized as follows. Section gives a rather concise introduction of the modelling tools and of the numerical approach employed to studying the information content of extinction and backscattering observations. Section presents the main results of this study, and Sect. offers concluding remarks. To make this paper self-contained, we include an appendix that gives a brief introduction to some essential concepts of data assimilation, and a detailed explanation of the methods we used for quantifying the information content of aerosol optical observables.

Methods

This study consists of two parts. In the first part we quantify the information content of extinction and backscattering coefficients at multiple wavelengths. In the second part we perform a numerical test to investigate to what extent the concentrations of different chemical aerosol components can be constrained by observations of extinction and backscattering coefficients. The modelling tools required for this study are (i) a chemical transport model, (ii) an aerosol optics model, and (iii) a data assimilation system.

Multiple scale Atmospheric Transport and CHemistry modelling system (MATCH)

We employ the chemical transport model MATCH, which is an off-line Eulerian CTM with flexible model domain. It has been previously used from regional to hemispheric scales. Here we use a model version that contains a photochemistry module with 64 chemical species, among them four secondary inorganic aerosols (SIAs) – namely ammonium sulfate, ammonium nitrate, other sulfates, and other nitrates. It also contains a module with 16 primary aerosol variables – namely sea salt, elemental carbon (EC), organic carbon (OC), and dust particles, each emitted in four different size bins. Thus, the model contains 20 different aerosol variables. The particle-radius ranges of the four bins are as follows:

size bin 1: 10–50 nm;

size bin 2: 50–500 nm;

size bin 3: 500–1250 nm;

size bin 4: 1250–5000 nm.

The model reads in emission data, meteorological data, and land use data and computes transport processes, chemical transformation, and dry and wet deposition of the various trace gases and aerosols. As output, it provides concentration fields of gases and aerosols, the deposition of these chemical species to land and water-covered areas, as well as the temporal evolution of these variables.

We mention that there exists another model version that includes aerosol microphysical processes, such as nucleation, condensational growth, and coagulation. In that model version the aerosol size distribution evolves dynamically. The model has 20 size bins and seven chemical species (EC, OC, dust, sea salt, particulate sulfate (PSOX), particulate nitrate (PNOX), and particulate ammonium (PNHX)), although not all species are encountered in all size bins. The total number of model variables currently in that version is 82.

More complete information about the mass transport model can be found in . The sea salt module is discussed in . The aerosol microphysics module is described in .

For the sake of simplicity we here use the mass transport model without aerosol microphysical processes (see next section). The model is set up over Europe covering 33∘ in the longitudinal and 42∘ in the latitudinal direction in a rotated lat-long grid with 0.4∘ × 0.4∘ horizontal resolution. In the vertical direction the model domain extends up to 13 hPa, using 40 terrain-following coordinates. The meteorological input data are taken from the numerical weather prediction model HIRLAM . For the emissions of all aerosol components we used EMEP data for the year 2007, where EC and OC emissions were computed from total primary particle emissions based on the data in .

Aerosol optics model

We have two different optics models coupled to MATCH: one to the mass transport module, and another to the aerosol microphysics module. The former assumes that all aerosol species are homogeneous spheres, and that each chemical species is contained in separate particles. Under these assumptions the optics model is linear, i.e. the optical properties are linear functions of the concentrations of the chemical aerosol species. The latter model accounts for the fact that in reality different chemical species can be internally mixed, i.e. they can be contained in one and the same particle. That model also accounts for the inhomogeneous internal structure of black carbon mixed with other aerosol components, and for the irregular fractal aggregate morphology of bare black carbon particles . Under these assumptions the optics model becomes nonlinear, which introduces additional complications in the inverse-modelling problem. This is the main reason why we chose to use the simpler mass transport optics model in this study. Much of the theory explained in the Appendix B relies on the assumption that the optics model is either linear, or that it is only mildly nonlinear, so that it can be linearized – see Eq. ().

Table lists the refractive indices in the mass-transport optics model at the three lidar wavelengths considered in this study. More information about the aerosol optics models implemented in MATCH can be found in .

Refractive indices at the three harmonics of the Nd:YAG laser assumed in the MATCH mass-transport optics model.

Wavelength (µm) 0.355 0.532 1.064 SIA

1.53+5.0e-3i

1.53+5.6e-3i

1.52+1.6e-2i

Dust

1.53+1.7e-2i

1.53+6.3e-3i

1.53+4.3e-3i

NaCl

1.51+2.9e-7i

1.50+1.0e-8i

1.47+2.0e-4i

1.53+5.0e-3i

1.53+5.6e-3i

1.52+1.6e-2i

1.66+7.2e-1i

1.73+6.0e-1i

1.82+5.9e-1i

Three-dimensional variational data assimilation (3DVAR)

Data assimilation is a class of statistical methods for combining model results and observations. The algorithm weighs these two pieces of information according to their respective error variances and covariances. As output the assimilation returns a result in model space of which the error variances are smaller than those of the original model estimate. In our case the model variables are the mass mixing ratios of aerosol components in a three-dimensional discretized model domain. These model variables are summarized in a vector x. The model provides us with a background (or first guess) estimate xb (with an error ϵb). The observations, summarized in a vector y, are related to the model state x by y=H^(x)+ϵo, where H^ is known as the observation operator, and ϵo denotes the vector of observation errors. The problem is to determine the most likely state vector xa given xb and y, and given the background error covariance matrix B=〈ϵb⋅ϵbT〉, and the observation error covariance matrix R=〈ϵo⋅ϵoT〉. Here 〈⋯〉 denotes the expectation value. In the three-dimensional variational method (3DVAR), the maximum-likelihood solution is found by numerically minimizing the cost function J=12(x-xb)T⋅B-1⋅(x-xb)+12[H^(x)-y]T⋅R-1⋅[H^(x)-y].

Data assimilation is commonly employed for constraining model results by use of observations. However, one can also employ data assimilation as an inverse-modelling tool, i.e. for retrieving a model state from measurements. A summary of the theoretical basis of variational data assimilation is given in Appendices B–D.

Many authors distinguish between data assimilation and data analysis. In data analysis one merely post-processes a model results by incorporating the information provided by observations. In data assimilation, the data analysis process is part of the time integration of the CTM. Thus, in each time step the result of the analysis becomes the new initial state for the next model forecast. Our 3DVAR code can be used in either analysis or assimilation mode. However, in this study we only perform numerical tests at a fixed point in time. Thus we use the 3DVAR code as a data analysis tool.

The MATCH model contains a 3DVAR data assimilation module. This model uses a spectral method, i.e. the model state vector is Fourier transformed in the two horizontal coordinates. All error correlations in the horizontal direction are assumed to be homogeneous and isotropic. The background error covariance matrix is modelled with a method that follows similar principles to the NMC method . A more complete description of our 3DVAR program can be found in .

Analysis of the information content of aerosol optical parameters

The questions we ask are these:

Suppose we have an n dimensional model space. Given m observations (e.g. m1 different parameters at m2 different wavelengths, so that m1⋅m2=m), how many independent model variables N≤n can we constrain with the observations? Obviously, the best we can achieve would be N=min⁡{m,n}, but often we will have N<min⁡{m,n}.

Which are the N model variables (or linear combinations of model variables) that can be constrained by the measurements?

Here we only give a summary of the most essential theoretical tools for answering these questions. A more thorough explanation of these concepts is given in Appendix C.

First we want to explain what we mean by signal degrees of freedom and noise degrees of freedom, closely following an example in (p. 29f). Suppose we have a direct measurement y of a scalar variable x with error ϵo, i.e. y=x+ϵo. Suppose further that we have a background estimate xb with background error variance σb2, and that the error ϵo has variance σo2. The prior variance of y is given by σy2=σb2+σo2, assuming that background and observation errors are uncorrelated. One can show that the best estimate xa of x will be xa=σb2y+σo2xbσb2+σo2. Hence, if σb2≫σo2, then the measurement y will provide information for estimating xa, i.e. the measurement provides a degree of freedom for signal. However, if σb2≪σo2, then xa will be close to xb, and y provides little information to estimating xa. The measurement mostly contains information on ϵo, i.e. it provides a degree of freedom for noise.

In a more general case we have to consider a state vector x and a set of measurements y with errors ϵo. The number Ns of signal degrees of freedom is a measure for the information content of the set of measurements. It provides us with an estimate of the number N of model variables that can be controlled by assimilating measurements.

The mapping from model space to observation space given in Eq. () can be Taylor expanded to first order according to y=H^(xb)+H⋅δx+ϵo, where H^ is the observation operator, H denotes its Jacobian, and δx=x-xb. The background or prior estimate xb is often obtained from a model run. The (in general non-square) matrix H is the main quantity we need to investigate in order to address the questions formulated at the beginning of this section. It is transformed to the so-called observability matrix H̃=R-1/2⋅H⋅B1/2, where R is the observation error covariance matrix, and B denotes the error covariance matrix of the background estimate. Subsequently, one performs a singular-value decomposition (SVD) R-1/2⋅H⋅B1/2=VL⋅W⋅VRT, where the matrices VL and VR contain the left and right singular vectors, respectively, and W is a matrix that contains the singular values along the main diagonal, while all other matrix elements are zero. It turns out that the singular values wi can be employed to compute the number of signal degrees of freedom Ns according to Ns=∑i=1min⁡{n,m}wi2/(1+wi2).

Another useful measure is obtained by expressing our incomplete knowledge of the atmospheric aerosol state by use of the Shannon entropy. The use of measurement information reduces the entropy, and this entropy reduction H can be expressed in terms of the singular values: H=12∑i=1min⁡{n,m}log⁡2(1+wi2).

Both Ns or H allow us to quantify the information content of a set of measurements. More detailed explanations of these concepts are given in Appendix C. A comprehensive discussion of information aspects and inverse methods for atmospheric sounding can be found in .

By performing the transformation δx′=VRT⋅B-1/2⋅δx we go from our physical model space to an abstract phase space – see Eq. () in Appendix . In this phase space the components of δx′ can be separated into signal-related and noise-related variables. The signal-related components can be controlled by the measurements, the noise-related components cannot. We therefore introduce constraints into our 3DVAR program such that only the Ns signal-related components of δx′ are allowed to be adjusted in the data analysis procedure, while the noise-related components are not altered. This is accomplished by adding an extra term JG to the cost function in Eq. (), where JG=12δxT⋅B-1/2⋅VR⋅BG-1⋅VRT⋅B-1/2⋅δx, and where BG is a diagonal matrix which we assume to have the form BG=σGdiag(w1,w2,…,…,wK,c,…,c). Here K=min⁡{n,m}, and the number c is assumed to be much smaller than the smallest singular value. We note that the formulation of the constraint term in Eq. () is by no means unique. Other possible choices of the matrix BG are discussed in Appendix . However, we performed preliminary tests which indicate that the constrained 3DVAR approach is not very sensitive to exactly how one chooses to formulate the matrix BG, as long as it behaves in such a way that the noise-related phase-space variables are tightly constrained, while the signal-related variables can be varied relatively freely by the analysis. The free parameters σG and c should be tuned in such a way that the constrains are neither too hard nor too soft. In the former case, the analysis will stay too close to the background estimate. In the latter case, it will not differ much from the unconstrained analysis.

Numerical test of the constrained assimilation algorithm

We study the performance of the 3DVAR system by performing a numerical test. To this end, we first perform a reference run by driving the MATCH model with analysed meteorological data. These reference results are taken as the “true” chemical state of the atmosphere. We apply the optics model to the model output to generate synthetic “observations”, i.e. a vertical profile at a selected observation point of extinction and backscattering coefficients at three typical lidar wavelengths. Next we run the MATCH model again, this time driven with 48 h forecast meteorological data. The results are taken as a proxy for a background model-estimate that is impaired by uncertainties. Finally, we perform a 3DVAR-analysis of the “observations” and the background estimate in an attempt to restore the reference results. In this numerical test we have perfect knowledge of the true state, and we assume that our optics model is nearly perfect, thus providing nearly perfect observations (we assumed that the observation error standard deviation is 10 % of the measurement value). The only factor that may prevent us from fully restoring the reference state is a lack of information in the observed parameters. Thus, comparison of the retrieval and reference results gives us an indication of how strongly different model variables can be controlled by the information contained in the observations.

We perform this test (i) with the unconstrained 3DVAR algorithm and (ii) with the constrained 3DVAR algorithm. We compare both runs in order to make a first assessment of the impact of the constraints. In particular, we are interested in the prospect of reducing the risk of assimilating noise in such a highly under-constrained inverse problem.

Results Analysis of the information content of aerosol optical parameters

We consider the set of parameters {kext(λ1), kext(λ2), βsca(λ1), βsca(λ2), βsca(λ3)}, where kext and βsca denote the extinction and backscattering coefficients, respectively, and the wavelengths λ1=1064 nm, λ2=532 nm, and λ3=355 nm denote the first three Nd:YAG harmonics. Hereafter, we will abbreviate these parameters by kext(λi)=ki, βsca(λj)=βj, i=1,2, j=1,2,3. Out of this five-parameter set we pick different subsets and analyse the singular values of the corresponding observability matrices. From those we compute the number of signal degrees of freedom as well as the change in Shannon entropy for each subset of measurements. We will focus on those parameter subsets that are technically relevant in practical lidar applications.

Number of signal degrees of freedom Ns and reduction in entropy H as a function of observation standard deviation, taken from the lowest model layer (closest to the surface). Results are shown for different subsets of k1, k2, β1, β2, β3, where ki and βi represent the extinction and backscattering coefficient, respectively, at the wavelengths λ1=1064 nm, λ2=532 nm, and λ3=355 nm.

Obs. SD (%) 1 5 10 50 100 No. Parameters

β3

1.00 10.9 1.00 8.58 1.00 7.58 1.00 5.26 1.00 4.26 2.

β1+β2

2.00 20.6 2.00 15.99 2.00 13.98 1.97 9.36 1.90 7.42 3.

β1+β2+β3

3.00 27.3 3.00 20.3 2.99 17.3 2.72 10.5 2.33 8.00 4.

β3+k3

2.00 19.4 2.00 14.8 2.00 12.8 1.92 8.21 1.74 6.37 5.

β1+β2+k2

3.00 28.0 3.00 21.0 2.99 18.0 2.77 11.2 2.42 8.63 6.

β1+β2+β3+k2+k3

5.00 40.0 4.97 28.4 4.91 23.5 3.89 12.9 2.97 9.49

Table shows the number of signal degrees of freedom Ns and the reduction in Shannon entropy H for different values of the observation standard deviation σo. For low values of σo, the number of signal degrees of freedom is identical to the number of observational parameters. However, as we increase σo we observe a decrease in Ns. For instance, for σo=100 % the five parameters β1+β2+β3+k2+k3 (last row) only provide roughly Ns=3 signal degrees of freedom. The reduction in Shannon entropy H displays an analogous behaviour. For instance, for σo=1 % we see that H consistently increases as one increases the number of observational parameters. This is much less pronounced for σo=100 %. In that case, H does increase as one goes from a single parameter to two parameters (compare the first to the second and fourth rows). However, as one adds more parameters, the increase in H slows down considerably. For five parameters (last row), H is only about twice as high as for a single parameter (first row).

This illustrates the pivotal importance of the observation error for the amount of information that can be obtained from measurements. It is important to understand that the observation error ϵo is not the same as the measurement error ϵm. Rather, in our case we have ϵo=ϵm+ϵf, where ϵf denotes the forward-model error (see, e.g., Eq. 1 and accompanying text in ). Any simplifying assumptions in the optics model or incomplete knowledge of the particle size distribution, morphology, chemical composition, or dielectric properties can contribute to ϵf. Such assumptions enter into our relatively simple optics model.

A more realistic optics model, such as the one investigated in would help to reduce the observation standard deviation. For future studies, such a model should be linearized and investigated in a similar way.

Note also that in operational applications there may be other terms contributing to ϵo. For instance, if a point measurement is taken at a location that does not provide a good representation of the grid-cell average, then one would have to add a representativity error ϵr to the observation error.

The strong impact of the observation errors on the information content of measurements suggests two conclusions.

In order to make the forward-model error ϵf as small as possible, it is essential to develop accurate and realistic aerosol optics models. The most accurate measurements may intrinsically contain a wealth of information on aerosol properties. But we can only make use of this information to the extent that our observation operator is able to accurately describe the relation between the physical and chemical particle characteristics and their optical properties.

It is equally essential to accurately estimate the contribution of the uncertainties in the aerosol optics model, i.e. to estimate the forward-model error ϵf. If we underestimate this error, we will rely too much on the measurements than we should, thus assimilating noise. If we overestimate this error, we will waste information contained in the observations. In practice, one way to estimate ϵf is to compute optical properties while varying the particles' size, morphology, and dielectric properties within typical ranges. The resulting variation in the optical properties then allows us to estimate ϵf. (For a review of aerosol optics modelling see , and references therein).

In Table we sorted the results for Ns and H by different values of the observation standard deviation. However, it is important to realize that the results also depend on the background error standard deviation, or, more precisely, on how large the background error standard deviations are compared to the observation error standard deviations. made this point very explicit. They discussed an idealized case with diagonal background error covariance matrix B=σb21 and observation error covariance matrix R=σo21. They considered the case of direct measurements, i.e. the model variables and the observed parameters are the same type of variables. Under such idealized conditions, they showed that one can maximize the amount of information that can be obtained from the observations by optimizing the regularization parameter σb/σo (or, equivalently, the regularization parameter σo2/σb2). In our more general case, instead of σb we need to consider the full matrix B1/2, instead of σo-1 we need to consider R-1/2, and in order to compare the two matrices we need to first transform B1/2 from model to observation space according to H⋅B1/2. Thus, in place of σb/σo we need to consider the more general quantity R-1/2⋅H⋅B1/2, and we need to diagonalize it by a singular value decomposition according to Eq. (). Thus the singular values wi generalize the parameter σb/σo. The latter applies to the case of direct observations and error covariance matrices that are proportional to unit matrices. The former apply to the general case of non-diagonal error covariance matrices and indirect observations.

From this we learn that the singular values wi provide us with a (however abstract) means to quantify how the background standard deviations compare to the observation standard deviations. We pick one of the columns in Table , namely the one for σo=50 %, and expand it in Table . We show the singular values wi, as well as their contributions Nsi=wi2/(1+wi2) and Hi=0.5log⁡2(1+wi2) to the sums in Eqs. () and (), respectively. The results reveal that the singular values wi can decrease quite rapidly from the largest to the smallest value (see, e.g., case no. 6 in the table). However, the corresponding contribution Nsi to the number of signal degrees of freedom changes rather smoothly. Even those singular values that are only slightly larger than 1 make contributions Nsi that lie close to 1 (see, e.g., i=4 in case no. 6). However, once wi falls below 1, the corresponding contribution Nsi becomes much smaller than 1 (see i=5 in case no. 6).

Let us now compare the different subsets of parameters in Tables and . In case no. 1 we observe a single parameter that provides a single degree of freedom. In cases no. 2 and 4 we observe two parameters, which nearly doubles Ns. Comparison of these two cases shows that it does not make a significant difference whether we observe backscattering coefficients at different wavelengths, or both extinction and backscattering coefficients each at a single wavelength. In either case the measurements provide roughly the same amount of information (in terms of Ns or H). The same is true when considering three observational parameters (compare case nos. 3 and 5). The 3β+2α case (no. 6) clearly provides the largest amount of information in comparison to the other cases. However, as we saw in Table , observation errors that are large in comparison to the background errors can significantly reduce the effective information that can assimilated into a model.

Signal degrees of freedom Ns and change in entropy H for the lowest model layer (closest to the surface). Also shown are the singular values wi and their contributions Nsi and Hi to Ns and H, respectively. The results have been obtained by assuming an observation standard deviation of 50 %.

No. Parameters

Nsi

β3

1 38.2 1.00 5.26 1.00 5.26 2. β1, β2 1 108 1.00 6.76 1.97 9.36 2 6.00 0.97 2.61 3. β1, β2, β3 1 115 1.00 6.84 2.71 10.5 2 6.54 0.98 2.73 3 1.68 0.74 0.97 4. β3, k3 1 83.3 1.00 6.38 1.92 8.22 2 3.43 0.92 1.84 5. β1, β2, k2 1 128 1.00 7.00 2.77 11.24 2 8.71 0.99 3.13 3 1.90 0.78 1.10 6. β1, β2, β3, k2, k3 1 153 1.00 7.26 3.89 12.9 2 9.52 0.99 3.26 3 1.94 0.79 1.13 4 1.63 0.73 0.93 5 0.79 0.38 0.35

Numerical inverse-modelling test

We integrated the findings of Sect. into our 3DVAR program by constraining the algorithm to varying only the signal-related model variables. To illustrate the method we conduct a numerical test as described in Sect. . We perform a 3DVAR analysis by assimilating “3β+2α” profiles, i.e. synthetic lidar measurements of βsca at the three wavelengths 1064, 532, and 355 nm together with kext at the two wavelengths 532 and 355 nm. Thus in our case the number of singular values in each vertical layer is K=5. We assume an idealized situation in which the observation standard deviation is only 10 %. As we see in Table (case no. 6), the number of signal degrees of freedom is Ns=4.9 in this case. Thus, we roughly have as many signal degrees of freedom as we have measurements.

Vertical profiles of selected aerosol components in different size bins. From top to bottom: organic carbon in the third size bin (OC-3), OC in the fourth size bin (OC-4), elemental carbon in the third size bin (EC-3), and dust in the first size bin (DUST-1). The reference results are shown in black, and the background (first guess) estimate is shown in green. The unconstrained 3DVAR analysis results are presented in the left panels in blue, the constrained 3DVAR analysis results are shown in the right panels in red.

Figure shows vertical profiles of selected aerosol components, namely (from top to bottom): organic carbon (OC) in the third size bin (OC-3), OC in the fourth size bin (OC-4), elemental carbon (EC) in the third size bin (EC-3), and mineral dust in the first size bin (DUST-1). The reference and background mixing ratios are shown in black and green, respectively. The 3DVAR analysis was first performed without any constraints; the results are shown in the left column by the blue line. Then the 3DVAR analysis was repeated with the constraints in Eqs. () and (); the results are represented in the right column by the red line. Clearly, the unconstrained analysis (blue lines in the left panels) yields results that oscillate quite erratically in the vertical direction. Also, the unconstrained analysis can yield conspicuously high values at higher altitudes, even though the reference and background values are both close to zero. By contrast, the constrained analysis (red lines in the right panels) yields results that better agree with the reference results. The noisiness in the vertical direction is significantly reduced, and the results at higher altitudes are generally lower than those obtained with the unconstrained analysis.

As Fig. , but for the total mass mixing ratio (summed over all size bins). The components are (from top to bottom): EC, OC, mineral dust, sea salt, secondary inorganic aerosols (sum of all sulfate, nitrate, and ammonium species), and PM10 (sum of all aerosol components).

Figure shows analogous results for the mass mixing ratios of different aerosol components, each summed over all size bins. The aerosol components are (from top to bottom): elemental carbon (EC), organic carbon (OC), mineral dust (DUST), sea salt (NaCl), secondary inorganic aerosols (SIA, i.e. the sum over all sulfate, nitrate, and ammonium species), and PM10 (i.e. the sum over all aerosol components). Clearly, the constrained analysis faithfully retrieves both PM10 and SIA. The unconstrained analysis performs almost equally well for these two variables. Sea salt and mineral dust are not well retrieved from the measurements in either the constrained or unconstrained approach. EC and OC are very well retrieved by the constrained analysis. For these components, the unconstrained analysis has a very small bias compared to the reference results, but it is considerably more noisy (i.e. oscillating in the vertical direction) than the constrained analysis. We also see, again, that the mixing ratios at higher altitudes obtained with the unconstrained analysis can be unreasonably high. This is especially pronounced for OC. In general, however, the problems we encounter in the unconstrained analysis are less pronounced in Fig. than in Fig. . A possible explanation is that SIA may be most strongly related to the measurement signal, and SIA is dominating the aerosol mass in this case. We will return to this point shortly. Another possible factor is that the noise in the analysis can be damped by summing up results over several size bins.

Figure shows the observations (black) as well as the observation-equivalents of the background estimate (green) and the unconstrained (blue) and constrained (red) 3DVAR analysis for all five observations. We learn from this figure that the analysis follows the observations faithfully. The reason for this is that we assumed that the observations were highly accurate with an error standard deviation of only 10 %. In fact, the difference between the observation-equivalent analysis and the observations deviate by even less than 10 %. However, our tests confirmed that an increase in the observation error eventually results in analysis results of which the observation-equivalent increasingly deviates from the observations (not shown).

Observations (black solid line), and observation-equivalents of the background estimate (green), and of the unconstrained (blue) and constrained (red) 3DVAR analysis. The optical parameters and wavelengths are indicated above each panel.

Vertical profiles of the transformed model variables δx′, defined in Eq. (). The figure shows results obtained with the constrained (red) and unconstrained (blue) 3DVAR analysis.

We have seen that the analysis provides a reasonable, but, as expected, not a perfect answer to the inverse problem. We have further seen that at the observation site it relies more on the observations than on the background estimate. Most importantly, we have seen that the constraints introduced in the 3DVAR algorithm suppress noise in the analysis, especially in EC and OC. However, the previous figures do not provide us with any direct insight of how exactly the constraints accomplish this. To learn more about that we need to inspect the analysis in the abstract phase space of the transformed model variables δx′. (Recall that we defined this variable in Eq. as δx′=VRT⋅B-1/2⋅(x-xb)). Figure shows vertical profiles of a selection of the, in total, 20 variables δxi′. The background estimate corresponds to δxi′=0 and is represented by the green line. The unconstrained 3DVAR analysis increment is represented by the blue line, the constrained 3DVAR analysis increment is shown by the red line. The first five phase space elements in the top row are the signal-related control variables. Generally, the magnitude of the constrained increments (red) is larger than that of the unconstrained increments (blue). The noise-related phase space elements, five of which are shown in the bottom row, display the opposite behaviour. The constrained increments are close to zero, as they should be. The unconstrained elements consistently show higher magnitudes than the constrained elements. However, we also see that the unconstrained analysis does produce increments that are largest for the two elements δx1′ and δx2′, which most strongly relate to the measurement signal. Based on our single test case we cannot say if this is a lucky coincidence or a consistent property. If the latter, it may indicate that we are using rather reasonable background error statistics, so that the analysis increment in observation space is distributed to the different variables in model space in a sensible way. If the former, it could be the case that the success of the unconstrained analysis is largely dependent on whether or not those aerosol components dominate the total aerosol mass that most strongly relate to the signal degrees of freedom. (In our case the total mass is dominated by SIA, which is very well retrieved by the analysis).

The first five rows (from top to bottom) of the matrix VRT⋅B-1/2 at the observation site, and for model layers 2 (left) and 22 (right). The y values are normalized by dividing them by the maximum element. The x axis indicates the aerosol components in model space to which the elements of the row vectors correspond, namely, sea salt (NaCl), EC, OC, and dust, each in four size bins, as well as the four SIA components: sulfates (SOx) other than (NH4)2SO4, ammonium sulfate (AS), ammonium nitrate (AN), and nitrates (NOx) other than NH4NO3.

Finally, we want to obtain a better understanding of how the aerosol components x in model space, or their increments δx, are linked with the signal-related phase-space elements δx′. To this end we inspect the first five row vectors of the transformation matrix VRT⋅B-1/2 in Eq. (). The magnitude of these elements can be taken as a measure for how much each aerosol component of δx in model space contributes to the signal-related elements of δx′. Figure shows ∣(VRT⋅B-1/2)ij∣ for i=1,…,5, and for j=1,…,20, where 5 is the number of signal-related phase-space elements, and 20 is the number of aerosol components in model space. Results are shown for model layers 2 (left column) and 22 (right column), which correspond to altitudes of about 100 m and 6 km, respectively. The x axis shows sea salt (NaCl), EC, OC, and dust, each in four size bins, as well as the four SIA components, i.e. sulfates (SOx) other than (NH4)2SO4, ammonium sulfate (AS), ammonium nitrate (AN), and nitrates (NOx) other than NH4NO3.

Comparison of the two columns clearly demonstrates that the elements of the transformation matrix can vary considerably with vertical layer (or, more generally, with location). This is because the error covariance matrix B varies with location, and the matrix R varies from one observation site to another (in our case, from one altitude to another). Hence the matrix VR is also dependent on location – see Eq. (). Consequently, it is very difficult to draw general conclusions about which aerosol components make a dominant contribution to the signal-related phase-space variables; this can vary with location, and it can vary for different data sets.

However, in our case the SIA components consistently make a strong contribution to the first signal-related element δx1′. Since SIA is dominating the aerosol mass mixing ratio in this test case, the analysis was able to retrieve PM10. We also see that the dust components make only a weak contribution to most of the signal-related elements δxi′, especially to the first one. This is a likely explanation for the difficulties encountered in retrieving the dust mass mixing ratio. Sea salt is more complicated. Size bins 3 and 4 do contribute considerably to δx1′, and also to some of the other four increments, while size bins 1 and 2 do not make a significant contribution to most of the five signal-related control variables. In our case the sea salt mass is strongly dominated by the second size bin (not shown). This explains the difficulties we encountered in the retrieval of sea salt.

Summary and conclusions

We have quantified the information content of multiwavelength lidar measurements with regard to the chemical composition of aerosol particles. Different combinations of extinction and backscattering observations at several wavelengths have been investigated by determining the singular values of the scaled observation operator, by computing the number of signal degrees of freedom Ns, and by calculating the reduction in Shannon entropy H caused by taking measurements. We first quantified Ns and H as a function of observation standard deviation σo. The information content of the observations, as expressed by Ns and H, decreased as σo was increased. This became the more pronounced the larger the number of simultaneously observed parameters was.

The observation error depends not only on the measurement error, but also on the forward-model error. The latter depends on the uncertainties in the aerosol optics model. This highlights the importance of developing accurate aerosol optics models and of obtaining an accurate estimate of the observation error, especially of the uncertainty in the aerosol optics model. This is a prerequisite for extracting as much information as possible from the measurements, while avoiding to extract noise rather than signal. More often than not, computational limitations and lack of knowledge force us to introduce simplifying assumptions about the particles' morphologies. However, we know that aerosol optical properties can be highly sensitive to the shape (), small-scale surface roughness , inhomogeneity , aggregation , irregularity , porosity , and combinations thereof . We need to know how much these sources of uncertainty contribute to the observation standard deviation. One way of estimating this is to compare aerosol optical properties computed with simple shape models to either measurements or to computations based on more realistic particle shape models – see for a recent review and a more detailed discussion.

The singular values of the scaled observation operator provide us with an abstract measure to compare the standard deviations of the background (prior) estimate to those of the observations. The reason why this is a rather abstract measure is because background and observation errors are, in general, in different spaces and cannot be directly compared. However, we constructed a mapping that transforms the state vector in physical (model) space to an abstract phase space in which the components of the state vector can be partitioned into signal-related and noise-related components. The singular values indicate to what extent the signal-related phase-space variables can be constrained by the measurements. We exploited this fact by constructing weak constraints in a 3DVAR data assimilation code, which limited the assimilation algorithm to acting on the signal-related phase-space variables only (hereafter referred to as the constrained analysis). The idea was to maximize the use of information, while avoiding the risk of assimilating noise by overusing the measurements. Thus, our main hypothesis was that the constrained analysis will yield less noisy results than the unconstrained analysis. Numerical tests confirmed this hypothesis. Notably in the case of elemental carbon (EC) and organic carbon (OC) the unconstrained analysis gave mixing ratios that oscillated considerably in the vertical direction. The constrained analysis results were considerably less noisy.

When mapped into observation space, the analysis result closely reproduced the measurements. When viewed in the abstract phase space, we found that the constrained analysis did, indeed, yield noise-related components that were close to zero, as they should be. This was not so in the unconstrained analysis. Also, the magnitude of the signal-related phase-space components was generally larger in the constrained analysis than in the unconstrained analysis. This confirms that the constraints we introduced work as intended.

In our specific test case secondary inorganic aerosol components were most faithfully retrieved by the inverse modelling solution, followed by organic and black carbon. Dust and sea salt mass mixing ratios were more challenging to retrieve. We could explain this by inspecting the linear coefficients in the transformation from physical space to the abstract phase space. We found that those aerosol components that had the largest weight in the transformation were most faithfully retrieved by the analysis. However, these linear coefficients depend on the background error covariances (which can change with location), and on the observation error variances. Therefore, it is difficult to draw general conclusions about which aerosol components are most easily retrieved by a given set of measurements.

The results presented here suggest further questions for future studies. We have performed this investigation with a mass transport model, thus focusing on the information content of optical measurements with respect to the chemical composition of aerosols. When we include aerosol microphysical processes, then the model delivers the aerosols' size distribution, as well as their size-resolved chemical composition. This makes the problem quite different from that we investigated here. First, the dimension of the model space is considerably larger for an aerosol microphysics transport model. Constraining such a model with limited information from measurements becomes even more challenging than in the case of a mass transport model. On the other hand, an aerosol microphysics model delivers information on the particles' size distribution and mixing state. Therefore, this would require us to make fewer assumptions in the aerosol optics model, which may reduce the observation error. The present study could be extended to investigate the information contained in extinction and backscattering measurements for simultaneously constraining the chemical composition and the size of aerosol particles.

Another important issue concerns the choice of the aerosol optics model. In the present study we employed a simple homogeneous-sphere model in which all chemical components were assumed to be externally mixed. There is little one can put forward in defence of this model other than pure convenience. (Regarding the applicability of simplified model particles in atmospheric optics see the review by ). As a result of the external-mixture assumption, the observation operator is linear, which is a prerequisite for much of the theoretical foundations of this study – see Appendices B–D for details. However, it has been demonstrated that drastically simplifying assumptions, such as the external-mixture approximation, can give model results for aerosol optical properties that differ substantially from those obtained with more realistic nonlinear optics models . It would therefore be important to extend the present study to include more accurate and realistic optics models. A first step could be to analyse the degree of nonlinearity of optics models that account for internal mixing of different aerosol species. If they turn out to be only mildly nonlinear, then one can linearize them and work with the Jacobian of the nonlinear observation operator. Otherwise the theoretical methods employed in this paper would have to be extended in order to accommodate nonlinear observation operators.

The data used in this study are included in the Supplement.

Inverse problems

Suppose we have a system described by a set of variables x1,…,xn, summarized in a vector x. Suppose also that we have an operator H^:Rn→Rm, x↦y=H^(x) that allows us to compute a set of variables y1,…,ym, summarized in a vector y. To take a specific example, we may think of x as a vector of mass mixing ratios of chemical aerosol species, y as a set of aerosol optical properties, and H^ as an aerosol optics model. The operator H^ maps from model space into observation space, which allows us to compare model output and observations. We consider the following two problems:

Direct problem: given x and H^, calculate y=H^(x).

Inverse problem: given y and H^, solve y=H^(x) for x.

A pair of such problems is inverse to each other; it is, therefore, somewhat arbitrary which problem we choose to call the direct problem, and which one we call the inverse problem. However, one of the problems is usually well posed, while the other one is ill-posed. Such is also the case in aerosol optics modelling. It is customary to call the well-posed problem the direct problem, and the ill-posed one the inverse problem.

An equation y=H^(x) is called well posed if it has the following properties:

Existence: for every y∈Rm, there is at least one x∈Rn for which y=H^(x).

Uniqueness: for every y∈Rm, there is at most one x∈Rn for which y=H^(x).

Stability: the solution x depends continuously on y.

If any of these properties is not fulfilled, then the problem is called ill-posed.

Three-dimensional variational data assimilation

Data assimilation is usually employed for constraining models by use of measurements, but it can also be used to solve inverse problems. Here we focus on one specific data assimilation method known as three-dimensional variational data assimilation, or 3DVAR.

In a CTM we discretize the geographic domain of interest into a three-dimensional grid. In each grid cell, the aerosol particles are characterized by the mass mixing ratio of each chemical component in the aerosol phase, such as sulfate, nitrate, ammonium, mineral dust, black carbon, organic carbon, and sea salt. Suppose we summarize all these mass mixing ratios from all grid cells into one large vector x∈Rn. The model provides us with a first guess of the atmospheric aerosol state, known as a background estimate xb.

In the remote sensing and inverse modelling community, the background estimate is more commonly referred to as the a priori estimate.

Suppose also that we have m observations, which we summarize in a vector y∈Rm. We further have an observation operator H^:Rn→Rm, x↦H^(x) that maps the state vector x from model space to observation space.

The optics model H^ usually has to invoke assumptions about physical aerosol properties that are relevant for the optical properties, but not provided by the CTM output, e.g. assumptions about the morphology of the particles. If the CTM is a simple mass-transport model without aerosol microphysics, then it is also necessary to invoke assumptions about the size distribution of the aerosols.

We further denote by xt the true state of the atmosphere, by ϵb=xt-xb the error of the background estimate, and by ϵo=H^(xt)-y the observation error.

We stress, once more, that the observation error must not be confused with the measurement error ϵm. The latter contributes to the former, but the observation error also contains other sources of error. For instance, if we deal with morphologically complex particles, but our lack of knowledge forces us to make assumptions and invoke approximations about the particle shapes, then this forward-model error ϵf contributes to the observation error. The same is the case if we lack information about the particles' size distribution. In operational applications the representativity error ϵr can also make a substantial contribution to ϵo.

The background and observation errors are assumed to be unbiased and uncorrelated with each other. Then their joint probability distribution becomes separable, i.e. P(ϵb,ϵo)=Pb(ϵb)Po(ϵo).

The true state of the atmosphere is, of course, unknown. Therefore, our definition of the errors and their probability distribution is only of conceptual use, but not of any practical value. However, we can reinterpret the probability distributions by replacing ϵb in the argument of Pb with x-xb, and by replacing ϵo in the argument of Po with H^(x)-y. We further assume that both the background and the observation errors are normally distributed. Thus we may write Pb(x)=(2π∣B∣)-1/2exp⁡-12(x-xb)T⋅B-1⋅(x-xb)Po(x)=(2π∣R∣)-1/2exp⁡-12(H^(x)-y)T⋅R-1⋅(H^(x)-y). Here B and R denote the covariance matrices of the background and observation errors, respectively, and ∣⋅∣ denotes the matrix determinant. In this form, Pb(x) represents the probability that the atmospheric aerosol particles are found in state x, given a background estimate xb with error covariance matrix B. Similarly, Po(x) is the probability that the system is found in state x, given measurements y with error covariances R.

The observation errors are often assumed to be uncorrelated (this is not always true). In such a case the matrix R is diagonal, where the diagonal elements are the observation error variances.

Equations ()–() can be summarized in the form P(x)=12π(∣B∣⋅∣R∣)1/2exp⁡(-J(x))J(x)=12(x-xb)T⋅B-1⋅(x-xb)+(H^(x)-y)T⋅R-1⋅(H^(x)-y), where J is suggestively called the cost function, since it can be interpreted as a measure for how “costly” it is for a state x to simultaneously deviate from the background estimate and the measurements within the permitted error bounds. The deviations are weighted with the inverse error covariance matrices. For instance, this means that for measurements with a small error variance, a deviation H^(x)-y becomes “more costly”.

We are interested in the most probable aerosol state of the atmosphere, i.e. in that state xa for which the probability distribution attains its maximum. This is obviously the case when the argument of the exponential in Eq. () assumes a minimum. Thus we seek to minimize the cost function J. The variational method is based on computing the gradient of the cost function, ∇xJ, and to use this in a descent algorithm to iteratively search for the minimum of J.

In practice it is common to introduce the variable δx=x-xb, and use the first-order Taylor expansion of the observation operator, H^(x)=H^(xb)+H⋅δx, where the (m×n) matrix H denotes the Jacobian of H^ at x=xb. If H^ is only mildly nonlinear, and if the components of δx are sufficiently small, then we can substitute this first-order approximation into Eq. (), which yields J=Jb+JoJb(δx)=12δxT⋅B-1⋅δxJo(δx)=12H^(xb)+H⋅δx-yT⋅R-1⋅H^(xb)+H⋅δx-y. The components of the vector δx are the control variables that are iteratively varied by the algorithm until the minimum of the cost function is found.

The solution to the equation ∇xJ=0n is a solution to the inverse problem (where 0n denotes the null vector in n-dimensional model space); we input the observations y into the algorithm, and as output we obtain a result in model space that is consistent with the measurements (within the given error bounds).

By solving the equation ∇J|x=xa=0n for the analysed state xa it can be shown that the solution to the inverse problem is given by xa=xb+K⋅(y-H^(xb)), where K=B⋅HT⋅(H⋅B⋅HT+R)-1 is known as the gain matrix. This illustrates that the analysis updates the background estimate xb by mapping the increment (y-H^(xb)) from observation space to model space by use of the gain matrix. The correlations among the model variables enter into the gain matrix through the matrix B. In our case the vertical correlations are rather weak in comparison to correlations among different aerosol species.

What if the measurements contain insufficient information about the state x? The algorithm will still provide an answer to the inverse problem, but the missing information will be supplemented by the background estimate xb. The weighting of the two pieces of information, xb and y, is controlled by the respective error covariance matrices. Thus data assimilation is a statistical approach, which can be expected to give good results on average, but not in every single time step of the model run. This can become highly problematic if we only have very few observations, i.e. m≪n, where n is the dimension of the model space. If we allow all model variables to be freely adjusted by the assimilation algorithm in such a severely under-constrained case, then the algorithm may just assimilate noise from the measurements rather than signal, resulting in unreasonable solutions to the inverse problem

e.g.

. To avoid such problems, one needs to systematically analyse the information content of the observations and constrain the assimilation algorithm to only operate on the signal degrees of freedom.

Information content of measurements

Our ultimate goal is to formulate the data assimilation problem in such a way that the information contained in the measurements is fully exploited, but not overused. To this end, we first need to know how many independent quantities can be determined from a specific set of measurements. We investigate this question by borrowing ideas from retrieval and information theory – see for more detailed explanations.

The main idea is to compare the variances of the model variables to those of the observations. Only those model variables whose variance is larger than those of the observations can be constrained by measurements. However, to actually make such a comparison poses two problems. The first problem is that one cannot readily compare error covariance matrices. The second problem is that model variables and measurements are in different spaces. We first address the second problem.

When we account for observation errors ϵo, then the basic relation between model variables and observations is, to first order y=H^(xb)+H⋅δx+ϵo. The error covariance matrices are given by the expectation values B=〈δx⋅δxT〉, and R=〈ϵo⋅ϵoT〉, where the dot denotes a dyadic product.

The expectation value of a discrete variable a that assumes values a1,a2,…,an with corresponding probabilities p1,p2,…,pn is given by 〈a〉=∑i=1npiai.

From Eq. () we see that the covariance matrix of δy=y-H^(xb) is given by 〈δy⋅δyT〉 =H⋅B⋅HT+R, where we assumed that background and observation errors are uncorrelated. This last equation suggests that we can compare model and observation errors in the same space by transforming the background error covariance matrix from the space of (n×n) matrices to the space of (m×m) matrices, namely H⋅B⋅HT.

To address the first problem, we diagonalize the covariance matrices by making the following change of variables: δx̃=B-1/2⋅δxδỹ=R-1/2⋅(y-H^(xb))H̃=R-1/2⋅H⋅B1/2. Here B1/2 denotes the positive square root

A matrix A is called a square root of a matrix B if AT⋅A=B. The positive square root of B, which is denoted by B1/2, has the property xT⋅B1/2⋅x≥0 for all x. If B is itself positive and symmetric, as is the case for covariance matrices, then the positive square root exists and is unique.

of the matrix B, and B-1/2 denotes its inverse. The scaled observation operator H̃ is sometimes referred to as the observability matrix. In the new basis, the cost function in Eqs. ()–() becomes J=12δx̃T⋅δx̃+12H̃⋅δx̃-δỹT⋅H̃⋅δx̃-δỹ. The covariance matrices are now unit matrices. This can also be seen by considering the transformed errors, e.g. ϵ̃o=R-1/2⋅ϵo and computing 〈ϵ̃o⋅ϵ̃oT〉=R-1/2⋅〈ϵo⋅ϵoT〉⋅R-1/2=1m×m, since 〈ϵo⋅ϵoT〉=R. (Here, 1m×m denotes the unit matrix in m-dimensional observation space.) Similarly, we find 〈δx̃⋅δx̃T〉= 1n×n. The covariance matrix of the transformed measurement vector δỹ is given by 〈δỹ⋅δỹT〉= H̃⋅H̃T+1m×m. The first term is the model error covariance term transformed into observation space, while the second term (the unit matrix) is the diagonalized observation error covariance matrix.

We are still not in a position to make a meaningful comparison of model and observation errors, since the first term, H̃⋅H̃T, is still not diagonal. To make it so we need to perform one more transformation. To this end, we consider the singular value decomposition of the matrix H̃: H̃=R-1/2⋅H⋅B1/2=VL⋅W⋅VRT. Here H̃ is a (m×n) matrix, the matrix of the left-singular vectors VL is a (m×m) matrix, the matrix VR containing the right-singular vectors is a (n×n) matrix, and the (m×n) matrix W consists of two blocks. If m<n, then the left block of W is a (m×m)-diagonal matrix containing the m singular values w1,…,wm on the diagonal; the right block is a (m×(n-m))-null matrix. Similarly, if m>n, then the upper block of W is a (n×n)-diagonal matrix containing the n singular values on the diagonal, while the lower block is a ((m-n)×n)-null matrix.

We now make another change of variables: δx′=VRT⋅δx̃δy′=VLT⋅δỹH′=VLT⋅H̃⋅VR. The matrices VL and VR are orthogonal, i.e. VLT⋅VL=1m×m, and similarly for VR. Thus, substitution of Eqs. ()–() into () yields J=12δx′T⋅δx′+12H′⋅δx′-δy′T⋅H′⋅δx′-δy′. Evidently, the transformation given in Eqs. ()–() preserves the diagonality of the background and observation error covariance matrices. What about the covariance matrix 〈δy′⋅δy′T〉 in the new basis? Using ϵo′=VLT⋅ϵ̃o= VLT⋅R-1/2⋅ϵo, as well as Eqs. (), ()–(), and ()–(), we obtain 〈δy′⋅δy′T〉= H′⋅H′T+1m×m. The contribution of the background error covariances in this coordinate system is H′⋅H′T, which is a diagonal matrix. This becomes clear from Eqs. () and (), which yields H′⋅H′T=W⋅WT, which is a (m×m) diagonal matrix. Thus in this coordinate system we can readily compare the diagonal elements of the transformed background error covariance matrix H′⋅H′T to the diagonal (unit) elements of the observation error covariance matrix 1m×m. Roughly, those singular values wi on the diagonal of W that are larger than unity correspond to model variables δxi′ that can be controlled by the measurements. Those singular values smaller than unity correspond to model variables that are only related to noise.

In the above discussion we relied on plausibility arguments. We mention that there are more systematic ways of approaching the problem. Here we merely state some key results without going into details. The interested reader is referred to chapter 2 in . However, in all approaches the main quantities of interest are always the singular values of the observability matrix R-1/2⋅H⋅B1/2.

One can compute the number of signal degrees of freedom Ns from the expectation value of Jb in Eq. (). The result can be expressed in terms of the singular values wi of the observability matrix: Ns=∑i=1min⁡{m,n}wi2/(1+wi2), where n is the dimension of model space, and m is the dimension of observation space.

Another approach is based on information theory. Given a system described by a probability distribution function P(x), one defines the Shannon entropy S(P)=-∫P(x)log⁡2P(x)P0(x)dx, where P0 is a normalization factor needed to make the argument of the logarithm dimensionless. A decrease in entropy expresses an increase in our knowledge of the system. For instance, if we initially describe the system by Pi(x), and, after taking measurements, by Pf(x), then the measurement process has changed the entropy by an amount H=S(Pi)-S(Pf). In our case, we assume that all errors are normally distributed. In that case, one can show that H=12∑i=1min⁡{m,n}log⁡2(1+wi2). H can be interpreted as a measure for the information content of a set of measurements.

Our findings so far suggest a general strategy for how to optimize the amount of information that can be extracted from measurements. First, we need to compute the singular value decomposition in Eq. (), as well as the transformation given in Eqs. () and (), which we can summarize as δx′=VRT⋅B-1/2⋅δx. Then we want to formulate the minimization of the cost function in such a way that only those components of δx′ are adjusted by the assimilation algorithm that correspond to the largest singular values of the matrix W in Eq. (). All other elements of δx′ should be left alone. In other words, we want to constrain the minimization of the cost function to the subspace of the signal degrees of freedom of the state vector. Thus, in order to implement this idea, we first need to discuss how to incorporate constraints into the theory.

Minimization of the cost function with constraints

In the minimization of the cost function all elements of the control vector δx are independently adjusted until the minimum of J is found. This may not be a prudent approach if the information contained in the observations is insufficient to constrain all model variables. In such a case one should introduce constraints that reduce the number of independent control variables. However, this needs to be done in a clever way; the goal is to neither underuse the measurements (thus wasting available information), nor to overuse them (thus assimilating noise).

For reasons we will explain later we formulate the constraints as weak conditions. However, for didactic reasons as well as for the sake of completeness, we will also mention how to formulate constraints as strong conditions.

Minimization of the cost function with strong constraints

Given k constraints in the form gi(δx)=0, i=1,…,k, the most general way of finding the minimum of J(δx) under the constraints gi is the method of Lagrange multipliers. More specifically, one introduces k Lagrange multipliers λ1,…,λk and defines the function L(δx1,…,δxn,λ1,…,λk)=J(δx1,…,δxn)+∑i=1kλigi(δx1,…,δxn); then one solves the minimization problem ∇L(δx1,…,δxn,λ1,…,λk)=0n+k, where ∇=∇δx1,…,δxn,λ1,…,λk is now a (n+k)-dimensional gradient operator, and where 0n+k denotes the null vector in an (n+k)-dimensional space. Note that in this general formulation of the problem the constraints can even be nonlinear. We are specifically interested in linear constraints, which can be expressed in the form G⋅δx=0k. Then the constrained minimization problem becomes L(δx,λ)=J(δx)+λT⋅G⋅δx∇δx,λL(δx,λ)=∇δxJ(δx)+λT⋅GG⋅δx=0n+k.

Compared to the unconstrained minimization problem, the introduction of k constraints has increased the dimension of the problem from n to n+k. Naively, one may have expected that the dimension would, on the contrary, be reduced to n-k. This is indeed the case if the constraints are linear, and if the function J is quadratic, as is the case in Eqs. ()–(). To see this, let us first write those equations more concisely in the form J=12δxT⋅Q1⋅δx+Q2T⋅δx+δxT⋅Q2+Q3Q1=B-1+HT⋅R-1⋅HQ2=HT⋅R-1⋅(H^(xb)-y)Q3=(H^(xb)-y)T⋅R-1⋅(H^(xb)-y). Note that the covariance matrices and their inverses are symmetric (i.e. RT=R, etc.) The unconstrained minimization problem requires us to solve the equation ∇δxJ=Q1⋅δx+Q2=0n. Now we want to minimize the cost function subject to the linear constraints G⋅δx=0k, where G is a (k×n) matrix, δx is an n-vector, and 0k is the null-vector in Rk. Let us denote the kernel

The kernel or null space of a matrix is the set of all vectors z such that G⋅z=0. The kernel is a subspace of the full vector space Rn with dim ker(G)=n-k.

of G by ker(G). Let further z1,…,zn-k denote a basis of ker(G). We define the (n×(n-k)) matrix Z=z1⋯zn-k, the column vectors of which are just the basis vectors of ker(G). Obviously, G⋅Z=0k×(n-k), where 0k×(n-k) denotes the ((k×(n-k))-null matrix. If δx is a vector in Rn for which there exists a vector ξ∈Rn-k such that Z⋅ξ=δx, then we automatically have G⋅δx=0k, i.e. δx satisfies the linear constraints. Thus we can formulate the constrained minimization problem by substitution of δx=Z⋅ξ into Eq. (), which yields J=12ξT⋅ZT⋅Q1⋅Z⋅ξ+Q2T⋅Z⋅ξ+ξT⋅ZT⋅Q2+Q30k=∇ξJ=ZT⋅Q1⋅Z⋅ξ+ZT⋅Q2. Thus we have reduced the (n+k)-dimensional constrained minimization problem given in Eq. () to a problem consisting of the following two steps:

Determine a basis of the null space ker(G); this yields the matrix Z.

Solve the unconstrained (n-k)-dimensional optimization problem given in Eq. (). From the (n-k)-vector ξ that minimizes the cost function in Eq. (), we then obtain the solution δx=Z⋅ξ that minimizes the cost function in Eq. () subject to the constraint Eq. ().

Minimization of the cost function with weak constraints

In the approach described in the previous section the solution satisfies the constraints exactly. Therefore, this approach is known as the minimization of the cost function with strong constraints. In the weak-constraint approach the constraints only need to be satisfied within specified error bounds.

The formulation of the weak-constraint approach is conceptually quite simple. One incorporates the constraints by adding an extra term to the cost function Eq. (), i.e. J=Jb+Jo+JGJG=12δxT⋅GT⋅BG-1⋅G⋅δx, which also gives an extra term in the gradient of the cost function, ∇δxJG=GT⋅BG-1⋅G⋅δx. We will assume that the matrix BG = diag(σ1G,…,σkG) is diagonal, where k is the number of constraints. The “error variances” σiG along the diagonal of BG allow us to fine-tune the influence of each constraint on the solution. If σiG is small, then the ith constraint is relatively strong, and vice versa. Typically, if the σiG are made too large, then there is a risk that the minimization algorithm ignores the constraints all together. In that case the solution will be very similar to the unconstrained solution. On the other hand, if the σiG are made too small, then JG can make the dominant contribution to J. In that case, there is a risk that the minimization routine largely ignores the observations and returns a solution that lies quite close to the background estimate.

Constraints designed for making optimum use of the information contained in the observations

We now want to incorporate the results of Appendix into the variational data assimilation method. More specifically, we want to formulate weak constraints, Eq. (), based on the singular values of the observation operator in Eq. (). To this end, we make the change of variables given in Eq. (). We assume, without loss of generality, that the first ℓ singular values are greater than unity. Thus we only want to use the corresponding components δx1′,…,δxℓ′ as independent control variables in the 3DVAR algorithm, while the remaining components remain unchanged, at least approximately, within specified error bounds. If we were to formulate this requirement as a strong constraint, as in Eq. (), then it would take the form δx′=VRT⋅B-1/2⋅δx=δx1′⋮δxℓ′0⋮. Thus the matrix expressing the constraints is given by G=VRT⋅B-1/2, which is a (n×n) matrix.

The weak constraint approach is, arguably, more suitable in our case. We have, in the preceding text, frequently used the terms signal degrees of freedom and noise degrees of freedom. Although it was conceptually useful to make this distinction, it is important to stress that there is no sharp boundary between the two. Rather, there is a smooth transition from singular values w1>w2>⋯>wℓ≥1 to singular values 1>wℓ+1>wℓ+2>…>wK (K=min⁡{n,m}). For this reason we choose to formulate the constraints as weak constraints. This allows us to make a smooth transition from free to constrained control variables, where the transition from one regime to the other can be controlled by the singular values.

In order to apply the weak-constraint approach, we need to substitute the constraint matrix G=VRT⋅B-1/2 into Eq. (), which yields JG=12δxT⋅B-1/2⋅VR⋅BG-1⋅VRT⋅B-1/2⋅δx, where BG is a (n×n) matrix. We want to set up this matrix in such a way that we obtain a smooth transition from freely adaptable control variables δx1′,…δxℓ′ to increasingly constrained variables δxℓ+1′,…δxk′,…,δxn′. One possible choice of the matrix BG would be BG=σGdiag(w1,w2,…,wℓ,…,wk,c,…,c), where σG is a free scaling factor, and where the last n-k diagonal elements are equal to a constant c≪wk chosen to be much smaller than the smallest singular value wk.

Clearly, how we set up the matrix BG is not unique. For instance, a more general choice would be BG=σGdiag(w1p,w2p,…,wℓp,…,wkp,c,…,c), where c≪wkp, and where the exponent p would be another parameter that can be employed to tune how steeply the transition from unconstrained to constrained control variables takes place. Yet another choice would be BG=σG⋅diag(μ1,μ2,…,μℓ,…,μk,c,…,c),μi=wi2/(1+wi2), where c≪μk. This ansatz is suggested by Eq. (), i.e. each of the elements δx1′,…δxk′ is weighted with its corresponding contribution to the number of signal degrees of freedom. We tested all three approaches (the one in Eq. for p=2). These tests showed that the different approaches often yield analysis results that are quite similar. However, in each approach the free parameters σG and c are tuned to different values. If they are not well tuned, then the analysis tends either toward the background estimate or toward the unconstrained analysis, as explained earlier in the text following Eq. ().

Practical aspects of the implementation

We will here discuss some practical aspects that are mainly interesting for model developers.

One of the main practical problems is the dimension n of the model space. The grid-size is typically on the order Nx×Ny×Nz∼100×100×10, and the number of aerosol components is of the order of Nc ∼ 10–100. Hence the dimension of the model space is n∼106–107. In our case, the matrix H̃ in Eq. () is a (m×n) matrix. To numerically perform a singular value decomposition of such a large matrix would be a formidable task.

In variational data assimilation we encounter a similar problem in the inversion of the matrix B. In our 3DVAR code this problem is alleviated by using a so-called spectral formulation. The idea is to make a Fourier transformation in the horizontal coordinates and to assume that all horizontal error correlations are homogeneous and isotropic. Under these assumptions one obtains one background error covariance matrix for each horizontal wavenumber; each of these matrices has dimension Nz×Nc∼103–104. This can further be reduced to about 102 by making a reduced eigenvalue diagonalization. The details are explained in .

In our case we are primarily interested in constraining the aerosol components. Therefore, we formulate our weak constraints in a suitable subspace of the physical space. Suppose, for simplicity, that we have reduced all data to the vertical resolution of our model. Let νl=1,…,ml label all measurements that lie within model layer l. Suppose further than (iα,jα) is the horizontal grid point belonging to observation νl (so that the index α depends on the layer l and the observation νl). Consider the reduced background error covariance matrix with elements Bk,k′(α,l)=Biαjαlk,iα,jαlk′, k,k′=1,…,Nc, and Nc is the number of aerosol components. Consider further the reduced observability matrix with elements H̃νl,k(l)=∑k′=1NcRνl,νl-1/2Hm,iα,jαlk′{(B(α,l))1/2}k′,k, where m=m(l,νl) labels the νlth observation in model layer l. Analogous to Eq. (), we now perform a singular value decomposition in the reduced space H̃νl,k(l)=∑s=1min⁡{ml,Nc}(VL(l))νl,sws(l)(VR(l))k,s. The dimension of this SVD problem is now considerably reduced. The number of singular values is equal to K=min⁡{Nc,ml}. The constraint matrix G=VRT⋅B-1/2 reduces to Gs,k=∑k′=1Nc(VR(l))k′,s{(B(α,l))-1/2}k′,k. We now invoke the assumption that the constraints computed at the observation site are also valid at neighbouring points, i.e. we apply the constraint matrix given in Eq. () in Eq. () according to JG=12∑ijlkk′sδxijlk′Gs,k′(BG-1)sGs,kδxijlk, where (BG)s denotes the diagonal elements of the matrix given in Eq. ().

For those readers interested in spectral formulations of 3DVAR we refer to Eqs. (28)–(30) in . Expressed by the spectral control vector χ=U⋅δx, the weak constraint in the cost function takes the spectral form JG=12χ†⋅U-†⋅GT⋅BG-1⋅G⋅U-1⋅χ, and its contribution to the gradient of the cost function becomes ∇χJG=U-†⋅GT⋅BG-1⋅G⋅U-1⋅χ. We see that these expressions involve the computation of the variable δx=U-1⋅χ in physical space. Thus, even when using a spectral formulation of the 3DVAR method, one can still compute the constraints in physical space and add their contributions to J and ∇J. The advantage of this is, as explained above, that the SVD of the observability matrix can be computed in the reduced subspace, which substantially reduces the dimension of the numerical SVD problem.

Another aspect concerns the positive square root of the background error covariance matrix, which appears in essential parts of the theory, namely in Eqs. () and (). In theoretical developments it is, arguably, didactically expedient to work with the matrix B1/2. But in practice there are numerically more efficient formulations. One such approach is discussed in in the context of a spectral formulation of the variational method. The spectral formulation is applied to the full B matrix in order to reduce the dimension of the problem of diagonalizing this matrix. This method is our method of choice in the formulation of the background and observation terms in the cost function given in Eqs. () and (), respectively. However, in the formulation of the constraint term given in Eq. () we can substantially reduce the dimension of the matrix B by working in the reduced space in which only the covariances B(α,l) among aerosol components are considered. One could compute the matrix (B(α,l))-1/2 in Eq. () by diagonalizing the matrix B(α,l). However, a numerically much more efficient approach is to not work with positive square root, but with the so-called Cholesky decomposition

The Cholesky decomposition is, essentially, a special case of a LU decomposition, which applies to symmetric real (or Hermitian complex), positive definite matrices.

of the B matrix, B(α,l)=CuT⋅Cu, where Cu is an upper triangular matrix. Thus the actual algorithm we used for formulating the constrained minimization of the cost function is obtained by replacing in the preceding formulas all incidences of the matrix B1/2 with the matrix CuT (and, similarly, by replacing the inverse matrix B-1/2 by the inverse of the Cholesky factor, Cu-T).

The Supplement related to this article is available online at doi:10.5194/acp-17-3423-2017-supplement.

Michael Kahnert worked with the theoretical developments and numerical implementation, Emma Andersson performed the testing of the method.

The authors declare that they have no conflict of interest.

Acknowledgements

This work was funded by the Swedish National Space Board through project nos. 100/16 (MK) and 101/13 (EA). Edited by: M. Tesche Reviewed by: five anonymous referees

References Andersson et al.(2007)Andersson, Langner, and Bergström

Andersson, C., Langner, J., and Bergström, R.: Interannual variation and trends in air pollution over Europe due to climate variability during 1958–2001 simulated with a regional CTM coupled to the ERA40 reanalysis, Tellus B, 59, 77–98, 2007.

Andersson et al.(2015)Andersson, Bergström, Bennet, Robertson, Thomas, Korhonen, Lehtinen, and Kokkola

Andersson, C., Bergström, R., Bennet, C., Robertson, L., Thomas, M., Korhonen, H., Lehtinen, K. E. J., and Kokkola, H.: MATCH-SALSA – Multi-scale Atmospheric Transport and CHemistry model coupled to the SALSA aerosol microphysics model – Part 1: Model description and evaluation, Geosci. Model Dev., 8, 171–189, 10.5194/gmd-8-171-2015, 2015.

Andersson and Kahnert(2016)

Andersson, E. and Kahnert, M.: Coupling aerosol optics to the MATCH (v5.5.0) chemical transport model and the SALSA (v1) aerosol microphysics module, Geosci. Model Dev., 9, 1803–1826, 10.5194/gmd-9-1803-2016, 2016.

Benedetti et al.(2009)Benedetti, Morcrette, Boucher, Dethof, Engelen, Huneeus, Jones, andS. Kinne, Mangold, Razinger, Simmons, and Suttie

Benedetti, A., Morcrette, M. J.-J., Boucher, O., Dethof, A., Engelen, R. J., Huneeus, M. F. H. F. N., Jones, L. S., Kinne, J. W. K., Mangold, A., Razinger, M., Simmons, A. J., and Suttie, M.: Aerosol analysis and forecast in the European Centre for Medium-Range Weather Forecasts Integrated Forecast System: 2. Data assimilation, J. Geophys. Res., 114, D13205, 10.1029/2008JD011115, 2009.

Bi et al.(2010)Bi, Yang, Kattawar, and Kahn

Bi, L., Yang, P., Kattawar, G., and Kahn, R.: Modeling optical properties of mineral aerosol particles by using nonsymmetric hexahedra, Appl. Optics, 49, 334–342, 2010.

Bocquet(2009)

Bocquet, M.: Toward optimal choices of control space representation for geophysical data assimilation, Mon. Weather Rev., 137, 2331–2348, 2009.

Burton et al.(2015)Burton, Hair, Kahnert, Ferrare, Hostetler, Cook, Harper, Berkoff, Seaman, Collins, Fenn, and Rogers

Burton, S. P., Hair, J. W., Kahnert, M., Ferrare, R. A., Hostetler, C. A., Cook, A. L., Harper, D. B., Berkoff, T. A., Seaman, S. T., Collins, J. E., Fenn, M. A., and Rogers, R. R.: Observations of the spectral dependence of linear particle depolarization ratio of aerosols using NASA Langley airborne High Spectral Resolution Lidar, Atmos. Chem. Phys., 15, 13453–13473, 10.5194/acp-15-13453-2015, 2015.

Burton et al.(2016)Burton, Chemyakin, Liu, Knobelspiesse, Stamnes, Sawamura, Moore, Hostetler, and Ferrare

Burton, S. P., Chemyakin, E., Liu, X., Knobelspiesse, K., Stamnes, S., Sawamura, P., Moore, R. H., Hostetler, C. A., and Ferrare, R. A.: Information content and sensitivity of the 3β+2α lidar measurement system for aerosol microphysical retrievals, Atmos. Meas. Tech., 9, 5555–5574, 10.5194/amt-9-5555-2016, 2016.

Cardinali et al.(2004)Cardinali, Pezzulli, and Andersson

Cardinali, C., Pezzulli, S., and Andersson, E.: Influence-matrix diagnostic of a data assimilation system, Q. J. Roy. Meteor. Soc., 130, 2767–2786, 2004.

Foltescu et al.(2005)Foltescu, Pryor, and Bennet

Foltescu, V., Pryor, S. C., and Bennet, C.: Sea salt generation, dispersion and removal on the regional scale, Atmos. Environ., 39, 2123–2133, 2005.

Fuller and Mackowski(2000)

Fuller, K. A. and Mackowski, D. W.: Electromagnetic scattering by compounded spherical particles, in: Light scattering by nonspherical particles, edited by: Mishchenko, M. I., Hovenier, J. W., and Travis, L. D., 226–273, Academic Press, San Diego, 2000.

Johnson et al.(2005a)Johnson, Hoskins, and Nichols

Johnson, C., Hoskins, B. J., and Nichols, N. K.: Very large inverse problems in atmosphere and ocean modelling, Int. J. Numer. Meth. Fl., 47, 759–771, 2005a.

Johnson et al.(2005b)Johnson, Nichols, and Hoskins

Johnson, C., Nichols, N. K., and Hoskins, B. J.: A singular vector perspective of 4D-Var: Filtering and interpolation, Q. J. Roy. Meteor. Soc., 131, 1–19, 2005b.

Joiner and da Silva(1998)

Joiner, J. and da Silva, A. M.: Efficient methods to assimilate remotely sensed data based on information content, Q. J. Roy. Meteor. Soc., 124, 1669–1694, 1998.

Kahnert(2004)

Kahnert, F. M.: Reproducing the optical properties of fine desert dust aerosols using ensembles of simple model particles, J. Quant. Spectrosc. Ra., 85, 231–249, 2004.

Kahnert(2008)

Kahnert, M.: Variational data analysis of aerosol species in a regional CTM: background error covariance constraint and aerosol optical observation operators, Tellus B, 60, 753–770, 2008.

Kahnert(2009)

Kahnert, M.: On the observability of chemical and physical aerosol properties by optical observations: Inverse modelling with variational data assimilation, Tellus B, 61, 747–755, 2009.

Kahnert(2015)

Kahnert, M.: Modelling radiometric properties of inhomogeneous mineral dust particles: Applicability and limitations of effective medium theories, J. Quant. Spectrosc. Ra., 152, 16–27, 2015.

Kahnert and Devasthale(2011)

Kahnert, M. and Devasthale, A.: Black carbon fractal morphology and short-wave radiative impact: a modelling study, Atmos. Chem. Phys., 11, 11745–11759, 10.5194/acp-11-11745-2011, 2011.

Kahnert et al.(2012a)Kahnert, Nousiainen, Lindqvist, and Ebert

Kahnert, M., Nousiainen, T., Lindqvist, H., and Ebert, M.: Optical properties of light absorbing carbon aggregates mixed with sulfate: assessment of different model geometries for climate forcing calculations, Opt. Express, 20, 10042–10058, 2012a.

Kahnert et al.(2012b)Kahnert, Nousiainen, Thomas, and Tyynelä

Kahnert, M., Nousiainen, T., Thomas, M. A., and Tyynelä, J.: Light scattering by particles with small-scale surface roughness: comparison of four classes of model geometries, J. Quant. Spectrosc. Ra., 113, 2356–2367, 2012b.

Kahnert et al.(2013)Kahnert, Nousiainen, and Lindqvist

Kahnert, M., Nousiainen, T., and Lindqvist, H.: Models for integrated and differential scattering optical properties of encapsulated light absorbing carbon aggregates, Opt. Express, 21, 7974–7992, 2013.

Kahnert et al.(2014)Kahnert, Nousiainen, and Lindqvist

Kahnert, M., Nousiainen, T., and Lindqvist, H.: Review: Model particles in atmospheric optics, J. Quant. Spectrosc. Ra., 146, 41–58, 2014.

Kahnert et al.(2016)Kahnert, Nousiainen, and Markkanen

Kahnert, M., Nousiainen, T., and Markkanen, J.: Morphological models for inhomogeneous particles: light scattering by aerosols, cometary dust, and living cells, in: Light Scattering Reviews 11, edited by: Kokhanovsky, A., Springer, Berlin, 299–339, 2016.

Khade et al.(2013)Khade, Hansen, Reid, and Westphal

Khade, V. M., Hansen, J. A., Reid, J. S., and Westphal, D. L.: Ensemble filter based estimation of spatially distributed parameters in a mesoscale dust model: experiments with simulated and real data, Atmos. Chem. Phys., 13, 3481–3500, 10.5194/acp-13-3481-2013, 2013.

Kupiainen and Klimont(2004)

Kupiainen, K. and Klimont, Z.: Primary emissions of submicron and carbonaceaous particles in Europe and the potential for their control, Tech. Rep. IR-04-079, IIASA, Laxenburg, Austria, 2004.

Kupiainen and Klimont(2007)

Kupiainen, K. and Klimont, Z.: Primary emissions of fine carbonaceous particles in Europe, Atmos. Environ., 41, 2156–2170, 2007.

Kylling et al.(2014)Kylling, Kahnert, Lindqvist, and Nousiainen

Kylling, A., Kahnert, M., Lindqvist, H., and Nousiainen, T.: Volcanic ash infrared signature: porous non-spherical ash particle shapes compared to homogeneous spherical ash particles, Atmos. Meas. Tech., 7, 919–929, 10.5194/amt-7-919-2014, 2014.

Lindqvist et al.(2009)Lindqvist, Muinonen, and Nousiainen

Lindqvist, H., Muinonen, K., and Nousiainen, T.: Light scattering by coated Gaussian and aggregate particles, J. Quant. Spectrosc. Ra., 110, 1398–1410, 10.1016/j.jqsrt.2009.01.015, 2009.

Lindqvist et al.(2011)Lindqvist, Nousiainen, Zubko, and Muñoz

Lindqvist, H., Nousiainen, T., Zubko, E., and Muñoz, O.: Optical modeling of vesicular volcanic ash particles, J. Quant. Spectrosc. Ra., 112, 1871–1880, 2011.

Lindqvist et al.(2014)Lindqvist, Jokinen, Kandler, Scheuvens, and Nousiainen

Lindqvist, H., Jokinen, O., Kandler, K., Scheuvens, D., and Nousiainen, T.: Single scattering by realistic, inhomogeneous mineral dust particles with stereogrammetric shapes, Atmos. Chem. Phys., 14, 143–157, 10.5194/acp-14-143-2014, 2014.

Liu and Mishchenko(2007)

Liu, L. and Mishchenko, M. I.: Scattering and radiative properties of complex soot and soot-containing aggregate particles, J. Quant. Spectrosc. Ra., 106, 262–273, 2007.

Liu et al.(2011)Liu, Liu, Lin, Schwartz, Lee, and Wang

Liu, Z., Liu, Q., Lin, H.-C., Schwartz, C. S., Lee, Y.-H., and Wang, T.: Three-dimensional variational assimilation of MODIS aerosol optical depth: Implementation and application to a dust storm over East Asia, J. Geophys. Res., 116, D23206, 10.1029/2011JD016159, 2011.

McKeen et al.(2007)McKeen, Chung, Wilczak, Grell, Djalalova, Peckham, Gong, Bouchet, Moffet, Tang, Carmichael, Mathur, and Yu

McKeen, S., Chung, S. H., Wilczak, J., Grell, G., Djalalova, I., Peckham, S., Gong, W., Bouchet, V., Moffet, R., Tang, Y., Carmichael, G. R., Mathur, R., and Yu, S.: Evaluation of several PM2.5 forecast models using data collected during the ICARTT/NEAQS 2004 field study, J. Geophys. Res., 112, d10S20, 2007.

Mishchenko et al.(1997)Mishchenko, Travis, Kahn, and West

Mishchenko, M. I., Travis, L. D., Kahn, R. A., and West, R. A.: Modeling phase functions for dustlike tropospheric aerosols using a shape mixture of randomly oriented polydisperse spheroids, J. Geophys. Res., 102, 16831–16847, 1997.

Mishchenko et al.(2014)Mishchenko, Dlugach, and Zakharova

Mishchenko, M. I., Dlugach, Z. M., and Zakharova, N. T.: Direct demonstration of the concept of unrestricted effective-medium approximation, Opt. Lett., 39, 3935–3938, 2014.

Muinonen(2000)

Muinonen, K.: Light scattering by stochastically shaped particles, in: Light scattering by nonspherical particles, edited by: Mishchenko, M. I., Hovenier, J. W., and Travis, L. D., 323–354, Academic Press, San Diego, 2000.

Müller et al.(1999)Müller, D., Wandinger, and Ansmann

Müller, D., Wandinger, U., and Ansmann, A.: Microphysical particle parameters from extinction and backscatter lidar data by inversion with regularization: theory, Appl. Optics, 38, 2346–2357, 1999.

Omar et al.(2009)Omar, Winker, Vaughan, Hu, Trepte, Ferrare, Lee, Hostetler, Kittaka, Rogers, Kuehn, and Liu

Omar, A. H., Winker, D. M., Vaughan, M. A., Hu, Y., Trepte, C. R., Ferrare, R. A., Lee, K.-P., Hostetler, C. A., Kittaka, C., Rogers, R. R., Kuehn, R. E., and Liu, Z.: The CALIPSO automated aerosol classification and lidar ratio selection algorithm, J. Atmos. Ocean. Tech., 26, 1994–2014, 2009.

Parrish and Derber(1992)

Parrish, D. F. and Derber, J. C.: The National Meteorological Centre's spectral statistical interpolation analysis system, Mon. Weather Rev., 120, 1747–1763, 1992.

Rabier et al.(2002)Rabier, Fourrié, Chafaï, and Prunet

Rabier, F., Fourrié, N., Chafaï, D., and Prunet, P.: Channel selection methods for infrared atmospheric sounding interferometer radiances, Q. J. Roy. Meteor. Soc., 128, 1011–1027, 2002.

Rodgers(2000)

Rodgers, C. D.: Inverse methods for atmospheric sounding, World Scientific, Singapore, 2000.

Rubin and Collins(2014)

Rubin, J. I. and Collins, W. D.: Global simulations of aerosol amount and size using MODIS observations assimilated with an Ensemble Kalman Filter, J. Geophys. Res., 119, 12780–12806, 2014.

Saide et al.(2013)Saide, Charmichael, Liu, Schwartz, Lin, da Silva, and Hyer

Saide, P. E., Carmichael, G. R., Liu, Z., Schwartz, C. S., Lin, H. C., da Silva, A. M., and Hyer, E.: Aerosol optical depth assimilation for a size-resolved sectional model: impacts of observationally constrained, multi-wavelength and fine mode retrievals on regional scale analyses and forecasts, Atmos. Chem. Phys., 13, 10425–10444, 10.5194/acp-13-10425-2013, 2013.

Sandu et al.(2005)

Sandu, A., Liao, W., Carmichael, G. R., Henze, D. K., and Seinfeld, J. H.: Inverse modeling of aerosol dynamics using adjoints: Theoretical and numerical considerations, Aerosol Sci. Tech., 39, 677–694, 2005.

Sekiyama et al.(2010)Sekiyama, Tanaka, Shimizu, and Miyoshi

Sekiyama, T. T., Tanaka, T. Y., Shimizu, A., and Miyoshi, T.: Data assimilation of CALIPSO aerosol observations, Atmos. Chem. Phys., 10, 39–49, 10.5194/acp-10-39-2010, 2010.

Undén et al.(2002)Undén, Rontu, Järvinen, Lynch, Calvo, Cats, Cuxart, Eerola, Fortelius, Garcia-Moya, Jones, Lenderlink, McDonald, McGrath, Navascues, Nielsen, Ødegaard, Rodriguez, Rummukainen, Rõõm, Sattler, Sass, Savijärvi, Schreur, Sigg, The, and Tijm

Undén, P., Rontu, L., Järvinen, H., Lynch, P., Calvo, J., Cats, G., Cuxart, J., Eerola, K., Fortelius, C., Garcia-Moya, J. A., Jones, C., Lenderlink, G., McDonald, A., McGrath, R., Navascues, B., Nielsen, N. W., Ødegaard, V., Rodriguez, E., Rummukainen, M., Rõõm, R., Sattler, K., Sass, B. H., Savijärvi, H., Schreur, B. W., Sigg, R., The, H., and Tijm, A.: HIRLAM-5 Scientic Documentation, available at: http://www.hirlam.org (last access: 30 March 2009), 2002.

Veselovskii et al.(2002)Veselovskii, Kolgotin, Griaznov, Müller, Wandinger, and Whiteman

Veselovskii, I., Kolgotin, A., Griaznov, V., Müller, D., Wandinger, U., and Whiteman, D. N.: Inversion with regularization for the retrieval of tropospheric aerosol pa rameters from multiwavelength lidar sounding, Appl. Optics, 41, 3685–3699, 2002.

Veselovskii et al.(2004)Veselovskii, Kolgotin, Griaznov, Müller, Franke, and Whiteman

Veselovskii, I., Kolgotin, A., Griaznov, V., Müller, D., Franke, K., and Whiteman, D. N.: Inversion of multiwavelength Raman lidar data for retrieval of bimodal aerosol size distribution, Appl. Optics, 43, 1180–1195, 2004.

Veselovskii et al.(2005)Veselovskii, Kolgotin, Müller, and Whiteman

Veselovskii, I., Kolgotin, A., Müller, D., and Whiteman, D. N.: Information content of multiwavelength lidar data with respect to microphysical particle properties derived from eigenvalue analysis, Appl. Optics, 44, 5292–5303, 2005.

Vilaplana et al.(2006)Vilaplana, Moreno, and Molina

Vilaplana, R., Moreno, F., and Molina, A.: Study of the sensitivity of size-averaged scattering matrix elements of nonspherical particles to changes in shape, porosity and refractive index, J. Quant. Spectrosc. Ra., 100, 415–428, 10.1016/j.jqsrt.2005.11.068, 2006.

Wang et al.(2014)Wang, Sartelet, Bocquet, and Chazette

Wang, Y., Sartelet, K. N., Bocquet, M., and Chazette, P.: Modelling and assimilation of lidar signals over Greater Paris during the MEGAPOLI summer campaign, Atmos. Chem. Phys., 14, 3511–3532, 10.5194/acp-14-3511-2014, 2014.

Xu(2006)

Xu, Q.: Measureing information content from observations for data assimilation: relative entropy versus shannon entropy difference, Tellus A, 59, 198–209, 2006.

Zhang et al.(2014)Zhang, Campbell, Hyer, Reid, Westphal, and Johnson

Zhang, J., Campbell, J. R., Hyer, E. J., Reid, J. S., Westphal, D. L., and Johnson, R. S.: Evaluating the impact of multisensor data assimilation on a global aerosol particle transport model, J. Geophys. Res., 119, 4674–4689, 2014.

</app></app-group></back> </article>