Technical Note: Spectral representation of spatial correlations in variational assimilation with grid point models and application to the Belgian Assimilation System for Chemical Observations (BASCOE)

The formulation of the background error covariances represented in the spectral space is discussed in the context of univariate assimilation relying on a grid point model, leaving out all the aspects of balances between the different control variables needed in meteorological assimilation. The spectral transform operations are discussed in the case of a spherical harmonics basis and we stress that there is no need for an inverse spectral transform and of a Gaussian grid. The analysis increments are thus produced directly on the model grid. The practice of producing analysis increments on a horizontal Gaussian grid and then interpolating to an equally spaced grid is also shown to produce a degradation of the analysis. The method discussed in this paper allows the implementation of separable and non-separable spatial correlations. The separable formulation has been implemented in the Belgian Assimilation System for Chemical ObsErvations (BASCOE) and its impact on the assimilation of O3 observed by the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) is shown. To promote the use of this method by other non-meteorological variational systems and in particular chemistry, the Fortran code developed is made available to the community.


Introduction
One of the critical aspects of any assimilation system is the formulation of the background spatial error covariances matrix, denoted the B matrix.It needs to be sufficiently com-pact to be numerically implemented and sufficiently complex to represent correctly the real error covariances of the first guess field.There are several approaches to achieve this goal.For variational systems, which are the systems we will focus on, three approaches have been developed.The first one historically (Parrish and Derber, 1992), which we shall discuss at length in this paper, uses a spectral representation of the correlation matrix.It is based on the fact that a homogeneous and isotropic horizontal correlation matrix can be represented by a diagonal matrix in the spectral space.This property was discovered in the late sixties by Russian scientists working in statistical fluid mechanics and turbulence theory (see the works of Monin andYaglom, 1971, 1975;Panchev, 1971).These concepts were extended to the sphere and applied successfully to large-scale atmospheric dynamics and analyses by Boer (1983) and Boer and Shepherd (1983) using spherical harmonics as the orthogonal basis.It is only in the late nineties that this formulation became the cornerstone of 3-dimensional variational meteorological assimilation systems (Courtier et al., 1998), and was applied later to chemical data assimilation (Dethof and Hólm, 2004).More recently, efforts have been made to develop spectral representation of inhomogeneous and flow-dependent correlations (Fisher, 2003).
The second approach for variational systems is based on a diffusion operator which uses the model's diffusion to generate the effect of a Gaussian correlation function.This method was introduced by Derber and Rosati (1989) and further developed by Weaver and Courtier (2001).This method Published by Copernicus Publications on behalf of the European Geosciences Union.
Q. Errera and R. Ménard: Spectral representation of background error covariances is particularly useful when the domain has complex boundaries (e.g.ocean data assimilation) where it is difficult to define positive definite correlations.This method is also well suited for models that have non uniform grid on a sphere (e.g.icosahedral grids) because it avoids the interpolation of analysis increments onto that grid (Elbern et al., 2010;Schwinger and Elbern, 2010).The diffusion approach primarily gives rise to a Gaussian covariance model but can be generalized to more complex (i.e.inhomogeneous and anisotropic) correlations.However, the diffusion approach is highly demanding in computer resources (both CPU and memory), especially if one needs to simulate non-Gaussian correlations.
A third approach in variational data assimilation which does not compute explicitly the error covariance is the recursive filter (Purser et al., 2003).Like the diffusion operator approach, the recursive filter approach attemps to compute Gaussian correlations.The method consists in evaluating the effect of a Gaussian correlation model on a state vector, by applying a sequence of 1-D finite difference operators in different directions on the state vector.Repeated applications of these finite difference operators in carefully chosen directions can lead to approximate the smoothing effect of Gaussian homogeneous and isotropic correlations.Although positive-definitness can be obtained, the scheme is approximate and computationally complex.
This paper discusses the spectral representation of background error correlations onto a spherical harmonic basis.Most of the mathematical background presented in this paper has already been published (see the references cited in Sects.2-4).However, none of those publications provide a complete picture of the problem.For instance, these publications rarely give an introduction on spectral transformations.As numerical weather prediction (NWP) systems usually rely on spectral models, this aspect is supposed to be known.Moreover, those publications describe the method in the meteorological context, that is the use of balance operators between meteorological variables.This makes the background theory of this method difficult to be acquired for people not involved in meteorology.As a consequence, one of the aims of this paper is to describe the spectral representation of the background error correlations for univariate assimilation with systems relying on a grid point model.
A code for this purpose has been developed and implemented in the Belgian Assimilation System for Chemical ObsErvation (BASCOE).In doing so we have identified a rather important property that seemed to have been overlooked in meteorological data assimilation: in variational assimilation only the adjoint spectral transform is necessary, not the inverse transform.As a result the analysis spectral increments are produced directly on the model grid.In other words, there is no need to extract the increments on the (nonequally spaced) Gaussian grid and then to perform a transformation from that grid to the model grid -a transformation which necessarily degrades the analysis.While this method is limited to homogeneous and isotropic correlations, it has the advantage of being very low demanding in computer resources.Moreover, while not studied in this paper, nonseparable three-dimensional error correlations could be implemented easily, meaning that vertical correlations can be different for large and small horizontal spectral features.
This paper is organized as follows.Section 2 and Appendix A introduce representations in spherical harmonics and the spectral representation of homogeneous and isotropic correlations.Section 3 discusses the variational formulation and the control variable change from the model basis to the spherical harmonic basis.The implementation of the spectral transform operators and of the spatial correlations is described in Sect. 4. Section 5 discusses the numerical implementation of the method and illustrates it by the results of an assimilation of a single pseudo-observation.In Sect.6, the method is applied to real ozone observations using the BAS-COE system.The results of this paper are summarized in the conclusions.A Fortran code of this method is provided in the Supplement of this paper.This code is introduced in the Appendix B.

Introduction on spectral transform on the sphere
This section provides some useful information about twodimensional spectral representation on the sphere.For an exhaustive introduction, we refer to the books of Satoh (2004) and Krishnamurti et al. (2006), and the unpublished lectures of P. Swarztrauber * .
In this section, we denote a 2-D field on the sphere by (λ, µ) where λ is the longitude, µ = sin φ and φ is the latitude.Alternatively, the notation will be used in place of (λ, µ).The field (λ, µ) can be represented by a series of spherical harmonics where the coefficients are noted m n .The values of n and m are specified below.In this section, the term grid will be associated with the horizontal grid and not the usual three dimensional grid of the model.

Spherical harmonics
Spherical harmonics, denoted by Y m n , are defined as solutions of the Laplace equation.They take the following form: where N m n are the normalization constants, P m n (µ) are the associated Legendre functions and e imλ are the Fourier exponentials where i 2 = −1.Spherical harmonics form an orthogonal basis and their normalization is defined by the normalization constants.This is discussed in detail in Appendix A1.Here the geodesy 2π normalization is adopted (while Courtier et al., 1998, uses the geodesy 4π) such that: where [.] * denotes the complex conjugate and δ k k is the Kronecker symbol (δ k k = 1 if k = k and 0 otherwise).When m = 0, P 0 n are called the Legendre polynomials.In such a case the value of m is omitted in the notation (P 0 n ≡ P n ).When spherical harmonics are used in the context of spectral transforms, as in this study, it is appropriate to include the normalization constant in the associated Legendre functions.In this case, we define the normalized associated Legendre functions as P m n (µ) ≡ N m n P m n (µ).

Direct spectral transform
In general, a smooth function on a sphere (λ, µ) can be represented by a series of spherical harmonics with coefficients m n : In this transformation, and recalling the definition of the spherical harmonics (Eq.1), the dependence along the longitude is taken by the Fourier exponentials.For this reason, m is called the zonal wavenumber.In addition, n is called the total wavenumber.m n are complex coefficients.If (λ, µ) is real, which is the case in atmospheric sciences, then the following property holds: This suggests that it might be possible to define real coefficients using Fourier sine and cosine instead of the complex coefficients using Fourier exponentials.In this study, the complex representation is kept.The transform Eq. ( 3) cannot be evaluated exactly and requires to be truncated at some reasonable number of terms.Several truncation methods exist and here the triangular truncation is adopted.In this case, the relationship between the spectral coefficients and the physical field is given by (5) where N is the degree of truncation (and should not be confused with the normalization factor N m n ) and where j and k are respectively the longitude and latitude indices of the target grid of the spectral transform.For practical purposes, Eq. ( 6) is split in a Fourier transform followed by a Legendre transform: Note that usually, operational centers use fast Fourier transforms instead of Eq. ( 8).This is not the case in the code provided in the Supplement as the CPU time necessary to operate the Fourier transform is found sufficiently fast.

Inverse spectral transform
Given a 2-D field (λ, µ) on the sphere, the question is now how we can calculate its spectral coefficients m n .Using the orthogonality of the spherical harmonics, the inverse of Eq. ( 3) is given by In order to solve this equation, one usually starts with a discrete 2-D field given on a grid (λ j , µ k ) and looks to invert the sequence (Eqs.7, 8).The inverse of the Fourier transform (Eq.8) is given by: where M is the number of longitudes.The inverse of the Legendre transform (Eq.7) is much more difficult to implement.One aspect that has not been addressed yet is the specification of the target grid of the spectral transform.The direct transform (Eqs.7, 8) and the inverse of the Fourier transform (Eq.10) are applicable to any type of grid, in particular to the equally spaced (in degree) model grid.However, this is not true for the inverse of the Legendre transform.If one needs to calculate the exact inverse of Eq. ( 7) without loss of information, the choice of spectral grid is limited to the Gaussian grid.On this grid, the latitudes correspond to the roots of the Legendre polynomials P N (µ k ) = 0, which are not equally spaced.The longitudes, on the other hand, are equally spaced.On the Gaussian grid, the inverse spectral transform is defined by the Gaussian quadrature or Gauss-Legendre quadrature.
Let f (µ) be a polynomial of degree (2L-1).The Gaussian quadrature specifies that the integral of f (µ) between -1 and 1 can be solved exactly by the following quadrature (see the proof in Krishnamurti et al., 2006, Sect. 6.6;or Satoh, 2004 where w(µ l ) are the Gaussian weights which can be calculated analytically.Using (11), the inverse of the Legendre transform ( 7) is given by: where K is the number of latitudes.
An equally spaced grid can also be a target grid for the spectral transform.While the Gaussian grid allows one to define an exact inverse transform, an accurate (but not exact) inverse transform can also be defined from the equally spaced grid.This is reported in the Appendix A2.
In the following, the target grid of the spectral transform will be simply denoted by the target.

Adjoint spectral transform
In variational assimilation where the B matrix is defined in the spectral space, the adjoint of the spectral transform (Eqs.7, 8) is needed.Let us denote the spectral transform operator by S. To obtain the gradient of the cost function (see Sect. 3), one needs to define the adjoint of S, denoted by S * (that must not be confused with the conjugate operator [.] * ).One way to implement the adjoint is to (re)define the spectral transform as unitary operations such that S −1 = S * .This is possible by re-writing the direct spectral transform (Eqs.7, 8) as and the inverse spectral transform (Eqs.10, 12) as However, this formulation requires the use of the Gaussian grid as the target grid of the spectral transform.If one uses a model defined on an equally spaced grid, as usually done in atmospheric modeling, a mapping operation will be necessary to map the analysis increments from the target grid to the model grid.As mapping operations introduce a degradation of the increments, it is better to find an alternative to the unitary definition of the spectral transforms.For this reason, the adjoint of the code of the direct spectral transform sequence (Eqs.7, 8) has been written by hand in BASCOE.
In terms of equation, the operator S * takes the form: where ˆ denote the adjoint variable of , i.e. ˆ = ∂J /∂ .

Degree of truncation, target grid and aliasing
To complete the definition of the spectral operators, we have to define a degree of truncation N and the corresponding dimensions of the horizontal grid M and K. First, let us consider the unitary definition of the spectral transforms.In order to keep the information content equal in a sequence of direct and inverse Fourier transforms, one needs to impose that M = 2(N + 1) (Krishnamurti et al., 2006, Chap. 7; Swarztrauber, Lecture 1).If one tries M > 2(N + 1), a false representation, or aliasing is introduced.On the other hand, if M < 2(N +1), there is a loss of information during the spectral transform (i.e. the spectral representation has a higher resolution than the horizontal grid).Aliasing in the associated Legendre functions also exists but is much more difficult to derive.Associated Legendre functions can be obtained by a sum of Fourier coefficients in the latitude direction (Swarztrauber, Lecture 2).In this case, the aliasing is prevented if K ≤ N +1 and the equality is required for no loss of information in a sequence of direct and inverse Legendre transform.
If one bases their code on the non-unitary spectral transforms, the condition is less strict.It can be verified with the code provided in the Supplement of this paper that the following conditions must hold: N ≥ max(K, M/2) − 1.This allows one to implement longitutes and latitudes that have not necessarilly the same resolution, which is usually the case in atmospheric modeling.
In the case of the BASCOE horizontal grid of 2 • × 2 • that includes the poles (as done in Sect.6), the model resolution is 91 latitudes and 180 longitudes.If the target grid of the spectral transform is the BASCOE model grid, K = 91, M = 180 and the degree of truncation is fixed to N = 90.If the target grid is the Gaussian grid, a mapping operator G must be introduced to account for the transformation from the Gaussian grid to the BASCOE grid.In this case, the size of the Gaussian grid is K = 90 and M = 180, the degree of truncation is N = 89 and the dimensions of G are 91 × 90.Let ( ) be an error field on the surface sphere where = (λ, µ).Let us assume that the error field is unbiased and normalized, i.e. ( ) 2 = 1 where denotes the mathematical expectation.Let us assume that the correlations are homogeneous and isotropic.This means that the correlations between the points and depends only on the distance d between the two points.Gaspari and Cohn (1999) have shown that, on a sphere of radius A, d corresponds to the chordal or geodetic distance between the two points

Atmos
where θ is their angular separation as defined in Eq. (A14).
It follows that the correlations between the two points might be expressed as a function of θ: The concept of homogeneous and isotropic correlations on the surface of a sphere is however not independent as noted by Gaspari and Cohn (1999), and needs to be somewhat revised.Isotropy can be obtained by translation invariance back to the same point along great circle with translations in any directions -so basically the correlation function only needs to be homogeneous.Also as discussed in the same paper, any homogeneous correlation function defined in R 3 is also homogeneous on any continuous manifold embedded in R 3 , such as the surface of a sphere.Thus an homogeneous (translation invariant) correlation function on the surface of a sphere can be defined from a correlation using as the chordal distance d defined in R 3 (and not the great circle distance), and which has the consequence to be implicitly periodic from all directions.
Let us come back to Eq. ( 20).As homogeneous (and isotropic) correlations over the sphere are invariant with rotation, let us suppose that one of the two points is at the north pole.Then θ is the co-latitude angle, i.e. θ ≡ π 2 − φ.Consequently, we have cos θ = µ.In this configuration, the correlations between the two points are independent of the longitude.If f m n represents the spectral coefficients of f (θ ), we have f m n = 0 for m = 0.It follows that where f n ≡ f 0 n .Boer (1983) and Gauthier et al. (1993) have shown that the correlations are represented by a diagonal matrix on a spherical harmonics basis, i.e. (1 − cos θ ) as reported in Gaspari and Cohn (1999, Eq. 2.33).
where b n ≡ f n /N 0 n (see also the proof in Appendix A4).

Variational assimilation and control variable transform
Variational methods (three-dimensional -3D-Var -and fourdimensional -4D-Var) aim at calculating the model state that minimizes the objective function J (x) (Talagrand, 1997): where where x b and B are respectively the background model state (i.e. the initial guess) and its associated error covariance matrix; y and R are respectively the observational vector and its associated error covariance matrix; x is the model state vector and H is the non-linear observation operator that projects the model state in the observation space.In the 3D-Var case the observations and the model state correspond to the same time.In the 4D-Var case, observations span over an assimilation time window, the model state x is defined at the beginning of that window (or initial time) and the observation operator H includes an evolution model operator that projects the model initial state x at the observation time.
The minimization of Eq. ( 24) is usually done with a quasi-Newton minimizer that requires the knowledge of the gradient of the objective function: As the typical dimension of x is around 10 6 , the matrix B is of size 10 12 .This is far too large to be stored and inverted by modern computers.Moreover, the specification of the elements of such a matrix requires a huge amount of a priori information, more than available (Dee, 1995).For those reasons, it is necessary to reduce the problem.Here, we follow the same strategy as developed at ECMWF in late nineties (Courtier et al., 1998).A review on the formulation of background error covariance matrix in meteorology might be found in Bannister (2008a,b).
In order to avoid the problem of inverting B, a control variable transform is introduced: The objective function thus becomes: where is the innovation vector.The background term (Eq.32) can be easily calculated.If the observation operator H is linear, the term d−H (Lχ ) in Eq. ( 33) is equal to the term y−H (x) in Eq. ( 26).As a result, J o (x) = J o (χ).If H is non-linear but the difference x − x b is small, then H can be linearized.
It follows that the calculation of J o is practically obtained by the resolution of Eq. ( 26), which is much easier to implement numerically than Eq. ( 33).
The gradient of the objective function is now given by: where L * is the adjoint of the operator L. Again, the calculation of the background term (Eq.35) of the gradient is no longer an issue and it can be verified that the gradient of the observation term (Eq.36) corresponds to the operator L * applied to the results of Eq. ( 29), i.e. ∇ χ J o = L * ∇ x J o .With χ as the control variable, the iterative sequence of operations necessary to minimize J is: By definition, the background error covariance matrix B is the product of a standard deviation error matrix (diagonal) and a correlation matrix C (non diagonal): or In this study, we will only deal with univariate correlations such that no species-species correlations are considered in BASCOE.As the matrix C 1/2 is huge, it is necessary to make assumptions and/or transformations to allow its implementation in a variational system (see Sect. 1).In the NWP community, it was suggested to build the correlation matrix in the spectral space (see e.g.Courtier et al., 1998).As mentioned in Sect.2.6, assuming homogeneous and isotropic horizontal correlations at the surface of a sphere allows one to get a diagonal correlation matrix in the spectral space.This means that considering a 3-D model on the sphere, the spatial correlations are then represented by a block diagonal matrix.Let 1/2 denote the spectral representation of C 1/2 and S denote the spectral transform operator.Hence, the transformation C 1/2 can be rewritten as S 1/2 , so that L becomes: The operator G has been introduced in Eq. ( 39) to account for the transformation from the target grid of the spectral transform to the model grid.In the case where the analysis increments are calculated on the Gaussian grid, the transformation G is represented by a weighting average over the latitudes.In the case where the spectral grid is the model grid, no transformation is implemented (G is the identity matrix).The adjoint operator L * is given by Recall that is a diagonal matrix, so the transpose sign (.) T has been omitted in the equations.The block diagonal elements of have the dimensions L × L where L is the number of vertical levels of our model.Each element of a block m n takes the form where the coefficients b n are those of Eq. ( 23) (Berre, 2000, see also the Appendix A4 and p are the level indices of the model) .This is the non-separable formulation of the spatial correlations.By this formulation of , the horizontal correlation coefficients b n depend on the altitude.Moreover, the matrices C v n allow one to take into account different vertical correlations for different wavenumbers.In other words, features with large horizontal scale could have different vertical correlations than features of small horizontal scale (Fisher, 2003).This specification is necessary in meteorology in order to get correct correlations between the mass and wind fields (Phillips, 1986;Bartello and Mitchell, 1992) while this non-separability has not yet been studied in chemistry data assimilation.Note that the vertical correlation matrices C v n are defined independently of the zonal wavenumber m, as found for the horizontal correlation coefficients b n .This formulation of the background matrix is then very low demanding in computer resources.
Although this non-separability formulation could be implemented in the present formulation of the correlation matrix in BASCOE, this has not been done in this study.Here, we have assumed that horizontal and vertical correlations are separable, i.e.
where the horizontal correlation coefficients are taken independent of the altitude and where the vertical correlation matrices are independent of the wavenumber n.
Usually, the b n coefficients and the vertical correlations matrices C v n are estimated or calibrated in order to match the statistical a priori errors of an assimilation system.This can be done using methods that allow one to derive an ensemble of error fields from which these statistics are calculated.We will not describe all these calibration methods and will refer to Bannister (2008a, Sect. 5) for a complete review.It is important to note that the calibration method will require an inverse of the operation S in order to estimate the correlation spectra b n .If one is using an equally spaced model grid, the exact inverse of S is not necessary and the method described in Appendix A2 can be used.Even more simpler is to interpolate the error fields on the Gaussian grid before the inversion.
In the current study, the correlation matrix of BASCOE has not been calibrated.Instead, a priori correlation functions have been used to build it.In the code provided in the Supplement, two functions have been coded for the horizontal correlations: a Gaussian function and a second-order autoregressive (SOAR) function where L = L h /A 0 , A 0 is the Earth's radius and L h is a horizontal correlation length scale.
For the vertical correlations, again, two functions are coded: a hat function and a Gaussian function For the Gaussian case, the distance between two levels |p i − p j | can be measured either in level indices or in kilometers.Moreover, a correlation length scale L v must be provided in the same units as p.
Finally, the square root of C v , which is necessary to build 1/2 (see Eqs. 39 and 40), is obtained by a singular value decomposition (SVD).

Assimilation of a single pseudo-observation
The operators described in Sect. 4 have been implemented in Fortran and tested.This code is provided in a Supplement to this paper.It includes the code for the different operators that define L and L * , as well as two tests.All this material is introduced in Appendix B.
This section presents the results of an assimilation of a single pseudo-observation using the 3D-Var method.The goal is to evaluate the implementation of the B matrix by an assimilation for which the result can be calculated analytically.Here, we choose to assimilate a pseudo-observation with a value of 1.2 and located at a model grid point.The background field x b has a constant value of 1.Both the observation and the background field have an error standard deviation of √ 0.02.With this configuration, the value of J after assimilation is expected to be 1/2.The optimization is performed by the quasi-Newton algorithm M1QN3 (Gilbert and Lemarechal, 1989).
In the assimilation experiments discussed here, the model grid is equally spaced and two different kinds of systems have been used.In the first one, the spectral transform operates form the spectral space to the equally spaced model grid.In the second case, the spectral transform operates from the spectral space to the Gaussian grid and a mapping operation from the Gaussian grid to the model grid is used.In the following part of this section, the analyses obtained with the two systems will be denoted, respectively, LL (i.e.lat/lon grid) and GG (i.e.Gaussian grid).
Several model resolutions, locations of the observation and (horizontal and vertical) correlation functions have been considered in these experiments.Here are shown only the results obtained with specifics but representative configuration of the experiment.This configuration considers a Gaussian correlation function with correlation length scales L h of 600 km horizontally and L v of 3 model levels vertically.The resolution of the model is 120 longitudes by 60 latitudes by 31 levels.Accordingly, the degree of truncation is 59.Three locations of the observation have been considered, namely the Equator, 40 • N and 80 • N. In such a configuration, the theoretical analysis at the observation location is 1.1.From that point, the analysis should decrease according to a Gaussian Table 1.Scores of the single-observation assimilations using two different target grids of the spectral transform: the model lat/lon grid (LL) and the Gaussian grid (GG).The experiments have been performed considering three latitudes of the observations: the Equator, 40 • N and 80 • N. We report the value of the objective function J for the analysis, the analysis at the observation location H (x), the analysis correlation length scales obtained by fitting the analysis cross-sections (see Fig. 1) by a Gaussian function, and the root mean square (RMS) between the analysis and the expected Gaussian function.
Equator 40  bell curve with a standard deviation that corresponds to the correlation length scales, i.e. 600 km horizontally and 3 levels vertically.
Figure 1 displays the results of the experiments that consider the observation at the Equator.It shows three crosssections of the analyses (upper panels) -along the latitude, the longitude and the vertical directions, all of them crossing the observed location.The differences of the analysis cross-sections with the expected cross-sections are shown in the lower panels of Fig. 1.Both analyses present a maximum at the location of the observation.As the distance to the observation location increases, the analyses decrease with a Gaussian shape.The GG analysis has a value at the observation location which is significantly different than the expected value of 1.1.On the other hand, the LL analysis is (visually) closer to this expected value.This is confirmed by the Fig. 1d- where it can be seen that the LL analysis is usually closer to theoretical cross-sections than the GG analysis.
Table 1 reports several scores obtained by the different experiments and their expected theoretical values.For each (computed or theoretical) analysis, the table provides the value of the objective function J , the analysis value at the observation point H (x) and the correlation length scales estimated by a fit of the cross-section lines in Figs.1a-c by a Gaussian function.The table also provides the value of the root mean square (RMS) of the differences between the analyzed and the theoretical cross-section.Compared with the GG analysis, the values of J , H (x) and RMS from the LL analysis are closer to the expected value.The correlation length scales obtained by both analyses are close to each other.However, correlation length scales found by the latitude and longitude cross-sections from the LL experiments are more consistent -they do not differ from more than 1 km.This is not the case for the GG experiments where the correlation length scales from the latitude cross-section is around 620 km while those from the longitude cross-section are around 575 km -around 45 km of differences.
We have also noted that the "fitted" correlation length scales are slightly different from the expected values, around 575 km instead of 600 horizontally and 2.88 levels instead of 3 vertically.The reason for these deviations remains unknown.This issue has not been investigated as we found satisfactory the results of the single assimilation experiments and real data assimilation (see Sect. 6).
Another way of comparing both analyses is to look at the power spectrum of their increments, which are displayed at the model level of the observation location (Fig. 2a).Recall that the analysis increments are the difference between the analysis and the background field and measure the impact of the observations on the analyses.We see that the information provided by the increments of both analyses differs according to the wavenumber.This is even clearer when the horizontal correlation length scale decreases up to 300 km, i.e. a value closer to the resolution of the grid in the region of the observation (Fig. 2b).Taking the LL experiments as references, Fig. 2a-b tell us how the mapping operation from the Gaussian grid to the model grid degrades the GG experiments.At low frequencies (i.e.high horizontal scales), the GG analysis provides more information but decreases more rapidly than the LL analysis, this later providing more information at high frequencies (i.e.small horizontal scales).Therefore, the mapping operation tends to overestimate the spatial correlations for high horizontal scale features and to underestimate the spatial correlations for small horizontal scale features.
Finally, the power spectra of the analysis increments of two additional LL and GG experiments (one each) are displayed on Fig. 2c.These experiments consider a model horizontal resolution four times higher (240 longitudes by 120 latitudes) than the previous experiments.Those experiments have been run in order to get the optimal size of the model resolution.Considering L v = 600 km, one sees that the increments does not provide additional information [%] above wavenumber ∼85.As a consequence, a model resolution of about 180 longitudes by 90 latitudes appears to be optimal.While this result is valid for the LL and GG grid, it is restricted to the correlation length scales of 600 km.
On the other hand, the the optimal resolution increase as L v decrease.For example, with L v = 300 km, one finds (not shown) an optimal truncation of ∼180 or a resolution of 360 longitudes by 180 latitudes.In the next section, assimilation of real data will then be performed using a model horizontal resolution of 180 longitudes by 90 latitudes with a correlation length scales of 600 km.

Assimilation of real data
In this section, the results of two chemical data assimilation experiments are presented.The first one includes correlations in the B matrix while the second one considers a diagonal B. The experiments have been performed by means of the 4D-Var Belgian Assimilation System for Chemical ObsErvations (BASCOE, Errera et al., 2008).In its usual configuration, this system considers 57 stratospheric species advected by the Flux-Form Semi-Lagrangian scheme (Lin and Rood, 1996), 200 chemical reactions and a parameterization of the physico-chemical processes due to the so-called Polar Stratospheric Clouds (PSCs).However, in this study, only the advection of ozone is considered (i.e. the chemical and PSC schemes have been turned off) in order to reduce the CPU time.By doing this, it is assumed that ozone behaves like an inert tracer.This is a fair assumption by choosing an assimilation window of one day and by excluding observations above 1 hPa where the ozone time scale is shorter than the assimilation window.
The horizontal resolution is set to 2 • ×2 • lat/lon grid.This is higher than past experiments based on 3.75 • × 5 • lat/lon (Geer et al., 2006;Errera et al., 2008;Viscardy et al., 2010) or 2.5 • ×3.75 • lat/lon grid (Lahoz et al., 2011, and the BASCOE Near Real Time service -http://macc.aeronomie.be).The vertical grid is represented by the BASCOE usual 37 vertical levels from the surface to 0.1 hPa, these levels being a subset of the ECMWF levels.In these experiments, the dynamics is provided by the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim analyses (Dee et al., 2011).The assimilated ozone data are taken by the Michelson Interferometer for Passive Atmospheric Sounding (MIPAS) ESA off-line level 2 profiles between August and October 2003, as in BASCOE past experiments discussed by Geer et al. (2006) and Errera et al. (2008).MIPAS is a limb fourier transform spectrometer onboard the Envisat platform operating between 2002 and 2012.Measuring in the infrared, limb scans are inverted to provide profiles of numerous trace gases, including ozone (Fischer et al., 2008).
Both experiments described in this paper consider 30 % standard deviation in the background fields.The experiment in which spatial correlations are introduced -denoted COR-REL -considers a Gaussian correlation function in the horizontal and vertical directions with a correlation length scale L h = 600 km and L v = 1 level, respectively.Note that this combination of model horizontal resolution and horizontal correlation length scale is optimal as mentioned in Sect. 5.The experiment using a diagonal B is denoted DIAG.
Figure 3 shows the Observations-minus-Forecast (OmF) statistics for both runs and for the period of the 2003 Antarctic ozone hole (September-October 2003) and for five latitude bands as explained in Errera et al. (2008, Sect. 3.4).Generally, the biases of the OmF obtained by the CORREL run are lower than those obtained by the DIAG run.This is significant in the upper troposphere/lower stratosphere (UTLS) region, between 70 and 300 hPa, particularly so in the southern polar region.Regarding the standard deviations, the statistics of both runs are very similar except at the South Pole.
To understand the origin of the differences between both runs, their analysis increments on September 15 at ∼44 hPa (i.e. at model level 21) are shown in Fig. 4. In the DIAG increments, the track of the satellite is clearly visible and the increments are confined around the observation locations.On the other hand, the CORREL increments are spread over a much larger area, especially in southern high latitudes, thanks to the non-diagonal nature of B. As shown by the OmF statistics, this improves the assimilation.
To assess the improvements at the southern high latitudes brought by the non-diagonal B, both experiments have been compared with ozonesonde observations at three NDACC (Network for the Detection of Atmospheric Composition Change) stations in Antarctica (Amundsen-Scott, Mac Murdo and Neumayer).Figure 5 presents the time series of the ozone partial column between 10 and 100 hPa during the formation of the ozone hole in the 2003 Antarctic winter, along with the corresponding partial column from the analyses.As expected, the CORREL run agrees much better with the observations than the DIAG run.As a result, CORREL exhibits a deeper ozone hole than DIAG which is in better agreement with observations provided by the Total Ozone Mapping Spectrometer (TOMS) (Fig. 6).
The comparison of both runs against ozone sondes in the UTLS has not revealed any "winner" or "looser" (not shown).found that MIPAS O 3 values are 5 % to 25 % larger compared to the majority of the other datasets (Cortesi et al., 2007).Removing the bias in the MIPAS data could help to improve the analyses in this region.Nevertheless, this task has been left for future studies.Moreover, the parameters of B in the COR-REL run have not been optimized.For instance, nothing tells us that the background error standard deviation is 30 % of the first guess field, that the correlations have a Gaussian shape neither that the length scales are 600 km horizontally at every model level or 1 level vertically.In principle, a calibrated B should improve the analyses.Several methods have been developed by the NWP centers and are reviewed by Bannister (2008a, Sect. 5).This optimization effort has not been implemented in the present study.Indeed, we aimed to focus on the implementations of the correlations and not essentially on the BASCOE results.We nevertheless plan to present the optimization of the parameters of B in future papers.

Conclusions
There are several outcomes from this work.First, we have gathered together all the material published so far in order to provide a stand alone document describing the spherical harmonic representation of the background error covariance matrix (the B matrix) in the case of univariate assimilation.The motivation of a spherical harmonic representation of B is based on the fact that, if horizontal correlations are assumed homogeneous and isotropic, B can be represented in the spectral space by a block diagonal matrix with repeated matrices for each zonal wavenumber m, which makes its numerical computation feasible.Horizontal and vertical correlations can be implemented in two ways.The first one consists of providing correlation functions for the vertical and the horizontal directions (not necessarily the same) along with correlation length scales.In the second method, these correlation matrices are calibrated using optimization methods such as those summarized by Bannister (2008a, Sect. 5).This latter method allows the implementation of non-separable spatial correlations.This has proved to be important in meteorology but has never been studied in chemistry.
The second outcome of this study is to provide the numerical Fortran code of the formulation of the B matrix (see the Supplement of this paper).During the course of this implementation, we realized that only the adjoint of the spectral transform operator was necessary for variational assimilation, not the inverse.The problem with the inverse operator is to restrict the target grid of the spectral transform to the (nonequally spaced) Gaussian grid.On the other hand, the adjoint operator does not suffer this limitation thus allowing the use of the (equally spaced) model grid.In this way, there is no need to implement a grid transformation from the Gaussian grid to the lat/lon (model) grid, which degrades the analyses.
The spherical harmonic representation of B has been implemented in the 4D-Var stratospheric chemical assimilation system BASCOE using the Michelson Interferometer for Passive Atmosphere Sounding (MIPAS) ozone observations between August and October 2003.Two experiments were performed: the one considering a spectral formulation of B and the other considering the diagonal B (i.e.spatially uncorrelated) implemented so far in BASCOE.In those experiments, the chemical and PSC schemes of BASCOE were turned off such that the influence of ozone observations on other modeled constituents has not been studied.At southern high latitudes and during the ozone hole period, the spatially correlated B allows a great improvement of the analyses with respect to the uncorrelated one.The correlations allow one to increase the size of the analysis increments which allow one to produce ozone fields in good agreement against ozonesondes and total ozone observations, which is not the case for the analyses calculated with the diagonal B.
In the future, two studies are planned.The first one is to adapt this formulation for BASCOE in the case where the chemistry module is turned on and to evaluate the improvements to constituents other than ozone.The second study consists of estimating the correlation matrices by one of the methods reviewed by Bannister (2008a, Sect. 5) for separable and non-separable spatial correlations.In the atmosphere, the constituent's concentrations can vary with height by several orders of magnitude and chemical regimes can completely change with altitude, for instance across the tropopause and stratopause.For those reasons, the horizontal and vertical lengths scales are likely to be strongly dependent on height.Thus a non-separable three-dimensional error correlation matrix could provide a better representation of the background error for chemical variables than a separable matrix.
m n [ m n ] * = b n δ n n δ m m (23) † And not 2 √ and R. Ménard: Spectral representation of background error covariances where χ is a new control variable, δx is the analysis increment and L is the square root of B: B = LL T .(31) 1. Initialization: read x b and y, set χ = 0 2. Calculate J b by Eq. (32) 3. Get x = Lχ + x b and calculate J o by Eq. (26) 4. Get J = J b + J o 5. Calculate ∇ χ J b by Eq. (35) 6. Calculate ∇ x J o by Eq. (29) 7. Get ∇ χ J = ∇J b + L * ∇ x J o 8. Provide χ , J and ∇ χ J to the minimizer and update χ 9.If convergence, get x = Lχ +x b and start a forecast until the next analysis time.Otherwise, go to step 2.4 Formulation of L and its adjoint L *

−Fig. 1 .
Fig.1.Cross-sections of the analyses obtained by the assimilation of a single pseudo-observations located at the Equator using Gaussian spatial correlation function and two target grids of the spectral transform: the lat/lon grid (blue line) and the Gaussian grid (red line).The correlation length scales are L h = 600 km horizontally and L v = 3 model levels vertically.The three top panels show the cross-section of theses analyses along the latitude (a), the longitude (b) and the altitude (c), all crossing the observation location.The horizontal black lines at value 1.1 indicate the theoretical analyses at the observation location.The bottom panels display the differences between the analysed cross-sections and the expected cross-sections.

Fig. 2 .
Fig. 2. Power spectra of the analysis increments χ at the observation level obtained using two different spectral grids -the lat/lon grid (blue line) and the Gaussian grid (red line).Panels (a) and (b) are obtained using a correlation length scales of L h = 600 km (a) and 300 km, respectively.In both cases, the experiments are based on a model resolution of 120 longitudes by 60 latitudes and 31 levels.Results on panel (c) are obtained with L h = 600 km and a model resolution of 240 longitudes by 120 latitudes and 31 levels.

Fig. 3 .Fig. 4 .
Fig. 3. Bias (top row) and standard deviation (bottom row) of the Observations-minus-Forecast (OmF) statistics between the analyses and the MIPAS data, in percent.Statistics are done at five latitude bands denoted in the title of each plot.Blue and red lines represent the statistics obtained, repectively, by the CORREL and DIAG runs.

Fig. 5 .
Fig. 5. Time series of the ozone partial column (10-100 hPa) between August and October 2003 obtained above three NDACC stations in Antarctica by the ozone sondes (black circles), and the runs DIAG (red line) and CORREL (blue line), in Dobson unit (DU).

Fig. 6 .
Fig. 6.Total ozone in the Southern Hemisphere on 1 October 2003 obtained by the runs DIAG (a) and CORREL (b), and observed by the Total Ozone Measurement Satellite (TOMS) (c), in Dobson unit (DU).