Modelling representation errors of atmospheric CO 2 mixing ratios at a regional scale

Inverse modelling of carbon sources and sinks requires an accurate quality estimate of the modelling framework to obtain a realistic estimate of the inferred fluxes and their uncertainties. So-called “representation errors” result from our inability to correctly represent point observations with simulated average values of model grid cells. They may add substantial uncertainty to the interpretation of atmospheric CO2 mixing ratio data. We simulated detailed variations in the CO2 mixing ratios with a high resolution (2 km) mesoscale model (RAMS) to estimate the representation errors introduced at larger model grid sizes of 10– 100 km. We found that meteorology is the main driver of representation errors in our study causing spatial and temporal variations in the error estimate. Within the nocturnal boundary layer, the representation errors are relatively large and mainly caused by unresolved topography at lower model resolutions. During the day, convective structures, mesoscale circulations, and surface CO 2 flux variability were found to be the main sources of representation errors. Interpreting observations near a mesoscale circulation as representative for air with the correct footprint relative to the front can reduce the representation error substantially. The remaining representation error is 0.5–1.5 ppm at 20–100 km resolution.


Introduction
Understanding the variation in atmospheric CO 2 concentration is the key to prediction and quantification of global climate change. Terrestrial CO 2 fluxes have a major impact on the CO 2 mixing ratios and it is therefore important to understand their spatial and temporal variation. Since the atmosphere is on the short term an incomplete mixer of the CO 2 Correspondence to: L. F. Tolk (lieselotte.tolk@falw.vu.nl) surface fluxes, observations of CO 2 mixing ratios can be used to quantify the magnitude and strength of the surface fluxes. At the global scale, such inversion studies have increased our knowledge about the terrestrial source-sink distribution, but exact estimates of the sources and sinks still vary considerably (e.g. Fan et al., 1998;Bousquet et al., 1999;Gurney et al., 2002;Rödenbeck et al., 2003;Baker et al., 2006). The difference between the results of the various studies are associated with errors in the simulated atmospheric transport (Gurney et al., 2002;Yang et al., 2007;Stephens et al., 2007), aggregation of the surface fluxes over large areas (Kaminski et al., 2001), errors due to a poor representation of the diurnal and seasonal covariance of the surface fluxes with the boundary layer height, i.e. "rectification" errors (Denning et al., 1996;Perez-Landa et al., 2007;Ahmadov et al., 2007) and errors introduced by the assumption that a point observation can be represented by the average CO 2 mixing ratio in a model grid box, i.e. representation errors (Gerbig et al., 2003a, b;Lin et al., 2004;Van der Molen and Dolman, 2007;Corbin et al., 2008).
These errors may be reduced by increasing the resolution of global atmospheric transport models or by employing high resolution regional models (e.g. Peters et al., 2004;Karstens et al., 2006;Geels et al., 2007;Perez-Landa et al., 2007;Sarrat et al., 2007a, b;Ahmadov et al., 2007). With higher resolutions the simulated CO 2 mixing ratios are potentially more accurate, because more small scale phenomena that cause variations in the CO 2 distribution are explicitly resolved. This becomes increasingly important as observations in the boundary layer are used to constrain surface fluxes in more detail at the regional scale (e.g. Carouge, 2006;Lauvaux et al., 2008;Peters et al., 2007;Zupanski et al., 2007).
One important error associated with the use of continental CO 2 mixing ratio observations in inversions is studied here in more detail: the representation error (RE). Previous studies showed that the error can be substantial when a grid cell at relatively coarse resolutions over the continent is assumed to be representative of a point observation (Gerbig et al., 2003a, b;Van der Molen and Dolman, 2007). In global scale inversions, large REs can be avoided by selecting "background" observations and rejecting observations that are influenced too strongly by local sinks and sources (Houweling et al., 2000). However, in smaller scale inversions, observations over the continent are used to constrain the fluxes. The RE due to subgrid variability in the mixing ratios over the continent must thus be taken into account. From an analysis of aircraft profiles in the COBRA experiment in North America, Gerbig et al. (2003a) suggested that models may require horizontal resolution smaller than 30 km to capture the most important spatial variations of atmospheric CO 2 in the boundary layer over the continent. Van der Molen and Dolman (2007) found comparable results in a modelling study over Siberia.
In this paper, REs are studied in more detail using a high resolution mesoscale model. This provides the opportunity to assess the spatial and temporal distribution of the RE and its variability due to meteorological circumstances and surface properties. We aim to determine the main features and the major causes of REs, at scales from 10 to 100 km resolution. In Sect. 2, the model configuration and the calculation of the RE will be described. Section 3 will show the results of the simulations and the main contributors to REs. These will be discussed in Sect. 4, where also some options to reduce the RE will be addressed.

Representation error calculation
In this study, the RE is estimated based on the spatial variability of the CO 2 mixing ratio simulated with the Regional Atmospheric Modelling System (RAMS) at 2 km resolution. We calculate the error introduced when trying to represent the values in the 2 km resolution set with the mean value at a coarser resolution of 10, 20, 50 and 100 km. The calculations are performed at an hourly time step, independently for each of the vertical model levels, using terrain following grid boxes. The RE is defined by the standard deviation of the CO 2 mixing ratio simulated at 2 km resolution within the coarser grid boxes of 10×10, 20×20, 50×50 and 100×100 km: (1) Where σ CO 2 is the RE, n is the number of 2 km resolution grid cells within the coarser grid cell, x i the CO 2 mixing ratio of the 2 km resolution grid cell, and x coarse is the average CO 2 mixing ratio within the coarser grid cell. Additionally, the RE is calculated based on linear interpolation of the low resolution values. In these calculations x coarse is replaced by the interpolated CO 2 mixing ratio x interpol . This is a linear interpolation between the mean of the grid cell and its adjoining cells. We excluded the calculation of RE interpol at 100 km resolution because the boundary cells required for this calculation cover almost the full domain. The size of the statistical sample (n) from which the RE is calculated is minimally 25 and maximally 2500 in the 10×10 km and 100×100 km cases respectively. The calculation of the RE is equal to the approach in Gerbig et al. (2003a), Lin et al. (2004) and Van der Molen and Dolman (2007). Note that in the studies of Gerbig et al. (2003a) and Lin et al. (2004) the size of the statistical sample was not limited due to the large number of observations, and the relatively small measurement error was included, which is non existent in our model study.

Simulation setup
The atmospheric simulations are performed with the nonhydrostatic mesoscale model RAMS (Pielke et al., 1992), which has been used to simulate the behaviour of CO 2 in the atmosphere in a number of studies (e.g. Denning et al., 2003;Nicholls et al., 2004;Sarrat et al., 2007b;Perez-Landa et al., 2007;Corbin et al., 2008). The version used in this study is BRAMS-3.2, including the adaptations to secure mass conservation (Medvigy et al., 2005;Meesters et al., 2008). The surface fluxes are calculated using Leaf-3 (Walko et al., 2000) which was extended with the Farquhar photosynthesis model (Farquhar et al., 1980; to calculate surface fluxes of CO 2 . The standard vegetation parameters of Leaf-3 are used and completed with maximal rate of Rubisco activity (V c max ) based on values from Wullschleger (1993) and . Respiration is simulated with an exponential (Q 10 ) temperature-respiration relationship, in which the Q 10 and R 0 values as estimated by Van Dijk and Dolman (2004) are used. Further specifications of the simulations are given in Table 1.
The simulations are performed for two days of the CERES experiment in South Western France ( Fig. 1) in spring 2005. See Dolman et al. (2006) for further details of this experiment. The 300×300 km domain with a resolution of 2 km is nested in a 1200×1200 km domain at 10 km resolution. It is bounded in the west by the Atlantic Ocean and in the south by the Pyrenean mountain massif with tops over 3000 m height. The area is characterized by several large areas of homogeneous land cover, with the Les Landes pine forest in the west, woods and pastures in the northeast, and large areas of cultivated plots in the rest of the domain (Fig. 1). Two major cities are located in the southeast (Toulouse) and northwest (Bordeaux) corners of the domain.
In this study two different days were simulated to compare the influence of different meteorological circumstances on the REs. The first day, 27 June 2005, was very warm with anti-cyclonic clear sky conditions. The wind came mainly from the southeast and allowed the formation of a sea breeze in the afternoon. On the second selected day, 6 June 2005, north-western winds prevailed. This day was cooler, and some cumulus clouds formed in the afternoon in the northern part of the domain. The two selected days were part of intensive observation periods within the CERES campaign. The simulations were compared to the available observations and the most important findings were described in the model intercomparison of 5 mesoscale models by Sarrat et al. (2007b). That comparison shows the ability of the models to represent the atmospheric CO 2 distribution satisfactorily, in general agreement with the observations. They conclude that the complex spatial distribution as well as the temporal evolution of CO 2 in interaction with the surface fluxes are realistically simulated compared to the aircraft observations. Our model, BRAMS  3.2 performed satisfactory in most aspects. Any possible further influences of discrepancies between the simulations and the observations on the estimate of the RE are addressed in the discussion section.
The surface fluxes in the standard simulations are calculated based on the Pelcom land use map with a homogeneous LAI per land use class (http://www.geo-informatie.nl/ projects/pelcom/). To test the sensitivity of the RE to the formulation of the surface cover, land use maps derived from the Modis satellite data were used (http://modis-land.gsfc.nasa. gov/vi.htm), where the LAI can vary per pixel. Additionally, simulations were performed in which the CO 2 flux was prescribed spatially homogeneous, as a function of time.
The standard simulations for the two days were thus kept very similar, i.e. with similar land use maps and LAI and similar initialization of the CO 2 mixing ratio, so that the differences between the two days represent the influence of meteorology on the RE.

Results
Our simulations show that a number of processes contribute to the total RE, and their relative contribution is different on both simulated days. REs are always associated with strong horizontal gradients in the CO 2 distribution. There are large variations across the spatial domain due to differences in land-surface type and topography. In the next sections, we will separately discuss each process contributing to the total RE, which spans a range from 0.5 ppm to as much 10 ppm. First, we will illustrate the dominant mesoscale circulation patterns to provide appropriate background for the analysis.

Mesoscale circulations
The simulations show that the RE has a large spatial and temporal variability. The two simulated days show a clear  distinction, where the REs during the day on 27 May exceed those on 6 June. The largest difference between the two days is the synoptic wind direction, which originates from the southeast on 27 May and from the west on 6 June. On 27 May mesoscale circulations formed, these were suppressed on 6 June.
During the night of 27 May the south-eastern wind moved air with a high CO 2 mixing ratio, because of respiration, from the land over the sea. Since the CO 2 fluxes over the sea are relatively small the CO 2 mixing ratios remain high there during the following day. In the course of the day a sea breeze developed. The direction of the sea breeze was at 27 May opposite to the synoptic wind direction. The converging winds led to the formation of a front (Fig. 2a). A gradient of about 10 ppm formed between the high nocturnal mixing ratios over the ocean and the depleted mixing ratios over the land perpendicular to the coast line. This was also described by Dolman et al. (2006), Sarrat et al. (2007a) and Ahmadov et al. (2007).
Additionally, the relatively high sensible heat flux above the forest resulted in a deep boundary layer compared to its surroundings. On 27 May both the large scale and sea breeze wind directions are directed towards the forest. This prevented advection of the deep boundary layer that formed above the forest over the rest of the domain.
In contrast, on 6 June the synoptic wind was directed from sea to land and similar to the main direction of the sea breeze. Neither advection of the nocturnal high mixing ratios from land to sea, nor the formation of a convergence zone during the day takes place (Fig. 2b). The effects of the sea breeze on the CO 2 mixing ratio are thus suppressed by the westerly wind on 6 June. Also, the high boundary layer over the forest is advected. This eliminates the strong contrast between the depth of the boundary layer over the forest and its surroundings. At 6 June background CO 2 mixing ratios from the ocean are advected over the land, where it is depleted due to CO 2 uptake at the surface during the day (Fig. 2b).

Representation errors due to mesoscale circulations
The large mixing ratio contrasts over small distances induced by mesoscale circulations may lead to a large RE. On 27 May a higher RE is simulated than at 6 June ( Fig. 3). At locations that are not affected by mesoscale circulations, the REs on 27 May are comparable to those observed on 6 June. The high RE on 27 May is located in grid cells near of the edges of the convergence zone (Fig. 4). Near the front a RE of ∼2.5 ppm is found at 10 km resolution, and ∼5.5 ppm at 100 km resolution.
On 27 May, the CO 2 depleted air from the boundary layer is lifted in the convergence zone. This leads to a band with high REs along the eastern edge of the convergence zone which dominate the average RE over the domain. The highest REs during the day are found around the top of the boundary layer next to the convergence zone (Fig. 5).
Also in the rest of the domain and on 6 June the RE is high around the mean height of the top of the boundary layer (Fig. 5b). This is a result of the difference between convection cells and their surroundings, causing strong horizontal gradients between the depleted boundary layer and free tropospheric air. On 6 June this effect stretches over a larger vertical range (Fig. 5b), causing horizontal variations in the CO 2 mixing ratio up to 3000 m height during the day.
Over the sea the boundary layer is very shallow and therefore the largest REs are limited to the lower part of the atmosphere. For example, the RE at 250 m height as shown in  influenced by nocturnal land fluxes and transported from land during the night causing high REs. The high mixing ratios from the land contrast here with the background mixing ratios over the sea. On 6 June, the wind from the ocean brings "background" air which is not influenced by any strong near field terrestrial fluxes. The REs over most of the sea are therefore very small over the sea on 6 June.

Representation errors in the free troposphere
The free tropospheric RE is caused by CO 2 mixing ratio gradients in the residual boundary layer (Fig. 5). During the evening of 27 May, the convergence zone due to the sea breeze remains intact until the temperature of the land decreases towards the sea water temperature. Until then, the boundary layer air is forced to rise to higher altitudes. Combined with the influence of the upwind mountains, this leads to an ongoing increase in the RE at higher altitudes (Fig. 5a). In this case, the variations in the free troposphere will still exist at the beginning of the next day.
On the other hand, analyses of the RE for 6 June show that after the collapse of the boundary layer at the end of the day, the relative strong horizontal winds above the nocturnal boundary layer cause mixing. This diminishes the horizontal variations in the CO 2 mixing ratio at altitudes that are no longer influenced by the surface fluxes and the RE in the residual boundary layer gradually decreases during the evening and the night (Fig. 5b). In this case, the RE for the next day will hardly be affected by the residual boundary layer. This is typical for cases without special disturbances. It was also implicitly assumed when a homogeneous CO 2 mixing ratio was chosen to initialize our simulation.

Representation errors due to topography
During the night the presence of the high mountains of the Pyrenees in the south of the domain strongly influences the RE at lower altitudes. In the simulations for both days a band with high CO 2 mixing ratios accumulates at the foot of the Pyrenees. It strongly contrasts with the lower mixing ratios in the flatter areas and at the tops of the mountains. This leads to a high RE near the surface during the night. After sunrise, over the land the high CO 2 mixing ratios at the foot of the Pyrenees are decreased due to the growth of the boundary layer, entrainment and CO 2 uptake at the surface. Over the sea, the high CO 2 mixing ratios are preserved in the shallow boundary layer and lead to enhanced REs during the whole day.
Smaller scale topographic features also induce RE. During the night, accumulation of CO 2 is simulated in valleys with up to a hundred meters altitude difference. The CO 2 mixing ratio variations induced by small scale variations in the surface altitude, remote of the Pyrenees, cause a RE of 0.5-3 ppm at 10 km resolution, and ∼3 ppm at 100 km resolution. The exact RE due to topography will depend on the specific meteorological circumstances and on the strength of the surface CO 2 fluxes. After sunrise, the gradients in the CO 2 mixing ratios formed during the night decrease and consequently the representation error is reduced near the surface.

Representation errors due to flux variability
During the day, areas of contrasting vegetation types and accompanying contrasting CO 2 fluxes cause large scale variations. An increase in the stomatal conductance causes a simultaneous increase of the CO 2 flux and the transpiration, at the expense of the sensible heat flux. This leads to an inverse relation between the CO 2 fluxes and the boundary layer depth. The surface signal is diluted stronger at locations with a deep boundary layer. CO 2 mixing ratio gradients introduced by CO 2 flux heterogeneity strengthen this effect and increase the RE. The CO 2 flux variability also strengthens the effect of the mesoscale circulations. Ahmadov et al. (2007) showed the covariance (3D-rectifier) between the mesoscale circulations and the CO 2 fluxes. The unresolved flux variability within one grid cell increases with the size of the grid cells. Its influence becomes therefore more pronounced at lower resolutions.

Relative contribution of the representation error components
The relative importance of the different sources of RE depends on the resolution, time and location. When a mesoscale circulation develops like on 27 May, it overwhelms the other sources of RE at all resolutions in the grid cells near the front (e.g. Fig. 4). With decreasing resolution, the high RE caused by the sea breeze front influences a larger area as the size of the grid cells increases. Without sea-breeze circulation, the main source of RE along the coast (see Fig. 1) is the gradient between the mixing ratios over land and sea. Here linear interpolation can reduce the RE substantially, as is shown for 6 June in Fig. 6. More inland (Fig. 1), linear gradients are less important and RE interpol differs just slightly from the RE based on the standard means (Fig. 6).
We performed two simulations with spatially homogeneous land CO 2 fluxes: with land CO 2 fluxes comparable to the forest fluxes or comparable to the fluxes of the crops. The REs in the simulations with homogeneous fluxes are mainly caused by convective structures. Gradients between the updraft and downdraft mixing ratios hamper the estimate of the mean boundary layer mixing ratio (e.g. Weckwerth et areas for which these RE 2 s are estimated.  Lothon et al., 2007). The strength of the gradient, thus the RE, depends on the strength of the surface fluxes (Fig. 7). The simulations showed that the RE due to convective structures is about 0.05-0.5 ppm at 10 km resolution (Fig. 7). These values are comparable to the uncertainty of 0.2 ppm estimated from the COBRA data (Gerbig et al. 2003a). At high resolutions the RE is dominated by these CO 2 mixing ratio variances due to convective structures.
At lower resolutions, the RE due to CO 2 surface flux variability is, next to mesoscale circulations, the major source of REs. At 100 km resolution the RE simulated with heterogeneous fluxes is higher than the RE from either of the simulations with homogeneous CO 2 fluxes (Fig. 7). At 6 June, the mean CO 2 flux over the domain is comparable to the average of the forest and crop CO 2 fluxes. The mean of the mixing ratio variances (RE 2 ) of the two different homogeneous flux simulations compares well with the RE 2 of the heterogeneous flux simulation (Fig. 7). Their difference increases with resolution. An estimate of the relative contribution of the flux variability to the RE is indicated in Fig. 7 with the grey area.

Discussion and conclusions
The RE of the atmospheric CO 2 mixing ratio at regional scales is found to be substantial. We concur with earlier studies (Gerbig et al. 2003a, b;Lin et al., 2004;Van der Molen and Dolman, 2007;Corbin et al., 2008) that it is a source of uncertainty that should be taken in account in inversion studies to avoid biased results. The order of magnitude of the RE above the continent appears to be comparable to other  sources of uncertainties like transport and rectification errors (Gurney et al., 2002;Yang et al., 2007;Stephens et al., 2007;Denning et al., 1996;Perez-Landa et al., 2007;Ahmadov et al., 2007). The most important contribution to the RE in our small domain during the day comes from surface flux variability, convective atmospheric structures and mesoscale transport phenomena such as the land-sea breeze.
Our numbers are similar to the REs estimated in previous studies (Fig. 3). In the study by Gerbig et al. (2003a, b) the RE was estimated for many different areas within the US based on experimental and theoretical evidence. Our results for 6 June are slightly higher but comparable to these more general estimates of the RE. Coastal areas, like the Les Landes area studied here, are special because of the often occurring sea-breeze circulations as seen on 27 May. The higher domain averaged REs in that situation are more comparable with the REs found in the model study by Van der Molen and Dolman (2007) over Siberia, where mesoscale circulations also played an important role.
The results do not depend on the choice of the land cover map; simulations with the Pelcom and Modis land use maps (Table 1) give the same main sources of RE. The boundary layer height on 6 June was simulated correctly compared to the observations. On 27 May the boundary layer height was underestimated by the model at some locations (for details see Sarrat et al., 2007b). Since the strength of the vertical mixing determines the dilution of the surface signal in the atmosphere, the occasional underestimation of the boundary layer depth may have led to a slight overestimation of the REs in this study. The assumption that the simulation at 2 km resolution captures all variability in the CO 2 mixing ratio may on the other hand lead to a small underestimation of the RE. Hence, the absolute values of the RE in this study must be handled with caution. The processes we found to cause the RE are robust and the difference of the RE between the two days is larger than the sensitivity to the model settings. Therefore, it seems justified to use the simulations as a basis for a qualitative analysis of the RE.
Because of the heterogeneity of the RE in time and space, we recommend to use a time and place dependent error estimate in inversions. Within the boundary layer, the RE is lowest during the day in the well mixed part. This is thus the best location and time to get a representative sample. Observations around the top of the boundary layer should be avoided as the RE is high there. Near land-sea and other surface cover contrasts the RE can be reduced substantially by the use of linear interpolation instead of a simple mean of the coarse grid results (Fig. 6).
During the day the largest REs in our simulations were associated with the sea breeze front caused by the sharp contrast between air masses with different flow histories. A correct interpretation of a CO 2 mixing ratio observation as representative for air with a terrestrial footprint or as representative for the sea breeze can reduce the RE. The measured wind direction and a possible change in the observed CO 2 mixing ratio as the sea breeze reaches the observation location may be used as indicators.
At night, unresolved topography is the main source of REs. The simulated accumulation of CO 2 in the valleys is in line with the findings of previous model (Nicholls et al., 2004;Van der Molen and Dolman, 2007) and observational studies (Eugster and Siegrist, 2000;Araújo et al., 2008;Goulden et al., 2006;Aubinet et al., 2003) which show that near surface cooling leads to katabatic drainage flow of CO 2 rich air. Nocturnal observations at high mountains may after data selection be taken as representative of the CO 2 mixing ratio at their height above sea level (e.g. Schmidt et al., 2003;Geels et al., 2007) with accompanying relatively low free tropospheric REs.
Katabatic drainage due to small scale topography of up to 100 m altitude leads to mixing ratio gradients within the nocturnal boundary layer. Mixing ratios in the valleys are enhanced by accumulation of respired CO 2 , while mixing ratios at higher parts are reduced. The signal may be advected and can also affect the observations downwind of the small scale topography. With moderate unresolved topography we therefore advice to be aware of a possible bias in the observations, and take a RE of ∼3 ppm, depending on the flux intensity and the stability of the boundary layer, into account during the night.
Extra towers can give a better constraint on the average CO 2 mixing ratio and reduce the RE. Our simulations indicate that even at a relatively high resolution (10 km) the RE over the land exceeds the error introduced by the measurement accuracy aimed for at high accuracy stations. To reduce the RE an observation network with a number of clustered towers may be favourable over a regularly spaced network. The ring of towers around the WLEF tower is an example of such a tower cluster (Zupanski et al., 2007), which may be applied at smaller scales too.
Increasing the model resolution is the most straight forward manner of decreasing the RE. How much the resolution must be increased to resolve the main CO 2 mixing ratio variability depends on the strength and the horizontal extent of the surface CO 2 flux variability and the meteorology. The results of this study suggest that much can be gained by increasing the resolution from relative coarse scales of 100 km toward finer resolutions (Fig. 3), especially when relative large scale phenomena like the sea breeze cause contrasts in the CO 2 mixing ratio.
Although the simulations in this study resolve some formation of atmospheric eddies, it is difficult to simulate the location of the updrafts and downdrafts correctly. In the absence of mesoscale circulation, this means that increasing the resolution to scales below the size of convective structures does not necessarily reduce the RE. High resolution simulations may better simulate the formation of a mesoscale front like on 27 May. Therefore, when mesoscale circulations develop a further increase of the resolution may reduce the RE. This confirms the suggestions of previous studies to use a resolution of 30 km (Gerbig et al., 2003a, b) and a possible further refinement of the grid when mesoscale circulations develop, to reduce the error associated to mesoscale processes (Van der Molen and . If observations are associated with the proper influence history, our simulations suggest that the RE in the boundary layer during the afternoon can be limited to below 1 ppm up to at least 20 km resolution, or a coarser resolution when the circumstances are favourable like in this study at 6 June.