Error characterization of CO 2 vertical mixing in the atmospheric transport model WRF-VPRM

Abstract. One of the dominant uncertainties in inverse estimates of regional CO 2 surface-atmosphere fluxes is related to model errors in vertical transport within the planetary boundary layer (PBL). In this study we present the results from a synthetic experiment using the atmospheric model WRF-VPRM to realistically simulate transport of CO 2 for large parts of the European continent at 10 km spatial resolution. To elucidate the impact of vertical mixing error on modeled CO 2 mixing ratios we simulated a month during the growing season (August 2006) with different commonly used parameterizations of the PBL (Mellor-Yamada-Janjic (MYJ) and Yonsei-University (YSU) scheme). To isolate the effect of transport errors we prescribed the same CO 2 surface fluxes for both simulations. Differences in simulated CO 2 mixing ratios (model bias) were on the order of 3 ppm during daytime with larger values at night. We present a simple method to reduce this bias by 70–80% when the true height of the mixed layer is known.


Introduction
Anthropogenic emissions of greenhouse gases (GHGs) have changed the atmospheric composition substantially (IPCC, 2007).Carbon dioxide (CO 2 ) and methane (CH 4 ) have been identified as the most important GHGs for the anthropogenic alteration of the global climate system.The carbon cycle is linked to climate directly by affecting the energy budget of the Earth and indirectly via feedback processes (Heimann and Reichstein, 2008).Oceans and the terrestrial biosphere are slowing down this process by absorbing about half of the anthropogenic carbon emissions (Canadell et al., 2007).The spatial distribution, strength and temporal development of these sinks is the subject of active research.However, multilateral treaties like the Kyoto protocol aim at the reduction and management of the anthropogenic emissions in order to mitigate the human impact on the climate system.Thus, improved knowledge of carbon budgets is required to project future climate, to support political discourse, to verify emission inventories, and to monitor management techniques of the carbon cycle.
In the past inversions of global CO 2 measurements have been used in a "top-down" approach (Nisbet and Weiss, 2010) to infer carbon fluxes for regions the size of continents (e.g.Schimel et al., 2001;Gurney et al., 2002) down to spatial scales of 100 km (Rödenbeck et al., 2003;Peters et al., 2007).To relate carbon source/sink processes to external forcing on the sub-continental scale, where the observed CO 2 signals near vegetated areas are highly variable and flux signatures degrade rapidly due to mixing in the lower troposphere, inversion frameworks with even higher spatiotemporal resolutions (<100 km, diurnal) are necessary (Gerbig et al., 2003;Karstens et al., 2006;Gerbig et al., 2009;Lauvaux et al., 2009;Göckede et al., 2010).
In inversions the relation between CO 2 mixing ratios and sources/sinks is approximated with atmospheric transport models.It has been known for quite some time that these models are subject to uncertainties that significantly hamper reliable estimations of surface fluxes (Law et al., 1996(Law et al., , 2008;;Gerbig et al., 2003;Stephens et al., 2007;Lauvaux et al., 2009;Houweling et al., 2010).Especially at high spatio-temporal resolution different sources of transport error were reported.Lin and Gerbig (2005) estimate the impact of misrepresenting wind flows in the transport model on CO 2 concentrations during the active growing season to ∼6 ppm.

Inaccurate representation of meteorological processes at the
Published by Copernicus Publications on behalf of the European Geosciences Union.meso-and microscale, e.g.land-sea breezes, mountain-valley circulations, heat island effects and surface energy fluxes can cause errors of ∼2-3 ppm (Ahmadov et al., 2007;van der Molen and Dolman, 2007;Tolk et al., 2008).
One of the dominant transport uncertainties is related to vertical mixing of CO 2 associated with atmospheric turbulence near the surface where most CO 2 observations are made.Errors in vertical transport, that exceed the targeted measurement precision for CO 2 of 0.1 ppm by more than an order of magnitude, are inflicted by the seasonal and diurnal covariances of CO 2 fluxes and turbulence in the planetary boundary layer (PBL) (Denning et al., 1995(Denning et al., , 1996(Denning et al., , 1999(Denning et al., , 2008;;Stephens et al., 2000;Yi et al., 2001Yi et al., , 2004;;Gerbig et al., 2008;Tolk et al., 2009).
On the diurnal scale, several vertical mixing processes can affect the concentration of CO 2 in the PBL.During daytime in situations when a convective boundary layer (CBL) develops (summer, clear sky) photosynthetic uptake is diluted up to the height of a turbulent mixed layer (ML) within CBL on hourly time scales.Entrainment of air situated above the ML (free troposphere or residual layer) is caused by vertical advection and overshooting thermals.Such processes affect time-mean CO 2 concentrations in the mixed layer on the order of several ppm, but also alter other properties of the ML like moisture, temperature, and the mixing height itself (McGrath-Spangler and Denning, 2010).
Due to the vigorous mixing in the CBL on small timescales variables like potential temperature, water vapor and CO 2 are approximating constant vertical profiles (Stull, 1988).In fact, various studies used this simplification for column mass budgeting approaches to directly determine CO 2 exchange fluxes (e.g.Wofsy et al., 1988;Chou et al., 2002;Laubach and Fritsch, 2002;Bakwin et al., 2004;Helliker et al., 2004;Aubinet et al., 2005b).
When turbulence is reduced as less radiation heats the surface after sunset, the colder layer of air near the surface is decoupled from the warmer well-mixed part of the ML.During these times the latter part, called residual layer (RL), is not directly affected by surface forcing, because air parcels are confined to the lower part of the PBL by a capping temperature inversion.Hence, tracer profiles in the RL stay relatively constant with time (Yi et al., 2001).In the night, when respiration fluxes dominate and there is only weak mixing in a Stable Boundary Layer (SBL) CO 2 can accumulate near the surface.In the SBL mixing can still occur due to wind shear and surface friction up to several hundred meter (Stull, 1988).When sun rises again the capping inversion becomes weaker due to increased heat fluxes into the SBL.During the growth of the new mixed layer RL air is entrained, which causes a rapid dilution of CO 2 molecules (Gibert et al., 2007).
The mixing height (MH, z i ), normally a property of the CBL, is an intuitive measure for vertical mixing strength because it is defined as the level of most negative heat flux or also as the height at which air parcels rising from the surface become neutrally buoyant (Stull, 1988).The MH not only determines the volume of a column of air in which the fluxes contribute to the CO 2 concentration, model mismatches in this property can also lead to bias in CO 2 concentrations (Denning et al., 1996(Denning et al., , 2008;;Yi et al., 2004;Pérez-Landa et al., 2007;Ahmadov et al., 2009).Ramonet et al. (2009) and Aulagnier et al. (2009) demonstrated that the multiannual trend of the MH could be an important driver for a recently observed build up of CO 2 over the European continent.In an intercomparison study of five mesoscale tracer models Sarrat et al. (2007) conclude that MHs between models revealed considerable discrepancies.Gerbig et al. (2008) (hereafter G08) demonstrated that MH uncertainties relate to CO 2 transport errors of typically 3 ppm during summertime in the temperate zones of NW Europe.
In this paper we further investigate the relationship between mixing height and CO 2 concentrations in the PBL.Similar to G08 we assume that errors in simulated PBL CO 2 concentrations are at first order caused by a wrong vertical distribution of CO 2 in a given atmospheric column, such that the column-integrated concentration is unaffected.Thus, one might correct for the error in simulated PBL CO 2 by redistributing or reshuffling CO 2 from the free troposphere to the PBL or vice versa, compensating for the mismatch between true and observed mixing height, while keeping the total column mass unchanged.We present the results of a pseudodata experiment using the high resolution chemical transport model WRF-VPRM (Ahmadov et al., 2007).It was set up with two parameterizations of the PBL: the Mellor-Yamada-Janjić scheme (MYJ, Janjic, 2002) and the Yonsei-University scheme (YSU, Hong et al., 2006).Hereafter we refer to the simulations with the different schemes as MYJ and YSU respectively.Both schemes are commonly-used parameterizations which are able to realistically simulate PBL dynamics (e.g.Borge et al., 2008).In addition, we diagnose biospheric fluxes from satellite reflectance data at 500 m horizontal resolution updated every 8 days and use the recent 2005 EDGAR emission inventory (0.1 × 0.1 degree grid) to consider anthopogenic flux contributions (Source: EC-JRC/PBL.EDGAR version 4.1.http://edgar.jrc.ec.europa.eu/,2010).In line with previous findings (Ahmadov et al., 2007;Sarrat et al., 2007;Ahmadov et al., 2009;Pillai et al., 2010) we therefore assume the simulated CO 2 fields to realistically capture the dominant mixing and flux processes that determine the diurnal variation of atmospheric CO 2 concentration.However, it is also known that MYJ produces weaker vertical mixing compared to YSU and other schemes (Hu et al., 2010), thus these two schemes seem appropriate for the purpose of our study as significant divergence in simulated transport of CO 2 can be expected as well.
The goal of our study is to investigate the following scientific questions:  2. What is the quantitative impact of theses differences on the simulated CO 2 (i.e.transport model error)?
3. Can we use the true mixing height to reduce this model error?
We diagnose and compare the simulated mixing height to determine errors in vertical mixing.The simulation using the YSU scheme is defined as the known truth, and the MYJ simulation as the potentially error-prone model.The choice of the YSU simulation to represent the "truth" is arbitrary.Any of the two schemes might be closer to reality in some or all instances which does not affect the mentioned goals of the paper.Regarding the third question, this paper can be seen as a preparatory study to elucidate the potential for assimilating observation-based mixing heights into the modeling system to better constrain vertical tracer transport.We organized the paper as follows.In the Methodology section we describe the WRF-VPRM modeling system and present the reshuffling method used to vertically redistribute CO 2 .To test this method we apply it to a 1-D conceptual model of the PBL and to the WRF-VPRM simulated 3-D CO 2 fields.In the Results section we show the comparison of the WRF-VPRM simulations in terms of mixing heights and CO 2 concentrations together with the results of the pseudo-data experiment.The discussion focuses on shortcomings of our method as well as application to current regional inversions.

WRF-VPRM modeling system and its setup
We applied the WRF-VPRM modeling system (Ahmadov et al., 2007) to simulate biospheric fluxes and transport of CO 2 for large parts of the European continent during the period of August 2006 in forecast mode (30 h short range forecasts).The chosen horizontal resolution is 10 km at 41 ver-tical levels with the model top at 50 hPa.The lowest vertical level has a thickness of ∼35 m and 19 levels are located within 2 km of the surface.The model domain is shown in Fig. 1.Each of the 30 h short-term forecasts includes 6 h of meteorology spin-up.We excluded water grid cells from the analysis of the simulation output since the focus of our study is on PBL simulation over land.In addition, 10 grid cells (∼100 km) at each domain border were excluded to minimize direct influence of the lateral boundary conditions.An overview of the physic options which were used is given in Table 1.
What follows is a brief review of the main model components; for a more detailed model description the reader is referred to Ahmadov et al. (2007).
WRF-VRPM couples the Weather Research and Forecasting model (http://www.wrf-model.org) to the Vegetation Photosynthesis and Respiration Model (VPRM, Mahadevan et al., 2008).WRF is a non-hydrostatic mesoscale numerical weather prediction model that was extended by Grell et al. (2005) to allow for atmospheric transport of chemical compounds and aerosols on-line (WRF-Chem).Ahmadov et al. (2007) added transport of CO 2 as a passive tracer to WRF-Chem in such a way that separation of the different CO 2 components (background, anthropogenic, biospheric) is possible.
VPRM calculates hourly biosphere-atmosphere CO 2 exchange fluxes diagnostically as a function of WRF surface temperature (T2) and short wave radiation (SWDOWN) fields and two indices: the Enhanced Vegetation Index (EVI) and the Land Surface Water Index (LSWI).These indices are derived from MODIS (Moderate Resolution Imaging Spectroradiometer) satellite surface reflectance retrievals given globally at 500 m horizontal resolution at 8 day intervals.The VPRM calculations are scaled by parameters specific to the vegetation coverage of the model grid cell.These parameters were optimized against eddy covariance flux measurements for the North American continent (Mahadevan et al., 2008).Here we use updated parameters suitable for the European continent (Pillai et al., 2011).The fractional vegetation coverage for the model grid is obtained from the SYN-MAP (Jung et al., 2006) with a spatial resolution of 1 km.
To isolate the effect of vertical mixing mismatches on the CO 2 concentrations we prescribed the same VPRM fluxes for both simulations.Technically, VPRM fluxes were first calculated within the YSU simulation and then added to the MYJ simulation as flux input fields similar to the anthropogenic emissions.To reduce the complexity of the study we neglect feedback effects between vertical mixing, surface temperature, soil moisture, latent head and cloud cover (differences in radiation) which may influence NEE.
Hourly CO 2 fluxes from anthropogenic sources were prescribed with 2005 data obtained from the Emissions Database for Global Atmospheric Research (EDGAR) version 4.1 (0.1 × 0.1 degree grid, http://edgar.jrc.ec.europa.eu).Oceanic flux contributions are considered solely as part of the initial and later boundary conditions from the global model (see paragraph below), i.e. they are not explicitly prescribed in the short-term forecasts, because compared to the biospheric fluxes, oceanic fluxes are small and vary only slightly and because our analysis excludes ocean grid cells entirely.
Lateral boundary conditions of the meteorology, sea surface temperature and soil initialization fields were taken from ECMWF analyses (European Center for Medium-Range Weather Forecasts, 6 hourly, 35 km horizontal resolution).CO 2 initial and boundary conditions were obtained from TM3 inversion results driven by NCEP meteorology (Rödenbeck, 2005) 1 .To account for CO 2 flushing time of the domain we excluded the first 8 days from the simulated period (2-30 August 2006) from our analysis.

Reshuffling method for mixed layer CO 2
In this section we describe a method to relate model-model (or model-truth) differences of mixed layer CO 2 and mixing height.Figure 2 illustrates two simplified profiles of CO 2 in the atmosphere.This concept is based on a simple slab model assuming convective conditions in the afternoon and an infinitesimally thin entrainment zone (Stull, 1988).
We define one profile as the truth (mixing height z i,truth ) and the other one as modeled column (z i,model ).Assuming the same total mass and net CO 2 fluxes (NEE) for both columns (photosynthetic uptake during day), the mean mole fraction in the model ML (C m,model ) is lower than the true mean mixing ratio (C m,truth ).This is because in the model photosynthesis removes CO 2 molecules at the same rate from a shallower volume (surface to z i,model and unit area).Knowing the true mixing height one can adjust C m,model to match the true mixing ratio by reshuffling air between the mixed layer and the air just above the ML (with mixing ratio C + ), while keeping the total column integrated mass of CO 2 fixed (G08).Mathematically this is expressed as follows: with where ρ(z) and C(z) are profiles of molar air density and mole fraction mixing ratio of CO 2 respectively.Note, in this paper we refer to dry air densities and CO 2 mole fractions.Effectively this method entrains the air between z i,model and z i,truth into the ML which gets perfectly mixed instantaneously to yield C m,truth .Similarly, when the model overestimates the mixing height, the method "unmixes" by turning the air between z i,truth and z i,model into free tropospheric or residual layer air, depending on the layer above the mixing height.During nighttime when respirative fluxes dominate (net gain of CO 2 molecules) the relationship between MH and mixing ratio reverses.
We tested the correction method with a conceptual 1-D model of the PBL which will be discussed in the following section.

Fig. 2.
Idealized CO 2 profiles in a column of air with unit area.A model with weak vertical mixing (blue lines) has a lower mixing height (z i,model ) than the truth (z i,truth , red lines).The shaded areas represent the mixed layer column mass assuming a constant air density profile.The reshuffling method (Eq. 1) scales the modeled ML CO 2 such that the column mass is unchanged.

Test of the reshuffling method
The reshuffling method (Eq. 1) was tested with a conceptual model of the PBL.We base our concept model on a simplified mass budget equation as used in the past to constrain CO 2 fluxes and isotopes of CO 2 from observed concentrations and meteorological properties (e.g.Laubach and Fritsch, 2002;Helliker et al., 2004;Aubinet et al., 2005a).Although this model is subject to several simplifying assumption (discussed in detail in the cited literature), it can realistically describe the diurnal evolution of CO 2 concentration in the PBL.
Here we use a height integration formulation which was applied to analyze data from the Amazon Boundary Layer Experiment (ABLE) (Stephens et al., 2000;Chou et al., 2002;Wofsy et al., 1988): The mean mixing ratio (C m ) of CO 2 in the mixed layer with height (z i ) is balanced by the surface biospheric fluxes F NEE , entrainment (∂z i /∂t) and vertical advection (subsidence, wz i ).For simplicity we keep the density ρ vertically and temporally constant, and contributions to the mass balance from horizontal advection are neglected.The subsidence is included in the last term on the RHS, and accounts for vertical advection at the top of the ML (with mean velocity wz i ) that mixes concentrations directly above the ML (C + ) and the ML concentration (C m ).
A simple model based on Eq. ( 2) was time-integrated for several days to compute the variation of the CO 2 within a column of air with unit area and a fixed height of h = 3.3 km.The whole column is advected downwards with a temporally fixed subsidence rate which is set to typical values for the mid-latitudes (Stull, 1988;Yi et al., 2001).At the column top (h) this is set to wh = −0.02m s −1 and linearly drops to w0 = 0 m s −1 at the surface, representing constant horizontal divergence.This profile is used to obtain subsidence velocity at the MH ( wz i ).We prescribe the flux of CO 2 (F NEE ) with typical values for mid-latitude continental vegetation (Dolman et al., 2006;Ahmadov et al., 2007;Gibert et al., 2007) as a net sink over 24 h (∼−3.4 µmol s −1 , Fig. 3a).The background CO 2 mixing ratio is set to the constant value of 380 ppm (C t ).Growth and decay of the mixed layer over 24 h are prescribed as shown in Fig. 3a.
In order to test the reshuffling method (Eq.1), a second run with a 30 % low bias in mixing heights was performed (blue line in Fig. 3a).In this section we will refer to this run as the model, while the original run (red line in Fig. 3a) represents the truth with stronger vertical mixing (referred to in this section as truth).
Both runs start with a constant CO 2 profile of 380 ppm at each model layer (264 layers each 12.5 m thick).After 14 days of integration (time steps of 150 s) the model reaches steady state, i.e. fluxes and storage of ML CO 2 in Eq. ( 2) are balanced.
The resulting model mismatch of mixed layer CO 2 , i.e. the differences between truth and model (C m,model − C m,truth ), is shown in Fig. 3b (black line).According to the prescribed CO 2 fluxes the difference is positive during nighttime and negative during daytime.In the following we will refer to these differences generally as CO 2 mismatch.
Applying the reshuffling method (Eq. 1) based on the true MH (z i,truth ) to correct the model ML CO 2 mixing ratio (C m,model , Fig. 3b red line), the CO 2 mismatch is completely removed during daytime (10:00 to 14:00 local time).In the late afternoon, during nighttime, and in the early morning the correction can reduce the difference, but exhibits overcompensation.The reason is the compensating effect of night and daytime differences due to sign reversal.The simulated profiles in Fig. 4 show that during these times a residual layer (RL) is situated above the ML, which determines the mean CO 2 mixing ratio above the ML (C + in Eq. 1).The RL concentration is lower in the model according to equally lower ML values during daytime.As such it preserves the daytime CO 2 mismatch over time.Thus, the temporally local reshuffling is obviously insufficient in these cases.Consequently the information the RL mixing ratios contain can be used to extend the reshuffling method.An improved correction method addresses the biased RL concentration before correcting the ML concentration.This correction can be accomplished with the same reshuffling method applied first to the RL (reshuffle air between free troposphere and RL) and afterwards using the corrected RL mixing ratio in Eq. ( 1) to also adjust the ML, given that the true and modeled height of the RL is known.Figure 3b (blue line) shows the result of applying such a second order correction which removes the CO 2 mismatch for the full period.Thus we conclude that the reshuffling method can fully compensate the CO 2 mismatch in a 1-D model.We hypothesize that errors made due to the simplifications of the 1-D model are small compared to the mixing height induced CO 2 mismatches.In the remainder of the paper we will test this hypothesis in the complex 3-D WRF simulation.As a prerequisite it is necessary to diagnose the mixing heights from the simulated meteorology.This is discussed in the following section.5. Illustration of the effect of differences between the mixing height (z i ) and a diagnosed PBL top (z PBL ) on the ML mean mole fraction (C m ).The column mixing ratio integrated from the surface to z PBL has a higher value than a similar column integrated to the top of the well mixed part (z i ).The difference between these two mean mixing ratios is shown as C m .

Estimation of mixing height
Estimating the simulated mixing height (MH) is a non-trivial task.The height diagnosed from the model state (here referred to as PBL top, z PBL ) usually differs from the height up to which the CO 2 profile is constant (z i ).The PBL top determines the amount of entrainment zone air that is integrated into the PBL mean CO 2 mole fraction C m (schematically shown in Fig. 5).We can regard this difference of mixing ratios as an error C m : Since both PBL schemes use their own method to diagnose the PBL top, (see Appendix A for details) C m may even be inconsistent between both simulations, which would affect the results of our study.This could be circumvented by using the CO 2 gradients between ML and free troposphere to determine the MH, but in reality this information is usually not available, except for stable situations where tall tower data (∼300 m) is available.Accordingly, we restrict ourselves to a method that more generally diagnoses the PBL top rather than the MH itself, knowing that this approach will cause some error ( C m = 0 ppm).If C m is consistently arising in both simulations, the CO 2 mismatch will be unaffected.We tested independent methods to diagnose the MH from the simulated meteorology (offline, i.e. after simulation finished).We applied different formulations of the Bulk-Richardson number method (Vogelezang and Holtslag, 1996).Our offline MHs differed only slightly from the YSU online diagnosed ones, which is not surprising since the YSU scheme is based on the Bulk-Richardson method (Appendix A).In contrast the offline diagnosed heights of the MYJ simulation were in general much higher than online calculated ones.For the nighttime the differences between MYJ and YSU MHs even changed sign when using different Richardson formulations.We compared C m to see which MH is most suitable.For the calculation of the second term on the RHS of Eq. ( 3) we used half the diagnosed PBL top z PBL for z i in cases with a well developed ML.This seems to be a reasonable choice because the entrainment zone can be 0.4 z i thick (Stull, 1988) and in well mixed conditions the mean mole fraction C m is equal C(z) for any height z below the ML top.We used the criterion z PBL > 600 m for the choice of well developed ML cases.
Figure 6 shows an example of monthly averaged C m at 12:00 UTC.For YSU C m is similar for online and offline MHs whereas for MYJ the online diagnosed MH seems more appropriate.A possible explanation might be the combination of how the threshold of the TKE (turbulent kinetic energy) that defines the PBL top in the MYJ scheme was chosen and how the transport of scalars in the PBL is related to this height (Appendix A).Similar results were reported  by Hu et al. (2010).Therefore we decided to proceed using the WRF online diagnosed PBL heights for our study since it seems to relate better to the effective MH and C m is consistent between the two schemes.For the remainder of this paper we will refer to the online diagnosed height as the mixing height (MH).The MH for the YSU and the MYJ simulations will be denoted with z i,ysu (corresponding to z i,truth in the pseudo-data experiment) and z i,myj (z i,model ) respectively.

Simulated mixing heights
In this section we present differences in diagnosed mixing heights between the two WRF simulations.Point comparisons are shown for 12 grid cells that contain the locations of existing observation sites continuously measuring CO 2 in Fig. 7.As observed concentrations from these sites are used in inversions, model errors in representing the vertical mixing at these sites directly influence flux estimates.These sites are placed throughout the simulation domain (marked with squares in Fig. 1).We also show results for all land cells in the domain (Fig. 8a and b).The bias is calculated by time averaging the MH differences ( z i,myj − z i,ysu ).
For the selected sites the bias is typically in the range of 200-400 m (∼30-60 % relative to YSU, Fig. 7).The relative bias is higher at nighttime and decreases as the ML deepens.During night in more than one third of the cases the MYJ MH equals 0 m (for further analysis the MH was set to 20 m which equals approximately half the height of the first vertical layer).This indicates very weak turbulence, i.e.TKE was falling below the threshold of 0.4 m s −2 as mentioned in Appendix A, leading to a rather large bias during nighttime when the YSU scheme produced on average heights of 300 m.
Timing of the mixed layer growth generally agree well.However, a closer look at some sites reveals 1-2 h timing differences regarding the maximum MH, i.e.MYJ ML reaches the maximum earlier (e.g.SCH, BIK) and in most cases turbulence in MYJ ceases earlier in the afternoon (e.g.BIK, PDM).We also found higher growth rates in the morning for MYJ, i.e. when averaged from 00:00 UTC to the time of maximum MH, for most sits the MYJ MH grows 1-2 % of maximum MH per hour faster than YSU.
An example of the spatial patterns of bias during day and night is shown in Fig. 8a and b.The spatial distribution of bias reveals larger bias during daytime over elevated terrain (e.g. the Alps) and a north-south gradient is visible.Compared to MYJ the YSU MH was usually much deeper over the ocean, which seems to affect the bias in maritime areas.For example, the deeper YSU MH advected over the British Isles and coastal France during nighttime.The bias standard deviation is usually in the range of 150-200 m (nighttime) and 350-450 m (daytime).
In summary, a general underestimation of the MYJ MH by 30 % (∼500 m) during daytime and at least a factor of two larger during nighttime (∼150 m) is evident.This finding is in line with previous studies showing that Mellor-Yamada based schemes tend to produce less turbulence than comparable PBL parameterizations (e.g.Sun and Ogura, 1980;Janjic, 2002;Steeneveld et al., 2008;Nakanishi and Niino, 2009;Hu et al., 2010).
These results confirm the initial assumption of quite noticeable differences in vertical mixing between both schemes with much lower MH in MYJ in line with the considerations in the 1-D conceptual model (Sect.2.3).Rather different from the conceptual model is the diurnally-varying bias in the WRF simulations (roughly 30 % for daytime, 60 % for nighttime) whereas in the 1-D case we assumed a constant 30 % low bias of MH.In our numerical experiment this timevarying MH bias combined with the sign reversal of surface fluxes can lead to a strong diurnal rectifier effect.In the following section the impact on simulated CO 2 fields will be presented.

CO 2 mismatch
Here we show the bias in simulated CO 2 and the bias reduction after applying the reshuffling method (Eq. 1) to each land pixel and the full simulation period of the MYJ fields.
The correction is applied offline (after the simulation has finished) to the hourly model output.For the reshuffling method we computed averaged ML concentration by integrating over all n vertical model levels below the diagnosed MH, i.e.C m = n i=1 m i C i / n i=1 m i where m i is mass and C i the CO 2 mixing ratio of grid cell i.In case the MH was below 20 we used the top of the first model level instead.We will refer to these ML averaged mixing ratios in the following as C m,ysu and C m,myj respectively.Note, the reshuffling is a local method (1-D) and thus applied on a per pixel basis, which neither gets any information from surrounding grid cells (space time) nor does it affect them.Similarly to the previous section, we show the results for the selected sites and the entire simulation domain.
Figure 9 shows the average diurnal variation of the CO 2 mismatch, i.e. the bias at each of the 12 grid cells containing the observation sites.In keeping with the conceptual model experiment we estimate the CO 2 mismatch by computing the differences C m,myj − m,ysu .The bias is determined as time mean of these differences ( C m,myj − C m,ysu ) and the random error refers to the standard deviation (sd(C m,myj − C m,ysu )).The relative bias is calculated from C m,myj − C m,ysu / C m,ysu × 100.The evolution of the bias over 24 h (monthly mean diurnal variation of the mismatch) reflects the combined effect of the discrepancies in the simulated MH and the dominating direction of the surface fluxes (CO 2 release at night, CO 2 uptake during the day) at these sites with much larger bias during nighttime (typically 4 to 10 ppm) than daytime (−3 to −1 ppm).
A large span of the bias is related to differences in the timing of the growth (morning) and decay (afternoon) of the mixed layer and the general weak mixing in MYJ during stable conditions.For instance the high altitude station Pic Du Midi has a negative peaks (−6 ppm, −2 % relative bias) at 10:00 local time (LT) and at 19:00 LT which are dominated by the MYJ CO 2 PBL concentration (not shown).A higher MYJ ML growth rate can be seen at PDM (cf.Fig. 7).In this case the maximum height for both schemes is around 13:00 LT, but in YSU the deepening of the mixed layer starts earlier (YSU: 05:00 LT, MYJ: 09:00 LT) and is at that time already at 20 % of the maximum height (MYJ 09:00 LT at 5 %).NEE at PDM becomes dominated by photosynthesis at around 07:00 LT (not shown), about 2 h before mixing starts to increase in MYJ, thus the MYJ CO 2 mixing ratio is dropping much faster.concentration is caused by entrainment with air above the mixed layer.At 10:00 LT when the MYJ z i growth rate is at its maximum (MYJ: 30 % of maximum z i per hour) this entrainment flux dominates the photosynthesis flux and causes the CO 2 concentration to rise, because the air above the fastgrowing ML has a relatively higher CO 2 concentration.Sudden bias increases in the late afternoon may be caused by fast decaying turbulence in the MYJ simulation while the photosynthesis still dominates fluxes for 1-2 h (cf.PDM 19:00 LT, Fig. 9).Because in the MYJ simulation the mixing is weaker, the air masses advected in the PBL to the station are depleted in CO 2 .The comparably large night bias at Cabauw (8 % relative bias) is caused by two factors.Firstly, ther is a larger bias in MH which is between 300 and 400 m in the early morning (01:00-05:00 LT), likely caused by too vigorous mixing in the YSU simulation, whereas the bias at all the other sites in the range of 200 to 300 m (cf.Fig. 7).Second and more importantly the EDGAR anthropogenic emissions in our model set up, which contribute more than 25 ppm to the surface CO 2 mixing ratio between 01:00-07:00 LT at CBW, whereas for all other sites this contribution is less than 5 ppm.
To summarize the effect of the reshuffling method (Eq.1), we report the bias reduction which is calculated as the relative difference of the absolute bias before and after applying the correction: Thus, a perfect correction would yield a bias reduction of 100 %.The bias after applying the correction (Eq. 1) is reduced substantially for all stations (Fig. 9).During daytime the bias is reduced with 60-90 % (Fig. 10a) for most stations.Throughout the domain the bias is mostly reduced for night and day time (shown in Fig. 11).During nighttime the MYJ concentrations are much higher, consistent with the net release of CO 2 and the very shallow MYJ ML, which located in the first model layer (about 20 m a.g.l.) for 30 % of land pixels, compared to less than 1 % of land pixels for the YSU scheme.The correlation of the temporal evolution of the ML CO 2 between both simulations is high (70-80 %, Fig. 10c) and the corrected fields can not improve this correlation.
The bias pattern is mainly controlled by the surface fluxes and the orography.On the one hand in areas where surface fluxes are weak the CO 2 bias is smaller, and on the other hand mixing increases in MYJ over areas with elevated terrain (enhanced heat flux and surface friction) to heights closer to YSU (Fig. 8a, e.g.Alps, Pyrenees, Scottish Highlands, Anatolia) causing the bias to decrease as well.
The daytime (11:00-14:00 UTC) bias for all land pixels is usually in the range of 3 ppm before and 1 ppm after applying the correction.Standard deviations of the bias are generally in range of 1 to 3 ppm during daytime and more than a factor of two larger during nighttime (01:00-04:00 UTC, not shown).
This spatially and temporally consistent bias clearly shows the sensitivity of simulated diurnal variation of mixed layer CO 2 to the choice of the PBL parameterization in highly resolved transport models.
Figure 12 summarizes the reshuffling performance for all land pixels and hour of the day by averaging the bias (a) and its standard deviation (b) spatially.Like the bias, the random error (standard deviation of bias) is reduced, albeit at lower rates of 10-20 % during daytime (Fig. 10b and 12a), and 30-50 % during nighttime (Fig. 12b).The CO 2 mismatch spatially more uniformly distributed at night (cf.Fig. 11a and  c), and therefore conceptual deficiencies (i.e.1-D correction of the 3-D simulation) of the correction seem to play a minor role compared to daytime (more on this in the discussion section).
In summary, we conclude that the overall effect of the reshuffling is a substantial improvement in the diurnal amplitude of MYJ ML CO 2 .
Obviously the performance of the correction breaks down during transition times of ML development, i. morning growth and during afternoon when turbulence ceases (cf.Fig. 10 ORL, OXK and Fig. 12).The reason for this effect is twofold: (1) the bias contains two flux regimes (release, uptake), resulting in a sign change of the CO 2 mismatch.The incoming solar radiation causes the transition from one regime to the other (sunrise and sunset).At these transition times the bias approaches 0 ppm and thus the reshuffling method has on average a minor effect or even increases the CO 2 mismatch (cf.Sect.2.3, Fig. 3b 5-9 h and 18-20 h); (2) inconsistencies between the CO 2 profiles and the diagnosed mixing height.For instance we often found that the ML was clearly evident from the simulated CO 2 profiles (well mixed in the first several model layers) during morning growth while the diagnosed MH for both simulations was near 20 m (somewhere in the ML instead at the top of it).As a consequence the correction had almost no effect (z i,model /z i,truth ≈ 1 in Eq. 1).This is reflected in Fig. 9 e.g. at the stations Fyodorovskoje (07:00-10:00 LT), Ochsenkopf (10:00-12:00 LT), Puy De Dôme (07:00-10:00 LT).
For MYJ a reason for MH/CO 2 inconsistencies might be a too large TKE threshold (cf.Sect.2.4), which can lead to diagnosed MH that are lower than gradients in the vertical profiles of the simulated scalars themselves suggest (Hu et al., 2010).YSU, on the other hand, uses a bulk Richardson method to diagnose z i that is very sensitive to the choice of the critical Richardson number (Ri c , cf.Appendix A).This resulted in large fluctuations (several 100 m) in z i time series especially during transition times (well mixed to stable and vice versa) when Ri c changes from 0 to 0.25 (Hong, 2007).The fluctuations were not reflected in the CO 2 profiles as they do not seem to adapt quickly to sudden MH changes.In such cases the concentration gradient in the CO 2 profile between the MH and the layer above would be a better diagnostic for the MH.But as stated in Sect.2.4 this information is usually not available.
These results highlight the dependence of the correction on a method that accurately estimates the effective mixed layer height.Other reasons for such inconsistency can be related to conceptual deficiencies of the reshuffling method and are discussed further in the following section.

Discussion
The results support the initial hypothesis that the diurnal CO 2 bias is to first order controlled by local differences in vertical mixing and that these differences are reflected in the mixing layer heights.With our rather indirect reshuffling method the gain in information represented by the known MH could be translated to a reduced bias in tracer space.These findings underline the potential to successfully constrain the transport model by assimilating observation-based mixing heights.This would improve convective tendencies directly.More importantly, such approaches can lead to improvements of the respective PBL parameterization and thus enhanced process understanding.Ways to assimilate mixing height data are currently being explored (McGrath-Spangler and Denning, 2010).
Regarding the presented reshuffling method, not all of the bias is explained and random error is only slightly improved.This result does not come as a surprise, as the 1-D correction neglects changes in the column due to horizontal advection (cf.Sect.2.3).Thus the reshuffling method cannot account for errors that arise from differences in horizontal transport of CO 2 .Such differences can include (1) horizontal advection within the mixed layer, (2) deep cumulus convection combined with high altitude horizontal winds, and (3) wind shear above the nighttime SBL.
To investigate the effect of these different causes we analyzed the simulation fields to find situations in which the correction fails.For simplicity we define failure of the correction as cases when the absolute CO 2 error after applying the correction is significantly larger than before: As an example we show in Fig. 13a a map for 10 August of the simulation period where each pixel that is classified as failure is marked.Possible reasons for these apparent inconsistencies are (1) the diagnosed MH does not represent the effective top of the ML (cf.previous section), (2) the CO 2 fluxes into the model column are different for YSU and MYJ.To circumvent the first trivial cause for failure one can either try to improve the methods to diagnose the MLH correctly or choose one from a set of available methods that is better suited for the current synoptic condition, which is still a matter of ongoing research (Seibert et al., 2000;Seidel et al., 2010).
The second source of failure, which is related to conceptual deficiencies of the reshuffling method, seems to be closely related to differences in the convective tendencies that are calculated within the PBL scheme.For illustration, Fig. 13b and c shows the Convective Available Potential Energy (CAPE).Although both simulations use the same cumulus parameterization (Table 1), CAPE differs substantially near areas where the correction consistently fails (arrows in Fig. 13).For these areas mass fluxes into the cloud base and vertical velocities are considerably higher in the MYJ simulation (not shown).This suggest that the CO 2 signature within the ML is transported to higher altitudes differently in both simulations.Because above the ML wind shear is likely to occur and wind speed increases from subgeostrophic within the ML to geostrophic above, differences in vertical transport of mass also leads to altered horizontal trajectories of the CO 2 molecules.Therefore, tracer transport might be more local in one simulation than in the other, causing greater errors in 1-D budgets in columns downstream, because these mass contributions by advection are neglected.
To identify synoptic events in which horizontal advection dominates the column mass budget one might use particle trajectories from a Lagrangian transport model together with a term for horizontal advection in the budget formulation (e.g.Aubinet et al., 2005a).But these trajectories can only be as accurate as the underlying meteorological drivers allow and the required computational efforts make such method unfeasible for inversions.Thus, assimilation approaches like 3D/4DVAR or Extended Kalman Filters that use observed MH to optimize the model-state for tracer transport seem preferable.
However, our results give reason to hope that already a relatively simple correction can help to improve the flux estimates of regional inversions: 1.The derived model-data mismatch statistics (error covariances) can be propagated through the inversion system as demonstrated by G08.This approach estimates posterior flux uncertainties that are more consistent with the truth, which is the basis for any reliable model-data fusion system (Gerbig et al., 2009).
2. Observation based mixing heights could be used to constrain the simulated CO 2 fields by applying the introduced reshuffling method before cost functions in the inversion framework are minimized.In combination with the before mentioned error propagation to account for the remaining CO 2 mismatch this can lead to improved posterior uncertainties and flux estimates.
The basis for any of these methods to improve regional inversions is reliable estimates of the mixing heights from observations of the atmospheric state.Mixing heights can be obtained from several observational sources, in-situ  measurements like radio soundings (G08, Seibert et al., 2000;Seidel et al., 2010) and a variety of acoustic, electromagnetic and optical remote sensing techniques, like SO-DAR (Sonic Detecting And Ranging) or RADAR (Radio Detection and Ranging) (Emeis et al., 2004).Aerosol backscatter profiles measured by LIDARs (LIght Detection And Ranging) and ceilometers combined with sophisticated retrieval algorithms can provide MH estimates as well.Ceilometers in particular, originally designed to detect cloud bases, promise a relatively cheap and easy to deploy method to acquire large quantities of such data and thus are planed to be part of measurement frameworks like the Integrated Carbon Observation System (Haeffelin et al., 2011;Milroy et al., 2011; http://www.icos-infrastructure.eu/).A number of ceilometer networks are operated by weather services throughout Europe 2 .For example, 36 sites of the German weather service (DWD) were recently used in the study of Flentje et al. (2010) to measure the volcanic ash plume of the 2010 Eyjafjallajökull eruption.

Atmos
Another promising source of data can be obtained from satellites like CALIPSO (Cloud-Aerosol LIDAR and Infrared Pathfinder Satellite Observations).Similar to groundbased ceilometers they provide aerosol backscatter profiles of the atmosphere that can be used together with sophis-2 Denmark, Great Britain, France, Germany, the Netherlands, Iceland, Sweden, Switzerland ticated retrieval algorithms to infer mixing heights (Jordan et al., 2010).The advantage of such satellite observations is a dense spatio-temporal coverage which can complement the other mentioned data streams to constrain transport over large areas.

Conclusions
In this study we presented results from a synthetic experiment using the atmospheric transport model WRF coupled to the diagnostic biosphere model VPRM and anthropogenic emissions to realistically simulate transport of CO 2 for large parts of the European continent at 10 km horizontal resolution.To elucidate the impact of uncertainties in vertical mixing on modeled CO 2 transport, we simulated the period of August 2006 with different commonly used parameterizations of the PBL (YSU and MYJ), while keeping CO 2 fluxes the same.We used diagnosed mixed layer heights of the modeled meteorology to quantify differences in vertical mixing strength between both simulations.A reshuffling method to relate differences in mixing height to resulting differences (errors) in CO 2 was presented.The method was tested with a conceptual 1-D model (neglecting horizontal advection) and with the complex 3-D WRF simulations.
Following the initial questions of this paper, the main results are: www.atmos-chem-phys.net/12/2441/2012/Atmos.Chem.Phys., 12, 2441-2458, 2012 1. We found significant differences in the amplitude in simulated mixing heights.The MYJ MHs were generally 30-40 % lower relative to the YSU MHs for daytime.The low bias was by a factor of two larger for nighttime.The MYJ ML often developed later in the morning and turbulence ceased earlier than in the YSU simulation (time lags of 1-2 h).These subtle timing differences might be used in comparisons to observed MHs to validate the PBL schemes in the future.
2. The simulated diurnal cycle of mixed layer CO 2 mole fractions differed considerably in amplitude during daytime (1-3 ppm differences) and even more during nighttime (4-10 ppm differences), with standard deviations on the same order.Both simulations agreed well in phasing (r squared ∼0.8).However, large peaks in bias were found in point comparisons (grid cells including observational sites) related to timing differences of turbulence (−6 ppm during daytime, −2 % relative bias) as well as influence of nearby anthropogenic emission sources (Cabauw, 30 ppm nighttime, 8 % relative bias).
3. The reshuffling method (Eq. 1) substantially reduced the bias in mixed layer CO 2 by 70-80 % during daytime (>80 % nighttime) when using information about the true mixing height (YSU).This result underlines the potential of observation-based mixing height data to constrain transport in order to improve regional surfaceatmosphere flux estimates by CO 2 inversions.Failure of our simple correction method was shown to be due to inconsistencies in methods to diagnose mixing heights, and to conceptual deficiencies related to neglecting the horizontal advection.More sophisticated ways to assimilate mixing height data are needed to gain process understanding of PBL dynamics in order to improve PBL parameterizations.
We conducted an idealized experiment with two different PBL parameterizations, so one can question if the mismatches found in MH and CO 2 concentration are a realistic estimate of the true uncertainty.However, we would like to stress that both PBL parameterizations are commonly used and several studies validated these schemes in combination with WRF against meteorological observations, clearly showing their ability to realistically simulate PBL dynamics in different seasonal and synoptic conditions (e.g.Borge et al., 2008;Hu et al., 2010).In this regard our study underlines the need to further investigate transport errors in vertical mixing to infer reliable regional flux estimates.Independent from the absolute values of model error, the presented concept can be used to infer vertical transport error statistics that can be propagated through inversions.In addition, a direct application of the reshuffling method presented in this paper to reduce the CO 2 model-data mismatch prior to the flux optimization will be tested in future inversion studies.
Observation-based mixing heights to constrain transport will be obtained from radiosondes, ceilometers or satellites.
Ultimately one is interested in the impact of these CO 2 biases on the flux inversions.What is needed are sensitivity tests with an inversion set up incorporating the full error characteristics like it was demonstrated by Rödenbeck et al., 2006.However, to get a first idea for our study, we can scale the tagged tracer corresponding to NEE in order to compensate for the reported CO 2 bias.The monthly average signal from NEE within our domain (0.45 PgC/month) causes a regional signal (draw down relative to the lateral boundary condition) of 2 ppm.Taking a transport model bias of 1 ppm, which is 50 % of the signal, the corresponding error in fluxes would be 0.225 PgC/month in NEE.
Future work should also comprise ensemble simulations with a broader range of model/PBL scheme combinations for different seasons and years.Beside CO 2 these studies should include other GHGs like CH 4 and N 2 O. Observations of trace gases like SF 6 and 222 Rn with better constrained surface fluxes could be used to isolate vertical transport mismatches from other error sources (G08).To better exploit these efforts a closer collaboration with experts from weather services and numerical weather prediction centers should be fostered.

Fig. 3 .Fig. 4 .
Fig. 3.The surface fluxes (F NEE ) used for the 1-D conceptual model (Eq.2) with 24 h mean ∼3.4 µmol s −1 , also shown are the prescribed mixing heights for the truth (red line) and the model with a 30 % low bias (blue line) (a).Also shown in (b) is the Diurnal variation of the CO 2 difference (C m,model − C m,truth ) (black line) resulting from 1-D conceptual model (Eq.2).The green line shows the same error after applying the reshuffling method (Eq. 1) to the model ML CO 2 (C m,model ).Taking into account the residual layer mole fraction completely compensates for the difference (purple line).
Fig.5.Illustration of the effect of differences between the mixing height (z i ) and a diagnosed PBL top (z PBL ) on the ML mean mole fraction (C m ).The column mixing ratio integrated from the surface to z PBL has a higher value than a similar column integrated to the top of the well mixed part (z i ).The difference between these two mean mixing ratios is shown as C m .

Fig. 6 .
Fig.6.Box and whisker plot of C m for monthly averaged mean ML CO 2 for YSU (a) and MYJ (b) at 12:00 UTC.All grid cells were included in the boxplot where the diagnosed PBL top was higher than 600 m in both simulations.The upper boxes represent the differences of mean CO 2 integrated over the offline diagnosed MH (VH96 Eq. 1, Ri c = 0.25) and the mean CO 2 integrated over 300 m (within the well mixed part of the ML).The lower boxes show similar differences but using online diagnosed WRF PBL top and the 300 m height.The whiskers denote the central 90 % of the data points, the boxes the central 50 %, the vertical lines within the boxes are the median and the black dots represent the mean.

Fig. 7 .
Fig. 7. Mean diurnal MH in m (a.g.l.) at grid cells containing the stations (from top to bottom, left to right): BIK, CBW, TVE, GRI, HUN, KAS, CMN, OXK, ORL, PDM, PUY and SCH.Results from the YSU and MYJ simulations are shown in red lines and cyan lines respectively.

Fig. 9 .
Fig.9.Same as Fig.7but for the bias in ML CO 2 (< C m,myj − C m,ysu >, red line) and the bias after applying the reshuffling method (Eq. 1) to the MYJ CO 2 fields (cyan line).

Fig. 12 .
Fig. 12. Summary barplot of the bias reduction (a) and reduction of random error (b) in percent.One bar represents the average over all land grid cells in the domain at one hour of the day.
Figure 14 Grid cells for which the reshuffling method failed calculated for August 10, 2006, 12:00 UTC a).Also shown is the Convective Available Potential Energy (CAPE) for YSU b) and MYJ c) at the same time.Details are given in the text.

Fig. 13 .
Fig. 13.Grid cells for which the reshuffling method failed calculated for 10 August 2006, 12:00 UTC (a).Also shown is the Convective Available Potential Energy (CAPE) for YSU (b) and MYJ (c) at the same time.Details are given in the text.

Table 1 .
Setup of WRF options.