Articles | Volume 23, issue 4
Research article
27 Feb 2023
Research article |  | 27 Feb 2023

Evaluation of simulated CO2 power plant plumes from six high-resolution atmospheric transport models

Dominik Brunner, Gerrit Kuhlmann, Stephan Henne, Erik Koene, Bastian Kern, Sebastian Wolff, Christiane Voigt, Patrick Jöckel, Christoph Kiemle, Anke Roiger, Alina Fiehn, Sven Krautwurst, Konstantin Gerilowski, Heinrich Bovensmann, Jakob Borchardt, Michal Galkowski, Christoph Gerbig, Julia Marshall, Andrzej Klonecki, Pascal Prunet, Robert Hanfland, Margit Pattantyús-Ábrahám, Andrzej Wyszogrodzki, and Andreas Fix

Power plants and large industrial facilities contribute more than half of global anthropogenic CO2 emissions. Quantifying the emissions of these point sources is therefore one of the main goals of the planned constellation of anthropogenic CO2 monitoring satellites (CO2M) of the European Copernicus program. Atmospheric transport models may be used to study the capabilities of such satellites through observing system simulation experiments and to quantify emissions in an inverse modeling framework. How realistically the CO2 plumes of power plants can be simulated and how strongly the results may depend on model type and resolution, however, is not well known due to a lack of observations available for benchmarking. Here, we use the unique data set of aircraft in situ and remote sensing observations collected during the CoMet (Carbon Dioxide and Methane Mission) measurement campaign downwind of the coal-fired power plants at Bełchatów in Poland and Jänschwalde in Germany in 2018 to evaluate the simulations of six different atmospheric transport models. The models include three large-eddy simulation (LES) models, two mesoscale numerical weather prediction (NWP) models extended for atmospheric tracer transport, and one Lagrangian particle dispersion model (LPDM) and cover a wide range of model resolutions from 200 m to 2 km horizontal grid spacing. At the time of the aircraft measurements between late morning and early afternoon, the simulated plumes were slightly (at Jänschwalde) to highly (at Bełchatów) turbulent, consistent with the observations, and extended over the whole depth of the atmospheric boundary layer (ABL; up to 1800 m a.s.l. (above sea level) in the case of Bełchatów). The stochastic nature of turbulent plumes puts fundamental limitations on a point-by-point comparison between simulations and observations. Therefore, the evaluation focused on statistical properties such as plume amplitude and width as a function of distance from the source. LES and NWP models showed similar performance and sometimes remarkable agreement with the observations when operated at a comparable resolution. The Lagrangian model, which was the only model driven by winds observed from the aircraft, quite accurately captured the location of the plumes but generally underestimated their width. A resolution of 1 km or better appears to be necessary to realistically capture turbulent plume structures. At a coarser resolution, the plumes disperse too quickly, especially in the near-field range (0–8 km from the source), and turbulent structures are increasingly smoothed out. Total vertical columns are easier to simulate accurately than the vertical distribution of CO2, since the latter is critically affected by profiles of vertical stability, especially near the top of the ABL. Cross-sectional flux and integrated mass enhancement methods applied to synthetic CO2M data generated from the model simulations with a random noise of 0.5–1.0 ppm (parts per million) suggest that emissions from a power plant like Bełchatów can be estimated with an accuracy of about 20 % from single overpasses. Estimates of the effective wind speed are a critical input for these methods. Wind speeds in the middle of the ABL appear to be a good approximation for plumes in a well-mixed ABL, as encountered during CoMet.

1 Introduction

According to a recent compilation of sectorial greenhouse gas emissions for the year 2018, approximately 34 % of global anthropogenic CO2 emissions are attributable to the energy sector and 24 % to the industrial sector (Minx et al.2021). Emissions from these sectors primarily originate from power plants, industrial combustion plants, and other large industrial facilities. The concentrated plumes of these sources may be detectable from satellite observations (Nassar et al.2017), which makes the quantification of these emissions an attractive target for observation-based CO2 emission monitoring. Quantifying the emissions of large point sources is indeed one of the main goals of the anthropogenic CO2 Emissions Monitoring and Verification Support Capacity (CO2MVS) currently developed under Europe's Earth Observation Program of Copernicus (Janssens-Maenhout et al.2020). This is not only important because of their large global share but also will help us to better quantify the remaining more dispersed emissions, which are not necessarily visible as plumes but rather as contributions to regional CO2 enhancements.

Emissions from large combustion plants are sometimes measured directly within the stacks, especially in economically more developed countries, but these numbers are not always readily and publicly available (or only with large delays). Moreover, a complete global record of power plant emissions will not realistically be available in the near future. One of the main goals of the planned European Copernicus Anthropogenic Carbon Dioxide Monitoring satellite mission (CO2M) is therefore to quantify the CO2 emissions of large point sources globally by providing images of total column dry-air mole fractions (XCO2) at a spatial resolution of about 2 km × 2 km over a 250 km wide swath (Janssens-Maenhout et al.2020).

A growing body of scientific literature has demonstrated the feasibility of quantifying CO2 emissions from power plants using satellite observations. These studies were either based on theoretical considerations combined with synthetically generated (simulated) CO2 observations (Bovensmann et al.2010; Kuhlmann et al.2019; Strandgren et al.2020; Kuhlmann et al.2021b) or on real observations from existing CO2 satellites, like Orbiting Carbon Observatory-2 (OCO-2; Nassar et al.2017; Reuter et al.2019; Nassar et al.2021; Hakkarainen et al.2021; Kiemle et al.2017; Chevallier et al.2022) and OCO-3 (Nassar et al.2022), and hyperspectral imagers like PRISMA (Cusworth et al.2021).

Numerous methods have been proposed to quantify point source emissions from satellite observations using mass balance considerations or by fitting a simulated plume to the observations (Krings et al.2013; Varon et al.2018; Beirle et al.2019; Kuhlmann et al.2021b; Fioletov et al.2015). Plume fitting methods often rely on Gaussian plume models, taking advantage of their simplicity and computational efficiency (Wang et al.2020). An alternative but less-explored option is to simulate the plume with a full 3D atmospheric transport model. Such models can more realistically describe atmospheric transport and mixing than a Gaussian plume model and thereby better capture the structure of real plumes. They can also better represent complex flow conditions and temporal changes associated with the evolution of the atmospheric boundary layer. However, accurately representing small-scale plumes is extremely challenging because small errors in wind direction may create a simulated plume that does not overlap with the real plume (Zheng et al.2019). Furthermore, plumes are often turbulent, in which case even a perfect model will never be able to exactly match the observed plume due to the stochastic nature of turbulence. Traditional inverse emission estimation methods relying on a point-by-point comparison between simulated and observed CO2 may therefore be inappropriate, but more advanced non-local methods, as suggested by Farchi et al. (2016), may be required.

In May–June 2018, the CO2 plumes of two large coal-fired power plants, Bełchatów in Poland and Jänschwalde in Germany, were observed with aircraft in situ and remote sensing measurements in the context of the CoMet campaign (Carbon Dioxide and Methane Mission; Fix et al.2018; Gałkowski et al.2021; Fiehn et al.2020; Krautwurst et al.2021; Wolff et al.2021). These measurements provide a unique opportunity to study the capability of atmospheric transport models to simulate such plumes in a realistic manner and to define optimal sampling and modeling strategies for emission quantification.

Since several research groups were already performing or planning to perform simulations for these power plants, a coordinated effort was undertaken to compare the different models operated by the groups. A joint modeling protocol (File S2 in the Supplement) was created to harmonize the setup of the models (simulation periods, domains, location, and intensity of the source) and the output (data format, variables, and output grid) as much as possible in order to simplify the data analysis and to make the results comparable. Finally, six research groups operating six different atmospheric transport models agreed to perform simulations following this protocol and to contribute to the present study. Our study includes five different Eulerian transport models but only one Lagrangian dispersion model. A similar model evaluation study including other Lagrangian models was recently presented by Karion et al. (2019). The present study complements their analysis by focusing specifically on emissions from power plants rather than on surface emissions. Another related study was conducted by Angevine et al. (2020), who simulated the plume of a power plant with a single Lagrangian transport model to analyze different sources of error in top-down emission estimates, including uncertainties in winds and boundary layer heights, as also addressed in our study.

The overall aims of this study are as follows:

  • Evaluate the model simulations against in situ and remote sensing observations with respect to selected meteorological parameters and CO2 concentrations.

  • Analyze how the spatiotemporal variability and dispersion of the plumes are represented by the different models operating on a wide range of resolutions and provide recommendations for an optimal model setup.

  • Analyze how well emissions can be quantified from future CO2M satellite observations using two well-established methods, namely the cross-sectional flux and integrated mass enhancement method.

  • Provide recommendations for future measurement campaigns to optimally support the validation of model simulations and satellite observations.

2 Aircraft measurements of power plant plumes

In May–June 2018, the CoMet 1.0 (Carbon Dioxide and Methane Mission) intensive measurement campaign was conducted to study CH4 and CO2 emissions from hot spots in Europe. A particular focus was placed on methane emissions from coal mining and other industrial activities in the Upper Silesian Coal Basin in Poland (Fiehn et al.2020; Kostinek et al.2021; Krautwurst et al.2021). Three aircraft were operated during the campaign, with two by the German Aerospace Center (DLR) and one by Freie Universität Berlin (FUB). One of the goals of the campaign was to evaluate the lidar system CHARM-F (Amediek et al.2017), an airborne demonstrator of the upcoming satellite mission MERLIN (Methane Remote Sensing Lidar Mission; Ehret et al.2017), and to investigate its capabilities to detect atmospheric gradients in vertical columns of CO2 and CH4 and plumes of individual sources. Another goal was to evaluate the synergistic use of airborne remote sensing and in situ measurements for source detection and quantification.

Fiehn et al. (2020)Mallaun et al. (2015)Mallaun et al. (2015)Mallaun et al. (2015)Gałkowski et al. (2021)Amediek et al. (2017)Gerilowski et al. (2011)

Table 1An overview of the measurements used for model evaluation. The time range indicated for each aircraft is not the time between takeoff and landing but corresponds to the range between start of the first plume transect and the end of the last transect in Universal Coordinated Time (UTC). The availability of an instrument is indicated by a × symbol. Note that ppm is parts per million.

a Time of CHARM-F transects; plume not detected by the in situ instrument JIG. b First transects at 08:50–10:00 UTC for MAMAP (Methane Airborne Mapper); last transects at 10:00–10:50 UTC for in situ measurements.

Download Print Version | Download XLSX

One of the aircraft, the DLR Cessna, was equipped with in situ instruments and mostly flew in the atmospheric boundary layer (ABL). The two other aircraft, the DLR HALO and the FUB Cessna, primarily flew at constant altitudes above the ABL to measure vertical columns of CH4 and CO2, with active and passive remote sensing, using the CHARM-F lidar and the MAMAP (Methane Airborne Mapper) spectrometer (Gerilowski et al.2011; Krautwurst et al.2021), respectively. An overview of the airborne instruments and of the measurements used in this study is presented in Table 1.

To evaluate the capability of quantifying emissions of large CO2 point sources with different measurement techniques and sampling strategies, the plumes of two of the largest coal-fired power plants in Europe were sampled on 3 different days during the campaign. The Jänschwalde power plant in Germany was visited on 23 May 2018 by the two aircraft equipped with remote sensing instruments, i.e., the DLR HALO and the FUB Cessna. The FUB Cessna aircraft also carried an in situ instrument and performed multiple transects through the plume at different altitudes in the ABL. The Bełchatów power plant in Poland was visited twice, on 29 May 2018 by the DLR HALO only and on 7 June 2018 by all three aircraft. The in situ and remote sensing data collected on 7 June are the most comprehensive data set used in this study, while the comparatively small data set collected on 29 May 2018 is not analyzed here.

Table 2Power plants and their 2018 annual mean CO2 emissions and actual emissions during the observation periods. All model results are based on actual emissions. Flue gas temperature and effluent flux were estimated from published power plant statistics for Germany (Pregger and Friedrich2009), as these were not officially reported.

Download Print Version | Download XLSX

The Bełchatów plant is the largest coal-fired power plant in Europe and one of the five largest in the world. In 2018, it released a total of 38.4 Mt CO2 yr−1 to the atmosphere, according to the European Pollutant Release and Transfer Register (E-PRTR), which is approximately the same amount as the emissions of the country of Switzerland, as officially reported to UNFCCC (United Nations Framework Convention on Climate Change). CO2 is released through two 299 m tall stacks. The power plant at Jänschwalde is the third-largest in Germany and the fourth-largest in Europe, with a total reported emission of 23.1 Mt CO2 yr−1 in 2018. Different from Bełchatów, its emissions are not released through stacks but through six out of its nine cooling towers because the modern flue gas treatment reduces the exhaust temperatures to a level that is too low to be vented through stacks (Busch et al.2002). An overview of the two power plants, their total CO2 emissions in 2018, and of the stack parameters used for plume rise calculations in this study is presented in Table 2. It should be noted that the coordinates are not identical to those reported in E-PRTR but were selected as the center of all emitting stacks or cooling towers. The difference between the reported address and the true location of the source was about 800 m for Bełchatów and about 300 m for Jänschwalde.

2.1 Model systems

Simulations were conducted with six different state-of-the-art atmospheric transport and dispersion models with horizontal resolutions between 2 km and 200 m. Two of the models, COSMO-GHG (Consortium for Small-scale Modeling–Greenhouse Gas) and WRF-GHG (Weather Research and Forecasting–Greenhouse Gas), are mesoscale non-hydrostatic numerical weather prediction (NWP) models extended with the capability to simulate the transport, emissions, and atmosphere–biosphere exchange fluxes of greenhouse gases. Three of the models, WRF-LES (large-eddy simulation), ICON-LEM (ICOsahedral Non-hydrostatic large-eddy model) and EULAG (Eulerian/semi-Lagrangian fluid solver), are LES models in which turbulent eddies larger than a certain filter width are explicitly resolved, whereas the smaller, less energetic scales are parameterized (Heus et al.2010). The last model, Atmospheric Radionuclide Transport Model (ARTM), is a Lagrangian particle dispersion model driven by prescribed vertical profiles of wind and turbulence, depending on atmospheric stability. Although all models are able to resolve the plumes in quite some detail (i.e., model grid spacings are small compared to the size of the plume), the different types of models and the wide range of resolutions varying over 2 orders of magnitude in terms of grid cell area allow us to study the capabilities and limitations of different model concepts and to investigate the impact of resolution on the characteristics of the plumes. The LES models may be considered to be a reference, as they have the most realistic representation of atmospheric turbulence. However, they are computationally expensive, and their results critically depend on the specific setup and forcing data. A summary of the participating models is presented in Table 3, and brief descriptions are provided in the following.

Brunner et al. (2019)Jähn et al. (2020)Ahmadov et al. (2007)Beck et al. (2011)Wolff et al. (2021)Kern and Jöckel (2016)Prusa et al. (2008)Hanfland et al. (2022)

Table 3Overview of participating model systems. The model version is the version of the meteorological core (COSMO, WRF, and ICON). The ID is the identifier used to distinguish between different model systems and simulations. The column “IC/BC” denotes the source of the meteorological data used as initial and boundary conditions. The models WRF-GHG, ICON-LEM, and WRF-LES were run in configurations with multiple nests, in which case the column “IC/BC” describes the initial and boundary conditions for the outermost domain. Note: LPDM is the Lagrangian particle dispersion model, and HRES is high resolution.

Download Print Version | Download XLSX

COSMO-GHG is based on the NWP and regional climate model COSMO (Baldauf et al.2011), which was developed by a consortium of European weather services under the lead of the German Weather Service (DWD). The GHG extension allows for the simulation of the transport of passive trace gases and their emissions and surface exchange fluxes (Liu et al.2017; Brunner et al.2019; Kuhlmann et al.2019). An online emissions module was developed for a flexible treatment of anthropogenic emissions from point and area sources (Jähn et al.2020) and was used here to prescribe vertical emission profiles for the two power plants. COSMO-GHG was run in a version optimized for execution on graphical processing units (GPUs; Fuhrer et al.2014; Brunner et al.2019). Meteorological initial and boundary conditions were taken from operational COSMO-7 analyses of the Swiss weather service MeteoSwiss at approximately 7 km horizontal and 1 h temporal resolution. The domain of COSMO-7 covers much of Europe and is nested into operational IFS analyses of the European Centre for Medium-Range Weather Forecast (ECMWF). Within the model domain, the meteorology of COSMO-GHG was nudged toward observations from surface stations, radiosondes, and commercial aircraft, using the scheme of Schraff (1998).

In this study, two distinctly different configurations of WRF were used, WRF-GHG and WRF-LES. The backbone of both is the Weather Research and Forecast model (WRF; Skamarock et al.2008) operated with the Advanced Research WRF (ARW) core. WRF-ARW is a state-of-the-art Eulerian NWP model developed in a collaboration of several U.S. research institutions, led by the National Center for Atmospheric Research (NCAR), and integrates the non-hydrostatic, fully compressible Euler equations in flux form on a terrain-following mass-based vertical coordinate. The governing equations are expressed as perturbations from a hydrostatically balanced reference state. The WRF model can be applied from global scale to microscale, where atmospheric processes can be effectively downscaled through one- or two-way nesting. In both cases, the system is operated as a limited-area model, using meteorological boundary conditions of a larger-scale modeling system, namely the operational high-resolution (HRES) IFS forecast from ECMWF, downloaded at 0.125× 0.125 horizontal and L137 vertical resolution. A detailed summary of the model domains and parameterizations used in all WRF-GHG and WRF-LES simulations is provided in the Supplement (file S5).

For the WRF-GHG configuration, the WRF-Chem add-on with the GHG option (Beck et al.2011; Zhao et al.2019) was used, allowing for the online simulation of the emission, transport, and mixing of CO2. All GHG tracers are treated as chemically inert (i.e., passive). WRF-GHG was run with WRF version in a one-way nested setup, with a parent domain spanning Europe at 10 km × 10 km horizontal resolution, an intermediate nested domain at 2 km × 2 km resolution, and a fine-grid domain run at 0.4 km × 0.4 km horizontal resolution (hereafter labeled WRF-GHG-HR). The fine domain was run with the same parameterizations as the intermediate domain. We used the classic WRF pressure-based terrain-following vertical coordinate, with the model top at 5 kPa (approximately 21 km a.m.s.l. – above mean sea level) and 85 vertical eta levels, with increased level density in the ABL. There were typically 33 levels below the altitude of 2 km. The internal time step of the domains was set to 60, 12, and 2 s, respectively. Details of the applied configuration are the WRF single-moment five-class microphysics scheme, Rapid Radiative Transfer Model (RRTM) longwave radiation scheme, Dudhia shortwave radiation scheme, and Grell–Freitas ensemble cumulus parameterization (in the 10 km domain only). Land surface was simulated using the Community Land Surface Model version 4. The planetary boundary layer (PBL) was parameterized using the Mellor–Yamada Nakanishi and Niino 2.5 (MYNN) scheme, with the MM5 similarity surface layer parameterization. Similar to the setup of Ahmadov et al. (2007, 2009), the computations were performed as a series of 30 h simulations, with reinitialization of meteorological fields every 24 h (at 18:00 UTC), using the last available IFS forecast data, with the subsequent recycling of the tracer fields at midnight (00:00 UTC), using the output from the end of the previous cycle.

In the WRF-LES configuration, WRF version was used for Bełchatów but WRF version 3.8.1 for Jänschwalde. WRF-LES was also operated with three one-way nested domains with horizontal resolutions of 5, 1, and 0.2 km, respectively, with the outer domain covering a portion of central Europe (see file S5 in the Supplement). Vertically, 57 eta levels were introduced with a top layer pressure of 200 hPa and with increased level density in the ABL (i.e., at least 20 levels below 2 km). The internal time step of the domains was set to 30, 5 and 1 s, respectively. Parameterizations applied in WRF-LES configuration are a Morrison two-moment microphysics scheme, Rapid Radiative Transfer Model radiation (RRTMG) scheme (for short- and longwave radiation), the Unified Noah land surface model and the revised MM5 surface layer physics scheme. For the coarser two domains, the planetary boundary layer (PBL) was parameterized using the same setup as in WRF-GHG, i.e., the Mellor–Yamada Nakanishi and Niino 2.5 (MYNN) scheme with the MM5 similarity surface layer parameterization. The innermost domain was run as a large-eddy simulation (LES) to resolve local turbulence. This configuration uses full 3D diffusion for turbulent mixing and a prognostic equation for turbulent kinetic energy. The implementation of passive CO2 tracers in WRF-LES was applied following the methodology of Blaylock et al. (2017) and used in Wolff et al. (2021) for simulations of the Jänschwalde coal-fired power plant.

The ICON (ICOsahedral Non-hydrostatic) model (Zängl et al.2015) is a joint project of DWD, the Max Planck Institute for Meteorology (MPI-M), and their partners. For this study, ICON 2.4.0, coupled to the Modular Earth Submodel System (MESSy; Kern and Jöckel2016) was used. The spatial and temporal variation in the passive tracer emission in the simulation was controlled by the MESSy interface, whereas the transport of the tracers was handled by ICON. The simulations were performed in a limited-area configuration, with ICON running in LES mode (ICON-LEM; Dipankar et al.2015). The large-eddy simulations were driven by limited-area ICON simulations over Germany and Poland, respectively, with a grid spacing of approximately 2.5 km. Initial and boundary conditions for these simulations were provided from operational IFS analyses every 6 h (ECMWF2020).

EULAG is NCAR's generic numerical framework for solving geophysical flow equations for a wide range of scales and applications. It allows solving the equations of fluid motion in either the Eulerian or the semi-Lagrangian mode (Prusa et al.2008). The code has been used, in particular, to simulate turbulent flows in LES mode. EULAG is a research code that allows multiple adaptations based on particular user needs. The LES version used here solves the anelastic Navier–Stokes equations in the Eulerian form (Wyszogrodzki et al.2012). Further model adaptations were performed for the needs of CO2 modeling. In particular, the model was coupled with output from the mesoscale model COSMO-GHG, with several meteorological output fields from COSMO-GHG used to initialize the EULAG simulation and to force the model throughout the simulation. The COSMO-GHG fields provided on a 1 km × 1 km grid every 60 min (every 15 min during the period of CoMet flights) were interpolated to the EULAG spatial grid and time steps. The model domain included 400 × 300 grid points, with a resolution of 0.003 (longitude) × 0.002 (latitude), which corresponds roughly to 208 × 220 m. The domain was centered on the power plant to allow the buildup of high-resolution upwind circulation in the model domain. The vertical resolution was 50 m. With 60 model levels, the model extended to 3000 m above the surface, which is well above the top of the ABL. The time step was 2 s, with model output stored every 15 min.

The Atmospheric Radionuclide Transport Model (ARTM) is a Lagrangian particle dispersion model (LPDM) developed by Gesellschaft für Anlagen- und Reaktorsicherheit (GRS) gGmbH (Germany's central expert organization in the field of nuclear safety) in 2007 on behalf of the Federal Office for Radiation Protection (BfS) in Germany. It is based on AUSTAL2000 (Janicke and Janicke2013), a widely used dispersion model for conventional tracers in Germany, and designed for modeling the dispersion of radionuclides emitted from nuclear facilities under routine operation on an annual timescale. Here, ARTM version 3.0.0 was used, which employs the same wind and turbulence models as version 2.8.0 (Hanfland et al.2022) but with the ability to specify the mixing layer depth as given in the modeling protocol. The spatial resolution of ARTM was limited by the maximum horizontal number of grid cells (300 × 300) and their maximum horizontal size (666 m × 666 m). The temporal resolution was limited to 1 h. ARTM runs a diagnostic wind field model, creating wind and turbulence fields with homogeneous density for the simulation, using meteorological data at a single location within the simulation domain. As such, the COSMO-GHG data were used for the Jänschwalde simulation. For the Bełchatów simulation, the COSMO-GHG data were only used before and after the measurement flight. During the flight, the data derived from the in situ wind measurements were used to drive the model with two different wind directions in order to mimic the broad probability distribution of the measured wind directions.

2.2 Modeling protocol

The protocol is provided in the Supplement (file S2), and only the main points are summarized here. Each simulation needed to include a minimum set of three passive CO2 tracers, CO2_PP_H, CO2_PP_M, and CO2_PP_L, representing CO2 emitted by the power plant (PP) according to three different scenarios in terms of vertical release height. In the low-release scenario L, the emissions were released at stack height without additional plume rise. In the reference scenario M, CO2 was released according to a fixed vertical profile calculated using a plume rise model, as described by Brunner et al. (2019). The plume rise model accounts for stack height and stack parameters such as flue gas temperature and volume flow (see Table 2) and for the specific meteorological conditions (wind speed and vertical stability) during the time of the aircraft flights. Meteorological conditions were taken from hourly COSMO-7 analyses of MeteoSwiss at the position of the respective power plant. The scenario H was similar to M but corresponded to a release at a higher altitude computed as the maximum of all hourly plume rise calculations for the day of the flight and the previous day. The vertical profiles for the three scenarios (in meters above surface) are provided in the Supplement (file S3). Each modeling group had to translate these profiles to the respective vertical coordinate system of the model. Constant emission rates corresponding to the annual means reported to E-PRTR for the year 2018 were used in all simulations (Table 2). However, the actual emission rates during the observation periods were different. Following Nassar et al. (2022), we estimated hourly CO2 emissions by comparing actual energy production during the observation periods with annual mean energy production by the two power plants. We assume that the period of power generation relevant for the observations at Bełchatów was 7 June 12:00–14:00 UTC. For the observations at Jänschwalde, the corresponding period was 23 May 08:00–10:00 UTC. Based on these assumptions, we estimate that the actual CO2 emission rate at Bełchatów was 47.4 Mt yr−1, i.e., 23 % higher than the annual mean. At Jänschwalde, the actual emission rate was 29.5 Mt yr−1, i.e., 28 % higher than the annual mean. Details of the computation, including tables of annual and hourly energy production and references to the data sources, are provided in the Supplement (file S4). To account for the higher CO2 emission rates during the observation periods, all model-simulated CO2 fields were scaled by a factor 1.23 for Bełchatów and by a factor 1.28 for Jänschwalde.

Optionally, additional tracers could be simulated representing background CO2, CO2 emitted by all other anthropogenic sources within the model domain, and CO2 from biospheric uptake and release. Summing up these tracers with one of the three power plant tracers should allow for a direct comparison with the in situ CO2 measurements. In this study, however, we focus on the analysis of the power plant tracers only and compare the simulations with observed CO2 plume enhancements above a local background. The XCO2 remote sensing data from MAMAP and CHARM-F were already provided as deviations from a local background.

Table 4Overview of the two model simulations, the minimum time period to be covered, and the longitude and latitude range and resolutions of the two output grids.

Download Print Version | Download XLSX

All simulations were required to cover at least the day of the flight and the previous day but were free to include additional days of spin-up. Model output had to be reported on a prescribed latitude–longitude grid for both a large domain (approx. 200 km × 200 km) with about 1 km resolution and for a small domain (approx. 60 km × 60 km) with about 200 m resolution. The small domain was selected to be sufficiently large to cover all aircraft transects. The large domain also captures parts of the plume more than 30 km downwind of the source that may still be detectable by a future satellite such as CO2M (Kuhlmann et al.2021b). Models running at very high resolution were only able to cover the small domain. In contrast to the horizontal direction, no grid was specified for the vertical direction but output was reported in the native vertical coordinate system of each model. The output had to be produced in a standardized netCDF format and had to include both meteorological variables (pressure, temperature, specific humidity, horizontal wind components, and geopotential) and the different CO2 tracers. An overview of the two mandatory simulations and the corresponding small and large output grids is presented in Table 4. A map of the domains and the ground tracks of the three aircraft is shown in Fig. 1.

Figure 1Overview of the large (blue) and small (red) model output domains for Jänschwalde (left) and Bełchatów (right). Overlaid are the flight tracks of HALO (white), DLR Cessna (dark gray; only at Bełchatów), and FUB Cessna (black) on the corresponding measurement days. The background map shows the contrast between land (green shading) and sea areas (blue shading).

2.3 Model performance assessment

The model simulations were compared with each other and with the in situ and remote sensing observations. The comparison between models allows for assessing the influence of different model types, configurations, and resolutions. It also allows investigating how differences in meteorology such as wind speeds and depth and stability of the ABL affect the model results. The comparison with observations allows evaluating how well the main characteristics of the plumes are reproduced and how well the simulated meteorology captures the true situation.

It is important to note that, in the presence of atmospheric turbulence, the comparability between models and observations is fundamentally limited due to the stochastic and chaotic nature of the turbulence (Lorenz1969). The observations only provide snapshots, and each simulation only represents a single realization. Repeating a simulation with slightly perturbed initial conditions would produce a different plume evolution with different patterns of meandering, stretching, and thinning that characterize a turbulent plume. It is therefore more meaningful to compare statistical properties such as width and amplitude of the plumes rather than comparing models and observations point by point. Other properties could also have been investigated, such as probability density functions or spectra of concentration fluctuations, but these properties are more sensitive to measurement uncertainties, which differed strongly between remote sensing and in situ measurements.

To compare plumes between the models and in situ and remote sensing observations, we divided the flights into individual plume transects (see Figs. S1–S5 in the Supplement) and fitted a Gaussian distribution to the CO2 data after subtracting a linearly changing background for each transect. The background was computed as a line connecting the 10 % percentile of the first one-fifth of data points with the 10 % percentile of the last one-fifth of data points in the transect. This was only done for the in situ measurements since the remote sensing data were already provided as enhancements above background. The Gaussian distribution can be described as follows:

(1) c p ( y ) = A 2 π σ exp - ( y - μ ) 2 2 σ 2 ,

where cp is either the CO2 mole fraction (in parts per million, ppm) or the column integral (mol cm−2). The three fit parameters are the area integral A (either in ppm m or mol cm−2 m (=100 mol cm−1)), the plume width σ (m), and the plume position shift μ (m). Flight coordinates (latitude and longitude) were translated into a Cartesian coordinate system (units of m), with its origin placed at the position of the power plant. The variable y describes the distance (m) flown from the starting point of each transect. The parameters A, μ, and σ were estimated using a nonlinear least squares Levenberg–Marquardt minimization method starting from an initial guess. The initial value of μ was set to the center of the transect, σ to 2000 m, and the area integral A to 500 ppm m for in situ measurements and 0.1 mol cm−2 m for column measurements. When no solution was found, a 3 times larger initial value of σ was chosen. In this way, the method almost always converged to a solution, though sometimes with large uncertainties. Uncertainties in all three parameters were also obtained from the fit procedure.

We estimate the true plume width from the Gaussian fit as σ⋅cf, where the geometric correction factor cf=cos(atan(yc/xc)) accounts for the fact that transects were not perfectly perpendicular to the plume axis. Here, yc and xc denote the coordinates of the plume center in the coordinate system centered on the power plant. Finally, for each transect, the start and end coordinates and times, the total length of the transect, the fit parameters and their uncertainties, and the location and distance of the plume center from the power plant were stored in a text file, following the YAML (Yet Another Markup Language) specifications. Plume amplitude was computed as the maximum of the Gaussian curve at the location y=μ. In order to make the amplitude comparable between in situ and column observations, the columns were converted from moles per square centimeter to parts per million, assuming that the plume extends uniformly over the full depth of the ABL, which was estimated from the observations to be 175 hPa (from the surface at 200 m to the top of ABL at 1800 m a.m.s.l) deep for Bełchatów and 160 hPa (60–1520 m) for Jänschwalde.

3 Results and discussion

3.1 Maps of total column XCO2

In order to compare the representation of the Bełchatów plume between the different models, Figs. 2 and 3 show the evolution of total column XCO2 fields (CO2_PP_M tracer) on 6 June 2018 from the early morning to the early afternoon. In all models except ICON-LEM (Fig. 3d–i), the plume is transported into a northwesterly direction at all times. In the early morning at 05:00 UTC (approx. 06:00 local time, where local time corresponds to Central European Time, CET), the plume is compact and laminar in almost all models. A fanning out is visible in some models, which is a consequence of the advection of the plume into different directions due to vertical shear. With the sunrise in the morning at 04:28 CET on 7 June, the ABL slowly starts to grow and eventually encompasses the plume release height. At this point in time, the plume starts to become turbulent.

Figure 2Time evolution of the Bełchatów total column XCO2 plume on 7 June 2018 from 05:00 to 13:00 UTC in the NWP models of COSMO (a–c) and WRF-GHG (d–i) and in the Lagrangian model ARTM (j–l).


Figure 3Time evolution of the Bełchatów total column XCO2 plume on 7 June 2018 from 05:00 to 13:00 UTC in the LES models of WRF-LES (a–c), ICON-LEM (d–i), and EULAG (j–l).


The onset of turbulence is clearly visible at 09:00 UTC in the LES models (Fig. 3b, e, h) and the high-resolution WRF-GHG simulation (Fig. 2h), whereas turbulence is still moderate in COSMO-GHG and the low-resolution WRF-GHG simulation (Fig. 2b, e). The plume reaches a highly turbulent state by 13:00 UTC, around the time of the aircraft flights. The widest plumes at this time of the day are simulated by COSMO-GHG, WRF-LES, and EULAG (Figs. 2c, 3c, l). In COSMO-GHG, this is due to mixing of a small portion of the plume into the free troposphere, where wind direction was nearly opposite to the ABL. The same effect, though less pronounced, is also seen in WRF-LES.

The size spectrum of the turbulence is wide enough that even NWP models like COSMO-GHG or WRF-GHG running at 1–2 km resolution are able to resolve the largest eddies. However, the variability in XCO2 clearly grows with resolution, which is especially evident when comparing the two WRF-GHG simulations at 2 and 0.4 km resolution, respectively (Fig. 2f, i). Another impact of the resolution is that the plume expands much more quickly in the initial phase upon release in a low-resolution simulation, which is again best seen by comparing the two WRF-GHG runs. No turbulent structures are visible in ARTM. Instead, the plumes mostly have a Gaussian shape, except for a fanning out at 05:00 UTC due to vertical wind shear. ARTM is forced with constant vertical wind profiles every 60 min. As a result, the plume can slightly change direction with distance from the source. In ARTM, the plume is only slightly wider at 13:00 UTC than at 09:00 UTC but much wider compared to nighttime (not shown). This shows that ARTM also accounts for increased turbulent dispersion during daytime, though the plume is significantly more compact than in other models. Tests with different turbulence parameterizations indicate that the standard configuration of ARTM tends to produce too narrow plumes (Hanfland et al., in preparation, 2023). Except for resolution, there seems to be no clear difference between NWP and LES models. The plume simulated by WRF-GHG at high resolution, for example, is structurally very similar to the plumes simulated by WRF-LES and ICON-LEM at comparable resolution.

Similar maps for the plume of the Jänschwalde power plant on 23 May 2018 are presented in Fig. S7 in the Supplement. Only results for 10:00 UTC are shown, which roughly corresponds to the time of the aircraft overpasses. At this time of the day, the turbulent structures of the plume were not yet as wide (only visible in the LES simulations), and the plume itself was less dispersed as the plume observed around noon at Bełchatów.

3.2 Qualitative comparison with in situ observations

The DLR Cessna flew a total of 12 transects through the Bełchatów plume at multiple levels and at three distances from the source (Fig. S1 in the Supplement). The CO2 measurements along these transects provide detailed insights into the horizontal and vertical extent of the plume. To compare the simulations with the observations, meteorological quantities and CO2 mole fractions were interpolated to the flight track. In a first step, the 4-D model fields were interpolated in time and latitude–longitude space to produce vertical curtains along the flight track. In a second step, these curtains were interpolated vertically to the flight altitude to produce a time series corresponding to the observations.

Figure 4Curtains of the Bełchatów CO2 plume along the DLR Cessna flight on 7 June 2018. Figures show the middle release tracer CO2_PP_M.


Curtains of CO2 along the flight track are presented in Fig. 4 for all model simulations. The corresponding in situ measurements are overlaid as colored circles with the same color scale. A constant background of 399.8 ppm was subtracted from the observations, which is 1 ppm higher than the lowest observed mole fraction. The first transect was flown at an altitude of 1000 m close to the source at a distance of about 9 km, followed by seven transects at 14 km distance, starting at an altitude of 800 m and rising step by step to 1900 m. Another four transects were then flown at 26 km distance between 800 and 1450 m above sea level. Finally, the aircraft rose to an altitude of 2200 m, well above the ABL.

The plume enhancements are clearly visible in the observations typically near the center of the horizontal transects. Elevated CO2 mole fractions were also measured at the highest altitudes above about 1800 m at the beginning of the flight at around 13:05 UTC, later at around 14:15 UTC, and especially after 14:50 UTC, when the aircraft rose to 2200 m. These enhancements were due to higher background CO2 above the ABL, which is typical of summertime when biospheric uptake by photosynthesis reduces CO2 in the continental ABL (Sweeney et al.2015). These elevated values are therefore not reproduced by the simulated power plant tracers. An exception is the situation at around 14:15 UTC, where likely a mixed signal of elevated CO2 from background air above the ABL and from the plume was measured. Peak values of both CO2 and other species like CH4 were somewhat higher than observed elsewhere at similar altitudes in the free troposphere.

The multiple plume crossings are also visible in the simulated curtains (Fig. 4a–h). In all simulations, the plume essentially extends from the surface to the top of the ABL, which suggests rapid vertical mixing in an unstable, convective ABL. Since the closest transect was at 9 km, and typical wind speeds in the ABL were around 5 m s−1 (18 km h−1), the simulated timescale of vertical mixing in the ABL was only of the order of 30 min. As a consequence, there is no clear difference between the tracers CO2_PP_H, CO2_PP_M, and CO2_PP_L released at different altitudes. Figure 4 shows the results for the reference tracer CO2_PP_M. Results for the other two tracers are presented in the Supplement (Figs. S8, S9).

The shape, width, and vertical extent of the plume varies quite substantially between the models. The plumes are more strongly dispersed horizontally and less well confined at the top in COSMO-GHG and WRF-LES (Fig. 4a, c), compared to WRF-GHG and ICON-LEM (Fig. 4b, d, e, f). In the latter two models, the plumes are sharply capped at the top of the ABL, suggesting little exchange with the free troposphere aloft. In the high-resolution version of WRF-GHG, the ABL is about 100 m deeper than in the low-resolution version. The plume extends too high, to about 2400 m in ARTM (Fig. 4g), because of the coarse vertical resolution of the model output grid in the upper part of the domain. The top layer in ARTM extends from 2100 to 2400 m. The assumed mixing layer top of 2000 m above surface for the period of the aircraft flight allowed the plume to mix into the top layer and in this way to reach 2400 m. A finer vertical resolution in the upper part of the domain would likely have prevented this. In contrast to ARTM, the plume stays comparatively low in EULAG, mostly below 1500 m (Fig. 4h). This can be compared to the COSMO-GHG model, which provides the lateral forcing data (and surface sensible heat fluxes) for EULAG. Both models show the main part of the plume at rather low altitudes, but the plume is even lower in EULAG and also more sharply capped at the top of the ABL compared to COSMO-GHG. EULAG simulated a more compact plume than COSMO-GHG in the horizontal direction as well.

Figure 5Curtains of potential temperature along the DLR Cessna flight on 7 June 2018.


Many of the differences between the models can be explained by differences in the structure of the ABL. Figure 5 presents curtains of potential temperature for all models with the observations overlaid. Consistent with the CO2 curtains, the capping inversion at the top of the ABL is much sharper in WRF-GHG and ICON-LEM (Fig. 5b, d, e, f) than in COSMO-GHG, WRF-LES, and EULAG (Fig. 5a, c, g). In the latter models, the inversion is not only weaker but also more fuzzy, suggesting significant entrainment–detrainment at the interface between the ABL and the free troposphere. This likely explains why parts of the CO2 plume are advected in the reverse direction, especially in COSMO-GHG (see Fig. 2c). Compared to the observations, the top of the ABL is too high and too weakly stratified in WRF-LES (Fig. 5c); instead, vertical stability starts increasing already at about 1500 m so that only a small fraction of the plume mixes into the top of the ABL at 1900–2000 m. A similar conclusion can be drawn for COSMO-GHG and especially EULAG, where stability starts increasing already well below 1500 m. In contrast, WRF-GHG and especially ICON-LEM show an almost perfectly neutral ABL up to the capping inversion. Despite a comparatively low ABL, the core of the plume extends higher up in these models. No curtain is shown for ARTM since turbulent mixing in this model is not constrained by a temperature profile but is prescribed depending on stability class. The measurements indicate a top of the ABL at about 1900 m that is capped by a sharp inversion. WRF-GHG is the model that captures the ABL structure most accurately. Similar curtains of wind speed are presented in the Supplement (Fig. S10).

Figure 6Curtains of the Jänschwalde CO2 plume along the FUB Cessna flight on 23 May 2018. Figures show the middle release tracer CO2_PP_M.


Plume transects at multiple vertical levels in the ABL were also performed by the FUB Cessna aircraft at Jänschwalde during the second part of its flight on 23 May 2018. In the first part, the aircraft had flown above the ABL (close to its top) to sample vertical column transects of the plume with MAMAP. Curtains of CO2 along the second part of the flight are compared with the in situ measurements in Fig. 6. No simulations are available for this flight from the high-resolution version of WRF-GHG and from EULAG. Compared to the observations, the plume is too wide and amplitudes are too low in COSMO-GHG and WRF-GHG (Fig. 6a, b), which are the two models with comparatively low resolution. These models also underestimate the vertical extent of the plume, which was clearly detectable in the observations up to about 1500 m, whereas the simulated plumes only extend to about 1300–1400 m. Somewhat surprisingly, the observed plume was stronger during the first three transects at the highest flight levels in the ABL, which is opposite to the strengths of the plumes simulated by COSMO-GHG and WRF-GHG. This behavior is quite well reproduced by WRF-LES, which, however, simulated a plume with a more complex structure compared to the observations, suggesting that it might have overestimated turbulence intensity. A similarly complex structure with two or more sub-plumes was also simulated by ICON-LEM. As for Bełchatów, the plume is displaced in the ICON-LEM model, suggesting that winds were not accurately captured. Curtains of wind speed (see Fig. S12) show a markedly different behavior from ICON-LEM compared to other models. Note that no temperature and wind measurements are available for this flight.

Figure 7Time series of CO2 (tracer CO2_PP_M) along the DLR Cessna flight at Bełchatów on 7 June 2018. The gray line is the flight altitude (second y axis).


Time series of observed and simulated CO2 along the DLR Cessna flight at Bełchatów are presented in Fig. 7. Both observations and simulations were averaged over 5 s intervals along the flight track, which corresponds to a distance of about 350 m. The observations reveal sharp peaks of more than 40 ppm in the first transect and gradually wider and lower peaks down to about 10–15 ppm in the last four transects. The width and amplitude of the simulated peaks varies considerably. COSMO-GHG consistently underestimates the peaks (Fig. 7a), especially at the higher flight levels, due to insufficient mixing into the upper ABL, as mentioned before. WRF-GHG underestimates the plume amplitude in the low-resolution setup (Fig. 7b) but captures and partly overestimates the amplitudes in the high-resolution setup (Fig. 7d). The plume transects are remarkably well represented in WRF-LES (Fig. 7c), except for the first transect, where the peak amplitude is underestimated. A similar underestimation for this transect is also present in other models, which may indicate a turbulent structure of unusually high CO2 concentrations encountered during the flight. In ICON-LEM, the plumes tend to be narrower than observed (Fig. 7e, f), and they are displaced due to the erroneous wind direction. ARTM reproduces plume location and amplitude quite accurately, but the plumes tend to be narrower than observed, despite the usage of two alternating wind directions in the simulations, which generated additional plume spread. Finally, EULAG quite well captures the plume at the lowest flight level but fails to reproduce the observed peaks at the higher levels due to insufficient vertical extent of the plume as mentioned before.

Corresponding time series of potential temperature and wind speed are presented in the Supplement (Figs. S15 and S16). While average wind speeds are quite accurately captured by COSMO-GHG and WRF-GHG, they are slightly overestimated by WRF-LES and strongly overestimated by ICON-LEM. In EULAG, mean wind speeds are close to observations, but fluctuations are too large, suggesting that turbulence was too strong in this model.

Figure 8Time series of CO2 (tracer CO2_PP_M) along the FUB Cessna flight at Jänschwalde on 23 May 2018. The gray line is the flight altitude (second y axis).


Time series of in situ observed and simulated CO2 along the FUB Cessna flight at Jänschwalde are presented in Fig. 8. The plumes are quite well represented by WRF-LES, they are generally too wide and underestimated by WRF-GHG, they are underestimated by COSMO-GHG and ARTM during the first three but overestimated during the last two transects, and they are misplaced by ICON-LEM. Similar to Bełchatów, there is a large variability between model results, suggesting that it is very difficult to represent the observed plumes in all details.

3.3 Qualitative comparison with vertical column XCO2 observations

Models may be more successful in reproducing vertical columns as these are much less sensitive to vertical transport and mixing in the ABL. Time series of vertically integrated CO2 (µmol cm−2) from the different models interpolated in time and space to the flight tracks of the FUB Cessna at Bełchatów (7 June 2018) and Jänschwalde (23 May 2018) are compared in Figs. 9 and 10, with corresponding vertical columns measured by MAMAP.

Figure 9Time series of CO2 (tracer CO2_PP_M) column enhancements simulated and observed by MAMAP along the FUB Cessna flight at Bełchatów on 7 June 2018. The plumes observed around 12:20 and 13:45 UTC, which are not reproduced by any of the models, were measured upwind of the power plant. These plumes are caused by retrieval issues over water surfaces rather by real CO2 enhancements.


Figure 10Time series of XCO2 (tracer CO2_PP_M) along the FUB Cessna flight at Jänschwalde on 23 May 2018.


In contrast to in situ CO2, COSMO-GHG reproduces the observed total columns at Bełchatów quite accurately (Fig. 9a). ICON-LEM, in contrast, tends to underestimate the total column amounts (area under the curve) and partly misses the plumes due to a wrong wind direction (Fig. 9e, f). The underestimation can be explained by a strong overestimation of wind speeds in the ABL (see Fig. S16). Furthermore, the plumes are much narrower than observed, which was already noticed in the comparison with the in situ measurements, and could also be a consequence of too high wind speeds. Too narrow plumes are also simulated by ARTM during the first half of the flight (first seven transects in Fig. 9g between 12:20 and 13:27 UTC), but, differently to ICON-LEM, this leads to an overestimation of peak amplitudes. During the second half (last seven transects), the plumes simulated by ARTM agree much better with the observations. WRF-LES and WRF-GHG capture the plume transects quite well and mostly at the correct position (Fig. 9b, c, d). Peak amplitudes match the observations better in the high-resolution version of WRF-GHG. EULAG reproduces the total columns much better than the in situ CO2 because the underestimation of the vertical extent of the plume does not affect the columns. The plume widths and amplitudes are well matched, except for an overestimation of the amplitude and underestimation of the width during the first two transects closest to the source. For Jänschwalde (Fig. 10), the overall quality of the agreement with the observations is similar, but the results for the individual models are somewhat different. WRF-LES and especially ICON-LEM misplaced the plume and therefore missed it on selected transects.

Figure 11Time series of CO2 (tracer CO2_PP_M) column enhancements simulated and observed by CHARM-F along the HALO flight at Bełchatów on 7 June 2018.


A comparison with the columns measured by the CHARM-F lidar at a distance of only about 3–4 km downwind of the Bełchatów power plant is presented in Fig. 11. To convert differential absorption optical depths (DAODs) as measured by CHARM-F into vertical columns, a differential absorption cross section of 7.27 × 10−23 cm2 was assumed (Wolff et al.2021). The observations are rather noisy, but the enhancements during the four plume transects are clearly visible. Except for ARTM, EULAG, and the high-resolution version of WRF-GHG, the models tend to underestimate the plume amplitudes, mainly due to too wide plumes. The corresponding figure for Jänschwalde is shown in Fig. S17.

3.4 Evaluation of statistical properties of the plume

In order to compare characteristic properties of the plumes between simulations and observations, a Gaussian curve was fitted to each aircraft transect, as described in Sect. 2.3. Although most of the plume transects did not reveal a classic bell shape, it was often possible to determine the fit parameters of the Gaussian distribution with reasonably low uncertainty. Examples are presented in Fig. S6.

Figure 12Comparison between observed and simulated characteristics of the CO2 plume of the Bełchatów power plant on 7 June 2018 as a function of distance from the source. Plume characteristics were determined by fitting a Gaussian distribution to the individual plume transects. (a) Plume widths (σ⋅cf). (b) Plume amplitudes (maximum of the Gaussian distribution). (c) Plume areas A. Observations are shown as black open symbols and models as filled colored symbols. Symbols are only shown when the Gaussian fit was sufficiently robust (uncertainty in plume width < 10 %) and the plume was not too close to the border of the transect. Gray lines describe plume width of an analytical Gaussian plume model, following Briggs (1973), for highly (dashed) and weakly unstable (dotted) conditions.


A summary of the observed and simulated plume characteristics is presented in Fig. 12 as a function of distance from the Bełchatów power plant. Width (σ⋅cf), amplitude (maximum), and integral area (A) of the fitted Gaussian were determined for both in situ CO2 along DLR Cessna transects and for column CO2 enhancements along the FUB Cessna and HALO transects. The corresponding measurements are shown as open circles, squares, and diamonds, respectively. The model results are presented as filled colored symbols. Although the same transects were considered, the distance from the source varies between observations and models because, for each plume, the geometric distance between (fitted) plume center and power plant was determined. As described in Sect. 2.3, plume widths were geometrically corrected to represent the width perpendicular to the plume axis, and vertical columns were converted to mole fractions to enable a joint analysis with the in situ measurements.

For both observations and models, the plume width generally increases and the amplitude decreases with distance, as expected. However, between about 13 and 26 km, there is no clear tendency in plume width, neither in the observations nor in the model simulations. A possible reason could be that the plume was not fully covered by the transects at 26 km. This is true for some of the simulated plumes due to the limited model domain, but it is not obvious for the observations. However, the fact that plume amplitude changed only little suggests that the plume did indeed not grow between 13 and 26 km. Overlaid in the figure are plume width estimates from a classical Gaussian plume model, following Briggs (1973). The two lines describe an average behavior of turbulent plumes under highly unstable (stability class A) and weakly unstable (stability class C) atmospheric conditions. The observed plume growth up to a distance of 15 km is quite consistent with the Gaussian plume model for very unstable conditions (dashed gray line), but, at 26 km distance, the observed plume is almost twice as narrow as expected.

The model results show a wide range in both width and amplitude, but the mean model behavior is quite consistent with the observations. In the near-field range up to distances of about 8 km, models with lower resolutions (COSMO-GHG and WRF-GHG) tend to show wider plumes than models with higher resolutions (ICON-LEM, WRF-GHG-HR, WRF-LES, and EULAG). The Lagrangian model ARTM, which can represent the source as a true point release without averaging over the extent of a grid cell, simulated a very compact plume in the near-field range that is clearly narrower than the plume observed by both MAMAP and HALO. Also, the Eulerian models with very high resolutions simulated a too narrow plume in the near-field range. A possible reason is that the plumes were released at a single point above the power plant, whereas, in reality, the release occurred from two stacks separated by 350 m. Furthermore, the plume had likely spread horizontally already during plume rise, a process that was not considered in the simulations where CO2 was released from a single horizontal location (or grid cell) above the chimney.

The observations in the near-field range, which primarily originate from MAMAP and HALO, show a rapid growth of the plume up to a width of about 2 km at a distance of 5 km, suggesting a strongly turbulent nature of the plume. In fact, the second- and third-closest transects from MAMAP show a split of the plume into two and three parts, respectively. Also, the closest transect observed from the DLR Cessna at 9 km shows a double-structured plume. The observed plume was strongly displaced to the north, away from the main plume axis, suggesting that a turbulent eddy had pushed it northwards upon release from the power plant. Since this was not reproduced by any of the models, the model symbols corresponding to this transect appear in the figure at a much shorter distance of around 6–7 km.

The evolution of plume amplitudes shows a somewhat more robust behavior than plume widths, with a clearly decreasing trend up to 13 km but only a small further decrease up to 26 km. A possible explanation for the higher robustness could be that the fitting of plume amplitude is less sensitive to the incomplete coverage of the plume within a transect. Again, the models with resolutions of 1 km or coarser show a much faster dilution in the near-field range and a corresponding underestimation of plume amplitude. This is especially evident when comparing the results of WRF-GHG, which was run at 2 km and 400 m resolution. The high-resolution version is much more consistent with the observations. The high-resolution models ARTM and EULAG tend to overestimate the amplitudes in the near-field range, which is consistent with their underestimation of plume width in this range.

At larger distances, the plume amplitudes are largely consistent between the models and the observations. However, COSMO-GHG consistently underestimates plume amplitudes, suggesting a too rapid dispersion not only near the source but also at larger distances. Despite the fact that the plumes simulated by ICON-LEM are too narrow, their amplitudes are quite comparable to the observations, which is likely due to the too high wind speeds of this model, as mentioned earlier.

The plume integrals (i.e., the areas under the Gaussian curves) presented in Fig. 12c correspond to the integrated amount of CO2 along each transect in units of parts per million times kilometer. Since CO2 is transported as a passive gas, the plume integrals are expected to stay constant with distance unless (i) the wind speed or direction changes with distance (or with time, since the transects were flown at different times), (ii) the plume extent is not fully covered by all transects, or (iii) the plume is not yet homogeneously mixed over the full depth of the ABL, such that a mole fraction measured by an in situ instrument at a given altitude is not representative of the ABL column mean. The figure suggests that the plume integrals are indeed not constant but decrease with distance, which is seen more clearly in the measurements than the simulations. The reason for this could be any combination of the above points. The integrals also enable a quantitative comparison between observations and models. The mean (and standard error of the mean) averaged over all models (excluding ICON-LEM and points with unrealistically low values below 10 ppm km) and all distances is 105.6±2.8 ppm km (n=126). The corresponding mean over all observations is 111.9±11.1 ppm km (n=26). The two values agree within their combined uncertainties, suggesting that the simulations are consistent with the observations.

Figure 13Same as Fig. 12 but for the FUB Cessna measurements collected at the Jänschwalde power plant on 23 May 2018.


The same analysis was also performed for the measurements collected during the FUB Cessna flight at Jänschwalde on 23 May 2018 (Fig. 13). No results from CHARM-F on HALO are included here, as it was difficult to fit a Gaussian distribution to these observations. To support the visual comparison with the results at Bełchatów, the same axis ranges were used. In comparison to Bełchatów, the plume at Jänschwalde remained more compact in both the observations and the simulations, which is likely due to a combination of lower turbulence and higher wind speeds. The evolution of plume width is quite consistent, with a Gaussian plume model for weakly unstable conditions (dotted gray line). Even more obvious than for Bełchatów, the two comparatively coarse models WRF-GHG and COSMO-GHG overestimate plume width in the near-field range but agree better at distances larger than 15 km from the source. For the in situ transects (between 10 and 12 km), WRF-GHG overestimates the plume widths and underestimates the amplitudes quite substantially, whereas the agreement for the vertical column transects is much better. ICON-LEM tends to underestimate the plume width, though no comparison could be performed at distances larger than 10 km because the simulated plume moved out of the measurement transects rather quickly due to the wrong wind direction. WRF-LES performed the best, matching both the observed plume widths and amplitudes quite accurately.

Different from Bełchatów, the plume integrals remain approximately constant with distance. The mean averaged over all models, except ICON-LEM, is 55.5±2.1 ppm km (n=92), and the corresponding mean over all observations is 57.0±5.6 ppm km (n=13). Again, the two values agree within their combined uncertainties. Using the annual mean instead of the actual CO2 emission rates in the simulations would have resulted in too low plume integrals that are inconsistent with the observations for both Bełchatów and Jänschwalde. This finding agrees with a recent study by Nassar et al. (2022), who demonstrated that it is necessary to account for actual power generation to explain day-to-day variations in CO2 emissions from Bełchatów estimated from individual OCO-2 and OCO-3 satellite overpasses.

3.5 Emission quantification with a CO2M like satellite

In this section, we generate synthetic total column CO2 observations from the model outputs, mimicking those of a future CO2M satellite, and analyze two popular emission quantification methods applied to these synthetic satellite images. The main purpose is to determine how well the true emissions can be estimated from single CO2M satellite overpasses, assuming that the models provide a realistic representation of such plumes. We also analyze how diurnal variability in ABL structure and measurement noise affects the ability to quantify emissions. In order to translate vertical columns into fluxes, both methods require the estimation of an effective wind speed (or transport speed) of the plume. Here, it is determined separately for each model, using the respective 3D model wind fields. In the case of real satellite observations, however, the transport speed would be estimated from a meteorological analysis (see, e.g., Nassar et al.2017, 2021, 2022), which comes with an additional uncertainty because the analyzed winds will be different from reality. Although the wind fields are perfectly known in our case, we will show that the estimation of the effective wind speed is affected by uncertainties due to turbulent wind fluctuations and due to the fact that it is not known exactly at what altitude the plume is located.

The synthetic observations are generated by reducing the resolution of the output to 2 km × 2 km (through averaging over multiple output grid cells) and adding Gaussian random noise corresponding to a low- (0.5 ppm) and a high-noise (1.0 ppm) instrument scenario of CO2M (Sierk et al.2021). Assuming a depth of the atmosphere of 950 hPa, this corresponds to a noise of 1.67 × 10−5 and 3.34 × 10−5 mol cm−2 in total column CO2, respectively. The two quantification approaches are the cross-sectional flux and the integrated mass enhancement (IME) method, which were identified by Varon et al. (2018) as being comparatively robust methods.

Figure 14Illustration of cross-sectional flux and IME methods for the Bełchatów CO2 plume (mol cm−2) on 7 June 2018 12:00 UTC, as simulated by the WRF-GHG high-resolution (HR) model downsampled to 2 km × 2 km resolution for a (a) low-noise (0.5 ppm) and (b) high-noise (1 ppm) CO2M instrument noise scenario. For the IME method, pixels above a threshold of 0.4 mol m−2 are marked as white crosses. For the cross-sectional flux method, fluxes through 10 north–south cross sections (thin dashed gray lines) downwind of the power plant were computed and averaged. The centers and north–south extensions (±2σ) of the plumes, as determined by a Gaussian plume fit, are marked with black circles and thick dashed black lines, respectively.


The two methods are illustrated in Fig. 14, for the example of a plume at Bełchatów on 7 June 2018 12:00 UTC, as simulated by the WRF-GHG model in the high-resolution (HR) configuration. In the low-noise scenario (Fig. 14a), the plume signal clearly stands out from the background noise. In the high-noise scenario (Fig. 14b), in contrast, the noise partly obscures the plume signal. Since the simulated plume amplitude linearly scales with the emission strength, the high-noise scenario would be identical to a low-noise scenario for a 2 times smaller emission source.

The cross-sectional flux method integrates total column CO2 (kg m−2) along a cross section approximately perpendicular to the plume axis and obtains the emission as the product of this line density (kg m−1) with an effective wind speed perpendicular to the cross section (m s−1). For simplicity, we chose exact north–south cross sections together with the east–west wind component U. In order to obtain a representative wind speed, the wind component U was evaluated in the center of the plume transect (filled black circles Fig. 14) and averaged over the pressure range 925–875 hPa (approx. 800–1200 m above sea level), which approximately corresponds to the center of the daytime ABL. Similar to Kuhlmann et al. (2021b), we computed the emission as being the average over multiple cross sections (dashed lines) in order to make better use of the imaging capability of a future CO2M satellite. Only cross sections for which the fitted Gaussian curve was fully (±2σ) inside the model output domain were included in the average. In the example, all cross sections fulfilled this criterion.

In case of IME, the integrated mass enhancement (i.e., the total mass of CO2 within the plume) was determined from all pixels above a given threshold (white crosses in Fig. 14). As recommended by Varon et al. (2018), the image was first smoothed with a Gaussian filter of 200 m width (1 σ) in order to limit erroneous detection of pixels outside the plume due to measurement noise. The filtering substantially stabilized the detection of the plume, especially for the high-noise case (Fig. 14b). The emission Q was then computed as follows (Varon et al.2018):

(2) Q = U eff L IME ,

where Ueff is the effective wind speed and L a characteristic length scale of the plume. The ratio L/Ueff represents the residence time of CO2 within the detected plume. A possible measure of the length scale L is the square root of the area of the detected pixels (Varon et al.2018). It is important to note that the exact choice of the threshold and length scale affects the effective wind speed. In contrast to the cross-sectional flux method, the effective wind speed may be a nonlinear function of the true transport speed of the plume and first needs to be calibrated to obtain an unbiased estimate of Q. Varon et al. (2018) suggested performing LES model simulations to determine this relationship. Here, we took the wind speed (square root of sum of squared U and V) at the position of the power plant averaged over the same altitude range as for cross-sectional flux method. In order to bring the estimated emissions in close agreement with the truth, this wind speed had to be multiplied by a factor of 0.75. This potential caveat of the method will be discussed later.

Figure 15Emissions quantified for the Bełchatów plume for all hours of 7 June 2018 by the (a, c) cross-sectional flux method and (b, d) integrated mass enhancement method. (a, b) Low-noise CO2M scenario (0.5 ppm) and (c, d) high-noise scenario (1.0 ppm). The dashed line shows the true emissions. Effective wind speeds were obtained as vertically averaged wind speed between 925 and 875 hPa (see text for further details).


Figure 15 presents a comparison of the results of the two methods. To enable a fair comparison, the image was first smoothed before applying the cross-sectional flux method in the same way as for the IME method. The figure shows the emissions estimated from the simulated plumes at Bełchatów for all 24 h of 7 June 2018. For both methods, the scatter between the models is lower around noontime than at night, which is a result of the strong vertical mixing during daytime. Approximating the effective transport speed by a wind speed in the middle of the ABL seems to be a good approach under these conditions. At night, conversely, the results are more sensitive to the altitude range over which the wind speeds are averaged because of vertical wind sheer and a more limited vertical extent of the plume.

A summary of the performance of the two methods for midday-averaged (09:00–15:00 UTC) fluxes is presented in Table 5. Overall, the results of the two methods are comparable, with the cross-sectional flux method producing slightly more robust results (smaller scatter between the model results). For both methods, the multi-model mean bias is mostly well below 10 %, and the standard deviation is of the order of 20 %, slightly higher for the high-noise scenario than for the low-noise scenario. One reason for the fluctuations between the model results is measurement noise. However, even without any noise, the standard deviation is still about 17 % of the true value (see Table 5). A second reason is that the assumed 925–875 hPa average wind speed is only an approximation of the true transport speed. Finally, the fluxes through vertical cross sections are not constant in time and space due to turbulent fluctuations. Averaging over multiple cross sections reduces this variability but does not eliminate it. For the low-noise scenario, the standard deviation of the emissions estimated for the 10 individual transects is of the order of 20 % to 30 %, depending on the model. Averaging over 10 transects reduces this uncertainty by roughly a factor of 3 (10). For a satellite like OCO-2 with a narrow swath of only 8 km, the possibilities for averaging are much more limited, such that substantial uncertainties of the order of 10 %–20 % due to turbulent fluctuations alone have to be expected. The same applies to the planned lidar satellite MERLIN, which will measure along a very narrow ground track. Wolff et al. (2021) therefore concluded that emissions can be better quantified from MERLIN under less turbulent conditions at night and in the early morning than at midday. However, our results suggest that this is only true if the height of the plume and the corresponding wind speed are well known. These parameters are likely more difficult to estimate for a vertically structured atmosphere at night than for a well-mixed ABL during the daytime.

Table 5Emissions from the Bełchatów power plant estimated with the cross-sectional flux (X Flux) and integrated mass enhancement (IME) methods. Results are presented for 09:00–15:00 UTC averaged fluxes from eight different models, as shown in Fig. 15, for a low and a high CO2M measurement noise scenario.

Download Print Version | Download XLSX

As mentioned earlier, the wind speed had to be scaled for the IME method by a factor of 0.75 to obtain emissions close to the truth. The estimation of this scaling factor comes at the price of an additional uncertainty that is not present in the cross-sectional flux method. In practice, the relationship between true and effective wind speeds may be determined from multiple observations over a known source or from realistic simulations with a high-resolution transport model. However, this (nonlinear) relationship likely depends not only on wind speed but also on the turbulent state of the atmosphere, which makes the calibration a challenging multi-dimensional problem.

4 Conclusions

Six atmospheric transport models differing in type and resolution were used to simulate the CO2 exhaust plumes of two large coal-fired power plants, Bełchatów in Poland and Jänschwalde in Germany, following a common protocol. The simulations were compared among each other and evaluated against a comprehensive data set of airborne in situ and remote sensing observations collected on 2 fair-weather days in May and June 2018 by the CoMet measurement campaign. The CO2 emissions assumed in the simulations correspond to values officially reported for the year 2018 but are scaled by a factor 1.23 for Bełchatów and 1.28 for Jänschwalde to account for the fact that hourly energy production rates were higher during the observations than annual mean production rates. On average, the amount of CO2 integrated along individual plume transects was highly consistent between simulations and observations when the emissions were scaled in this way.

The simulations indicate that, with the growth of the ABL in the morning, the plumes evolved from compact laminar plumes at night into much wider, highly turbulent plumes during the day. The turbulent nature of the daytime plumes was not only captured by the high-resolution (200–600 m) LES models but also by the mesoscale NWP models operating at 400 m–2 km horizontal resolution, though turbulent structures were increasingly smoothed out and not well represented anymore at 2 km resolution.

Characteristic properties of the plumes, such as vertical extent and horizontal dispersion, differed substantially between the models. Consistent with the observations, the simulated plumes extended over almost the whole depth of the ABL during daytime. As a consequence, the exact altitude of the release of CO2 in the models did not have a strong impact on the results in the early afternoon when most of the measurements were collected. Nevertheless, differences in the vertical stability in the upper parts of the ABL and the strength of the capping inversion had a significant effect on the simulations near the top of the ABL. WRF-GHG and ICON-LEM simulated an almost neutral ABL with a sharp inversion for the Bełchatów case, in good agreement with the observations, whereas WRF-LES, COSMO-GHG, and EULAG showed a fuzzier and wider ABL top and an increase in stability already well below the capping inversion. This dampened the vertical expansion of the plume and led to an underestimation of plume height in COSMO-GHG and especially EULAG. The vertical plume extent was also underestimated by some models at Jänschwalde, including WRF-GHG, which performed much better at Bełchatów. Differences in vertical dispersion between different models were also found, in the study of Karion et al. (2019), to be a major driver of differences in simulated concentrations and emission sensitivity. Similarly, Katharopoulos et al. (2022) identified the turbulence description of the LPDM FLEXPART-COSMO (where FLEXPART is the FLEXible PARTicle dispersion model) as being a main source of error when operating at high-resolution using the Bełchatów plume as a benchmark.

Simulations at resolutions coarser than about 1 km showed a too rapid dispersion of the plumes in the near-field range up to about 8 km downwind of the source, but the further dispersion was not systematically different from higher-resolution models. The high-resolution LES models WRF-LES and ICON-LEM and the Lagrangian model ARTM, in contrast, simulated a too narrow plume in the near-field range, possibly because the plumes were released in the simulations from a single point rather than from multiple stacks, and horizontal mixing (and displacement) during plume rise was not accounted for.

Overall, the COSMO-GHG model overestimated the dispersion. ARTM, in contrast, simulated a generally too compact plume for the Bełchatów case when the ABL was very unstable but performed better for Jänschwalde. The plumes were also generally too narrow in ICON-LEM, especially at Bełchatów, probably due to a significant overestimation of wind speeds for this case. Plume width was generally well represented at all distances by WRF-LES, but wind speeds were slightly overestimated at Bełchatów. WRF-GHG showed too wide plumes when run at a coarse resolution (2 km × 2 km) but agreed much better with the observations when run at a high resolution (400 m × 400 m).

The agreement with total column CO2 measurements was usually better than with in situ measurements because errors in the vertical distribution have only a minor impact on total columns. EULAG, for example, which showed a poor agreement with in situ measurements at Bełchatów due to the underestimation of plume height, showed very good agreement with total columns from the MAMAP spectrometer and the CHARM-F lidar.

Based on the limited sample of only 2 measurement days, it is difficult to draw general conclusions on model performance or to even rank the models. Several simulations, for example, the high-resolution version of WRF-GHG for Bełchatów and WRF-LES for Jänschwalde, showed remarkable consistency with the observations, suggesting that power plant plumes can be simulated by both LES and NWP models in a very realistic way. However, the stochastic nature of turbulence puts fundamental limits on any point-by-point comparison. Good or bad agreement in a point-by-point comparison can be a matter of luck. In the ESA-funded project SMARTCARB2, an ensemble of 18 COSMO-GHG simulations with slightly different settings were performed for the Bełchatów case, each producing a different realization of the turbulent plume. The results revealed a large spread in model performance, with correlation coefficients ranging from 0 to 0.8, depending on whether or not the simulated plume was structurally similar to the observed plume (Kuhlmann et al.2021a).

Nevertheless, a few general conclusions can be drawn. Models with resolutions of 1 km or coarser tend to simulate wider plumes and significantly overestimate plume width in the near-field range at distances up to about 8 km from the source. Realistically representing turbulent structures of the plumes requires simulations at 1 km resolution or better. The model resolution appears to have a larger impact on the results than differences in the treatment of turbulence between LES and NWP models. When run at a comparable resolution, LES and NWP models showed a very similar performance. The agreement of a model with observations critically depends on the setup and forcing of the simulation. Initial and boundary conditions from a meteorological analysis improves the representation of the meteorological situation. Additional assimilation of meteorological observations within the model domain (as in COSMO-GHG) or frequent reinitialization of the simulations from analyzed meteorological fields (as in WRF-GHG and WRF-LES) can further improve the performance. ICON-LEM did not capture the weather situation well, probably because it was forced by a free-running regional ICON simulation that was not sufficiently constrained by meteorological observations. The ARTM simulation at Bełchatów, in contrast, accurately captured the position of the plume as it was forced by observed winds from the aircraft.

The model simulations were used to generate synthetic CO2M satellite observations in order to analyze the capability of CO2M to quantify emissions using two popular emission estimation methods, the cross-sectional flux method and the integrated mass enhancement (IME) method. Assuming that winds in the middle of the ABL are a good approximation of the true transport speed of CO2 in the plume, the emissions from Bełchatów can be estimated from a single overpass of CO2M, with an uncertainty of about 20 %, with a bias of no more than a few percent. Because the satellite image was first smoothed (with a Gaussian filter) before applying the flux estimation, the uncertainty was only slightly higher for a CO2M instrument scenario with high measurement noise (1.0 ppm) than for a low-noise scenario (0.5 ppm). Our estimate of a 20 % uncertainty is higher than the average value of about 12 % recently estimated by Nassar et al. (2022) for single images of Snapshot Area Maps (SAMs) from the OCO-3 satellite over Bełchatów. They acknowledge that their value could be an underestimate of the total uncertainty, but on the other hand, it was consistent with absolute differences between estimated and expected (from actual power generation) emissions. A 20 % uncertainty may thus be a conservative estimate.

The performance of the IME and cross-sectional flux methods was very similar, but the IME method suffered from an additional uncertainty introduced by the fact that wind speeds had to be translated into an effective wind speed. Averaging over multiple transects substantially improves the estimates in the case of the cross-sectional flux method because fluxes through individual 2 km wide transects fluctuate by 20 %–30 % due to turbulence. Such averaging will be possible for the upcoming CO2M satellite constellation owing to its wide swath. Because turbulence is much reduced at nighttime, it seems attractive for an active lidar instrument like CHARM-F or the future MERLIN satellite to quantify emissions from measurements at night (Wolff et al.2021). However, because the plumes are much more confined in the vertical at night, the results will critically depend on an accurate estimation of plume height in situations where vertical wind shear is strong.

The combination of in situ (chemical tracers and meteorological parameters) and remote sensing observations at varying distances from the source collected during CoMet provided an excellent data set for evaluating the vertical and horizontal structure of the plumes as simulated by the models. Nevertheless, for future campaigns, it would be desirable to sample power plant plumes under different meteorological conditions with stronger and weaker winds and turbulence and at different times of the day, including measurements at night. Furthermore, imaging spectrometry, as planned for a forthcoming CoMet campaign, could reveal much more detail of the horizontal structure of the turbulent plumes, and multiple overpasses could provide critical insights into their dynamic nature.

A potentially important application of high-resolution model simulations, as performed in this study, is the estimation of point source emissions from satellite observations through inverse modeling. However, accurately simulating the location and structure of the corresponding plumes will remain a challenge, especially in the presence of turbulence. Simple Bayesian inversions, where simulations and observations are compared locally on a pixel-by-pixel basis, therefore seem ill-suited, but more advanced methods, e.g., using non-local metrics as proposed by Vanderbecken et al. (2022), will be necessary. Whether such methods can outperform simpler methods, such as Gaussian plume matching and mass balance approaches that do not require any expensive model simulations, will have to be seen.

High-resolution simulations are invaluable, however, for testing the capabilities of future satellites or other measurement platforms in observing system simulation experiments as shown, e.g., by Kuhlmann et al. (2019, 2021b). Our study shows that simulations performed at a resolution of 1 km or better are able to provide a highly realistic representation of real plumes.

Data availability

The MAMAP CH4 column anomalies are available from the authors upon request. The airborne in situ measurements acquired by the DLR Cessna, the FUB Cessna, and the DLR HALO and the CHARM-F measurements can be obtained from the authors or downloaded from the HALO database (; Deutsches Zentrum für Luft- und Raumfahrt2021​​​​​​​).


The supplement related to this article is available online at:

Author contributions

DB wrote the paper, contributed to the protocol, and conducted the analyses. SH and GK performed, processed, and analyzed the COSMO-GHG simulations and, together with EK, developed the comparison and emission estimation concepts in the SMARTCARB2 and CoCO2 projects. BK and PJ performed and analyzed the ICON-LEM simulations, SW and CK did the WRF-LES runs, MG and JM did the WRF-GHG runs, AK, AW, and PP did the EULAG runs, and RH, MPA, and CV did the ARTM runs. AlF, AnF, AR, CK, SK, KG, HB, MG, and CG conducted the CoMet campaign in 2018 and processed the measurements used in this study. AnF was the principal investigator (PI) of the CoMet campaign and contributed to the measurements of DLR. AlF and AR performed the DLR Cessna in situ measurements and MG and CG the HALO in situ measurements. CK and SK performed and analyzed the CHARM-F measurements. JB conducted the FUB Cessna MAMAP and in situ measurements. SK performed the analysis of the MAMAP remote sensing data. All authors contributed to the interpretation of the results and to the writing and improvements of the article.

Competing interests

At least one of the co-authors is an editor of Atmospheric Chemistry and Physics. The peer-review process was guided by an independent editor, and the authors also have no other competing interests to declare.


Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Special issue statement

This article is part of the special issue “CoMet: a mission to improve our understanding and to better quantify the carbon dioxide and methane cycles”. It is not associated with a conference.


We acknowledge support by the EU project CoCO2 funded through the European Union's Horizon 2020 Research and Innovation program (grant no. 958927) and by the project SMARTCARB funded by the European Space Agency (ESA). We also acknowledge funding for the CoMet campaign by the BMBF (German Federal Ministry of Education and Research) through AIRSPACE (grant no. FK 01LK1701B), the State of Bremen, and the Max Planck Society. The work has been further supported by the German Research Foundation (DFG) within the Priority Program (grant no. SPP 1294) Atmospheric and Earth System Research with the Research Aircraft HALO (High Altitude and Long Range Research Aircraft; grant no. 1731/1-1). Simulations with COSMO-GHG have been supported by a grant from the Swiss National Supercomputing Centre (CSCS; grant no. s1091) and by the Center for Climate Systems Modeling (C2SM). We thank ECMWF, for providing operational analysis and forecast data as initial and boundary conditions for the high-resolution models.

Financial support

This research has been supported by the Horizon 2020 (CoCO2; grant no. 958927), the European Space Agency (grant no. 4000119599/16/NL/FF/mg), the Bundesministerium für Bildung, Wissenschaft, Forschung und Technologie (grant no. FK 01LK1701B), the Centro Svizzero di Calcolo Scientifico (grant no. s1091), and the Deutsche Forschungsgemeinschaft (grant no. BO 1731/1-1).

Review statement

This paper was edited by Jens-Uwe Grooß and reviewed by Ray Nassar and Anna Karion.


Ahmadov, R., Gerbig, C., Kretschmer, R., Koerner, S., Neininger, B., Dolman, A. J., and Sarrat, C.: Mesoscale covariance of transport and CO2 fluxes: Evidence from observations and simulations using the WRF-VPRM coupled atmosphere-biosphere model, J. Geophys. Res.-Atmos., 112, D22107,, 2007. a, b

Ahmadov, R., Gerbig, C., Kretschmer, R., Körner, S., Rödenbeck, C., Bousquet, P., and Ramonet, M.: Comparing high resolution WRF-VPRM simulations and two global CO2 transport models with coastal tower measurements of CO2, Biogeosciences, 6, 807–817,, 2009. a

Amediek, A., Ehret, G., Fix, A., Wirth, M., Büdenbender, C., Quatrevalet, M., Kiemle, C., and Gerbig, C.: CHARM-F – a new airborne integrated-path differential-absorption lidar for carbon dioxide and methane observations: measurement performance and quantification of strong point source emissions, Appl. Optics, 56, 5182–5197,, 2017. a, b

Angevine, W. M., Peischl, J., Crawford, A., Loughner, C. P., Pollack, I. B., and Thompson, C. R.: Errors in top-down estimates of emissions using a known source, Atmos. Chem. Phys., 20, 11855–11868,, 2020. a

Baldauf, M., Seifert, A., Forstner, J., Majewski, D., Raschendorfer, M., and Reinhardt, T.: Operational convective-scale numerical weather prediction with the COSMO model: description and sensitivities, Mon. Weather Rev., 139, 3887–3905,, 2011. a

Beck, V., Koch, T., Kretschmer, R., Marshall, J., Ahmadov, R., Gerbig, C., Pillai, D., and Heimann, M.: The WRF Greenhouse Gas Model (WRF-GHG), Technical Report No. 25, Max Planck Institute for Biogeochemistry, Jena, Germany, 2011. a, b

Beirle, S., Borger, C., Dörner, S., Li, A., Hu, Z., Liu, F., Wang, Y., and Wagner, T.: Pinpointing nitrogen oxide emissions from space, Sci. Adv., 5, eaax9800,, 2019. a

Blaylock, B. K., Horel, J. D., and Crosman, E. T.: Impact of Lake Breezes on Summer Ozone Concentrations in the Salt Lake Valley, J. Appl. Meteorol. Clim., 56, 353–370,, 2017. a

Bovensmann, H., Buchwitz, M., Burrows, J. P., Reuter, M., Krings, T., Gerilowski, K., Schneising, O., Heymann, J., Tretner, A., and Erzinger, J.: A remote sensing technique for global monitoring of power plant CO2 emissions from space and related applications, Atmos. Meas. Tech., 3, 781–811,, 2010. a

Briggs, G. A.: Diffusion estimation for small emissions, Preliminary report, Tech. Rep. TID-28289, National Oceanic and Atmospheric Administration,, 1973. a, b

Brunner, D., Kuhlmann, G., Marshall, J., Clément, V., Fuhrer, O., Broquet, G., Löscher, A., and Meijer, Y.: Accounting for the vertical distribution of emissions in atmospheric CO2 simulations, Atmos. Chem. Phys., 19, 4541–4559,, 2019. a, b, c, d

Busch, D., Harte, R., Krätzig, W. B., and Montag, U.: New natural draft cooling tower of 200 m of height, Eng. Struct., 24, 1509–1521,, 2002. a

Chevallier, F., Broquet, G., Zheng, B., Ciais, P., and Eldering, A.: Large CO2 Emitters as Seen From Satellite: Comparison to a Gridded Global Emission Inventory, Geophys. Res. Lett., 49, e2021GL097540,, 2022. a

Cusworth, D. H., Duren, R. M., Thorpe, A. K., Eastwood, M. L., Green, R. O., Dennison, P. E., Frankenberg, C., Heckler, J. W., Asner, G. P., and Miller, C. E.: Quantifying Global Power Plant Carbon Dioxide Emissions With Imaging Spectroscopy, AGU Advances, 2, e2020AV000350,, 2021. a

Deutsches Zentrum für Luft- und Raumfahrt: HALO database, Deutsches Zentrum für Luft- und Raumfahrt [data set],, 2021. a

Dipankar, A., Stevens, B., Heinze, R., Moseley, C., Zängl, G., Giorgetta, M., and Brdar, S.: Large eddy simulation using the general circulation model ICON, J. Adv. Model. Earth Sy., 7, 963–986,, 2015. a

ECMWF: IFS Documentation CY47R1 – Part III: Dynamics and Numerical Procedures, no. 3 in IFS Documentation, ECMWF,, 2020. a

Ehret, G., Bousquet, P., Pierangelo, C., Alpers, M., Millet, B., Abshire, J. B., Bovensmann, H., Burrows, J. P., Chevallier, F., Ciais, P., Crevoisier, C., Fix, A., Flamant, P., Frankenberg, C., Gibert, F., Heim, B., Heimann, M., Houweling, S., Hubberten, H. W., Jöckel, P., Law, K., Löw, A., Marshall, J., Agusti-Panareda, A., Payan, S., Prigent, C., Rairoux, P., Sachs, T., Scholze, M., and Wirth, M.: MERLIN: A French-German Space Lidar Mission Dedicated to Atmospheric Methane, Remote Sens., 9, 1052,, 2017. a

Farchi, A., Bocquet, M., Roustan, Y., Mathieu, A., and Quérel, A.: Using the Wasserstein distance to compare fields of pollutants: application to the radionuclide atmospheric dispersion of the Fukushima-Daiichi accident, Tellus B, 68, 31682,, 2016. a

Fiehn, A., Kostinek, J., Eckl, M., Klausner, T., Gałkowski, M., Chen, J., Gerbig, C., Röckmann, T., Maazallahi, H., Schmidt, M., Korbeń, P., Neçki, J., Jagoda, P., Wildmann, N., Mallaun, C., Bun, R., Nickl, A.-L., Jöckel, P., Fix, A., and Roiger, A.: Estimating CH4, CO2 and CO emissions from coal mining and industrial activities in the Upper Silesian Coal Basin using an aircraft-based mass balance approach, Atmos. Chem. Phys., 20, 12675–12695,, 2020. a, b, c

Fioletov, V. E., McLinden, C. A., Krotkov, N., and Li, C.: Lifetimes and emissions of SO2 from point sources estimated from OMI, Geophys. Res. Lett., 42, 1969–1976,, 2015. a

Fix, A., Amediek, A., Bovensmann, H., Ehret, G., Gerbig, C., Gerilowski, K., Pfeilsticker, K., Roiger, A., and Zöger, M.: CoMet: an airborne mission to simultaneously measure CO2 and CH4 using lidar, passive remote sensing, and in situ techniques, EPJ Web Conf., 176, 02003,, 2018. a

Fuhrer, O., Osuna, C., Lapillonne, X., Gysi, T., Cumming, B., Bianco, M., Arteaga, A., and Schulthess, T.: Towards a performance portable, architecture agnostic implementation strategy for weather and climate models, Supercomput. Front. Innov. Int. J., 1, 45–62,, 2014. a

Gałkowski, M., Jordan, A., Rothe, M., Marshall, J., Koch, F.-T., Chen, J., Agusti-Panareda, A., Fix, A., and Gerbig, C.: In situ observations of greenhouse gases over Europe during the CoMet 1.0 campaign aboard the HALO aircraft, Atmos. Meas. Tech., 14, 1525–1544,, 2021. a, b

Gerilowski, K., Tretner, A., Krings, T., Buchwitz, M., Bertagnolio, P. P., Belemezov, F., Erzinger, J., Burrows, J. P., and Bovensmann, H.: MAMAP – a new spectrometer system for column-averaged methane and carbon dioxide observations from aircraft: instrument description and performance analysis, Atmos. Meas. Tech., 4, 215–243,, 2011. a, b

Hakkarainen, J., Szeląg, M. E., Ialongo, I., Retscher, C., Oda, T., and Crisp, D.: Analyzing nitrogen oxides to carbon dioxide emission ratios from space: A case study of Matimba Power Station in South Africa, Atmos. Environ. X, 10, 100110,, 2021. a

Hanfland, R., Pattantyús-Ábrahám, M., Richter, C., Brunner, D., and Voigt, C.: Atmospheric Radionuclide Transport Model (ARTM) – Development, Description and Sensitivity Analysis, Air Qual. Atmos. Hlth., 249,, 2022. a, b

Heus, T., van Heerwaarden, C. C., Jonker, H. J. J., Pier Siebesma, A., Axelsen, S., van den Dries, K., Geoffroy, O., Moene, A. F., Pino, D., de Roode, S. R., and Vilà-Guerau de Arellano, J.: Formulation of the Dutch Atmospheric Large-Eddy Simulation (DALES) and overview of its applications, Geosci. Model Dev., 3, 415–444,, 2010. a

Jähn, M., Kuhlmann, G., Mu, Q., Haussaire, J.-M., Ochsner, D., Osterried, K., Clément, V., and Brunner, D.: An online emission module for atmospheric chemistry transport models: implementation in COSMO-GHG v5.6a and COSMO-ART v5.1-3.1, Geosci. Model Dev., 13, 2379–2392,, 2020. a, b

Janicke, L. and Janicke, U.: Entwicklung eines modellgestützen Beurteilungssystems für den anlagenbezogenen Immissionsschutz, Tech. rep., Federal Environmental Agency, (last access: 22 February 2023), 2013. a

Janssens-Maenhout, G., Pinty, B., Dowell, M., Zunker, H., Andersson, E., Balsamo, G., Bézy, J.-L., Brunhes, T., Bösch, H., Bojkov, B., Brunner, D., Buchwitz, M., Crisp, D., Ciais, P., Counet, P., Dee, D., Denier van der Gon, H., Dolman, H., Drinkwater, M., Dubovik, O., Engelen, R., Fehr, T., Fernandez, V., Heimann, M., Holmlund, K., Houweling, S., Husband, R., Juvyns, O., Kentarchos, A., Landgraf, J., Lang, R., Löscher, A., Marshall, J., Meijer, Y., Nakajima, M., Palmer, P., Peylin, P., Rayner, P., Scholze, M., Sierk, B., Tamminen, J., and Veefkind, P.: Towards an operational anthropogenic CO2 emissions monitoring and verification support capacity, B. Am. Meteorol. Soc., 101, E1439–E1451,, 2020. a, b

Karion, A., Lauvaux, T., Lopez Coto, I., Sweeney, C., Mueller, K., Gourdji, S., Angevine, W., Barkley, Z., Deng, A., Andrews, A., Stein, A., and Whetstone, J.: Intercomparison of atmospheric trace gas dispersion models: Barnett Shale case study, Atmos. Chem. Phys., 19, 2561–2576,, 2019. a, b

Katharopoulos, I., Brunner, D., Emmenegger, L., Leuenberger, M., and Henne, S.: Adaptations to FLEXPART-COSMO for Simulations at 1 km Grid Resolution, Bound.-Lay. Meteorol., 185, 129–160,, 2022. a

Kern, B. and Jöckel, P.: A diagnostic interface for the ICOsahedral Non-hydrostatic (ICON) modelling framework based on the Modular Earth Submodel System (MESSy v2.50), Geosci. Model Dev., 9, 3639–3654,, 2016. a, b

Kiemle, C., Ehret, G., Amediek, A., Fix, A., Quatrevalet, M., and Wirth, M.: Potential of Spaceborne Lidar Measurements of Carbon Dioxide and Methane Emissions from Strong Point Sources, Remote Sens., 9, 1137,, 2017. a

Kostinek, J., Roiger, A., Eckl, M., Fiehn, A., Luther, A., Wildmann, N., Klausner, T., Fix, A., Knote, C., Stohl, A., and Butz, A.: Estimating Upper Silesian coal mine methane emissions from airborne in situ observations and dispersion modeling, Atmos. Chem. Phys., 21, 8791–8807,, 2021. a

Krautwurst, S., Gerilowski, K., Borchardt, J., Wildmann, N., Gałkowski, M., Swolkień, J., Marshall, J., Fiehn, A., Roiger, A., Ruhtz, T., Gerbig, C., Necki, J., Burrows, J. P., Fix, A., and Bovensmann, H.: Quantification of CH4 coal mining emissions in Upper Silesia by passive airborne remote sensing observations with the Methane Airborne MAPper (MAMAP) instrument during the CO2 and Methane (CoMet) campaign, Atmos. Chem. Phys., 21, 17345–17371,, 2021. a, b, c

Krings, T., Gerilowski, K., Buchwitz, M., Hartmann, J., Sachs, T., Erzinger, J., Burrows, J. P., and Bovensmann, H.: Quantification of methane emission rates from coal mine ventilation shafts using airborne remote sensing data, Atmos. Meas. Tech., 6, 151–166,, 2013. a

Kuhlmann, G., Broquet, G., Marshall, J., Clément, V., Löscher, A., Meijer, Y., and Brunner, D.: Detectability of CO2 emission plumes of cities and power plants with the Copernicus Anthropogenic CO2 Monitoring (CO2M) mission, Atmos. Meas. Tech., 12, 6695–6719,, 2019. a, b, c

Kuhlmann, G., Henne, S., Brunner, D., Löscher, A., and Meijer, Y.: SMARTCARB2 – Use of satellite measurements of auxiliary reactive trace gases for fossil fuel carbon dioxide emission estimation (Phase 2), Zenodo,, 2021a. a

Kuhlmann, G., Henne, S., Meijer, Y., and Brunner, D.: Quantifying CO2 Emissions of Power Plants With CO2 and NO2 Imaging Satellites, Frontiers in Remote Sensing, 2, 689838,, 2021b. a, b, c, d, e

Liu, Y., Gruber, N., and Brunner, D.: Spatiotemporal patterns of the fossil-fuel CO2 signal in central Europe: results from a high-resolution atmospheric transport model, Atmos. Chem. Phys., 17, 14145–14169,, 2017. a

Lorenz, E. N.: The predictability of a flow which possesses many scales of motion, Tellus, 21, 289–307,, 1969. a

Mallaun, C., Giez, A., and Baumann, R.: Calibration of 3-D wind measurements on a single-engine research aircraft, Atmos. Meas. Tech., 8, 3177–3196,, 2015. a, b, c

Minx, J. C., Lamb, W. F., Andrew, R. M., Canadell, J. G., Crippa, M., Döbbeling, N., Forster, P. M., Guizzardi, D., Olivier, J., Peters, G. P., Pongratz, J., Reisinger, A., Rigby, M., Saunois, M., Smith, S. J., Solazzo, E., and Tian, H.: A comprehensive and synthetic dataset for global, regional, and national greenhouse gas emissions by sector 1970–2018 with an extension to 2019, Earth Syst. Sci. Data, 13, 5213–5252,, 2021. a

Nassar, R., Hill, T. G., McLinden, C. A., Wunch, D., Jones, D. B. A., and Crisp, D.: Quantifying CO2 Emissions From Individual Power Plants From Space, Geophys. Res. Lett., 44, 10045–10053,, 2017. a, b, c

Nassar, R., Mastrogiacomo, J.-P., Bateman-Hemphill, W., McCracken, C., MacDonald, C. G., Hill, T., O'Dell, C. W., Kiel, M., and Crisp, D.: Advances in quantifying power plant CO2 emissions with OCO-2, Remote Sens. Environ., 264, 112579,, 2021. a, b

Nassar, R., Moeini, O., Mastrogiacomo, J.-P., O’Dell, C. W., Nelson, R. R., Kiel, M., Chatterjee, A., Eldering, A., and Crisp, D.: Tracking CO2 emission reductions from space: A case study at Europe's largest fossil fuel power plant, Frontiers in Remote Sensing, 3, 1–15,, 2022. a, b, c, d, e

Pregger, T. and Friedrich, R.: Effective pollutant emission heights for atmospheric transport modelling based on real-world information, Environ. Pollut., 157, 552–560,, 2009. a

Prusa, J. M., Smolarkiewicz, P. K., and Wyszogrodzki, A. A.: EULAG, a computational model for multiscale flows, Comput. Fluid., 37, 1193–1207,, 2008. a, b

Reuter, M., Buchwitz, M., Schneising, O., Krautwurst, S., O'Dell, C. W., Richter, A., Bovensmann, H., and Burrows, J. P.: Towards monitoring localized CO2 emissions from space: co-located regional CO2 and NO2 enhancements observed by the OCO-2 and S5P satellites, Atmos. Chem. Phys., 19, 9371–9383,, 2019. a

Schraff, C. H.: Mesoscale data assimilation and prediction of low stratus in the Alpine region, Meteorol. Atmos. Phys., 64, 21–50,, 1998. a

Sierk, B., Fernandez, V., Bézy, J.-L., Meijer, Y., Durand, Y., Courrèges-Lacoste, G. B., Pachot, C., Löscher, A., Nett, H., Minoglou, K., Boucher, L., Windpassinger, R., Pasquet, A., Serre, D., and te Hennepe, F.: The Copernicus CO2M mission for monitoring anthropogenic carbon dioxide emissions from space, in: International Conference on Space Optics – ICSO 2020, edited by: Cugny, B., Sodnik, Z., and Karafolas, N., International Society for Optics and Photonics, SPIE, vol. 11852, 30 March–2 April 2021, virtual conference, 1563–1580,, 2021. a

Skamarock, W. C., Klemp, J. B., Dudhia, J., Gill, D. O., Barker, D. M., Wang, W., and Powers, J. G.: A description of the Advanced Research WRF version 3, NCAR Technical note -475+STR,, 2008. a

Strandgren, J., Krutz, D., Wilzewski, J., Paproth, C., Sebastian, I., Gurney, K. R., Liang, J., Roiger, A., and Butz, A.: Towards spaceborne monitoring of localized CO2 emissions: an instrument concept and first performance assessment, Atmos. Meas. Tech., 13, 2887–2904,, 2020. a

Sweeney, C., Karion, A., Wolter, S., Newberger, T., Guenther, D., Higgs, J. A., Andrews, A. E., Lang, P. M., Neff, D., Dlugokencky, E., Miller, J. B., Montzka, S. A., Miller, B. R., Masarie, K. A., Biraud, S. C., Novelli, P. C., Crotwell, M., Crotwell, A. M., Thoning, K., and Tans, P. P.: Seasonal climatology of CO2 across North America from aircraft measurements in the NOAA/ESRL Global Greenhouse Gas Reference Network, J. Geophys. Res.-Atmos., 120, 5155–5190,, 2015. a

Vanderbecken, P. J., Dumont Le Brazidec, J., Farchi, A., Bocquet, M., Roustan, Y., Potier, É., and Broquet, G.: New plume comparison metrics for the inversion of passive gases emissions, Atmos. Meas. Tech. Discuss. [preprint],, in review, 2022.  a

Varon, D. J., Jacob, D. J., McKeever, J., Jervis, D., Durak, B. O. A., Xia, Y., and Huang, Y.: Quantifying methane point sources from fine-scale satellite observations of atmospheric methane plumes, Atmos. Meas. Tech., 11, 5673–5686,, 2018. a, b, c, d, e, f

Wang, Y., Broquet, G., Bréon, F.-M., Lespinas, F., Buchwitz, M., Reuter, M., Meijer, Y., Loescher, A., Janssens-Maenhout, G., Zheng, B., and Ciais, P.: PMIF v1.0: assessing the potential of satellite observations to constrain CO2 emissions from large cities and point sources over the globe using synthetic data, Geosci. Model Dev., 13, 5813–5831,, 2020. a

Wolff, S., Ehret, G., Kiemle, C., Amediek, A., Quatrevalet, M., Wirth, M., and Fix, A.: Determination of the emission rates of CO2 point sources with airborne lidar, Atmos. Meas. Tech., 14, 2717–2736,, 2021. a, b, c, d, e, f

Wyszogrodzki, A. A., Miao, S., and Chen, F.: Evaluation of the coupling between mesoscale-WRF and LES‐EULAG models for simulating fine-scale urban dispersion, Atmos. Res., 118, 324–345,, 2012. a

Zängl, G., Reinert, D., Rípodas, P., and Baldauf, M.: The ICON (ICOsahedral Non-hydrostatic) modelling framework of DWD and MPI-M: Description of the non-hydrostatic dynamical core, Q. J. Roy. Meteor. Soc., 141, 563–579,, 2015. a

Zhao, X., Marshall, J., Hachinger, S., Gerbig, C., Frey, M., Hase, F., and Chen, J.: Analysis of total column CO2 and CH4 measurements in Berlin with WRF-GHG, Atmos. Chem. Phys., 19, 11279–11302,, 2019. a

Zheng, T., Nassar, R., and Baxter, M.: Estimating power plant CO2 emission using OCO-2 XCO2 and high resolution WRF-Chem simulations, Environ. Res. Lett., 14, 085001,, 2019. a

Short summary
We evaluated six atmospheric transport models for their capability to simulate the CO2 plumes from two of the largest power plants in Europe by comparing the models against aircraft observations collected during the CoMet (Carbon Dioxide and Methane Mission) campaign in 2018. The study analyzed how realistically such plumes can be simulated at different model resolutions and how well the planned European satellite mission CO2M will be able to quantify emissions from power plants.
Final-revised paper