Estimation of power plant SO<sub>2</sub> emissions using the HYSPLIT dispersion model and airborne observations with plume rise ensemble runs

Chai, Tianfeng; Ren, Xinrong; Ngan, Fong; Cohen, Mark; Crawford, Alice

doi:https://doi.org/10.5194/acp-23-12907-2023

Articles | Volume 23, issue 19

https://doi.org/10.5194/acp-23-12907-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/acp-23-12907-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 23, issue 19

Research article

|

13 Oct 2023

Research article |

| 13 Oct 2023

Estimation of power plant SO₂ emissions using the HYSPLIT dispersion model and airborne observations with plume rise ensemble runs

Tianfeng Chai, Xinrong Ren, Fong Ngan, Mark Cohen, and Alice Crawford

Download

Final revised paper (published on 13 Oct 2023)
Preprint (discussion started on 13 Mar 2023)

Interactive discussion

Status: closed

RC1:
'Comment on egusphere-2023-329', Anonymous Referee #1, 04 Apr 2023

Review of “Estimation of power plant emissions…” by Chai et al.
The paper describes attempts to estimate SO2 emissions from power plants by use of a Lagrangian dispersion model and aircraft measurements. It emphasizes the uncertainty in plume rise due to stack heat input, which is treated as unknown. Two methods are used to find the optimal heat input. There is some good information here, but the presentation could be clearer, and the implications should be more clearly stated.
General comments:

1. The objective of the paper seems to be to find ways to determine the optimum simulation to produce the correct (known) emissions. Two methods are suggested, one based on correlation between the observed and simulated time series, and the other based on the RMS difference of that same time series. Unfortunately I have just explained the objective more clearly than the paper ever does. These are reasonable proposals for how to determine the optimum simulation, but they both have flaws, which are evident in the data. For example, both the correlation and the RMS are sensitive to misplacement of the plume, whereas the inversion may not be sensitive to that misplacement.

2. The heat input to the plume rise calculation is treated as a free parameter. There must be reasonable estimates of the real value available, based on the CEMS data and the characteristics of the plants, for example whether they have scrubbers or not. If the optimization process finds values that are well outside a reasonable range, that may indicate that the plume rise calculation is inadequate, which would be valuable information.

3. Only two of the many possible sources of uncertainty are explored here. That’s fine if it is clearly stated. The two sources examined are the plume rise and the background specification. Errors in wind direction are present, and get some attention. Errors in vertical placement and mixing of the plume are probably also present, but are not discussed at all. As it stands, varying the heat input amounts to looking for the value that best compensates other errors in the model. That’s not wrong as an empirical method, but again, it should be stated.

4. The vertical structure of the simulated plumes should get more emphasis. Some of the figures in the Appendix should be promoted to the main text. It looks like the flights were rather close to the plants, that is, in the region where the plume is not well-mixed in the vertical. This is arguably a mistake in the flight planning, unless it is an error in the model (too slow mixing). In theory, the inversion should recover the correct emissions as long as the observation samples a reasonable amount of the plume, but this is a very strong constraint on the precision of the simulation.

5. It seems that we are to take the set of differences between the inverted and known emissions as a measure of the uncertainty of the method, but this is never stated. Although a formal uncertainty analysis is not really possible with such a small number of samples, some statement should be made. Clearly the differences are not Gaussian, and the large differences (which may or may not be “outliers”), are of concern.

6. More detail is needed on the WRF runs. WRF has many options. The chosen options, initial and boundary data, etc. must be stated in enough detail that WRF experts can judge whether they are reasonable, and others can plausibly replicate the results. In particular, whether using the mixed layer depth (PBLH?) out of WRF directly is reasonable depends on the physics options chosen.

7. The SO2 background is clearly important in this region. I recommend simplifying the presentation by removing the parts where background is not used. Furthermore, I am concerned that the method used to derive the background may be chosen primarily because it gives the best results (given compensating errors). The explanation is not perfectly clear, but the 25th percentile within the plume seems like it should yield a value considerably higher than background, which is usually taken to be outside the plume.

8. The conclusions should state the authors’ recommendations for future studies. This should include a recommendation for which optimization method to use, or that another method is needed. Guidelines for deciding whether a given set of observations is useful or should be discarded would be helpful. Does a large RMS relative to the mean imply that a flight should be discarded? Implications for flight planning should be included. Do the authors recommend using single deterministic meteorology, or should ensembles be used?
Specific comments:

1. The abstract is long and detailed, but does not clearly state the objectives or method. It is more of an introduction than an abstract.

2. Line 273: The standard deviation of the 1-s observations is not a reasonable estimate of the observational uncertainty. It does not take into account sampling error (probably dominant here). How is the process affected by a larger observation uncertainty estimate?

Citation: https://doi.org/10.5194/egusphere-2023-329-RC1
- AC1: 'Reply on RC1', Tianfeng Chai, 31 Jul 2023
  
  Please see attached pdf file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-329-AC1
- AC3: 'Comment on egusphere-2023-329', Tianfeng Chai, 31 Jul 2023
  
  Revised manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2023-329-AC3
RC2:
'Comment on egusphere-2023-329', Anonymous Referee #2, 19 Jun 2023
My overall rating is accept only after major revisions. I think that the use of the retrieval algorithm has merit, and that’s the key reason I’m not recommending rejection. However, there are several issues with the paper that need to be resolved before I can recommend acceptance. Examples (details follow):
Reason for the need for Qh estimates rather than stack emission temperatures and vertical velocities is unclear. The latter are usually part of CEMS observations, and emissions inventories usually include these parameters as time dependent values or annual averages. I’ve included suggested sources of information and an EPA emissions inventory staff person’s email address to contact on information for the sources the authors studied.

Details on the plume-rise calculation need to be given – there are different ways of implementing Briggs algorithms (see references given below).

The extent to which the observations indicate conditions suitable for attempting retrievals (steady-state of the observation data) is unclear – but this could be determined from the observation data (references to consult are included).

The extent to which the HYSPLIT model provides sufficient process detail for a reactive gas such as SO2 for determination of emissions estimates is unclear. There is a risk that this model is too simplified to adequately simulate SO2 concentrations, and the lack of process detail may contribute to emissions estimate errors.

The measurement data did not attempt to bracket the individual sources, which results in ambiguity regarding upwind concentrations and the possibility for meteorological conditions to be changing in the source region. The authors can not do much about this at this stage, but the issue should be acknowledged and the Introduction should include a review of other aircraft studies flight path methodology and a discussion on how this may affect retrieval results.

My more detailed comments follow.
Abstract, pages 1 and 2, lines 1 to 30: Some rewording of the abstract is needed. The abstract provides a step-by-step description of “what was done” but does not describe the goal of the project. For example, was the objective of the work to test out the TCM retrieval method on aircraft observations (that is, was the objective to determine whether or not the method works and how well if so), was it to determine the circumstances under which retrievals can be carried out, etc.? The reader of the abstract needs to be told why the work was being done, whether or not the project was successful (why/why not), and the main conclusions resulting from the project. The authors have focused on the fine grain detail of the work and not on the big picture, which should be the focus.
Introduction, page 2, line 45, line 62: The authors’ list of “source term estimation applications have been developed using various dispersion models and inverse modeling schemes” misses a few recent ones appearing in Atmospheric Chemistry and Physics and other journals such as Nature. E.g. Fathi et al 2021 (https://acp.copernicus.org/articles/21/15461/2021/) describes meteorological conditions under which retrievals of SO2 emissions from aircraft observations are likely to result in significant errors in retrieved emissions, as well as some of the implementation details of dispersion models which may lead to errors in retrievals if they are not recognized and taken into account. That is, some meteorological conditions may result in erroneous emissions estimates – these may explain some of the authors’ problems with some of their aircraft retrievals. One underlying concept for the successful retrievals explored in the above-referenced work is that the meteorology approximates a steady-state during the time of the retrieval – the direction and speed of the winds, the change in wind speed with height, and the atmospheric stability are all invariant with time as the observations are taking place. Do the authors have sufficient data to determine whether the meteorological conditions were stable during the observation time (and does variability in those conditions explain for example the negative correlation between retrievals and CEMS values for one of the flights the authors examined)?
Introduction, page 2, lines 48-56: One difference between retrievals of cesium, volcanic ash, wildfire particulate emissions and unreactive tracer transport, and emissions of SO₂, is that SO₂ may undergo oxidation by the OH radical, as well as uptake and oxidation within cloud droplets, creating sulphuric acid and particle sulphate. The paper needs to recognize this loss process, and at least attempt to estimate its magnitude (e.g. state whether or not the observations were carried out under cloud-free conditions, hence eliminating the loss through cloud processing, and estimate the gas-phase oxidative losses via OH, preferably through an independent observation-derived estimate of OH concentrations and if not via typical OH concentrations). My expectation is that the OH loss will be a relatively minor term, but this needs to be confirmed). Another loss process that needs to be explored is via dry deposition of SO₂ ( the authors mention dry deposition in terms of its Henry’s Law dependance, but not how HYSPLIT makes use of different vegetation types in its calculation of deposition. See also for example Hayden et al 2021, ACP, https://acp.copernicus.org/articles/21/8377/2021/) for a description of how aircraft retrievals of SO₂ may be used to estimate SO₂ deposition velocities directly.   Note that the accuracy of Briggs’ equations may also depend on the manner in which they are implemented (see for example Gordon et al., 2018: https://acp.copernicus.org/articles/18/14695/2018/) and/or Akingunola et al, 2018: https://acp.copernicus.org/articles/18/8667/2018/). The latter suggests that applying Briggs’ formulae may be more accurate on a layer by layer basis to account for changes in the atmospheric temperature profile with height. The authors should provide a bit more implementation details on how they used Briggs equations: does HYSPLIT or the driving WRF meteorology include information on temperature in the vertical, hence allowing for a layer-by-layer approach instead of assuming that the surface conditions are sufficient to determine the plume height?
Methods, section 2.1, line 90. It appears that there are sufficient wind speed and direction, temperature, etc., observations to determine the likelihood of retrieval success (e.g. see the three meteorology-based metrics used to describe conditions for accurate retrievals in Fathi et al, 2021). These checks can be carried out a priori to eliminate some flight data as being unlikely to provide good retrievals. The paper should include a brief description including a few explanatory figures with each flight explaining why the flight is likely to be a good candidate for successful SO2 emissions retrieval.
Figure 1, page 4: A general comment: I was surprised that the flight tracks were apparently a single downwind screen or wall, and did not attempt to bracket the source at multiple levels (box or oval flight around the source) or provide an upwind screen and downwind screen. A single downwind screen will fail to allow the effects of upwind sources to be removed from the retrieval, as well as preventing tests of meteorological variability over space to be carried out (the latter in turn providing information on the extent to which the steady-state requirement for successful retrievals is taking place).   This is a fundamental drawback of the sampling methodology in the flights the authors are using to test their retrieval methodology.
Methods, section 2.2, HYSPLIT model. The authors are using HYSPLIT as their model for retrievals. The authors need to make an argument for why HYSPLIT is an appropriate modelling platform for carrying out retrievals, as opposed to a public domain Eulerian model such as CMAQ.   For example, they mention that HYSPLIT passively advects tracers in their configuration (line 100). SO₂ is a reactive gas, being oxidized in both gas and aqueous reactions.   The authors make no estimates of these potential oxidative losses, and don’t mention whether the aircraft observations took place under clear sky conditions (i.e. whether or not aqueous removal may be likely). The potential for upwind sources of SO2 have not been included in the model.   The potential for depositional losses of SO2 needs to be described in more detail – how does this depend on land use and what deposition algorithm is being used (reference).    The model’s ability to simulate turbulent mixing of pollutants, in addition to advective transport has not been described.   Later (lines 108 to 111) the authors mention that WRF turbulence variances are used by HYSPLIT and that a fixed horizontal to vertical turbulence is imposed, and boundary layer heat and momentum fluxes are used to calculate boundary layer fluxes, but its not clear how HYSPLIT uses this information. Does the model include turbulent diffusive transport, for example? HYSPLIT has the advantage of being computationally fast – but I am not convinced based on the authors description that HYSPLIT will capture enough of the relevant physics and chemistry to be a good proxy for real atmosphere in the retrievals process. Had an Eulerian model such as CMAQ been used for the retrieval process, the vertical diffusion, upwind sources of emissions, and oxidative removal of SO2 would have been included by default. Or were all these processes included in the authors’ HYSPLIT implementation? The authors need to describe them in this section if so.
Page 4, line 106: Large variations in the observed wind direction on time scales of one minute imply that an accurate SO₂ emissions retrieval may not be possible (see the a priori meteorological criteria described in Fathi et al, 2021). A successful retrieval is dependent on the meteorology being relatively constant during the aircraft flight period. If this is not the case, estimates of emissions fluxes are likely to be in error (see Fathi et al 2021).
Page 5, lines 119-121: A better and slightly more recent reference for Briggs would be his 1984 book chapter: Briggs, G. A.: Plume rise and buoyancy effects, atmospheric sciences and power production, in: DOE/TIC-27601 (DE84005177), edited by: Randerson, D., TN, Technical Information Center, US Dept. of Energy, Oak Ridge, USA, 327–366, 1984.
Page 7, lines 132-140: The authors mention here that the gas exit temperature of the three stacks could not be obtained, without explanation, whereas they later mention (line 244) that Continuous Emissions Monitoring System data for the stacks was available. In my experience, CEMS data usually includes stack temperatures – and while there isn’t an official NEI 2019 year, the 2017 NEI is available, and all three facilities mentioned by the author are present (cf https://www.epa.gov/air-emissions-inventories/2017-national-emissions-inventory-nei-data#dataq). My point here is that gas emissions temperature usually is part of CEMS, is included in USA inventory reporting, and should be available to the authors. The authors have made use of a range of estimates of QH: this may not have been necessary, if the stack temperatures and exit velocities are available, since these may be used to generate Fb values as well (see Briggs, 1984). More justification / description of the efforts made to determine stack gas exit temperatures or typical values for same from another year (and why these were not part of the CEMS records) need to be provided in the text (or alternatively, make use of those temperatures to estimate the Fb terms in the equations). They might also contact George Pouliot at the US EPA (the EPA's emissions guru) to ask for this information if its not available on-line (pouliot.george@epa.gov). The use of approximations for QH is a serious limitation of the authors’ work here, and I’m not sure its necessary.

Section 2.4: Its worth noting here the potential impact of model deficiencies on the emissions estimates generated using the cost function and HYSPLIT output (see my note above). This is a generic concern with a data assimilation approach – the accuracy of the model used may influence the accuracy of the resulting retrieved emissions. The emissions generated will be those required to create the most accurate HYSPLIT predictions – but if HYSPLIT itself does not do a good job of transport, reaction, etc., of the emissions, then the resulting emissions estimates may be inaccurate (especially if some of the physical processes known to be present in the actual atmosphere are absent in the model).   This role of model physical parameterization detail and prediction accuracy on the retrieval process via data assimilation methods such as cost function minimization needs to be acknowledged in the text. The authors should also contrast with other methods of observation-based emissions estimation which do not have this limitation, but may have other limitations (e.g. see Fathi et al, 2021 for some examples and references).
Lines 181-183, page 8: the variation in background SO2 mixing ratio would presumably have been captured with a reaction-transport model such as CMAQ.
Line 184, page 8: I’m used to measurement campaigns where the aircraft flights are determined based on a model forecast to limit the amount of time the aircraft samples air that is not from the sources. A few words on why there was no contribution from the powerplants before 15Z or after 21Z should be added here: was this due to the aircraft flying to/from the plumes during those times, or some other consideration?
Lines 188-189, page 8: the use of the given QH value allows separation of the three plumes – is there also evidence from the observations that the three plumes were separated (e.g. in a flight screen, you’d get three different hotspots in the wall, separated by low concentrations)? The uncertainty in the interpretation in turn suggests that getting observed stack gas temperatures is critical, see earlier comment.
Figure 5, page 11: presumably the observations can be used to estimate both plume and PBL heights – they should be included on this image. How well did the model perform relative to observations?

Section 3.2 a general comment: I understand the value of a sensitivity run of a model to determine its sensitivity to a key parameter (in this case QH). There is also a question regarding the accuracy of the methodology used to retrieve emissions. Here, we have a sensitivity run that suggests that the retrieval accuracy is highest for particular QH values. Can an argument be made whereby deficiencies in the model or the retrieval algorithm be ruled out as alternative causes for the retrieval to work well at the given QH value? This is why I’m hoping that typical stack gas temperatures are available; if they are used instead of set of QH sensitivity values to estimate Fb, then the estimate of QH can be ruled out as a cause of error and the focus can be on the accuracy of the retrieval and the model. My concern here is that HYSPLIT lacks sufficient process detail to do a good job of representing SO2 removal and transport, in which case the choice of QH may compensate for that lack of detail… with an impact on the estimated emissions. A more detailed description of how SO2 is modelled in the authors’ HYSPLIT implementation might help to alleviate this concern. Figure 6: ok, the correlation coefficient improves depending on the choice of QH value – how do we know that the QH value is actually correct? Line 259-260: the correlation between model and obs has improved for certain QH. The authors seem to assume that this means that the given QH values are correct… what if it means that the given QH values compensate for other issues in the model, giving a correct result but for the wrong reason? Is there any additional information that can be brought to bear to indicate that the given QH values when the model performs well and hence Fb are in fact accurate? See above comments regarding EPA data, etc…
Table 1, page 13: The authors show that the model performs better in morning than afternoon. Why is this the case? They mention errors in the WRF wind fields – I’m wondering about other issues, such as the wind direction and stability changing over time. Was the model stability and boundary layer height stable during the time period (see the criteria for a successful retrieval in Fathi et al, 2021 – these might provide additional explanations regarding why the model performance is poor (and why retrievals during those times are less likely to be successful).

Section 3.3.1, 3.3.2: given the impact of background SO2, it would be worthwhile to mention that the flight patterns themselves could have been better constructed, to sample upwind as well as downwind air as has been done in other studies (see above references and references quoted therein). This would have removed background SO2 levels as a source of uncertainty in the emissions estimates.

Line 295, page 14: why was the height of the plumes not estimated from the aircraft observations and included in Figure 5? If the “best performance” QH values are correct, and the underlying plume rise algorithm and model are also correct, then the plume heights should match fairly well. Do they?

Section 3.3.3. The variation in model results based on correlation coefficient versus RMSE show the sensitivity of the model to the chosen QH, and that the “optimal” QH may depend on the statistic used for the evaluation. Can this comparison provide the reader with any evaluation of the accuracy of the plume rise method itself?

Figure 9: it is difficult to tell from the plots as presented whether the model is doing a good or poor job of simulating the plume height. It would be better if the authors aggregated the observations to allow for individual plumes to be determined (e.g. by interpolation between the observation values) and the same time aggregation applied to the measurements: that is, how well do the average plume heights for each plume resolved in the observations compare to the average plume heights determined by the model in each case?
Summary/ Discussion:
              Based on the study results, what would the authors recommend for future aircraft and emissions estimate follow-up work (e.g. flight planning to include upwind SO2 measurements, measuring sources for which CEMS data including stack parameters are available, a priori decision making for when the data is suitable for retrievals and when it is not suitable, etc..
              Is the use of RMSE better than correlation coefficient in determining emissions? I’m not clear on that by the end of the paper.
Line 431-435: the authors need to include in this discussion the need for direct observations of plume gas temperatures – which are included in most large point source observations (I’m hoping they can get this data from the web links and email contact address I’ve included above). The authors apparently believe that plume observations do not include stack temperature observations – I’ve found the contrary to be the case in my experience, looking at emissions inventories. It would be unusual for the powerplants mentioned to not also have stack parameters, so some follow-up by the authors is worthwhile.
Lines 441-443: “We speculate…” please clarify and expand on this statement. Do you mean plume placement in the vertical dimension? And why would the method be less sensitive to a height error?
Citation: https://doi.org/10.5194/egusphere-2023-329-RC2
- AC2: 'Reply on RC2', Tianfeng Chai, 31 Jul 2023
  
  Please see attached pdf file.
  
  Citation: https://doi.org/10.5194/egusphere-2023-329-AC2
- AC3: 'Comment on egusphere-2023-329', Tianfeng Chai, 31 Jul 2023
  
  Revised manuscript.
  
  Citation: https://doi.org/10.5194/egusphere-2023-329-AC3
AC3: 'Comment on egusphere-2023-329', Tianfeng Chai, 31 Jul 2023

Revised manuscript.

Citation: https://doi.org/10.5194/egusphere-2023-329-AC3

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by Tianfeng Chai on behalf of the Authors (01 Aug 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (17 Aug 2023) by Joshua Fu

AR by Tianfeng Chai on behalf of the Authors (22 Aug 2023)

Post-review adjustments

AA: Author's adjustment | EA: Editor approval

AA by Tianfeng Chai on behalf of the Authors (28 Sep 2023) Author's adjustment Manuscript

EA: Adjustments approved (10 Oct 2023) by Joshua Fu

Short summary

The SO₂ emissions of three power plants are estimated using aircraft observations and an ensemble of HYSPLIT dispersion simulations with different plume rise parameters. The emission estimates using the runs with the lowest root mean square errors (RMSEs) and the runs with the best correlation coefficients between the predicted and observed mixing ratios both agree well with the Continuous Emissions Monitoring Systems (CEMS) data. The RMSE-based plume rise appears to be more reasonable.

Estimation of power plant SO2 emissions using the HYSPLIT dispersion model and airborne observations with plume rise ensemble runs

Download

Interactive discussion

Peer review completion

Post-review adjustments

Estimation of power plant SO₂ emissions using the HYSPLIT dispersion model and airborne observations with plume rise ensemble runs