High-resolution air quality simulations of ozone exceedance events during the Lake Michigan Ozone Study

Pierce, R. Bradley; Harkey, Monica; Lenzen, Allen; Cronce, Lee M.; Otkin, Jason A.; Case, Jonathan L.; Henderson, David S.; Adelman, Zac; Nergui, Tsengel; Hain, Christopher R.

doi:https://doi.org/10.5194/acp-23-9613-2023

Articles | Volume 23, issue 16

https://doi.org/10.5194/acp-23-9613-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/acp-23-9613-2023

© Author(s) 2023. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume 23, issue 16

Research article

|

30 Aug 2023

Research article |

| 30 Aug 2023

High-resolution air quality simulations of ozone exceedance events during the Lake Michigan Ozone Study

R. Bradley Pierce, Monica Harkey, Allen Lenzen, Lee M. Cronce, Jason A. Otkin, Jonathan L. Case, David S. Henderson, Zac Adelman, Tsengel Nergui, and Christopher R. Hain

Download

Final revised paper (published on 30 Aug 2023)
Preprint (discussion started on 02 Feb 2023)

Interactive discussion

Status: closed

RC1: 'Comment on egusphere-2023-152', Kirk Baker, 16 Feb 2023

This manuscript is well written, and the Figures are well organized and clearly convey pertinent information relevant for this assessment. It is very useful to have more modeling assessments that show skill at capturing complex meteorology and ozone chemistry in the Lake Michigan area. Some substantive comments and suggestions are provided to strengthen the paper then smaller suggestions follow.
First, some more clarity around the “EPA run” would be appreciated. EPA does not have a recommended set of options for the application of WRF or CMAQ but tends to use a certain configuration for both of these models that has traditionally performed well over the contiguous U.S. at 12 km grid resolution. The name is also a little confusing in that it might suggest that EPA provided the WRF simulation for this project. Perhaps that simulation could be called “ACM2/PX” or something along that line which identifies it by certain major PBL physics options.
Second, it is difficult to fully review this manuscript due to key aspects referencing a paper (Otkin, 2022) that is not in the reference list. A quick Google Scholar search did not turn up this paper. If this has not been published perhaps it could be provided as part of this review.
Third, some rationale would be appreciated about why the lake breeze timing and strength was only analyzed at Sheboygan and the vertical wind measurements made at Zion were not included as part of this assessment. Both sites are important for ozone impacts in southeast Wisconsin and since there are only 2 sites with vertical wind profiles it seems within scope for a paper focused on that particular process.
Fourth, how did the modeling system predict isoprene and isoprene oxidation products? These model simulations did much better at capturing formaldehyde at Sheboygan compared to other assessments that used the same biogenic emissions model. Perhaps the authors could provide a little more information about the hemispheric scale RAQMS simulation that was used to provide lateral boundary inflow. That seems to be a potentially big difference from other simulations for this field study (e.g., Baker et al, 2023).
Fifth, please provide some more details in the methods section about the version of CMAQ applied for this assessment and which version of the Carbon Bond chemical mechanism (CB6r3, etc.) was used for these model simulations. Was the GLSEA product used as part of the EPA simulation? Better resolved lake temperatures seem important for capturing thermal gradients at the lake/land interface and if only one of the simulations used that product that could be as much or more important than the different physics options selected.

More specific comments follow.
Lines 62-63. Is there a reference for 1-10 km or is this the opinion of the authors? This paper does not conclusively show that 12 km can not capture high O3 in this region that would be relevant for SIP demonstrations.
Line 94. Please define RMSE or provide a reference
Figure 2. Please comment on the over-prediction (orange) in upper right panel in Figure 2 that looks to be in Chicago.
Figure 3. It would be helpful to replace the Julian day scale with the calendar dates (same comment for Figures 4, 5, and 6).
Figure 3. YNT_SSNG NO2 is greater than EPA. Does this increase in NO2 translate to more O3 in the YNT_SSNG model? If so, how does that impact contemporaneous O3 performance? Is there a better way to show this in figures or explain in text?
Fig 5 and 6 not so easy to see, consider shortening the x-axis to include only the A B and C time periods the authors wanted to highlight. For Figures 5 and 6 there does not seem to be a notable difference in performance. Some days one configuration is better and then the other is better on other days. Some days both are poor. I think the aggregated metrics might be over-emphasized in the text. Do they include nighttime minimums that are not SIP relevant?
Line 211. “While the overall hourly ozone statistics at Sheboygan KA and Chiwaukee Prairie are relatively similar between the EPA and YNT_SSNG simulations at these sites, the simulations during high ozone events are quite different. This is illustrated by looking at composite statistics during events A, B, and C.” How are they different? The subsequent figures and paragraphs generally seem to say the two sims are more similar than different.
Lines 311. “Given the relatively high precision of GeoTASO NO2 compared to the column amounts observed during high ozone events A, B and C, we conclude that the high bias in NO2 columns in the EPA simulation is meaningful.” Meaningful how? Does this translate to differences in O3 production and therefore performance in the model?
Figure 12. How does the penetration compare for B and C?
Line 389. “The YNT_SSNG simulation captures the thermal structure of the nocturnal boundary layer and timing of the maritime boundary layer but underestimates the surface temperatures (by roughly how much? Please quantify or provide model perf statistics here) within the convective boundary layer. The YNT_SSNG simulation captures the vertical structure of the lake breeze wind speed and direction, but the timing of the switch in wind direction is about 3 hours too early. In contrast, the EPA simulation shows no thermodynamic signature of a nocturnal or marine boundary layer and underestimates (by how much? quantify or provide model perf statistics here) the sharp change in the observed windspeed and direction.”
REFERENCES
Baker, Kirk R; Liljegren, Jennifer; Valin, Lukas; Judd, Laura; Szykman, Jim; Millet, Dylan B; Czarnetzki, Alan; Whitehill, Andrew; Murphy, Ben; Stanier, Charles; Photochemical model representation of ozone and precursors during the 2017 Lake Michigan ozone study (LMOS), Atmospheric Environment, 293,,119465,2023

Citation: https://doi.org/10.5194/egusphere-2023-152-RC1
RC2:
'Comment on egusphere-2023-152', Anonymous Referee #2, 14 Mar 2023
General comments:
I would rate this paper as “minor revisions” – I have a lot of comments, but reading them through, they are along the lines of clarifications needed in the text, some additional references worth quoting since they are recent papers on the same topic in the same region that the authors have missed, the need for some caveats on some of the conclusions, etc., as opposed to serious methodological issues. I also made some suggestions for additional possible causes for model differences that might be worth investigating (a check on the inputs for ozone deposition velocity and whether the model changes between the two versions might have affected the simulated deposition velocities, for example). Definitely worth publishing, subject to the clarifications, etc., below in my specific comments.
Specific Comments:
Introduction:
Line 53, list of references: A few on the Canadian side of the border worth referencing, since they are on the same topic, and give insight into model processes (including two studies at 2.5km resolution):
Stroud et al., 2020: Chemical analysis of surface-level ozone exceedances during the 2015 Pan American games, Atmosphere, 11(6), 572; https://doi.org/10.3390/atmos11060572 , identifies the updraft region of lake breeze fronts as the region where ozone production is occurring, and a transition from VOC to NOx-sensitive O3 formation within that region. Lake Ontario water temperature and strength of the large-scale flow shown to be critical in timing of the observed and modelled LBFs.
Brook et al, 2013: https://acp.copernicus.org/articles/13/10461/2013/ Overview of the main findings of the BAQSMet2007 study, including interactions of LBFs and ozone formation.
Makar et al, 2010 : Mass tracking for chemical analysis: the causes of ozone formation in southern Ontario during BAQS-Met 2007, https://acp.copernicus.org/articles/10/11151/2010/ 2.5km resolution analysis, lake breeze, recirculation and other effects studied.
He et al., 2011 : https://acp.copernicus.org/articles/11/2569/2011/ observation-based analysis of O3.

Methods:
Line 80: please mention the vertical resolution of the model for the first few layers up from the surface, as part of this discussion. Later, in the conclusions, mention is made of a disconnect between the ACM2 physics and the modelling setup for YNT_SSNG: please provide some background information, either in the Introduction or Methods sections, on this disconnect and why it might impact model results.
Line 83: mention is made of the Morrison, Thompson microphysics as well as the Kain-Fritsch cumulus parameterization. Explicit microphysics schemes are usually thought of as applying only at high resolution, while cumulus parameterizations such as KF are usually used at lower resolutions. Which type of cloud approach was applied at which resolution, in the authors’ setup of the nested model, and why? Is this the disconnect mentioned by the authors I mention above with reference to line 80 and later in the text?
Line 93: The authors have discussed the model setup, but not the input analyses used to drive it, or how often these were updated. That is, they mention the EPA meteorology, and the optimized WRF configuration: were both of these meteorological models driven from the same meteorological analysis or from different analyses? Depending on the source of driving met information used for those two sources of information, different results for the meteorology might occur, and may explain some of the differences between the two CMAQ simulations (if they were different – e.g. different analysis hours, different analyses from different sources). Please give some more information on the source of analysis information used to generate the meteorology for the two simulations – was it the same information or different – and how often the meteorological models were updated using a new analysis. This is a key part of the meteorological setup for this sort of experiment; needs to be included in Methods.
Line 95: I didn’t see a description on whether either the CMAQ or meteorological models include a parameterization for horizontal diffusion when operating at high resolution. Is this process included (please mention if or if not, in the text)?
Line 108: the paper would benefit from a figure showing the nested domains used in the simulations, please add.
Line 114: the information that the 4 km emissions were interpolated and downscaled by 1/9 (I assume that this was actually 1.3²/4², or about 1/9.47?), rather than generated by SMOKE using area source polygons gridded directly to higher resolution spatial allocation was a bit surprising. The use of 4km data at 1.3 km will negate some of the advantages of going to 1.3km resolution. Why was this approach taken? Had 1.3 km² resolution emissions been generated directly but resulted in poor performance, or is this a stage still to be carried out? The text needs to note at this point that the linear interpolation and scaling of 4km emissions to 1.3 km resolution will prevent the resolution from being able to capture, e.g., area sources of smaller than a 4x4 km grid cell, effectively diluting the 1.3km emitted pollutant mass on input to the 1.3km model. This could have a substantial impact on model performance, and needs to acknowledgement in the Methods and Conclusions sections (ideally, rerunning the model with emissions at the higher resolution would be a better approach). Were the biogenic area emissions also downscaled in this fashion?

Results

Line 155: The authors have stated that “the biases and RMSE in 8-h maximum ozone are generally smaller by 2ppbv in the YNT_SSNG simulation”. This needs to be corrected - I think what’s happening is more nuanced and should be qualified a bit by the authors. What I see in Figure 1’s mean bias panel is that the concentration biases in the YNT_SSNG parameterization are uniformly more positive than the EPA simulation, aside from the 80—90 ppbv values where the EPA simulation has the least negative bias. That is, the YNT_SSNG concentrations are likely in general higher, across the entire < 80 ppbv concentration range. For example, relative to the zero bias line, the YNT_SSNG results for 20 – 30 ppbv and for 30 – 40 ppbv are actually worse (higher positive bias) than for the EPA model. Similarly, the EPA model has the lower magnitude bias for the upper end concentrations between 80 and 90 ppbv. The general upward shift in bias towards more positive numbers in the YNT_SSNG model versus the EPA model made me wonder if this offset is associated with a different deposition algorithm being used between the two models, or if the same deposition algorithm is used, that the differences in GVF, soil temperature and moisture may have resulted in different deposition velocities for O3. That kind of difference in bias may be an indication of a different loss flux for O3 between the two models.
Similarly, the RMSE would be better described as “better for the YNT_SSNG model between 40 and 80 ppbv”, since between 20 and 40 ppbv and between 80 and 90 ppbv, the EPA model has a lower RMSE than the YNT_SSNG model.
So – mention the range of concentrations where each model is outperforming the other, rather than the blanket statement currently in use.
Figure 2: agree, YNT_SSNG definitely looks better on average for both mean bias and RMSE in these maps (which will give the average performance across all concentration ranges).
Line 188: maybe that should be “hence better representation of biogenic VOCs”. Another possibility here: were the same land use category maps used for deposition in each of the models, or were they also updated / different between the two simulations? Land use type, Leaf-area-index also have a key impact on deposition velocity – if those have changed, they could well account for the offset in ozone bias noted above. Its worth adding a few lines to the methodology regarding whether any of the inputs to the model’s gas-phase dry deposition code have changed as a result of the different configurations.
Figure 4: An aside: I actually thought that both of these are pretty good, for HCHO, since it is so dependent on the secondary chemistry as well as the primary VOC emissions.
Line 208: Discussion on Figures 5 and 6: its rather hard to tell the two panels in each of these figures apart at a glance; more useful information is provided by the summary values in r, bias and rmse on the panels. Aside from establishing that the two simulations do have differing performance at two different sites, I’m not sure if they add to the paper – either remove them, put them in an SI, or try zooming in on the period encompassing start of A through end of C to better show the differences in the region of interest (start at Julian day 152, end at 169).
Also, Figure 5, 6 discussion. Can the authors make some statement on whether the model results are different from each other in a statistically significant way (e.g. 90% confidence limits calculated for the two model runs and compared, etc.). Can something quantitative be stated regarding the model differences not be due to chance, etc.?
Line 233: differences in the meteorology had me wondering whether the two sets of driving meteorology were generated from different analyses, see comment above.
Line 258: “provide an overall comparison of the ”. They provide a comparison at two locations. Are there other meteorological stations in the region which can be used to provide an overall evaluation table of meteorological performance? That would be better than a two station comparison…. Figure 1 provides part of this, I think and is a better indication of model performance than Figure 7.
Figure 9: There needs to be some explanation or at least speculation of why the models both are showing much greater variability in the simulated wind direction relative to the observed wind direction, for wind directions < 200 degrees. I’m wondering for example what the wind magnitudes were here; the Figure 7 wind roses give frequency but not wind speed… for example, are the winds “light and variable” when the winds are coming from the < 200 degree directions, hence possibly explaining why the direction has such higher variability? i.e. are the models having issues resolving wind direction at low wind speeds?
Lines 313 to 315: Looking at Figure 11, the YNT_SSNG has a larger number of points clustering off of the 1:1 line than the EPA simulation, in the upper end of the range. The figures need some addition to show the range of variability that might be expected in the observations, to better make the point regarding HCHO results for the two models being a bit less hard to quantify given the uncertainties in the obs. Plotting the obs expected range of variability (e.g. factor of 2 lines if the obs are as much as a factor of 2 off) on the figures would make a better case, and I suggest the authors do this – I’m not convinced by the text and looking at the figures that the HCHO values are sufficiently uncertain that both EPA and YNT_SSNG are doing equally well. I suspect that Figure 11 has a large number of points in the red zone outside of the expected error of the instrument for both simulations.
Figures 10 and 11 lack colour bar scales for the number of points in a given cell, and the dividing line between points clustered by cell and individual points obs/model pairs as dots is not clear. Colour bar scales need to be added to the two figures.
Section 3.3:
General comment: this section was more convincing – for this particular case, the YNT_SSNG model was doing better than the EPA model.
Line 330: Why was this particular time chosen? One thing I’ve seen from high resolution AQ model runs in the great lakes area in past research (see above references) is that the timing of the front arrival can be off. In that respect, with regards to the EPA simulation – did the EPA front ever reach as far inland as the YNT front during the event – or was it always close to the shoreline?
Line 334, left panel in Figure 12: do the authors have access to the original imagery so they could alter the grey scale to a colour scale making the differences a bit easier to see in the image?
Line 339: However, it is also clear from Figure 12 that the ozone at several of the stations is more accurately captured by the EPA model setup (e.g. the northernmost station, two stations near the "-68" and "43" on the figure, the two stations near Chiwaukee); worth adding a caveat to that effect. YNT_SSNG gets the southern group of three stations better, however. A scatterplot of O3 for across all stations at this time might help to make a better case for YNT_SSNG here, too. I agree that the EPA simulation is not getting the inland penetration right and YNT_SSNG is doing a better job, however.
Line 358: agree, good case that YNT_SSNG is doing better for column NO2. Figure 13: both simulations appear to be missing some of the column NO2 over the lake on the eastern side of the observations. EPA could arguably be doing better there, though it shows an erroneous peak to the SE not in the obs. Any thoughts on what’s happening further out over the lake and why the models are missing it? Upper atmosphere NO2 coming from somewhere else?
Lines 367 to 370: I don't follow the explanation here - a few more sentences of description of how/why the drift relative to the baseline happens are needed. EPA simulation seems to be closer to obs in this eastern part of the domain, but authors are saying we don't trust the data there: they need to explain reasoning why the observation results are less trustworthy here better, preferably with some references. Note that the NO2 columns also show higher values on the east side, which would argue a real effect as opposed to artifact.
Figure 15: YNT_SSNG definitely better than EPA for temperature and windspeed here. Both seem to be off for the wind direction (centre panels). I note that this corresponds to an underprediction by both models of the wind speed for the central region at about 10 GMT; see my earlier note regarding Figure 9 and wind direction uncertainties being exacerbated at low wind speeds.

Conclusions:
Some further summary comments, based on the Figures (authors should note if these take-away messages differ from their intent and modify the text if so):
Figures and text of section 3.3 make the best case that YNT_SSNG is doing a better job.
Figure 1 suggests that YNT_SSNG biases have been shifted more positive uniformly across the concentration bins, RMSE are better for YNG_SSNG for 40-80 ppbv range, worse for smaller and higher concentration ranges, than EPA. Figure 2: YNT_SSNG doing a better job.
Figure 3: toss-up.
Figure 4: YNT_SSNG better.
Figure 5 & 6: together, a toss-up (and two stations are not necessarily meaningful - I wonder what a time series of the average of all stations for the obs and the two models would look like?).
Figure 7: hard to say which is better.
Figure 8: lots of variability, but EPA seems to be doing better for the highest O3 events.
Figure 9: models have so much variability in the wind direction at low angles that its very hard to see which is better. This needs to be combined with wind speeds somehow.
Figures 10 and 11 could use lines showing the upper and lower range associated with the satellite observations. I'm not convinced by the authors's statement that the HCHO values are sufficiently suspect that both models performance relative to those observations is similar. Show it on the figures.
Figure 12 - 15: EPA doing better for some stations - but YNT_SSNG doing better for getting the lake breeze penetration distance and the shape of the front, etc.
Other comments on the Conclusions:
Lines 417-418: Figure 11 seems to indicate that column HCHO biases are in places more negative in YNT_SSNG than EPA? See earlier comment about adding some lines indicating the observation variability on that Figure.
Line 422: biogenic emissions differ, don’t they? Add a qualifier here, emissions are not necessarily identical. Re: differences being linked to differences in horizontal and vertical transport: it might be better to day that they are linked to differences in meteorology, but not necessarily just transport. For example, are the cloud fields the same for the two simulations (aside from aqueous chemistry effects, whether or not convection has kicked off may also influence photolysis rates).
Line 437: mismatch between ACM2 vertical diffusion and YNT_SSNG mentioned here for the first time – see my note above – some prior discussion of this issue and what the mismatch consists of needs to appear earlier in the text.
Line 443-444: The authors statement here is too broad – the YNT_SSNG shows improved timing and extent of the lake breeze front for one high ozone episode; that of section 3.3. The statement should be rewritten as, e.g., “in the YNT_SSNG simulation for the June 2, 2017 ozone episode”. The abstract had it right, in that respect; this line in the conclusions needs to match the abstract.
The authors mention that going to two-way coupling is worth looking at for this problem. I would agree, with some caveats and advice (since our group has gone there). One of the first thing that happens with full coupling is that all of the cloud fields move – since the activation of clouds is linked to the locations where hygroscopic aerosols are located. However, these movements may not be significant, since individual convective cells kicking off or not in adjacent grid cells give a local large difference in water and chemistry… but don’t necessarily affect things like the timing of frontal passage: you need to look at multiday averages to see whether the feedback effects have a net effect – single case studies are much more difficult to show a significant effect. Use of confidence ratios (cf Figures 8, 12, 13 of Makar et al, 2021 https://acp.copernicus.org/articles/21/12291/2021/ ) are one way to estimate the significance of changes like these. Calculating similar confidence ratios might be another way to see the extent to which the YNT_SSNG and EPA simulations are significantly different, in the current study.
Citation: https://doi.org/10.5194/egusphere-2023-152-RC2
AC1: 'Comment on egusphere-2023-152', R. Bradley Pierce, 28 Jun 2023

The comment was uploaded in the form of a supplement: https://egusphere.copernicus.org/preprints/2023/egusphere-2023-152/egusphere-2023-152-AC1-supplement.pdf

Citation: https://doi.org/10.5194/egusphere-2023-152-AC1

Peer review completion

AR: Author's response | RR: Referee report | ED: Editor decision | EF: Editorial file upload

AR by R. Bradley Pierce on behalf of the Authors (28 Jun 2023) Author's response Author's tracked changes Manuscript

ED: Publish as is (18 Jul 2023) by Stefano Galmarini

AR by R. Bradley Pierce on behalf of the Authors (24 Jul 2023)

Short summary

We evaluate two high-resolution model simulations with different meteorological inputs but identical chemistry and anthropogenic emissions, with the goal of identifying a model configuration best suited for characterizing air quality in locations where lake breezes commonly affect local air quality along the Lake Michigan shoreline. This analysis complements other studies in evaluating the impact of meteorological inputs and parameterizations on air quality in a complex environment.