Lagrangian transport simulations using the extreme convection parametrization: an assessment for the ECMWF reanalyses

. Atmospheric convection plays a key role in tracer transport from the planetary boundary layer to the free troposphere. Lagrangian transport simulations driven by global meteorological input data such as the European Centre for Medium-Range Weather Forecasts’ (ECMWF’s) ERA5 and ERA-Interim reanalysis typically lack proper explicit representations of convective up- and downdrafts because of the limited spatiotemporal resolution of the input data. Lagrangian transport simulations for the troposphere can be improved by applying parametrizations to better represent the effects of unresolved convective trans- 5 port in the global meteorological reanalysis data. Here, we implemented and assessed the effects of the extreme convection parametrization (ECP) in the Massive Parallel Trajectory Calculations (MPTRAC) model. The ECP is conceptually simple. It requires the convective available potential energy (CAPE) and the height of the equilibrium level (EL) of the meteorological data for input. Assuming that unresolved convective events yield well-mixed vertical columns of air, the ECP randomly redistributes the air parcels vertically between the surface and the EL, if CAPE is present. We analyzed statistics of explicitly 10 resolved and parametrized convective updrafts and found that the frequencies of strong updrafts due to the ECP, i. e., 20 K potential temperature increase over 6 h or more, increase by 2 to 3 orders of magnitude for ERA5 and 3 to 5 orders of magnitude for ERA-Interim compared to the explicitly resolved updrafts. To assess the effects of the ECP on tropospheric tracer transport, we conducted transport simulations for the artificial tracer e90, which is released globally near the surface and which has a constant e-folding lifetime of 90 days throughout the atmosphere. The e90 simulations were conducted for the year 2017 with 15 both, ERA5 and ERA-Interim data. Next to sensitivity tests on the choice of the CAPE threshold, an important tuning parameter of the ECP, we suggest a possible improvement of the ECP method, i. e., to take into account the convective inhibition (CIN) indicating the presence of warm, stable layers that prevent convective updrafts in the real atmosphere. While ERA5 has higher spatiotemporal resolution and explicitly resolves more convective updrafts than ERA-Interim, we found there is still a need for both reanalyses to apply a convection parametrization such as the ECP to better represent tracer transport from the 20 planetary boundary layer into the free troposphere on the global scale.

no significant overhead on the total runtime of the Lagrangian transport simulations, which is an advantage for conducting large-scale ensemble or long-term simulations.
In this study, we assess whether the improvements of explicitly resolved convective features from ERA-Interim to ERA5 are sufficient to properly represent global tracer transport and to which extent Lagrangian transport simulations for the free troposphere and stratosphere can benefit from applying a convection parametrization such as the ECP. First, we compared statistical distributions of explicitly resolved and parametrized convective updrafts for both, ERA5 and ERA-Interim. This of MPTRAC to prescribe the e90 concentrations of the air parcels at the lower boundary of the model and a module to simulate exponential loss of the concentrations using a given, fixed e-folding lifetime (Sect. 3.3 and following).

The extreme convection parametrization
Atmospheric convection is characterized by various fundamental physical quantities. Among the most important quantities are 130 the convective available potential energy and the convective inhibition (Blanchard, 1998;Riemann-Campe et al., 2009). The convective available potential energy (CAPE) is the integrated amount of work that upward buoyancy force can perform on an air parcel if it rose vertically through the atmosphere. CAPE is calculated from (1) where p is pressure, T vp is the virtual temperature of the lifted air parcel moving upward moist adiabatically from the level of 135 free convection (LFC) to the equilibrium level (EL), T ve is the virtual temperature of the environment, and R d is the specific gas constant for dry air. Similarly, the convective inhibition (CIN) is calculated from Conceptually, CIN is the opposite of CAPE. It indicates the amount of energy that will prevent an air parcel from rising from the surface (SFC) via the lifted condensation level to the LFC. Physically, CIN indicates the presence of warm, stable layers 140 that will effectively hinder the formation of convective updrafts. Both, the definitions of CAPE and CIN, take into account the virtual temperature, T v = T (1 + ϵq), with temperature T , ϵ = 0.608, and specific humidity q. The virtual temperature is the temperature that dry air would have if its pressure and density would match a given sample of moist air. In the definitions of CAPE and CIN, the use of virtual temperature reduces uncertainties due to neglecting the effects of moisture on the equation of state (Doswell and Rasmussen, 1994). 145 As an example, Fig. 1 shows monthly mean CAPE, CIN, LFC, and EL data in July 2017 from the ERA5 reanalysis. The data have been calculated directly from the ERA5 temperature and specific humidity vertical profiles using the meteorological data preprocessing code of MPTRAC (Hoffmann et al., 2022a, see electronic supplement). CAPE is largest in the tropics, near the Intertropical Convergence Zone (ITCZ), where high temperature and moisture strongly promote convection. Local means of CAPE of up to 3000 J kg −1 relate to frequent events of intense convection. CAPE decreases by 2 to 3 orders of magnitude 150 at higher latitudes. CAPE minima are observed over cold water ocean surfaces and in arid regions over land. In contrast, the largest values of CIN are found over the subtropics, near the downwelling regions of the Hadley circulation. CIN is stronger over land than over ocean. A broad maximum of CIN, with peak values of up to 1400 J kg −1 in the monthly mean occurs over Northern Africa and the Arabian Peninsula. The patterns found here are qualitatively consistent with other studies discussing climatologies and long-term changes of CAPE and CIN (Riemann-Campe et al., 2009;Chen et al., 2020).

155
The ECP as implemented in MPTRAC requires CAPE and EL data derived from the meteorological input data. In the first step, the gridded CAPE and EL data are interpolated to the horizontal positions of the air parcels. If the interpolated CAPE value of an air parcel is larger than a threshold CAPE 0 , i. e., a user-defined control parameter of the ECP, and the air parcel is located below the EL, it is assumed that up-and downdrafts within the vertical column are strong enough to trigger a convective event. In the second step, the air parcels involved in a convective event are randomly redistributed in the vertical 160 column stretching from the surface to the EL. The random redistribution of the air parcels is weighted by air density in order to yield a well-mixed vertical column of air over the grid boxes of the meteorological input data. Mass conservation is achieved because the number of air parcels and their mass are not changed in a convective mixing event. In the following time step of the model, the trajectories are continued from the new positions the air parcels obtained during the convective mixing event. In MPTRAC, the ECP can be applied as frequent as each time step of the model, or it can be applied more sparsely at user-defined 165 time intervals (e. g., every ∼3 h) to reflect typical convective timescales.
The globally applied threshold CAPE 0 can be set to zero, implying that convection will take place everywhere below the EL where CAPE exists. Strictly speaking, this parameter choice is referred to as the "extreme convection" approach. It provides an upper limit to the effects of unresolved convection in the meteorological input data. In contrast, switching off the ECP completely will provide a lower limit for the effects of convection on the Lagrangian transport simulations, as only explicitly resolved convective updrafts of the meteorological input data will be taken into account. Intermediate states can be simulated by selecting specific values of the threshold CAPE 0 . As there is no fixed classification, we here refer to CAPE values of less than ∼1000 J kg −1 to represent weak to moderate instability, ∼1000 to ∼3000 J kg −1 to represent moderate to strong instability, and CAPE values greater than ∼3000 J kg −1 to indicate cases of extreme instability.
To provide guidance on choosing the threshold CAPE 0 for the ECP, Fig. 2a shows occurrence frequencies of convective 175 events exceeding a given threshold in different latitude bands derived from global ERA5 data on 1 July 2017, 00:00 UTC.
Similar to Fig. 1, it is found that convective events are predominant in the tropics, followed by middle latitudes, whereas strong CAPE events are much less frequent at high latitudes. Despite the large variability, Fig. 2b shows that the mean height of the EL tends to scale logarithmically with CAPE, increasing from mean heights of about 2 km below 10 J kg −1 to about 14 km for CAPE values larger than 1000 J kg −1 . This correlation is noteworthy, as it might potentially be used estimate the height of 180 the EL from CAPE data, in case this information is missing in the meteorological input data. For example, for the ECMWF reanalyses the EL data from the CAPE calculations are not available from the MARS archive, which is why we applied the MPTRAC meteorological data pre-processing code to obtain this information. In Sect. 3.5, we will discuss sensitivity tests showing how different choices of CAPE 0 impact tracer transport simulations.
In this section, we presented examples of CAPE and CIN data derived from the ERA5 reanalysis, but we also conducted 185 comparisons with the corresponding ERA-Interim data. These comparisons generally revealed good agreement of the convective variables between the reanalyses, which is promising, as similar CAPE and EL input data from ERA5 and ERA-Interim are a prerequisite to yield similar results in ECP transport simulations. If the CAPE and EL data derived from ERA5 and ERA-Interim are similar, the parametrized convective updrafts of the ECP are not expected to largely differ between the reanalyses.
The differences between using ERA5 and ERA-Interim to drive ECP transport simulations are further discussed in Sect. 3.4. 190 3 Results   Overall, the features of the zonal PDFs found here are expected and stress the important role of global circulation patterns such as the tropical Hadley cell and mid-latitude storm tracks in affecting the formation and occurrence of convection (Oort and Yienger, 1996;Diaz and Bradley, 2004).

230
Comparing the statistics of the non-ECP trajectories, we found that ERA5 (Fig. 4a) shows stronger and more frequent updrafts than ERA-Interim (Fig. 4b). This is consistent with earlier work (Hoffmann et al., 2019), demonstrating that ERA5 better resolves convective features due to improved spatiotemporal resolution of the ECMWF forecasting system. For the ERA5 reanalysis, we conducted two additional non-ECP calculations in which we downsampled the ERA5 data from full temporal resolution to 6-hourly time intervals and in which reduced the spatial resolution with a downsampling factor of 3 × 3  to achieve a temporal or spatial resolution of the ERA5 data which is roughly comparable to ERA-Interim. The analysis of the downsampled ERA5 data suggests that spatial resolution (Fig. 4c) is more relevant than temporal resolution (Fig. 4d) in maintaining the explicitly resolved updrafts, as spatially downsampled ERA5 data show significantly less peak updrafts than temporally downsampled data. With spatial downsampling being applied, the ERA5 updraft statistics become rather similar to 240 the lower resolution ERA-Interim data.
Comparing the statistics of non-ECP (Fig. 4a,b) and ECP (Fig. 5a,b) calculations for ERA5 and ERA-Interim, it becomes obvious that the ECP substantially increases the number of convective updrafts in the tropics and at middle latitudes. The occurrence frequency of vertical updrafts in the range of 20 to 60 K per 6 h increases by up to 3 orders of magnitudes for ERA5 and up to 5 orders of magnitude for ERA-Interim from the none-ECP to the ECP cases (Fig. 6). Despite the fact that the 245 statistics are calculated from different meteorological input data, the ECP statistics of ERA5 (Fig. 5a) and ERA-Interim ( Fig.   5b) are quite similar. This is promising, as we would expect the ECP to yield similar effects, independent of the different input data. Also note that the ECP and non-ECP patterns found here are qualitatively similar to the updraft statistics of Konopka  Figure 5c shows how the updraft statistics are changing when the threshold is set to CAPE 0 = 1000 J kg −1 . This test stresses the important role of the strong convective events on the parametrized convection, as filtering the weak to moderate events with the increased threshold has only minor impact of the updraft statistics.

255
In the literature discussing the ECP, there is some ambiguity about whether vertical mixing in the convective columns is restricted to being directed "upward" or not. Restricting the mixing to the upward direction is likely motivated by the fact that convective updrafts occur on smaller, unresolved horizontal scales compared to the compensating effects of larger scale downdrafts and subsidence. We implemented an option in the MPTRAC model to enforce upward mixing, i. e., at each convective step, the vertical displacement due to the mixing can only be positive. In this case, upward mixing is still being 260 weighted by density to fulfill the well-mixed criterion. Figure 5d shows that upward mixing leads to more frequent and even  stronger updrafts than the regular ECP method. Figure 6 shows that peak potential temperature changes increase by another 10 K per 6 h. However, note that the upward mixing approach requires tuning and a well-informed choice of the time interval at which parametrized convection is being applied. Applying the upward mixing at each time step of the model, most air parcel will eventually be uplifted closely towards the equilibrium level. As the ECP with upward mixing will overestimate upward 265 transport without tuning the convective event frequency, this approach was not further assessed in this study.

Comparison of e90 artificial tracer ECP and non-ECP simulations
The statistical assessment of the convective updrafts presented in Sect. 3.2 suggests large potential impact of the ECP on Lagrangian transport simulations for the troposphere for both, ERA5 and ERA-Interim. In the following sections, we discuss transport simulations of the artificial tracer e90 to quantify the impact of the ECP on the tracer transport. The artificial tracer 270 e90 is a passive tracer in the upper troposphere and lower stratosphere, which is of particular interest for studies related to the chemical tropopause and stratosphere-troposphere exchange (Prather et al., 2011;Abalos et al., 2017) as well as model validation (Eyring et al., 2013;Orbe et al., 2018). The tracer e90 is emitted uniformly at the surface with a volume mixing ratio of 150 ppbv and has a constant e-folding lifetime of 90 days throughout the atmosphere. With this lifetime, e90 becomes well-mixed quickly in the troposphere. However, the lifetime is much shorter than typical timescales of stratospheric transport.

275
The tracer e90 exhibits sharp gradients across the tropopause. The 90 ppbv contour surface of e90 is considered as a proxy of the chemical tropopause (Prather et al., 2011). By definition, the artificial tracer e90 has similar characteristics to carbon monoxide, being a 'real' chemical tracer of atmospheric transport in the troposphere. We initialized the e90 tracer simulations with MPTRAC by globally distributing air parcels in the pressure range from the surface up to 20 hPa (about 60 km of altitude). In the horizontal, the density of the air parcel was weighted with cosine of 280 latitude. In the vertical, a uniform random distribution over height was applied. With this approach, near-homogeneous global coverage of the air parcels is achieved. Each air parcel is assigned a volume mixing ratio of the tracer e90, representing the concentration of e90 in an infinitesimally small neighborhood. The e90 concentration in a larger region, e. g., for a zonal mean, is calculated by averaging the volume mixing ratios of the air parcels located in that region. The mean volume mixing ratio might be undefined, if no air parcels are located in a given volume. However, in our analysis we found that the air parcels were 285 usually well distributed and no data gaps occurred. A total number of 10 6 air parcels was considered for the simulations.
The initial e90 volume mixing ratio of all air parcels was set to zero. During the course of the simulation, the boundary condition module of MPTRAC was used to set the e90 volume mixing ratio in a near-surface layer to 150 ppbv. The boundary condition for e90 was prescribed at each time step of the model. Note that while for an Eulerian model the term "near-surface layer" in the definition of the e90 artificial tracer might be taken as the lowermost vertical level of the model, for a Lagrangian 290 model the depth of the layer needs to be specified. Here, we selected the lowermost 150 hPa with respect to the surface pressure to define the near-surface layer, thereby also following the orography. A sensitivity test on the depth of the near-surface layer is presented in Sect. 3.7. Only above the layer, the volume mixing ratios of the air parcels decay exponentially according to the For the non-ECP simulations, it is found that e90 concentrations gradually decrease with height from the surface towards the tropopause (Fig. 7). Local maxima of e90 in the middle and upper troposphere are found in the tropics and local minima are 305 found in the subtropics. The 90 ppbv contour of e90 resembles the shape of the dynamical tropopause but underestimates its height by 1 -2 km. In general, the zonal mean distributions found here are rather similar to results presented in other studies, for example the climatology of e90 concentrations from a Whole Atmosphere Community Climate Model (WACCM) run of Abalos et al. (2017) or the e90 tracer simulations with the Chemical Lagrangian Model of the Stratosphere (CLaMS) of Konopka et al. (2019Konopka et al. ( , 2022. This indicates that the MPTRAC model yields a reasonable representation of tracer transport in 310 the free troposphere and stratosphere in the present simulation set up. In contrast to the non-ECP simulations, the ECP simulations with MPTRAC led to significantly larger e90 concentrations in the free troposphere (Fig. 8). For instance, the middle and upper troposphere e90 maxima in the tropics (30 • S to 30 • N) were increased from 110 -120 ppbv to 140 -150 ppbv when using the ECP. The 90 ppbv contour of e90 now even more closely resembles the dynamical tropopause, with height differences well below ±1 km. The comparison of the non-ECP and 315 ECP simulation results indicates that unresolved, parametrized convection has strong impact of tracer transport in the free troposphere, in particular at tropical latitudes, which are governed by frequent and intense convective activity.

Comparison of e90 artificial tracer simulations driven by ERA5 and ERA-Interim data
To assess the influence of the meteorological input data on the Lagrangian transport simulations, we conducted the e90 transport simulations with ERA-Interim instead of ERA5 data. The differences of the e90 monthly mean zonal means of ERA5 minus For the ECP simulations (Fig. 10), there is generally better agreement between the ERA5 and ERA-Interim simulations 325 than for the non-ECP simulations. This may be attributed to the fact that in the ECP simulations the e90 distributions are largely governed by parametrized updrafts, which exhibit statistically similar distributions between ERA5 and ERA-Interim (Sect. 3.2). The largest e90 differences between the ECP simulations are in the range of ±15 ppbv and found at the tropopause.
Above the tropical tropopause, e90 from ERA5 is lower than ERA-Interim, indicating slower transport in the tropical pipe in ERA5 than in ERA-Interim. This is consistent with recent studies on the Brewer-Dobson circulation finding that tropical 330 upwelling in ERA5 is up to 40% weaker than in ERA-Interim, which is mainly due to significantly weaker gravity wave forcing at the equatorial-ward upper flank of the subtropical jet (Diallo et al., 2021;Ploeger et al., 2021). In contrast, ERA5 yields larger e90 concentrations than ERA-Interim at subtropical and middle latitudes, suggesting stronger isentropic mixing between the tropical upper troposphere and the extratropical lowermost stratosphere in ERA5.

335
In this section, we discuss a sensitivity test on the threshold CAPE 0 used to trigger convective events in the ECP simulations. Figure 11a shows the sensitivity of the 90 ppbv contour of e90 on the choice of CAPE 0 . The test was conducted using ERA5 data. We focus the discussion on July 2017, noting that other months show similar results. We tested CAPE threshold values in the range of 100 to 5000 J kg −1 . Except for minor differences, the simulation results for thresholds of 100 to 500 J kg −1 are quite similar to the extreme case without any restrictions on CAPE. The 90 ppbv contour of e90 is located near the dynamical 340 tropopause. For CAPE thresholds of 1000 and 2000 J kg −1 , restricting the ECP to moderate to strong convective instability, increasing differences in the 90 ppbv contour become visible in the extratropics. The CAPE threshold of 5000 J kg −1 filters all events except for few local cases of extreme instability. For this threshold, the e90 contour matches the non-ECP simulation. Figure 12 shows global maps of the occurrence frequencies of the ECP convective events for different thresholds CAPE 0 for July 2017 ERA5 data. In general, the largest occurrence frequencies (up to 100 %) are found over the tropics, and the Figure 10. Same as Fig. 9, but for ECP simulations.
frequencies gradually decrease towards middle and high latitudes. The occurrence frequencies decrease notably with increasing CAPE 0 , where CAPE 0 = 0 (Fig. 12a) includes all events where CAPE exists, whereas CAPE 0 = 1000 J kg −1 (Fig. 12d)  the Northern Indian Ocean, and the tropical Western Pacific as seen in satellite records (Liu et al., 2007;Spang et al., 2012).

Improvement of ECP simulations by considering convective inhibition
While CAPE is a key factor, convective activity is also characterized by various other variables. Here, we suggest a possible 360 improvement of the ECP simulations by considering the convective inhibition (CIN) in addition to CAPE when triggering convective events. CIN can be used to detect cases where layers of warm air yield stability, preventing cooler air parcels from rising in the atmosphere. CIN indicates the amount of energy needed to force air parcels to push through and rise above a stable layer. CIN is typically stronger over land than over ocean and shows the largest means and variability over the subtropics, in particular over Northern Africa and the Arabian Peninsula in boreal summer (Fig. 1). Considering CIN in the ECP potentially 365 has large local effects on the transport simulations in these regions. To take into account the CIN in ECP simulations with MPTRAC, we implemented a control parameter CIN 0 , which suppresses convective events if CIN > CIN 0 . Figure 11b shows the results of a sensitivity test of the threshold CIN 0 on the e90 tracer transport simulations. We tested distributions in the free troposphere are mostly governed by the strong convective updrafts, which are generally not filtered and removed by a CIN threshold. The CIN threshold is therefore expected to only locally affect the e90 distributions. Figure 13 shows maps of occurrence frequency differences of the ECP convective events for different thresholds CIN 0 minus unfiltered data (Fig. 12a) suggesting that transport into the upper troposphere is underestimated compared to the ECP case. This test indicates that significant updrafts are being present in the ERA5 reanalysis. However, the updrafts are not extending down to the surface and are therefore not be captured in the non-ECP simulations, if the selected surface layer is too thin.
In contrast, for the ECP simulations, the sensitivity test did not reveal any significant variations in the e90 concentrations with respect to the surface layer (not shown). In the ECP simulations, the e90 concentrations in the free troposphere are largely 400 governed by the parametrized rather than the explicitly resolved updrafts of the reanalysis data. Following other studies (Gerbig et al., 2003), the ECP was implemented here to influence all parcels in the convective columns down to the surface. In principle, the lower boundary of the ECP could be changed to other levels, for example, the surface pressure could be replaced by the level of free convection. This might be considered physically more realistic, but it would cause other difficulties as the MPTRAC model does not feature any advanced parametrizations for turbulence and mixing in the planetary boundary layer. Changing 405 the lower level of the convective columns in the ECP scheme might therefore cause similar issues as seen in the non-ECP tests, preventing air parcels from being captured by convective updrafts, if the chosen surface layer is too thin. For this reason, we follow the original ECP approach and apply convective mixing down to the surface.

Conclusions
In this study, we assessed the impact of the ECP on Lagrangian transport simulations for the free troposphere and the lower 410 stratosphere. The ECP is conceptually simple and computationally fast. It requires only on two input variables from the meteorological input data, CAPE and EL. If CAPE exceeds a given threshold CAPE 0 , a parametrized convective event is triggered, i. e., air parcels are vertically mixed in the convective column from the surface to the EL. If there is already an explicitly resolved convective updraft being present in the meteorological input data at the same location, this would be no harm, as the nor ECMWF are responsible for any use that may be made of the Copernicus information or data in this publication. We acknowledge the Jülich Supercomputing Centre for providing computing time and storage resources on the supercomputer JUWELS. We acknowledge Gebhard Günther and Olaf Stein for provisioning the ECMWF ERA5 and ERA-Interim data in Jülich. We thank our colleagues at the Institute of Energy and Climate Research and at the Jülich Supercomputing Centre for providing helpful feedback and suggestions on this study.