Convective sources of trajectories traversing the tropical tropopause layer

. Transit properties across the tropical tropopause layer are studied using extensive forward and backward Lagrangian diabatic trajectories between cloud tops and the ref-erence surface 380 K. After dividing the tropical domain into 11 subregions according to the distribution of land and convection, we estimate the contribution of each region to the upward mass ﬂux across the 380 K surface and to the vertical distribution of convective sources and transit times over the period 2005–2008. The good agreement between forward and backward statistics is the basis of the results presented here. It is found that about 85 % of the tropical parcels at 380 K originate from convective sources throughout the year. From November to April, the sources are dominated by the warm pool which accounts for up to 70 % of the upward ﬂux. During boreal summer, the Asian monsoon region is the largest contributor with similar contributions from the maritime and continental parts of the region; however, the vertical distributions and transit times associated with these two subregions are very different. Convective sources are gener-ally higher over the continental part of the Asian monsoon region, with shorter transit times. We estimate the monthly averaged upward mass ﬂux on the 380 K surface and show that the contribution from convective outﬂow accounts for


Introduction
The tropical tropopause layer (TTL) is a key region in the atmosphere that controls the transport of tropospheric air into the stratosphere (Highwood and Hoskins, 1998;Fueglistaler et al., 2009). Situated above the level of main convective outflow (Corti et al., 2006;Fueglistaler et al., 2009), the TTL is penetrated by deep convection which becomes increasingly rare with altitude (Liu and Zipser, 2005;Fu et al., 2007). Outside of convective towers, the vertical motion is weak. The TTL region encompasses the level of zero radiative heating (LZRH) which marks the transition from negative to positive radiative heating values, thus creating a barrier for the largescale transport of air parcels into the stratosphere (Folkins et al., 1999). Quantifying the transport paths across the TTL and their spatial and temporal variability is important for a better understanding of the chemical composition of the air entering the stratosphere and the relation with source regions (Holton et al., 1995;Fueglistaler et al., 2004;Fu et al., 2006;James et al., 2008;Aschmann et al., 2009;Tzella and Legras, 2011;Bergman et al., 2012Bergman et al., , 2013Bergman et al., , 2015Chen et al., 2012;Heath and Fuelberg, 2014;Orbe et al., 2015;Vogel et al., 2015).
In this work, we focus on how parcels detrained from the convective clouds are transported across the TTL and reach the tropopause, defined as the 380 K surface, upon the com- Although the distribution of tropical convection is well known, the location and intensity of convective sources in the TTL are still debated. The questions addressed in this work are whether the intensity is linked only to the altitude of cloud tops (Gettelman et al., 2002(Gettelman et al., , 2009Devasthale and Fueglistaler, 2010), what the role of cloud heating is in favoring the crossing of the LZRH (Corti et al., 2006;Tzella and Legras, 2011), what the role of horizontal transport and inmixing from extratropical latitudes is (Ploeger et al., 2012), and what the special role of continental convection is during the Asian monsoon in particular above the Tibetan Plateau Devasthale and Fueglistaler, 2010;Bergman et al., 2013;Heath and Fuelberg, 2014;Vogel et al., 2015).
We address the whole range of these questions by performing for the first time a comprehensive set of forward and backward trajectories in the TTL between convective sources and the tropopause. We do not account for the transport from the boundary layer to convective tops, which is assumed to be fast and local.
In Sect. 2, we describe the data and the methods used in this work to retrieve trajectories connecting the clouds and the 380 K potential temperature surface. Section 3 discusses the distribution of convective sources and transit times in the TTL. Section 4 discusses the sensitivity of our results to uncertainties in the data and methods. Section 5 takes a further step by calculating the mass flux across the 380 K surface and the contributions of convective sources. Section 6 offers a summary and outlook.

Lagrangian trajectories and convective sources
Lagrangian trajectories of air parcels are calculated within the TTL between the time of detrainment from convective sources and the crossing of the 380 K potential temperature surface, taken as the lower boundary of the stratospheric overworld (Holton et al., 1995).

Determination of the altitude of deep convective clouds
A prerequisite of this study is a characterization of cloud tops that is global in both space and time. We use the CLAUS data set (Hodges et al., 2000), which provides global 3-hourly maps of brightness temperature at 30 km resolution, combined with ERA-Interim data (Dee et al., 2011) to determine the pressure of the top of the convective clouds. Since we are only interested by air parcels which are able to reach the LZRH and above, we consider only cold pixels with brightness temperatures less than 230 K. In the deep tropics between 15 • S and 15 • N, this temperature corresponds to a pressure of about 240 hPa, which is below the main detrainment level near 200 hPa and well below the all-sky LZRH, which is usually located above 155 hPa. As the mean detrainment levels and the LZRH are even higher over continents and in particular over Asia during monsoon season, this is a conservative choice. This method is, however, limited by the inability to distinguish overlaying cirrus from convective tops and is known to underestimate the altitude of the deep convective clouds by about 1 km (Sherwood et al., 2004;Minnis et al., 2008). We correct the altitude provided by CLAUS by an upward shift of 1 km. This approximation is consistent with more recent comparisons of cloud top height determined from active sounders (Kwon et al., 2010;Hamann et al., 2014). The sensitivity to this correction is described in Sect. 4.1. Another limitation is that the brightness temperature can be colder than the environment and is sometimes colder than the cold point tropopause. This is possible under fast adiabatic cooling within active convective towers (Adler and Mack, 1986;Luo et al., 2008) and is found to occur for less than 2 % of the cold pixels for each month in the 20 • S-40 • N band. In such cases, we follow Sherwood et al. (2004) and consider that the parcels rise adiabatically from an altitude of about 40 hPa below the cold point tropopause.

Three-dimensional Lagrangian trajectories
We compute forward and backward diabatic threedimensional trajectories in the TTL using TRACZILLA, which is a modified version of FLEXPART (Stohl et al., 2005;Pisso and Legras, 2008). The horizontal part of the motion is calculated using the 3-hourly wind fields of ERA-Interim (Dee et al., 2011), combining analyses at 00:00, 06:00, 12:00 and 18:00 UTC, 3 h forecasts at 03:00 and 15:00 UTC and 9 h forecasts at 09:00 and 21:00 UTC. The vertical (cross-isentropic) displacement is calculated using the 3-hourly average all-sky radiative heating rates of ERA-Interim (including the radiative effect of the clouds but excluding the latent heating) which are archived at 01:30,04:30,07:30,10:30,13:30,16:30,19:30 and 22:30 UTC. The forward calculations are meant to estimate the impact and efficiency of convective sources while the backward calculations estimate the contribution of sources to the air composition at the 380 K surface. The two calculations provide, however, very consistent results as shown below. The calculations are performed over the whole period 2005-2008 and are discussed below in terms of monthly statistics.

Backward simulation
In the backward calculation, parcels are launched from the 380 K potential temperature surface, every 2 days, between 40 • S and 40 • N, on a regular grid of 0.5 • in latitude and longitude. We retain the first encounter of a parcel with the top of a cold cloud within the previous 3 months as in Tzella and Legras (2011), except our criterion is based on pressure rather than temperature. A parcel encounters a cloud when its pressure is larger than the pressure of the corrected top as described above. The comparison is performed 3-hourly along the trajectory of the parcel with the CLAUS pixel containing the parcel at that time. It is clear that we may miss some cloud encounters in this way because the parcels can sometimes travel by 300 km or more over 3 h, that is 10 CLAUS pixel sizes, under strong wind conditions. The sensitivity to such effect is tested in Sect. 4.2

Forward simulation
In the forward calculation, parcels are launched from 3hourly CLAUS maps, with one parcel at the center of each cold pixel (brightness temperature ≤ 230 K) at the corrected altitude of the cloud top, for locations between 20 • S and 40 • N. Trajectories are integrated for 3 months and we retain the first crossing of the 380 K surface between 40 • S and 40 • N when it occurs. Parcels encountering other clouds along their path, i.e., those which are found within cloudy pixels with cloud tops higher than their altitude, are discarded. On the average, between 16 % (January) and 24 % (August) of the launched parcels are eliminated for this reason. Such parcels are mostly on a descending path and keeping them instead of discarding them only marginally affects our results for parcels crossing the 380 K surface.

Source distribution
We focus our study on different geographical regions as defined by the color boxes on Fig. 1. The boundaries are chosen to highlight the regions where convection is intense in the tropics and subtropics and to separate land and oceanic contributions. The South Asian Pacific (SAP) region corresponds to the warm pool. The Asian monsoon region is divided into the continental Asian mainland (AML), the North Asian Pacific Ocean (NAPO), which includes the Bay of Bengal, the Sea of China and the Philippine Sea, and the Tibetan Plateau (Tibet), defined as the region in Asia above 3500 m. America is divided into South America (SAm) and Central America (CAm). There is a single region for Africa (Af).

Annual cycle
The black curve in Fig. 2a shows that on average 86 % of the backward parcels reach a cloud top within 3 months. The time axis for this curve is the launch time. The parcels which do not reach a cloud are to a vast majority initialized in the subtropics and ascend backward in the deep Brewer-Dobson circulation of the extratropics. The proportion of parcels reaching a cloud varies very little over the mean annual cycle, with a maximum in April (88.7 %) and a minimum in July (84.7 %). The other curves in the same panel show the contributions of each region among the trajectories reaching a cloud within the ensemble of all regions in Fig. 1. In order to facilitate the comparison with forward calculations, the time axis for these curves is that of the intersection of each parcel with a cloud, grouped by months.
The main feature between November and April is the dominance of the SAP region, which alone accounts for a maximum of 68.4 % of all sources in January. The next boreal winter contributors are Af and SAm with a maximum contribution of 19.4 % for Africa in April.
From June to September, NAPO is the leading source with a maximum share of 35.3 % in July followed by AML, which peaks at 19.8 % in July. Together these two regions represent from 45 to 55 % of all sources from June to September. The following contributors are SAP, CAm, North Central Pacific (NCP) and Af. Tibet is a tiny overall contributor (with a maximum share of 2.5 % in July).
The forward estimate of the source distribution (see Fig. 2b) is calculated from the trajectories launched at cloud top level which have reached the 380 K surface. The quantity shown is the monthly ratio of the number of parcels from a given region to the total number originating from the ensemble of regions in Fig. 1. The time associated with each trajectory is that of its launch, grouped into monthly bins. There is a striking agreement between the forward and backward distribution of sources in Fig. 2a and b. In spite of some slight quantitative changes in the proportions, the general pattern and the ordering of sources is almost identical throughout the whole year. The largest change is for Tibet, which displays a forward contribution of 7.6 % in July and August (3 times its backward contribution) but still remains a minor overall contributor.
The similarity is not totally unexpected as forward and backward calculations are solving dual equations for the Green function of the advection-diffusion equation (Holzer and Hall, 2000;Legras et al., 2005). Here diffusion has no role because we consider averages over regions and durations much larger than the diffusion scale and diffusion time defined in Legras et al. (2005). It is, however, surprising that sampling is made at a higher resolution than the ERA-Interim winds but the transit time between the cloud top and the 380 K surface is long enough (as shown below) to make the trajectories numerically irreversible due to the chaotic aspects of transport. Our results show that the sampling is good enough to reestablish the reversibility in the statistical sense as predicted by the Green function formalism for a continuous sampling (Holzer and Hall, 2000). Figure 2c shows the proportion of forward trajectories released in a given region from high convective tops that reach the 380 K surface within 3 months. As most of the other trajectories have returned back in the lower troposphere, this proportion conveys the efficiency of each region at converting convective air into stratospheric air. Tibet displays a singular behavior by reaching an efficiency of 84.5 % in July, which indicates that this proportion of air detrained at the top of clouds reaches the 380 K surface. During boreal summer, after Tibet, AML conveys up to 53.5 % of parcels to the stratosphere, while NAPO is less efficient at about 32.5 % but still the largest contributor due to its size and the frequency of high convective clouds within its domain.
The other regions exhibit efficiencies lower than 30 % with SAP lying just above SAm, CAm and Af. However, when the Asian land convection ramps down, the efficiency of SAP combines with the intensity of convection in this region to let it dominate the transfers over half of the year.
Atmos. Chem. Phys., 16, 3383-3398, 2016 www.atmos-chem-phys.net/16/3383/2016/ The high efficiency above Tibet is consistent with previous studies that found a confinement of Tibetan air within the monsoon anticyclone (Bergman et al., 2013;Heath and Fuelberg, 2014) which persists above Asia during boreal summer, trapping tracer compounds that recirculate inside (Park et al., 2007. The contribution to the inside of the summer Asian monsoon anticyclone (AMA) has been further estimated by calculating sources over the restricted portion of the 380 K surface confined between 20 and 40 • N, 25 and 125 • E, and with potential vorticity smaller than 4 × 10 6 m 2 s −1 kg −1 K. It is found that 87 % of the AMA parcels originate from Asia in the forward calculation (see Fig. 2b) among which 19.5 % from the Tibetan Plateau. In the backward calculations (see Fig. 2a) the proportions are redistributed among Asian continental convection between AML and Tibet (47.5 and 19.5 % in forward and 54.3 and 8.8 % in backward, respectively, the sum varying only from 67 to 63.1 %). The NAPO contribution is unchanged.

Vertical distribution of sources
We investigate now the vertical distribution of sources within each region and its relation to the LZRH. During boreal winter (see Fig. 3a and b) forward and backward calculations predict that the distribution of cloud top sources in the dominating SAP region peaks at 355 K. This is located above the all-sky LZRH at 352.6 K during that season. A proportion of 89.5 % of the sources is then located above the LZRH in the forward calculation and 76.3 % in the backward calculation. The other contributing regions (Af, SAm, NAPO and NCP) also all exhibit a modal peak at 355 K in the forward distribution. The backward distribution exhibits a shift towards small potential temperature which also affects the mean and the median of the distribution (see Table 1).
As in Tzella and Legras (2011), a more complex pattern emerges during boreal summer with several competing regions (see Fig. 3c and d). Both forward and backward calculations produce similar distributions of sources with the main differences being a stronger contribution of Asian land regions (AML and Tibet) in the forward distribution. SAP, CAm and NCP regions are grouped in the forward calculation with a modal peak at 353 K which is below the boreal winter peak at 355 K in SAP. A similar pattern is found in the backward calculation. The AML and Tibetan modal peaks are above 360 K, reaching 367 K for Tibet. The African modal peak is also near 360 K with a fairly flat distribution, and NAPO modal peak is intermediate near the common boreal winter peak at 355 K. The backward shift in the sources towards small potential temperature is smaller than during boreal winter in SAP. Most of the sources are again located above the LZRH, up to 90 % for the forward trajectories from NAPO and CAm (see Table 1). The only exception is Tibet, for which the LZRH, much higher than in other regions at 367.3 K, is located above the modal peak and just at the median in forward calculations. Nevertheless, Tibet was found as the region with highest efficiency during boreal summer, which might be explained by the very high level of the LZRH and sources in this region. The separation of NAPO and Asian land sources explains the double peak pattern shown in Fig. 8 of Tzella and Legras (2011).

Transit time
The differences between forward and backward calculations are mostly seen in the transit time distribution shown in Fig. 4. The peak of the distribution is always shifted to smaller values in the forward calculations and the tail is also decaying much faster. As a result, the ratio backward/forward for the median and the mean is of the order of 1.5 in SAP during boreal winter and in AML and NAPO during boreal summer. It is lower but always larger than 1.1 in the other regions with the exception of Tibet with a near 2 factor (see Table 2). In the forward calculation, the mean transit time over contributing oceanic regions (SAP, NCP and CAm) is of the order of 30 days during boreal summer, which is larger than the boreal winter value of 23 days in SAP. On the contrary, Asian land regions during boreal summer exhibit shorter transit times (21 days for AML and 15 days for Tibet) than their boreal winter counterparts (Af and SAm) near 28 days. Transit times from Af show very little change between boreal winter and boreal summer. The backward distribution is biased by the fact that sampled backward trajectories can easily miss a cloud by passing a pixel away and then wander away in another region. The ef- Table 1. Characteristic numbers of the vertical distribution of sources for the contributing regions during boreal winter (DJF) and boreal summer (JJA). All the quantities but the last column are potential temperatures in units of K. The modal peak is based on the discretized mean histogram shown in Fig. 3. The mean, median and standard deviation are calculated from a cubic spline interpolation over the 340-380 K interval. For each region and each season the upper line refers to backward calculations and the lower line to forward calculations as indicated in the modal peak column. Regions with low contribution have been masked.  fect is the largest for small regions like Tibet or regions ventilated by intense large-scale circulation such as Asia during boreal summer. The backward trajectories can also, at least in principle, get trapped into unstable trajectories oscillating about the LZRH from which trajectories diverge in forward time and to which they therefore converge in backward time, although this has seldom been observed. As it appears, estimates of transit times are more sensitive to sampling effects than the source distribution.

Sensitivity studies
In this section, we study the sensitivity of the results presented in Sect. 3 to changes in data and the design of our calculations.

Sensitivity to the cloud top offset
As the estimate of cloud top and the +1 km correction are subject to uncertainty, we have redone the analysis without the +1 km correction. Since the LZRH is the same, the direct effect is to reduce the proportion of forward trajectories  reaching the 380 K surface (see Table 3). The ratio is about 45 % for both boreal summer and boreal winter except for continental Asia during boreal summer, when it is smaller. Tibet is the most sensitive region with a ratio of 22 %. Figure 5a and c show the change in the vertical distribution of sources for forward calculations. Besides the overall reduction, it is visible that the modal peaks are unmoved except for NAPO and Tibet during boreal summer when they move, respectively, from 358 to 355 K and from 367 to 362 K. The vertical distribution of sources is made narrower by reducing the tail of the distribution towards the upper end of the interval as the highest clouds are shifted down.
The proportion of backward trajectories reaching a cloud within 3 months is now 85 % during DJF and 82.7 % during JJA, which is less than but close to the value when the offset is applied (87 and 85.2 %, respectively) and hence with much less variations than the forward efficiency. The modal peaks are slightly shifted to lower values (with larger shift for NAPO and Tibet) and the narrowing is well pronounced with almost no sources within the range [365, 370 K] except over continental Asia during boreal summer.
The lower end cutoff value of sources (at about 345 K for maritime convection and 350 K for AML) is preserved in both forward and backward calculations except for Tibet during boreal summer. This indicates that outside Tibet the cutoff is determined by the transport properties across the LZRH which are unchanged by offsetting the cloud top or changing the threshold brightness temperature, providing it is warm enough. Over Tibet, where the modal source peak is below the LZRH and where a 1 km displacement is a larger jump in θ than at other location, the lower cutoff is shifted downward and the number of sources is, proportionally, more reduced than in the other regions.
In accordance with these limited changes in the sources, the transit time distribution for all regions but Tibet is weakly affected for both forward and backward calculation (see Fig. 6). The change is localized in the small time contribution in agreement with the narrowing of the source distribution, which reduces the proportion of short transit paths. Tibet is an exception with a shift by about a factor of 2 of the forward and backward transit times towards larger values. Tibet differs from the other regions by having a very high LZRH and a distribution of sources laying mainly under this level. This effect is amplified when the cloud top correction is canceled with almost no contribution left above the LZRH, inducing a significant shift in the transit time distribution.

Sensitivity to increase of the size of cloud pixels
In this section we address the sensitivity to the density of cloud observation and the effect of missing cloud encounters in backward calculations. We modify the encounter criterion by enlarging the pixel size by a factor of 3 in both latitude and longitude, retaining the smallest top pressure among the nine CLAUS pixels surrounding the parcel at a given time. This modification enlarges high clouds and has largest effect in regions where convective systems are small and sparse. Figure 7 (first and second rows) compares the distributions of backward sources in boreal winter and boreal summer 2005 with and without enlarging the pixel size. The total number of trajectories meeting a cloud is again quite insensitive, increasing by 3 % during boreal winter and 1.5 % during boreal summer. The distribution of sources is, however, modified by shifting the distribution to higher potential temperatures and widening the profile. Hence, the effect is qualitatively opposite to the lowering of the top of clouds done in Sect. 4.1.

Sensitivity to the daily cycle of the heating rates
Cloud radiative forcing and the resulting heating rates are a priori sensitive to the daily cycle of convective activity in the tropics. We test here the sensitivity to the daily cycle of heating rates by replacing the 3-hourly sampling by a time moving average. This average at time t is performed as a discretization of where τ is 30 days. Figure 7 (third row) shows that the distributions of sources are only weakly affected even at levels below the LZRH. This result is in agreement with Bergman et al. (2012), who did a similar test with MERRA reanalysis. However, the maps (not shown) of the smoothed heating rates still contain a large amount of spatial variability. This suggests that the horizontal motion that samples this variability is more important than daily fluctuations of the heating rates to cross the LZRH.

Sensitivity to the reanalysis
One main source of uncertainty is the error in the reanalysis wind and heating rates. It has been shown that the heating rates differ quite significantly among reanalyses (Wright and Fueglistaler, 2013) and that the horizontal wind may contain large errors in tropical regions poorly covered by radiosoundings (Podglajen et al., 2014). It is therefore important to assess how our results are sensitive to a change of the reanalysis. As an extensive comparison among all available reanalyses at the time of this writing would have consumed a lot of resources, we limit the comparison to two reanalyses, JRA-55 (Kobayashi et al., 2015) and MERRA (Rienecker et al., 2011). JRA-55 has higher horizontal resolution than the ERA-Interim (spherical T319 truncature instead of T255) and the same number of levels. Winds and heating rates are available every 6 h at model resolution. MERRA has about the same horizontal resolution as ERA-Interim but, because the heating rates are only available in this format, we use , when the CLAUS pixel size is enlarged by a factor of 3 as described in the text (middle two panels) and when the 3-hourly sampling of daily cycle of heating rates is replaced by a time moving average as described in the text (lower two panels).
winds and heating rates on a 1.25 • horizontal grid and a reduced set of vertical pressure levels every 3 h. Backward calculations have been performed for 2005 using the same setup as for ERA-Interim. Figure 8 shows that in the three cases, SAP dominates during boreal winter and NAPO is the largest contributor during boreal summer. There are, however, significant differences. The relative contributions of NAPO is largest in JRA-55 and the distributions are narrower in MERRA. The main difference is in the vertical location. JRA-55 sources are slightly shifted upward by 2 to 3 K with respect to ERA-Interim. MERRA sources are even more shifted by up to 12 K for oceanic sources (SAP, NAPO, CAm) during boreal summer. The shift is smaller for SAP in boreal winter (+5 K) and for AML and Tibet in boreal summer (+4 K). The total proportion of backward trajectories meeting a cloud remains, however, close to that of the ERA-Interim (80.7 % for JRA-55 and 74.4 % against 85.4 % for ERA-Interim in 2005).
In order to interpret these results, Fig. 9 compares the mean profiles of all sky heating rates among the reanalyses for each region in January and July. In addition to the three reanalyses, we show also the curve for MERRA2 (Molod  , 2015). It is clear that all the curves are always close within non-convective regions (AML, Cam, NCP, SEP and Tibet during boreal winter; SAm and SEP during boreal summer) where heating rates are calculated from clear sky radiative transfer. In these regions, however, ERA-Interim often displays larger heating rates than the two others above 370 K without affecting the LZRH. Over convective regions, where additional cooling or heating is provided by clouds, there is a clear separation between JRA-55/ERA-Interim and MERRA/MERRA2. ERA-Interim still displays larger heating rates than JRA-55 and this shifts down its LZRH by a few K. In the Asian region during boreal summer and in SAP during boreal winter, ERA-Interim cools less than JRA-55 below 340 K but this does not affect the LZRH. This is consistent with the shift observed in the source distribution.
MERRA/MERRA2 exhibit a very special pattern over convective regions with reduced heating near 355 K with respect to the two other reanalyses and a strong heating near 345 K, resulting in a characteristic "S" pattern. As a result, the LZRH is pushed upward and multiple LZRH occur over NAPO and CAm during boreal summer. In all cases, MERRA2 is very close to MERRA except over Tibet during boreal summer, where MERRA does not differ very much from ERA-Interim and JRA-55 while MERRA2 does quite unexpectedly.

Mass flux across the 380 K surface and regional distribution
In this section, we take a further step by determining the mass flux across the 380 K surface and the contribution of each convective region.

Method and validation
The instantaneous diabatic mass flux M across the 380 K surface, over a specific domain of the sphere, can be estimated from all sky radiative heating rates as is, respectively, positive and negative. It is known that upward and backward fluxes are ill-defined when the time interval goes to 0 (Hall and Holzer, 2003). This is no longer the case after time smoothing which removes the noise, but the flux then may depend on the applied time smoothing interval.
Another, more traditional, method is based on the residual mean meridional circulation v * , w * , with as defined by Andrews et al. (1987) in log pressure coordinates z = H log(p 0 /p), where the overbar indicates a zonal average. The kinematic mass flux is then given by (Appenzeller et al., 1996) where p 380 K is the pressure at the 380 K surface, a = 6371 km and H = 7 km. Monthly upward and downward Gridded summation is applied in the same way as for M diab . Figure 10 compares the monthly average upward fluxes calculated from Eq. (2) and Eq. (5). Here the diabatic flux is calculated from 3-hourly data and then averaged for each month over the period 2005-2008. The upward flux is calculated as indicated above for each month and then averaged over the 4 years. The kinematic mass flux is calculated in the same way from monthly averages of the residual circulation and pressure at 380 K. The contribution of the pressure variation term in Eq. (5) is then 2 orders of magnitude smaller than that of the residual velocity and can be neglected.
The two estimates display a similar modulation with a minimum during boreal summer and agree with other estimates from ERA-Interim (Abalos et al., 2012). There is, however, a shift between M ↑ diab and M ↑ kine which increases with the size of the latitude band. The mean difference is close to 25 % in the three latitude bands. It is notable that the upward flux and the total flux are identical in the 20 • S-20 • N band because monthly mean diabatic heating rates on the 380 K surface within this domain are positive throughout the year. It is also notable that there is little change in the upward flux as the domain is expanded to 30 • S-30 • N and 40 • S-40 • N because most of the upward motion occurs within the 20 • S-20 • N band.
The discrepancy between diabatic and kinematic fluxes can be reduced by correcting the diabatic mass flux to satisfy global mass conservation. There is no physical mean of ensuring such mass conservation when calculating heating rates (Shine, 1989) from radiative transfer. Actually, the mean total mass flux across the 380 K provided by Eq. (2) and ERA-Interim data is 7.9 × 10 9 kg s −1 over 2005-2008. Applying a uniform compensating correction to the heating rates over the sphere defines a new mass flux M ↑ diab,corr which reduces the discrepancy with M ↑ kine for all latitudes and all times. However, as this uniform correction is entirely ad hoc in the absence of any information about the spatial and temporal distribution of the errors, we refrain from applying it in the subsequent analysis.
The definition used for the diabatic flux is consistent with what follows. However, a more appropriate definition to compare with the kinematic fluxes would be to take a zonal average before applying the sign criterion defining the upward flux. The resulting upward flux (not shown) is indeed larger than M ↑ diab but the difference in the 40 • S-40 • N band is of the order of 1 to 2 %. This shift is negligible and cannot explain the shift between kinematic and diabatic fluxes.
Notice that the total mass flux across an isentropic surface does not need to vanish instantaneously unlike across an isobaric surface under the hydrostatic approximation. However, the small part of the pressure variation term in the monthly flux indicates that the mass balance must be satisfied over monthly averages in the same proportion, i.e., within about 1 %. This is in agreement with the stratospheric overworld mass variations shown by Appenzeller et al. (1996).
Having compared the diabatic mass flux from heating rates to the kinematic flux, we now check that the diabatic mass flux can also be retrieved from the Lagrangian trajectories.
Here the mass flux is calculated from the displacement of backward Lagrangian trajectories launched at 380 K. The heating rate is estimated as θ/ t, where θ is the variation of the potential temperature along the trajectory during an interval t after the launch. Here the interval t is al-Atmos. Chem. Phys., 16, 3383-3398 ways a multiple of 24 h to ensure that the averages are taken over an integer number of daily cycles. The density σ is calculated from the ERA-Interim at the location and time of the launch. Actually, due to combined horizontal and vertical motion, and the inclination of potential temperature surfaces with respect to isobars, pressure can temporarily decrease for a descending parcel with increasing potential temperature. As a general rule, pressure undergoes much larger fluctuations than potential temperature along a Lagrangian trajectory. The mass flux over a domain can then be calculated as a sum over all parcels belonging to this domain: where δs i is the surface of the 0.5 × 0.5 • element associated with the parcel. The monthly upward flux M ↑ back (φ) is calculated by averaging over each month the individual parcel flux at each location over the grid, selecting all grid points where the monthly flux is positive and then averaging in longitude.

Regional distribution of the upward mass flux
Based on these premises we calculate how the monthly upward flux is distributed among regions in the following way. First all the grid points on the 380 K surface within the domain 40 • S-40 • N where the monthly average flux is positive are selected for each month. The contributions of parcels originating from each source region are then quantified. Then the mass flux of each region is calculated as the sum of mass fluxes at the selected grid points from which backward trajectories link to this region as the source. The sum of all these regional contributions constitutes the convective upward flux at the 380 K surface M ↑ conv . In this procedure, a trajectory is labeled with the time of its launch on the 380 K surface (unlike the calculations leading to Fig. 2 where the trajectories are labeled according to their arrival or departure from clouds).  Figure 11 shows that M ↑ conv basically explains the seasonal variations of M ↑ back . The difference between these two estimates has a mean of 4.8×10 9 kg s −1 and a standard deviation of 9×10 8 kg s −1 in the 40 • S-40 • N band. As a result, the ratio M ↑ conv /M ↑ back varies between 78.7 % in boreal winter and 80.5 % in boreal summer (80 % on the average). The nonconvective contribution is mostly accounted for by parcels which are mixed into the TTL from the extratropics (Ploeger et al., 2012).
The mean annual cycle is shown in Fig. 12. For each region, it exhibits a maximum and a minimum which lag by about 1 month with respect to the corresponding source curve Fig. 2a, where the time axis is that of the intersection with convection. This is particularly clear for SAP with a maximum in February and a minimum in September that determines those of the total convective flux. The delay is consistent with the distribution of transit time among the main sources (see Table 2). It is only for AML that the lag is not clear, but this is also a region with short transit times. The larger boreal winter to boreal summer modulation of M ↑ conv than the total source curve in Fig. 2a suggests that this modulation is mostly due to variations of transport properties within the TTL rather than to a modulation of the properties of convective sources. Figure 12 shows the same hierarchy among source regions as Fig. 2a with enhanced domination of the SAP contribution over the year which accounts for 39 % of the total M ↑ conv flux while NAPO accounts for 18 % (see Table 4). If one adds NCP, CAm which is mostly oceanic, and the small contribution from the Atlantic, Indian Ocean and South East Pacific, we see that the contribution from oceanic regions (which include some large islands) is 74.5 %, a proportion in line with the fractional surface area of the oceans in the region 20 • S-20 • N.
We stress that the flux M ↑ conv is the mass flux crossing the 380 K surface that originates from the region of convective outflow within the TTL but is not necessarily purely made of air processed by convection. This air mass contains convective air detrained from the clouds in the vicinity but is also mixed with environmental air which may have been transported from distant and earlier convective sources and partly originates from in-mixing or extratropical lower stratosphere transported into the tropics (Ploeger et al., 2012). Therefore M ↑ conv must be taken as an upper bound of the flux of convectively processed air.

Summary and outlook
We have shown that a consistent vertical distribution of convective sources of stratospheric air over the tropical regions is obtained from backward and forward diabatic trajectories in the TTL.
The seasonal cycle of sources is binary with a domination of the single SAP from November to April and a more complicated pattern dominated by the regions of the Asian monsoon from June to September.
Atmos. Chem. Phys., 16, 3383-3398, 2016 www.atmos-chem-phys.net/16/3383/2016/ The distribution of sources among regions is qualitatively robust to uncertainties in the method and the data but the quantitative distribution is somewhat sensitive to the representation of cloud tops and the reanalysis used to drive the trajectories. Generally, increasing the weight of highest clouds shifts the distribution of sources towards higher altitude and wider vertical dispersion, but it does not change the proportion of backward trajectories meeting a cloud.
There is a pronounced seasonal cycle of the monthly average upward mass flux across the 380 K surface, with a maximum in February and a minimum in September, which is shifted by about month to the seasonal cycle of sources, due to the mean time of transit of parcels across the TTL.
The forward transit times average 1 month with a significant standard deviation of about 15 days. Transit times are shorter from convection over continental Asia during boreal summer (3 weeks from AML and 2 weeks for Tibet). The backward transit times are longer. The discrepancy between forward and backward transit times is most pronounced for transport from convection over Tibet.
One of the main motivations of this study was to study how air parcels detrained from clouds find their way across the LZRH. Our results, however, show that the sources are mostly (80 %) located slightly above the LZRH, but not by much. This implies that only the small percentage of convective events penetrating high enough in the TTL is relevant as stratospheric source. The proximity of sources to the LZRH provides evidence that the vertical flow separation at this level is an important factor in determining the distribution of the sources. This will be further demonstrated in a companion paper.
The high contribution of the tallest clouds raise a concern about the lack of representation of small-scale convective features in the CLAUS data set and in the ERA-Interim heating rates. In particular, CLAUS captures very well the large anvils that form at the top of convective systems but misses the short and elusive overshooting events over these anvils. The velocities and radiative heating rates from reanalyses also miss the cross-isentropic mixing provided by such events or represent them with very crude parameterizations.
The comparison between three modern reanalyses show that the heating rates can differ substantially in the TTL (see Wright and Fueglistaler, 2013) with significant consequences on its distributions of sources. While JRA-55 remains fairly close to the ERA-Interim, MERRA shifts the sources by up to 12 K in the vertical location in potential temperature space. The discrepancy is even larger with the calculations of Bergman et al. (2012Bergman et al. ( , 2015 who uses heating rates based on the observed distribution of clouds (Yang et al., 2010). The main differences are during boreal summer season. Bergman et al. (2012) find a proportion of backward trajectories reaching clouds falling to 15 % in boreal summer from 60 % in boreal winter. We find instead a maximum a maximum in April (88.7 %) and a minimum in July (84.7 %) with ERA-Interim. The second important difference is in the fact that continen-tal sources in Asia prevail over oceanic sources during boreal summer in Bergman et al. (2012Bergman et al. ( , 2015. Yang et al. (2010) perform radiative calculations using clouds retrieved from the CALIPSO based on their own processing of the L1 data combined with monthly averaged IS-CCP data for opaque clouds. More recently, Johansson et al. (2015) have studied the radiative effect of clouds in the TTL in the Asian monsoon region using a combined flux and heating rate product of the CLOUDSAT and CALIPSO missions (2B-FLXHR-LIDAR). Both studies conclude that the average cloud radiative heating is positive in the TTL. This agrees, qualitatively, with the calculations of ERA-Interim but contradicts MERRA, which finds a net cooling (see Supplement Figs. S1 and S2 for ERA-Interim, Figs. S3 and S4 for MERRA). However, Johansson et al. (2015) find that the heating is reinforced over active convective regions of the Asian monsoon while Yang et al. (2010) find cooling in these regions above 15 km, in particular over the Bay of Bengal. Such differences might explain the discrepancies between our finding and those of Bergman et al. (2012). The net cloud radiative heating in the TTL is the result of the opposite tendencies of convection, which is mainly cooling, and thick and thin cirrus clouds which are mainly warming.
It is actually very difficult to compare the calculated heating rates with observations. However, the integrated cloud radiative effect (CRE) of the three reanalyses considered here has been evaluated against CERES observations by Li and Mao (2015) who found that "spatial correlation of CREs and TOA upward radiation fluxes in ERA-Interim is the best among the three reanalyses" in spite of some discrepancies in the global mean CRE. Li and Mao (2015) also note that all three reanalyses have difficulties reproducing boreal summer CREs over East Asia, a conclusion also supported by Wang et al. (2014) based on a study of radiation budgets in AMIP-5 models. Therefore we are led to conclude that the heating rates and the radiative effect of clouds in the Asian monsoon region are still a puzzle that requires further investigations.
Our study corroborates the special role of the Tibetan Plateau in providing air to the AMA. We find that 87 % of the AMA air originates from continental Asia, which agrees very well with the finding of Heath and Fuelberg (2014) (90 %), but our results differ from these authors by giving a much stronger weight to the Asian continental regions outside Tibet (AML). We find, however, that this proportion is sensitive to the representation of cloud tops and that it varies a lot within the literature: Fu et al. (2006) and Wright et al. (2011), find like Heath and Fuelberg (2014), a prevalent role of the Tibetan Plateau, while Aschmann et al. (2009) and Devasthale and Fueglistaler (2010) meet our conclusions. Vogel et al. (2015) finds that during mid-summer 2012 the AMA is fed mostly from continental sources over North India and the Tibetan Plateau and stress the role of the south boundary of the AMA as a transport barrier. Nevertheless, Tibet is characterized by a very high efficiency at carrying air from the top of the clouds to the 380 K surface and short transit times. This www.atmos-chem-phys.net/16/3383/2016/ Atmos. Chem. Phys., 16, 3383-3398, 2016 provides a hint that Tibet is a sensitive area for the increase of air pollution as boundary layer compounds processed by convection can be carried efficiently and rapidly to the stratosphere. It remains that Tibet is an overall small contributor to the global transport into the stratosphere because most of the air entering the stratosphere during boreal summer is not processed inside the AMA but is transported around, separating the location of convection from the entry point into the stratosphere (Bannister et al., 2004;James et al., 2008;Vogel et al., 2015). Our results can be compared with those of Chen et al. (2012) and Orbe et al. (2015). Chen et al. (2012) use kinematic trajectories from the boundary layer focusing on the boreal summer season while Orbe et al. (2015) analyze impulse tracers transported by a general circulation model and provides a whole year analysis. Although both use a different set of regions within the tropics, our results regarding the regional distribution of sources and their seasonal variations are basically consistent with these two studies. There are discrepancies, however, in the transit timescales. Chen et al. (2012) show a distribution of transit times from the boundary layer to the stratosphere which has a strong modal peak over Asia at about 3 days while, contrarily, Orbe et al. (2015) show a distribution which peaks at about 2 months. Our results lay somewhere between, but such a discrepancy highlights an issue that needs to be resolved. Transit times are an important factor in understanding and modeling the behavior of many short-lived chemical species in the TTL and the lower stratosphere.
In spite of displaying the highest cloud tops (Liu and Zipser, 2005), the continental convection above Af and SAm is a weaker provider than the oceanic convection of SAP and NAPO. However, our conclusions do not account for the effect of small-scale overshoots which are more commonly observed over these regions than over the oceans (Corti et al., 2008;Liu et al., 2010) due to the larger available convective energy over land.
The Supplement related to this article is available online at doi:10.5194/acp-16-3383-2016-supplement.