Controls on the water vapor isotopic composition near the surface of tropical oceans and role of boundary layer mixing processes

. Understanding what controls the water vapor isotopic composition of the sub-cloud layer (SCL) over tropical oceans ( δD 0 ) is a ﬁrst step towards understanding the water vapor isotopic composition everywhere in the troposphere. We propose an analytical model to predict δD 0 mo-tivated by the hypothesis that the altitude from which the free tropospheric air originates ( z orig ) is an important factor: when the air mixing into the SCL is lower in altitude, it is generally moister, and thus it depletes the SCL more efﬁciently. We extend previous simple box models of the SCL by prescribing the shape of δD vertical proﬁles as a function of humidity proﬁles and by accounting for rain evaporation and horizontal advection effects. The model relies on the assumption that δD proﬁles are steeper than mixing lines, and that the SCL is at steady state, restricting its applications to timescales longer than daily. In the model, δD is as a function of z orig , humidity and temperature proﬁles, surface conditions, a parameter describing the steepness of the δD vertical gradient, and a few parameters describing rain evaporation and horizontal advection effects. We show that δD 0 does not depend on the intensity of entrainment, in contrast to several previous studies that had hoped that δD 0 measurements could help estimate this quantity.

Abstract. Understanding what controls the water vapor isotopic composition of the sub-cloud layer (SCL) over tropical oceans (δD 0 ) is a first step towards understanding the water vapor isotopic composition everywhere in the troposphere. We propose an analytical model to predict δD 0 motivated by the hypothesis that the altitude from which the free tropospheric air originates (z orig ) is an important factor: when the air mixing into the SCL is lower in altitude, it is generally moister, and thus it depletes the SCL more efficiently. We extend previous simple box models of the SCL by prescribing the shape of δD vertical profiles as a function of humidity profiles and by accounting for rain evaporation and horizontal advection effects. The model relies on the assumption that δD profiles are steeper than mixing lines, and that the SCL is at steady state, restricting its applications to timescales longer than daily. In the model, δD 0 is expressed as a function of z orig , humidity and temperature profiles, surface conditions, a parameter describing the steepness of the δD vertical gradient, and a few parameters describing rain evaporation and horizontal advection effects. We show that δD 0 does not depend on the intensity of entrainment, in contrast to several previous studies that had hoped that δD 0 measurements could help estimate this quantity.
Based on an isotope-enabled general circulation model simulation, we show that δD 0 variations are mainly controlled by mid-tropospheric depletion and rain evaporation in ascending regions and by sea surface temperature and z orig in subsiding regions. In turn, could δD 0 measurements help estimate z orig and thus discriminate between different mixing processes? For such isotope-based estimates of z orig to be useful, we would need a precision of a few hundred meters in deep convective regions and smaller than 20 m in stratocumulus regions. To reach this target, we would need daily measurements of δD in the mid-troposphere and accurate measurements of δD 0 (accuracy down to 0.1 ‰ in the case of stratocumulus clouds, which is currently difficult to obtain). We would also need information on the horizontal distribution of δD to account for horizontal advection effects, and full δD profiles to quantify the uncertainty associated with the assumed shape for δD profiles. Finally, rain evaporation is an issue in all regimes, even in stratocumulus clouds. Innovative techniques would need to be developed to quantify this effect from observations. 1 Introduction 1.1 What controls the water vapor isotopic composition?
The water vapor isotopic composition (e.g., δD = (R/R SMOW − 1) × 1000 expressed in per mill, where R is the D/H ratio and SMOW is the Standard Mean Ocean Water reference), has been shown to be sensitive to a wide range of atmospheric processes , such as continental recycling (Salati et al., 1979;Risi et al., 2013); unsaturated downdrafts (Risi et al., , 2010a; rain evaporation (Worden et al., 2007;Field et al., 2010); the degree of organization of convection (Lawrence et al., 2004;Tremoy et al., 2014); the convective depth (Lacour et al., 2017b); the proportion of precipitation that Published by Copernicus Publications on behalf of the European Geosciences Union.
occurs as convective or large-scale precipitation (Lee et al., 2009;Kurita, 2013;Aggarwal et al., 2016); vertical mixing in the lower troposphere Galewsky, 2018a, b), mid-troposphere (Risi et al., 2012b) or upper-troposphere (Galewsky and Samuels-Crow, 2014); convective detrainment (Moyer et al., 1996;Webster and Heymsfield, 2003); and ice microphysics (Bolot et al., 2013). It is therefore very challenging to quantitatively understand what controls the isotopic composition of water vapor. A first step towards this goal is to understand what controls the water vapor isotopic composition in the sub-cloud layer (SCL) of tropical (30 • S-30 • N) oceans. Indeed, this water vapor is an important source moistening air masses traveling to land regions (Gimeno et al., 2010;Ent and Savenije, 2013) and towards higher latitudes (Ciais et al., 1995;Delaygue et al., 2000). It is also ultimately the only source of water vapor in the tropical free troposphere since water vapor in the free troposphere ultimately originates from convective detrainment (Sherwood, 1996), and convection ultimately feeds from the SCL air . Therefore, the water vapor isotopic composition in the SCL of tropical oceans serves as initial conditions to understand the isotopic composition in land waters and in the tropospheric water vapor everywhere on Earth. We focus here on the SCL because, by definition, there is no complication by cloud condensation processes.
The goal of this paper is thus to propose a simple analytical equation that allows us to understand and quantify the factors controlling the δD in the water vapor in the SCL of tropical oceans. So far, the most famous analytical equation for this purpose has been the closure equation developed by Merlivat and Jouzel (1979) (MJ79). This closure equation can be derived by assuming that all the water vapor in the SCL air originates from surface evaporation. The water balance of the SCL can be closed by assuming a mass export at the SCL top (e.g., by convective mass fluxes) and a totally dry entrainment into the SCL to compensate for this mass export. The MJ79 equation has proven very useful to capture the sensitivity of δD and second-order parameter d-excess to sea surface conditions (Merlivat and Jouzel, 1979;Ciais et al., 1995;Risi et al., 2010d). However, the δD calculated from this equation suffers from a high bias in tropical regions (Jouzel and Koster, 1996). This bias can be explained by the neglect of vertical mixing between the SCL and air entrained from the free troposphere (FT). The MJ79 equation can better reproduce surface water vapor observation when extended to take into account this mixing (Benetti et al., 2015, hereafter B15). This extension requires us to know the specific humidity (q) and water vapor δD of the entrained air. To get these values, they assume that the air entrained into the boundary layer comes from a constant altitude. However, this does not reflect the complexity of entrainment and mixing processes in marine boundary layers. Figure 1 summarizes our knowledge about these entrainment and mixing processes. In stratocumulus regions, clouds are thin and the inversion is just above the lifting condensation level (LCL). Air is entrained from the FT by cloud-top entrainment driven by radiative cooling or wind shear instabilities (Mellado, 2017), possibly amplified by evaporative cooling of droplets (Lozar and Mellado, 2015). Both direct numerical simulations (Mellado, 2017) and observations of tracers (Faloona et al., 2005) and cloud holes  show that air is entrained from a thin layer above the inversion, thinner than 80 m and as small as 5 m. The boundary layer itself is animated by updrafts, downdrafts, and associated turbulent shells that bring air from the cloud layer downward (Brient et al., 2019;Davini et al., 2017).

Entrainment and mixing mechanisms
In trade-wind cumulus regions, the cloudy layer is a bit deeper. Observational studies and large-eddy simulations have pointed out the important role of thin subsiding shells around cumulus clouds, driven by cloud-top radiative cooling, mixing, and evaporative cooling of droplets (Jonas, 1990;Rodts et al., 2003;Heus and Jonker, 2008;Heus et al., 2009;Park et al., 2016). This brings air from the cloudy layer to the SCL. Subsiding shells may also cover overshooting plumes of the cumulus clouds, entraining FT air into the cloud layer (Heus and Jonker, 2008).
In deep convective regions, unsaturated downdrafts driven by rain evaporation (Zipser, 1977) are known to contribute significantly to the energy budget of the SCL (Emanuel et al., 1994). Large-eddy simulations show that subsiding shells, similar to those documented in shallow convection, also exist around deep convective clouds (Glenn and Krueger, 2014). In the clear-sky environment between clouds, turbulent entrainment into the SCL may also play a significant role (Thayer-Calder and Randall, 2015). Therefore, whatever the cloud regime, air entering the SCL from above may originate from either the cloud layer or the free troposphere, depending on the mixing mechanism. Therefore, in this paper in contrast with B15, we let the altitude from which the air originates, z orig , be variable. We do not call it "entrained" air because entrainment sometimes refers to mixing processes through an interface (e.g., De Rooy et al., 2013;Davini et al., 2017), whereas air in the SCL may also enter through deep, coherent, and penetrative structures such as unsaturated downdrafts. We do not call it FT air either since it may originate from the cloudy layer.

Goal of the article
To acknowledge the diversity and complexity of mixing mechanisms, we extend the B15 framework in several ways. First, we assume that we know the shape of δD profiles as a function of q. Second, we write the specific humidity of the air originating from above the SCL as a function of z orig . Third, we account for rain evaporation and horizontal effects.  While B15 focused on observations during a field campaign, we also apply the extended equation to global outputs of an isotope-enabled general circulation model, with the aim to quantify the different factors controlling the δD variability in the tropics. The variable z orig will emerge as an important factor. Therefore, we discuss the possibility that δD measurements at the near surface and through the lower FT could help estimate z orig and thus the mixing processes between the SCL and the air above.
Note that we focus on δD only. Results for δ 18 O are similar. We do not aim at capturing the second-order parameter d-excess because our model requires some knowledge about free-tropospheric vertical profiles of isotopic composition. While δD is known to decrease with altitude (Ehhalt, 1974;Ehhalt et al., 2005;Sodemann et al., 2017), vertical profiles of d-excess are more diverse and less well understood (Sodemann et al., 2017). In addition, there is more need for an extension of MJ79 for δD than for d-excess since the effect of convective mixing is larger on δD than on d-excess (Risi et al., 2010d;Benetti et al., 2014).
2 Theoretical framework

Box model and budget equations
Building on Benetti et al. (2014) and B15, we consider a simple box representing the SCL (Fig. 2). We assume that the air comes from above (M) and from the incoming large-scale horizontal advection (F adv ) and is exported through the SCL top (N , e.g., turbulent mixing or convective mass flux) and by outgoing large-scale horizontal advection (F adv,out ). We assume that the SCL is at steady state. For example, its depth is constant. Since the SCL properties may exhibit a diurnal cycle (Duynkerke et al., 2004), this hypothesis restricts the application of this model to timescales longer than daily. The air mass budget of the SCL is thus These fluxes also transport water vapor and isotopes. In addition, surface evaporation E and rain evaporation F evap import water vapor and isotopes (Fig. 2). Hereafter, to simplify equations, we use the isotopic ratio R instead of δD.
The SCL is usually well mixed (Betts and Ridgway, 1989;Stevens, 2006;De Roode et al., 2016). We thus assume that the humidity and isotopic properties are constant vertically and horizontally in the SCL. They are noted (q 0 , R 0 ). The humidity and isotopic properties of the mass flux export N are thus also (q 0 , R 0 ). The properties of the flux M are noted (q orig , R orig ). The properties of the incoming air by horizontal advection are noted (q adv , R adv ). For simplicity here we neglect the effect of horizontal gradients in humidity (i.e., q adv = q 0 ), assuming that the main effect of horizontal advection on δD 0 arises from horizontal gradients in δD. Appendix C explains how R adv can be calculated. At steady state, the water budget of the SCL is written This model is consistent with SCL water budgets that have already been derived in previous studies (Bretherton et al., 1995), except that we consider steady state. This equation can be solved for q 0 : The SCL humidity q 0 is thus sensitive to M, justifying that it can be used to estimate the mixing intensity or the "entrainment velocity" w e = M/ρ 0 (ρ being the air volumic mass) (Bretherton et al., 1995). At steady state, the water isotope budget of the SCL is written where R E is the isotopic composition of the surface evaporation. It is assumed to follow the Craig and Gordon (1965) equation: where R oce is the isotopic ratio in the surface ocean water, α eq is the equilibrium fractionation calculated at the sea surface temperature (SST) (Majoube, 1971), α K is the kinetic fractionation coefficient (MJ79), and h 0 is the relative humidity normalized at the SST (h 0 = q 0 /q s (SST, P 0 ), where q s is the saturation-specific humidity at SST and P 0 is the surface pressure). We write the isotopic composition of the rain evaporation, R evap , as where α evap is an effective fractionation coefficient. For example, if droplets are formed near the cloud base, some of them precipitate and evaporate totally into the SCL (e.g., in non-precipitating shallow cumulus clouds), then α evap = α(T cloud base ). In contrast, if droplets are formed in deep convective updrafts after total condensation of the SCL vapor, and then a very small fraction of the rain is evaporated into a very dry SCL, then α evap = 1/α(T SCL )/α K (Stewart, 1975). We note η = F evap /E the ratio of water vapor coming from rain evaporation to that of surface evaporation, and φ = F adv · q adv /E the ratio of water vapor coming from horizontal advection to that coming from surface evaporation. We note β = R adv /R 0 the ratio of isotopic ratios of horizontal advection to that of the SCL.
Note that in all our equations, we assume that temperature and humidity profiles and all basic surface meteorological variables are known. We do not attempt to express either h 0 as a function of q 0 as in B15 or the q profile as a function of q 0 . Our ultimate goal is to assess the added value of δD assuming that meteorological measurements are already routinely performed.
By combining all these equations, we get where r orig = q orig /q 0 is the proportion of the water vapor in the SCL that originates from above.
An intriguing aspect of this equation is that the sensitivity to M disappears. In contrast to q 0 , R 0 is not sensitive to M. Therefore, it appears illusory to promise that water vapor isotopic measurements could help constrain the entrainment velocity that many studies have striven to estimate (Nicholls and Turton, 1986;Khalsa, 1993;Wang and Albrecht, 1994;Bretherton et al., 1995;Faloona et al., 2005;Gerber et al., 2005Gerber et al., , 2013. The lack of sensitivity of R 0 to M is explained physically by the fact that for a given q 0 and q orig , if M increases, then E + F evap increases in the same proportion to maintain the water balance. Therefore, the relative proportion of the water vapor originating from surface and rain evaporation to that coming from above, to which R 0 is sensitive, remains constant. Rather, since q and R vary with altitude, R 0 is sensitive to the altitude from which the air originates.

Closure if δD profile follows a Rayleigh distillation line
Equation (6) requires knowing q orig and R orig . B15 take these values from general circulation model (GCM) outputs at 700 hPa. In contrast, here we acknowledge the diversity and complexity of mixing mechanisms by keeping the possibility to take q orig and R orig at a variable altitude z orig . If the goal is to predict R 0 from z orig , we can apply Eq. (6) if we know the q and δD vertical profiles. Conversely, if the goal is to predict z orig from R 0 , we can numerically solve Eq. (6) if we know the q and δD vertical profiles. No analytical solution exists in the general case, but a numerical solution can be searched for the z orig based on Eq. (6). However, the existence and unicity of the solution is not warranted for all kinds of profiles (e.g., Appendix A).
In practice, full isotopic profiles are costly to measure. In addition, our goal is to develop an analytical model. Therefore, in the following we simplify the problem by assuming that the vertical profile of R follows a known relationship as a function of q. Measured vertical profiles of δD are usually bounded by two curves when plotted in a (q, δD) diagram (Sodemann et al., 2017): Rayleigh distillation curve and mixing line.
First, we explore the case of a Rayleigh distillation curve (Dansgaard, 1964), as in Galewsky and Rabanus (2016): where α eff is an effective fractionation coefficient. Typically, q decreases with altitude, so R also decreases with altitude. However, in observations and models, vertical profiles of R can be very diverse Sodemann et al., 2017). The water vapor may be more (Worden et al., 2007) or less (Sodemann et al., 2017) depleted than predicted by a Rayleigh curve using a realistic fractionation factor that depends on local temperature. Therefore, here we let α eff be a free parameter larger than 1. Rather than assuming a true Rayleigh curve, we simply assume that R and q are logarithmically related. Effects of horizontal advection and rain evaporation on tropospheric profiles are encapsulated into α eff . Injecting Eq. (7) into Eq. (6), we get As a consistency check, in the limit case where the air coming from above is totally dry (r orig = 0), Eq. (9) becomes the MJ79 equation: Equation (8) tells us that whenever α eff > 1, R 0 decreases as r orig increases ( Fig. 3 red), i.e., as q orig is moister. Therefore, R 0 decreases as z orig is lower in altitude. This result may be counterintuitive, but can be physically interpreted as follows. If z orig is high, mixing brings air with very depleted water vapor, but since the air is dry, the depleting effect is small. In contrast, if z orig is low, mixing brings air with water vapor that is not very depleted, but since the air is moist, the depleting effect is large (Fig. 4a).
Figure 3 (red) shows that the range of possible δD values is restricted to −70 ‰ to −85 ‰. This explains why in quiescent conditions near the sea level in tropical ocean locations, the water vapor δD varies little (Benetti et al. (2014), Françoise Vimeux, personal communication, 2018. In the limit case where r orig → 1 (i.e., the air comes from the SCL top), R 0 → R oce α eq · 1 h 0 +α K ·(1−h 0 )·α eff (L'Hôpital's rule was used to calculate this limit). This lower bound is not so depleted compared to the more depleted water vapor observed in regions of deep convection (e.g., Lawrence et al., 2002Lawrence et al., , 2004Kurita, 2013). This is because when r orig → 1, the water vapor coming from above has a composition very close to that of the SCL, so the depleting effect is limited. In addition, surface evaporation strongly damps the depleting effect of mixing. Only rain evaporation or liquid-vapor exchanges (Lawrence et al., 2004;Worden et al., 2007) can further decrease R 0 (Appendix B).
Figure 3 (green) shows that the sensitivity to α eff is relatively small but cannot be neglected. Therefore, predicting δD 0 requires having some knowledge about the steepness of the isotopic profiles in the FT. Rain evaporation and horizontal advection can have either an enriching or depleting effect, but do not qualitatively change the results (Fig. 3 purple and blue). Now we consider the case of a mixing line. Detailed calculations in Appendix A show that the sensitivity to r orig is lost. An infinity of FT end members can lead to the same δD 0 when mixed with the surface evaporation, as illustrated in Fig. 4b and analytically demonstrated in Appendix A. Our main results (more depleted δD 0 as r orig increases, restricted range of δD 0 variations, relationship with z orig ) hold only for δD profiles that are steeper than a mixing line. This is the case for profiles that are intermediate between a Rayleigh Figure 3. δD 0 as a function of r orig according to Eq. (9), with α eff = α eq as an example (red). For this illustrative purpose, we assume SST = 30 • C, h 0 = 0.8, δD oce = 0 ‰, and φ = η = 0. The sensitivity to the effective fractionation factor α eff (green) is shown. If rain evaporation is 25 % of surface evaporation (η = 0.25), the solid pink and blue curves show the sensitivity to the effective fractionation factor α evap . If the incoming water vapor by horizontal advection is 25 % of surface evaporation (φ = 0.25), the dashed pink and blue curves show the sensitivity to the isotopic gradient quantified by β. and a mixing line, as is usually the case in nature (Sodemann et al., 2017) or in a general circulation model (Appendix D1).

LMDZ simulations
We use an isotope-enabled general circulation model (GCM) as a laboratory to test our hypotheses and investigate what controls the isotopic composition. We use the LMDZ5A version of LMDZ (Laboratoire de Météorologie Dynamique Zoom), which is the atmospheric component of the IPSL-CM5A coupled model (Dufresne et al., 2012) that took part in CMIP5 (Coupled Model Intercomparison Project; Taylor et al., 2012). This version is very close to LMDZ4 (Hourdin et al., 2006). Water isotopes are implemented the same way as in the predecessor LMDZ4 (Risi et al., 2010c). We use . For this illustrative purpose, we assume SST = 30 • C, h 0 = 0.8, δD oce = 0 ‰, and φ = η = 0. In (a), the red curve shows the Rayleigh profile starting from the SCL and the purple curve shows the mixing line connecting the air coming from above to the surface evaporation, in the case r orig = q orig /q 0 = 0.7. The green curve shows the Rayleigh profile starting from the SCL and the blue curve shows the mixing line connecting the air coming from above to the surface evaporation, in the case r orig = q orig /q 0 = 0.5. One can see that when r orig is lower, the mixing line is more curved, leading to more enriched values. In (b), the purple curve joins the SCL air and the air at all altitudes above the SCL. One can see that different values for r orig can lead to the same value of δD in the SCL. 4 years (2009-2012) of a simulation of the AMIP (Atmospheric Model Intercomparison Project) (Gates, 1992) that was initialized in 1977. The winds are nudged towards ERA-40 reanalyses (Uppala et al., 2005) to ensure a more realistic simulation. Such a simulation has already been described and extensively validated for isotopic variables in both precipitation and water vapor (Risi et al., 2010c(Risi et al., , 2012a. The ocean surface water δD oce is assumed constant and set to 4 ‰. The resolution is 2.5 • in latitude by 3.75 • in longitude, with 39 vertical levels. Over the ocean, the first layer extends up to 64 m, and a typical SCL extending up to 600 m is resolved by six layers. Around 2500 m, a typical altitude for the inversion for trade-wind cumulus clouds, the resolution is about 500 m. For our calculations, we only use tropical grid boxes (30 • S-30 • N) over tropical oceans (> 80 % ocean fraction in the grid box). In addition, to avoid numerical problems when estimating the effect of horizontal advection and rain evap-oration, only grid boxes and days where E > 0.5 mm d −1 are considered. This represents 99.7 % of all tropical oceanic grid boxes.
Specific diagnostics for horizontal advection and rain evaporation are detailed in Appendix B and C.

STRASSE observations
We also apply our theoretical framework to observations during the STRASSE (Sub-Tropical Atlantic Surface Salinity Experiment) cruise that took place in the northern subtropical ocean in August and September 2012 (Benetti et al., 2014). This campaign accumulates several advantages that are important for our analysis: (1) continuous δD 0 measurements in the surface water vapor (17 m) at a high temporal frequency during 1 month (Benetti et al., 2014(Benetti et al., , 2017b, (2) associated surface meteorological measurements, including SST and h 0 , (3) 22 radio soundings relatively well distributed over the campaign period and providing vertical profiles of altitude, temperature, relative humidity and pressure, (4) ocean surface water δD oce measurements (Benetti et al., 2017a), (5) a variety of conditions ranging from quiescent weather to convective conditions, (6) on many vertical profiles, a well defined temperature inversion allows to calculate the inversion altitude.
We use δD 0 measurements on a 15 min time step. The measurements in ocean water were interpolated on the same time steps using a Gaussian filter with a width of 3 d. The radio-soundings are used together with all water vapor isotopic measurements that are within 30 min of the radiosounding launch. Only profiles during the ascending phase of the balloon are considered because the descent phase is often located far away from the initial launch point (McGrath et al., 2006;Seidel et al., 2011).

Estimating the altitude from which the air originates
Here we explain how z orig is estimated based on LMDZ outputs. First, we assume that the q and δD at 500 hPa (q f , δD f ) belong to a Rayleigh distillation line starting from the surface with effective fractionation α eff : In a real field campaign, this assumption means that we do not need to measure the full vertical profile of δD, but only δD f at a given free-tropospheric altitude (e.g., 500 hPa).
We checked that results are similar when defining the end member at 400 hPa rather than 500 hPa. However, the end member should be defined above 500 hPa to ensure that it is well above boundary layer processes. If the end member is defined below 500 hPa (e.g., 600 hPa), there are a few cases where q increases with altitude (q f > q 0 ) due to horizontal advection or convective detrainment from nearby moister re-Atmos. Chem. Phys., 19, 12235-12260, 2019 www.atmos-chem-phys.net/19/12235/2019/ Figure 5. Schematics illustrating the typical structure of tropical marine boundary layers. The sub-cloud layer (SCL) extends from the surface to the lifting condensation level (LCL) and the cloud layer extends from the LCL to the inversion (z i ). EIS stands for the estimated inversion strength. Left: shape of the vertical profile in q (black) and q s (green). Right: shape of the vertical profile in potential temperature θ , inspired by Wood and Bretherton (2006). The LCL, z orig , z orig,r orig =0.6 , and z i altitudes defined in Sect. 3.4 are indicated.
Third, the altitude z orig is estimated from r orig . Using the q vertical profile, we find z orig so that q(z orig ) = r orig · q 0 (Fig. 5, red).
When estimating z orig from observations, we follow the same methodology except that in absence of measurements for q f and δD f we assume a constant α eff = 1.07 based on LMDZ simulation and that α eq , α K , δD oce , h 0 , and δD 0 come from surface observations. Note that r orig and z orig are not direct diagnostics from the simulation, but rather a posteriori estimates to match the simulated δD 0 . Therefore, if assumptions underlying Eq. (9) are violated, then the estimate of r orig , and subsequently z orig , will be biased. The estimate of r orig encapsulates the effect of mixing processes, but also all other processes that have been neglected in our theoretical framework, such as temporal variations in SCL depth, q 0 , or δD 0 or vertical variations in q 0 or δD 0 within the SCL. Figure 5 illustrates the structure of a typical tropical marine boundary layer covered by stratocumulus or cumulus clouds (Betts and Ridgway, 1989;Wood, 2012;Wood and Bretherton, 2004;Neggers et al., 2006;Stevens, 2006). The cloud base corresponds to the lifting condensation level (LCL). Below is the well mixed SCL. Above is the cloud layer, topped by a temperature inversion. Above the inversion is the FT.

Boundary layer structure diagnostics
The LCL is calculated as the altitude at which the specific humidity near the surface equals the specific humidity at saturation of a parcel that is lifted following a dry adiabat (Fig. 5).
The temperature inversion is an abrupt increase in temperature that caps the boundary layer. Therefore, a method to automatically estimate its altitude z i is to detect a maximum in the vertical gradient of potential temperature (Stull, 1988;Oke, 1988;Sorbjan, 1989;Garratt, 1994;Siebert et al., 2000). This method is sensitive to the resolution of vertical profiles (Siebert et al., 2000;Seidel et al., 2010). Therefore, we adapted this method in order to yield z i values that best agree with what we would estimate from visual inspection of individual temperature profiles. In LMDZ, we calculate z i as the first level at which the vertical potential temperature gradient exceeds 3 times the moist-adiabatic lapse rate. In observations, we calculate z i as the first level at which the vertical potential temperature gradient exceeds 5 times the moist-adiabatic lapse rate because radio-soundings are noisier than simulated profiles.
Finally, we calculate z orig (r orig = 0.6), which is the z orig altitude if r orig is set to 0.6. This usually coincides with the altitude of strong humidity decrease near the inversion (Fig. 5).

Averages and composites
All calculations are performed on daily values for LMDZ and on 15 min values for observations. For LMDZ, when analyzing spatial and seasonal variability, seasonal averages are calculated at each grid box over tropical oceans by averaging all days of all years that belong to each season. Seasons are defined as boreal winter (December-January-February), spring (March-April-May), summer (June-July-August), and fall (September-October-November). For illustration purpose, all maps are plotted for boreal winter. Standard deviations are also calculated among all days of all years for each season.
The type of clouds and mixing processes depends strongly on the large-scale velocity at 500 hPa (ω 500 , map shown in Fig. 6a), with shallow clouds in subsiding regions and deeper clouds in ascending regions (Fig. 1). Therefore, it is convenient to plot variables as composites as a function of ω 500 (Bony et al., 2004). To make such plots, we divide the ω 500 range from −30 to 50 hPa d −1 into intervals of 5 hPa d −1 . In each given interval, we average all seasonal-mean values at all locations over tropical oceans for which seasonal-mean ω 500 belongs to this interval (e.g., Fig. 8a will be an example). Note that such composites are carried out on seasonalmean ω 500 because cloud processes and their associated diabatic heating are tied to the large-scale circulation through energetic constraints (Yanai et al., 1973;Emanuel et al., 1994) that are best valid at longer timescales, otherwise, the energy storage term may become significant (e.g., Masunaga and Sumi, 2017). This is why ω 500 is generally averaged over a month or longer (e.g., Bony et al., 1997;Williams et al., 2003;Bony et al., 2004;Wyant et al., 2006;Bony et al., 2013). In addition, we primarily focus on understanding the seasonal and spatial distribution of δD 0 .
The cloud cover strongly correlates with the inversion strength, which can be quantified by the estimated inversion strength (EIS; Wood and Bretherton, 2006) (map shown in Fig. 6b) as a measure of inversion strength. We thus also plot variables as composites as a function of EIS. To make such plots, we divide the EIS range from −1 to 9 K into intervals of 0.5 K. In each given interval, we average all seasonalmean values at all locations over tropical oceans for which seasonal-mean EIS belongs to this interval (e.g., Fig. 8b will be an example). Using seasonal-mean values is consistent with Wood and Bretherton (2006) and with the better link at longer timescales between cloud processes and the largescale dynamical regime.
The relative contributions of each of these components to the δD variability are quantified by performing a linear regression of each of the components as a function of δD 0 . If the correlation coefficient is significant for a given factor, then the slope quantifies the contribution of this factor to the variability of δD 0 . The sum of all contributions may not always be 1 due to nonlinearity. Such a method has already been applied in previous studies (e.g., Risi et al., 2010b;Oueslati et al., 2016). The contributions to the seasonal spatial variability of δD 0 can be quantified by performing the regression among all locations and seasons. The contributions to the daily variability of δD 0 can be quantified by performing the regression among all days of a given season at a given location.

Decomposition method for r orig
To understand what controls r orig , a similar method as for the decomposition of δD 0 can be applied. We can write r orig as r orig = h z orig · q s T z orig + δT z orig , P z orig q 0 , where T (z orig ) + δT (z orig ) = T (z orig ) is the temperature at altitude z orig , T is the tropical-ocean-mean temperature profiles, h(z orig ) and P (z orig ) are the relative humidity and pressure at z orig , and δT (z orig ) is the temperature perturbation compared to T . Therefore, the variability of r orig is decomposed into the effect of four factors: q 0 , z orig , h(z orig ), and δT (z orig ). In practice, r orig and z orig are calculated following Sect. 3.3, and then Eq. (11) is applied.
4 Results from LMDZ

Decomposition of δD 0 variability
The spatial variations in δD 0 simulated by LMDZ (Fig. 7a) are characterized by depleted values near midlatitudes and in dry subsiding regions (e.g., off the coast of Peru and over other regions of oceanic upwelling) and regions of atmospheric deep convection (e.g., Maritime Continent). Consistently, δD 0 values exhibit a maximum for weakly ascending or subsiding regions: δD 0 decreases with increasing vertical velocity of both signs (Fig. 8a black); δD 0 decreases as EIS increases reflecting more stable, subsiding conditions ( Fig. 8b black). This pattern is consistent with previous studies (e.g., Good et al., 2015). For the first time, we propose a theoretical framework to interpret this pattern, decomposing it into six contributions: r orig , α eff , SST, h 0 , rain evaporation, and horizontal advection effects (Sect. 3.6). We check that the reconstructed δD 0 from the sum of its four contributions is very similar to the simulated δD 0 (Figs. 7b, 8 dashed black). In ascending regions, the main contribution explaining the more depleted δD 0 in deep convective regions is that of α eff (Figs. 7d, 8a red). α eff is higher in more ascending regions (Fig. D1d). This means that the main factor depleting δD 0 in deep convective regions is the fact that the mid-troposphere is more depleted. This leads to a steeper gradient (higher α eff ), and thus a more efficient depletion by vertical mixing. This is consistent with deep convection depleting the water vapor most efficiently in the mid-troposphere . The second main contribution is that associated with r orig (Figs. 7c, 8a green). r orig is larger in deep convective regions (as explained in Sect. 4.2).
In subsidence regions, SST is the main factor controlling δD 0 (Figs. 7e, 8a pink): as subsidence is stronger, or as EIS increases, SST is colder, leading to larger α eq and thus more depleted δD 0 . Another important factor is h 0 (Figs. 7f, 8a purple): as subsidence is stronger, h 0 is drier, leading to more depleted δD 0 . The contribution of r orig is also a significant contribution to the depletion of δD 0 in the cold upwelling regions, for example off Peru or Namibia (Fig. 7c). The shallower boundary layer there is associated with higher r orig .
From a quantitative point of view, we can decompose the δD 0 seasonal spatial variations into these different effects (Sect. 3.6). In regions of large-scale ascent, α eff is the main factor explaining the δD 0 seasonal spatial variations (33 %), followed by rain evaporation (20 %) and r orig (19 %; Table 2). In regions of large-scale descent, SST is the main factor explaining the seasonal spatial variations (54 %), followed by r orig (29 %), h 0 (13 %), and α eff (10 %) ( Table 2). Note that the contribution of r orig would be similar if we neglect rain evaporation and horizontal advection effects ( Table 2).
The decomposition method can also be applied to decompose the δD 0 variability at the daily timescale at each loca-tion and for each season (Table 3). On average, in ascending regions, r orig is the main factor (52 %), followed by rain evaporation (48 %) and α eff (35 %). In subsiding regions, the effect of SST is muted due to its slow variability, and r orig (82 %) becomes the main factor.
Overall, the results highlight the importance of r orig as one of the main factors controlling the spatiotemporal variability of δD 0 .

Decomposition of r orig variability
Given the importance of r orig in controlling the δD 0 variations, we now decompose r orig into its four contributions: q 0 , z orig , h orig , and δT orig (Sect. 3.6). Spatially, r orig is maximum in regions of strong large-scale ascent (Fig. 10a) such as the Maritime Continent (Fig. 9a) and in very stable regions ( Fig. 10b) such as upwelling regions (Fig. 10a). We check that the reconstructed r orig from the sum of its four contributions is very similar to the simulated r orig (Figs. 9b, 10 dashed black).
In regions of strong large-scale ascent, r orig is larger mainly because h orig is larger (Figs. 9e, 10a pink). This is because the moister the FT, the higher the contribution of vapor coming from above to the vapor of the SCL, and thus the higher r orig and the more depleted δD 0 . This mechanism through which a moister FT leads to a more depleted δD 0 is consistent with that argued in B15. z orig damps this effect: when convection is stronger and the FT moister, convec-  Table 2. Decomposition of the spatial and seasonal variation in δD 0 into its six contributions: effect of r orig , α eff , SST, h 0 , rain evaporation, and horizontal advection. For each contribution, we show the correlation coefficient of the linear regression of the contribution as a function of δD 0 . The analysis is performed separately for ascending and subsiding regimes. All seasons and locations over tropical oceans (30 • N-30 • S, ocean fraction > 80 %, surface evaporation > 0.5 mm d −1 ) are considered. The threshold for the correlation coefficient to be statistically significant at 99 % is 0.15 or lower in all cases. We write correlation coefficient and slope values between brackets when they are not significant at 99 %.

Regime
Ascending Subsiding tion is also deeper, so the air originates from higher altitudes where the air is drier.
In very stable regions, r orig is larger because q 0 is larger (Figs. 9c, 10b green), consistent with the drier conditions in these regions of large-scale descent. Note that this effect can be seen only in the most stable regions, but when considering all subsiding regions, the contribution is small (Table 2). r orig is also larger because z orig is lower in altitude (Figs. 9d, 10b red). As EIS increases, the boundary layers are shallower, the air comes from lower in altitude, r orig is higher, and thus Atmos. Chem. Phys., 19, 12235-12260, 2019 www.atmos-chem-phys.net/19/12235/2019/  δD 0 is more depleted. This mechanism was not considered in B15 but our decomposition shows that it is a key mechanism driving r orig and thus δD 0 variations in stable regions. Quantitatively, in ascending regions, the main factor controlling the seasonal spatial variations in r orig is h orig (182 %), dampened by z orig (−67 %) (Table 4). In descending regions, the main factor is also h orig (96 %), followed by z orig (41 %) Table 4. Decomposition of the spatial-seasonal variation in r orig into its four contributions: effect of q 0 , z orig , h orig , and δT orig variations. For each contribution, we show the correlation coefficient of the linear regression of the contribution as a function of r orig . The analysis is performed separately for ascending and subsiding regimes. All seasons and locations over tropical oceans (30 • N-30 • S, ocean fraction > 80 %) are considered. The threshold for the correlation coefficient to be statistically significant is 0.15 or lower in all cases. We write correlation coefficient and slope values between brackets when they are not significant at 99 %.

Regime
Ascending  (Table 4). At the daily scale, the same two factors dominate the variability of r orig : h orig and z orig contribute to 78 % and 39 % of r orig variations on average over ascending regions and to 118 % and 39 % on average over descending regions (Table 5).

Estimating altitude z orig
Estimated altitude z orig is at a minimum in dry subsiding regions, especially in upwelling regions (Figs. 11a,and 12), corresponding to regions with the strongest inversion (Fig. 11). This contributes to the depleted δD 0 in these regions.
As explained in Sect. 3.3, our estimate of z orig may be artificially biased due to the neglect of some processes in our theoretical framework. Ideally, to check whether z orig really physically represents the altitude from which the air originates, additional model experiments where water vapor from different levels are tagged (Risi et al., 2010b) would be needed. While we leave this for future work, we check whether z orig estimates are consistent with what we expect based on what we know about mixing processes in the marine boundary layers. We expect that in stratocumulus regions, air originates from a very shallow (a few tens of meters) layer above the inversion, whereas the mixing processes may be more diverse, and possibly deeper in the FT, as the boundary layer deepens (Fig. 1).
To check whether estimated z orig is consistent with this picture, we compare z orig to z orig (r orig = 0.6) (z orig that we would estimate is r orig was set constant to 0.6) and z i (Sect. 3.4), which are measures of the altitude of the humidity drop and temperature inversion, respectively. As expected from Fig. 1, they are minimum in dry upwelling regions, intermediate in trade-wind regions, and maximum values in convective regions 12 green,blue). Therefore, the low z orig in upwelling regions reflects the low z i . Consistently, in subsiding regions, z orig correlates well with z orig,r orig =0.6 (correlation coefficient of 0.52, statistically significant beyond 99 %). If we focus on very stable regions only (EIS > 7 K), z orig correlates well with both z orig (r orig = 0.6) and z i (correlation coefficient of 0.58 and 0.52, respectively, statistically significant beyond 99 %). The altitude z orig is a few meters above the inversion in stratocumulus regions, and up to 1 km above the inversion in cumulus and deep convective regions (Fig. 12), consistent with our expectations from Fig. 1. This lends support to the fact that at least in subsiding regions, our isotope-based z orig estimate effectively reflects the origin of air coming from above.
In ascending regions, in contrast, z orig does not correlate significantly with z orig (r orig = 0.6) or z i . This may indicate either that our z orig estimate is biased by neglected processes such as rain evaporation or that in deep convective regions the origin of FT air into the SCL is very diverse due to the variety of mixing processes (Fig. 1).

Results from observations
To check whether our results obtained with LMDZ are realistic, we apply our methods to the measurements gathered during the STRASSE campaign. For simplicity and in absence of all necessary measurements, here we neglect the effects of rain evaporation and horizontal advection.
Throughout the cruise, δD 0 shows a large variability, ranging from around −75 ‰ in quiescent conditions to −120 ‰ during the two convective conditions (Benetti et al., 2014) (Fig. 13a red). Variability in r orig is the major factor contributing to this variability (58 %) (Fig. 13a green, Table 6). This crucial importance of mixing processes is consistent with B15.
During the two convective events, the estimated r orig saturates at 1 (Fig. 13b). This proves that r orig estimated in these conditions is biased high because it encapsulates the effect of neglected processes, i.e., depletion by rain evaporation. Equation (9) is not valid in this case. In addition, at the scale of a few hours, the steady-state assumptions may Atmos. Chem. Phys., 19, 12235-12260, 2019 www.atmos-chem-phys.net/19/12235/2019/ be violated. Rain evaporation may strongly deplete the SCL before surface evaporation has the time to play its dampening role, hence the possibility of reaching very low δD 0 that cannot be predicted even when considering rain evaporation (Appendix B). During the rest of the cruise, the main factors controlling the r orig variability are z orig (90 %) and h orig (70 %). The importance of FT humidity in controlling r orig was already highlighted in B15. However, in their paper, the variability in z orig was neglected, whereas it appears here as the main factor.
Through September, the cruise goes from a shallow boundary layer in early September to deeper boundary layers with higher inversions, before reaching the convective conditions (Fig. 13c). Consistently with this deepening boundary layer, the air originates from increasingly higher altitudes. Remarkably, there are 6 d when z orig coincides with z i with a root-mean-square error of 31 ‰ and correlation coefficient of 0.996 (Fig. 13c). This indicates that the air comes exactly from the inversion layer. When recalling that z orig and z i are estimated from completely independent observations, the coincidence is remarkable and lends support to the fact that on these days, our z orig estimate is physical. However, there remain 9 d when z orig is much higher than z i . This may reflect more penetrative downdrafts as we approach deeper convective regimes. But it may also be an artifact of our neglect of horizontal advection. For example, on these days which are characterized by lower h 0 , neglecting the advection of enriched water vapor from nearby regions with higher h 0 could be misinterpreted as lower r orig and thus higher z orig .
6 Discussion: what can we learn from water isotopes on mixing processes?
We have shown in the previous section that one of the main factors controlling δD 0 at the seasonal spatial and daily scales is the proportion of the water vapor in the SCL that originates from above (r orig ) and that one of the main factors controlling r orig is the altitude from which the air originates (z orig ). In turn, could we use water vapor isotopic measurements to constrain z orig ? This would open the door to discriminating between different mixing processes at play (Fig. 1). Since mixing processes are crucial to determine the sensitivity of cloud fraction to SST (Sherwood et al., 2014;Bretherton, 2015;Vial et al., 2016), such a prospect would allow us to improve our knowledge of cloud feedbacks, and hence of climate sensitivity. With this in mind, we assess the errors associated with z orig estimates from δD 0 measurements, and discuss whether they are small enough for z orig estimates to be useful. In stratocumulus clouds where the air is believed to originate from the first few tens of meters above cloud top (Faloona et al., 2005;Mellado, 2017), z orig estimates are not useful if the errors are larger than a few tens of meters, e.g., 20 m. In cumulus clouds where mixing processes are more diverse and possi- Figure 11. (a) Map of winter-mean z orig estimated from δD 0 simulated by LMDZ. (b) Same as (a) but z orig that we would estimate if r orig were constant set to 0.6, z orig (r orig = 0.6). (c) Same as (a) but for z i simulated from LMDZ. Only days when EIS > 2 K are considered, otherwise z i is difficult to estimate. (d) Same as (a) but for LCL simulated by LMDZ. Figure 12. Composites as a function of EIS of seasonal mean of z orig (black), z orig (r orig = 0.6) (green), z i (blue), and LCL (red). The composite profiles of cloud cover are also shown, showing deep clouds when EIS is close to 0 and the shallowest clouds when EIS is largest. bly deeper (Fig. 1), z orig estimates may be useful if errors are of the order of 80 m.
Let us assume that we have a field campaign where we measure δD 0 , surface meteorological variables, temperature and humidity profiles (e.g., radio soundings), and a few δD profiles (e.g., by aircraft). This is what we can expect for example from the future EUREC4A (Elucidating the role of clouds-circulation coupling in climate) campaign to study trade-wind cumulus clouds (Bony et al., 2017). Below we quantify the effects of five sources of uncertainty on z orig estimates.

Measurement errors
The first source of uncertainty is measurement errors. We recalculate z orig assuming an error of 0.4 ‰ on δD 0 (typical of what we can measure with in-situ laser instruments; Aemisegger et al., 2012;Benetti et al., 2014) and 1 ‰ on δD f (larger errors due to lower humidity and the increased complexity of measurements in altitude). The averaged errors on Atmos. Chem. Phys., 19, 12235-12260, 2019 www.atmos-chem-phys.net/19/12235/2019/ z orig and their standard deviations are plotted as a function of EIS in Fig. 14a. Whereas errors on δD f lead to errors on z orig of the order of 20 m (Fig. 14a, green), errors on δD 0 lead to errors on z orig of the order of 80 m (Fig. 14a, red). Yet in stratocumulus, no one expects the air to originate from a higher altitude than 80 m above the inversion. Therefore, δD 0 measurements would need to be more accurate than usual to be useful in stratocumulus regions, i.e., 0.1 ‰ to yield a 20 m precision on z orig . In trade-wind cumulus regions, the precision of 0.4 ‰ is enough for z orig to be useful.

Neglecting rain evaporation
The second source of uncertainty is associated with neglecting rain evaporation. This effect can be quantified in a model, but it is very difficult to quantify in nature because it is complicated and uncertain to measure η (Rosenfeld and Mintz, 1988), and it is even more complicated to measure or predict α evap . Rain evaporation can have a depleting or enriching effect depending on microphysical details that are too complex to be addressed here (Graf et al., 2019). Neglecting rain evaporation leads to an error of the order of 500 m in regions of low EIS and 250 m in regions of strong EIS (Fig. 14b, brown). In regions of stratocumulus regions, rain evaporation is a significant source of error in spite of the relatively small amount of precipitation available to evaporate. This is because total evaporation of the rain efficiently enriches the SCL and easily modifies δD 0 by more than the 0.1 ‰ targeted precision explained above. However, it is possible that LMDZ overestimates this source of error in trade-wind cumulus and stratocumulus regions. LMDZ is one of the GCMs producing the strongest rain in stratocumulus regions (Zhang et al., 2013), and GCMs are known to trigger convection too often in trade-wind cumulus regions (Nuijens et al., 2015a, b).

Neglecting horizontal advection
The third source of uncertainty is associated with horizontal advection. In nature, φ can be estimated from meteorological analyses and β can be estimated from near-surface isotopic measurements at several locations (e.g., sounding arrays during typical field campaigns). In absence of these additional measurements, neglecting this effect leads to an error of the order of 800 m (Fig. 14b, purple). This limits the usefulness of z orig estimates for all cloud regimes.

Daily variability in the steepness of δD profiles
The fourth source of uncertainty arises from the daily variability in α eff (Appendix D2). Estimating α eff requires us to measure δD f at 500 hPa. Satellite measurements are available but are affected by random errors that are too large for our application (Worden et al., 2011(Worden et al., , 2012Lacour et al., 2015). Precise in situ measurements of water vapor δD in altitude are costly and difficult (Sodemann et al., 2017).
Let us assume that we have only one δD f value that represents the seasonal average at a given location. To estimate the resulting error on z orig , we reestimate z orig every day and at each location using α eff + σ α eff and α eff − σ α eff . The error on z orig is calculated as z orig (α eff − σ α eff ) − z orig (α eff + σ α eff ) /2. The averaged error and its standard deviation is plotted as a function of EIS in Fig. 14c (black). It is of the order of 400 m and rarely below 200 m. If we attempt to estimate α eff as the fractionation coefficient as a function of local temperature, errors would be even more dissuasive (Fig. 14c, blue).
Therefore, estimating z orig from daily δD 0 measurements cannot be useful unless we measure δD f on a daily basis as well. Practically, we could imagine measuring FT properties (δD f ) at the top of a mountain while we measure δD 0 at the sea level (e.g., on islands such as Hawaii or Réunion: Galewsky et al., 2007;Bailey et al., 2013;Guilpart et al., 2017).

Rayleigh assumption for the shape of δD profiles
Finally, as a fifth source of uncertainty comes the assumption that the δD profile follows a Rayleigh distillation line (Sect. 2.2). However, in both LMDZ (Appendix D1) and nature (Sodemann et al., 2017), δD profiles are usually intermediate between Rayleigh and mixing lines. The precision of our z orig estimate is at a maximum in the Rayleigh distillation case.
When trying to find a numerical solution for z orig directly from Eq. (6), a solution can be found only in 0.1 % of cases. This is because simulated δD profiles are often close to a mixing line in the lower troposphere (Appendix D1). Whatever z orig in the lower troposphere, the δD 0 calculated from Eq. (6) is nearly constant because the δD profile is close to a mixing line (Appendix A, Fig. 4b). Whatever z orig in the middle troposphere, the δD 0 calculated from Eq. (6) is also nearly constant because r orig there is very small. So whatever z orig , the δD calculated from Eq. (6) is nearly constant, and the numerical solution fails.
However, it is possible that δD profiles simulated by LMDZ are closer to mixing lines than real profiles since GCMs are known to overestimate vertical mixing through the troposphere (Risi et al., 2012b) and to mix the lower free troposphere too frequently by deep convection in trade-wind regions (Nuijens et al., 2015a, b). Therefore, the shape of δD profiles simulated by LMDZ is not a sufficient reason to reject the Rayleigh assumption. The uncertainty associated with this assumption is very difficult to quantify in LMDZ. More measurements of full δD profiles are very welcome to help quantify it.
To summarize, δD 0 measurements could potentially be useful to estimate z orig with a useful precision, but only if we measure daily δD f in the mid-troposphere, if the shape of δD profiles can be better documented, if we measure δD 0 at different places to quantify the effect of horizontal advection, Figure 14. Errors when estimating z orig from δD 0 observations, as a function of EIS, as predicted by LMDZ. (a) Error we would make if δD 0 is measured with a 1 ‰ error (red), and error we would make if δD f is measured with a 1 ‰ error (green). (b) Error we would make if we neglect horizontal advection effects (purple) and rain evaporation effects (brown). (c) Error if one uses α eq as a function of local temperature to estimate α eff (blue), error if one uses the seasonal-mean profile instead of the daily profile to estimate α eff (black). The standard deviations among all daily errors estimated in each bin of EIS are also shown. and if we can invent innovative techniques to better quantify the effect of rain evaporation. In addition, in stratocumulus clouds, we need to measure δD 0 with an accuracy of 0.1 ‰.

Conclusions
We propose an analytical model to predict the water vapor isotopic composition δD 0 of the sub-cloud layer (SCL) over tropical oceans. This model relies on the hypothesis that the altitude from which the air originates, z orig , is an important factor. We build on B15, who extended the Merlivat and Jouzel (1979) closure equation to make the explicit link between δD 0 and mixing processes. We further extend their equation: we assume a shape for the δD vertical profiles as a function of q, and we account for horizontal advection and rain evaporation effects.
The resulting equation highlights the fact that δD 0 is not sensitive to the intensity of mixing processes. Therefore, it is unlikely that water vapor isotopic measurements could help estimate the entrainment velocity that many studies have striven to estimate (Bretherton et al., 1995). In contrast, δD 0 is sensitive to the altitude from which the air originates. Based on a simulation with LMDZ and observations during the STRASSE cruise, we show that z orig is an important factor explaining the seasonal spatial and daily variations in δD 0 , especially in subsidence regions. In turn, could δD 0 measurements, combined with vertical profiles of humidity, temperature, and δD, help estimate z orig and thus discriminate between different mixing processes? For such isotopebased estimates of z orig to be useful, we would need a precision of a few hundreds of meters in deep convective regions and smaller than 20 m in stratocumulus regions. To reach this target, we would need daily measurements of δD in the midtroposphere and very accurate measurements of δD 0 , which are currently difficult to obtain. We would also need information on the horizontal distribution of δD to account for horizontal advection effects, and full δD profiles to quantify the uncertainty associated with the assumed shape for δD profiles. Finally, rain evaporation is an issue in all regimes, even for stratocumulus clouds. Innovative techniques would need to be developed to quantify this effect from observations. This study is preliminary in many respects. First, it would be safe to check using water tagging experiments in LMDZ that z orig estimates really represent the altitude from which the air originates and are not biased by our simplifying assumptions. Second, the coarse vertical resolution of LMDZ and the simplicity of mixing parameterizations (e.g., cloud top entrainment is not represented) are a limitation of this study. Ideally, the relationship between δD 0 , z orig , and the type of mixing processes should be investigated in isotopeenabled large-eddy simulations (LESs) (Blossey et al., 2010;Moore et al., 2014). Artificial tracers and structure detection methods (Park et al., 2016;Brient et al., 2019), combined with conditional sampling methods (Couvreux et al., 2010), could help detect the different kinds of mixing structures, estimate their contributions to vertical transport, and describe their isotopic signature. This would allow us to confirm, or disprove, many of the hypotheses and conclusions in this paper. Finally, if the sensitivity of δD 0 to the type of mixing processes is confirmed, paired isotopic simulations of single-column model (SCM) versions of general circulation models (GCMs) and LES, forced by the same forcing, could be very useful to help evaluate and improve the representation of mixing and entrainment processes in GCMs, as is routinely the case for non-isotopic variables (Randall et al., 2003;Hourdin et al., 2013;Zhang et al., 2013). For simplicity, we neglect here horizontal advection and rain evaporation effects, but results would be similar otherwise. If we assume that R orig is uniquely related to q orig through a mixing line between the SCL air and a dry end member (q f , R f ), and Reorganizing Eq. (A1), we get a = r−p 1−p with p = q f /q 0 . Since q f ≤ q orig ≤ q 0 , p ≤ r orig . Injecting Eq. (A2) into Eq. (6), we get As a consistency check, in the limit case where the end member is totally dry (p = 0), we find the MJ79 equation, i.e., Eq. (10).
It is intriguing to realize that r orig has disappeared from Eq. (A3). This can be understood physically: if the vertical profile follows a mixing line, it does not matter from which altitude the air comes: ultimately, what matters is how much dry air has been mixed directly or indirectly into the SCL (Fig. 4b). Therefore, if R orig follows a mixing line, we lose the sensitivity to z orig .

Appendix B: Diagnostics for rain evaporation in LMDZ
Rain evaporation can be accounted for in Eq. (8) if we can quantify η, the ratio of water vapor originating from rain evaporation to that originating from surface evaporation, and α evap , the ratio of isotopic ratio in the rain evaporation flux to R 0 .

B1 Equations
In LMDZ, two parameterization schemes can produce rain evaporation: the convective scheme and the large-scale condensation scheme. Their respective precipitation evaporation tendencies, (dq/dt) evap,conv and (dq/dt) evap,lsc , are given in kg water · kg air −1 · s −1 and are used to calculate F evap in kg water · m −2 · s −1 : where k LCL is the last layer below the LCL, P k is the depth of layer k in pressure coordinate, and g is gravity.
The isotopic equivalent of this flux, F evap,iso , is used to calculate R evap = F evap,iso /F evap .
Only grid boxes and days where F evap > 0.05 mm d −1 are considered to calculate α evap . This represents 94.0 % of all tropical oceanic grid boxes.

B2 Results
Consistent with the larger amount of precipitation available for evaporation, η is at a maximum in regions of deep convection, reaching 30 % around the Maritime Continent (Fig. C1a). It is minimal over the dry descending regions, reaching 5 % off the coasts of Mauritania, Peru, and Namibia. The rain evaporation is more depleted than the SCL in regions of strong deep convection, by as much as 70 ‰ around the Maritime Continent (Fig. C1b). When the fraction of raindrops that evaporate is small, as is the case in such moist regions, isotopic fractionation favors evaporation of the lighter isotopologues. In these regions, rain evaporation has a depleting effect on the SCL, consistent with Worden et al. (2007). In contrast, in other regions, rain evaporation has an enriching effect on the SCL, up to 70 ‰ in dry regions. This is because in dry regions, rain evaporates almost totally, so that the evaporation flux has almost the same composition as the initial rain, which is more enriched than the water vapor.

Appendix C: Diagnostics for horizontal advection in LMDZ
We can account for horizontal advection in Eq. (8) if we can quantify parameters φ = F adv ·q adv E , the ratio of water vapor coming from horizontal advection to that coming from surface evaporation, and β = R adv /R 0 , the ratio of isotopic ratios of horizontal advection to those of the SCL.

C1 Equations
Let us assume that the box representing the SCL has a zonal extent y and a meridional extent x and is composed of k LCL layers of vertical extent z k . The quantity F adv · q adv represents the mass flux of water entering the grid box by horizontal advection per surface area, expressed in kg water · s −1 · m −2 . Assuming an upstream advection scheme, it can be expressed as where u k and v k are the zonal and meridional wind components at layer k, ρ k is the volumic mass of air at layer k, and q uk and q vk are the humidities of the incoming air from zonal and meridional advection at layer k. When u k > 0, q uk is the humidity in the grid box to the west. When u k < 0, q uk is the humidity in the grid box to the east. When v k > 0, q vk is the humidity in the grid box to the south. When v k < 0, q vk is the humidity in the grid box to the north.
Applying the hydrostatic equation at each layer ( P k = ρ k · g · z k , where g is gravity and P k is the vertical extent of the layer k in pressure coordinate), we get F adv · q adv = k LCL k=1 P k g · |u k | · q uk x + |v k | · q vk y .
The quantity F adv represents the incoming air mass flux by horizontal advection, and q adv represents the humidity of the incoming air. We can thus write them as The same budget as in Eq. (C1) can be written for water isotopes: where R adv represents the isotopic ratio of the incoming water vapor: |u k | x · q uk + |v k | y · q vk · P k g .
Note that the upstream advection scheme assumed here overestimates the effect of advection compared to the Van Leer (1977) advection scheme used in LMDZ. We thus estimate an upper bound for the advection effect here.
In practice, rather than calculating β = R adv /R 0 , we calculate β = R adv /R SCL , where R SCL is the isotopic ratio on average through the SCL: This prevents the advected water vapor to be systematically more depleted when the mixed-layer hypothesis is not exactly verified.

C2 Results
Parameter φ is at a maximum where winds are maximum, such as near the extra-tropics or in the North Atlantic (Fig. C1a). Horizontal advection has an enriching effect in deep convective regions (probably because water vapor comes from nearby drier regions that have been less depleted by deep convection) and a depleting effect near the coasts (probably because of winds bringing vapor from the nearby land that is depleted by the continental effect) (Fig. C1b).
Note that in this formulation, parameters φ and β are resolution-dependent. For example, in a finer resolution, φ would be larger and β would be closer to 1, but F adv · q adv · R adv and thus the contribution of horizontal advection in Eq. (8) would remain the same.
Atmos. Chem. Phys., 19, 12235-12260, 2019 www.atmos-chem-phys.net/19/12235/2019/ Figure C1. (a) Ratio η of water in the SCL coming from rain evaporation to that coming from surface evaporation. (b) Effective fractionation coefficient α evap between the SCL water vapor and the rain evaporation. (c) Ratio φ of water in the SCL coming from horizontal advection to that coming from surface evaporation. (d) Effective fractionation coefficient β between the SCL water vapor and the water vapor coming from horizontal advection.

Appendix D: LMDZ free-tropospheric profiles
The goal of this appendix is to document the spatiotemporal variability in the shape (Sect. D1) and steepness (Sect. D2) of simulated free-tropospheric δD profiles. Note that a detailed interpretation of these profiles is beyond the scope of this paper. This paper aims at understanding δD 0 , which is the first step towards understanding full tropospheric profiles. In turn, understanding full tropospheric profiles in future studies will help refine our model for δD 0 .

D1 Shape of tropospheric profiles
First, we test whether the δD vertical profiles simulated by LMDZ follow a Rayleigh or mixing line as a function of q. For the Rayleigh curve, α eff is estimated as explained in Sect. 3.3. For the mixing line (Appendix A), the end member (q f , R f ) is also taken at 500 hPa. The tropical-mean vertical δD profiles simulated by LMDZ are bounded by Rayleigh and mixing lines (Fig. D1a). To better document the spatial variability in the shape of δD profiles, we plot parameter f = δD LMDZ −δD Rayleigh δD mix −δD Rayleigh , describing how close the simulated δD (δD LMDZ ) is to the Rayleigh (δD Rayleigh ) and mixing (δD mix ) lines. We have f = 0 in the case of a Raleigh line, f = 1 in the case of a mixing line, and f > 1 if δD is more enriched than a mixing line. In the lower troposphere, δD LMDZ is close to a mixing line (and sometimes even more enriched) in deep convective regions (Indian Ocean, South Pacific Convergence Zone, Atlantic ITCZ), probably because deep convection efficiently mixes the lower troposphere. Elsewhere, δD LMDZ is intermediate between the two lines (Fig. D1e). In the middle troposphere, δD LMDZ is relatively closer to Rayleigh everywhere (Fig. D1b).
The daily variability of f is large everywhere and at all levels (Fig. D1c, e), with standard deviation of 0.23 and 0.44 on tropical average at 1000 and 4000 m, respectively. A large daily variability in the shape of profiles is also observed in nature (Sodemann et al., 2017).

D2 Steepness of tropospheric profiles
The steepness of the δD gradient from the surface to the middle troposphere is described by the parameter α eff . It is at a maximum in regions of deep convection, for example around the Maritime Continent (Fig. D2a). This is consistent with the maximum depletion simulated in deep convective regions in the mid-troposphere simulated by models , leading to steeper δD profiles. The pattern of α eff may also reflect horizontal advection effects, where strong isotopic gradients align with winds (e.g., from the eastern to the western Pacific; Dee et al., 2018).
Values of α eff are of the same order of magnitude as real fractionation factors, but the spatial variations do not reflect those predicted if using a fractionation coefficient α eq as a function of temperature T (Fig. D2b).
The daily standard deviation of α eff (σ α eff ) for a given season ranges from 5 ‰ in the central Atlantic to 40‰ near the Maritime Continent (Fig. D2c). On average over all seasons and locations, daily α eff − 1 at a given location varies within ±25 % of its seasonal-mean mean value.
Atmos. Chem. Phys., 19, 12235-12260, 2019 www.atmos-chem-phys.net/19/12235/2019/ Figure D1. (a) Vertical profiles of water vapor δD simulated by LMDZ (black), calculated if assuming a Rayleigh-type curve with α eff estimated from (q f , δD f ) at 500 hPa (green), calculated if assuming a Rayleigh-type curve with α eq (T ) (dashed green), and calculated if assuming a mixing curve between the first layer and (q f , δD f ) at 500 hPa (blue), on average over all tropical oceanic locations and days.  Author contributions. CR thought about the equations, ran the LMDZ simulations, performed the analysis, and wrote the paper. JG initiated the discussion on the subject and discussed regularly about the results. GR provided the STRASSE radiosoundings. FB provided insight and references about cloud processes. JG, GR, and FB all gave comments on the paper.
Competing interests. The authors declare that they have no conflict of interest.