Comparison of mean age of air in five reanalyses using the BASCOE transport model

transport model Simon Chabrillat1, Corinne Vigouroux1, Yves Christophe1, Andreas Engel2, Quentin Errera1, Daniele Minganti1, Beatriz M. Monge-Sanz3, Arjo Segers4, and Emmanuel Mahieu5 1Royal Belgian Institute for Space Aeronomy, BIRA-IASB, Brussels, 1180 Belgium 2Institute for Atmospheric and Environmental Science, Goethe University Frankfurt, Frankfurt, Germany 3European Centre for Medium-Range Weather Forecasts, Shinfield Park, Reading, RG29AX, UK 4TNO, Department of Climate, Air and Sustainability, P.O. box 80015, 3508 TA Utrecht, the Netherlands 5Institute of Astrophysics and Geophysics, University of Liège, 4000 Liège, Belgium Correspondence to: Simon Chabrillat (Simon.Chabrillat@aeronomie.be)

• L31 P4: "idealized tracer which increases linearly at the surface": throughout the surface or just in the tropics?
The choice of the surface as source region introduced confusing inconsistencies in the discussion paper (see the general comment by the second referee and also the next comment here). Hence we decided to re-run our calculations and re-plot all figures (except for figure 8, see below) using as source region the tropical tropopause region (defined as the 100 hPa isobar between latitudes 10°S and 10°N), and computing the AoA at each gridpoint as the time elapsed since the mixing ratio of the ideal tracer reached the same value in that source region. The figures did not change significantly from the discussion paper, indicating that this is a methodological issue which does not have any impact on our findings. The last paragraph of section 2.1 has been re-written to fully explain the updated procedure for computing AoA. • L31-32 P6: "the AoA at the equatorial tropopause has been subtracted from the fields...": did you use the climatological or time-dependent tropopause altitude?
See previous question: the revised manuscript shows AoA computes AoA directly from the tropical tropopause region and has dropped all a posteriori corrections by subtraction of AoA values at the equatorial tropopause. We have removed the sentences describing this procedure from the revised manuscript.
• L9 P8: GCCM: this has not been introduced before, do you mean CCM?
Yes. We have replaced all occurrences of "GCCM" by "CCM".
Replaced "sign of observational trend not significant" by clearer ...sign of trends not significant in the observations.
• This focus on ERA-I is due to the exclusive use of ERA-I in previous studies modeling the latitudinal structure of AoA for the post-2000 period (see first paragraph of section 4.3, P12 of ACPD manuscript). But this context had not been introduced yet for the discussion of figure 8 (i.e. L12 P10 and P16 L22-23). Since this question is specifically investigated through figures 11 and 12, we have simply removed the premature sentence from the discussion of figure 8.
• L23 P12: standard error for which confidence level?
This important information has been added in section 4.1 which describes our methodology for multi-linear regressions: The uncertainties arising from the fit are calculated for the 95% confidence interval and corrected for auto-correlation in the residuals (Eqs. 3,4 and 6 in Santer et al., 2000). and in the section 4.3 (discussing linear trends): It is expressed in years per decade (yr dec -1 ) and is deemed significant at a given grid point if its absolute value is larger than its uncertainty (as defined in section 4.1).
Thanks for pointing this out. In the revised manuscript we now also compare our results with those by Ploeger et al. (2015a), both for the discussion of the latter period (figure 11): Our results also agree well with those obtained by a diabatic model driven by ERA-I over the same period (Ploeger et al., 2015a). and for the discussion of the total period (first paragraph discussing figure 12): Our ERA-I results for the overall period partly contradict those obtained by diabatic models which use not only the wind fields from ERA-I but also its heating rates (Diallo et al., 2012; Ploeger et al., 2015a). Looking at slightly shorter periods of two decades (1989-2010 for the former and 1990-2013 for the latter), these papers reported negative AoA trends for both hemispheres below 28km altitude.
• L10 P13: "using only wind fields": do you mean not using heating rates? Perhaps this should be explicit.
Indeed we meant that our results did not use the heating rates. During our revision we found that the whole sentence was confusing and removed it from the manuscript. The additional use of ERA-I heating rates by the diabatic models (Diallo et al., 2012;Ploeger et al., 2015a) is now explicitly stated as soon as they are cited (see previous comment).
• L26-27 P14: "While this may be a coincidence...": but having more wave drag would imply a faster BDC, so I do not see the point of this sentence.
Agreed. This sentence has been removed from the revised manuscript.
The following sentence has been added to the discussion: Similar disagreements have also been reported between the trends of the annual mean tropical upwelling in three reanalyses over the period 1979-2012, with vertical residual velocities (w*) increasing in MERRA and JRA-55 and decreasing in ERA-I (Abalos et al., 2015, Fig. 11).
• L18 P15: Another difference with CLaMS is that it includes a mixing parameterization.
The revised version states also this difference, citing Konopka et al. (JGR, doi:10.1029/2003JD003792, 2004. This sentence has been removed from the caption of Fig.~7. Konopka Response to Reviewer #2 for discussion paper

Chabrillat et al., ACPD, 2018
We thank the reviewer for his/her insightful comments. It appears that the version of the manuscript which was reviewed by this referee is the version first submitted to ACPD (on 4 April 2018) rather than the version finally published in ACPD (on 7 May 2018). Fortunately all comments apply equally to both versions. In our replies below the bold type is used to highlight text in the revised manuscript.

Replies to general comments
• The principal conclusion is that the simulations of AoA obtained when BASCOE is constrained by different reanalyzed datasets differ substantially from one another. This is not at all unexpected given the differences among the reanalysis models.
The reanalysis systems are based on different models but they assimilate very similar satellite datasets. Many users of reanalyses are neither modellers of stratospheric dynamics nor aware of the lack of observational information to constrain the BDC in the reanalyses. From feedback obtained at the 5th International Conference on Reanalysis (ICR5, Rome, November 2017), such users do not expect to see a spread between the reanalyses which is as large as between unconstrained GCCMs (Fig. 4). On Fig. 8 they easily understand that the uncertainties in the observational timeseries are large (due to sparse and irregular sampling) but they do not expect to see that the spread between the reanalyses is as large as these observational uncertainties.
A third highlight of this paper is the intercomparison of AoA trends between the reanalyses. Several reanalysis intercomparisons of diagnostics related to stratospheric dynamics have already been published and showed significant differences with respect to their trends (e.g. Abalos et al., 2015;Miyazaki et al., 2016). Yet for the AoA diagnostic, most recent studies rely on ERA-I with much interest in the latitudinal structure of its trends. We found that over the post-2002 period ERA-I is the only reanalysis to deliver opposite trends of AoA in the two hemispheres (Fig. 12,middle column). This is also an unexpected result.
• The paper is well organized and clearly written, with some exceptions, the main one being that the procedure for computing AoA is not well explained. In particular, it is not clear whether AoA is calculated with respect to a reference level at the tropical tropopause or in the troposphere, and this introduces some ambiguity in the interpretation of the results.
We agree with the referee that the handling of the reference level was problematic in the submitted manuscript. All our calculations used the surface both as the source region and to compute the time lag defining the mean Age of Air. But in order to better highlight the different transit times from the equatorial tropopause, Fig.1 and 3 were corrected a posteriori by subtraction of the time-averaged AoA at 100hPa, 10°S-10°N. All other figures used the surface as reference, hence including the transit time from the surface to the tropical tropopause. This distinction was not clearly made and led to inconsistent figures, as shown by several specific comments made by both reviewers.
Hence we decided to re-run our calculations and re-plot all figures (except for figure 8, see below) using as source region the tropical tropopause region (still defined as the 100 hPa isobar between latitudes 10°S and 10°N), and computing the AoA at each grid point as the time elapsed since the mixing ratio of the ideal tracer reached the same value in that source region. The last paragraph of section 2.1 has been re-written to fully explain the updated procedure for computing AoA.
The figures did not change much from the discussion paper, indicating that this is a methodological issue which does not have a large impact on our findings. Besides figures 3 and 9 which are discussed below for specific comments, there is one other case where the figure changed sufficiently to warrant a minor update in the text: on figure 12 the positive AoA trends for ERA-I in 2002ERA-I in -2015 have become significant at all latitudes (in the discussion paper they were significant only in the polar latitudes). On figure 12 the signs and patterns of AoA trends did not change for any other reanalysis or period but the range of these trends increased by up to 50% (see min/max values above the plots); this led us to extend the scale of the color bar, from [-0.6,0.6] to [-0.9,0.9].
For figure 8 (and figure 8 only) we have kept the original calculations where the tracer was set to increase linearly throughout the surface, because this figure includes a comparison with observational values of AoA which used the surface as reference. We have moved to the discussion of figure 8 the description of this surface boundary condition and its propagation through the troposphere, because it is now irrelevant for all other figures. We have also added in this figure a plot showing tropical AoA computed both from the surface and from the tropical tropopause to show that the difference does not vary significantly with the simulated year (see next comment). • (4, 20)  This comparison between the two evaluations also allows a discussion on the impact of the omission of deep conviction in the model. The discussion of Fig. 8 in the revised manuscript includes the following paragraph:

Replies to specific comments
The differences between the two calculations represent the transit times from the surface to the tropical tropopause, are nearly independent of the simulated year and range between 3 months (with ERA-I or JRA-55) and 6 months (with MERRA). These values are close to the longest transit times reported in a recent intercomparison of global models (Krol et al., 2018), indicating a rather slow transport from the surface to the tropical tropopause which we attribute to the omission of deep convective transport in our model. While the surface-based model AoA (solid lines in Fig. 8) may be slightly overestimated, these biases have no significant inter-annual variations and do not hinder the intercomparison between reanalyses.
Text corrected.
• (6, 8) " Figure 1 compares the results": I do not believe you have stated how AoA is calculated. [...] Section 2.1 now describes explicitly the revised procedure to calculate AoA from the tropical tropopause region: The age of air is defined as the spectrum of transit times from a source region to a given location, with the tropical tropopause usually defining the source region for studies of the stratosphere. In the case of ideal tracers which increase linearly in the source region and have no photochemical production or losses, the mean of this spectrum (denoted here AoA) is simply the time elapsed since the mixing ratio of this ideal tracer reached the same value in the source region (see e.g. Waugh and Hall, 2002). We follow here this classical approach, using for most simulations the 100 hPa isobar between latitudes 10°S and 10°N as source region.
Section 3.2 describes explicitly the original procedure which has been kept only for figure 8: A color bar has been added to figures 2 and 3.  Waugh and Hall, 2002, Sec. 3.1) then the choice of a base point in the troposphere confuses the issue, especially given the use of artificial diffusive transport in the lower troposphere.
The reviewer was rightly confused and his interpretation is correct. We have followed this advice, choosing the tropical tropopause as reference point in the revised manuscript (see above). The relative differences between ERA-I and the four other reanalyses vanish at the reference point and the difference is not plotted at grid points where ERA-I AoA is smaller than 5 days.
• (8, 5) "The intercomparison at 50 hPa": You should state explicitly in the text that in these comparisons AoA is "normalized" to be zero at the tropical tropopause (this is only stated in the caption of Figure 4). Otherwise, the reader will wonder, as I did, why the AoA shown in Figures 2-3 are different from the AoA in Figure 4. By the way, a problem with the "normalization" of AoA to zero at 100 hPa is that it gives the impression that AoA above that level is determined only by the stratospheric circulation, when in fact the AoA also contains the effect of transport in the troposphere.
thanks to the direct calculation of AoA using the tropical tropopause as reference point, no "normalization" is performed any more for the figures 1 and 4 of the revised manuscript.
• (8, 12) "overall, the spread . . . is larger than the 1-sigma. . .": One wonders how this result would change if AoA were computed with respect to a reference point at 100 hPa.
In the revised manuscript the AoA are computed with respect to a reference point at 100 hPa. The differences in Figure 4 between the submitted and revised manuscripts are nearly indistinguishable. Hence the spread between the five simulations at 50 hPa is still larger than the 1-s observational uncertainties in the tropics, and still nearly as large in the extratropics. We have not modified this sentence in the revised manuscript.
• (8, 26) "The spread between the four reanalyses reaches a maximum of 0.2 years at 30 hPa": Are you referring here to the gradient comparison, Figure 4d? How is this "gradient" calculated? The figure legend refers to "MLNH-Tropics" and shows values in months, not per unit distance, so this is really a difference between the Tropics andmidlatitudes o f the NH. How are Tropics and NH midlatitudes defined?
The words "(mean age) gradient profiles" or "latitudinal gradients (of mean age)" were meant with the same meaning as Neu et al. (2010) and Chipperfield et al. (2014) i.e. as the difference between AoA in NH midlatitudes and AoA in the Tropics. The vertical profiles on figure 4d simply show the differences between the corresponding profiles on figures 4c and 4b which are mean values for latitude bands 35°N-45°N and 10°S-10°N respectively (as stated in the figure of caption 4).
In the revised manuscript we have added the definition of the latitude bands in the discussion of figures 4b and 4c and we have added the following sentences in the discussion of figure 4d: These "latitudinal gradients of AoA" were used in several CCM intercomparisons (Neu et al., 2010;Chipperfield et al., 2014). Figure 4d shows this diagnostic for the five reanalyses, i.e. the differences between the AoA profiles on Fig. 4c and Fig. 4b. We have replaced the words "latitudinal gradients" by "AoA differences" in the remainder of this discussion and in the caption of the figure.
• (8, 30) "MERRA-2 yields an outlying vertical profile of AoA at northern midlatitudes": True with respect to the other reanalyses except for MERRA (Fig. 4c), and in fact, MERRA and MERRA-2 midlatitude profiles of AoA agree best with the observations. You keep referring to MERRA-2 as an "outlier", which carries negative connotations,but in fact being an outlier in this comparison is a good thing if one considers the data to be the "truth"..
Thanks for pointing this out. We have corrected the discussion according to your comment: MERRA and MERRA-2 yield larger AoA at northern midlatitudes than the three other reanalyses. In the case of MERRA-2 this results in a profile of AoA differences which are significantly larger than the profiles obtained with the four other reanalyses but agrees much better with the profile derived from the observations. Hence MERRA-2 apparently underestimates the tropical upwelling in the lowermost stratosphere (100-60~hPa), agrees better with the observations at 50~hPa than any other reanalysis, and joins the results of the four other reanalyses at higher levels.
• (9, 14) "MERRA-2 starts with much older values": This behavior does appear to be anomalous. Any idea what might be causing it?
This issue is discussed in detail in sectin 5 (see paragraph starting with "MERRA-2"). We have inserted the following sentence in section 3.2: The possible causes for this apparently anomalous behavior of MERRA-2 are discussed in section 5.
• (9, 18) "The Pinatubo eruption does not appear to have any impact of the simulated AoA at 50 hPa": Insofar as one might expect that the largest impact of Pinatubo would be in the Tropics, it might be worthwhile to examine the AoA time series averaged over, say, 30N-30S.
This was done and no impact was found for the tropical latitude band, as shown by the corresponding plot: Note that any impact in the 30°S-30°N latitude band would have been seen in figure 7 which shows the 72°S-72°N band. The revised manuscript mentions the absence of volcanic impact in the tropical latitude band as well.
• (10,8) "observational trend is not significant": One would not expect any trend calculated from the smoothed, model time series shown on the right pane of Figure 8 to be significant either. By the way, you keep referring to "trends" in connection with the model results, but you have not calculated any trends...
Not yet. As outlined in the introduction, modelled trends are evaluated extensively in section 4. The sentence now refers to this later finding: ERA-I delivers a weakly positive trend over the period 1989-2015 and we will assess in section 4.3 that this trend in the model results is significant.
And this finding has been added in the discussion of figure 12 (section 4.3): The same plot (i.e. Fig. 12, top right) also shows that the positive trend which had been inferred visually for the northern mid-latitudes of the middle stratosphere (Fig. 8, left) is significant.
• ...Note also that Garcia et al. (2011) have argued that, even using model output for an ideal AoA tracer, trends over periods as long as 30 years are often not significant when the ideal tracer is sampled like the available observations of stratospheric tracers....
Great caution should indeed be exercised in comparisons of trends between model output and observational datasets which are sparsely and irregularly sampled. This is what we meant in the original manuscript with "..., but Engel et al. (2009Engel et al. ( , 2017 showed that the sign of this observational trend is not significant". Referring to Garcia et al. (2011) allows us to reinforce and clarify this call to caution. The revised manuscript states: While the overall trend simulated with ERA-I is apparently in agreement with the balloon observations, this comparison should be considered with great caution because the sign of the AoA trend is not significant in the observations (Engel et al., 2009(Engel et al., , 2017 and modelled trends over periods as long as 30 years are often not significant when the ideal tracer is sampled like the available observations of stratospheric tracers (Garcia et al., 2011).
• ... Furthermore, trends derived from observation are also confounded by the fact that no real atmospheric tracer has a constant, linear growth rate.
The non-linearity of the growth rate of CO 2 and SF 6 has of course been taken into account by Engel et al. (2009Engel et al. ( , 2017 for their derivation of AoA and also for their error analysis. The procedure is described in the supplementary information of Engel et al. (2009) and in the first paragraph of section 4 in Engel et al. (2017).
• (11,16) "in the polar regions and midlatitudes": Why limit this to extratropical behavior only? It would be interesting to show the seasonal amplitude in the Tropics as well, say a ±30° average.
Thanks for the suggestion. The seasonal amplitudes with ERA-I and MERRA-2 have a different vertical structure in the Tropics. We have added this plot to figure 9 in the revised manuscript.
The left plot below is extracted from fig. 9 in the ACPD manuscript while the right plot shows results with the new AoA calculation (i.e. using the tropical tropopause as reference): This shows that the new AoA calculation delivers amplitudes of AoA seasonal variations which are much closer for periods 1989-2001 and 2002-2015. This is the case also for the other latitude bands, so for the revised manuscript we have removed from figure 9 the results for the early period 1989-2001 and we have dropped the corresponding part of the discussion.
• (12, 9) "could not be expected from inspection of the native dynamic variables": I do not understand what you are trying to say here. Please elaborate. And note also that the discrepancies you mention ("up to 50% dependencies on the considered time period") are not even illustrated, so it is very difficult to even guess what the intent of your statement is.
The dependencies of seasonal amplitudes on the considered time period reached 50% in fig. 9a and 9c of the ACPD manuscript but, as explained in the previous comment, they have disappeared from the revised manuscript thanks to the corrected AoA calculation. We removed the whole paragraph (i.e. last paragraph of section 4.2) from the revised manuscript because it is redundant with the second paragraph of section 5.
• (12, 15) "unexpectedly increasing": Given the very short period covered by the SF 6 observations, it is not clear that one should "expect" any particular sign for the trends. Determination of AoA trends from observations of stratospheric tracers is fraught with many uncertainties; even in models where an ideal, linearly increasing artificial tracer is used, one has to rely on zonal-mean results over long periods to obtain trends that are clearly statistically significant. Arguably, examination of AoA trends determined from observations of stratospheric tracers is not the best tool for documenting changes in the BD circulation. See Garcia et al. (2011).
We acknowledge that it is important to provide proper context about this topic. Section 4.3 now starts with a new paragraph explaining some caveats of interpreting AoA trends as changes in BDC. The next paragraph has been expanded and corrected to justify the interest of the section: It is delicate to infer changes in the BDC on the basis of AoA trends over periods shorter than several decades. Even in models where an ideal, linearly increasing artificial tracer is used, one has to rely on zonal-mean results over long periods to obtain trends that are clearly statistically significant (Garcia et al., 2011 (Stiller et al., 2012). This structure of trends shows AoA decreasing in the Southern Hemisphere but unexpectedly increasing in the Northern Hemisphere which was used to explain a recent increase of stratospheric HCl in the Northern Hemisphere (Mahieu et al., 2014) and interpreted as the consequence of a southward shift of the subtropical transport barriers  .
We have removed the word "unexpected" from this sentence.
• (13, 23) "the reversal is found for all five reanalyses": This is true, but the reversals are in the opposite sense in ERAi and CFS vs. JRA-55, MERRA and MERRA2, so it is hard to know what to make of this.
Yes but we still want to highlight this striking feature in case some reader does find what to make of this. We have changed this sentence to: This reversal is found for all five reanalyses and in all regions of the stratosphere but it is difficult to interpret because it goes in opposite directions in ERA-I and CFSR versus JRA-55, MERRA and MERRA-2.
• (13, 25) "unexpectedly large disagreements": I am not sure why you think the disagreements are "unexpected". While presumably all reanalyses use more or less the same observational data, the manner in which the data are assimilated and the physical parameterizations included in the different models (in particular, those for convection and mesoscale gravity waves) are different. Note that at (14,8) you suggest that "the disagreements found here may lie in the differences between the underlying models"; I agree that this is the most plausible working hypothesis.
See above the reply to the first general comment. The word "unexpectedly" had already been removed before publication in ACPD.

Grammar, typos, etc.
All errors have been corrected. Garcia In view of these large disagreements, we urge great caution for studies aiming to assess AoA trends derived only from reanalysis winds. We briefly discuss some possible causes for the dependency of AoA on the input reanalysis and highlight the need for complementary intercomparisons using diabatic transport models.

15
The mean age of air (hereafter AoA) is an evaluation of the time necessary for variations of long-lived (e.g., greenhouse or ozone-depleting) species to propagate from the troposphere to various regions in the stratosphere. This classical diagnostic provides insights on the strength and structure of the Brewer-Dobson Circulation (BDC), the polar vortex, and irreversible mixing in the mid-latitudes (Waugh and Hall, 2002). Due to increased greenhouse gas forcing, Chemistry-Climate Model (CCM) simulations of the 1990-2090 period predict an acceleration of the BDC and a decrease of AoA at all latitudes in the 20 lower part of the stratosphere (Austin and Li, 2006;Butchart, 2014). The observational detection of trends in the BDC strength turns out to be quite difficult. They can be indirectly derived from multidecadal records of stratospheric temperatures but these derivations are indirect and do not yet allow a clear confirmation of the acceleration predicted by CCM, mainly due to an insufficient quality :::::::::: insufficiently :::::::::: constrained ::::::: accuracy : of the temperature observations (Fu et al., 2015;Ossó et al., 2015).
Observation-based AoA is derived from concentration measurements of very long-lived tracers which increase (nearly) leading to breakthrough studies about observed AoA and its time variations during this comparatively short period (Stiller et al., 2008;Haenel et al., 2015). The magnitude, distribution and detectability of the AoA trends observed over the past years and decades are currently a topic of intense research (e.g., Engel et al., 2009;Stiller et al., 2012;Mahieu et al., 2014;Engel et al., 2017).
Reanalysis systems combine a global weather forecast model, observations, and an assimilation scheme to provide the best estimates (analyses) of past atmospheric states including surface pressure, temperature, and wind over a long (usually multi-decadal) period. While they are derived from assimilation systems used operationally to deliver weather fore-5 casts, they aim to achieve more consistent variations over long timescales, e.g., avoiding spurious discontinuities and trends ::::::::::::::::::::::::::::::::::::::::::::::: (Trenberth and Olson, 1988;Bengtsson and Shukla, 1988) . Hence the same model version and assimilation scheme are used for the whole period and special care is given to the time-varying biases between the assimilated observations (see, e.g., Simmons et al., 2014). The resulting reanalysis datasets provide a multivariate, spatially complete, and coherent record of the global atmospheric circulation. The absolute value of AoA and its evolution over the past decades can be derived from the surface pressure and wind fields available in such reanalyses, using either an offline transport model (see, e.g., Chipperfield, 2006) or a chemistry-climate 20 model nudged to the input reanalysis (Kunz et al., 2011;Kovács et al., 2017) to model the transport of inert tracers propagating from the troposphere to the stratosphere. This approach helped to identify shortcomings in the Brewer-Dobson circulation described by early reanalyses (Meijer et al., 2004;Pawson et al., 2007) and to assess the improvements in the next generation of reanalyses, e.g., from ERA-40 to ERA-Interim (Monge-Sanz et al., 2007;Dee et al., 2011;Monge-Sanz et al., 2012).
Few AoA comparisons have been performed between reanalyses originating from different reanalysis centers. This is mainly 25 due to technical difficulties which ::: that : are not limited to file formatting issues. While all modern systems use hybrid σ − p vertical coordinates (Simmons and Burridge, 1981), each reanalysis comes with a wind field computed on a different grid with different horizontal and vertical resolutions. Some reanalysis forecast models use spectral dynamical cores (Krishnamurti et al., 2006) while others use finite-volume dynamics (Lin, 2004, see next section for details). A common offline transport model may have difficulties dealing with such different grids because it is usually tailored for a specific family of reanalyses, e.g., using 30 an advection algorithm similar to the dynamical core of the driving reanalysis system or climate model (Strahan and Polansky, 2006).
Section 2 describes the input reanalyses and our modeling tools to explains ::::: explain : how these difficulties were circumvented.

35
3 The main purpose of this paper is to provide a comparison of the AoA obtained from five modern reanalyses included in the S-RIP project in order to assess their level of agreement or to identify outliers. Its focus is not on detailed comparisons with observations (which are deferred to a follow-on study) but rather on a consistent intercomparison between the reanalyses through the use of a common transport model.
In order to allow quick propagation of this boundary condition to the free troposphere, eddy vertical diffusion is modeled in the lower half of the troposphere with a vertical diffusion coefficient K zz decreasing from an arbitrary value of 10 m 2 s −1 at the surface to zero at the pressure level halfway between the surface and the tropopause.There is no other representation of 15 convection in the model nor any explicit mechanism for horizontal diffusion.
Each reanalysis is available on two vertical grids: the native grid of the underlying atmospheric model (product on "model levels") and an output grid of constant pressures (product interpolated to "pressure levels"). Our simulations are run on the native model levels in order to account for the different vertical resolution of each reanalysis system and also to avoid any interference from the interpolation methods used to deliver the products on constant pressure levels. All reanalysis systems

5
The forecast models use two different frameworks to discretize their primitive variables on the horizontal plane: MERRA and MERRA-2 solve for mass fluxes on a regular latitude-longitude grid (Lin, 2004) while ERA-I, JRA-55 and CFSR use spectral dynamical cores, i.e., they solve for vorticity and divergence expressed on a spherical harmonics basis (e.g., Krishnamurti et al., 2006). Users of the reanalyses often download velocity fields which are derived from the primitive variables and evaluated on varying regular grids: these may be reduced Gaussian grids (ERA-I and JRA-55), regular Gaussian grids (CFSR) or regular 5 latitude-longitude grids (MERRA and MERRA-2). This pre-processing is described in detail in the next subsection.
We use in all cases the analyses valid at 00 h, 06 h, 12 h and 18 h, i.e., datasets with a 6-h time resolution. The assimilation procedure for MERRA and MERRA-2 uses an iterative predictor-corrector approach, generating two separate sets of reanalysis products designated "ANA" for analysis state and "ASM" for assimilated state (Rienecker et al., 2011). The latter products use a 6h "corrector" forecast centered on the analysis time and an incremental analysis update to apply the previously calculated 10 assimilation increment gradually rather than abruptly at the analysis time (Bloom et al., 1996). Thanks to this procedure, the ASM products have smaller wind imbalances than the ANA products (Fujiwara et al., 2017) hence they are preferable for tracer transport simulations. We used the ASM products in MERRA-2 but could not do so with MERRA where the ASM products are only available on constant pressure levels. Since we aim to evaluate each reanalysis on its native vertical grid, we had to fall back on the ANA product in the case of MERRA.

Pre-processing of the reanalyses
Our offline transport model ::: The :::::::: BASCOE ::::::::: Transport :::::: Model :::::::: (hereafter ::::::::: BASCOE :::: TM) : is used as a tool to perform a fair comparison of advective transport in each reanalysis data-set, using their native vertical grids but a common, low-resolution latitude-longitude grid. It requires on input the surface pressure and horizontal velocity on a so-called Arakawa C-grid, i.e., the zonal wind u must be staggered in longitude and the meridional wind v must be staggered in latitude. As indicated by its 20 name, the FFSL algorithm evaluates internally the corresponding mass fluxes and derives the vertical winds (w) from mass conservation. Hence the reanalysis datasets must be carefully pre-processed from spectral or high-resolution gridded fields to the low-resolution C-grid. We have paid special attention to this pre-processing of the reanalyses to make sure that the different types of wind fields are expressed in a consistent manner for our transport algorithm.
Our pre-processing for the five reanalysis systems is based on this algorithm, with a preliminary derivation of the spherical harmonics coefficients of vorticity, divergence and surface pressure for the reanalyses other than ERA-I. In all cases these spectral coefficients are truncated at the wavelength number :::::::::: wavenumber : 47 to avoid aliasing on the 2°x2.5°target grid (Kr-  Fig. 1). Some observational context is provided with in-situ observations of SF 6 and CO 2 (Hall et al., 1999). As done in recent intercomparisons of climate models (Neu et al., 2010;Chipperfield et al., 2014) , the AoA value at the equatorial tropopause has been subtracted from the fields in order to exclude the transit time from the surface to the tropopause.

30
For the sake of convenience the results of each simulation will be designated by its driving reanalysis but the reader is reminded that all results presented here are obtained indirectly through an offline and kinematic transport model. The outcome of the intercomparison could have been different if the AoA had been computed directly in each reanalysis system.

Mean distribution in 2002-2007
The is significantly stronger in the southern mid-latitudes and polar regions than in the northern hemisphere ::::::: Northern :::::::::: Hemisphere, and old air masses reach much lower altitudes above the Antarctic than above the Arctic (e.g., the 5-year isoline starts at 50 hPa above the South Pole and ends at 20 hPa above the North Pole). This is qualitatively confirmed by AoA derived from MIPAS observations of SF 6 for the same period (Kovács et al., 2017, Fig. 7d).
The intercomparison at 50 hPa (Fig. 4a) shows again the important disagreements between the five model simulations.
ERA-I delivers a weakly positive trend over the period 1989-2015, apparently in agreement with the balloon observations, but Engel et al. (2009Engel et al. ( , 2017 showed that the sign of this observational trend is not significant. More importantly, the AoA from the We now perform a quantitative investigation of the temporal variations in order to derive the amplitudes of periodic variations and the linear trends of AoA at all latitudes and pressure levels, including their uncertainties.

Methodology
Vigouroux et al. (2015) used a multiple linear regression model to study the trends of ozone total columns and vertical distribution at several ground-based stations. Here we apply the same tool to A(t), the monthly zonal means of AoA as a function 10 of time, latitude and pressure (after interpolation to a constant log-pressure grid with 2km increments). The multiple linear regression model is expressed as: where t is time, A 0 is the baseline value, A 1 is the annual trend of AoA and (t) represents the residuals. The term S(t)
We now investigate the differences in the QBO among all reanalyses. Kawatani et al. (2016) have compared the monthlymean zonal wind in the equatorial stratosphere between ::::: among reanalyses and found that their degree of disagreement depends 30 on latitude, longitude, height, and the phase of the QBO. They also noted a tendency for the agreement to be best near the longitude of Singapore, suggesting that the Singapore observations act as a strong constraint on all the reanalyses.
Here we perform an intercomparison of the amplitude of the QBO signal (in years) in each reanalysis. We approximate it again as the difference between the maximum and minimum values reached by the term Q(t) in the linear regression model. Our  Fig. 11 and 10, respectively) .
In the polar regions, H2015 showed large and positive trends while they are insignificant according to our model (Fig. 11).
Our transport model, using only wind fields and surface pressure from ERA-I, confirms this finding.
Many intercomparisons of reanalyses have focused on the instantaneous values or long-term evolution of direct output fields such as temperature or zonal winds (Simmons et al., 2014;Lawrence et al., 2015;Long et al., 2017;Kozubek et al., 2017). 25 These intercomparisons do not find large discrepancies, especially after the introduction of new satellite instruments around year 2000. The large disagreements obtained here may look surprising unless one considers :: be ::::::::: explained :: by : the lack of wind observations available for assimilation in the tropics, high latitudes and stratosphere (Baker et al., 2014). This deficiency of wind information may explain ::::::: explains : the divergences between trajectories obtained with different reanalyses in the lower stratosphere, e.g., in the equatorial region during some phases of the QBO (Podglajen et al., 2014) or above the Antarctic 30 during the vortex break-up season (Hoffmann et al., 2017). Such divergent trajectories could have a significant cumulative impact on the mean Age of Air because it is a time-integrated diagnostic spanning several years.
Since the wind fields are weakly constrained, the causes for the disagreements found here may lie in the differences between the underlying models which were summarized recently in the context of S-RIP (Fujiwara et al., 2017). Let us first look at vertical resolution, which has an important impact on the modeling of lower stratospheric dynamics (Richter et al., 2014). In the lower stratosphere, the vertical resolution of CFSR is finest while the resolution of and ERA-I and JRA-55 is the coarsest, with the resolution of MERRA and MERRA-2 in between (Fujiwara et al., 2017). This has no clear impact on AoA since CFSR 5 and JRA-55 deliver the youngest AoA while the MERRA and MERRA-2 deliver the oldest, with ERA-I results in between.
Hence one cannot establish a simple link between vertical resolution and AoA in this intercomparison.
While this has an important impact on temperatures in the upper stratosphere and lower mesosphere, it does not seem to have an impact on the AoA time series in the middle stratosphere (Fig. 8) and cannot explain the large values obtained during the 1990's ::::: 1990s. With respect to forward model forcings, MERRA-2 is the only reanalysis which includes a large source of non-orographic gravity wave drag in the tropics (Molod et al., 2015) and realistic aerosol optical depths. This last feature most probably :::: likely : explains the sensitivity of the MERRA-2 AoA at 50 hPa to the Pinatubo eruption, which cannot be seen with any other reanalysis (Fig. 6).

5
Yet the impact of the Pinatubo eruption on MERRA-2 AoA at 50 hPa cannot be seen in the northern midlatitudes, and in the southern midlatitudes it is not larger than the amplitude of seasonal variations. In section 3.2 we could not find any influence of volcanic aerosols at the global scale (Fig. 7), contrarily :::::: contrary : to recent results obtained by Diallo et al. (2017) using the Chemical Lagrangian Model of the Stratosphere (CLaMS) driven by ERA-I and JRA-55. While CLaMS is a Lagrangian transport model and BASCOE CTM ::::::: including :: a :::::: mixing ::::::::::::: parametrization :::::::::::::::::::::: (Konopka et al., 2004) and :::::::: BASCOE :::: TM : a Eulerian 10 transport model, we believe :::::: suggest that these conflicting results are better explained by the different approaches with respect to vertical transport: BASCOE CTM ::: TM : is a kinematic model (see section 2.1) while CLamS is a diabatic transport model, hence also driven by the heating rates from the reanalysis forecast models (Ploeger et al., 2010(Ploeger et al., , 2015b. Wright and Fueglistaler (2013) have shown that the heat budgets differ significantly in the Tropical Tropopause :::::: tropical ::::::::: tropopause layer among the reanalyses, with substantial implications for representations of transport and mixing in this region. Future work will also involve the disentangling of the contributions to AoA of :: the : residual circulation, mixing on resolved scales and mixing on unresolved scales (i.e., : diffusion) as recently performed with ERA-I (Ploeger et al., 2015a;Dietmüller et al., 2017) and quantitative comparisons with observational data-sets, using both MIPAS observations of SF 6 (Stiller et al., 2012;Haenel et al., 2015) and balloon observations of SF 6 and CO 2 (Ray et al., 2014). Comparisons with long-term records of 25 other long-lived tracers will provide further insight at multidecadal scales. A recent study by Douglass et al. (2017) explained that the relationship between AoA and the fractional release of such tracers is a stronger test of the realism of simulated transport than the simple comparisons of mean age distributions. This approach seems very promising not only in the context of S-RIP but also for observation-based evaluations of stratospheric transport in global circulation-chemistry models.
6 Summary and conclusions 30 We have developed a pre-processor to feed a Eulerian and kinematic transport model with any of the available global reanalysis datasets. This has allowed us to compute the mean Age of Air (AoA) in the stratosphere and its evolution from 1985 to 2015, according to five modern reanalyses: ERA-Interim, JRA-55, MERRA, MERRA-2 and CFSR. Our results compare well with those published previously using other transport models driven by ERA-Interim and MERRA-2.
The five reanalyses deliver very different and diverse results. In the middle and upper stratosphere, MERRA yields the oldest AoA (~5-6 years at mid-latitudes) and JRA-55 the youngest one (~3.5 years). MERRA-2 provides a different distribution of latitudinal and vertical AoA gradients than any other reanalysis, with near-zero vertical gradients in the middle stratosphere 5 which do not seem ::: are ::: not : supported by observations. CFSR and ERA-I give the most similar AoA distributions, with the latter providing stronger gradients vertically in the middle stratosphere and latitudinally in the southern hemisphere ::::::: Southern :::::::::: Hemisphere. The relative differences between ERA-I and the four other reanalyses are largest in the lower tropical stratosphere.
Tropical ascent rates have been compared through the difference between AoA in the northern mid-latitudes and in the tropics, showing good agreements between all reanalyses except for MERRA-2 and an overestimation of the upwelling in the tropical 10 lower stratosphere.
The time variations of AoA were studied first through a qualitative analysis of raw time series in the mid-latitudes, then through a fit with a multiple linear regression model. While the linear trends vary considerably depending on the considered period (2002-2012, 2002-2015 or 1985-2015), the general hierarchy of "older" (MERRA, MERRA-2) and "younger" ( (Engel et al., 2017). The spread between the five simulations is as large as the observational uncertainties, highlighting again the importance of the disagreements between the five reanalyses. The AoA using ERA-I does 20 not show any overall trend after 2000, unless one arbitrarily ends the period in 2010.
The main conclusion of this study is the significant diversity in the distribution of mean AoA which we obtain with our transport model, depending on the input reanalysis. This casts doubt on our ability to model accurately the time necessary for variations of greenhouse or ozone-depleting species to propagate from the troposphere to the stratosphere. We have also found 20 large disagreements between the five reanalyses with respect to the long-term trends of age of air. This suggests that with our type of offline transport model, the wind fields in modern reanalyses are not sufficiently constrained by observations to evaluate the actual changes of stratospheric circulation. Yet this conclusion should not be hastily extended to other types of transport models which also use the reanalyses of temperature and heating rates.