Interactive comment on “ Three-dimensional variations of atmospheric CO 2 : aircraft measurements and multi-transport model simulations

The investigators use 4 atmospheric transport models (3 online and 1 offline) driven by 2 different datasets of surface CO2 flux (Flux 1 and Flux 2). The outputs from different numerical experiments are then compared to aircraft measurements taken at mid to upper troposphere as part of the CONTRAIL project. I find the paper informative and contributes to the further understanding of the behavior of various atmospheric transport models we use to interpret atmospheric CO2 in terms of carbon sources and sinks. I recommend acceptance with major modifications.

Apart from the model-observation comparison of the CO 2 distribution for different areas and seasons, the authors conclude that the contrast between the northern hemispheric sink and the tropical source needs to be larger that found in previous studies using extensive aircraft observations.I recommend publication only after some major revisions on the manuscript, taking into account the comments below.
We are grateful for your time to review our manuscript.We also appreciate for giving us many fruitful comments and suggestions.Our replies to your comments are described below.Please note that all page and line numbers in our replies refer to the amended manuscript.

Main comments:
The main finding of a contrast between tropical source and NH sink that is substantially larger than what was found by Stephens et al. 2007 (in the following referred to as S07) needs to be more substantiated.The authors speculate on the source-sink distribution based on simulations with four different transport models and two different flux distributions and on the comparison with CONTRAIL data.One of the flux estimates used is a bottom-up estimate not generally consistent with the atmospheric constraint, and the other is based on inverse estimates valid for a different time period.One way to improve this would be by using actual inverse estimates based on the different models, following an approach similar to the one used by S07.This would make the supposed differences to the S07 study more clear: are the transport models so different from those used in S07, or are the CONTRAIL data inconsistent with the regular profile data used in S07?
In the present study, we cannot say our models are largely different from those used in S07.We are also unable to see if the CONTRAIL data inconsistent with the profiles of S07, because their measurement periods do not overlap each other as the reviewer has pointed out.Our primary aim is to check model-model differences in the three-dimensional CO 2 variations and know how those differences are significant to the observed ones by using the CONTRAIL data and the flux interpretations came as by product of the comparisons.S07 used the inversion fluxes derived from their models, therefore they did not validate model transport but validated the inversions by comparing the vertical profiles.Transport difference is the key issue in this study, therefore we used the common flux instead of using inverted fluxes like S07.
Nevertheless, as the reviewer pointed out, our manuscript tended to highlight the north-tropics contrast of the surface CO 2 flux.Therefore, we changed sentences to highlight the model transport differences and what we can learn from the result.
Changed sentences are as follows.
We changed the last part of Abstract as follows.
"The models consistently underestimated the north-tropics mean gradient of CO 2 both in the free-troposphere and marine boundary layer during boreal summer.This result suggests that the north-tropics contrast of annual mean net non-fossil CO 2 flux should be greater than 2.7 PgCyr −1 for 2007."=>[Page 2, lines 5-8] "In summer season, differences in latitudinal gradients by the fluxes are comparable to or greater than model-model differences even in the free troposphere.This result suggests that active summer vertical transport sufficiently ventilates flux signals up to the free troposphere and the models could use those for inferring surface CO 2 fluxes." We modified sentences in Introduction and added new sentences there.
[From Page 2 line 16 to Page 3 Line 4] "The TransCom3 models showed large differences of the so-called rectifier effect (Denning et al., 1996).…However, our understanding of global-scale CO 2 distributions in the FT remained limited" In section 3.4, we added some words as follows.
"which suggests that active summer vertical transport ventilates some significant flux signals up to FT" => [Page 16, "which suggests that active summer vertical transport ventilates some significant flux signals up to FT and those could be captured by the models." In section 3.4, we deleted the sentences in the last paragraph as below.
"The annual mean of the observed north-tropics gradients at 5-6 km and in MBL are, …we might be unable to reproduce the observed north-tropics mean gradient of CO 2 for

2007."
And we moved the sentence of "the possibility of a stronger…is well simulated by the models" to after "are during boreal summer".[Page 17, In Conclusions, we replaced "new" with "some" [Page 17, line 22] and changed the last sentences as follows.
"From comparison of latitudinal profiles in FT and MBL,…contrast of net non-fossil CO 2 flux for 2007 as greater than 2.7 Pg C yr -1 " => [Page 18 lines 2-7] "From comparison of latitudinal gradients in FT and MBL, we found that the differences by the fluxes are comparable to or greater than model-model differences in summer.It suggests that active summer vertical transport ventilates some significant flux signals up to FT and those could be captured by the models.On the other hand, the model-model difference is much greater than the differences by the fluxes, suggesting that the transport model uncertainty is predominant during boreal winter." Using fluxes from a different dime period is not state of the art, as interannual variations are likely to happen (see e.g.results from different inversions on the webpage carboscope.eu).Also the flux distribution might change on interannual time scales.Just saying that both periods were similarly affected by ENSO is not sufficient.We agree with you that interannual variations of CO 2 flux are significant.We have seen inversion results of extremely high interannual variability (IAV) in fluxes (Patra et al., GBC, 2005a,b) to low IAV (Rödenbeck et al., ACP, 2003) We thank the reviewer for the helpful suggestion to add a table describing the models and their characteristics.According to the suggestion, we added the following table and also added "(Table 2)" after "the simulations" in the first sentence of Section

[Page 5, line 13]
Table 2. Table 2. List of the transport models, and their fundamental characteristics.* SF 6 gradient is defined as the difference between the annual mean concentrations of the two-hemispheres.$ Vertical radon gradient is defined as the difference of the global July-August-September (JAS) mean concentrations between at 300 and 850 hPa.& Vertical CO 2 gradient is defined as the difference of JAS mean concentrations between at 850 hPa and 500 hPa in the northern hemisphere.

Model
We think that the horizontal wind data is not a key issue for three-dimensional structures of tracers and that vertical transport, most of which is parameterized by numerical schemes, are more important for the model differences in free troposphere.
According the comment, we added the sentence below into the last of the first paragraph of Section 2.2.
[Page 5, lines 18-24] "Three of the four transport models are nudged with the same JRA-25/JCDAS horizontal winds.However, the choice of meteorological reanalysis product has less influence on the quality of model simulations as seen in the TransCom continuous experiment (Law et al. 2008;Patra et al. 2008), which also demonstrated sizeable differences between models driven by same reanalysis.Three-dimensional tracer distributions are largely influenced by sub-grid scale parameterized vertical transport of turbulent mixing and cumulus convection." The separation of transport uncertainty and flux uncertainty is a difficult task, but is key to making inferences about the flux distributions.The way this is handled in the manuscript is not very convincing.Using fluxes that are not consistent with surface observations (i.e.not inverted fluxes), the authors make statements about the impact of different fluxes vs. the impact of different models on the modeled CO 2 distribution.With two flux distributions, that are sometimes different in certain regions, and sometimes not, this can lead to wrong conclusions.Inferring the impact of transport uncertainty from the differences between the different transport models is also not appropriate, given the small number of models and their lack of independence.Also regarding these issues the paper would benefit from using inverse estimates based on the different transport models for the specific period.
We agree with you that separation of model transport and surface flux errors contributing to the model-observation mismatch is difficult task.However, it has to be done, and this is our first attempt towards that.We hope the reviewers will agree that some of the model-observation matches shown in the work clearly do not belong to the flux error (e.g., the large mode-model difference of the latitudinal mean profile for JFM) and some are appearing from flux error (e.g., the offset in profiles over IND at all altitudes).Although we used the small number of models, vertical transport schemes are largely different each other as described above and consequently the models showed large differences in the CO 2 concentrations.Therefore, we consider that the models are almost independent on each other.As the reviewer pointed out, two flux distributions might be sometimes different in certain regions and sometimes not.However, it is too difficult to investigate all the possibilities of the flux uncertainty.Nevertheless, we consider the two fluxes we used are quite different from each other because one is completely neutral for terrestrial areas and the other is not.Moreover, in the manuscript, we always refer flux difference for specific area when we discuss contributions of the flux difference to the concentration difference.We think it is a fair way.Both fluxes (CASA and TransCom) are in fact result of the model parameter or regional flux optimization with transport model using climatological CO 2 observation data, and it is not surprising that some of the participating models demonstrated reasonably good match with observations using the fluxes before accepting those for the intercomparison.

P12809 L24: same for "top-down"
As suggested, we deleted "top-down/".No, it was not an artifact of the plotting program, but was due to wrongly outputting data at 850 hPa in MJ98-CDTM.We modified Figure 2. According to the suggestion, we changed the figures to those for JAS and added ones at 850 hPa.By this change, the global totals were also changed.In the previous manuscript, we wanted to compare the radon results with the previous studies (Mahowald et al. 1997, Jacob et al 1997, Dentener et al 1999), which showed for JJA.
That is why we showed radon for JJA.
In the revised manuscript, we changed sentences and add new ones to describe 850 hPa radon as follows.
"the simulated radon concentrations at 300 hPa and 500 hPa for June-July-August "Those radon distribution patterns generally resemble those described in previous reports (Mahowald et al., 1997;Jacob et al., 1997;Dentener et al., 1999).Compared to those studies, the amounts of radon concentration in the upper troposphere (Fig. 3a) "Compared to earlier studies (Mahowald et al., 1997;Jacob et al., 1997;Dentener et al., 1999), the June-July-August radon concentration at 300 hPa in the upper troposphere simulated by ACTM, NICAM-TM, and NIES are somewhat on the larger side and that by MJ98-CDTM is on the smaller side (not shown)."

P12815 "averaged correlation coefficients of each vertical profile between the observation and the model mean" this is somewhat unclear. How was this correlation calculated?
We derived the average correlation coefficients by simply averaging 144 model-observation correlations (4 models × 4 seasons × 9 regions).However, we have realized that an average of correlation coefficients in a number of samples does not represent an "average correlation".Therefore, we changed the way to derive the average correlation coefficient.We apologize for our misunderstanding.In the revised manuscript, correlation coefficients are transformed into Fisher's z prior to averaging and the averaged coefficient is derived by back transforming the averaged z.Therefore, the coefficient values are different from those in the previous manuscript.Besides, we checked those significances.We also changed the word "averaged correlation" to "average correlation".
We changed sentences and added new ones as followings.
[Page 10, lines 13-18] "average correlation coefficients are 0.83 and 0.85 (significant at 95 % confidence level), respectively, for the results obtained using Flux1 and Flux2.Hereafter, we use average correlation coefficients to check the compatibility between the models and the [in the caption of Figure 5] =>"The error bar indicates the variation of the instantaneous data, derived by averaging the standard deviations of the instantaneous data within each grid at each level." P12816 L22, also P12817 L2: The fact that changing fluxes have an impact on the gradient (which is not surprising) does not proof flux uncertainty to be a significant cause for too small gradients.It could still be dominated by transport uncertainty.
Especially over the IND region, the authors should consider potential transport pathways that could lead to a mismatch.Given that there is significant convective activity, also aircraft profiles will not be unbiased, as they will avoid strong convective cells.
For the northern areas, the proof is that ΔCO 2 simulated by MJ98-CDTM, which has the weakest vertical mixing as shown in Fig. 4, is more largely different from the observed one in the FT.As the reviewer pointed out, those sentences are not so clear.
Therefore, we made those clearer and discussed more carefully as follows.
[  2011).Furthermore, despite the large range of cumulus convection schemes in the models, all the models consistently overestimated ΔCO 2 ."Also, we changed subsequent "This fact" to "Those facts".
Actually, it is difficult to investigate the so-called clear sky bias in the aircraft data.
However, the figure shows the simulated ΔCO 2 are underestimated not only at high or low level, but at all the levels including the bottom and top of the convective cells.That is why we do not consider the clear sky bias here.
P12817 L8: "12.5gC/mˆ2" is not a unit for fluxes.Also, only knowing the difference but not the magnitude of either flux1 or flux2 makes this hard to interpret for the reader.
We appreciate for pointing out the mistake.We changed the sentences as follows.
We deleted ", which has a 12.4 g C m −2 stronger net sink than Flux2 in IND for JAS, " and added the sentence below.
[Page 12, lines 7-8] =>"In fact, the time-integrated amount of Flux1 in IND for JAS (-3.12 g C m -2 ) is much smaller than that of Flux2 (9.32 g C m -2 )." P12818 L12: Again, only flux errors are discussed.Can it be excluded that there is a problem with transport in the models?
According to the comment, we removed whole the second paragraph of Section 3.2.3 and the sentence of "Another notable … dry atmosphere condition." in Conclusions.In revising the manuscript, we have realized that it is difficult to know the causes of the model-observation mismatches and those sentences were going too far.
P12818 L24: The representation of fossil fuel emissions in the vicinity of many of the profiles should be assessed with more care.Has a selection been made on the data for wind direction?Is this only a problem for SSA, not for any of the other regions?
The reviewer comment is completely correct.The representation error of the fossil fuel emission might exist in other places, but it is very difficult to check all of those and we could not find such kind of errors so clearly in other places.The bias over Jakarta is clearly noticeable and has been confirmed by the sensitivity test with EDGAR4.We did not do data selection for wind direction.In order to investigate such kind of errors further, we should use a local high-resolution model instead of a global model as used in this study.Therefore, this is left for the future study.
According to the comment, we changed the last part of Section 3.2.3 as follows.
[Page 13, line 11] "Probably, " => "One possible cause is that" [Page 13, lines 15-18] newly added => "The representation error of the fossil fuel emission might exist in other places, but we could not find such kind of errors so clearly in the global models.We here show the error over Jakarta as a typical case." P12821 L25: How was the correlation determined?A correlation coefficient of 0.7 suggests 50% explained variance.The double minimum feature in the FT is not really well captured by the models, and probably does not contribute much to the variance.
The correlation is derived from the correlation coefficients of the 4 models by the same way as for vertical profiles.We checked that the correlation is significant at 95 % confidence level and we consider that the double minimum feature is reasonably simulated by the models except MJ98-CDTM.The minima are in April and September-October.
P12822 L2: Why is "CO 2 variation . . .intruding towards the south . .." a reason for models to underestimate the seasonal amplitude?Which is the process that is not properly represented in the models?
The process that is not properly represented in the models is seasonal amplitude in the northern hemisphere.The northern seasonal variations are transported into the southern hemisphere via tropical upper troposphere, therefore, seasonal amplitude over tropics is also underestimated.According to the comment, we changed the sentences as follows.
[Page 16, lines 3-5] =>"This is probably because the seasonal CO 2 variation in the northern hemisphere, whose amplitude is underestimated by the models, intruded towards the south via the tropical upper troposphere." P12822 L5-8: It should be mentioned that the observed latitudinal gradients are larger than the simulated gradients at both altitudes.
In fact, the observed latitudinal gradients are not necessarily larger than the simulated gradients at both altitudes.Please see the values in Table 7.The gradient of MJ98-CDTM is smaller than that of the observation at both levels.In Figure 9, the observed ΔCO 2 is larger than the modeled ones in NH, but this tendency would be changed if we choose different offset site other than Minamitorishima.
P12822 L15: add "transport" before "model uncertainty" We added "transport" following the comment.[Page 16, line 17] P12822 L5-16: When assessing impact from flux vs. impact from transport, the difference in fluxes (Flux2-Flux1) is important and needs to be mentioned.For this table 1 should be augmented to include the JFM and JAS flux budgets in addition to the annual totals.
We thank the reviewer for the helpful suggestion.As suggested, we added JFM and JAS flux budgets in =>"Despite comparable north-tropics difference of carbon budgets between Flux1 and Flux2 to that for JFM (ca. 4 Pg C yr -1 , see Table 1), the simulated latitudinal gradients are considerably changed by the fluxes." P12822 L23: There seems to be an inconsistency: table 6 shows observed gradients of 4.7ppm in the MBL, not 5.2ppm.
We appreciate for finding the mistake.We changed "5.2" to "4.7".[Page 16, line 28] P12822 L26: Only under the assumption of perfect transport.
We agree with the reviewer that the sentence is not appropriate here.Therefore, we deleted "Actually, the north-tropics CO 2 gradient should reflect the flux contrast between the two areas.Therefore,".
P12823 L1: To say that there is no strong impact by uncertainty in the modeled vertical transport is I think going too far.The latitudinal gradients are indeed underestimated at both altitudes, but to a quite different degree, especially near 20-40 N.
According to the comments, we deleted "It is noteworthy that this discussion is not affected strongly by model uncertainty for vertical transport because the model underestimation of the north-tropics mean gradient is consistent both in FT and MBL".
Our models have large differences in vertical transport as shown in the simulations of SF 6 , radon and CO 2 .Despite the large differences, all the models underestimated the north-tropics gradient at both altitudes.Therefore, we consider the flux uncertainty is the cause of the mismatch.As you pointed out, actually, the degree of latitudinal gradient different is different between at the two altitudes.But, we consider that it comes from the fact that the CONTRAIL data are more likely to be affected by terrestrial fluxes, which is discussed in the manuscript.
P12823 L1: The speculation about the location and magnitude of the fluxes should be tested by adjusting the fluxes in the simulation, or better by using inverted fluxes based on the different transport models.
As described above, we believe that using the common flux is more appropriate for our study than using different inverted fluxes.
In order to make the manuscript simpler and more understandable, we modified the figure and table captions and replace "seasonal mean variation" with "monthly mean variation" in the manuscript.
We Because the comments by the reviewers were helpful in revising the manuscript, we would like to add the following sentence in the acknowledgement.
=>"We also thank the anonymous reviewers for their valuable comments on this manuscript." The manuscript describes comparisons between transport model simulations with observations of CO 2 from commercial airliners made within the CONTRAIL program.Four different transport models are combined with two different flux estimates for CO 2 .

Further
, it remains unclear which flux year was used, was it an average of 1999-2001, or a specific year?Yes, we used an average flux of 1999-2001.According to the comment, we added the following sentence.[Page 5, lines 1-3] "This inversion flux is derived as an average of 1999-2001, when no strong El Niño or La Niña was experienced.Therefore, Flux2 can be considered as a near climatological inversion flux." available inversion-optimized fluxes such as Carbontracker, Jena or LSCE.TransCom flux is based on ensemble of the transport models that is intended to alleviate the particular model biases and CASA flux has diurnal cycle tested in previous study (TransCom continuous).According to the comment, we changed the sentence in Introduction as follows."we used two datasets of surface CO 2 flux to evaluate the relative contributions of flux uncertainty" =>[Page 3, lines 28-29] "we used two datasets of surface CO 2 flux to evaluate the relative contributions of one possible flux uncertainty" Technical comments: [Page 3, line 3] P12807 L26: please replace the Xueref-Remy 2010 ACPD reference by the ACP version that appeared recently We replaced Xueref-Remy 2010, ACPD by the ACP version accordingly.[Page 3, line 9] "2010"=>"2011" and the reference was replaced by "Xueref-Remy, I., Bousquet, P., Carouge, C., Rivier, L., and Ciais, P.: Variability and budget of CO 2 in Europe: analysis of the CAATER airborne campaigns -Part 2: Comparison of CO 2 vertical variability and fluxes between observations and a modeling framework, Atmos.Chem.Phys., 11, 5673-5684, 2011."P12808 L9: What do the authors mean by "ecophysical"?As the reviewer pointed out, it is unclear.Therefore, we changed "ecophysical" to "geographical".[Page 3, line 20] P12808 L14: I would suggest to replace "Because the multi-model framework" by "As a multi-model framework" to make the sentence more clear According to the suggestion, we replaced "Because the multi-model frame work" by "As a multi-model framework".[Page 3, line 25] P12809 L9: "bottom-up" should be introduced (or dropped) Fig. 2 (b): Orange line (CDTM) shows strange increase at the beginning of the time series shown.Is this an artifact of the plotting software?

Fig 3+4 :
Fig 3+4:Why are CO 2 differences shown for JAS, while in Fig.3radon is shown for JJA?It would be helpful to have the same periods for Rn and CO 2 .Also it would be interesting to also have radon at 850 mbar for comparison.

(
radon concentrations at 850 hPa, 500 hPa and 300 hPa for July-August-September (JAS).""radon concentrations at 300 hPa (a) and 500 hPa (b) for June-July-August (JJA)" => [in the caption of Figure 3] "radon concentrations at 850 hPa (a), 500 hPa (b) and 300 hPa (c) for July-August-September (JAS)" "global averages of the radon mole fractions are 5.02, 1.84, 4.09, and 4.84×10 −21 ," =>[Page 9, lines 8-9] "global averages of the radon mole fractions are 4.78, 1.64, 3.96, and 4.64 ×10 -21 ," [Page 9, lines 14-15] "the smaller range of 3.23-4.36×10 -2 "=>"the smaller range of 3.26-4.27×10 -2 " [Page 9, lines 17-20] newly added =>"At 850 hPa, NICAM-TM and ACTM (uses Arakawa-Schubert type cumulus convection schemes) show relatively low concentration compared to those of MJ98-CDTM and NIES (uses Kuo type scheme).These differences suggest the transport model properties are diverse and suitable for transport model inter-comparison experiment." by ACTM, NICAM-TM, and NIES are somewhat on the larger side and those 5 by MJ98-CDTM are on the smaller side."=>[Page 9, lines 21-24] Fig 5, caption: "mean standard deviation" should be explained.Is it the standard error of the mean delta-CO 2 shown at each vertical bin?If those horizontal lines were taken seriously as error bars, most vertical gradients as well as model-measurement differences would obviously be insignificant.The error bars do not represent the error of the mean profile, but represent the magnitude of the variation of the instantaneous data (not averaged at all) at each level.The values of the error bars were calculated by averaging the standard deviations of the instantaneous data.According to the comment, we changed the caption of Figure 5 as follows.
Page 11, lines 9-20] =>"Most simulated vertical gradients from PBL to FT are smaller than the observed ones for JAS, except EAS.One probable cause is a deficiency of the model vertical mixing.Actually, Stephens et al. (2007) reported that the TransCom3 models have overly strong vertical mixing from PBL to FT during boreal summer.In this comparison, however, weakening vertical mixing might not improve the results because ΔCO 2 simulated by MJ98-CDTM, which has the weakest vertical mixing as shown in Fig. 4, is more largely different from the observed one in the FT.Therefore, although the possibility of transport processes other than vertical mixing causing the model-observation discrepancies cannot be ruled out, we consider that flux uncertainty is significant to the simulated PBL-FT gradients.It is because the PBL-FT gradients were changed greatly by selection of the surface flux for JAS.To investigate transport uncertainties further, we should compare the simulated radon results with vertical radon observations (if available) but this is left for the future work."For over the IND region, we added sentences discussing the transport processes and the transport uncertainties as follows.[Page 11, lines 26-31] =>"During boreal summer, a strong anticyclone circulation confines surface flux signals over the Indian continent preventing from mixing with surrounding air masses in the upper troposphere.Therefore the observed CO 2 concentrations up to the upper troposphere predominantly represent the surface flux on the Indian continent (Patra et al. replaced Patra et al.ACPD (2011) by Patra et al.ACP (2011).
Table 1 as follows.