Intercomparison of Middle Atmospheric Meteorological Analyses for the Northern Hemisphere Winter 2009-2010

. Detailed meteorological analyses based on observations extending through the middle atmosphere (~15 to 100 km altitude) can provide key information to whole atmosphere modelling systems regarding the physical mechanisms linking day-to-day changes in ionospheric electron density to meteorological variability near the Earth’s surface. However, the extent to which independent middle atmosphere analyses differ in their representation of wave-induced coupling to the ionosphere is unclear. To begin to address this issue, we present the first intercomparison 20 among four such analyses, JAGUAR-DAS, MERRA-2, NAVGEM-HA, and WACCMX+DART, focusing on the Northern Hemisphere (NH) 2009-2010 winter, which includes a major sudden stratospheric warming (SSW). This intercomparison examines the altitude, latitude, and time dependences of zonal mean zonal winds and temperatures among these four analyses over the 1 December 2009 – 31 March 2010 period, as well as latitude and altitude dependences of monthly mean amplitudes of the diurnal and semidiurnal migrating solar tides, the eastward 25 propagating diurnal zonal wave number 3 nonmigrating tide, and traveling planetary waves associated with the quasi-5 day and quasi-2-day Rossby modes. Our results show generally good agreement among the four analyses up to the stratopause (~50 km altitude). Large discrepancies begin to emerge in the mesosphere and lower thermosphere owing to (1) differences in the types of satellite data assimilated by each system and (2) differences in the details of the global atmospheric models used by each analysis system. The results of this intercomparison provide initial estimates of 30 uncertainty in analyses commonly used to constrain middle atmospheric meteorological variability in whole atmosphere model simulations . how variability near the Earth’s surface affects the state of the coupled thermosphere/ionosphere to 500 km on time scales from hours to months. In addition to the well-established solar and geomagnetic drivers of the T/I system, this meteorological variability can impact the performance of space-based geolocation and global communication systems, and this impact is particularly noticeable during times of reduced solar activity. Specifically, these space-based systems are affected by rapid changes in the ionospheric electron content, 55 which is determined by a complex interplay between variations in the thermospheric density, chemical composition, and circulation, particularly in the dynamo region of the thermosphere from 100 to 200 km that includes the ionospheric E and lower F regions. studying the dynamics and variability of the middle atmosphere. Examining the level of agreement among these new high-altitude systems is a first step towards understanding how whole atmosphere model simulations may be affected when constrained by different sets of meteorological input. This paper presents the first intercomparison of four analyses extending into the middle atmosphere: the high-altitude version of the Navy Global Environmental Model (NAVGEM-HA; the Whole Atmosphere Community Climate Model with thermosphere-ionosphere eXtension using the Data Assimilation Research Testbed the Japanese Atmospheric General circulation model Upper with Data Assimilation and MERRA-2. Each of these systems assimilates middle atmosphere data to varying degrees, with top output levels ranging from 80 km to ~500 km altitude. The objective of this study is to quantify the similarities and differences between these four 155 analyses. The results are useful for assessment of uncertainty in constrained or “nudged” whole atmosphere simulations arising from differences in meteorological inputs. These results can also be used to highlight where further improvements in middle atmospheric data assimilation and modelling are needed in order to improve our understanding of how meteorological variability impacts day-to-day variability in ionospheric conditions, especially during quiet Sun conditions. of subtropics. The in WACCMX+DART is smaller than the other analyses, showing a single peak of K near the Equator between 100 and 120 km altitude. corresponding WACCMX+DART o o S-40 o S) in WACCMX+DART. from larger SW2 amplitudes high latitudes o S-90 o and 60 o N-90 o N) not found in either NAVGEM-HA WACCMX+DART. For diurnal wave 3 both NAVGEM-HA and JAGUAR-DAS peak DE3 values of between o S no of a DE3

applications, such as those described above, that seek to quantify the response of the T/I system to meteorological variability in the middle atmosphere. For example, Sassi et al. (2018) demonstrated that whole atmosphere model simulations constrained with high-altitude meteorological analyses extending up to ~90 km altitude represented dayto-day variability in the lower thermosphere more realistically than simulations constrained with analyses that only extended up to ~60 km altitude, especially around the time of a major SSW. Constraining whole atmosphere models 130 by using meteorological analyses with widely varying representations of the middle atmosphere state above ~60 km altitude makes it difficult to conclusively identify and predict the physical drivers that are responsible for linking lower atmospheric meteorology to ionospheric variability.
To address the emerging need for accurate global atmospheric analyses throughout the entire middle atmosphere, high-135 altitude data assimilation and modelling systems (e.g., Pedatella et al., 2014b;McCormack et al., 2017;Koshin et al., 2020) have been developed recently to provide observations-based constraints of middle atmospheric meteorological variability for whole-atmosphere models McDonald et al., 2018;Pedatella et al., 2019). These  (Schoeberl et al., 2006) and TIMED (Thermosphere-Ionosphere-Mesosphere Energetics and Dynamics; Yee et al., 1999). Typical top levels for these new systems extend to 90 km altitude or higher, so each of these systems provides valuable resources for studying the dynamics and variability of the middle atmosphere.
Examining the level of agreement among these new high-altitude systems is a first step towards understanding how whole atmosphere model simulations may be affected when constrained by different sets of meteorological input. This paper presents the first intercomparison of four analyses extending into the middle atmosphere: the high-altitude version of the Navy Global Environmental Model (NAVGEM-HA; Eckermann et al., 2018), the Whole Atmosphere Each of these systems assimilates middle atmosphere data to varying degrees, with top output levels ranging from 80 km to ~500 km altitude. The objective of this study is to quantify the similarities and differences between these four 155 analyses. The results are useful for assessment of uncertainty in constrained or "nudged" whole atmosphere simulations arising from differences in meteorological inputs. These results can also be used to highlight where further improvements in middle atmospheric data assimilation and modelling are needed in order to improve our understanding of how meteorological variability impacts day-to-day variability in ionospheric conditions, especially during quiet Sun conditions.

160
The initial plan for this intercomparison was conceived as a follow-on study of Harvey et al. (2021) by the SPARC (Stratosphere-troposphere Processes and their Role in Climate) Data Assimilation Working Group (https://www.sparcclimate.org/activities/data-assimilation/) to examine high altitude meteorological analyses extending throughout the middle atmosphere. Due to the large computational resources needed to generate these types of meteorological 165 analyses, a detailed multi-year intercomparison is not currently within the scope of the present study. Instead, we focus on a detailed examination of the four analyses over the 1 December 2009 -31 March 2010 period, which includes a major SSW. This work is particularly interested in mesospheric wind and temperature disturbances that occur in late January (Goncharenko et al., 2013;Jones et al., 2018;McCormack et al., 2017), two weeks before the onset of easterlies in the stratosphere on 9 February (Butler et al., 2017). This Northern Hemisphere (NH) wintertime period is useful since it provides a prime example of a dramatic shift in middle atmospheric circulation that has been studied extensively through both observations and modelling studies.
The paper is organized as follows: The four meteorological analyses used in this intercomparison (NAVGEM-HA, WACCMX+DART, JAGUAR-DAS, and MERRA-2) are described in section 2. Section 3 describes the numerical 175 methods used to analyze space-time variations in the data related to specific PW and tidal features. Section 4 presents an intercomparison of the zonal mean zonal wind and zonal mean temperature data, while section 5 presents an intercomparison of the PW and tidal signatures. The results of this study are summarized, and implications for future research are discussed, in Section 6.

Data and Methods
This section provides an overview of each of the four high-altitude meteorological systems used in the present 185 intercomparison of the NH winter period extending from 1 December 2009 to 31 March 2010. Each of these systems combines a data assimilation (DA) component with an atmospheric model component that together produce global synoptic analyses of key atmospheric quantities. In the discussion below, we describe the main features of the DA and modelling systems relevant for capturing specific PW and tidal components; previous observational and modelling studies (see Section 1) have shown these PWs and tides can impact day-to-day variability in the T/I system. These 190 include the migrating diurnal and semidiurnal solar tides (referred to here as DW1 and SW2, respectively), the nonmigrating diurnal eastward zonal wavenumber 3 tidal component (DE3), the quasi-2-day wave (Q2DW), and the quasi-5-day wave (Q5DW).
For this intercomparison, we examine global gridded data sets of temperature, zonal wind and geopotential height from 195 four different systems extending throughout the middle atmosphere and in some cases (JAGUAR-DAS and WACCMX+DART) into the thermosphere. The main sources of middle atmosphere observations for these systems are retrieved vertical temperature profiles from the Aura Microwave Limb Sounder (MLS; Schwartz et al., 2008) between ~16 km and 90 km altitude and extending from 82 o S to 82 o N latitude, and from the TIMED Sounding of the Atmosphere using Broadband Emission of Radiation (SABER; Remsberg et al., 2008) instrument between ~16 km and 105 km 200 altitude with latitude coverage that alternated between its south viewing mode (83 o S-52 o N) and north-viewing mode (83 o N-52 o S) on 11 January 2010. Further details on each high-altitude analysis system can be found in the discussion below and references therein. All data used in this study is publicly available as described in the Acknowledgments section.
205 Table 1 gives overall references for each system, lists the horizontal, vertical, and temporal resolutions, gives the vertical range for the systems, and provides references for the orographic and non-orographic gravity wave parameterizations implemented in each system.

Analysis System
Reference ( List of analysis datasets used in this paper, overall references describing each system, the horizontal, vertical, and temporal characteristics of the analysis output, the model top, and references for gravity wave specifications. In the 5 th column, ORO refers to the parametrization for orographic gravity waves while NON refers to that of non-orographic gravity waves.

NAVGEM-HA
NAVGEM-HA is a research version of the U.S. Navy's operational NWP system developed for middle atmosphere 225 applications. It processes over 6 million atmospheric observations within its standard 6-hour assimilation window, consisting of surface station reports, radiosondes, and numerous operational meteorological satellites (McCormack et al., 2017;Eckermann et al., 2018). In addition to MLS and SABER temperature retrievals, NAVGEM-HA also assimilates vertical profiles of ozone and water vapor from MLS, and microwave radiances from the upper atmospheric sounder (UAS) channels of the Special Sensor Microwave Imager/Sounder (SSMI/S), as illustrated in Fig. 3a of model error covariances are equally weighted. Further details of the DA solver, including incorporation of middle atmosphere observation error and methods of bias correction between middle atmosphere satellite data sets, are provided in Kuhl et al. (2013) and Eckermann et al. (2018).

250
This intercomparison examines NAVGEM-HA zonal wind, temperature, and geopotential height fields produced with the T119L74 version of the system, where T119 refers to the triangular wavenumber truncation of the spectral forecast model and corresponds to a horizontal grid spacing of 1 o in latitude and longitude, and L74 refers to 74 vertical model levels extending from the surface to the top pressure of 6 x 10 -5 hPa. The NAVGEM-HA vertical coordinate is hybrid 255 s-p that is terrain following near the surface and transitions to isobaric above the 88 hPa level (approximately 17 km altitude). The spacing of the model's vertical levels is ~2 km in the stratosphere, ~3 km in the mesosphere, and >4 km in the lower thermosphere. Strong horizontal diffusion is applied to the top two model levels (above ~ 100 km altitude) in order to prevent numerical instabilities resulting from, e.g., spurious wave reflection. The resulting analyses near the model top are heavily influenced by this imposed diffusion. Therefore, in this study we limit our focus to altitudes 260 below 95 km geometric altitude, where previous validation studies (e.g., McCormack et al., 2017;Dhadly et al., 2018;Stober et al., 2020) have shown NAVGEM-HA to produce reliable results. The NAVGEM-HA system produces analyses every 6 hours, and these fields are supplemented by 3-hourly forecast fields produced by the system as part of the 4DVAR framework, providing an effective 3-hourly sampling rate for the extraction of tidal signatures in the horizontal wind and temperature fields.

MERRA-2
MERRA-2 temperature, geopotential height, and zonal winds are used in this study (Gelaro et al., 2017). The 3-hourly fields on the native model grid ("3d_asm_Nv";GMAO, 2015) (Gelaro et al., 2017). The MERRA-2 model component contains a stratospheric Quasi-Biennial Oscillation (QBO; Molod et al., 2015) and the MERRA-2 analysis QBO winds match well with the available radiosonde observations Kawatani et al. 2016). While MERRA-2 has an equatorial semiannual oscillation (SAO), Kawatani et al. (2020) has shown that reanalyses can differ in their representation of the SAO near the stratopause. introduction of incremental analysis update filtering to suppress generation of spurious waves; (2) a modified treatment 300 of horizontal diffusion in the JAGUAR forecast model; (3) assimilation of SABER temperature retrievals from 40-0.00014 hPa (~22 km to 110 km altitude) and the SSMI/S UAS microwave radiance measurements, described in section 2.1, from 10-0.01hPa (~30 km to 80 km). These improvements will be described in an upcoming study by Koshin et al. (2021, in review). Model error covariances were estimated from 50-member ensembles. The output from JAGUAR-DAS is 6-hourly and has horizontal grid spacing of 2.8125 o in latitude and longitude. The data assimilation capability is implemented in WACCMX using the Data Assimilation Research Testbed (DART, Anderson et al., 2009) ensemble adjustment Kalman filter (Pedatella et al., 2014b(Pedatella et al., , 2018. WACCMX+DART radio occultation refractivity in the troposphere-stratosphere, as well as Aura MLS and TIMED SABER temperature observations up to ~100 km altitude. To prevent spurious correlations, the observations are localized using a Gaspari-Cohn (Gaspari and Cohn, 1999) function with a half-width of 0.2 radians in the horizontal and 0.15 in ln(po/p) in the vertical, where p is pressure and po is surface pressure. For the present study, WACCMX+DART simulations were performed using 40 ensemble members and a six-hour data assimilation cycle. Second order divergence damping was 335 applied in order to stabilize the model as well as prevent large decreases in the O/N2 ratio and electron density in the thermosphere and ionosphere (Pedatella et al., 2018). The second order divergence damping results in tidal amplitudes that are 50-100% too small. Pedatella et al. (2020) demonstrated that the tidal amplitudes can be improved by using hourly data assimilation cycling; however, the present study makes use of existing simulations that utilized a six-hour data assimilation cycle. The WACCMX+DART six-hourly analysis fields of zonal wind, temperature, and geopotential 340 height are combined with short-term (1-5 hour) forecasts, yielding hourly output for analysis in the present study.

Space-time analysis
To quantify the various PW and tidal components in the high-altitude analyses, we employ the two-dimensional fast We also apply a continuous wavelet transform based on the S-transform method (Stockwell et al., 1996) to characterize evaluated against a spectrum of a first-order autoregressive time series with the same variance as the input temperature time series, as described in Sassi et al. (2012). The 90% and 95% confidence values are constructed based on eq. (18)
For this initial intercomparison, we examine all available output from these meteorological analyses over the altitude region from 20-120 km, with particular emphasis on the MLT region between ~50 km and 90 km altitude. Unless otherwise noted, all results are based on geometric altitude Z computed using gridded geopotential height H output by 380 each system corrected for both altitude and latitude variations in gravitational acceleration following Lewis (2007): where H is the geopotential height in meters, is latitude in degrees, g45 is the surface gravitational acceleration at 45° latitude (9.80665 m s -2 ), Re ( ) is a latitude-dependent value of Earth's radius that corrects for the combined effect of 385 gravitational and centrifugal forces, and the latitude-dependent gravitational acceleration g ( ) on the surface of an ellipsoid of revolution is given by the expression using Somagliana's constant ks = 1.931853 x 10 -3 , the Earth's eccentricity factor e = 0.081819, and the gravitational acceleration at the Equator ge =9.7803253359 m s -2 .

Zonal mean results
To begin, we examine how each of the four high altitude meteorological analyses represent the latitude and altitude dependencies of zonal mean temperature and zonal mean zonal wind averaged over the DJF period 2009-2010. The zonal mean temperature distribution from 20 km to 120 km altitude plotted in Figure 2 reflects a balance between net 395 radiative heating (driven primarily by stratospheric O3 heating and mesospheric CO2 cooling) and dynamically induced heating resulting from a thermally indirect (or residual) meridional circulation. This circulation is mainly produced by the cumulative effects of breaking PWs in the stratosphere and breaking gravity waves in the mesosphere. Similarly, the zonal mean zonal wind distributions plotted in Figure 3 from all four analysis systems also reflect this balance between radiative and dynamical drivers of the middle atmospheric circulation. Consequently, the zonal mean temperature and zonal wind distributions produced by each analysis system can depend not only on the number and quality of middle atmospheric observations being directly assimilated, but also by the physical parameterizations employed by the atmospheric model components to represent key processes (e.g., radiative heating and cooling, parameterization of sub-grid scale gravity wave drag). By characterizing similarities and differences in the zonal mean state among the four systems, we can begin to understand the relative roles that observations and model physics may Between 20 km and 50 km altitude, the zonal mean temperature distributions among all four data sets are broadly similar, exhibiting temperatures below 210 K in the equatorial lower stratosphere near 20 km altitude, consistent with adiabatic cooling in the upward branch of the Brewer-Dobson circulation, as well as in the NH winter polar night region below ~30 km altitude. Each system produces temperature maxima of ~280 K near 50 km altitude at the South However, there are important quantitative differences among the DJF zonal mean temperature distributions, most notably in the tropics from 80 km to 100 km altitude, where WACCMX+DART produces temperatures that are >20 K

425
warmer than corresponding temperatures produced by the NAVGEM-HA and JAGUAR-DAS systems. A warm bias at the tropical mesopause has been documented previously in free-running WACCM model simulations (e.g., Smith, 2012;Marsh et al., 2013;Harvey et al., 2019) but the cause is not yet fully understood. We also note that the summer polar temperature at 80 km altitude is ~20 K colder in WACCMX+DART compared to the other three data sets.

430
Also plotted in Figure 2 as heavy white contours are the corresponding temporal standard deviations of the zonal mean temperature during DJF from each analysis (see also supplemental Figure S1). All four analyses exhibit standard deviations exceeding 10 K at high northern latitudes, reflecting the relatively large amount of dynamical variability in the NH winter polar stratosphere associated with the SSW that occurred on 9 February. Large standard deviations are also noted at the summer polar mesopause, with NAVGEM-HA and JAGUAR-DAS values exceeding 10 K and

490
The results in Figs. 2 and 3 show that the largest differences occur above 80 km, where effects of gravity wave drag play an important role in determining the climatological zonal mean distributions of temperature and zonal wind in the middle atmosphere. Specific features such as the latitude and altitude dependences of the mesospheric summer easterly jet and the cold summer polar mesopause are known to be sensitive to the effects of gravity wave breaking and subsequent deposition of heat and momentum into the background (zonal mean) state (e.g., Fritts and Alexander, 2003).

495
Some of the largest differences among the standard deviations in both zonal mean temperature and zonal mean zonal wind plotted in Figs. 2 and 3 occur in the vicinity of these features, suggesting that differences in the treatment of gravity wave drag may be an important factor for explaining the large differences among the analyses above 80 km.
Indeed, Pedatella et al. (2014a) showed that gravity wave drag differences among models is related to differences in the background winds. The cause of the temperature and zonal wind differences presented here requires further 500 investigation that is beyond the scope of this initial intercomparison study.
To further examine the differences in zonal mean temperature and zonal wind distributions among the four analyses, the day-to-day variability throughout the month is relatively small. Near the Equator, the temperatures at 50 km differ by 8-10 K, with MERRA-2 and NAVGEM-HA tending to be warmer and WACCMX+DART tending to be cooler.
However, there is a very large spread (~80-100 m s -1 ) among the January mean zonal winds at 50 km within the tropics, with NAVGEM-HA exhibiting weak westerly winds at the Equator and WACCMX+DART exhibiting strong easterly winds. These differences in equatorial zonal mean zonal wind at 50 km among the four analyses are much larger than 520 the day-to-day variability indicated by the corresponding standard deviation values, suggesting a systematic bias could be present among these data sets. At NH extratropical latitudes, all four analyses produce similar mean values, and the spread among the mean results is much smaller than the standard deviations. The large standard deviations in the extratropical NH (winter) at 50 km reflect the high degree of day-to-day variability due to strong PW forcing in late January that resulted in a major SSW on 9 February.

525
In contrast to the results at 50 km, at 80 km altitude ( the NH polar mesosphere. The result is the well-documented "sudden mesospheric cooling" that accompanies most SSW events (e.g., Matsuno, 1971;Labitzke, 1972;Siskind et al., 2010;Eswaraiah et al., 2017). It has been suggested that the abrupt changes in NH (winter) polar gravity wave breaking can have consequences for SH (summer) polar 545 mesopause temperatures through changes in the pole-to-pole meridional residual circulation produced by subsequent modulation of the gravity wave drag in both the winter and summer mesosphere (e.g., Karlsson and Becker, 2016;Laskar et al., 2019;Zülicke et al., 2018). The combined effects of these SSW-related changes in mesospheric gravity wave drag produce an anomalous residual circulation with weaker upwelling in the summer polar mesopause region, and thus warmer temperatures in this region due to a reduction in adiabatic cooling. Alternatively, several case studies based on high-altitude meteorological analyses suggest that changes in mesospheric Q2DW activity may play a role in interhemispheric coupling (e.g., Siskind and McCormack, 2014;France et al., 2018;Lieberman et al., 2021). An additional mechanism was discussed in Smith et al. (2020), where changes in summer polar mesopause temperatures are a response to changes in the residual meridional circulation, with no direct role for wave activity in the summer hemisphere.

565
The relationship between winter mesospheric cooling and summer polar mesopause warming for the 2009-2010 NH winter period is examined in Figure 5, which plots the time behaviour of daily averaged zonal mean temperatures at a direct interhemispheric coupling (IHC) mechanism as described above, although we note that previous studies found the temperature response in the summer mesopause region to be relatively small, ~2-5 K (e.g., Karlsson et al., 2009a;deWit et al., 2015). Further examination of output from these analyses for other SSW cases in conjunction with modeling studies is needed to fully explore possible links between summer polar mesopause warmings and middle atmospheric variability in NH winter.  Fig. 6. Further investigation of this would require detailed momentum budget studies using specific output data (e.g., wind tendencies due to parameterized wave drag) that are not available for the present study. Making this data part of standard meteorological output fields would facilitate future investigations into the specific role that gravity wave drag plays in explaining these differences among the mesospheric zonal wind analyses.

615
To further explore the global response of middle atmospheric zonal mean zonal winds and temperatures to the occurrence of the SSW and mesospheric cooling in the NH winter of 2009-2010, we next examine the latitude/time distributions of zonal mean temperature and zonal mean zonal wind for three altitudes (50 km, 70 km and 90 km) in Figures 7 and 8, respectively, from the four analyses. Overall, we find good qualitative and quantitative agreement 620 among the zonal mean temperatures at 50 km (Fig. 7, bottom row). We note that NAVGEM-HA and MERRA-2, which assimilate MLS stratospheric O3 profiles, exhibit slightly lower peak temperatures at the South Pole compared to JAGUAR-DAS and WACCMX+DART, which do not assimilate stratospheric O3 observations. It would be of interest for future work to examine how differences in the assimilation of radiatively active chemical constituents such as O3 and H2O  winds over the Equator at 90 km altitude (Fig. 8, top row) in the JAGUAR-DAS results and the strong westerly flow in the WACCMX+DART results near 40 o S, which was also noted in the discussion of DJF average results (Fig. 3, bottom right panel). These zonal wind differences in the upper stratosphere and mesosphere are likely attributed to differences in the treatment of gravity wave drag in each system, though specific origins require further investigation, as noted above. Users of these high-altitude meteorological analyses should be aware that these differences in the zonal mean zonal winds imply that the choice of meteorological inputs may impact the results of nudged whole atmosphere simulations.
Next we explore global temperature variations during the two weeks preceding the major SSW event. Figure 9 shows latitude-altitude plots of the correlation coefficient between daily mean temperature variations at 80 o N and 30 km and (>35 m s -1 ) occur not at the higher altitudes, but at 50 km altitude during February and March 2010 (Fig. 11, bottom panel). The results in Fig. 11 indicate that these high-altitude analyses do not yet produce a consistent representation of the semi-annual oscillation (SAO) in zonal mean zonal winds in the equatorial middle atmosphere (Kawatani et al., 685 2020). The SAO is a basic climatological feature of the middle atmospheric circulation that impacts the propagation of gravity waves and tides into the mesosphere and lower thermosphere (e.g., Garcia et al., 1997). Consequently, this is an issue that will need to be addressed as these high-altitude data assimilation systems evolve.

725
In addition to zonal mean quantities, these four middle atmosphere meteorological analyses also provide valuable information on zonal variations in temperature and winds related to planetary scale waves and tides, which earlier studies based on MLS (e.g., Forbes and Wu, 2006) and SABER (e.g., Garcia et al., 2005;Zhang et al., 2006) temperature observations found to be prevalent throughout the MLT. Since each of the four analyses examined here assimilate either MLS data, SABER data, or a combination of the two, this section examines how these features are of mesospheric winds on 27 January, two weeks before the major SSW, as shown in Figure 6. We note that the quasistationary and traveling PW amplitudes are larger in WACCMX+DART relative to the NAVGEM-HA, MERRA-2, and JAGUAR-DAS results. Abrupt shifts in quasi-stationary planetary wave 1 in the Northern high latitude winter mesosphere related to SSWs have been documented in numerous studies (e.g., Smith, 2003;Manney et al., 2008;Siskind et al., 2010;Chandran et al., 2013;Koushik et al., 2020), and are linked to highly episodic sources of 740 barotropic/baroclinic instability at NH middle and high latitudes within the upper stratosphere and mesosphere (Sassi and Liu, 2014). Future studies comparing the relative roles of resolved vs. parameterized gravity wave forcing of the mesospheric circulation, as well as the representation of baroclinic/barotropic instabilities, within the four analyses could lend insight into the origins of the differences in Fig. 12, and would help to improve our understanding of this phenomenon as it relates to changes in the state of the T/I system in connection to SSWs.

745
In the remainder of this section, we present results from space-time analysis of the four analyses related to the Q5DW, Q2DW, DW1, SW2, and DE3 features. Recognizing that many other planetary wave and tidal features (e.g., Forbes et al., 2008;Sassi et al., 2012) are also important for producing T/I variability related to meteorological forcing from the middle atmosphere , the present study is not meant to be an all-inclusive assessment of every 750 feature, but rather is meant to provide an initial extension of the intercomparison study by Harvey et al. (2021) to include the mesosphere and lower thermosphere.
We begin with an examination of the Q5DW, which consists of a westward propagating zonal wavenumber 1 disturbance related to the first hemispherically symmetric normal (Rossby) mode. As shown in Harvey et al. (2021),

755
the middle atmospheric Q5DW can manifest in two forms. First, as a hemispherically symmetric feature related to latent heat release in the tropical upper troposphere (Salby, 1981;Miyoshi and Hirooka, 2003)  baroclinic/barotropic instability, leading to what is commonly referred to as the 6.5-day wave in the mesosphere and lower thermosphere (Talaat et al., 2001;Lieberman et al., 2003;Forbes and Zhang, 2017). Given the complex dynamical interactions that give rise to the Q5DW, capturing this feature is a good test for middle atmospheric meteorological analyses. Figure 13 plots altitude and latitude dependences of the Q5DW amplitude in temperature 770 during January 2010 extracted from the four analyses using the 2DFFT method described in section 2, using a bandpass for westward zonal wavenumber 1 and 0.16 -0.24 cpd (periods of 4.25-6 days). In all four analyses, the dominant Q5DW pattern is the high-latitude winter feature with peak amplitudes of 2-3 K between 60 o N and 80 o N latitude.
These amplitudes are consistent with the 5-day Rossby normal mode variation of 2.5-3.5 K derived from SABER temperature observations in the study by Garcia et al. (2005) for the March-May 2002 period; they are also consistent 775 with quasi-6-day wave amplitudes at high northern latitudes in January reported by Forbes and Zhang (2017) using 14 years of SABER temperatures. The main difference in the Q5DW amplitudes among the data sets is its vertical extent.
Both NAVGEM-HA and WACCMX+DART exhibit Q5DW amplitudes of 1-2 K at high Northern latitudes above 80 km, whereas the corresponding Q5DW amplitudes in JAGUAR-DAS are limited to below 80 km altitude. The three analyses extending above 80 km altitude also indicate weak (0.5-1 K) Q5DW amplitudes in the SH (summer) 780 extratropics that may be related to convective latent heat release.
Similar to the Q5DW, the Q2DW is a well-documented feature of upper stratospheric and mesospheric dynamics (e.g., Coy, 1979;Harris, 1994;Limpasuvan & Wu, 2003;Garcia et al., 2005;Pancheva, 2006;Lilienthal and Jacobi, 2015;Kumar et al., 2018). The Q2DW consists primarily of a westward propagating zonal wave number 3, although westward wave number 2 and 4 components are also present in both satellite observations and high-altitude meteorological analyses (e.g., McCormack et al., 2009;Tunbridge et al., 2011;Gu et al., 2013;McCormack et al., 2014). In the mesosphere, the Q2DW originates primarily from regions of baroclinic instability in the easterly mesospheric summer jet (Plumb, 1983;Pfister, 1985) that form in part by the effects of gravity wave drag (e.g., Ern et al., 2013;Sato et al., 2018). In the tropical upper stratosphere, the Q2DW can originate from regions of barotropic instability (Burks & 790 Leovy, 1986) related to inertial instability resulting from unusually strong PW activity in the winter hemisphere (e.g., Orsolini et al., 1997;McCormack et al., 2009;Lieberman et al., 2021). Both observational and modeling studies have indicated that the Q2DW, often through interaction with tides, is a significant source of day-to-day variability in the dynamics and composition of the thermosphere and ionosphere (e.g., Chang et al., 2011;Yue et al., 2012;Chang et al., 2014). It is, therefore, important that meteorological analyses used to constrain whole atmosphere simulations 795 accurately capture the Q2DW. Figure 14 plots altitude and latitude dependences of the January monthly mean Q2DW amplitude in temperature extracted from the four analyses using a bandpass for zonal wavenumber 3 and westward frequencies between 0.45 and 0.6 cpd (periods of 1.6-2.2 days). Below 80 km altitude, all four analyses show largest Q2DW amplitudes in the 800 SH along the equatorward flank of the summer easterly jet (see Fig. 3), coinciding with the region where the standard deviations in zonal mean zonal wind are largest in SH summer (e.g., Fig. 3 and Fig. S2). This spatial structure is broadly consistent with results from earlier studies based on MLS (e.g., Limpasuvan and Wu, 2003) and SABER (e.g., Gu et 75 km, in JAGUAR-DAS the peak DW1 amplitude of ~9K is located between 95 km and 100 km altitude, and in WACCMX+DART the peak DW1 amplitude of ~10K occurs near 110 km altitude. The range of altitudes for maximum DW1 amplitudes seen in these four analyses agrees with SABER observations (Zhang et al., 2006). For MERRA-2 and possibly NAVGEM-HA, the analysis system upper boundaries are low enough that artificially damping of DW1 may occur. In JAGUAR-DAS, DW1 is dissipated above ~100 km due to the model diffusion exponentially increasing with 840 height to mimic molecular diffusion. The differences in DW1 structure at/above 100 km between JAGUAR-DAS and WACCMX+DART are likely due to the large differences in background zonal mean zonal wind (Fig. 3). Deleted: 5

Moved down [1]:
There is now substantial evidence that circulation changes throughout the stratosphere and mesosphere 855 related to major SSWs can modulate the solar migrating tides (Pedatella and Forbes, 2010;Lima et al., 2012;Pedatella and Liu., 2013). Typically, the amplitude of the diurnal zonal wave number 1 (DW1) migrating solar tide is seen to decrease in the days leading up to a major SSW, followed by a pronounced increase in the amplitude 860 of the semidiurnal zonal wave number 2 (SW2) migrating tide for several days or weeks following the onset of the SSW (e.g., Pedatella and Liu, 2013;Limpasuvan et al., 2016;McCormack et al., 2017). The origins of the tidal modulation by SSWs are still under investigation, but possible causes may include transport-induced 865 changes in the distribution of ozone heating in the equatorial upper stratosphere (Goncharenko et al., 2012) and variations in zonal mean zonal winds that affect the upward propagation of the tides (McLandress, 2002;Sassi et al., 2013). In addition to migrating tides, nonmigrating tides are known to also impact T/I variability. One prominent nonmigrating feature is the eastward propagating diurnal zonal wave number 3 (DE3), that has been shown to play a role in establishing pronounced zonal variations in ionospheric total electron content (e.g., Immel et al., 2006;Hagan et al., 2007;McDonald et al., 2018). Variations in DE3 amplitude in relation to SSWs have been noted (Maute et al., 2014),

895
with non-linear wave-wave interactions within the mesosphere playing an important role in DE3 growth (Lieberman et al., 2015;Sassi et al., 2021). Figure  For purposes of constraining whole atmospheric model experiments, perhaps more important than the monthly mean amplitudes of the tides is the day-to-day tidal variability in the mesosphere and lower thermosphere captured by each of the three high-altitude analyses: NAVGEM-HA, JAGUAR-DAS, and WACCMX+DART. There is now substantial evidence that circulation changes throughout the stratosphere and mesosphere related to SSWs can modulate the solar 910 migrating tides (Pedatella and Forbes, 2010;Lima et al., 2012;Pedatella and Liu., 2013). Typically, the amplitude of DW1 is seen to decrease in the days leading up to a SSW, followed by a pronounced increase in the amplitude of the SW2 for several days or weeks following the onset of the SSW (e.g., Pedatella and Liu, 2013;Limpasuvan et al., 2016;McCormack et al., 2017). The origins of the tidal modulation by SSWs are still under investigation, but possible causes may include transport-induced changes in the distribution of ozone heating in the equatorial upper stratosphere described in Section 2. We note that the S-transform method by itself does not distinguish between eastward and westward propagating features. However, based on examination of individual 2DFFT spectra (not shown), we find that 945 the dominant spectral features associated with diurnal wave 1, semidiurnal wave 2, and diurnal wave 3 in the temperature fields at this level correspond to DW1, SW2, and DE3, respectively. To avoid edge effects commonly associated with wavelet methods, results for the first and last three days in the time period are not plotted in Figs. 18,   19, and 20. in NAVGEM-HA and JAGUAR-DAS. We note that the DW1 results from all 3 analyses plotted in Fig. 19 exceed the 95% confidence level at most latitudes.
For SW2 (Fig. 19), the peak WACCMX+DART amplitudes at 90 km are also generally less than peak values in the NAVGEM-HA or JAGUAR-DAS results. However, we note that in early February, all three high altitude data sets Deleted: Zonal averages of these amplitude spectra at 1.0 and 2.0 cpd are computed, and the resulting time varying temperature amplitudes for each wavenumber-frequency combination (zonal wave number 1 and 1 cpd, zonal wave 2 and 2 cpd, zonal wave 3 latitude distributions of the mean amplitude and ±1 standard deviation in the temperature amplitudes for the period 27 1020 January -9 February derived from the S-transform analysis. For diurnal wave 1 (Fig. 21a), all three analyses show largest amplitudes near the Equator; NAVGEM-HA and JAGUAR-DAS peak values are both ~7K, while the WACCMX+DART peak value is ~4K. For semidiurnal wave 2 (Fig. 21b) Overall, the results in Figure 21 suggest that while there is general qualitative agreement in the latitude structure of DW1, SW2, and DE3 among the three meteorological analyses extending to 90 km altitude, there are important quantitative differences. These differences are likely related to the details of each assimilation system regarding the type of observations being assimilated, the type of atmospheric model employed, and differences in the temporal and 1035 spatial resolutions of each system. For example, is it possible that the 6-hourly output of JAGUAR-DAS, which is at the Nyquist frequency for SW2, may result in some aliasing of other signals; this could potentially explain some of the high-latitude SW2 amplitudes seen in JAGUAR-DAS ( Fig. 21b) but not in either NAVGEM-HA or WACCMX+DART. We emphasize that the results in Fig. 21 are for a single altitude region (90 km). The comparisons would likely be quite different at higher altitudes where, for example, there is evidence of larger DW1 amplitudes during January 2010 in WACCMX+DART than in either NAVGEM-HA or JAGUAR-DAS. Further intercomparison of results among these (and possibly other) high-altitude meteorological analyses are needed to expand upon the initial results presented here. Nevertheless, the differences noted here in Figs. 18-21 indicate that the choice of high-altitude meteorological data set to constrain day-to-day meteorological variations in whole atmosphere models related to diurnal and semi-diurnal tides (either migrating or non-migrating) may impact the results, particularly in the equatorial regions. Thus, we advise users of these analyses to compare results to observations and/or other analyses to increase confidence. Further investigations where these types of differences are incorporated into constrained or "nudged" whole atmosphere model simulations as a source of uncertainty may be helpful to better quantify the impact of meteorological activity on day-to-day variations in the T/I system.

Summary and Discussion
Based on the results of this intercomparison among four analysis systems that assimilate middle atmospheric satellite observations, we find that there is overall good agreement in the latitude, altitude, and time behaviour of the zonal mean temperature and zonal winds up to approximately 50 km altitude during the December 2009 -March 2010 period.
This finding is consistent with the results presented in Harvey et al. (2021), which examined 10 reanalysis data sets, observations (from MLS). Also consistent with Harvey et al. (2021), we find that significant differences among the four analyses begin to emerge above 50 km altitude at low latitudes. The present intercomparison among the NAVGEM-HA, JAGUAR-DAS, and WACCMX+DART analyses shows how large inter-analysis differences can 1095 extend above 80 km altitude. As summarized in Fig. 10, the largest zonal mean temperature differences among the analyses, ranging from 10-15 K, are found near 90 km. However, we find that the largest zonal mean zonal wind differences are found not at the highest altitudes, but near 50 km altitude at the Equator (Fig. 11). This latter result highlights the fact that these middle atmosphere analyses do not currently produce a consistent description of key climatological features such as the SAO in zonal mean zonal wind near the stratopause (Kawatani et al., 2020). This is related to additional second order divergence damping that was included in the version of WACCMX+DART used for the present study, and has subsequently been removed, leading to increased tidal amplitudes in WACCMX+DART . As Fig. 21 shows, there can be as much as a factor of 2 difference in the 1115 temperature variance associated with equatorial DW1 among the analyses at 90 km altitude over the January-March 2010 period. Further study is needed to examine possible causes of the disagreement among the analyses, focusing both on the different types of middle atmospheric observations being assimilated (e.g., temperature profiles only vs. temperatures and constituents), the assimilation methods being used (e.g., 4DVAR vs. ensemble based, retrieval vs. radiance assimilation), and the details of the model physics (e.g., gravity wave drag, radiative heating 1120 parameterizations) being employed by each system.
It is important to note that this initial intercomparison is not meant to be the final word on the characteristics of these analyses, but rather a starting point. Given the extensive effort and computational resources involved in producing these data sets, a more thorough comparison over many years is beyond the scope of the present study. We also note 1125 that the systems producing these analyses are constantly evolving in order to improve both research and operational capabilities for specifying middle atmosphere conditions. Ultimately, more extensive intercomparisons that examine both seasonal and interannual variability of key middle atmospheric features (e.g., upward propagating waves and tides, SSWs and mesospheric coolings) over many years using the most recent version of the data available will be needed in the future. The aim of this study is to provide some initial insight on where efforts to improve these systems could be most useful. One area for improvement highlighted in this study is in the representation of the equatorial SAO in the upper stratosphere and lower mesosphere. This effort would be facilitated in the future by ensuring that these highaltitude meteorological analysis systems routinely save fields quantifying the parameterized sub-grid scale gravity wave drag.
To further pursue improvements in these middle atmospheric meteorological systems, a follow-on validation study is planned where independent (i.e., not assimilated) satellite and ground-based middle atmosphere observations are used to evaluate each of these data sets. Some examples of independent ground-based observations for validation of middle atmospheric analyses include mesospheric horizontal wind profiles derived from meteor radars (e.g., Stober et al., 2020) and temperature profiles from lidar (e.g., Marlton et al., 2020). Some examples of independent satellite-based observations that have been used for validation include wind observations from the TIMED Doppler Interferometer