Characterizations of Europe’s integrated water vapor and assessments of atmospheric reanalyses using more than two decades of ground-based GPS

. Ground-based Global Positioning System (GPS) has been extensively used to retrieve Integrated Water Vapor (IWV) and has been adopted as a unique tool for the assessments of atmospheric reanalyses. In this study, we investigated the multi-temporal-scale variabilities and trends of IWV over Europe by using IWV time series from 108 GPS stations for more than two decades (1994-2018). We then adopted the GPS IWV as a reference to assess six commonly-used 15 atmospheric reanalyses, namely CFSR, ERA5, ERA-Interim, JRA55, MERRA2, and NCEP2. The GPS results show that the diurnal cycles peak within 16:00-24:00 local time with peak-to-peak amplitudes accounting for 2%-18% of the daily mean. The diurnal 1-hourly anomalies can be much more intensive with a range of −100% to 200%. The annual cycles peak in July and August with maximum values of 17-32 kg m -2 . The interannual variations of IWV over Europe are found to be mainly linked to the North Atlantic Oscillation (NAO) and the East Atlantic (EA) patterns. The IWV continues to increase over 20 Europe during the last two decades at 0-0.4 kg m -2 decade -1 in the north and 0.4-1 kg m -2 decade -1 in the south. Regarding the assessments of the reanalyses, the intercomparisons with respect to GPS reveal a general superiority of the newly-released ERA5 IWV product. For instance, ERA5 only has a slight wet bias with a median value of 1%, whereas the median bias for MERRA2 is 4%. ERA5, MERRA2, and NCEP2 are the best, second best, and worst performers respectively in modelling the variability of daily IWV time series, with standard deviations of daily IWV differences against GPS by 0.5-1.6, 0.7-2.3, 25 and 1.2-3.0 kg m -2 , respectively. Moreover, the daily GPS IWV time series is best correlated with the ERA5 IWV with a median Pearson correlation coefficient of 0.996, whereas the second strongest and weakest median correlations are observed in MERRA2 and NCEP2 with values of 0.991 and 0.971, respectively. Furthermore, the correlations between the IWV trends from the reanalyses and GPS are strongest for ERA5 (0.82), a bit weaker for MERRA2 (0.72), and weakest for NCEP2 (0.52). 30 studies. In addition, nearly three decades of continuous GPS measurements with increasingly densified

their IWV products cover the period of the ground-based GPS data available since 1994. Despite ERAI having been decommissioned and superseded by ERA5 in August 2019, we still include it for the purpose of evaluating the progress of its successor ERA5. We therefore restricted the time range of all data records to a common period from 1994 to 2018.
The EA pattern is known to be the second leading mode of the climate variability over the Atlantic/European sector 140 (Barnston and Livezey, 1987) and it can modulate the structure and climate impact of the NAO (Moore and Renfrew, 2012).
For information on the other indices, readers are referred to the relevant references.

IWV retrievals
The Zenith Total Delay (ZTD) estimates derived from GPS data processing can be converted into IWV, the total amount of water vapor present in a vertical atmospheric column from the Earth's surface to the top of the atmosphere in unit 145 kg m -2 , using the following equation (Bevis et al., 1992): where denotes the gas constant for water vapor, 2 ′ and 3 are atmospheric refractivity constants (Bevis et al., 1994), and denotes the weighted mean temperature. The ZHD was modelled as follows (Saastamoinen, 1972;Davis et al., 1985): where denotes the pressure at the GPS station with a latitude of and height of .The and are computed by using the ERA5 pressure level product (see Yuan et al., 2021 and references therein). The ERA5 is selected as it can provide 1-hourly product without temporal interpolation.
IWVs at the GPS stations are also calculated from the six reanalyses. In this calculation, the pressure level products of the reanalyses are used rather than their surface level IWV products, though it requires a heavier workload. This is because 155 the GPS station and its nearby reanalysis surface grids are usually related to different heights. Vertical IWV adjustment is therefore usually required for the intercomparison between the IWV estimates from GPS and reanalyses Compared to the reanalyses' surface level products, their pressure level products allow for a better characterization of the vertical distribution of water vapor and tends to minimize the errors of the vertical adjustment (Parracho et al., 2018).
The scheme of the calculation is illustrated in Fig. 2. Assume a GPS station G with a height of located between two 160 adjacent reanalysis pressure levels ( and + ). We first determined its eight surrounding reanalysis grid nodes at these pressure levels ( and +1 , =1, 2, 3, and 4) and four auxiliary points ( ). The auxiliary points are located at the GPS https://doi.org/10.5194/acp-2021-797 Preprint. Discussion started: 26 October 2021 c Author(s) 2021. CC BY 4.0 License. station height whereas their horizontal locations are identical to the associated reanalysis nodes. We then calculate the related meteorological variables at each auxiliary point. Exponential and linear interpolations are employed for the vertical corrections of pressure and temperature, respectively. If the auxiliary point is lower than the lowest pressure level (e.g., 1000 165 hPa), an extrapolation of these variables is conducted. For details of the vertical inter-/extrapolation, readers are referred to Schüler (2001) and Wang et al. (2005). Next, we calculate the IWV at each auxiliary point by using a vertical integration: where and are specific humidity and gravitational acceleration profiles from to the reanalysis top level, respectively.
Finally, we estimate the IWV at the GPS station using a horizontal interpolation with an inverse distance weighting 170 algorithm (Jade and Vijayan, 2008).
Representativeness differences arise in the comparison of the IWV derived by the ground-based GPS and reanalyses (Bock and Parracho, 2019). This is because local variations of IWV measured by the GPS might fail to be resolved by the reanalyses due to their coarse horizontal resolution. The representativeness differences can be evaluated statistically as where IWV and IWV are the IWV values of the horizontal auxiliary points A and A , respectively (Fig. 2). Bock and Parracho (2019) found that the max was correlated with the standard deviation of the IWV differences between GPS and ERAI ( Δ ), and thus concluded that the representativeness differences between the IWV from GPS and ERAI contributed to their discrepancies. Here, we extend their work by estimating and comparing the representativeness statistics of the six 180 reanalyses with various spatial resolutions (Table 1).

Pre-processing
The 1-, 3-, and 6-hourly IWV time series are screened by using a robust outlier detection method as follows (Yuan et al., 2021). For each data point, the data within a 30-day window centered at this point are extracted. The 25th percentile (Q1), 75th percentile (Q3), and interquartile range (IQR) of the subsequent time series are then calculated. Finally, the data 185 point is identified as an outlier if it is outside the range of [Q1 − 1.5 × IQR, Q3 + 1.5 × IQR]. The IQR threshold and the 30day sliding window are adopted as they allow for a good robustness and a proper accommodation for the natural variability of IWV. On average, about 1% data were filtered out by the data screening.
The GPS IWV time series may have breaks due to changes in GPS data processing strategies and station-related changes like hardware changes or changes in the electromagnetic environment (Van Malderen et al., 2020 and references 190 therein). Changes in GPS data processing strategy are avoided in this study by using the homogeneously reprocessed GPS ZTD product. Moreover, the GPS IWV time series are homogenized as described in Yuan et al. (2021). The difference in time series between GPS and ERA5 IWV are then calculated, so that most of the natural IWV variabilities are removed and the small breaks are easily distinguishable. The associated GPS height time series and station log files are also inspected to https://doi.org/10.5194/acp-2021-797 Preprint. Discussion started: 26 October 2021 c Author(s) 2021. CC BY 4.0 License.
find out suspicious epochs of device changes. The significances of the breaks are tested at 95% confidence level with the 195 specific optimal noise model of each IWV difference time series. The IWV time series from reanalyses may also have breaks (e.g., Ning et al., 2016, Schröder et al, 2016, and thus we compared them with the homogenized GPS IWV time series.
Nevertheless, no obvious breaks were found in the reanalyses time series.
In this study, the performances of reanalyses IWV products in daily time series, 1-hourly diurnal cycle and variations, 1-monthly annual cycle and interannual variations, and trends are evaluated. As the temporal resolutions of the datasets are 200 different, temporal interpolation and aggregation are conducted. For the investigation of diurnal cycle and variations, the 3and 6-hourly IWVs are linearly interpolated into 1-hourly time series. To obtain daily mean IWVs, the 1-, 3-, and 6-hourly IWV time series are aggregated if they have at least 12, 4, and 2 data points in a day, respectively. The daily mean IWVs are further aggregated into monthly mean IWVs if there are no less than 15 data points in a month.

Metrics for consistency evaluation 205
To evaluate the performances of the reanalyses in reproducing IWV, Kling-Gupta Efficiency (KGE) is employed as a metric. The KGE is a composite index introduced by Gupta et al. (2009) and modified by Kling et al. (2012). It takes bias, variability, and correlation into account in the equation below: and , , and are the mean value, standard deviation, and coefficient of variation of the reanalysis IWV time series, respectively. The , , and are the corresponding parameters for GPS. The and indicate the consistencies in the mean and variability, respectively. The indicates the Pearson correlation coefficient between the GPS and reanalyses time 215 series. With perfect consistencies in mean, variability, and correlation, the , , and are identical to 1, respectively. From Eq. (5), it follows that a larger KGE score indicates a better consistency. Ideally, the KGE metric reaches its maximum of 1.
3 Assessments of daily time series 3.1 Representativeness differences Figure 3 compares the standard deviations of daily IWV difference ( Δ ) and the representativeness statistic ( max ) for 220 the six reanalyses, respectively. The first point to note is that the Δ and max are strongly correlated for all the reanalyses, with values from 0.74 for NCEP2 to 0.91 for ERA5. The result is in line with Bock and Parracho (2019), who compared ERAI to a global GPS network. Figure 3 indicates that the representativeness differences contribute to the discrepancies https://doi.org/10.5194/acp-2021-797 Preprint. Discussion started: 26 October 2021 c Author(s) 2021. CC BY 4.0 License.
between the reanalyses and GPS. Moreover, ERA5 is characterized by the lowest Δ and max , with values of 0.5-1.6 and 0.2-2.1 kg m -2 , respectively. In contrast, NCEP2 has the largest and max , with values of 1.2-3.0 and 1.9-5.2 kg m -2 , 225 respectively. The difference could be due to the fact that the spatiotemporal resolution of ERA5 is much higher than NCEP2 as indicated in Table 1. This result indicates that ERA5, with improved spatiotemporal resolution and data assimilation, is capable of reducing the discrepancy and representativeness difference with respect to GPS. Figure 4 shows the geographical distributions of the ratio of mean , the ratio of coefficient of variation , the 230 correlation , and the synthetic KGE metric for the six reanalyses by taking the daily GPS IWV time series as references.

Assessments using KGE
Moreover, the statistics of these scores are displayed in Fig. 5 with box-and-whisker plots. From the first column of Fig. 4 and Fig 5a we can see that the scores are slightly larger than 1 for all the reanalyses except JRA55, indicating a general wet bias. For example, MERRA2 has the largest median with a value of 1.04, indicating a general wet bias of 4%. In contrast, JRA55 scores a median of 0.99 ( Fig. 5a), indicating a slight dry bias of 1%. These results are in agreement with 235 Schröder et al. (2018), who reported that CFSR, ERAI, and MERRA2 are too moist over Europe compared to the ensemble mean of various satellite and reanalysis IWV records, whereas JRA55 has negligible bias there. Analyzing the reasons of the biases is of potential interest; however, it would go beyond the scope of this paper.
Regarding the consistency in variability, most of the scores are less than 1, with the associated median values ranging between 0.96 and 0.98 (Fig. 5b). The results indicate a good reproduction of daily IWV variability by the reanalyses, albeit 240 with a slight underestimation. The correlations are also pretty good with median values larger than 0.97 (Fig. 5c). However, the scores in variability and correlation are lower for the coastal stations, as shown in the second and third columns of Fig. 4, respectively. In particular, the and of NCEP2 are less than 0.96 for some coastal stations located at Western and Southern Europe. This could be explained by the representativeness differences that impact the IWV variability. For a coastal GPS station with an on-site measurement of IWV, a part of its associated four reanalysis grid nodes may be located over the sea 245 whereas others over land. As a consequence, the representativeness difference for such a coastal site is more severe than inland stations surrounded with flat terrain due to land-sea thermal contrast (Drobinski et al., 2018).

Assessments of diurnal variations
Despite the atmospheric water vapor being quite unstable, it is characterized by a diurnal cycle (Dai et al., 2002). In this section, the diurnal variations of IWV are investigated in two aspects: diurnal cycle and diurnal anomaly. The diurnal anomaly is calculated as follows (Diedrich et al., 2016;Steinke et al., 2019): 255 where IWV is the 1-hourly IWV time series in one day and IWV is the associated daily mean.
Each station's diurnal cycle is computed by averaging its diurnal anomalies for each season separately and for the entire year. We analyzed the amplitude and phases of each station's diurnal cycle. Note that, the amplitude is here defined as the difference between the maximum and minimum, i.e., the peak-to-peak amplitude. Moreover, the time of each station's 260 diurnal variation is converted into the Local Time (LT).

Diurnal cycle
Starting with the annual-averaged peak-to-peak amplitudes of the GPS IWV's diurnal cycle in Fig. 6e, it can be first noted that the amplitudes tend to increase equatorward, with values ranging from 2% to 18%. This finding is in line with the amplitude of the IWV itself, although we consider relative amplitudes here. However, lower latitude stations also experience 265 higher temperature contrasts over a day than higher latitude stations (Sharifnezhadazizi et al., 2019), and the IWV diurnal variability is mainly governed by the temperature diurnal variability (see Wang et al., 2016). Moreover, Fig. 7a  Concerning the phase of the IWV diurnal cycle, the GPS measurements show that most of the IWV cycles reach their minima between 5:00-12:00 LT and then peak around 16:00-24:00 LT (Fig. 6f-o). The minimum and maximum values of the yearly-averaged diurnal cycles occur around 8:00 and 19:00 LT, respectively. The result could be explained by the fact that solar insolation during the day enhances evapotranspiration and accumulates water vapor until sunset.
All the reanalyses successfully reproduce the IWV's diurnal cycle as shown in Fig. 8a-c. However, MERRA2 only 280 scores a median KGE value of 0.62, lower than the scores of CFSR, ERAI, and JRA55 (0.71-0.75) which have a coarser original temporal resolution (6-hourly versus 3-hourly). The poor performance of MERRA2 is mostly due to its overestimation of the variability of diurnal cycle (Fig. 8a). A similar overestimation of diurnal amplitude was reported by https://doi.org/10.5194/acp-2021-797 Preprint. Discussion started: 26 October 2021 c Author(s) 2021. CC BY 4.0 License. Wang et al. (2017) in an inter-comparison of GPS and MERRA (the ancestor of MERRA2) in the southern Tibetan Plateau.
In that work, the authors suggested that it could be due to an exaggerated land evaporation in MERRA (Mueller and 285 Seneviratne, 2014), which might also exist in MERRA2 over Europe. In contrast, ERA5 reproduces the best variability (median =0.98). However, it only scores a median KGE of 0.72, which is at the same level of CFSR, ERAI, and JRA55. This is mainly due to its weaker correlation (Fig. 8b). Therefore, though ERA5 and MERRA2 have higher temporal resolutions than the others, future work is certainly required to improve their capabilities in reproducing the phase and amplitude of diurnal cycles of IWV, respectively. 290

Diurnal anomalies
Diurnal anomaly represents high-frequency variations of IWV, associated with weather phenomena like heavy rainfall. Figure 9 compares the diurnal anomalies derived from reanalyses with respect to GPS. To be consistent with the temporal resolution of GPS and ERA5, the diurnal anomalies of the other reanalyses are linearly interpolated into 1-h time series. It is observed that the anomalies roughly range from −100% to 200% of the associated daily mean IWV, indicating that the 295 diurnal variations of IWV are much stronger than the diurnal cycles. It can also be seen that the range of GPS diurnal anomalies is larger in winter than in summer, suggesting a more intensive relative variability compared to the associated daily mean IWV. It could be due to the fact that the daily mean IWV is much lower in winter than in summer.
Figure 8d-f shows the KGE scores of the reanalyses in reproducing diurnal anomaly of IWV. It can be seen that the median KGE scores of ERA5 and MERRA2 are 0.90 and 0.77, respectively. Their scores are much better than the other four 300 reanalyses with median scores between 0.39 and 0.58. This is not surprising as ERA5 and MERRA2 have the highest initial temporal resolution (Table 1). Therefore, ERA5 is recommended in studies of high-frequency IWV variations.

Annual cycle
We analyzed the annual and inter-annual variations of IWV using monthly mean time series. The annual GPS IWV 305 cycle, as a function of latitude, is shown in Fig. 10a. The annual IWV cycles reach their maxima mostly in July and August,  can be seen that ERA5 has the least differences. Indeed, the quantitative evaluation shows that ERA5 obtains the highest median KGE score having a value of 0.97 (Fig. 5h). Moreover, the median KGE scores of the other reanalyses are also rather good (>0.94). The results indicate that all the reanalyses reproduce the annual cycle of IWV quite well, with ERA5 slightly outperforming the other reanalyses.

Interannual variations and teleconnections 315
The interannual variations of IWV are investigated by using its monthly relative anomalies. The relative anomaly of a month is calculated as the ratio between the IWV change of that month relative to the multiyear mean value of that month.
The monthly relative anomaly time series of GPS IWV are presented in Fig. 11a with decreasing latitudes. It could be noted that the majority of robust IWV anomalies occurs in winter (DJF), with amplitudes as large as 50%. Those winter IWV anomalies are found to be linked to NAO. with a pattern that the NAO index and IWV anomalies are strongly positively 320 correlated in Northern Europe and slightly negatively correlated in Southern Europe (Fig. 11d). This finding points to a wetter (drier) weather condition in Northern Europe and a drier (wetter) condition in Southern Europe when NAO is in its positive (negative) phase. The spatial pattern is in line with Trigo et al. (2002) and can be explained by the fact that during positive NAO phase, the path of moisture transport turns towards a more northeast orientation over the North Atlantic than usual, and thus results in enhanced (reduced) moisture transport over Northern (Southern) Europe (Hurrell. 1995). The 325 modulation of NAO on the IWV over Europe is also consistent with its influence on temperature and precipitation. Previous studies have demonstrated that a positive NAO phase tends to be accompanied by reinforced westerly winds over the North Atlantic, which is conducive to above-average temperature and precipitation over Northern Europe as well as below-average precipitation over Southern Europe (e.g., Hurrell 1995). During negative NAO phases, opposite patterns of temperature and precipitation anomalies are expected. 330 In addition, the IWV anomalies in winter are also significantly correlated to the EA pattern (Fig. 11e). The positive correlation between EA and IWV over Europe indicates a wetter (drier) condition with positive (negative) EA phase. This teleconnection could be attributed to the influence of EA on atmospheric circulation. The winter positive (negative) EA phase indicates a dominating zonal (meridional) circulation (Mikhailova and Yurovsky et al., 2016). Hence, EA can modulate heat and moisture transport. A positive EA index in winter indicates a higher likelihood of higher-than-usual 335 temperature over Europe, enhanced precipitation in Northern Europe, and precipitation deficiency in Southern Europe. By contrast, a negative EA index is likely to result in opposite influences on temperature and precipitation.
As the winter atmospheric circulation over the Atlantic/European sector can be mostly represented by NAO and EA, the joint effects of NAO and EA on the winter IWV anomalies are investigated with four cases, namely (i) both NAO and EA are positive, (ii) NAO is positive whereas EA is negative, (iii) NAO is negative whereas EA is positive, and (iv), both 340 NAO and EA are negative:  (Fig. 11a). The results could be explained by the positive NAO and EA indices anomalies (Fig. 11b-c), which indicate a general milder and wetter condition over Europe. For instance, the two winters are the wettest winters for the British Isles since 1850 with rainfall 345 totals over 500 mm (McCarthy et al., 2016).  December months, the NAO indices are strongly negative, indicating colder and drier-than-usual climate conditions in Northern Europe (Taws et al., 2011). However, the EA phases were slightly positive in December 2009 (Case iii here above) and negative in December 2010. Therefore, in this latter case, that negative EA strengthens the cold and dry effects due to the negative NAO, as also shown by Moore and Renfrew 2012. Therefore, the combination of negative NAO and EA indicates a more severe IWV deficiency in Northern Europe. 365 In addition to wintertime, the NAO and EA could also contribute to extreme IWV anomalies in other seasons. For example, Northern Europe experienced a significant negative IWV anomaly in March 2013, which could be associated with the strongly negative NAO and slightly negative EA as described in Case iv.
IWV anomalies might also be linked to other teleconnection indices (e.g., Thomas et al., 2021). For example, Fig. 11f illustrates that the winter IWV anomaly over Europe is well correlated with AO in a similar spatial pattern 370 as for the NAO (Fig. 11d). This is not surprising as the AO and NAO indices are highly correlated and they are even used interchangeably in some literature (e.g., Cohen et al., 2014), albeit their definitions are different (Ambaum et al., 2001). In addition, the correlations between the winter IWV anomalies and other indices, such as the SCA, WeMO, Niño 3.4, NOI, and QBO, were also examined and displayed in Fig. 11g-k. The results show that all these indices, except for the QBO, are correlated with the winter IWV anomalies at least in some parts of Europe. However, most of their correlation coefficients 375 are within ±0.4, hence weaker than the correlations for NAO/AO and EA. Our correlation analysis concludes that NAO/AO and EA are the dominant patterns among the teleconnection indices for the modulation of winter IWV over Europe.
However, this does not mean that the other patterns might not play a role for specific extreme events. For example, the excess of IWV over Europe in the winter of 2015/2016 could also be linked to the concurrent super El Niño (ENSO) event (Stockdale et al., 2017). Moreover, although we successfully explained some of the IWV winter extremes in Europe with the 380 four different phase combinations of NAO and EA, we must mention that the strengths and durations of the NAO and EA https://doi.org/10.5194/acp-2021-797 Preprint. Discussion started: 26 October 2021 c Author(s) 2021. CC BY 4.0 License. phase change from year to year, and that the atmospheric circulations over the Atlantic/European sector cannot be completely represented by the NAO and EA pattern. Therefore, future in-debt spatiotemporal analysis is necessary to clearly demonstrate the links between the IWV anomaly and the teleconnection indices.
The performances of the reanalyses in reproducing the interannual variations of IWV is evaluated by taking the GPS 385 IWV anomaly as reference. The six reanalyses achieved KGE scores larger than 0.9 at most of the stations as shown in Fig.   8i. Their median KGE scores are between 0.93 and 0.97, proving that all of them are able to reproduce well the interannual IWV variations over Europe. However, the median scores of the six reanalyses are between 0.96 and 0.98, indicating that they slightly underestimated the variability of 1-monthly IWV anomaly. JRA55 outperforms the other reanalyses (median KGE=0.97) as it reproduces the variability of IWV anomaly the best (median =0.98). ERA5 only ranks second (median 390 KGE=0.96) though it reached the highest correlation (median r=0.99). This is mostly due to its relatively lower (median =0.96). Therefore, the performance of ERA5 in reproducing the variability of monthly IWV anomaly over Europe could be further improved.
6 Assessments of trends Figure 12a shows the trends of the monthly anomaly GPS IWV time series. The geographical pattern of the IWV trends 395 is evident, with weaker trends in Northern Europe (generally 0-0.4 kg m -2 decade -1 ) and larger trends in Southern Europe and the Mediterranean Sea (generally 0.4-1 kg m -2 decade -1 ). The largest trend is found at station TUBI in Gebze, Turkey (29.5°E, 40.8°N), with a value of 1.5 kg m -2 decade -1 . The relative trends are characterized by a similar geographical pattern, with values from −0.5 to 8.2 % decade -1 . The geographical pattern and magnitudes of the GPS IWV trends are in line with earlier studies (Alshawaf et al., 2018;Yuan et al., 2021). The relative IWV trends derived from the reanalyses rather than 400 their absolute trends were compared to GPS for two reasons. One reason is that the relative trend is not affected by the IWV bias. The other reason is that the relative IWV trend is related more directly to the temperature variation, with the wellknown Clausius-Clapeyron relationship that states that water vapor holding capacity increases by 7% with a rise of 1 K in temperature (Trenberth et al. 2003). Figure 13 compares the relative IWV trends from the six reanalyses against GPS. The KGE score of ERA5 is 0.76, higher than all the other reanalyses' scores. It is no wonder because the ERA5 IWV time series 405 were used for the homogenization of the GPS time series (Yuan et al., 2021). The ERAI, JRA55, and MERRA2 also achieved rather good KGE scores, with values of 0.73, 0.72, and 0.72, respectively. The results indicate that these reanalyses are also good datasets for IWV trend reproduction. However, the KGE scores of CFSR and NCEP2 are much lower, with values of 0.48 and 0.20, respectively. This is mostly due to their deficiencies in correlation (0.55 and 0.52, respectively).
In this study, Integrated Water Vapor (IWV) time series for the period 1994-2018 were retrieved from continuous GPS observations of 108 ground-based GPS stations in Europe, with an average period of 21 years for those time series. The temporal features of Europe's IWV, such as its diurnal cycle and variation, annual cycle and interannual variation, and trend were then investigated. Moreover, the performances of six frequently used global atmospheric reanalyses in Europe were 415 assessed for the first time, namely CFSR, ERA5, ERA-Interim (ERAI), JRA55, MERRA2, and NCEP2. The main findings are summarized here below.
(i) The agreement between the daily GPS IWV time series and the six reanalyses are found to be best for ERA5 and worst for NCEP2, with standard deviations of IWV differences of 0.5-1.6 and 1.2-3.0 kg m -2 , respectively. The standard deviations of IWV differences are well correlated with representativeness statistics of the six reanalyses, indicating that the 420 representativeness differences contribute to the discrepancies between the reanalyses and GPS.
(ii) The peak-to-peak diurnal amplitudes account for 2%-18% of the associated daily mean IWV. The amplitudes are generally negatively correlated with station latitudes and positively correlated with station heights. Most of the diurnal cycles reach their minima between 5:00-12:00 Local Time (LT) and then peak around 16:00-24:00 LT. The diurnal IWV variations are much stronger than their cycles, with anomalies roughly ranging from −100% to 200% of the associated daily mean 425 IWV. Owing to higher temporal resolutions, ERA5 and MERRA2 outperformed the other four reanalyses in reproducing 1hourly diurnal anomalies of IWV with median KGE scores of 0.90 and 0.77, respectively. However, ERA5 slightly underperformed ERAI in modelling diurnal IWV cycles, and MERRA2 tended to overestimate the diurnal amplitudes.   Geophysical Research Letters, 21, 1999, 1994.  Figure S1 in the supplementary material. The coordinates of the stations and their time lengths are provided in 720 Table S1.        1979-present 1979-present 1979-Aug. 20191958-present 1980-present 1979 Temporal res.