This study examines the consistency and
representativeness differences of daily integrated water vapour (IWV) data
from ERA-Interim reanalysis and GPS observations at 120 global sites over a
16-year period (1995–2010). Various comparison statistics are analysed as a
function of geographic, topographic, and climatic features. A small (

Quantifying the global atmospheric moisture distribution and its variability across timescales remains a challenge to the climate community. Atmospheric reanalyses offer a comprehensive representation of the various components of the hydrological cycle, among which precipitation and evaporation are the dominant terms at the larger space scales and timescales. However, both quantities result from model integrations and are not strongly constrained by observations (Trenberth et al., 2011). The difference of precipitation minus evaporation corresponds to the net vertically integrated atmospheric moisture convergence, a quantity which can also be computed from analysed three-dimensional moisture and wind fields which benefit directly from the assimilation of observations (Trenberth and Fasullo, 2013). However, due to the high spatiotemporal variability of atmospheric moisture, the quality of moisture fields in the reanalyses remains limited, especially in data-sparse areas (Trenberth et al., 2005; Meynadier et al., 2010).

Ground-based Global Positioning System (GPS) integrated water vapour (IWV) observations have been used for some time as an independent validation source for global atmospheric reanalyses over limited regions and periods (Hagemann et al., 2003; Bock et al., 2005, 2016; Heise et al., 2009; Bock and Nuret, 2009) and moist atmospheric process studies (Bastin et al., 2007; Bock et al., 2008; Koulali Idrissi et al., 2012; Means, 2013; Adler et al., 2016; Khodayar et al., 2018). More recently, the value of continuous long time series of GPS IWV data has been investigated for the purpose of studying global and regional climate variability and validating climate models (Nilsson and Elgered, 2008; Vey et al., 2009; Roman et al., 2012; Ning et al., 2013; Chen and Liu, 2016; Wang et al., 2016; Parracho, 2017; Bastin et al., 2019). These studies reported various levels of agreement between GPS and atmospheric models/reanalyses making it difficult to draw general conclusions on the consistency between products. Indeed, the results depend on the model horizontal and vertical resolution, the method employed for the correction of vertical displacement between the model grid points and stations, and the considered geographical area and period of time. The influence of the model horizontal resolution suggests that representativeness differences exist between the model gridded data and station point observations. Such a situation is commonly faced in data assimilation when the station observations capture small-scale variability that is not resolved by the numerical model (Lorenc, 1986; Janjić and Cohn, 2006; Waller et al., 2014). In this context, it is traditional to include the representativeness error into the observation error in addition to the instrument error. For observations of highly variable fields such as humidity, representativeness errors can be considerably larger than instrument error and are state dependent and correlated in time (Janjić and Cohn, 2006). A proper treatment of representativeness errors, especially for humidity observations, is thus expected to improve the assimilation scheme (Waller et al., 2014). To our knowledge, representativeness errors of IWV observations, either from ground-based GPS or satellites, have not been discussed in this context. Representativeness errors arise also when measurements from different instruments are compared. This situation has been discussed for IWV measurements, e.g. by Liou et al., 2001, and Buehler et al., 2012. The representativeness errors represent in this case the effect of measurements not being perfectly co-located in space and time and using different sampling/measurement characteristics (e.g. point measurement vs. average over area/volume, instantaneous vs. time average; Buehler et al., 2012).

In the present study, we seek to analyse differences in daily IWV from ground-based GPS observations and the ECMWF global reanalysis, ERA-Interim (Dee et al., 2011), and identifying the proportion due to representativeness errors. In this context, we consider the GPS IWV observations as the reference and attribute the source of the representativeness errors to the coarse spatial resolution of the reanalysis. This choice is arbitrary and the results could be interpreted the other way round. Since GPS observations and model fields do not represent exactly the same quantity, representativeness errors can also be understood as representativeness differences more generally. Representativeness differences set a limit on the best achievable agreement between global reanalyses and station observations.

Among the motivations of this work, one is to explain the large systematic
differences (biases) between GPS and atmospheric models often observed in
coastal and mountainous regions (Hagemann et al., 2003; Bock et al., 2005;
Parracho et al., 2018). In coastal areas, model grid cells can contain a
fraction of IWV over sea not consistent with the GPS observations over land.
In mountains, the model IWV can be strongly biased compared to GPS
observations made in valleys or uphill. Biases amount typically to

The primary goal of this study is thus to analyse the consistency global of
daily IWV data from the ERA-Interim reanalysis and GPS station observations,
and explain the contribution of representativeness errors/differences. To
this purpose, we use simple statistics (mean differences and standard
deviations, such as those found in most past studies) to quantify the differences
between both datasets. We investigate the dependence of these statistics
upon latitude, altitude, and time, as well as mean atmospheric moisture
content and its spatial and temporal variability. A representativeness error
statistic is introduced which quantifies the spatial variability in the
ERA-Interim data at the surrounding grid points and explains to a good
degree the observed differences between the reanalysis and the observations.
All the statistics are computed over a period of 16 years because we want to
characterize the systematic ERA-Interim minus GPS differences and not their
changes over time (e.g. due to inhomogeneity and/or changes in the quality
in either of the datasets). The changes over time are small in magnitude
(Parracho et al., 2018) and have negligible impact on the average statistics
computed here. After establishing the contribution of representativeness
errors, we address the following specific questions:

By which means is it possible to mitigate the representativeness errors?

Does horizontal interpolation of model values degrade or increase the representativeness error?

Can we separate outlying results (e.g. sites with extreme biases and dispersion) due to enhanced representativeness errors those due to enhanced GPS instrument errors? To tackle this question, the seasonal variation of the comparison statistics and of the atmospheric environment (mean IWV and variability) is also analysed.

How efficient is the representativeness error statistic in detecting these outlying sites?

The results from this study are important for various applications where IWV data from reanalyses and GPS observations are used jointly. For example, recent attempts have been made to use the ERA-Interim reanalysis as a reference for detecting breaks in the GPS time series (Vey et al., 2009; Ning et al., 2016; Van Malderen, 2017). Outlying sites should be inspected more carefully to determine if the causes for the discrepancy are rather with GPS instrument errors or with reanalysis representativeness errors. This study may also contribute to a better treatment of ground-based Global Navigation Satellite System (GNSS) observation error in data assimilation (in this case, interpreting the representativeness error as an observation error), e.g. by establishing a parametric model of observation error depending on the spatiotemporal variability of IWV around the GNSS site computed from the model fields.

The paper is organized as follows. Section 2 describes how the IWV data from the two datasets are prepared. Special effort is made to use a procedure that maximizes the consistency between the datasets. Section 3 presents the results of IWV difference statistics and analyses their dependence upon a variety of parameters. General tendencies are derived that describe the consistency between the reanalysis and GPS globally. Section 4 introduces a range check which detects 15 outlying sites for which the IWV differences are especially large. The geographic, topographic, and seasonal characteristics of these sites are analysed and site-specific representativeness errors are highlighted. Section 5 discusses the possibility for detecting outlying sites a priori and concludes the paper.

In this study, we use the tropospheric delay estimates from the first
reprocessing of the International GNSS
Service (IGS), referred to as IGS repro1 (Byun and Bar-Server, 2009;
IGSMAIL-6298, 2010). It includes results for 456 stations over the period from
January 1995 to December 2010. Because we are interested in characterizing
the systematic differences between GPS and atmospheric reanalyses, a subset
of 120 stations which have the longest time series (16 years) is extracted.
The zenith tropospheric delay (ZTD) estimates, which are available with a
time sampling of 5 min, are first screened for outliers as described in
Parracho et al. (2018) and averaged in hourly bins centred on the round
hours (00:00, 01:00 UTC, etc.). Next, the hourly ZTDs are converted to
IWV using 6-hourly surface pressure,

Map showing the 120 GPS stations used in this study. A
dynamic map including geographical and technical information for all the GPS
sites can be found on

ERA-Interim is a modern reanalysis produced by ECMWF using the Integrated
Forecasting System (IFS) forecast model and the 4D-Var assimilation system
in 12-hourly analysis cycles (Dee et al., 2011). The number of observations
has increased from 10

Daily and monthly time-matched IWV values from GPS and ERA-Interim are
compared for each and every station, and overall statistics are computed
using the full time series (16 years). The overall statistics reveal the
systematic or persistent biases and discrepancies between the two datasets.
The goal is to identify the main causes of differences among the
representativeness differences, errors in the GPS data, and deficiencies in
the reanalysis (e.g. in data-sparse regions). The identification of
representativeness differences is made by inspection of a number of
statistics and their dependence upon characteristics of the GPS station's
environment: moist or dry climate (measured by the mean IWV), strength of
temporal variability (measured by the standard deviation of IWV and of its
first derivative), and spatiotemporal variability of IWV in the vicinity of
the station. The latter is computed from the ERA-Interim IWV values at the
four grid points surrounding the GPS stations. The maximum absolute
deviation of the four IWV values, denoted

All the statistics are defined by equations in Appendix A. The values computed for each station are given in Table S2 in the Supplement. They may be useful to readers who want to make their own statistical analysis of our results and/or detect outlying sites based on different thresholds than those we used in Sect. 4.

The mean and standard deviation of IWV differences (ERA-Interim minus GPS)
for all 120 stations over the 16-year period are shown in Figs. 2 to 5.
Figure 2 shows the results as a function of station latitude. The general
tendency is depicted by the fitted polynomials (the outlying stations,
defined beyond the dotted red lines will be discussed in Sect. 4). The
different plots show a clear dependence of the results on latitude. The mean
difference (Fig. 2a, c) is positive at northern and southern extratropical
latitudes (30–80

Similar to Fig. 2 but plotted as a function of GPS station altitude. The dashed black lines show linear fits.

Figure 3 shows the mean and standard deviation of IWV differences as a
function of altitude of the GPS stations. The mean differences (Fig. 3a, c)
show no dependence on altitude, meaning that the method of computation of
GPS IWV (from ERA-Interim

Figure 4 shows the standard deviation of IWV differences,

Standard deviation of daily IWV difference (ERAI minus
GNSS) for 120 global stations, as a function of

Figure 5 shows that time averaging is a means of reducing the
representativeness differences, as smaller-scale local features captured by
the GPS point observations get smoothed out. The mean differences (Fig. 5a,
c) are not impacted by the averaging, as expected. The standard deviation of
differences (Fig. 5b, d), on the other hand, decreases for the monthly
averages, both in absolute and relative units, at all sites. The median
standard deviation of the daily IWV differences (ERA-Interim minus GPS) is
1.2 kg m

Mean vs. standard deviation of IWV difference (ERAI minus
GNSS) for

Since representativeness differences impose a strong limitation on the
agreement between GPS and reanalysis, one may wonder if the horizontal
interpolation from the four surrounding ERA-Interim grid points does not
further enhance the differences by mixing information from the different
grid points. We investigated this question by computing the statistics for
each of the four surrounding grid points. Figure 6 shows the results in
comparison to the results obtained with the bilinearly interpolated IWV
values. The comparison of the mean values (Fig. 6a and b) emphasizes large
variations in the biases at some stations which will be further discussed in
Sect. 4. The slight shift of the ensemble of results below the

Scatter plots of

In the previous section, we have seen that the general agreement between GPS
and ERA-Interim is limited by representativeness differences which are
enhanced in regions of strong temporal variability (Fig. 2b, d), at higher
altitude (where mainly the relative standard deviation of differences is
impacted; Fig. 3d), and at sites where the mean spatial variability at the four surrounding ERA-Interim grid points is large (Fig. 4c). The standard
deviation of differences,

Figure 7 shows the values of the four comparison statistics for the 15
outlying cases for the bilinearly interpolated ERA-Interim values and also
from the values at the four surrounding grid points (ordered by increasing
distance to the GPS station). The results are grouped by region as outlying
sites appear to form several clusters located in specific areas of the globe
(see Fig. 1). In addition to the four statistics (Fig. 7a to d), we
included the altitudes of the GPS stations,

AREQ, SANT, and CFAG are all located in the Andes Cordillera, with
AREQ (

Further insight into the nature of the discrepancies is given by inspection
of the seasonal variation of the comparison statistics (Fig. 8) and of the
atmospheric environment (Fig. 9). Figure 8 shows that at all three sites,
the biases and standard deviations vary over the year, in relation with the
variation of the mean IWV (

Seasonal variation of

Seasonal variation of daily IWV data:

The next two sites, KIT3 (

The next five sites belong to two geographical regions: IISC in India, and
DHLG, BLYT, LONG, and COSO in California, USA, which are all characterized
by small discrepancies with only one statistic exceeding the range limits
(

The four outlying Californian sites can be separated into two groups: DHLG,
BLYT, and LONG, located south of the Sierra Nevada mountain range, in a
region of moderate topography, and COSO located in the Basin and Range
Province, a narrow valley at the southern exit of the Sierra Nevada. The
higher altitude (1485 m) and more complex topographic environment of COSO
enhances the representativeness differences. Interestingly, all four sites
show a step-like variation of the mean IWV and variability (Fig. 9a, b, c)
peaking in July–August–September associated with the North American monsoon
(Adams and Comrie, 1997; Means, 2013). This feature contrasts with
the Indian monsoon observed at IISC where variability was enhanced during
the transition seasons and not during the monsoon. At DHLG and BLYT, the
biases actually reverse signs in July–August (Fig. 8a, b) and the standard
deviation peaks at

The next site, MKEA (

The last group of sites is located in eastern Antarctica (Fig. 1).
Unfortunately, four of the five Antarctica sites used in this study suffer
from large discrepancies. Three of them have two statistics (

In this study, we first analysed the general tendency of IWV difference
between ERA-Interim reanalysis and global GPS observations. We found that
the mean difference, interpreted as the bias of the reanalysis with respect
to the observations, exhibits a latitudinal variation of

In a second part, we analysed in more detail the possible reasons for the
very bad comparison results obtained at 15 outlying sites. It is shown that
at most of the sites, representativeness errors are the most plausible cause
for discrepancies which are enhanced because of local topographic and
climatic features. The problematic topographic features include steep
orography such as that found for sites in the Andes Cordillera (AREQ, CFAG, and
SANT), on the island of Hawaii (MKEA), and close to the Himalaya chain (KIT3
and POL2), as well as coastal sites in Antarctica (MCM4, SYOG, MAW1, and
DAV1). The climatic features include large seasonal changes in the total
IWV, such as those associated with the Indian monsoon (IISC, KIT3, POL2) or the
North American monsoon (DHLG, BLYT, LONG, and COSO), and/or in the IWV
synoptic variability (observed at most sites during either the transition
seasons, winter or summer, depending on the geographic location). When
these 15 stations are eliminated from the dataset, the comparison statistics
become

These results lead to a more general question of whether it is possible to
eliminate problematic stations a priori, i.e. before the comparison statistics are
computed. Inspection of the elevation of the four surrounding grid points
with respect to the elevation of the GPS station and with respect to each
other provides some indication of possible representativeness errors. Some
correlation between IWV biases and altitudes at the individual grid points
was found in extreme cases (Fig. 7). A simple a priori check based on the
comparison of grid point altitudes to station altitudes would eliminate some
of the problematic cases. We compared the statistics with and without
selection of sites where the elevation of the grid points differs by more
than 500 m from the GPS station. When the selection is applied to the
nearest grid point only, 15 stations are eliminated, including four of the
outlying sites discussed in Sect. 4. This test is not very efficient. When
applied to all four surrounding grid points, 34 stations are eliminated,
including 11 of the outlying sites (only CFAG, MCM4, BLYT, and IISC remain
then in the dataset). On average, the statistics of the mean differences
(

Aside from the large representativeness errors found at a small number of sites, one should recognize that ERA-Interim and GPS IWV data are generally in good agreement globally, except perhaps in Antarctica where the comparison failed at four sites out of five. One of the remaining error sources not addressed in this study is the temporal consistency of both data sources. Therefore, other statistics are more relevant, such as trend estimates (Schröder et al., 2016; Parracho et al., 2018). The methodology described in this paper can also be applied to assess the consistency and representativeness of other data sources (e.g. climate models, satellite IWV data) and other observation types (e.g. surface humidity, temperature).

GPS IWV data have the following DOI: global GPS IWV data at 120 stations of the permanent IGS network;

Throughout this study, the GPS IWV data at a given station are denoted by

GPS and ERA-Interim IWV data are analysed using the following statistics,
where the mean and standard deviation are computed over the number of days
(months) of the time-matched daily (monthly) data:

The mean and standard deviation of IWV are

The relative standard deviation of IWV is

The standard deviation and relative standard deviation of IWV time derivate as

The ERA-Interim representativeness error statistic is based on the maximum
absolute difference in IWV from the four surrounding grid points,

The absolute and relative mean “representativeness error statistic” is

The ERA-Interim minus GPS differences are analysed using the following
statistics:

The mean and standard deviation of IWV differences are

The relative mean and standard deviation of IWV differences are

The units of the values computed using Eqs. (A1), (A2), (A6), (A8), and (A9) are given in kg m

The units of the values computed using Eqs. (A3), (A10), and (A11) are percentages when multiplied by 100.

The units of the values computed using Eq. (A4) are given in kg m

The supplement related to this article is available online at:

OB prepared the GPS and ERA-Interim data, performed the comparisons, and wrote the paper. ACP contributed to the data analysis and discussion of results.

The authors declare that they have no conflict of interest.

This article is part of the special issue “Advanced Global Navigation Satellite Systems tropospheric products for monitoring severe weather events and climate (GNSS4SWEC) (AMT/ACP/ANGEO inter-journal SI)”. It is not associated with a conference.

This work is a contribution to the European
COST Action ES1206 GNSS4SWEC (GNSS for Severe Weather and Climate
monitoring;

This research has been supported by the Centre National de la Recherche Scientifique (grant no. LEFE/INSU projet VEGA).

This paper was edited by Roeland Van Malderen and reviewed by two anonymous referees.