On the Suitability of Current Atmospheric 1 Reanalyses for Regional Warming Studies over China

Reanalyses are widely used because they add value to routine observations by generating physically or dynamically consistent and spatiotemporally complete atmospheric fields. Existing studies include extensive discussions of the temporal suitability of reanalyses in studies of global change. This study adds to this existing work by investigating the suitability of reanalyses in studies of regional climate change, in which land–atmosphere interactions play a comparatively important role. In this study, surface air temperatures (Ta) from 12 current reanalysis products are investigated; in particular, the spatial patterns of trends in Ta are examined using homogenized measurements of Ta made at ∼ 2200 meteorological stations in China from 1979 to 2010. The results show that ∼ 80 % of the mean differences in Ta between the reanalyses and the in situ observations can be attributed to the differences in elevation between the stations and the model grids. Thus, the Ta climatologies display good skill, and these findings rebut previous reports of biases in Ta. However, the biases in theTa trends in the reanalyses diverge spatially (standard deviation= 0.15–0.30 C decade−1 using 1× 1 grid cells). The simulated biases in the trends in Ta correlate well with those of precipitation frequency, surface incident solar radiation (Rs) and atmospheric downward longwave radiation (Ld) among the reanalyses (r =−0.83, 0.80 and 0.77; p < 0.1) when the spatial patterns of these variables are considered. The biases in the trends in Ta over southern China (on the order of −0.07 C decade−1) are caused by biases in the trends in Rs, Ld and precipitation frequency on the order of 0.10, −0.08 and −0.06 C decade−1, respectively. The biases in the trends in Ta over northern China (on the order of −0.12 C decade−1) result jointly from those in Ld and precipitation frequency. Therefore, improving the simulation of precipitation frequency and Rs helps to maximize the signal component corresponding to regional climate. In addition, the analysis of Ta observations helps represent regional warming in ERA-Interim and JRA55. Incorporating vegetation dynamics in reanalyses and the use of accurate aerosol information, as in the Modern-Era Retrospective Analysis for Research and Applications, version 2 (MERRA-2), would lead to improvements in the modelling of regional warming. The use of the ensemble technique adopted in the twentieth-century atmospheric model ensemble ERA-20CM significantly narrows the uncertainties associated with regional warming in reanalyses (standard deviation= 0.15 C decade−1).


Introduction
Observations and models are two fundamental approaches used in the understanding of climate change.Observations provide a direct link to the climate system via instruments, whereas models provide an indirect link and include information derived from measurements, prior knowledge and theory.
A large number of meteorological observations have been accumulated.These measurements, which are derived from a variety of sources, such as surface stations, ships, buoys, radiosondes, airplanes and satellites, record quantities that include near-surface and upper-air temperature, humidity, wind and pressure.They constitute a major source of atmospheric information through the depth of the troposphere but suffer from incomplete spatiotemporal coverage and observation errors, including systematic, random and representation errors.Recent satellite-based observations have much better coverage; however, they suffer from other notable lim-Published by Copernicus Publications on behalf of the European Geosciences Union.
itations, including temporal inhomogeneities (e.g.satellite drift) and retrieval errors (Bengtsson et al., 2007).These spatiotemporally varying gaps restrict the effective application of observations alone in climate research.
To fill in the gaps in observations, models are needed.Such models can be very simple; examples of simple models include linear interpolation or geostatistical approaches that are based on the spatial and temporal autocorrelation of the observations.However, these models lack the necessary dynamical or physical mechanisms.Given the steady progress of numerical weather prediction (NWP) models in characterizing the global atmospheric circulation in the early 1980s (Bauer et al., 2015), the first generation of reanalyses was produced by combining observations and dynamic models to provide the first global atmospheric datasets for use in scientific research (Bengtsson et al., 1982a, b).
After realizing the great value of this kind of reanalysis in atmospheric research, a step forward was taken with the suggestion made by Bengtsson and Shukla (1988) and Trenberth and Olson (1988) that most meteorological observations should be optimally assimilated under a fixed dynamical system over a period of time long enough to be useful for climate studies.In this way, available observations are ingested by advanced data assimilation techniques to provide a continuous initial state for an NWP model to produce the next short-term forecast.This procedure thus generates physically consistent and spatiotemporally complete threedimensional atmospheric fields that are updated in light of observations.
Taking this suggestion as a guide, and given the improvements that have been made since the mid-1990s in the integrity of the observations, the models and the assimilation methods used, successive generations of atmospheric reanalyses established by several institutes have improved in quality.These reanalyses include the first two generations of global reanalyses produced by the National Centers for Environmental Prediction, NCEP-R1 (Kalnay et al., 1996) and NCEP-R2 (Kanamitsu et al., 2002); the reanalyses produced by the European Centre for Medium-Range Weather Forecasts (ECMWF), ERA-15 (Gibson et al., 1997), ERA-40 (Uppala et al., 2005) and ERA-Interim (Dee et al., 2011b); the Japanese Meteorological Agency, JRA-25 (Onogi et al., 2007) and JRA-55 (Kobayashi et al., 2015); and the National Aeronautics and Space Administration, the Modern-Era Retrospective Analysis for Research and Applications (MERRA) (Rienecker et al., 2011) and its updated version, MERRA-2 (Reichle et al., 2017).
These reanalyses produce global gridded datasets that cover multiple timescales and include a large variety of atmospheric, oceanic and land surface parameters, many of which are not easily or routinely observed but are dynamically constrained by large numbers of observations from multiple sources assimilated using fixed NWP models.During the data assimilation, prior information on uncertainties in the observations and models are used to perform quality checks, to derive bias adjustments and to assign proportional weights.Therefore, such reanalyses add value to the instrumental record through their inclusion of bias adjustments, their broadened spatiotemporal coverage and their increased dynamical integrity or consistency.
The errors displayed by reanalysis products arise from three sources: observation error, model error and assimilation error (Thorne and Vose, 2010;Parker, 2016;Lahoz and Schneider, 2014;Dee et al., 2014;Zhou et al., 2017).Specifically, observation error incorporates systematic and random errors in instruments and their replacements, errors in data reprocessing and representation error, which arises due to the spatiotemporal incompleteness of observations (Dee and Uppala, 2009;Desroziers et al., 2005).Model error refers mainly to the inadequate representation of physical processes in NWP models (Peña and Toth, 2014;Bengtsson et al., 2007), such as the lack of time-varying surface conditions such as vegetation growth (Zhou and Wang, 2016b;Trigo et al., 2015), and incomplete cloud-precipitation-radiation parameterizations (Fujiwara et al., 2017;Dolinar et al., 2016).Assimilation error describes errors that arise in the mapping of the model space to the observation space and errors in the topologies of cost functions (Dee, 2005;Dee and Da Silva, 1998;Lahoz and Schneider, 2014;Parker, 2016).
These reanalyses mentioned above consist of the true climate signal and the nonlinear interactions among the observation error, the model error and the assimilation error that arise during the assimilation process.These time-varying errors can introduce spurious trends without being eliminated by data assimilation systems.Many spurious variations in climate signals were also identified in the early-generation reanalyses (Bengtsson et al., 2004;Andersson et al., 2005;Chen et al., 2008;Zhou andWang, 2016a, 2017a;Zhou et al., 2017;Schoeberl et al., 2012;Xu and Powell, 2011;Hines et al., 2000;Cornes and Jones, 2013).Therefore, reanalyses produced using the existing reanalysis strategy may not accurately capture climate trends (Trenberth et al., 2008), even though they may contain relatively accurate estimates of synoptic or interannual variations in the Earth's atmosphere.
An emerging requirement for climate applications of reanalysis data is the accurate representation of decadal variability, further increasing the confidence in the estimation of climate trends.This kind of climate reanalysis is required to be free, to a great extent, from other spurious non-climatic signals introduced by changing observations, model imperfections and assimilation error; that is, they must maintain temporal consistency.Therefore, the extent to which climate trends can be assessed using reanalyses attracts much attention and sparks heated debates (Thorne and Vose, 2010;Dee et al., 2011aDee et al., , 2014;;Bengtsson et al., 2007).
Given the great progress that has been made in climate forecasting models (which provide more accurate representations of climate change and variability) and coupled data assimilation, many efforts have been made by several institutes to build consistent climate reanalyses using the strategy of assimilating a relatively small number of high-quality long-term observational datasets.The climate reanalyses of this new generation extend back to the late nineteenth century and include the Climate Forecast System Reanalysis (CFSR), which is produced by the National Centers for Environmental Prediction (Saha et al., 2010); NOAA 20CRv2c, which is produced by the University of Colorado's Cooperative Institute for Research in Environmental Sciences (CIRES) in cooperation with the National Oceanic and Atmospheric Agency (NOAA) (Compo et al., 2011); and ERA-20C (Poli et al., 2016), ERA-20CM (Hersbach et al., 2015) and CERA-20C (Laloyaux et al., 2016), which are produced by the ECMWF.Compo et al. (2013) suggested that the NOAA 20CRv2c reanalysis can reproduce the trend in global mean surface air temperatures.In addition, the uncertainties estimated from multiple ensembles are provided to increase the confidence of the climate trends (Thorne and Vose, 2010;Dee et al., 2014).
From NWP-like reanalyses to climate reanalyses, existing studies focus mainly on comparing the differences in temporal variability between the reanalyses and observations using some statistical metrics, e.g. the mean values, standard deviations, interannual correlations, probability density functions and trends of surface air temperature over regions worldwide.These evaluations provide insight into the temporal evolution of the Earth's atmosphere.However, they lack the performance evaluations used in reanalyses in representing the spatial patterns of these statistics associated with the role of the coupled land-atmosphere and dynamical processes of the climate system.Moreover, the assessment of these spatial patterns provides a direct means of examining the most prominent advantage of reanalyses over geostatistical interpolation; thus, the spatial patterns require comprehensive investigation.
This study employs high-density station-based datasets of quantities including surface air temperatures (T a ), the surface incident solar radiation (R s ), the surface downward longwave radiation (L d ) and precipitation measured at ∼ 2200 meteorological stations within China from 1979 to 2010.It provides a quantitative examination of the simulated patterns of variations in T a in both the NWP-like and climate reanalyses and considers the climatology, the interannual variability, the mutual relationships among relevant quantities, the longterm trends and their controlling factors.The results indicate the strengths and weaknesses of the current reanalyses when applied in regional climate change studies and provide possible ways to improve these reanalyses in the near future.
2 Data and methods

Observational datasets
The latest comprehensive daily dataset (which contains averages at 00:00, 06:00, 12:00 and 18:00 UTC) of quantities that include T a , precipitation, sunshine duration, relative humidity, water vapour pressure, surface pressure and the cloud fraction from approximately 2400 meteorological stations in China from 1961 to 2014, of which only approximately 194 participate in global exchanges, is obtained from the China Meteorological Administration (CMA; http: //data.cma.cn/data).Approximately 2200 stations with complete and homogeneous data are selected for use in this study (Wang and Feng, 2013;Wang, 2008;Wang et al., 2007).The high density of meteorological stations in China promotes the representation of regional patterns in surface warming by reanalyses and the assessment of the skill of simulations.
R s values based on the revised Ångström-Prescott equation (Wang et al., 2015;Yang et al., 2006;Wang, 2014) are used in this study.The derived R s values consider the effects of Rayleigh scattering, water vapour absorption, and ozone absorption (Wang et al., 2015;Yang et al., 2006) and can accurately reflect the effects of aerosols and clouds on R s over China (Wang et al., 2012;Tang et al., 2011).Several intensive studies have reported that the derived R s values can accurately depict the interannual, decadal and long-term variations in R s (Wang et al., 2012(Wang et al., , 2015;;Wang, 2014).
L d is typically estimated by first determining the clear-sky radiation and atmospheric emissivity (Brunt, 1932;Choi et al., 2008;Bilbao and De Miguel, 2007) and then correcting for the cloud fraction (Wang and Liang, 2009;Wang and Dickinson, 2013).The derived L d values can directly reflect the greenhouse effect of atmospheric water vapour and clouds.Additionally, precipitation frequency is defined as days in a year with daily precipitation of at least 0.1 mm in this study, which has been shown to provide a good indication of the effects of precipitation on the interannual variability and trends in T a (Zhou et al., 2017).Taken together, the derived R s and L d values are able to physically quantify the effects of solar radiation and the greenhouse effect on surface warming.fluxes and thus modulates the variations in T a (Zhou et al., 2017;Zhou and Wang, 2017a).

Reanalysis products
All of the major global atmospheric reanalysis products are included in this study (Table 1).The reanalyses are summarized below in terms of three aspects, i.e. the observations assimilated and the forecast model and assimilation method used.The NWP-like reanalyses assimilate many conventional and satellite datasets from multiple sources (Ta-ble 1) to characterize the basic upper-air atmospheric fields; the spatiotemporal errors of these datasets vary with time.
In particular, the ERA-Interim and JRA-55 reanalyses incorporate many observations of T a , and the MERRA2 reanalysis includes aerosol optical depth estimates from satellite retrievals and model simulations based on emission inventories, whereas most of the other reanalyses use climatological aerosols (Table 1).To derive consistent long-term climate signals, the new strategy adopted by climate reanalyses involves the assimilation of a small number of relatively effective observed variables, e.g.surface pressure (Table 1).Except for its lack of the assimilation of surface pressure, ERA-20CM employs the same forecast model and external forcings as ERA-20C (Table 1); thus, the inclusion of ERA-20CM in this study provides a useful benchmark series against which to ascertain the skill that is added by assimilating various observations and to cognize the advantage of ensemble simulations.The reanalyses adopt different sea surface temperatures (SSTs) and sea ice concentrations for different time periods, which may lead to temporal discontinuities in the climate signals derived from the reanalyses (Table 1).To address this issue, the boundary conditions in CFSR are derived from its coupled ocean-sea ice models instead of observations (Table 1).CFSR, NOAA 20CRv2c and NOAA 20CRv2 use monthly greenhouse gases (GHGs) with annual means near those used in CMIP5.Conversely, in ERA-Interim, the GHGs increase more slowly than in CMIP5 after 2000.Finally, NCEP-R1 and NCEP-R2 adopt constant global mean concentrations of the GHGs (Table 1).
The forecast model is a fundamental component of a reanalysis that provides the background fields to the assimilation system.Different reanalyses produced by a single institute generally use similar physical parameterizations; however, updated versions of these parameterizations and higher spatial resolutions are used in the newer generations of these realizations (Table 1).Note that the CFSR is classified into climate reanalysis in this study, mainly because it adopts a climate forecast system (Table 1).The assimilation methods adopted by the current reanalyses incorporate variational methods (3D-Var and 4D-Var) and the ensemble Kalman filter (EnKF) approach (Table 1).
The 2 m T a in NCEP-1, NCEP-2, MERRA, MERRA-2, ERA-20C, ERA-20CM, CERA-20C, NOAA 20CRv2c, NOAA 20CRv2 and CFSR are model-derived fields that are functions of the surface skin temperature, the temperature at the lowest model level, the vertical stability and the surface roughness, which are constrained primarily by observations of upper-air variables and the surface pressure (Kanamitsu et al., 2002;Rienecker et al., 2011;Reichle et al., 2017;Poli et al., 2016;Hersbach et al., 2015;Laloyaux et al., 2016;Compo et al., 2011;Saha et al., 2010).However, the T a in ERA-Interim and JRA-55 are post-processing products by a relatively simple analysis scheme between the lowest model level and the surface and are analysed using ground-based observations of T a , with the help of Monin-Obukhov similarity profiles consistent with the model's parameterization of the surface layer (Dee et al., 2011b;Kobayashi et al., 2015).Additionally, radiation calculations are diagnostically determined from the prognostic cloud condensate microphysics parameterization, and the cloud macrophysics parameterization assumes a maximum-random cloud overlapping scheme (Saha et al., 2010;Dolinar et al., 2016).

Method used to homogenize the observed time series
Problems related to the observational infrastructure (e.g.instrument ageing and changes in observing practices) and station relocations can also lead to false temporal heterogeneity in time series.Therefore, it is necessary to diminish the impact of data inhomogeneities on the trends in the observed variables during the study period of 1979-2010.We use the RHtestsV4 software package (Wang and Feng, 2013) to detect and homogenize the breakpoints in the monthly time series.The package includes two algorithms.Specifically, the PMFred algorithm is based on the penalized maximal F test (PMF) without a reference series (Wang, 2008), and the PMTred algorithm is based on the penalized maximal t test (PMT) with a reference series (Wang et al., 2007).
In this study, we first use the PMFred algorithm to identify potential reference series at the 95 % significance level.We then reconstruct homogenous series for each inhomogeneous series using the following steps.
1. Horizontal and vertical distances from the inhomogeneous station of less than 110 km and 500 m, respectively, are specified.
2. Correlation coefficients between the first-order difference in the homogeneous series and that in the inhomogeneous one exceeding 0.9 are required.
3. The first 10 homogeneous series are averaged using inverse distance weighting to produce a reference series for the inhomogeneous station.Finally, we apply the PMTred algorithm to test all of the inhomogeneous series using the nearby reference series.Several intensive studies have been conducted that indicate the PMTred algorithm displays good performance in detecting change points in inhomogeneous series (Venema et al., 2012;Wang et al., 2007).
If a breakpoint is found to be statistically significant, the quantile-matching (QM) adjustment in RHtestsV4 is recommended for making adjustments to the time series (Wang et al., 2010;Wang and Feng, 2013); in such cases, the longest available segment from 1979 to 2010 is used as the base segment.The QM adjustment aims to match the empirical distributions from all of the detrended segments with that of the specific base segment (Wang et al., 2010).In addition, we replicate the procedures above for the sparsely distributed stations over western China and the Tibetan Plateau.
As such, the significant breakpoints are detected and adjusted at a confidence level of 95 % at 1092 of the 2193 (49.8 %) stations for the T a time series, 1079 of the 2193 (49.2 %) stations for the R s time series, 64 of the 2193 (2.9 %) stations for precipitation frequency time series, 971 of the 2193 (44.2 %) stations for the L d time series, 944 of the 2193 (43.0 %) stations for the water vapour pressure time series and 956 of the 2193 (43.6 %) stations for the cloud fraction time series.

Trend calculations, partial linear regression and total least squares
The bias, root-mean-squared error (RMSE) and standard deviation are used to assess the absolute value of T a .The trends in T a and the relevant variables are calculated using the ordinary least-squares method (OLS) and the two-tailed Student's t test.To determine whether the reanalyses contain biases in these trends, the two-tailed Student's t test is also applied to the differences in the time series between the reanalyses and the homogeneous observations.
The partial least-squares approach is used to investigate the net relationship between the detrended T a values and the relevant variables (R s , L d and precipitation frequency) after statistically excluding the confounding effects among the relevant variables (Zhou et al., 2017).To evaluate the potential collinearity of independent variables in the regression model, the variance inflation factor (VIF) is calculated.The VIFs for R s , precipitation frequency and L d are less than 4. Specifically, the VIF for China of 2.19 is much less than the threshold of 10, above which the collinearity of regression models is bound to adversely affect the regression results (Ryan, 2008).
The Pearson correlation coefficient (r) is used to reveal the spatial relationship between T a and the relevant variables.To further investigate the relationship between the spatial distributions of the biases in the trends in T a and the relevant parameters among the 12 reanalysis products, the weighted total least squares (WTLS) is adopted, in which the spatial standard deviations and correlations of pairs of variables on 1 • × 1 • grid cells are included (Reed, 1989;York et al., 2004;Golub and Van Loan, 1980;Hyk and Stojek, 2013;Tellinghuisen, 2010).
The variables x i and y i are the median trends in x and y (e.g.T a and R s ) for the ith reanalysis product; σ x i , σ y i and r i are the spatial standard deviations and correlations of the trends in x and y for the ith reanalysis product; β i is the least-squares-adjusted value; W i is the weight of the residual error; and b is the slope estimated using iterative methods with a relative tolerance of 10 −16 .The Monte Carlo method with 10 000 experiments is applied to estimate the 90 % confidence intervals of the slope b.In the Monte Carlo method, the grid index for the 1 • × 1 • grid cells over China, which ranges from 1 to 691, is gener-ated as a random number.On this basis, we can sample the spatial pattern in the biases in the trends in T a , R s , L d and precipitation frequency.We then calculate the median trends and their spatial standard deviations and correlations for each experiment used in the WTLS.

Results
3.1 Dependency of surface air temperature differences on elevation differences  2).A homogenizing adjustment of 0.03 • C from the raw T a observations is insufficient to account for the underestimation of T a by the reanalyses (Fig. 1 and Table 2).Similar biases in T a within various regions worldwide have been widely reported by previous studies (Mao et al., 2010;Pitman and Perkins, 2009;Reuten et al., 2011;Wang and Zeng, 2012;Zhou et al., 2017;Zhou and Wang, 2016b).However, we found that the spatial patterns in the differences in T a are well correlated with the elevation differences between models and stations, as reflected by correlation coefficients (r) of 0.85 to 0.94 (Figs. 2 and S1 in the Supplement).These results are in accordance with the reports from NCEP-R1, NCEP-R2 and ERA-40 (You et al., 2010;Ma et al., 2008;Zhao et al., 2008).The elevation differences ( Height; Figs. 2 and S1) between the stations and the model grids consist of the filtering error in the elevations used in the spectral models ( f) and differences in the site-to-grid elevations ( s) due to the complexity of the orographic topography.We further quantify the relative contributions of these factors to the T a differences.The elevation differences can explain approximately 80 % of the T a differences; approximately 74 % is produced by the site-to-grid elevation differences, and approximately 6 % is produced by the filtering error in the elevations used in the spectral models (Fig. 2).
The regression coefficient of the differences in T a is approximately 6 • C 1 km −1 , which is similar to the lapse rate at the surface (Fig. 2).Lapse rate values that exceed 6 • C 1 km −1 can be seen over the Tibetan Plateau (shown as red dots in Fig. 2).This result is very consistent with the reported lapse rates over China (Li et al., 2015;Fang and Yoda, 1988).In addition, the rate of decrease in the model filtering error is approximately 4 • C 1 km −1 among the 12 reanalyses   (Fig. 2).These results have important implications for the skill of the simulated T a climatologies of the 12 reanalyses over China.
3.2 Comparison of regional-scale surface air temperature series Figure 3 shows Taylor diagrams of annual T a anomalies from the observations and reanalyses over China and its seven subregions.We find that the correlations between the annual T a anomalies in the 12 reanalysis products and the observations are reasonably strong, as reflected by a median r of 0.95 (Fig. 3), despite the relatively weak correlations over the Tibetan Plateau associated with NCEP-R2 (r = 0.24) and CFSR (r = 0.53).The simulated time series of T a anomalies over eastern China are depicted most accurately by the reanalyses (Fig. 3c-g).
Overall, the NWP-like reanalyses (denoted by numbers 3-7) display better skill than the climate reanalyses (denoted by numbers 8-14) in this regard (Fig. 3).ERA-Interim and JRA-55 display the best performance in the simulated time series of T a anomalies over China (r = 1.00,RMSE = 0.05 • C) and the seven regions (r = 0.98, RMSE = 0.1 • C) (Fig. 3), perhaps due to their analysis of surface air temperature observations (Table 1).
Comparing the T a values from MERRA2 and MERRA shows that MERRA2 displays improved performance over northern China, as reflected by an increase in the correlation coefficient of 0.1 and a reduction in the RMSE of 0.1 • C (Fig. 3).This result may occur because MERRA2 includes time-varying aerosol loadings (Balsamo et al., 2015;Reichle et al., 2011).However, the incorporation of this information does not improve the results over southeastern China (Fig. 3h).CERA-20C displays better performance than ERA-20C and ERA-20CM, perhaps related to the inclusion of coupled climate forecast models and data assimilation, as well as the assimilation of surface pressure data in CERA-20C (Fig. 3 and Table 1).NOAA 20CRv2c and NOAA 20CRv2 display moderate performance in this regard (r = 0.8, RMSE = 0.3 • C) (Fig. 3), and the former reanalysis displays no improvement in performance, despite its use of new boundary conditions (Compo et al., 2011).

Key factors regulating regional temperature change
This section discusses key factors that control regional temperature change from the perspective of energy balance and its partitioning.The R s heats the surface, and the portion of this radiation that becomes the sensible heat flux heats the air near the surface (Zhou and Wang, 2016b, c;Wang and Dickinson, 2013).Part of the energy absorbed by the surface is released back to space as outgoing longwave radiation; some of this radiation is reflected by clouds and is influenced by atmospheric water vapour, further warming the near-surface air (Wang and Dickinson, 2013).This process is known as the greenhouse effect on T a and is quantified by L d .Existing studies have suggested that precipitation frequency better represents the interannual variability in soil moisture in China than the precipitation amount (Wu et al., 2012;Piao et al., 2009;Zhou et al., 2017;Zhou and Wang, 2017a); in turn, soil moisture affects vegetation growth and drives changes in surface characteristics (e.g.surface albedo and roughness).These changes alter the partitioning of available energy and thus regulate changes in T a .
Figure 4 illustrates the partial relationships among the annual anomalies in T a and R s , the precipitation frequency and L d .The results show that T a is consistently positively correlated with R s (except over the Tibetan Plateau) and L d ; however, it is consistently negatively correlated with precipitation frequency in the observations and the 12 reanalysis products (Fig. 4).Based on the observations, the interannual variations in T a are determined in part by precipitation frequency and L d in northeastern China and the northern part of northwestern China (Fig. 4).All of the reanalyses roughly capture these factors over these regions, although they display differences in the relative magnitudes (Fig. 4).Specifically, ERA-20CM, NOAA 20CRv2c, NOAA 20CRv2 and CFSR exhibit comparable relationships of T a with precipitation frequency and L d ; however, MERRA, MERRA2, NCEP-R2, ERA-20C and CERA-20C overestimate the relationship between T a and precipitation frequency, and ERA-Interim, JRA-55 and NCEP-R1 overestimate the relationship of T a with L d over these regions (Fig. 4).
Over the North China Plain and middle China, the interannual variations in T a are partly determined by R s , precipitation frequency and L d (Fig. 4).The reanalyses roughly capture the effects of these three factors on T a , although they display diverse combinations (Fig. 4).Among these combinations, JRA-55, MERRA2, ERA-20CM and ERA-Interim are comparable to the observations over these regions (Fig. 4).Over southeastern China, the interannual variations in T a are primarily regulated by L d , precipitation frequency and R s (Fig. 4).The reanalyses exhibit slightly overestimated rela-tionships of T a with R s and underestimated relationships with precipitation frequency (Fig. 4).
Over the Tibetan Plateau, the interannual variations in T a are regulated by R s and precipitation frequency (Fig. 4).Most of the reanalyses roughly capture the combinations of these factors but exhibit certain differences in the relative effects of R s and precipitation frequency on T a (Fig. 4).MERRA, MERRA2, NOAA 20CRv2c and NOAA 20CRv2 overestimate the relationships of T a with R s over the Tibetan Plateau (Fig. 4).
Overall, the spatial patterns of the simulated partial correlation of T a with R s in the reanalysis products are significantly correlated with those seen in the observations; r = 0.13-0.35(p < 0.05) for the NWP-like reanalyses, and larger values of r = 0.24-0.41(p < 0.05) are obtained for the climate reanalyses.Moreover, the spatial patterns in the sensitivity of T a to R s exhibit significant correlations (r = 0.12-0.17,p < 0.05) for most of the climate reanalyses (Table 1).Precipitation frequency displays the largest spatial correlations (r = 0.16-0.43,p < 0.05) of the sensitivity of T a with these three relevant parameters in the reanalyses (Table 3).Significant spatial correlations reflecting the relationship (including the partial correlation and sensitivity) of T a with L d are also found (Table 1).
3.4 Regional warming trend biases and their causes

All of China
From 1979 to 2010 over China, T a exhibits strong warming trends of 0.37 • C decade −1 (p < 0.05) in the observations and 0.22-0.48• C decade −1 (p < 0.05) in the 12 reanalyses  in the Supplement, Table 2).ERA-Interim and JRA-55 display spatial correlations with the observations (r = 0.47 and 0.54, p < 0.05) that are due at least partly to the inclusion of some T a observations, whereas NCEP-R2 and ERA-20C display the worst performance (Fig. S3 in the Supplement, Tables 1 and 3).Furthermore, approximately 87 % of the observed trends in T a over China can be explained by the greenhouse effect (i.e.65 % can be explained by the trend in L d ), precipitation frequency (29 %) and R s (−7 %, due to the trend in radiative forcing of −1.1 W m −2 decade −1 ) (Figs. S3-4 in the Supplement).The influence of the greenhouse effect on the observed trends in T a consists mainly of the trends in the atmospheric water vapour (42 %) and the cloud fraction (3 %) (Fig. S5 in the Supplement).Among the reanalyses, over 90 % of the trend in T a can be explained by the greenhouse effect, precipitation frequency and R s (Figs.S4-6 in the Supplement).Specifically, ERA-Interim, JRA-55, MERRA and MERRA2 display the best ability to capture the contributions of the greenhouse effect (48 to 76 %), precipitation frequency (22 to 34 %) and R s (−4 to 13 %) to the trend in T a over China (Figs.S4 and S6).The remaining NWP-like reanalyses (i.e.NCEP-R1 and NCEP-R2) substantially overestimate the contribution of R s to the trend in T a , whereas the climate reanalyses overestimate the contribution from L d (Figs.S4 and S6 in the Supplement).
We further quantify the contributions to the biases in the trend in T a made by those in R s , L d and precipitation  (d-o) the 12 reanalysis products over China with respect to the homogenized observations.The squares denote the original homogeneous time series, and the dots denote the adjusted homogeneous time series.The probability distribution functions of all of the biases in the trends are shown as coloured histograms, and the black stairs are integrated from the trend biases with a significance level of 0.05 (based on two-tailed Student's t tests).The cyan and green stars in (k-n) represent estimates of the biases in the trends outside the ensemble ranges whose locations are denoted by the black dots shown in (k-n).
frequency among the 12 reanalyses over China (Figs. 6-7).Over China, the overestimated R s trends (by 0.00-3.93W m −2 decade −1 ; Figs.S8 and S13 in the Supplement) increase the trends in T a (by 0.02-0.16• C decade −1 ; Fig. 7) in the 12 reanalyses; the underestimated L d trends (by −0.25 to −1.61 W m −2 decade −1 for the NWP-like reanalyses; Figs.S10 and S15) decrease the trends in T a (by −0.05 to −0.25 • C decade −1 for the NWP-like reanalyses; Fig. 7) and the biases in the trends in precipitation frequency (by approximately −1.5 days decade −1 for the NWP-like reanalyses and approximately 2.6 days decade −1 for the climate reanalyses; Figs.S9 and S14 in the Supplement) decrease the trends in T a (by 0.01 to 0.05 • C decade −1 for the NWP-like reanalyses and −0.01 to −0.06 • C decade −1 for the climate reanalyses; Fig. 7).Together, these effects produce an underestimate in the trends in T a on the order of 0.10 • C decade −1 in the reanalyses (Fig. 7 and Table 2).

Seven subregions
Averaged trends over large areas may mask regional differences that reflect diverse regional warming biases and their causes (Figs.5-7).The mean-adjusted spatial patterns of the biases in the trends in T a appear to be consistent among the 12 reanalyses (Fig. S7 in the Supplement) and mimic the spatial patterns in the overestimated R s trends over the www.atmos-chem-phys.net/18/8113/2018/Atmos.Chem.Phys., 18, 8113-8136, 2018 North China Plain, southern China and northeastern China (Fig. S8 in the Supplement), given the spatial correlations among these variables in most of the reanalyses (r = 0.11-0.42,p < 0.05) (Figs. 6 and S7-S8 in the Supplement, Table 3).However, the reanalyses still underestimate the trends in T a over most of the regions.The key reason for this underestimation is the increase in precipitation frequency over northwestern China, the Loess Plateau and middle China seen in the NWP-like reanalyses and that seen over broader regions in the climate reanalyses (Figs.5-6 and S9 in the Supplement).This relationship is reflected by their negative spatial correlation, which has a maximum value of −0.62 (p < 0.05) for MERRA (Table 3).Moreover, the decrease in L d , which occurs due to the decreases in the atmospheric water vapour and cloud fraction that occur in the NWP-like reanalyses (Figs.S10-S12 in the Supplement), substantially cancels the warming effect of the overestimation of R s on T a over eastern China (Figs. 5 and S7 in the Supplement).
The opposite changes occur over southeastern China in the climate reanalyses (Figs. 5 and S10 in the Supplement).The effect of the changes in L d is reflected by its spatial correlations of up to 0.50 (p < 0.05) (Table 3).
The corresponding contributions to the biases in the T a trend are calculated from those in R s , L d and precipitation frequency over the seven subregions of China (Figs. 6-7).Over northern China, biases in the trend in T a re-   sult primarily from those in precipitation frequency and L d (Figs.6-7).Over northeastern China, the observations exhibit an amplified warming of 0.41 • C decade −1 (p < 0.05; Fig. 4 and Table 2).This warming is significantly underestimated by NCEP-R1, JRA-55, NOAA 20CRv2 and NOAA 20CRv2c (on the order of −0.15 • C decade −1 ) and is overestimated by MERRA and CFSR (on the order of 0.2 • C decade −1 ) (Figs. 6-7).These biases in the trends in T a in the reanalysis are jointly explained by the warming (0.04-0.48 • C decade −1 ) induced by the underestimated trends in precipitation frequency and the cooling (−0.04 to −0.42 • C decade −1 ) induced by the underestimated trends in L d (Fig. 7).Over northwestern China, the biases in the trend in precipitation frequency and L d mainly explain the overestimated warming in NCEP-R2 (by 0.22 • C decade −1 ) (Fig. 7).The substantially underestimated trend in L d induced by the decrease in the atmospheric water vapour and cloud fraction (Figs.S9-S12 and S16-17 in the Supplement) lead to an underestimate of the warming in MERRA (by −0.22 • C decade −1 ) (Fig. 7).
Most of the reanalyses display weakened warming over the Tibetan Plateau and the Loess Plateau (Fig. 5 and S3, Table 2).In particular, NCEP-R1 and NCEP-R2 fail to reproduce the warming over the Tibetan Plateau, and MERRA fails to reproduce the warming over the Loess Plateau (Fig. 5 and S3 in the Supplement, Table 2).The significant cooling biases in the trends in T a (by −0.02 to −0.31 • C decade −1 ) over the Tibetan Plateau and the Loess Plateau result from the underestimated trends in L d and the overestimated trends in precipitation frequency seen in most of the reanalyses (Figs.5-7 and S9-12 in the Supplement).These cooling biases are further induced by the underestimated trends in R s .
Over southern China, the biases in the trend in T a are regulated by the biases in the trends in R s , L d and precipitation frequency (Figs.6-7).Over southeastern China, the significantly overestimated trends in T a (by 0.04, 0.02 and 0.17 Over middle China, the significantly overestimated trends in T a (by 0.04, 0.06, 0.11, 0.03, 0.11 and 0.14 • C decade −1 , respectively) are induced by the overestimated trends in R s (by 2.09, 1.50, 2.59, 1.20 and 4.81 W m −2 decade −1 , respectively) seen in ERA-Interim, JRA-55, ERA-20C, ERA-20CM, .The overestimated trends in precipitation frequency may lead to cooling in the trends in T a in the reanalyses, especially for MERRA (which reflects an induced bias in the trend of −0.15 • C decade −1 ) over middle China .
Due to the underestimated trends in the atmospheric water vapour and the cloud fraction (Figs.S11-S12 in the Supplement), the underestimation of L d produces a cooling effect on the trend in T a (by −0.05 to −0.32 • C decade −1 ) in the reanalyses over the North China Plain (Figs. 6-7 and S10 in the Supplement).However, due to the lack of inclusion of plausible trends in aerosol loading, the substantial increases in R s over the North China Plain (Fig. S8 in the Supplement) have strong warming effects on the trends in T a (by 0.01 to 0.21 • C decade −1 ) in the reanalyses .The biases in the trends in precipitation frequency (of approximately −2.5 days decade −1 for the NWPlike reanalyses and approximately 1.5 days decade −1 for some of the climate reanalyses) contribute some part of the biases in the trends in T a (approximately 0.05 • C decade −1 for the NWP-like reanalyses and −0.03 • C decade −1 for the climate reanalyses).
Overall, the biases in the trends in T a in the reanalyses can be substantially explained by those in L d , precipitation frequency and R s , but this effect varies regionally .Over northern China, the biases in the trend in T a (which are on the order of −0.12 • C decade −1 ) result primarily from a combination of those in L d (which are on the order of −0.10 • C decade −1 ) and precipitation frequency (which are on the order of 0.05 • C decade −1 ), with relatively small contributions from R s (which are on the order of −0.03 • C decade −1 ).Over southern China, the biases in the trend in T a (which are on the order of −0.07 • C decade −1 ) are caused by those in R s (which are on the order of 0.10 • C decade −1 ), L d (which are on the order of −0.08 • C decade −1 ) and precipitation frequency (which are on the order of −0.06 • C decade −1 ) (Fig. S18 in the Supplement).

Spatial linkages of biases in the warming trends in the 12 reanalyses
We next integrate the relationships of the spatial patterns in the biases in the trends in T a with those in R s , L d and precipitation frequency over China in the 12 reanalyses (Fig. 8).
The results show that the biases in the trends in T a show significant correlations with R s (r = 0.80, slope=0.06,p = 0.09), precipitation frequency (r = −0.83,slope = −0.04,p = 0.02) and L d (r = 0.77, slope = 0.10, p = 0.10) in the 12 reanalyses if information on these patterns is included.
When the spatial patterns of the biases in the trends in these variables are not considered, the biases in the trends in T a show relatively small correlations with R s (r = 0.32, slope = 0.02, p > 0.1), precipitation frequency (r = −0.51,slope = −0.02,p = 0.09) and L d (r = 0.14, slope = 0.02, p > 0.1) in the reanalyses (Fig. 8).Similar results are obtained for the atmospheric water vapour (r = 0.71, p = 0.1) and the cloud fraction (r = −0.74,p = 0.09) if their spatial patterns are considered (Figs.S19 in the Supplement), and this relationship involving the cloud fraction is very similar to that associated with R s (Figs. 8 and S19).Within the subregions of China, the biases in the trends in T a show significant correlations with R s (r = 068 to 0.90, p < 0.1), precipitation frequency (r = −0.55 to −0.94, p < 0.1) and L d (r = 0.53 to 0.93, p < 0.1) when the spatial patterns in the reanalyses are included (Fig. S20 in the Supplement).These results provide a novel perspective that can be used to investigate the spatial relationships between biases in the trends in T a and relevant quantities in reanalyses.

Discussion
In this section, we first examine the possible impacts of data homogenization on the trends in T a .The trends in T a derived from the original dataset are almost as high as those from the homogenized dataset, especially over the North China Plain and northwestern China (Fig. 5 and Table 2).Homogenization primarily adjusts breakpoints in time series (Wang, 2008), which occur mainly due to station relocation and changes in instruments (Cao et al., 2016;Li et al., 2017;Wang, 2014), and it helps to objectively depict trends in T a , thus permitting the assessment of the modelled trends in T a and its spatial patterns that are present in the reanalyses.We found that the elevation differences between the models and the stations influence the biases in the trends in T a but cannot explain the spatial patterns in the biases in the trends in T a (average r = 0.11) (Fig. S21 in the Supplement).Comparison of the models that use the same grid (NOAA 20CRv2c vs. NOAA 20CRv2, MERRA vs. MERRA2, NCEP-R1 vs. NCEP-R2 and ERA-20C vs. ERA-20CM) shows that the one is correlated with elevation differences, but the other is not, which implies that this statistical correlation does not have physical significance.Nevertheless, the spatial patterns in the normalized trends in T a (excluding the impacts of the absolute value of temperature on the trends) are very close to those of the trends (Fig. S22 in the Supplement), implying that the differences in the absolute value of temperature have a neglected effect on trend biases in T a in reanalysis products.77, (s=0.10, p=0.10) r=0.32, (s=0.02, p=0.33) r=0.14, (s=0.02, p=0.69) r=0.80, (s=0.06, p=0.09)  × 1 • grid cells that cover China.The median values (coloured dots with error bars of spatial standard deviations) of the biases in the trends in T a (unit: • C decade −1 ) in the 12 reanalyses are regressed onto those of (a) the surface incident solar radiation (R s , unit: W m −2 decade −1 ), (b) precipitation frequency (unit: days decade −1 ) and (c) the surface downward longwave radiation (L d , unit: W m −2 decade −1 ) using the ordinary least-squares method (OLS, denoted by the dashed grey lines) and the weighted total least-squares method (WTLS, denoted by the solid black lines).The 5-95 % confidence intervals of the regressed slopes obtained using WTLS are shown as shading.The regressed correlations and slopes are shown as grey and black text, respectively.
In the reanalyses, vegetation is only included as climatological information, but the vegetation displays a growth trend during the study period of 1979-2010 within China (Fig. S23 in the Supplement).This discrepancy positively enlarges the biases in the trends in T a due to the vegetation cooling effect (Zeng et al., 2017;Trigo et al., 2015).This effect is reflected by the negative spatial correlation (r = −0.26,p = 0.00) between the inverted trend in the normalized difference vegetation index and the biases in the trend in T a (Fig. S23 in the Supplement).The growth of vegetation reduces T a by regulating surface roughness, surface conductivity, soil moisture and albedo to partition greater amounts of available energy into latent heat fluxes, which leads to the formation of more precipitation (Shen et al., 2015;Spracklen et al., 2013).Thus, the inclusion of vegetation growth will improve the simulation of trends and especially the spatial pattern of T a in the reanalyses through the incorporation of more complete physical parameterizations (Li et al., 2005;Dee and Todling, 2000;Trigo et al., 2015).
Due to their inclusion of surface air temperature observations, ERA-Interim and JRA-55 display high skill in reproducing the observed patterns; they have near-zero means (0.01 and 0.01 • C decade −1 ) and the smallest standard deviations (0.16 and 0.15 • C decade −1 ) of the trend biases among the 12 reanalysis products.However, pattern differences of 37.8 % (standard deviation of trend bias/China-averaged trend) are still evident (Figs. 5 and 8).Although it does not incorporate surface air temperature observations, ERA-20CM presents a pattern (with a mean of −0.04 • C decade −1 and a standard deviation of 0.15 • C decade −1 ; Figs. 5 and 8) that is comparable to those of ERA-Interim and JRA-55 and better than that of ERA-20C (mean of −0.08 • C decade −1 and standard deviation of 0.20 • C decade −1 ; Figs. 5 and  8), which uses the same forecast model as ERA-20CM.These results imply that ensemble forecasting could be used to meet important goals.The ensemble simulation technique used in ERA-20CM also displays advantages in that it yields an improved simulated pattern of biases in the trends in R s (SD = 1.84 W m −2 decade −1 , 171 %), precipitation frequency (SD = 2.78 days decade −1 , 122 %) and L d (SD = 1.25 W m −2 decade −1 , 82 %) (Fig. 8).
We consider the degree to which the ensemble assimilation technique can improve the spatial patterns of the biases in the trends in T a in the reanalyses.We find that this technique can detect the biases in the trends in T a over approximately 12 % (8 %) of the grid cells in CERA-20C, which incorporates 10 ensemble members (NOAA 20CR2vc and NOAA 20CR2v employ 56 ensemble members) (Fig. 5l-n).However, the biases in the trends in T a over these grid cells are not significant at a significance level of 0.05, according to Student's t test, implying that the ensemble assimilation technique cannot explain the spatial pattern of the biases in the trends in T a identified in this study (in Fig. 5l-n).
To provide a preliminary discussion of the improvements in climate forecast models in reflecting patterns in climate trends, we compare the spatial patterns of the biases in the trends in R s , precipitation frequency and L d because observations of these variables are not included in the reanalyses.We find that the climate forecast models, i.e.ERA-20C, ERA-20CM, CERA-20C, NOAA 20CRv2c and NOAA 20CRv2, display better performance in reproducing the pattern of trends in R s (mean biases of 1.36 vs. 2.18 W m −2 decade −1 ; SD of 2.04 vs. 2.71 W m −2 decade −1 ), precipitation frequency (mean biases of 1.32 vs. −1.44 % decade −1 ; SD of 3.57 vs. 6.14 % decade −1 ) and L d (mean biases of 0.12 vs. −0.85W m −2 decade −1 ; SD of 1.33 vs. 1.50 W m −2 decade −1 ) than the NWP-like models, i.e.ERA-Interim, NCEP-R1, MERRA, JRA-55, NCEP-R2 and MERRA2 (Fig. 8).In addition, because the SST boundary condition evolves freely in CFSR, the patterns of biases in the trends in R s , precipitation frequency and L d in CFSR differ substantially from those in the other reanalyses.
We also consider whether the spatial pattern of biases in the trend in T a is altered by the atmospheric circulation patterns simulated by the ERA-20CM ensemble.In ERA-20CM, the atmospheric circulation patterns are influenced by SSTs and sea ice and then partly mediate the influence of global forcings on the trends in T a .In ERA-20CM, the probability distribution function of the biases in the trends in T a from outside the ensemble ranges incorporates that from Student's t test at a significance level of 0.05 (Fig. 5k).This result has important implications in that (1) the climate variability in the ensembles under the different model realizations of SSTs and sea ice cover does not change the pattern of the biases in the trends in T a (Fig. 5k); moreover, (2) Student's t test exhibits a suitable ability to detect the significance of the biases in the trends in T a (Fig. 5k) when considering the effects of interannual variability on the trend.
Overall, producing global or regional reanalyses that adequately reflect regional climate is challenging using the current strategy, and further improvements are required.The results and discussion above indicate some potential but challenging approaches that can be used to maximize the signal component corresponding to the regional climate in final reanalyses and robustly narrow the uncertainties in trends.

MERRA2's pioneering incorporation of time-varying
aerosol loadings provides a way of improving the representation of regional temperature changes over regions such as the North China Plain where the impacts of aerosols on surface temperatures are significant.Thus, we encourage research groups to include accurate aerosol information and improve the skill of simulation of the energy budget and partitioning, especially of regional surface incident solar radiation, in other reanalyses.
2. To improve regional climate modelling, forecast output should be produced using a physical ensemble like that employed in ERA-20CM to quantify the uncertainties associated with the relevant parameterizations in the reanalyses, due to the impossibility of optimizing all of the biases.Meanwhile, careful ensemble design would likely yield useful information for use in improving models, assimilation methods and the bias correction of observations by exploring the interdependency among sources of errors.Such designs would undoubtedly have additional benefits for further development, leading to the next generation of reanalyses.
3. To improve coupled land-atmospheric interactions, the true dynamics of land cover and use should be incorporated.Moreover, the physical parameterizations should be improved, including the responses of surface roughness, surface conductivity and albedo to regional climate.These changes would represent an improvement over the use of constant types and fractions of vegetation, as in ERA-Interim (Zhou and Wang, 2016b).
4. Given the implications of the spurious performance of the freely evolving boundary conditions in CFSR, homogeneous and accurate records of SST and sea ice should be produced.
Next-generation reanalyses, including both global and regional reanalyses, will assimilate and analyse in situ observations, satellite radiance and other remote observations.In addition to short-term accuracy and long-term trends, they will also focus on spatial patterns by incorporating or improving accurate representations of land surface conditions and processes within the coupled weather and climate Earth systems.Thus, these reanalyses will advance the simulation of landatmosphere interactions to yield high skill in studies of regional warming and the detection and attribution of regional climate change using various datasets, which frequently include global and regional reanalyses (Zhou et al., 2018;Zhou and Wang, 2016d;Herring et al., 2018;Trenberth et al., 2015;Stott, 2016;Dai et al., 2017;Zhou and Wang, 2017b).Additionally, the uncertainties associated with regional warming could be ascertained using physics ensembles with various equiprobable realizations of boundary conditions.

Conclusions
The reanalyses display differences in T a when compared to the observations with a range of −10 to 10 • C over China.
Approximately 74 and 6 % of these differences can be explained by site-to-grid elevation differences and the filtering error in the elevations used in the spectral models.These results imply fairly good skill in the simulation of the climatology of T a in the 12 reanalyses over China.Moreover, the 12 reanalyses roughly capture the interannual variability in Atmos.Chem.Phys., 18, 8113-8136, 2018 www.atmos-chem-phys.net/18/8113/2018/T a (median r =0.95).In the reanalyses, T a displays a consistently positive correlation with R s and L d and is negatively correlated with precipitation frequency, as seen in observations, despite the evident spatial patterns in their magnitudes over China.T a exhibits a strong warming trend of 0.37 • C decade −1 (p < 0.05) in the observations and 0.22-0.48• C decade −1 (p < 0.05) in the 12 reanalyses over China.In the observations, approximately 87 % of the observed trend in T a over China can be explained by the greenhouse effect (i.e.65 % can be explained by the trend in L d ), precipitation frequency (29 %) and R s (−7 %, due to the trend in radiative forcing of −1.1 W m −2 decade −1 ).
However, the biases in the trends in T a seen in the reanalyses relative to the observations display an evident spatial pattern (mean = −0.16∼ 0.11 • C decade −1 , SD = 0.15-0.30• C decade −1 ).The spatial patterns of the biases in the trends in the values of T a in the reanalyses are significantly correlated with those in R s (maximum r = 0.42, p < 0.05), precipitation frequency (maximum r = −0.62,p < 0.05) and L d (maximum r = 0.50, p < 0.05).Over northern China, the biases in the trends in T a (which are on the order of −0.12 • C decade −1 ) result primarily from a combination of those in L d (which are on the order of −0.10 • C decade −1 ) and precipitation frequency (which are on the order of 0.05 • C decade −1 ), with relatively small contributions from R s (which are on the order of −0.03 • C decade −1 ).Over southern China, the biases in the trends in T a (which are on the order of −0.07 • C decade −1 ) are regulated by the biases in the trends in R s (which are on the order of 0.10 • C decade −1 ), L d (which are on the order of −0.08 • C decade −1 ) and precipitation frequency (which are on the order of −0.06 • C decade −1 ).
If information on spatial patterns is included, the simulated biases in the trends in T a correlate well with those of precipitation frequency, R s and L d in the reanalyses (r = −0.83,0.80 and 0.77; p < 0.1); similar results are obtained for the atmospheric water vapour and the cloud fraction (r = 0.71 and −0.74, p < 0.1).These results provide a novel perspective that can be used to investigate the spatial relationships between the biases in the trends in T a and the relevant parameters among the 12 reanalyses.Therefore, improving simulations of precipitation frequency and R s helps to maximize the signal component corresponding to the regional climate.In addition, the analysis of T a observations helps to improve the performance of regional warming in ERA-Interim and JRA-55.Incorporating vegetation dynamics in reanalyses and the use of accurate aerosol information, as in MERRA-2, would advance the modelling of regional warming.The ensemble technique adopted in ERA-20CM, a twentieth-century atmospheric model ensemble that does not assimilate observations, significantly narrows the uncertainties of regional warming in the reanalyses (standard deviation = 0.15 • C decade −1 ).

Figure 1
Figure1illustrates the differences in T a from the NWP-like reanalyses and climate reanalyses relative to the homogenized station-based observations over China during the period of 1979-2010.When the T a values measured at the stations are compared directly with those in the corresponding model grid cells, the results indicate that the reanalysis products underestimate T a over most of the regions in China (by −0.28 to −2.56 • C).These discrepancies are especially pronounced over the Tibetan Plateau and middle China, where the underestimation ranges from −2.75 to −7.00 • C and from −1.19 to −2.91 • C, respectively (Fig.1and Table2).A homogenizing adjustment of 0.03 • C from the raw T a observations is insufficient to account for the underestimation of T a by the reanalyses (Fig.1and Table2).Similar biases in T a within various regions worldwide have been widely reported by previous studies(Mao et al., 2010;Pitman and Perkins, 2009;Reuten et al., 2011;Wang and Zeng, 2012;Zhou et al., 2017;Zhou and Wang, 2016b).However, we found that the spatial patterns in the differences in T a are well correlated with the elevation differences between models and stations, as reflected by correlation coefficients (r) of 0.85 to 0.94 (Figs. 2 and S1 in the Supplement).These results are in accordance with the reports from NCEP-R1, NCEP-R2 and ERA-40(You et al., 2010;Ma et al., 2008;Zhao et al., 2008).The elevation differences ( Height; Figs.2 and S1) between the stations and the model grids consist of the filtering error in the elevations used in the spectral models ( f) and differences in the site-to-grid elevations ( s) due to the complexity of the orographic topography.We further quantify the relative contributions of these factors to the T a differences.The elevation differences can explain approximately 80 % of the T a differences; approximately 74 % is produced by the site-to-grid elevation differences, and approximately 6 % is produced by the filtering error in the elevations used in the spectral models (Fig.2).The regression coefficient of the differences in T a is approximately 6 • C 1 km −1 , which is similar to the lapse rate at the surface (Fig.2).Lapse rate values that exceed 6 • C 1 km −1 can be seen over the Tibetan Plateau (shown as red dots in Fig.2).This result is very consistent with the reported lapse rates over China(Li et al., 2015;Fang and Yoda, 1988).In addition, the rate of decrease in the model filtering error is approximately 4 • C 1 km −1 among the 12 reanalyses

Figure 2 .
Figure2.The impact of inconsistencies between station and model elevations on the simulated multiyear-averaged differences in surface air temperatures (T a , unit: • C) during the study period of 1979-2010 over China.The elevation difference ( Height) between the stations and the models consists of the filtering error in the elevations used in the spectral models ( f ) and the difference in site-to-grid elevations ( s) due to the complexity of orographic topography.f is derived from the model elevations minus the "true" elevations in the corresponding model grid cells from GTOPO30.The GTOPO30 orography is widely used in reanalyses, e.g. by ECMWF.The colour bar denotes the station elevations (unit: m).The relationship of the T a differences is regressed on Height (shown at the bottom of each subfigure) or f and s (shown at the top of each subfigure); the corresponding explained variances are shown.

Figure 3 .
Figure 3. Taylor diagrams for annual time series of the observed and reanalysed surface air temperature anomalies (T a , unit: • C) from 1979 to 2010 in (a) China and (b-h) the seven subregions.The correlation coefficient, standard deviation and root-mean-squared error (RMSE) are calculated against the observed homogenized T a anomalies.

Figure 4 .
Figure 4. Composite map of partial correlation coefficients of the detrended surface air temperature (T a , unit: • C) against surface incident solar radiation (R s ), precipitation frequency (PF) and surface downward longwave radiation (L d ) during the period of 1979-2010 from observations and the 12 reanalysis products.The marker "+" denotes the negative partial correlations of T a with R s over the Tibetan Plateau in NCEP-R2, ERA-20C and ERA-20CM.

Figure 5 .
Figure 5. (a, b) The observed trends in surface air temperature (T a , unit: • C decade −1 ) and the simulated biases in the trends in T a (unit: • C decade −1 ) during the period of 1979-2010 from (c) raw observations and(d-o) the 12 reanalysis products over China with respect to the homogenized observations.The squares denote the original homogeneous time series, and the dots denote the adjusted homogeneous time series.The probability distribution functions of all of the biases in the trends are shown as coloured histograms, and the black stairs are integrated from the trend biases with a significance level of 0.05 (based on two-tailed Student's t tests).The cyan and green stars in (k-n) represent estimates of the biases in the trends outside the ensemble ranges whose locations are denoted by the black dots shown in (k-n).

Figure 6 .
Figure6.Composite map of the contributions (unit: • C decade −1 ) of the biases in the trends in three relevant parameters, surface incident solar radiation (R s , in red), surface downward longwave radiation (L d , in green) and precipitation frequency (in blue) to the biases in the trends in surface air temperature (T a ) during the study period of 1979-2010, as estimated using the 12 reanalysis products over China.

Figure 7 .
Figure 7. Contributions (unit: • C decade −1 ) of the biases in the trends in surface air temperatures (T a ) from three relevant parameters, surface incident solar radiation (R s , in brown), surface downward longwave radiation (L d , in light blue) and precipitation frequency (PF, in deep blue), during the study period of 1979-2010 from the 12 reanalysis products over China and its seven subregions.

Figure 8 .
Figure 8. Spatial associations of the simulated biases in the trend in surface air temperature (T a ) versus three relevant parameters among the 12 reanalysis products (solid lines indicate the NWP-like reanalyses, and dashed lines indicate the climate reanalyses).The probability density functions (unit: %) of these biases in the trends are estimated from approximately 700 1 •× 1 • grid cells that cover China.The median values (coloured dots with error bars of spatial standard deviations) of the biases in the trends in T a (unit: • C decade −1 ) in the 12 reanalyses are regressed onto those of (a) the surface incident solar radiation (R s , unit: W m −2 decade −1 ), (b) precipitation frequency (unit: days decade −1 ) and (c) the surface downward longwave radiation (L d , unit: W m −2 decade −1 ) using the ordinary least-squares method (OLS, denoted by the dashed grey lines) and the weighted total least-squares method (WTLS, denoted by the solid black lines).The 5-95 % confidence intervals of the regressed slopes obtained using WTLS are shown as shading.The regressed correlations and slopes are shown as grey and black text, respectively.

Table 1 .
Precipitation frequency can regulate the partitioning of available energy into latent and sensible heat Summary information on the 12 reanalysis products, including institution, model resolution, assimilation system, surface observations associated with surface air temperatures, sea ice and sea surface temperatures (SSTs), and greenhouse gas (GHG) boundary conditions.The number in the parentheses in the model name column is the year of the version of the forecast model used.More details on each product can be found in the associated reference.

Table 1 .
Continued from right column.

Table 2 .
Differences (unit: • C) relative to the homogenized observations and trends (unit: • C decade −1 ) in surface air temperatures (T a ) from 1979 to 2010 over China and its seven subregions.The bold font indicates results that are significant according to two-tailed Student's t tests with a significance level of 0.05.

Table 3 .
Spatial pattern correlation (unit: 1) of three groups: partial relationships, trends and simulated biases in the trends in surface air temperature (T a ) against surface incident solar radiation (R s ), precipitation frequency (PF) and surface downward longwave radiation (L d ).The bold and italic bold fonts indicate results that are significant according to two-tailed Student's t tests with significance levels of 0.05 and 0.1, respectively.
Data availability.The latest meteorological observation datasets were obtained from the China Meteorological Administration (CMA, http://www.cma.gov.cn).Current reanalysis products were publicly downloaded from the European Centre for Medium-Range Weather Forecasts (ECMWF) providing ERA-Interim, ERA-20C, ERA-20CM and CERA-20C data (http://www.ecmwf.int/); the Global Modeling and Assimilation Office (GMAO) at the NASA Goddard Space Flight Center providing MERRA and MERRA2 data (https://disc.gsfc.nasa.gov/); the NOAA Earth System Research Laboratory (ESRL) providing NCEP-R1, NCEP-R2, CFSR NOAA 20CRv2 and NOAA 20CRv2c data (http://www.esrl.noaa.gov/); and the Climate Prediction Division of the Global Environment and Marine Department at the Japan Meteorological Agency providing JRA-55 data (http://jra.kishou.go.jp/).Elevation data of GTOPO30 from the United States Geological Survey Earth Resources Observation and Science Data Center (http://edc.usgs.gov/products/elevation/gtopo30/gtopo30.html) and vegetation index of GIMMS from the Global Inventory Monitoring and Modeling System project (https://ecocast.arc.nasa.gov/data/pub/gimms/)were available for the public.