Some issues in uncertainty quantification and parameter tuning : a case study of convective parameterization scheme in the WRF regional climate model

The current tuning process of parameters in global climate models is often performed subjectively or treated as an optimization procedure to minimize model biases based on observations. While the latter approach may provide more plausible values for a set of tunable parameters to approximate the observed climate, the system could be forced to an unrealistic physical state or improper balance of budgets through compensating errors over different regions of the globe. In this study, the Weather Research and Forecasting (WRF) model was used to provide a more flexible framework to investigate a number of issues related uncertainty quantification (UQ) and parameter tuning. The WRF model was constrained by reanalysis of data over the Southern Great Plains (SGP), where abundant observational data from various sources was available for calibration of the input parameters and validation of the model results. Focusing on five key input parameters in the new Kain-Fritsch (KF) convective parameterization scheme used in WRF as an example, the purpose of this study was to explore the utility of high-resolution observations for improving simulations of regional patterns and evaluate the transferability of UQ and parameter tuning across physical processes, spatial scales, and climatic regimes, which have important implications to UQ and parameter tuning in global and regional models. A stochastic importance sampling algorithm, Multiple Very Fast Simulated Annealing (MVFSA) was employed to efficiently sample the input parameters in the KF scheme based on a skill score so that the algorithm progressively moved toward regions of the parameter space that minimize model errors. The results based on the WRF simulations with 25-km grid spacing over the SGP showed that the precipitation bias in the model could be significantly reduced when five optimal parameters identified by the MVFSA algorithm were used. The model performance was found to be sensitive to downdraftand entrainment-related parameters and consumption time of Convective Available Potential Energy (CAPE). Simulated convective precipitation decreased as the ratio of downdraft to updraft flux increased. Larger CAPE consumption time resulted in less convective but more stratiform precipitation. The simulation using optimal parameters obtained by constraining only precipitation generated positive impact on the other output variables, such as temperature and wind. By using the optimal parameters obtained at 25-km simulation, both the magnitude and spatial pattern of simulated precipitation were improved at 12-km spatial resolution. The optimal parameters identified from the SGP region also improved the simulation of precipitation when the model domain was moved to another region with a different climate regime (i.e. the North America monsoon region). These results suggest that benefits of optimal parameters determined through vigorous mathematical procedures such as the MVFSA process are transferable across processes, spatial scales, and climatic regimes to some extent. This motivates future studies to further assess the strategies for UQ and parameter optimization at both global and regional scales.


Introduction
Sound strategies and decisions making in climate change mitigation and adaptation require not only robust projections of the mean or most likely scenario but also the occurrence of Published by Copernicus Publications on behalf of the European Geosciences Union.
B. Yang et al.: Some issues in uncertainty quantification and parameter tuning low probability but high-impact events (IPCC, 2007).Uncertainty quantification (UQ) is the science of quantitative characterization and reduction of uncertainties in applications.It determines how likely certain outcomes are if some aspects of the system are not exactly known.UQ of predicted future climate is usually based on the ability of models to produce the current climate (Allen et al., 2000;Tebaldi et al., 2005).The full probability density functions (PDFs) of occurrence for both present climate and future prediction are needed to predict the probability of extreme weather or climate events.
Different approaches have been applied to generate ensemble simulations and construct PDFs for variables of climate model output.These approaches include perturbing the initial conditions, perturbing the input parameters of the model, ensemble simulations with multiple parameterization schemes, or ensemble simulations with multiple models, and so on (Allen et al., 2000;Giorgi and Mearns, 2002;Stainforth et al., 2005;Lopez et al., 2006).Covey et al. (2011) found that the variability of globally averaged upwelling longwave radiation and surface temperature induced by perturbation of initial condition is much smaller than that induced by perturbation of model input parameters.Hawkins and Sutton (2009) estimated the contributions to the total climate change prediction uncertainty from internal variability, model uncertainty, and scenario uncertainty and found that their relative contributions depend on the prediction lead times.Furthermore, for the decadal time scales and regional spatial scales (∼2000 km), model uncertainty is of greater importance than internal variability.Quantifying and reducing the uncertainty of tunable input parameters in climate models can improve our understanding of the physical process in climate systems as well as reduce the uncertainty for projecting future climate change.
Parameterizations in climate models typically contain many input parameters that are determined based on the physical processes being parameterized or estimated based on tuning to obtain qualitative agreement between the simulations and observations from limited local measurements or global observations.Larger number or ranges of input parameters usually result in higher uncertainties in climate simulations because of nonlinear interactions and compensating errors of parameters (Gilmore et al., 2004;Molders, 2005;Min et al., 2007;Murphy et al., 2007).Perturbed-Parameter Ensembles (PPE) with the same climate model but different combinations of several key input parameters, within reasonable ranges, have been employed to assess future climate uncertainty (Murphy et al., 2004;Jackson et al., 2003Jackson et al., , 2008;;Collins et al., 2011).
To approximate the posterior probability distribution of input parameters in physical parameterizations, many sampling strategies have been proposed, such as grid search method, Metropolis/Gibbs algorithm (Metropolis et al., 1953;Kirkpatrick et al., 1983;Sen and Stoffa, 1996), Monte Carlo or Quasi Monte Carlo (QMC), (Moskowitz and Caflisch, 1996), Latin Hypercube selection (Stein, 1987), Multiple Very Fast Simulated Annealing (MVFSA) (Ingber, 1989;Jackson et al., 2004), among others (Tierney and Mira, 1999;Haario et al., 2001).Grid search is a straightforward method to test the sensitivity of parameters by subdividing each parameter space into equally spaced intervals and evaluating uncertainty arising from those combinations.However, this method may require huge computational resources.For example, around 10 5 simulations are needed if five parameters with 10 intervals for each parameter are to be explored.Thus, high-efficiency sampling methods are needed for applications related to climate modeling.MVFSA is a stochastic importance sampling algorithm that can progressively move toward regions of the parameter space that minimize model errors and more efficiently provide useful information for optimizing or generating accurate measures of the posterior distribution (Villagran et al., 2008).Jackson et al. (2008) applied MVFSA to optimize six parameters related to the cloud process in a Global Climate Model (GCM) because cloud processes play a critical role in the hydrological cycle and uncertainty of climate response to doubling of CO 2 forcing (Colman, 2003;Webb et al., 2006;Medeiros and Stevens, 2011).Constrained by different sets of observations, their work provided a six-member ensemble of optimized model configurations with a narrower range of future temperature change projection.
Currently, UQ and parameter tuning in climate study are typically applied in GCMs, with more focus on global climate sensitivity and large-scale climatic features.Equal weighting of the state fidelity globally could compromise parameter tuning in GCMs because the processes being tuned may only be relevant for particular regimes.Furthermore, global tuning may produce parameter settings that approximate the observed global climate, but at the expense of yielding unphysical states or improper balance of budgets at the local or regional scales.Even if the calibration produces realistic regional means, important spatial variability may not be reproduced if observed spatial patterns from high-resolution measurements are not utilized in the global tuning.Hacker et al. (2011) evaluated the impacts of initial condition and model parameterization uncertainties on a WRF-based ensemble prediction system and found that different combinations of parameterization schemes associated with perturbed parameters could generate the most skillful ensemble prediction.
This study applies UQ and parameter tuning to a Regional Climate Model (RCM), which offers more flexibility in terms of model configuration and is computationally more economical, allowing some of the above issues to be explored in more details.More specifically, we explore the utility of high-resolution observations for improving simulations of regional patterns.We further investigate three important questions.First, can calibration of specific physical parameterizations lead to improvements in aspects not directly influenced by the parameterizations?Second, can model calibration performed at a coarser scale improve simulations at a finer scale?Lastly, can optimal parameters obtained by calibration in one climate regime lead to improvements in other climate regimes?These questions aim at evaluating the transferability of UQ and parameter tuning across physical processes, spatial scales, and climatic regimes, which have important implications to UQ and parameter tuning in global and regional models.
With the rapid growth of computing resources in the past decades, some climate models can now be applied at a cloud-resolving scale (Khairoutdinov et al., 2001;Tao et al., 2009).However, because of simulation length and the need for ensemble modeling, climate models being used in projecting climate change still use grid spacing of 25 km or larger where cumulus processes have to be parameterized.Since convective process contributes disproportionately to the magnitude and intensity of precipitation, and the diabatic heating from convective process is an important driver of global and regional circulation, it is important to better understand and constrain the convective parameterizations used in climate and weather forecasting models (Warner and Hsu, 2000;Liu et al., 2001).Many different Convective Parameterization Schemes (CPS) have been developed over the past decades (Janjic, 1994;Emanuel and Zivkovic-Rothman, 1999;Gregory et al., 2000;Grell and Devenyi, 2002).Among them, the Kain-Fritsch (KF) scheme (Kain and Fritsch, 1993;Bechtold et al., 2001), including more recent updates (Kain, 2004), is commonly used in regional models including the Weather Research and Forecasting (WRF) model (Skamarock et al., 2001).
This study applies UQ and model calibration to the WRF regional model to address the questions discussed above.Simulations were performed with WRF constrained by reanalysis data over the Southern Great Plains (SGP), where abundant observational data from various sources are available for calibration of the input parameters and validation of the model results.The MVFSA importance sampling algorithm was applied to quantify the uncertainty ranges and identify the optimal values of five key input parameters in the new KF CPS used in the WRF model.Because of its importance and sensitivity to model physics, precipitation is used as the constrained variable in the optimization process.The impact of precipitation-based optimization on a few other variables, such as temperature and wind, was analyzed.Furthermore, parameter transferability across spatial scales and climate regimes was investigated using sensitivity experiments.
This paper is organized as follows.Parameter selection in the new KF CPS, the MVFSA sampling algorithm, observational data, and the WRF model configuration are described in Sect. 2 and the optimization results, sensitivities of model performance, precipitation and other output variables to parameters in the KF scheme, and dependence of optimization on model configurations are presented in Sect.3. The conclusion is discussed in the last section.
2 Parameters, approach and experiment design

The new KF CPS and five key parameters
CPSs are appropriate for use in RCMs with a moderate grid spacing of 10-100 km.This spacing is large enough so that a cloud ensemble within the grid can be treated as a statistical entity but small enough to keep the uniform characteristics of the cloud environment.The new KF CPS, which is commonly used in many mesoscale models including WRF, was developed based on a mass flux parameterization (Kain, 2004).Using a Lagrangian parcel method (Simpson and Wiggert, 1969;Kreitzberg and Perkey, 1976), the new KF CPS operates by searching for the Updraft Source Layer (USL), which has a potential for inducing shallow or deep convection, starting from the surface upward to within the lowest 300 hPa of the atmosphere.When the USL is identified, updraft flux is initialized with a velocity based on atmospheric instability and grid-scale vertical motion at USL (Kain and Fritsch, 1990).Air mass is exchanged between the updraft and the environment through entrainment and detrainment at each layer.The rate of entrainment flux is related to the cloud radius that varies from 1000 to 2000 m depending on the large-scale vertical velocities.The intensity of updraft flux decreases with altitude as the thermal contrast between the cloud and the environment is reduced by mixing.Convective downdrafts, which play an essential role in determining the heating profile and humidity features in the lower troposphere (Johnson, 1976;Cheng, 1989), are driven by the evaporation of condensate generated within the updrafts.The strength of the downdraft mass flux is related to the relative humidity of environmental air (Knupp and Cotton, 1985;Ferrier et al., 1996;Shepherd et al., 2001).The fluxes of updraft, entrainment/detrainment, downdraft, as well as of grid-scale compensating subsidence are parameterized and used to calculate the convective temperature, water vapor and cloud water tendencies that are used to advance the respective largescale fields.
Five key parameters related to the downdraft flux rate and starting height, environmental entrainment flux rate, turbulent kinetic energy (TKE) in the sub-cloud layer, and the consumption time of Convective Available Potential Energy (CAPE) in the new KF CPS in the WRF are thought to be important in the KF CPS, but the range of their possible values is quite wide (J.Kain, personal communications, 2011).
The intensities of both downdraft and entrainment fluxes are proportional to the updraft mass flux at the top of USL in the KF CPS.In this study, two parameters P d and P e are defined as additional scale factors to modulate the rates of downdraft and entrainment fluxes from 1/2 to 2 times of their original values, respectively.
In Eqs. ( 1) and (2), M USL u and M USL d are the updraft and downdraft mass fluxes at the top of USL, respectively.RH is the mean relative humidity of environment air from the starting layer of downdraft to cloud base.R is the cloud radius, δp is the pressure thickness of a model layer and δM e is the maximum possible entrainment rate of this layer.More details can be found in Kain and Fritsch (1990).
Downdraft is assumed to start from 150 hPa above USL in the standard KF CPS.The starting height of downdraft P h controls the downdraft structures and also affects the atmospheric properties in the sub-cloud layer.We set the range of P h as 50-350 hPa to allow a larger degree of freedom in the downdraft structures from tall and narrow to short and wide.
Shallow or deep convection are based on different closure assumptions.For shallow convection, the intensity of updraft mass flux at USL is assumed to be a function of TKE in the sub-cloud layer.For deep convection, the KF scheme incrementally rearranges the updraft, downdraft and other mass flux until the CAPE is reduced by at least 90 % within a specified time, called CAPE consumption time.The CAPE consumption time is related to the vertical shear defined as the difference between horizontal wind at the cloud base and 500 hPa level (Bechtold et al., 2001).The TKE and average CAPE consumption time are referred to as P t and P c , with values of 5 m 2 s −2 and 2700 s in the standard KF CPS.We allowed a range from 3 to 12 m 2 s −2 for P t and from 900 to 7200 s for P c .The default value in the standard KF scheme and range of value for each parameter are shown in Table 1.

MVFSA optimization approach
Very Fast Simulated Annealing (VFSA) is a stochastic importance sampling algorithm with high converging efficiency toward the optimal results (Ingber, 1989;Jackson et al., 2004).For most optimization applications, multiple extreme values (i.e.local minimum/maximum) may exist and the selected parameter values may be trapped by some local minimums within the parameter space in one VFSA procedure.Repeating the VFSA multiple times with different initial starting parameter set (i.e.MVFSA) can help prevent such local trapping and identify the global minimum (Jackson et al., 2008;Villagran et al., 2008).The steps in the MVFSA algorithm, which is adapted from Jackson et al. (2004Jackson et al. ( , 2008)), are the following; 1.Take random points in the parameter spaces and run a simulation at each step.At the first step, an initial starting parameter set (m 0 ) is randomly selected to run the first WRF simulation.
2. Quantify the differences between simulation and observation in terms of a scalar skill score or "cost," referred to as E(m), where m is the parameter set.If Gaussian errors exist in the model results, E(m) is usually defined as N refers to different sets of observations/variables. d obs refers to observations and g(m) refers to simulations with a specific parameter set m. C −1 is the inverse of the data covariance matrix, which could include a weight coefficient for different variables.In this study, only one set of observation (precipitation) is used with equal weight at each grid point in the observation constraint in Eq. ( 3), so E(m) is simplified as: where i, j are the horizontal grid points in the model domain, and k represents the number of time steps.In Eq. ( 4), the model biases are assumed to be spatially or temporally uncorrelated (i.e. the data covariance matrix C −1 in Eq. ( 3) only contains nonzero elements along the diagonal).The frequency of precipitation rate tends to have an exponential distribution rather than a Gaussian distribution, which indicates that the score function of the model based on Eqs. ( 3) and ( 4) is dominated by the upper range in the observation.Given that our case study has strong convection over a limited region during a short time period, the use of Eq. ( 4) is appropriate in this study (see Sect. 2.3).
3. Reselect the parameter values based on the skill score so that the algorithm progressively moves toward regions of the parameter space that minimize modeling errors.
Starting from the second round of the procedure, the parameters will be perturbed to a new set of m new as follows: where m min i and m max i represent the possible minimum and maximum values of each parameter, and y i is drawn from a Cauchy distribution which is dependent on an annealing coefficient T : Within Eqs. ( 5)-( 8), subscript i, k are the parameter number and iteration number, respectively.sgn is the sign operator and RND represents a random number from a uniform distribution between 0 and 1.At iteration k, the annealing coefficient T is lowered according to If the results with a new set of parameters show an improvement over the old one, in effect, E = E(m new )− E(m 0 ) < 0, then the new set of m is accepted as the basis for the next iteration, that is, m 0 = m new .If not, the new set of parameters can still possibly be accepted with a probability With a lower T , the VFSA algorithm moves progressively toward regions of the parameter space that minimize model errors since the width of the Cauchy distribution will be incrementally focused on the current accepted parameter set, facilitating the VFSA algorithm to converge more efficiently.In this study, we lower T every two steps with an initial value of T 0 as 10.
4. To get global optimal values, we repeat the VFSA procedure three times with different starting parameter set (i.e. three chains).We conducted 50 experiments in each chain.Only 148 simulations are valid because instability occurred in two of the simulations.The three chains nearly converge to the same region within the parameter spaces (not shown), indicating that three chains are probably enough for this case study.
Figure 1 shows the best values averaged for three iterations based on three independent MVFSA chains.As seen in Fig. 1, the averaged best values monotonically decrease as the number of model integrations increases and finally reach convergence after 28 integrations.
In climate model calibration, we are interested in not only the magnitudes of model bias (e.g. standard deviation) but also the similarity of spatial pattern (e.g.spatial correlation coefficient) between observed and modeled large-scale fields (Taylor, 2001).We define where SC[d obs ,g(m)] refers to the spatial correlation coefficient between the observation and simulation, and n represents the time series.Both E(m) and C(m) are normalized so they can be considered together as EC(m), EC(m) = E(m) − C(m).Doing so accounts for both the magnitude of bias and similarity of spatial pattern.For brevity, E(m), C(m) and EC(m) are denoted as E, C, and EC, respectively hereafter.The University of Washington (UW) 1/8 gridded meteorological data set includes daily precipitation, maximum and minimum 2-m temperature and 10-m wind speed (Maurer et al., 2002).Only the daily precipitation data are used in the observation constraint in Eq. ( 4).The maximum and minimum temperatures at 2-m height and wind speed at 10m height are used to evaluate the WRF simulation performances that used the optimal parameters derived by constraining the precipitation alone.

Model configuration
The Advanced Research Weather Research and Forecasting model Version 3.2.1 (WRF Version 3.2.1,Skamarock et  2), with horizontal grid spacing of 25 km and 36 sigma levels from the surface to 100 hPa.Wind, temperature, water vapor, pressure, and underlying surface variables used to generate initial and boundary conditions are derived from the North American Regional Reanalysis (NARR) data with 32-km horizontal resolution and 3-h time intervals.
To obtain a reasonable simulation result for precipitation over the SGP region before starting the optimization process, we compared two different radiation schemes, RRTMG (Rapid Radiative Transfer Model for GCMs, Barker et al., 2003;Pincus et al., 2003) vs. CAM (Community Atmosphere Model 3.0, Collins et al., 2004), and two different microphysics schemes, WSM6 (WRF Single-Moment 6-class, Hong and Lim, 2006) vs. Morrison 2-Moment (Morrison et al., 2005).Figure 3 shows the observed and simulated monthly mean precipitations for June 2007 with different radiation (RRTMG vs. CAM) and microphysical parameterization schemes (WSM6 vs. Morrison) while the standard KF CPS was used in both simulations.The results show that more than 70 % of the rainfall is contributed by convective precipitation, indicating the importance of the CPS in simulating precipitation for the region in the summer.We find that the simulated precipitation is more sensitive to different radiation schemes than different microphysical schemes in this study.While the CAM radiation scheme tends to underestimate the amount of precipitation, the RRTMG seems to produce a more realistic magnitude and spatial pattern of precipitation.However the RRTMG scheme produces larger areas of precipitation than observed, especially over the northeast corner of the domain.Simulation result with the Morrison scheme is slightly better than with WSM6.Finally, RRTMG radiation and Morrison microphysics schemes, as well as the Mellor-Yamada-Janjic (MYJ, Janjic, 2002) PBL scheme and the Noah Land Surface Model (LSM) (Chen and Dudhia, 2001) were used in all simulations in this study.
We selected 1 May to 30 June 2007 for our simulations to focus on a wet month (June) with mostly convective-type precipitation.To isolate the influence of the convective parameterization, all model simulations, including those identifying the best configuration, were initialized every three days to minimize errors in the large-scale circulation that can also affect precipitation.Each simulation was initialized two days after the previous simulation.Discarding the first day as model spin-up, the results of the last two days of each simulation were concatenated to form a continuous time series for analysis.Unlike the atmospheric state, which was initialized every three days using the NARR data, the land surface state (soil moisture and temperature) was initialized based on simulation of the previous three days to produce better spun-up land surface conditions for realistic land-atmosphere interactions.As described in Sects.3.4 and 3.5, the same experimental design was used to conduct simulations with different horizontal resolutions and over different regions.

Model response to five parameters
The top panel of Fig. 4 shows the response of model performance (quantified as E as introduced in Sect.2.2) to five input parameters based on the 148 simulations through the MVFSA procedure.E is equal to 137 in the simulation with default parameters in the KF CPS. Figure 4 shows that E varies from 74 to 225, with lower E than 137 in the majority of experiments.We found that model response is more sensitive to the changes of P d (downdraft flux rate related coefficient), P e (entrainment rate related coefficient), and P c (CAPE consumption time) than to the other two parameters.For example, the model bias E significantly decreases with the increase of P d or decrease of P e .The optimal values for P d , P e , and P c that minimize E are around 0.9, −0.9, and 4600 s, respectively.The optimal value for P h and P t are around 280 hPa and 9 m 2 s −2 , both larger than the default values in the standard KF scheme for the starting height of downdraft above USL and the maximum TKE in the sub-cloud layer in this study.The responses of E to variations in P h and P t are not as evident as those of the other three parameters.
Among the 148 valid simulations derived from the MVFSA procedure, there were 114 simulations with lower E (better performance) than the standard KF scheme with default parameters.These 114 simulations are defined as "good" experiments.The middle panel of Fig. 4 shows the frequency distributions of the "good" experiments as a function of each parameter value.We found that around 51 % of the "good" experiments were produced by P d from 0.6 to 1.0, indicating that the ratio of downdraft to updraft mass fluxes shown in Eq. ( 1) is too small in the standard KF CPS.Approximately 60.5 % of the "good" experiments were produced by P e from −1.0 to −0.4, indicating that the ratio of maximum possible entrainment rate to updraft mass fluxes shown in Eq. ( 2) is too large in the standard KF CPS.As P h , P t , and P c are within the range from 230 to 320 hPa, 9 to 11 m 2 s −2 , and 3000 to 6000 s, respectively, there are better chances to obtain relatively lower E (better performance).
The marginal posterior probability distributions (PPD) for the five parameters derived from kernel density estimation are also shown in the bottom panel of Fig. 4. In statistics, kernel density estimation, a non-parametric way of estimating the PDF of a random variable, is a fundamental data smoothing problem where inferences about the population are made, based on a finite data.Different from the upper two panels of Fig. 4, the PPD was calculated using the proposed sample instead of the admitted samples to avoid the heavily biased admitted samples towards the mode.Similar to the middle panel of Fig. 4, large probabilities are located at around 0.8, −0.7, 320, 9.5 and 3200, respectively for the five parameters of P d , P e , P h , P t and P c .
Figure 5 shows the observed and simulated monthly mean precipitation for June 2007 with default and optimal parameters (see Table 2) in the simulations.Overall, the model with default parameters captures the spatial pattern but overpredicts the amount of precipitation, especially over the northeastern part of the domain.The simulation with Ebased optimal parameters has significantly reduced the wet bias of the model, as E decreases from 137 to 74.
Skill scores C describing the spatial pattern of precipitation (see Eq. 11) were calculated for all of the 148 experiments.The variations of E and C with perturbed parameters are closely correlated, with a correlation coefficient of 0.79, implying that the spatial pattern of the precipitation would likely be improved if the magnitude of the model's bias was  reduced through the MVFSA process.Among the five input parameters, entrainment related parameter P e has the most significant impact on C (not shown).
EC is calculated to represent the model performance in both magnitude and spatial pattern of precipitation.The bottom panel of Fig. 5 shows the simulations with optimal parameters based on E and EC, respectively.The E values for simulations with optimal E and EC are 74 and 79, respectively.The C values are 0.34 and 0.36, respectively, indicating that the spatial pattern in the simulation with optimal EC is more similar to the observation than that of the default or with optimal E.
Figure 6 shows the observed and simulated frequencies of daily precipitation as a function of rain rate.Compared to the observation, the WRF with the standard KF CPS evidently overestimates the frequency of precipitation across all rain rates and the model wet bias becomes larger for heavy rain.By applying the optimal parameters based on E (not shown) or EC, the model markedly reduced the overestimated occurrence frequency for rainy events larger than 3 mm day −1 .The improvement is more evident for the heavy precipitation with rain rate larger than 20 mm day −1 .

Sensitivity of precipitation and correlation with other variables
Figure 7 shows the responses of convective, explicit and total precipitation to each of the five parameters.As mentioned previously, total precipitation is contributed largely by the convective precipitation in this case study.The amount of explicit precipitation is around 0.2 to 1.5 mm day −1 , while convective precipitation varies between 3.8 and 9 mm day −1 .Because of the competition for moisture and physical interaction between the grid and sub-grid scale processes, the explicit precipitation is also affected by the CPS in the model  The result is derived from daily precipitation at all grids within the model domain as shown in Fig. 2 for June 2007.(Kain, 2004), although the convective precipitation is more sensitive to the parameters.
From the middle panel of Fig. 7 we found that downdraft related parameter P d and CAPE consumption time P c have larger impact on the convective precipitation.With a larger ratio of downdraft to updraft flux (larger P d ), more condensed water would be evaporated associated with a stronger downdraft process, resulting in less precipitation.The larger CAPE consumption time (larger P c ) slows down the develop-ment and decreases the intensity of convection, thus reducing the convective precipitation.Stronger entrainment rate usually produces less convective precipitation because it dilutes the moist convective core, which tends to suppress the updraft (Kain and Fritsch, 1990;Zhang and McFarlane, 1995).The impact of TKE on convective precipitation is relatively small.
The change of explicit precipitation is often anti-correlated with the convective precipitation.When the convective precipitation is suppressed with the perturbed parameters, more moisture will be available in the atmosphere, favoring the formation of explicit precipitation calculated based on the microphysics scheme in the model.The top panel of Fig. 7 shows that the explicit precipitation is more sensitive to the parameters related to entrainment and CAPE consumption time than the other three parameters.Since total precipitation is mainly contributed by the convective precipitation, the responses of total precipitation to the five parameters are consistent with that of convective precipitation.
Figures 8 and 9 demonstrate how the changes of two parameters, P d and P e , physically affect the convective process and other subsequent meteorological variables such as air temperature and humidity, cloud, and surface heat flux.In Fig. 8 we see clear response of the low-level cloud, water vapor, temperature and surface energy flux to the downdraftrelated parameter P d .While the downdraft flux became stronger with the increase of P d , it enhanced the evaporation of condensate, increasing the humidity and decreasing the temperature in the lower troposphere (900-800 hPa), which favors the formation of a low cloud.Consequently, increased The ratio of entrainment to updraft flux (P e ) also showed a remarkable impact on the convection process and weather system (see Fig. 9).With a larger entrainment rate, efficient mixing can suppress the development of updraft and increase the environmental air humidity at the middle (800-600 hPa) atmosphere, so that deep convection is weakened and the cloud top height decreases (i.e.outgoing longwave radiation increases).In the lower atmosphere, the weaker condensate or evaporation that results from weaker updraft can increase temperature and produce fewer clouds.Consequently, the downward surface solar radiation and skin temperature significantly increase.Since the skin temperature and low-level air temperature increase consistently, a clear trend of sensible heat flux (SH) was not seen with the change of entrainment rate.LH increases primarily due to the increased downward solar radiation at the surface.
The impact of the downdraft starting height P h on the convection process is similar to that of the downdraft rate (not shown).Downdraft flux initiating at a higher level can produce a tall and narrow downdraft, which has effects similar to a larger downdraft rate.
The relative sensitivities of the response of the meteorological variables to the five CPS parameters are shown in Fig. 10.The sensitivity ranking is calculated based on the correlation coefficients between output variables (y-axis) and input CPS parameters (x-axis) from 148 simulations, representing the variability of output variables against the perturbed input parameters (e.g. the slope of the fitted curve shown in Figs.7-9). Figure 10 shows that P d and P e have more impact on the output variables than the other three input parameters, while most of the output variables are least sensitive to P t , the maximum TKE in the sub-cloud layer.The impact of CAPE consumption time (P c ) on precipitation is significant as discussed in Sects.3.1 and 3.2, because P c efficiently controls the development of the convection.As shown in Fig. 10, cloud water content, PBL specific humidity, outgoing longwave radiation (OLR) and downward longwave radiation are very sensitive to P c .
A total of 148 simulations with perturbed parameter sets were completed in this study, providing an opportunity to investigate not only the response of various model variables to the CPS parameters but also the correlation and interaction among different model variables.As summarized in Table 3, strong positive correlations can be found between monthly mean convective precipitation and soil moisture, skin temperature and downward solar radiation flux, LH and air temperature, as well as LH and downward solar radiation flux.We found significant negative correlations between lower/mid-level air humidity and soil moisture, lowerlevel air humidity and convective precipitation, OLR and soil moisture, SH and air temperature, as well as LH and lowlayer cloud water content.

Impact of optimization on temperature and wind speed
Because only observed precipitation is used to constrain the MVFSA algorithm, the question arises as to how other simulated variables vary with the five CPS parameters when the model converges to the optimal results for precipitation.Table 4 shows the correlation coefficients of model skill scores between precipitation and 2-m temperature and 10-m wind speed.The correlation coefficient is 0.31 between E(T mean ) and E (Prec) and 0.76 between E(T mean ) and C (Prec) , indicating that the bias of model temperature is more correlated with spatial pattern than the bias of magnitude of simulated precipitation.The correlation coefficient between E (Wind) and E (Prec) is 0.86 and between E (Wind) and C (Prec) is 0.87, implying a consistent performance in simulating wind speed and precipitation (i.e.simulations with better precipitation are also more likely to have better wind speed).
Figure 11 shows the differences of model biases for temperatures and wind speed between the simulations with default and optimized parameters.Here, the value on each grid point is calculated as (|Optimal-Observation|−|Default-Observation|), so negative value represents a positive impact by using the optimized parameters.It can be seen that, except for the maximum temperature, all variables have reduced absolute biases with the optimized parameters than with the default parameters, especially over regions with strong precipitation, even though the optimal parameters are obtained only based on precipitation.The improvements for temperatures are more significant when using optimal parameters based on EC than based on E (not shown), which suggests that including precipitation pattern in the skill score metrics may be important in the optimization process.

Dependence of optimized parameters on model grid spacing
It is well known that the performance of CPS may vary with model resolution as current convective parameterizations generally exhibit scale dependence (Arakawa et al., 2011).Retuning of model parameters for high-resolution applications can be very time consuming and computationally intensive.In this study, the MVFSA procedure was performed based on WRF simulations at 25-km grid spacing.To Table 3. Correlations among different model output variables in 148 WRF simulations (25-km, SGP) with perturbed parameters in the KF scheme.The correlation coefficients are calculated based on the domain average as shown in Fig. 3. TS: skin temperature; SM: soil moisture; QC: cloud liquid water content at layers from 900 to 800 hPa; Q(P ): air specific humidity for 1000-900 hPa; T (P ): air temperature for 1000-900 hPa; Q(L): air specific humidity for 900-800 hPa; T (L): air temperature for 900-800 hPa; Q(M): air humidity for 800-600 hPa; T (M): air temperature for 800-600 hPa; SWD: short-wave radiation at surface; LWD: downward long-wave radiation at surface; OLR: outward long-wave radiation at top of the atmosphere; SH: sensible heat flux at surface; LH: latent heat flux at surface; EP: explicit precipitation; CP: convective precipitation.assess the transferability of model calibration across spatial scales, we completed two simulations with a higher resolution (12-km) with default and optimal parameters obtained from the 25-km simulations.Identical model configurations and domain size were used between the 25 km and 12 km resolution simulations.
Figure 12 shows the spatial distributions of observed and simulated precipitation with default and optimal parameters, respectively.We found that with default CPS parameters in the standard KF, the model can reasonably capture the spatial pattern of precipitation but significantly overestimates the maximum precipitation, especially over Oklahoma, the Kansas-Missouri border, and the Texas-Louisiana border.By using the optimal parameters obtained from the 25-km simulations, both the magnitude and spatial pattern of precipitation are improved at 12-km spatial resolution, with E decreasing from 148 to 89 and C increasing from 0.3 to  3) to the five CPS parameters (see Table 1) based on the 148 simulations (25 km) over SGP.The sensitivity ranking is calculated based on the correlation coefficients between output variables (y-axis) and input CPS parameters (x-axis) from 148 simulations.0.37.These results suggest that quantitative optimization may yield more robust model parameters that can improve precipitation simulation across a range of spatial scales.

Dependence of optimized parameters on climate regime
In the previous sections, optimization was performed for a regional model applied to a specific region (i.e. the SGP).However, the physical process and mechanism of convection and precipitation may differ in different climatic regimes (Knupp and Cotton, 1985;Grant, 2001;Kain et al., 2001).For example, Liang et al. (2004) showed that simulations of summer rainfall in the U.S. could be very sensitive to the CPS used because relative influence of large-scale tropospheric forcing and boundary layer forcing in triggering convection may vary in different CPSs.A critical question is how parameters optimized based on application in one regimes transfer to a different climate regime.We completed two additional simulations over the North America Monsoon (NAM) region (23-40 • N, 121-100 • W) using 25-km grid spacing on both simulations with default and optimal parameters, respectively.The NAM represents a distinctly different climate regime compared to the SGP in the central US (Berbery, 2001;Englehart and Douglas, 2006).For example, convection in the semi-arid NAM region is associated with strong surface heating, with a dominant late afternoon precipitation maxima related to the buildup of CAPE during the day.In the central US, on the other hand, precipitation maxima shows a distinct nocturnal maxima associated with increased nighttime moisture brought in by the Great Plain Low-Level Jet. Figure 13 shows the spatial distributions of observed and simulated precipitation with default and optimal parameters over the NAM region for July 1991.The model with default CPS parameters overestimates the maximum precipitation over coastal areas in northern Mexico.Precipitation over eastern New Mexico and the southern Colorado-Kansas border is also largely overestimated.As optimal parameters are applied, the precipitation over those regions is obviously improved, with E decreasing from 110 to 65 and C increasing from 0.26 to 0.31.
Similar to Fig. 6, Fig. 14 shows the observed and simulated frequencies of daily precipitation as a function of rain rate over the NAM region for July 1991.Compared with the observation, the WRF with default CPS parameters in the standard KF evidently overestimates the frequency of precipitation across all rain rates.By applying the optimal parameters based on EC over SGP, the model markedly reduces the overestimated occurrence frequency for all rainy events except for light rain smaller than 3 mm day −1 over the NAM region.The improvement is particularly evident for the moderate and heavy precipitation rain rates of more than 12 mm day −1 .These results suggest the optimal parameters determined based on one regime are transferable and lead to obvious improvements in model performance in a different regime.

Summary and discussion
Currently, Uncertainty Quantification (UQ) and parameter tuning in climate study are mostly applied in Global Climate Models (GCM).This may compromise the tuning by equal weighting of the state fidelity globally, even though the processes being tuned may only be relevant for particular regimes.The tuning process of parameters is often performed subjectively, although some studies have also applied an optimization procedure to minimize the difference between model fields and observations.While the latter approach may provide more plausible values for a set of tunable parameters to approximate the observed global climate or large-scale features, it is possible that the latter may be achieved by forcing the system to an unrealistic physical state or improper balance of budgets through compensating errors over different regions in the globe.In this study, regional climate model, the Weather Research and Forecasting (WRF) model, was used to provide a more flexible framework to investigate a number of issues related UQ and parameter tuning.The WRF model was constrained by reanalysis data over the Southern Great Plains (SGP), where abundant observational data from various sources were available for calibration of input parameters and validation of model results.Focusing on five key input parameters in the new Kain-Fritsch (KF) convective parameterization scheme (CPS) used in the WRF model as an example, our goal was to explore the utility of high-resolution observations for improving simulations of regional patterns and evaluate the transferability of UQ and parameter tuning across physical processes, spatial scales, and climatic regimes, which have important implications to UQ and parameter tuning in global models.The five parameters identified in the KF scheme are related to downdraft flux rate and starting height, environment flux rate, turbulent kinetic energy (TKE) in the sub-cloud layer, and the consumption time of Convective Available Potential Energy (CAPE), respectively.A stochastic sampling algorithm, Multiple Very Fast Simulated Annealing (MVFSA), was employed to efficiently sample the input parameters in the KF scheme based on a skill score so that the algorithm progressively moves toward regions of the parameter space that minimize model errors.
The WRF simulation period was from 1 May to 30 June 2007, and was reinitialized every three days, with 25-km grid spacing over the SGP.The results show the model bias for precipitation can be significantly reduced by using five optimal parameters identified by the MVFSA algorithm, especially for heavy precipitation with rain rates over 20 mm day −1 .The model response to precipitation and other model variables was more sensitive to the changes of downdraft-and entrainment-related parameters and consumption time of CAPE than to the other two parameters.Utilizing high-resolution observations, the simulated spatial pattern of precipitation was improved when the magnitude of model biases was reduced through the MVFSA process.The simulated convective precipitation decreases as the ratio of downdraft to updraft flux increases.Larger CAPE consumption time results in less convective but more stratiform precipitation.
The simulation using optimal parameters obtained by constraining precipitation alone generated positive impacts on other output variables, such as temperature and wind.By using the optimal parameters obtained at 25-km simulation, both the magnitude and spatial pattern of precipitation are also improved at 12-km spatial resolution.When moving the model domain to the North American Monsoon region, the optimal parameters identified from the SGP region also improved the simulation of precipitation, especially those with moderate and heavy precipitation with rain rates of more than 12 mm day −1 .These results suggest that benefits of optimal parameters determined through vigorous mathematical procedures such as the MVFSA process are transferable across processes, spatial scales, and climatic regimes to some extent.While our findings are preliminary, they motivate future studies to further assess the strategies for UQ and parameter optimization at both global and regional scales.
A number of limitations should be taken into account in evaluating the results of this study and in planning future studies.The primary limitation is that we assessed the model performance and tunable parameters based on differences in observed and modeled daily precipitation.Although most of the total rainfall was contributed by convective precipitation generated from the CPS in our case, the tuning process may still produce parameter settings that approximate the total observed rainfall, although the balance of different physical processes to achieve the total precipitation amount is not directly constrained.It is possible that the optimal parameters may only work well with the particular cloud microphysical scheme selected for this study.Furthermore, it may be more appropriate and beneficial to calibrate model parameters by constraining the behavior of physical processes (i.e. the turbulence, shallow and deep convection process in this study) rather than precipitation, which is a product of many interacting processes with large numbers of sources and sinks.
Second, the two regions (SGP and NAM) selected in this study are both convection-dominated climate regimes and precipitation are overestimated using the default model parameters in both regions.It is not clear whether optimization performed for one region is also transferable to another region if model biases with the default parameters are of opposite sign in the two regions.The issue of transferability of the benefits of optimization across different climate regimes and different spatial resolutions is being investigated further along with optimization of other physical parameterization schemes, which will be reported in a follow on paper.Third, how to define the skill metrics for evaluating model performance can be improved.In future studies, we would construct an auto-tuning procedure to minimize the bias in not only precipitation but also process-level variables, such as eddy diffusivities, PBL height, shallow convective mass fluxes, radiative heating rates, and so forth.In addition, future studies should also explore the use of spatial correlation coefficient, in addition to mean bias, in the skill score metrics for the optimization process, as this study already showed that spatial correlation provides useful information for model evaluation.In addition, uncertainties in the observations are not considered in this study, which may impact the shape of the posterior PDF of the input parameters and the model outputs including extreme events (Jackson et al., 2003).Fourth, different optimization approaches may affect the results and conclusions, but this issue has not been investigated in this study.We are currently comparing the MVFSA method and another sampling algorithm, the Annealing Evolutionary Stochastic Approximation Monte Carlo (AESAMC) (Liang, 2010), to investigate the convergence efficiency and the impact on the results.
Finally, the simulations conducted in this study were initialized every three days by reanalysis data.This weather forecast mode of simulation minimizes potential discrepancy between observed and simulated large scale circulation so model biases can be more directly related to the convective parameterization and its parameters.In future studies, we will compare model response and performance based on optimization process in free running simulations (i.e.climate simulation mode) strictly constrained (driven) by large-scale observations/reanalysis. Establishing the transferability of optimized parameters between weather and climate simulations would provide indirect evidence further supporting the seamless prediction strategy (Hurrell et al., 2009) and the transpose method of evaluating and diagnosing climate model biases through hindcast weather forecast simulations (e.g.Boyle et al., 2005).

Fig. 1 .
Fig. 1.The best values obtained using MVFSA method as a function of the number of model evaluations.

Fig. 3 .
Fig. 3. Spatial distributions of observed and simulated (25 km) monthly mean precipitation over SGP for June 2007, with different radiation (RRTMG vs. CAM) and microphysics schemes (WSM6 vs. Morrison).Solid box highlighted in top panel shows the sub-region for later analysis.

Fig. 4 .
Fig. 4. (Top) The response of model performance (quantified as E as introduced in Sect.2.2) to five input parameters based on the 148 simulations (25 km) over SGP through the MVFSA procedure.Red curves represent an average of results at each bin.Default number of each parameter is marked as red crosses.(Middle) The frequency distributions of "good" experiments as a function of each parameter."Good"experiments are defined as those with lower E (better performance) than that using the standard KF scheme with default parameters.(Bottom) The marginal probability density functions (PDF) for the five input parameters derived by kernel density estimation.

Fig. 5 .
Fig. 5. Spatial distributions of observed and simulated (25 km) monthly mean precipitations over SGP for June 2007, with default and optimized (based on E or EC) parameters in the KF scheme.

Fig. 6 .
Fig. 6.The observed and simulated (25 km) frequency distributions of daily precipitation over SGP as a function of rain rates, with default and optimized (based on EC) parameters in the KF scheme.The result is derived from daily precipitation at all grids within the model domain as shown in Fig. 2 for June 2007.

Fig. 7 .
Fig. 7.The response of simulated explicit (top), convective (middle) and total (bottom) precipitation averaged over the sub-domain shown in Fig. 3 to the five parameters in the KF scheme.The meaning of red curves is same as in Fig. 4.

Fig. 8 .
Fig. 8.The response of 14 model output variables (see Table3) to the downdraft mass flux related parameter P d based on the 148 simulations (25 km) over SGP.

B.Fig. 9 .
Fig. 9. Same as Fig. 8 except for the entrainment rate related parameter P e .

Fig. 10 .
Fig. 10.Relative sensitivities of the response of the 16 meteorological variables (see Table3) to the five CPS parameters (see Table1) based on the 148 simulations (25 km) over SGP.The sensitivity ranking is calculated based on the correlation coefficients between output variables (y-axis) and input CPS parameters (x-axis) from 148 simulations.

Fig. 11 .
Fig.11.The spatial distributions of the differences of model biases for temperatures and wind speed between the simulations (25-km) with the default and optimized parameters.Here, the value on each grid point is calculated as (|Optimal-Observation| − |Default-Observation|), so negative value represents a positive impact by using optimized parameters.

Fig. 12 .
Fig. 12.The spatial distributions of observed and WRF simulated (with 12-km spatial resolution) monthly mean precipitations over SGP for June 2007, with default and optimal parameters based on 25-km simulation.

Fig. 13 .
Fig. 13.The spatial distributions of observed and simulated (25-km) monthly mean precipitation with default and optimal parameters obtained at the SGP, respectively, over the North America Monsoon (NAM) region for July of 1991.

Fig. 14 .
Fig. 14.The observed and simulated (25-km) frequency distributions of daily precipitation over NAM for July 1991 as a function of rain rates, with default and optimized (based on EC in SGP) parameters in the KF scheme.

Table 1 .
The short name, default, minimum and maximum values, and the descriptions of the five parameters in the KF convective parameterization scheme in WRF 3.2.1.

Table 2 .
The values of five identified parameters in the KF scheme, skill scores E and C, used or obtained in the simulations with default or optimized (based on E or EC, respectively) parameters.

Table 4 .
Correlations of model performance between the precipitation and the mean/maximum/minimum 2-m temperature and 10-m wind speed.The correlation coefficients are calculated on the basis of skill scores for the precipitation (based on E and C, respectively) and for the temperature and wind speed (based on E) of the 148 simulations (25-km) over SGP.