Despite its important role on the human health and numerous
biological processes, the diffuse component of the erythemal
ultraviolet irradiance (UVER) is scarcely measured at standard
radiometric stations and therefore needs to be estimated. This
study proposes and compares 10 empirical models to estimate the
UVER diffuse fraction. These models are inspired from mathematical
expressions originally used to estimate total diffuse fraction, but,
in this study, they are applied to the UVER case and tested against
experimental measurements. In addition to adapting to the UVER range
the various independent variables involved in these models, the
total ozone column has been added in order to account for its strong
impact on the attenuation of ultraviolet radiation. The proposed
models are fitted to experimental measurements and validated against
an independent subset. The best-performing model (RAU3) is based on
a model proposed by Ruiz-Arias et al. (2010) and shows values of

Low doses of ultraviolet radiation are beneficial for human health, particularly for the synthesis of vitamin D3, critical in maintaining blood calcium levels (Webb et al., 1988; Glerup et al., 2000; Holick, 2004). However, excessive exposure has adverse consequences such as favoring the development of skin cancer, immune suppression, and eye disorders (Diffey, 2004; Heisler, 2010). The effectiveness of ultraviolet radiation in producing erythema on human skin is usually quantified by the erythemal action spectrum (McKinlay and Diffey, 1987). The ultraviolet radiation weighted by this action spectrum is named erythemal ultraviolet radiation (UVER). Additionally, ultraviolet radiation may have a negative impact on ecosystems such as corals and phytoplankton communities and affect plant growth (Lesser and Farrell, 2004; Zepp et al., 2008; Häder et al., 2011, 2015). It is also the main factor for degradation of paints and plastics exposed to outdoor conditions (Johnson and McIntyre, 1996; Verbeek et al., 2011).

Recent studies have shown that, in addition to stratospheric ozone
variability, changes in ultraviolet radiation in the last two decades
have been influenced by variations in aerosols, clouds, and surface
reflectivity (Arola et al., 2003; Herman, 2010). Significant positive
trends in ultraviolet radiation have been detected in different
European countries and attributed to a decrease in cloud cover
(Krzyscin et al., 2011; den Outer et al., 2005; Smedley et al., 2012;
Zerefos et al., 2012). A significant positive trend of
2.1

In the framework of the climate change, new variations in ultraviolet irradiance at the Earth's surface are expected for the next decades as a result of the predicted changes in clouds and aerosols (McKenzie et al., 2007; Bais et al., 2011; Williamson et al., 2014). These variations in clouds and aerosols may affect not only the amount but also the diffuse–direct partitioning due to the stronger effectiveness of scattering at shorter wavelengths.

In contrast to the direct component, diffuse ultraviolet irradiance is difficult to block (Utrillas et al., 2010; Kudish et al., 2011). For instance, diffuse UVER irradiance under a standard beach umbrella can reach 34 % of global UVER irradiance (Utrillas et al., 2007) and up to 60 % in tree shade (Parisi, 2000). This percentage increases notably with high load of aerosols and presence of clouds, especially in the case of broken clouds (Alados et al., 2000; Calbó et al., 2005; Esteve et al., 2010). However, very few studies focus on ultraviolet diffuse irradiance, mainly due to the scarcity in experimental measurements. While global ultraviolet irradiance is commonly registered worldwide, its diffuse component is seldom measured. Therefore, modeling is a good alternative to partly relieve this scarcity.

There are two main approaches to estimate solar radiation: using physically based or empirical models. In general, the diffuse component of the radiation field is the magnitude most difficult to estimate, due to the high complexity of the processes involved. Thus, physically based models, such as libRadtran (Mayer and Kylling, 2005), SBDART (Ricchiazzi et al., 1998) and TUV (Madronich and Flocke, 1997), require a very detailed and accurate description of the composition of the atmosphere, aerosols and clouds to reliably estimate the diffuse radiation. However, this detailed information is often unavailable, and therefore an empirical approach is needed. Hence, in this paper the empirical approach was preferred because of its simplicity and modest requirements in terms of ancillary data. The empirical approach has been widely use by the scientific community to estimate the diffuse component in the total solar spectrum (Orgill and Hollands, 1977; Iqbal, 1983; Reindl et al., 1990; Gonzalez and Calbo, 1999; De Miguel et al., 2001; Boland et al., 2008; Ridley et al., 2010; Ruiz-Arias, 2010; Engerer, 2015).

In the particular case of the UV range, the complexity in modeling the diffuse component increases due to the higher effectiveness of the Rayleigh scattering. As far as we are aware, only a few studies have applied empirical models to estimate the diffuse solar irradiance in the ultraviolet range (Grant and Gao, 2003; Nuñez et al., 2012; Silva, 2015). Moreover, the applicability of these studies is limited, since they rely on spectral measurements (Silva, 2015) or require information which is usually unavailable, such as cloud fraction and aerosols properties (Grant and Gao, 2003; Nuñez et al., 2012). In this context, comprehensive studies focused on the proposal of reliable models based on commonly available data are needed.

In order to contribute to addressing this need, this study aims to propose empirical expressions for modeling hourly UVER diffuse fraction under different sky conditions and to compare their performance against experimental measurements. The proposed expressions will be inspired on the empirical formulae commonly used to estimate the diffuse fraction for total solar irradiance. Several radiometric and geometrical variables will be assessed in order to address their contribution to the UVER diffuse fraction. Additionally, the total ozone column will be included in the models which are proposed in this study, due to its essential role for the attenuation of ultraviolet radiation. Finally, the performance of the proposed expressions will be validated against experimental measurements.

Data presented here were collected at the radiometric station
installed on the roof of the Physics building at the University Campus
in Badajoz, Spain. This station is operated by the AIRE research
group of the Physics Department of the University of Extremadura. This
experimental site is located in southwestern Spain (38.9

The period analyzed in this study comprises years 2011 and 2012, which ensures that a large variety of seasonal processes and meteorological conditions are sampled. The large variety of sun-geometry and meteorological situations that occur during a year guarantees the representativeness of the dataset for the proposal and assessment of empirical models for our location. However, it must be mentioned that snow and altitude are additional factors that have not been considered in this study. They are not represented in the dataset and the proposed models have not been tested for the processes they involve. These factors can significantly affect total UVER and the direct / diffuse ratio and therefore should be included for high and snowed locations.

The UVER irradiance data used in this study were recorded by two Kipp &
Zonen UVS-E-T radiometers with serial numbers #000409 and #080017. The
UVS-E-T radiometer measures erythemal ultraviolet irradiance between 280 and
400

The dataset consists of simultaneous measurements of horizontal global and diffuse UVER irradiance. Thus, while the UVS-E-T radiometer #000409 was installed on a table to measure global UVER irradiance, the UVS-E-T radiometer #080017 was installed on a Kipp & Zonen Solys 2 sun tracker to measure diffuse UVER irradiance. This device prevents the direct solar irradiance to reach the sensor by means of a small ball which continuously projects its shadow on the sensor. Since the portion of the sky obstructed by the shadow ball is negligible, no correction is required for these measurements (Ineichen et al., 1984).

Global and diffuse UVER measurements were recorded every minute by a Campbell Scientific CR-1000 data logger. Based on these data and the time of each measurement, a 1 min dataset consisting of the UVER diffuse fraction, UVER transmissivity, relative optical mass and cosine of the solar zenith angle was built. Subsequently these quantities were averaged hourly. In this study, hourly data have been used similarly to the majority of previous studies (Reindl et al., 1990; González and Calbó, 1999; Boland et al., 2001, 2008; Ridley et al., 2010; Ruiz-Arias et al., 2010). According to Ruiz-Arias (2010), while random errors are much lower than shorter intervals, it offers an appropriate agreement between data availability and the inherent solar radiation temporal variability. Thus, this temporal frequency is the one used by many applications, such as house energy ratings scheme software (Boland et al., 2001). As a consequence, most of the statistical models are based on the hourly interval of the solar radiation data (Ruiz-Arias, 2010; Gueymard and Ruiz-Arias, 2016).

Additionally, daily total ozone column (TOC) values as provided by the
NASA Ozone Monitoring Instrument (OMI) through their website
(

The diffuse component of the solar radiation is usually quantified by
the diffuse fraction (

Although there are very few models for estimating the ultraviolet diffuse fraction (Grant and Gao, 2003; Nuñez et al., 2012; Silva, 2015), several expressions proposed for modeling the diffuse fraction integrated along the complete solar wavelength interval (termed as total diffuse fraction) can be found in the literature (see, for example, compilations reported by Engerer, 2015, and Gueymard and Ruiz-Arias, 2016). These models attempt to describe the absorption and scattering of solar radiation when crossing the atmosphere. Since the mechanisms of absorption and scattering of ultraviolet solar radiation are qualitatively similar to those affecting other solar wavelengths, the models described in this study will be largely based on published models describing the total diffuse fraction. Towards this goal, a complete compilation of models for estimating total diffuse fraction was performed, the mathematical function and the variables involved were analyzed, and the most suitable models were adapted to the ultraviolet region.

Regarding the independent variables to use, it must be noted that most
empirical models for total diffuse fraction are primarily based on the
total transmissivity (

Figure 1 shows the relationship between UVER diffuse fraction
(

UVER diffuse fraction (

Additionally, in the particular case of the ultraviolet wavelengths,
the stratospheric ozone plays a very important role for modulating the
radiation that arrives at the Earth's surface. Therefore, in
principle, the ozone amount must be included in the models. In order
to test its impact on the UVER diffuse fraction (

The approaches analyzed in this study correspond to models originally
proposed for the total diffuse fraction

As mentioned above, total ozone column (TOC) is an essential attenuation factor for the UVER radiation and therefore it has been added to the models originally proposed for the total diffuse fraction. This new variable has been included by adding the term to each model's mathematical formula. It is worth mention that a multiplicative approach consisting of the product of the model's original formula and a power function of TOC has also been analyzed (not shown). However, the results were essentially the same as those achieved by simply adding a term, and therefore this latter approach was preferred because of its higher simplicity and parsimony.

The majority of empirical models for estimating the total diffuse
fraction represent

In contrast, in the ultraviolet range, no piecewise behavior is
detected in the relationship between

The diffuse fraction shows further variability due to short-term
changes in clouds or atmospheric turbidity. Gonzalez and Calbo (1999)
proposed three variables (

Similarly to the proposal of Gonzalez and Calbo (1999) for total
diffuse fraction, variables

Boland et al. (2001) proposed a logistic function to estimate the
total diffuse fraction as a function of the total transmissivity. The
logistic functions are S-shaped sigmoid curves where the increase is
approximately exponential at the initial stage and, then, the growth
slows as saturation begins. This behavior, but with decay, can be
useful to describe the dependence of total diffuse fraction (

The original expression proposed by Boland et al. (2001) was later
expanded by Ridley et al. (2010) to include four additional variables:
(1) the solar zenith angle; (2) the apparent solar time

Similarly, the UVER daily clearness index (

It has to be noted that, in this study, the variable

Kuo et al. (2014) developed several correlation models aimed to
estimate the hourly solar diffuse fraction in Taiwan. They compared
four newly proposed models with 14 models previously available
in the literature. As a result of the comparison, they proposed a new
model consisting of a multiple linear combination of the same
independent variables included in Ridley et al.'s model. In this
study, following Kuo et al.'s suggestion, a model named KUU was built
for the UVER case, as follows:

Similarly to Ridley et al. (2010), Ruiz-Arias et al. (2010) proposed
a model for the total diffuse fraction (

This study aims to fit the models to experimental data and subsequently compare their performance using an independent dataset. Towards that aim, the hourly dataset was randomly divided in two subsets: (1) the fitting subset, containing the 75 % of data (3979 cases), for fitting the coefficients of the models, and (2) the validation subset, containing the remaining 25 % of data (1262 cases), for model validation and comparison. In principle, linear fitting is preferred since it requires no starting values of the fitting coefficients. Therefore, linear least squares fitting was applied whenever possible, that is, to models which are linear (REU, CGU1, CGU2, CGU3 and KUU) or linearizable, i.e. those that can be reduced to a linear form with a change of variables (BOU and RIU). For the remaining cases (RAU1, RAU2, and RAU3) it was necessary to apply nonlinear fittings.

The performance of the models proposed to estimate the UVER diffuse
fraction was compared using both statistical and graphical tools. The
coefficient of determination (

Additionally, Taylor diagrams (Taylor, 2001) and the relative
differences were used for model comparison. The Taylor diagram
provides a concise graphical summary of different aspects of the
performance of a model such as the centered root-mean-square error,
the correlation, and the standard deviation. On the other hand, the
relative residuals between modeled,

Main results of the fitting of each empirical model to the fitting
subset are summarized in Table 1. Ordinary least squares fitting (also
known as linear least squares) for models REU, GCU1, GCU2, GCU3, BOU,
RIU and KUU, and nonlinear fitting for models RAU1, RAU2, and RAU3
have been calculated. As mentioned in Sect. 3, some models involve
parameters accounting for the short-term fluctuation. In particular,
models based on Gonzalez and Calbo (1999), that is, GCU1, GCU2, and
GCU3, include parameters

Coefficient of determination and relative root-mean-squared error corresponding to the fitting against experimental measurements and the validation of each model.

Most of the models performed notably well, with

Subsequently, the various models with their fitted coefficients were
applied to the validation subset. The resulting

Taylor diagram (Fig. 2) confirms the generally good performance achieved by the proposed models, but also identifies two separate groups: on the one hand, models BOU and RIU and, on the other hand, models REU, GCU2, KUU, and RAU3, the last of which performing moderately better. It is worth noting that the worst-performing models, BOU and RIU, are based on the same logistic function proposed by Boland et al. (2008). It can therefore be concluded that such functional form is not as appropriate for the UVER case as those used by the remaining models. Moreover, that worse performance is not improved even when more variables are included such as in model RIU.

Taylor diagram showing the performance of the models proposed to estimate the diffuse fraction, as compared to experimental measurements. This diagram summarizes different aspects of the performance of a model such as the centered root-mean-square error (green lines), the correlation, and the standard deviation with respect to the reference data set (black dot).

Mean relative residuals of each UVER diffuse fraction model
vs.

Functional form of the models and empirical fitting coefficients with their corresponding standard error for Badajoz, Spain.

Models REU and KUU completely overlap, indicating that no improvement
is achieved when variables AST,

In addition to the regression statistics mentioned above, the relative residuals between measured and modeled values were calculated, and their variation with respect to solar zenith angle, UVER transmissivity, and UVER diffuse fraction bands was analyzed. In order to clearly show the relationship with a particular variable, the relative residuals were averaged by intervals in that variable.

Figure 3 confirms the worse performance achieved by models BOU and
RIU. The relative residuals for these two models are the largest among
the models proposed in this study. These large residuals occur for low
solar zenith angle, high

In contrast, models REU, GCU2, KUU, and RAU3 show much smoother
patterns, with absolute relative residuals smaller than 5 % for
almost the entire range of

Table 2 shows the fitting coefficients for each proposed model. It is important to note that the particular values of the coefficients are specific for our local conditions. Therefore, in order to apply the models to other locations, the coefficients should be calculated by fitting to local measurements.

This study aims to accurately estimate hourly UVER diffuse fraction at the Earth's surface using empirical models. Towards this goal, 10 mathematical expressions are proposed and their performance is compared to experimental measurements. All the empirical models analyzed are based on mathematical expressions originally suggested by Reindl. et al. (1990), Gonzalez and Calbo (1999), Boland et al. (2008), Ridley et al. (2010), Kuo (2014), and Ruiz-Arias et al. (2010) for modeling the total diffuse fraction but, in this study, they are applied to the UVER case. Among a complete compilation of formulae used for estimating total diffuse fraction, those models that rely on variables commonly available at standard radiometric stations are selected. This criterion is applied in order to favor the general applicability of the results of the study. Additionally, a term including the total ozone column is added to account for the important role played by the stratospheric ozone in modulating the ultraviolet radiation that arrives at the Earth's surface. As a result, the models REU, GCU1, GCU2, GCU3, BOU, RIU, KUU, RAU1, RAU2, and RAU3 are built, fitted against experimental data, and finally validated.

The fitting to experimental measurements revealed a generally good performance of all models except for models BOU and RIU, which perform somewhat worse. It can be said that the proposal of mathematical expressions and variables succeed to describe the variation in the UVER diffuse fraction. Results indicate that multiple linear combinations and the sigmoid function suggested by Ruiz-Arias (2010) are more suitable for the UVER case than the logistic function proposed by Boland et al. (2008). In the case of total integrated radiation, logistic models proved to be useful since they reliably describe the abrupt change shown by the relationship between the total diffuse fraction and the total transmissivity. However, for the UVER measurements that relationship is much smoother, and therefore the logistic models BOU and RIU provide no improvement with respect to more simple linear models REU, GCU2, and KUU. Conversely, the more complex sigmoid function proposed by Ruiz-Arias et al. (2010) achieves the best fitting statistics.

The fitting results are confirmed by the validation against an
independent subset of measurements. The best-performing model is RAU3
followed by GCU2, REU, and KUU, and finally by RIU and BOU, which
perform notably worse, with

Regarding the residuals, RAU3 is again the best model, with almost all
absolute values smaller than 3 % and no dependency with

This study positively contributes to estimating UVER diffuse irradiance
and UVER diffuse fraction in locations where only UVER global
irradiance measurements are available. Additionally, the models
proposed here can be used to expand time series of UVER diffuse
radiation to periods when global but not diffuse UVER irradiance was
being measured. It should be mentioned that factors affecting the UVER
diffuse fraction such as the altitude or surface albedo have not been
tested in this study. Moreover, since only solar zenith angles below
70

The data analyzed in this study are available from the authors upon request (guadalupesh@unex.es).

The authors declare that they have no conflict of interest.

This study was partially supported by the research projects CGL2014–56255-C2-1-R, funded by the Ministerio de Economía y Competitividad from Spain, and by Ayuda a Grupos GR15137, funded by Junta de Extremadura and Fondo Social Europeo (FEDER). Guadalupe Sanchez Hernandez thanks the Ministerio de Economía y Competitividad for the predoctoral FPI grant BES-2012-054975. Thanks to the referees for their comments and suggestions, which notably improved this paper. Edited by: Stelios Kazadzis Reviewed by: two anonymous referees