The representation of winds in the lower troposphere in ECMWF forecasts and reanalyses during the EUREC4A ﬁeld campaign

. The characterization of systematic forecast errors in lower-tropospheric winds over the ocean is a primary need for reforming models. Winds are among the drivers of convection, thus an accurate representation of winds is essential for better convective parameterizations. We focus on the temporal variability and vertical distribution of lower-tropospheric wind biases in operational medium-range weather forecasts and ERA5 reanalyses produced with the Integrated Forecasting System (IFS) of the European Centre for Medium-Range Weather Forecasts (ECMWF). Thanks to several sensitivity experiments 5 and an unprecedented wealth of measurements from the 2020 EUREC4A ﬁeld campaign, we show that the wind bias varies greatly from day to day, resulting in RSME’s up to 2.5 m s − 1 , with a mean wind speed bias up to -1 m s − 1 near and above the trade-inversion in the forecasts and up to -0.5 m s − 1 in reanalyses. The modeled zonal and meridional wind exhibit a too strong diurnal cycle, leading to a weak wind speed bias everywhere up to 5 km during daytime, turning into a too strong wind speed bias below 2 km at nighttime. The biases are fairly insensitive to the assimilation of sondes and likely related to 10 remote convection and large scale pressure gradients. Convective momentum transport acts to distribute biases throughout the lowest 1.5 km, whereas at higher levels, other unresolved or dynamical tendencies play a role in setting the bias. Below 1 km, modelled friction due to unresolved physical processes appears too strong, but is (partially) compensated by

Some aspects of the systematic error in surface winds from weather models have been described in the literature, for instance the insufficient mesoscale variability in the extratropics (Gille, 2005), the lack of small-scale features relevant for sea surface temperature (SST) gradient effects (Chelton et al., 2004;Risien and Chelton, 2008) and the generally excessive zonal winds (Chaudhuri et al., 2013;Belmonte Rivas and Stoffelen, 2019;Sandu et al., 2020).
In this study we focus on the representation of the vertical profile of winds during EUREC4A in operational forecasts and the ERA5 reanalyses produced with the ECMWF IFS. Our objectives are to:

Observations
The measurements used in this study have been taken during the EUREC4A campaign . Within EU-REC4A, a region of intensive measurements was defined in support of studies of cloud-circulation interactions, cloud physics, and factors influencing the mesoscale patterning of clouds. This region largely corresponds with the domain of our study (Figure 1) and it is situated near the western end of the 'Tradewind Alley', an extended corridor across the Atlantic (see Figure 1 in ) with its downwind terminus defined by the Barbados Cloud Observatory (BCO). The instruments used 95 in this study are dropsondes, radiosondes and a ship-borne wind lidar system.

JOANNE
We use EUREC4A dropsonde measurements from the Joint dropsonde Observations of the Atmosphere in tropical North at-laNtic meso-scale Environments (JOANNE) dataset (George et al., 2021). The primary strategy of the EUREC4A dropsondes launches was to sample atmospheric profiles along a 222 km diameter circle centred at 13.3°N, 57.7°W. Following Stevens The radial velocities are corrected for ship motions using a simplified correction methodology using an internal GPS system of an accompanying short-range WLS7 WIndCube, which uses a combination of an xSEns MTi-G attitude and heading reference sensor (AHRS) and a Trimble SPS361 satellite compass. The simple motion correction applied to the LOS velocities 135 takes into account the translational ship motions and the yaw information, as explained in Wolken-Möhlmann et al. (2014) and Gottschall et al. (2018). The pitch and roll information is not used, since according to previous studies (Wolken-Möhlmann et al., 2014), the effect of these tilt motions are less relevant for relatively stable platforms. After corrections, the wind vector is retrieved and the data is averaged to 1 hourly values.
The left panel of Figure 1 shows for each 10 minutes, in green, the location of the RV Meteor carrying the WindCube. In the 140 time span of 1 hour the Meteor research vessel only covers a limited area of the studied domain. For this reason, hourly mean profiles from lidar measurements are not considered to be representative of the entire domain (see section 3). Figure 2 shows that wind profiles from lidar measurements are available continuously from the 25th of January to the 15th of February.

Modelling datasets
We use the observations to validate several modelling datasets produced with the ECMWF IFS, covering the period from the 145 18th of January to the 15th of February 2020.
These datasets comprise the operational deterministic high-resolution (9km) forecasts and analysis (or initial conditions from which the forecasts start), the ERA5 reanalysis (Hersbach et al., 2020), and several experiments at coarser resolution. For each of these datasets, model output was extracted at the nearest four neighbours of 61 points placed concentrically around the centre of EUREC4A-circle. Each group of four points was then used to interpolate the model values to the locations of the 61 150 arbitrary points using an inverse distance weighting method. The location of these arbitrary points is shown on the right panel of Figure 1 with black crosses. The second most external ring of points coincides with the EUREC4A-circle.

Forecast
For the operational ECMWF deterministic ten-days forecasts and analysis, which have a resolution of 9km, the extracted model grid points for the EUREC4A-circle are marked in orange in Figure 1 (right). For clarity, we avoid showing the rest 155 of the extracted model grid points. We extract hourly output for day two of the forecasts (a leadtime of 24 to 48 hours) and 3-hourly output for day four forecasts (a leadtime of 72 to 96 hours).
The wind errors in the short-range (day two) forecasts are very similar to the errors in the medium-range (day four) forecasts.
This corroborates the findings of Sandu et al. (2020) which showed that over this trade-winds region the errors in wind profiles develop in the first 12 hours of the forecast and do not grow significantly until day five. Hereafter we will therefore focus on the forecasts with a lead time of two days, which we for simplicity refer to as forecast.

ERA5
The newest reanalysis produced by ECMWF for the Copernicus Climate Change Service, ERA5 reanalysis is widely used for model evaluation, and often it is used as a proxy for observations. Similar to the operational analysis, ERA5 is produced with ECMWF IFS by optimally combining short-range forecasts and observations through data assimilation (as it is done to create 165 the analysis, or initial condition of the forecasts). While operational analyses are not consistent in time because of regular upgrades to the forecasting system, reanalyses are produced with an unique version of the forecasting system. This leads to a consistent time series which allows one to monitor environmental changes. For example, ERA5 is produced with the IFS model cycle 41r2, at a resolution of approximately 32 km, and covers the period 1950 to present (Hersbach et al., 2020).
Here we exploit EUREC4A observations to evaluate the quality of the wind profiles also in the ERA5 reanalysis. The 170 extraction points for ERA5 corresponding to the EUREC4A-circle are shown in blue in Figure 1 (right panel). In the sections below we focus on the wind profiles from ERA5, rather than from the operational analysis, because the differences in wind profiles over the EUREC4A region between the operational analysis (which is available 6 hourly) and ERA5 are marginal (not shown).

175
A question often asked, and which also motivates looking at the performance of ERA5, is whether a reanalysis is close to reality because observations are assimilated. To answer this question, several sensitivity experiments were performed at 40 km resolution and outputs saved every 3 hours for the day two forecasts and every 6 hours for the analyses.
First, a control analysis (CTRL_an) and corresponding ten-days control forecasts (CTRL_fc) initialized from it were performed at this resolution. Second, so-called data denial experiments were performed in which measurements made during 180 EUREC4A are not assimilated when creating the initial conditions of the forecasts. These experiments consist of: (a) an analysis experiment in which the EUREC4A dropsondes are not assimilated, and corresponding ten-days forecasts (Exp1_an, Exp1_fc); (b) an analysis experiment where neither EUREC4A dropsondes nor radiosondes are assimilated, and corresponding ten-days forecasts (Exp2_an, Exp2_fc). Lastly, we performed an analysis experiment, and corresponding ten-days forecasts, where shallow convective momentum transport is switched off (Exp3_an, Exp3_fc). These sensitivity experiments allow us to 185 explore the origin of the IFS wind bias. For all experiments the analysis covers the period 15 January to 15 February 2020, and the ten-days forecasts were initialized daily at 00UTC from the respective analysis.
For the above mentioned sensitivity experiments we also extracted outputs at the four nearest model grid points to the 61 arbitrary points shown in Figure 1 and performed an inverse distance weighting to the location of our arbitrary points. We found that the spatial resolution of the different model experiments has little impact on the results of this study, where we compare 190 means obtained from several locations in a 350 km x 350 km domain.
Mean wind profiles are derived using the datasets described above. All profiles are interpolated to a grid of 50 m vertical resolution between 0.15 km and 5 km. The wind vectors are decomposed into zonal (u) and meridional (v) components and analysed at different hours of the day using hourly and 3-hourly composites. While the modelling datasets directly provide 195 vectorial wind components, the observations measure scalar quantities such as wind speed and wind direction. In this study we retrieve the corresponding meridional and zonal components for each radiosonde and dropsonde and for each of the 10 minute lidar winds.
As described in section 2, all dropsondes launched during one flight circle are averaged to a timestamp of the nearest full hour. Radiosondes are averaged over the domain within three hours intervals. Lidar profiles are averaged within one hour 200 intervals. While model outputs uniformly sample the entire domain at each time step, observations only sample one location at the time. To partially account for these differences in the data sets and the spatial variability of the trades, when we derive the forecast and (re)analysis errors with respect to the different observational datasets, we sample the model output to match the sampling we have chosen above for the respective data set. For example, when we compare to the radiosondes, we average the model profiles extracted for the 61 points and over 3-hourly intervals. When we compare to the dropsondes, we average the 205 model points extracted along the EUREC4A-circle at the hour during which the circle was flown. In the case of the wind lidar, the observed hourly mean profiles represent a small area of the domain and for this reason we use the closest point from the forecast/(re)analysis dataset to compute the model errors with respect to the lidar. When the model is simultaneously compared to multiple observational datasets (e.g. top row in Figure 4 and left panel in Figure 7), we show the model mean over the selected domain, obtained from all the 61 points and with the temporal resolution available for the model output.   Figure 3 helps quantifying the spatial variability of winds in the study area and motivates our choice of the spatial matching of the observations and the model output. It shows that there is a NW to SE gradient in wind, whereby the south-east region of the domain experiences winds about 0.5 m s −1 stronger than the average of the domain. The lidar samples this region more frequently than the north-west area where weaker winds prevail. Thus, we expect the wind-lidar winds to generally be stronger.
As mentioned above, the spatial variability of winds in the domain is taken into account by calculating mean dropsonde 215 profiles for each flight-circle. Similarly for the radiosondes, we assume that the launch locations over three hours are sufficiently dispersed to provide a good representation of the entire domain.

Temporal Variability of the Wind Bias
January and February 2020 are characterised by low-level north-easterly winds (Figure 4 (top row)). The (zonal) wind speed typically peaks near cloud base and decreases aloft, establishing a so-called backward sheared wind profile. This structure 220 is typical for the trades, as documented in earlier field studies (Riehl et al., 1951;Brümmer et al., 1974) and more recently using the BCO climatology alongside ERA-Interim (Brueck et al., 2015). A recent study using north-Atlantic wide Large Eddy Simulations with ICON (hindcasts performed for the pre-EUREC4A NARVAL campaign period) suggests that the local maximum in zonal wind near cloud base results from efficient turbulent diffusion in the sub-cloud layer, but little if any cumulus friction at cloud base Helfer et al. (2021). There is also an apparent layer with counter-gradient momentum transport near cloud 225 tops, which suggests that clouds tend to enhance the vertical wind gradient, rather than reduce it Larson et al. (2019) Using JOANNE, Nuijens et al. (2021, under preparation) show that winds were relatively weak with strong backward shear during the final two weeks of January 2020, transitioning to a period with stronger winds and weaker shear during the first week of February 2020, and that the campaign ended with several days with strong winds and strong backward shear.

230
In the next subsections we analyse the ability of the IFS to model the mean winds and their variability during EUREC4A.
First we look at the overall mean profiles during EUREC4A (section 4.1). Next, we analyse the day-to-day synoptic variability (section 4.2). Lastly, we analyse the diurnal cycle of winds on sub-daily time scales (section 4.3).

Monthly Mean
The top row of Figure 4 shows the mean wind profiles from different data sets for the period and domain analysed in this study  The lidar measured stronger winds in the sub-cloud layer, as it probed the atmosphere in a region where winds were generally strong, as discussed in section 3. mean square error (RMSE) as: where the overbar represents the arithmetic mean and Θ is any modeled (mod) or observed (  As one would expect, ERA5 shows a better agreement with the observations than the forecast, both for the bias and the RMSE (middle and bottom row). The first two rows suggest that both ERA5 and the forecast perform considerably well in mimicking the radiosondes and lidar monthly mean profiles, especially in the lower 2 km. Nevertheless, the bottom row of 265 Figure 4 shows that the RMSE between forecast/ERA5 and radiosondes is as large as 1 m s −1 at 250 m and 2.5 m s −1 between 3 km and 4 km, for all components. This suggests that the positive and negative errors are fairly normally distributed and cancel out when calculating the model bias and the mean profiles.
In line with previous studies (Belmonte Rivas and Stoffelen, 2019), we find that on average ERA5 has slightly too strong easterly winds near the surface, with a too strong easterly component and a mean bias close to zero in the northerly component.

270
Nevertheless, here the forecast shows mean meridional winds almost 0.5 m s −1 too weak near the surface. A new finding here is that errors near the surface are relatively small compared to errors at higher levels. This suggests that the surface bias may be tied to processes further aloft. Before we explore this in more detail in section 5, we take a closer look at errors established at individual times, which as suggested by the relatively large RMSE should be pronounced (section 4.2). We will also show that the bias near the surface and higher up critically depends on the time of the day (section 4.3).

Day-to-day Synoptic Variability
Although the trade-winds are generally steadier than midlatitude flows, they still exhibit significant variability on a daily and even hourly basis. Figure 5 shows wind components (zonal, meridional and wind speed) at hourly resolution from ERA5, We are interested in whether the IFS error is larger on days associated with specific wind conditions. The 3-hourly bias of ERA5 compared to the radiosondes ( Figure 6) does not suggest a strong correlation between the reanalysis bias and the wind speed or direction. The sign and magnitude of the 3-hourly biases are relatively similar in the first and second half of the EUREC4A period, with positive and negative values of up to 2 m s −1 in both the zonal and the meridional wind components.

285
A similar pattern to the one in Figure 6, only with larger values, is obtained for the 3-hourly forecast bias with respect to radiosondes (not shown).
A prolonged and significant overestimation of the zonal jet (around 2 km) can be noticed on Jan-22 and Jan-23. This could be related to the misrepresentation of a large scale phenomenon acting on multi-diurnal time scales.

295
To analyse the diurnality of the winds we create composites for all the data sources as described in section 3. Hourly composite are obtained from dropsondes and lidar profiles, while 3-hourly composites from radiosondes. For ERA5 and the forecast we use hourly composites unless they are compared to radiosondes, then 3-hourly composites are used.
The mean diurnal cycle of the layer between 0.15 km and 0.75 km is depicted in Figure 7  we explore further in section 5 when we test the sensitivity to the assimilation of the radiosondes and dropsondes.
Although ERA5 itself is not bias free, we can use it to explore the vertical structure of the diurnality in winds. In the right panel of Figure 7 the wind diurnal cycle from ERA5 is represented between 150 m and 5 km. The contours show that the 310 diurnality is present at all heights of the vertical domain, not only in the sub-cloud layer, where turbulence plays a major role.
In particular, above 2 km, the zonal and total wind speed variations (right panel of Figure 7)   (solid blue). The right column refers to multiple levels from surface to 5 km with values from ERA5 only. Ueyama and Deser, 2008) and has been linked to semi-diurnal atmospheric thermal tides generated by the absorption of solar 315 radiation by ozone in the stratosphere and water vapor in the troposphere. These tides travel downward and affect sea level pressure, whose tidal amplitudes appear mostly semi-diurnal.
The diurnal meridional wind variations are not fully understood, but Ueyama and Deser (2008) showed that over the tropical Pacific such variations agree very well with pressure-derived wind diurnality, suggesting that the pressure gradient force plays a dominant role in setting the diurnality, with a smaller role for boundary layer stability. As will be shown later in section 6.1,

320
Barbados experiences only half of the dynamical forcing at daytime compared to nighttime in both the zonal and meridional component ( Figure 11 middle compared to bottom panel). The diurnality in pressure gradients may be linked to a diurnality in the Hadley cell, where the diurnal cycle of deep convection in the ITCZ and over the South American continent, which respectively peak in the early morning and in the afternoon, may play a role. daytime), the meridional wind component contributes most to the weak wind speed bias in the forecasts, not only below 2 km, 330 but also above 2 km, where it is accompanied by a too weak easterly wind (Figure 8 and 9). This suggests that at least one contributor to a weak wind speed bias during daytime, seen here throughout the lower troposphere, may be a bias in the thermal wind of the Hadley cell (vertical wind shear), as influenced by deep convection away from the trades.
Another important aspect to notice is that the nighttime bias is similar for both the forecast and ERA5 while the daytime bias is only visible in the forecast; this can be traced back to what is seen in Figure 7, where both the forecast and reanalysis 335 overestimate the amplitude of the diurnal cycle but only ERA5 captures the phase of the cycle.
From Figure 9 we can also infer that the dropsondes and radiosondes agree fairly well, apart for the zonal wind in the cloud layer. Here, at about 1.5 km, the radiosondes show zonal winds ∼1 m s −1 stronger than the dropsondes. These differences may be due to differences between the descending and ascending radiosondes. Descending radiosondes tend to show stronger winds above 1.5 km. Excluding the 186 descending radiosondes produces a better agreement with the dropsondes above 2 km 340 (not shown). However, around 1 km the descending radiosondes match the dropsondes considerably better than the ascending radiosondes. We also notice that the number of operating dropsondes reduces at lower altitudes. The differences between the radio-and dropsondes and the lidar winds are, as discussed in section 3, largely due to the spatial variability of winds.
Lighter colours in the middle row of Figure 8 compared to the top row show that ERA5 performs much better than the forecast at all hours of the day. Nevertheless the pattern in the right most panel (wind speed) suggests that the reanalysis 345 only reduces the magnitude of the bias, without eliminating the fundamental causes of an overestimated diurnal wind cycle below 1 km. In the next section we explore to what extent the assimilation of radiosondes and dropsondes explain the better performance of ERA5. Evidently, all analysis and forecast experiments remain considerably close to the corresponding control experiment (blue lines): the differences are everywhere small, and almost non-existent below 2 km. The sign, shape and magnitude of the profiles in Figure 10 confirm the results described in previous sections (e.g. see Figure 4) and support the idea that the mean wind bias 360 does not increase with coarser model runs (40 km spatial resolution and 3 hours temporal resolution of the model output). This also suggests that assimilating the local soundings do not alleviate the existing biases.
The fact that the analysed wind profile error does not change much in any of the denial experiments, does not mean that those observations do not play a role in constraining the wind profiles. Typically, when one observing system is withdrawn from the data assimilation system, other observing systems take over in constraining the analysis Sandu et al. (2020).

365
suggest that small frictional effects are present between 1 and 2 km, and an acceleration of northerly winds is seen above 2 km. There might be an important role of convectively driven circulations that change winds even at levels above the sub-cloud 400 layer, which are not present in the IFS. The tendency produced by the cumulus parameterization (in blue, Figure 11) is very small for the meridional component.
In the next section we carry out one experiment removing the momentum transport by convection, to investigate at what heights this may contribute to the bias.

405
To investigate the role of shallow convection momentum transport we compare Exp3_an and Exp3_fc, in which shallow convective momentum transport is turned off, with the same control experiment as in section 5 (CTRL_an, CTRL_fc) (see Figure 12).
CMT acts to mix winds between the surface and the cloud layer. If the wind speed increases with height, as is typically true for the sub-cloud layer, this would result in an increase of wind speed near the surface and a decrease in wind speed in the 410 cloud layer (the so-called "cumulus friction effect"). Figure 12 confirms that CMT by shallow convection acts to weaken the dominant easterly winds in the cloud layer, while strengthening winds near the surface. Without shallow CMT, in Exp3_an and Exp3_fc, a too strong easterly and total wind jet develops around 1 km during both day and night, while the mean bias in the zonal wind reduces to near zero at night and the meridional wind near the surface misses a frictional effect However, at levels above 2 km, the weak wind speed bias, evident in both the zonal and meridional components particularly at daytime, remains.

415
There is little difference between the black lines (Exp3_an and Exp3_fc) and the blue lines (CTRL_an and CTRL_fc) above 2 km in Figure 12. At these height levels, convective tendencies in the IFS are small or negligible ( Figure 11).
As suggested in Sandu et al. (2020), CMT may not underlie previously established errors in near-surface winds, but may rather act to communicate erroneous winds at upper levels to the surface. Nevertheless, it appears plausible that the IFS has biases in other components of the residual "frictional" force (e.g. the turbulent diffusion or the frictional force produced by the 420 convection scheme). Using ICON-LEM hindcast runs over the North Atlantic for twelve days corresponding to the NARVAL1 (winter) and NARVAL2 (summer) flight campaigns Dixit et al. (2021) and Helfer et al. (2021) show that resolved convective circulations play an important role in driving winds southwards throughout a deep layer up to 5 km. They also reveal the presence of convectively driven horizontal circulations and associated counter-gradient momentum flux near cloud tops. The IFS hardly has any convective tendencies above 1 km in the meridional component, and none above 1.5 km in the zonal 425 component and we therefore hypothesize that model physics does play a role in the wind biases.

New moist physics
In this section we test our hypothesis on the role of model physics in the wind biases. To do so, we use a model experiment  Bias wspd (m/s) I Figure 13. Mean model bias for ERA5 (dashed blue), the operational forecast (solid blue), and a forecast with the new model cycle 47r3 (red circles). The bias is shown separately for all hours of the day (top), for daytime between 10 and 16 LT (middle row), and for nighttime between 22 and 4 LT (bottom row). The bias is calculated with respect to radiosondes. From left to right the columns refer to the bias in the zonal wind (u), meridional wind (v), and wind speed.