Study on the seasonal variation of Aeolus detection performance over China using ERA5 and radiosonde data

Aeolus wind products have been available to ordinary users on May 12, 2020. In this paper, the Aeolus wind observations, L-band radiosonde (L-band RS) data and the European Centre for Medium-Range Weather Forecasts (ECMWF) fifth generation atmospheric reanalyses (ERA5) are used to analyse the seasonality of Aeolus detection performance over 10 China. Based on the Rayleigh-clear data and Mie-cloudy data, the data quality of the Aeolus effective detection data is verified, and the results show that the Aeolus data is in good agreement with the L-band RS data and the ERA5 data. The relative errors of Aeolus data in the four regions (Chifeng, Baoshan, Shapingba and Qingyuan) in China were calculated according to different months (July to December 2019, May to October 2020). The relative error of the Rayleigh-clear data in summer is significantly higher than that in winter, as the mean relative error parameter in July is 174% higher than that in December. Besides, the 15 distribution about the wind direction and the high-altitude clouds in different months (July and December) are analysed. The results show that the distribution of angle, between the horizontal wind direction of the atmosphere and the horizontal line of sight (HLOS), has a greater proportion in the high error interval (70 °–110 °) in summer, and this proportion is 8.14% higher in July than in December. In addition, the cloud top height in summer is about 3–5km higher than in winter, which may reduce the signal-to-noise ratio (SNR) of Aeolus. The results show that the detection performance of Aeolus is affected by seasonal 20 factors, which may be caused by seasonal changes in wind direction and cloud distribution.


Introduction
Global wind field data are indispensable meteorological parameters in weather forecasting (Ishii et al., 2017). To this end, the European Space Agency (ESA) proposed the Atmospheric Dynamics Mission Aeolus (ADM-Aeolus) in 1999. Aeolus is equipped with a 355nm direct-detection wind lidar, which uses a single-view detection method to scan the global three-25 dimensional wind field from space. In addition, it adopts a dual-channel design and uses different frequency discriminators to receive the Mie channel and Rayleigh channel signals respectively (Stoffelen et al., 2005;Reitebuch, 2012). Aeolus was successfully launched on 22 August 2018, becoming the world's first spaceborne wind lidar in orbit. Then, it sent back the first batch of wind profile data, proving that the satellite-borne direct-detection wind lidar has the ability to provide global wind profiles (Reitebuch et al., 2019). On 12 May 2020, ESA opened Aeolus' wind measurement products to the public 30 (https://earth.esa.int/eogateway/news/aeolus-data-now-publicly-available, last access: 13 March 2021), including Level 1B and 2B products. Among them, Level 2B products provide HLOS wind data after actual atmospheric correction and deviation correction (European Space Agency et al., 2008), which are used in this paper.
In order to accurately calibrate the Doppler lidar carried on Aeolus, relevant researchers have carried out a series of verification and comparison studies on Aeolus' wind products. After the launch of Aeolus, the special verification work for Aeolus was 35 carried out immediately. The main verification methods included radiosonde (Baars et al., 2020) and airborne lidar (Witschas et al., 2020;Lux et al., 2020), in which Aeolus' airborne prototype ALADIN (Atmospheric Laser Doppler Instrument) Airborne Demonstrator (A2D) was also used. In the following two years, researchers around the world completed regional verification of the Aeolus detection data using a variety of detection means, including satellites (Shin et al., 2020), ground-based lidar https://doi.org/10.5194/acp-2021-298 Preprint. Discussion started: 21 April 2021 c Author(s) 2021. CC BY 4.0 License. (Hauchecorne et al., 2020), ground-based wind profiler radar (Belova et al., 2021;Guo et al., 2021), and radiosonde (Liu et 40 al., 2021). In addition, global Aeolus data verification work using the NWP model has also been carried out (Martin et al., 2021;Rennie and Isaksen, 2019) . These verifications not only deepened our understanding of Aeolus data quality, but also discovered some factors that affect Aeolus data quality during the verification process, such as solar background radiation , satellite flight direction , seasonal changes (Martin et al., 2021), etc. ECWMF also revised the data processing algorithm in time, such as the temperature gradient correction algorithm for M1, which is the main 45 mirror of the Aeolus telescope, to solve the problem of seasonal changes (Rennie and Isaksen, 2019). So far, very few studies on the seasonal fluctuation of Aeolus detection performance over China have been reported, especially using actual detection data. After the implementation of the new M1 deviation correction scheme, the effect of system thermal performance changes on Aeolus' seasonal fluctuation is theoretically excluded. However, the actual atmospheric conditions in different seasons tend to affect the detection performance of lidar. 50 In this paper, the variation of Aeolus detection performance in different seasons is analysed by using Aeolus L2B wind products in four regions of China from 12 months (July to December 2019, May to October 2020), compared with the L-band RS detection data and the ERA5 reanalysis data. Two conjectures about Aeolus different detection capabilities are introduced and verified.

Data and methods 55
In order to compromise the consistency of meteorological conditions and the abundance of detection data, the Aeolus L2B data are compared with ERA5 data and the L-band RS detection data, within ±2.5° (latitude and longitude) geographical range

Aeolus L2B wind products
A sun-synchronous dawn-dusk orbit with a height of about 320 km is selected by Aeolus, and the orbit repeats its ground track 65 generally around 10:00 UTC and 22:00 UTC. In this paper, the criterion for judging the validity of Aeolus data are the validity flags (0=invalid and 1=valid) and the estimated errors (meet threshold requirements), which can be obtained from Level 2B data. According to ECMWF's recommendations (Rennie and Isaksen, 2020) and the actual situation of the data used in this paper (Fig. 2), the thresholds of estimated errors are 8m/s (Rayleigh-clear) and 4m/s (Mie-cloudy). In this study, the Mie-clear 70 and Rayleigh-cloudy data are discarded because the remaining valid data points of them are too few.

L-band Radiosonde wind data
The L-band radiosonde (L-band RS) is widly used to obtain the true situation of the atmospheric environment, and its detection altitude reach 30km (Guo et al., 2016). In this paper, we use the valid data (the absolute value of wind speed is less than 100 m/s) detected by the four L-band RS stations. Matching of geographic location and time needs attention in the use of L-band lidar. For geographic location, most of the valid L-band RS detection data used in this study have a balloon drift less than 0.5 ° 80 (longitude or latitude). Only a few data points in winter have a maximum balloon drift about 1.6 °. As for the time, the detection time of the L-band RS network in China is 0:00 UTC and 12:00 UTC. Generally, the time of launching the ball is about 45 minutes earlier than the detection time, which is 1 to 2 hours away from the transit time of Aeolus. In order to reduce the impact of time and geographic location differences on this study, ERA5 data are used as reference data.

ERA5 data 85
The reanalysis data set is often used as a reference in meteorological data analysis (Hersbach et al., 2020). ERA5 data provided by ECWMF are used in this paper as reference data. It should be noted that the current observation data set used for ERA5 assimilation does not contain the observation results of Aeolus (https://confluence.ecmwf.int/display/CKB/ERA5%3A +data+documentation#ERA5:datadocumentation-TheIFSanddata assimilation/, last access: 13 March 2021), so there is no mutual influence between the Level 2B data of Aeolus and the reanalysis data of ERA5. In addition to the zonal wind vector 90 u and the meridional wind vector v, the cloud coverage provided in ERA5 is also used in this paper (Sect. 3.2.2). Due to the high resolution of ERA5 in terms of time and geographic location, after matching well with Aeolus data, the difference between ERA5 and L-band RS data is used to represent the wind difference between Aeolus and L-band RS due to time and geographic differences to a certain extent. https://doi.org/10.5194/acp-2021-298 Preprint. Discussion started: 21 April 2021 c Author(s) 2021. CC BY 4.0 License.

Data matching 95
Since the L2B data is the final result of Aeolus's single-component wind measurements, it is necessary to decompose the Lband RS and ERA5 data in the direction of Aeolus HLOS. Aeolus has different azimuths at different locations and distances which have been given in the L2B level data. RS data and ERA5 data are decomposed in the HLOS direction: where / represents the total horizontal wind speed value provided by L-band RS data or ERA5 data. / represents 100 the wind direction of the total horizontal wind vector.
is the azimuth of Aeolus.
For time matching, both L-band RS and ERA5 take the latest detection data from Aeolus transit the target area (±2.5° near the L-band RS station). For geographic location matching, the selection of the L-band RS station has been determined. The Aeolus data selects the detection data within a ±2.5 ° (latitude and longitude) rectangular area centered on the L-band RS station (the red rectangular area in Fig.1). For ERA5 data, the data point closest to the latitude and longitude of the Aeolus data are selected. 105 In the vertical direction, due to the difference in the vertical resolution of the three data, both L-band RS data and ERA5 data are matched with the Aeolus data through linear interpolation. For each Aeolus valid data point, L-band RS (ERA5) data points which are just higher and just lower than the Aeolus valid data point are found. We mark them as ( ) and ( ) and then do a linear interpolation: where is the L-band RS (ERA5) wind matched with Aeolus data in altitude, and is the altitude of the Aeolus data point.

Calculation of relative error
Generally, the calculation formula of wind speed error for wind measurement system is: where represents the detection value, and represents the true value of the calibration. This equation is used to calculate the difference between different wind field data. Then for a data set with a sample size of , its statistical mean error (MD) and standard deviation (SD) are: However, because the detection error of Doppler wind lidar tend to increase with the increase of the detected wind speed (Frehlich, 2001), the relative error can better reflect the detection performance of the instrument than the error value. The calculation of relative error is generally as shown in Eq. (6): where take the data which have smaller error in the two data. In this paper, it is defaulted that the error of ERA5 data is 125 the smallest while the error of Aeolus data is the largest.
Similarly, for a data set with a sample size of , the statistical average relative error is: https://doi.org/10.5194/acp-2021-298 Preprint.
The average value of the comparisons between Aeolus and the two data is used to reduce the possibility of large deviations in the relative error. is used to approximate the relative error value of the Aeolus data in this paper.
3 Results and discussion 135

quality of data
The comparison results of the three data sources after data matching are shown in    Table 1 shows the comparison results of the three groups of data. The consistency of the L-band RS and Aeolus data is the lower among the three groups for both the Mie-cloudy group and the Rayleigh-clear group. Moreover, the performance of R value and SD value of this group (Aeolus vs L) is also slightly worse than that of the other two groups, which is in line with the expectations of this paper. However, there is no significant difference between the comparison results of L-band RS and ERA5 and the comparison results of Aeolus and ERA5. This situation not only shows that the Aeolus and ERA5 data are in 150 good agreement, but also that the error caused by the space-time matching problem of the L-band RS data may be larger than https://doi.org/10.5194/acp-2021-298 Preprint. Discussion started: 21 April 2021 c Author(s) 2021. CC BY 4.0 License. expected. But on the whole, the correlation coefficients in the three sets of comparison results are all higher than 0.92, reflecting the reliability of the data used in this paper.
Since this study also involves wind field data in different regions, the impact of geographic location and climate factors on data quality also need be showed. Figure 4 shows the comparison of Aeolus data and L-band RS data in the four regions used 155 in this paper. The blue data points represent the data of the Rayleigh-clear group, and the red data points represent the data of the Mie-cloudy group. The colors of the linear fitting line and related parameters are consistent with the corresponding data points. Table 2 summarizes the comparison results. The consistency of Aeolus and L-band RS data in Qingyuan area is worse than that of the other three groups. This may be due to the fact that Qingyuan is close to the tropics, where the atmospheric convection is active. In addition, from the perspective of the correlation coefficient R, the correlation of the Rayleigh-clear 160 group between the Aeolus data and the L-band RS data is higher, but the SD value is also relatively higher than Mie-cloudy group, which means that the data points are more scattered. As the latitude decreases, the data quality tends to decline, but the data quality of Baoshan is similar to that of Chifeng, which means that this trend is not obvious enough.

Seasonal variations in relative errors
After the data quality has been confirmed, the three wind field data are brought into the Eq. (3)-(8) introduced in Section 2 to acquire the errors. The representative statistical distribution of errors is shown in Fig. 5a, which conforms to the Gaussian 170 distribution law.
However, the value of relative error is affected by the denominator in Eq. (6). When the value of the denominator is close to 0, the relative error has a larger outlier. The occurrence of such outlier is sporadic and random, as shown in Fig. 5b. Although most of the relative errors are distributed in the interval [0,3], there are still sporadic outliers in the range greater than 3. These outliers will affect subsequent calculations and must be filtered. The threshold screening method is selected to filter them and 175 https://doi.org/10.5194/acp-2021-298 Preprint. Discussion started: 21 April 2021 c Author(s) 2021. CC BY 4.0 License. the relative error which is greater than 3 (that is, 300%) is considered as an invalid value. This threshold is derived from the statistical distribution of the relative errors in the all four regions.  Further, we use Eq. (7) to calculate the monthly average relative error of each month in each region, and then use Eq. (8) to calculate the monthly average of Aeolus relative error parameters. Finally, we obtain the changes of the three sets of relative 185 errors (Aeolus&L, L&ERA5, Aeolus&ERA5) from July 2019 to October 2020 (data missing from January to April 2020), as shown in Fig.6. For the Rayleigh-clear data, the relative error of the Aeolus data is significantly larger in summer. Although the ( & 5), which represents the error caused by imperfect space-time matching, also increases in summer, the increase of ( & ) and ( & 5) in summer months is much larger than that of ( & 5). Therefore, we believe that this increase in relative error is caused by the variation in Aeolus detection performance. Meanwhile, the same 190 seasonal trend of Rayleigh-clear data in the four regions (Fig.7a) also confirms this idea as the mean relative error parameter in July is over 174% higher than that in December. In addition, we also found that as the latitude decreases, the month in which the relative error peak appears is delayed. In the two years (2019-2020), the peaks of relative error in Chifeng were in July, while Baoshan and Shapingba were in August and Qingyuan is postponed to September. Meanwhile, the relative errors in the summer of 2020 have increased to varying degrees compared to 2019. This may be related to the differences in summer weather 195 conditions in different years or the decrease in laser power of Aeolus. It is not yet clear.
In addition, we also calculated the monthly average value of in the Mie-cloudy group, as shown in Fig. 7b mean relative error parameter in July is only 39% higher than that in December. Its seasonal fluctuations are relatively random for different regions, but the summer relative error is slightly larger overall. 200 ECMWF proposed that varying temperature gradients across the instrument's mirror M1 caused seasonal fluctuations in the quality of Aeolus detection data (Rennie and Isaksen, 2020). After application of corrections for the mirror effects, the seasonal variation caused by the thermal structure of the system itself theoretically become really small. On the other hand, the different 205 situation of the Mie-cloudy group and the Rayleigh-clear group does not seem to support the explanation that the Aeolus system itself causes seasonal changes in detection performance.
Considering the difference in detection range between the Rayleigh-clear group and the Mie-cloudy group (Mie-cloudy mainly detects the aerosol layer), we believe that this seasonal difference may be caused by seasonal changes in the real atmospheric environment. Therefore, we put forward and verified two conjectures based on Aeolus working principle. 210

Seasonal variations in atmospheric wind direction
Since Aeolus detects only a single line-of-sight wind vector, the detection wind vector is a component of the real wind vector.
Obviously, when the angle between the detection wind vector and the real wind vector approaches 90 °, the real wind vector has almost no contribution to the detection wind vector, and the detection error is the largest. The closer the angle between the two is to 0° or 180°, the smaller the detection error. 215 The wind direction of the atmospheric wind field has an obvious seasonal trend. In China, the northwest monsoon prevails in winter while the southeast in summer. To verify whether the seasonal variation in wind direction is the cause of the seasonal variation in Aeolus detection performance, we analyse the statistical distribution of the angle between the real horizontal wind direction and the Aeolus HLOS direction. Based on the previous data matching work, we calculate the angle α between the real horizontal wind direction (provided by ERA5 data) and the Aeolus HLOS direction (provided by L2B data) of each Aeolus valid data point. Figure 8 is obtained from 225 the Rayleigh-clear group data. When the angle α is in the range of 70 °-110 °, the relative error of the Aeolus data increases significantly. The proportion of data points whose angle is in the range of 70 °-110 ° in July is 8.14% higher than that in December. Most of the data points in December are concentrated in the vicinity of 0° and 180°. Theoretically, this will significantly increase the average relative error of Aeolus in July, which may be one of the reasons for the increase in the relative error of Aeolus in summer. For Mie-cloudy group (Fig.9), the proportion of data points whose angle is in the range of 230 70 °-110 ° in July is 5.86% higher than that in December, which means the same order of magnitude for the angle distribution difference of Rayleigh-clear data and Mie-cloudy data. However, it is known that the seasonal variation of relative error in the Mie-cloudy group is much smaller than that in the Rayleigh-clear group, so this conjecture can not explain the seasonal performance of the Mie-cloudy group.

Seasonal variations in upper cloud cover
In the actual work process, the spaceborne wind lidar is susceptible to the influence of cloud-aerosol. When the laser passes through the cloud-aerosol layer, it will be subject to a strong attenuation effect, resulting in decrease in the energy of the laser beam and signal-to-noise ratio (SNR). At the same time, for the Rayleigh channel, the cloud-aerosol layer will also bring strong 240 Mie scattering, which will pollute the signal of the Rayleigh channel and increase its detection error .
Because the Rayleigh channel has comprehensive coverage in altitude, we takes the effective data of the Rayleigh channel (both of Rayleigh-clear and Rayleigh-cloudy) and counts the number of samples ( , ) according to the month and altitude, then defines the parameter that represents the backscatter ratio: 245 is normalized to obtain the parameter . The larger the value of the parameter , the stronger the scattering of Mie in the altitude layer. We calculate the value of 12 months in different regions with a height resolution of 1000 m to get Fig. 10.
Obvious high-altitude Mie scattering layers are existed for the four areas in summer, and in autumn and winter, this Mie scattering layer moves to areas with lower altitudes with its Mie scattering intensity weakened. In the Qingyuan shown in  The ERA5 cloud coverage information matched with Aeolus data points are showed in Figure 11. The value in the Fig. 11 is a parameter that represents the backscattering ratio defined by Eq. (9). The mean value of cloud coverage and the value of 255 have a similar trend in the vertical direction, but they are different in the near-ground area because the main reason for the Mie scattering here is aerosols rather than clouds. Taken together, the high-altitude Mie scattering layer in Fig.10 in summer is caused by the presence of clouds in this area. In other words, the height of cloud tops in summer increases significantly within the Aeolus detection height range. Combining the results of the two figures ( Fig. 10 and Fig. 11), the cloud top height in July is about 3-5km higher than that in December for four regions, which is consistent with the seasonal variation rule of cloud top 265 height in East Asia (Zhao et al., 2020).
When the satellite-borne wind lidar works, high-altitude cloud layer will inevitably attenuate the laser beam and reduce the energy of the laser beam passing through the cloud layer. As a result, the SNR of the echo signal in the area below the high cloud decreases and the detection error increases. At the same time, the Rayleigh signal from the cloudy area will be interfered by the Mie scattering signal, which will affect the calculation of its 270 Doppler shift. Although the Aeolus data processing algorithm uses a strict backscatter ratio threshold to remove https://doi.org/10.5194/acp-2021-298 Preprint. Discussion started: 21 April 2021 c Author(s) 2021. CC BY 4.0 License.
the Rayleigh channel data elements that may contain Mie scattering from the Rayleigh-clear group , the signal interference still affects the final inversion result.
Finally, we try to use this conjecture to explain why the Mie-cloudy group's relative error seasonal variation are not obvious (Fig. 7b). There are two special data points in Fig.7b that are July and August 2020 in Baoshan. They are all summer, but the 275 average relative error in August is significantly higher than that in July. The vertical distribution about the valid data points and all data points of the Mie-cloudy group in the two months are showed in Fig. 12. The data points in July are mainly distributed in the high-altitude cloud area. It may be due to the dense clouds in the upper air in July, and it is difficult for Mie channel to detect the area below the clouds. The high-altitude cloud layer's Mie scattering signal closing to the satellite, which means the small optical thickness and the high SNR, improves the Aeolus data quality. In addition, the Mie channel 280 discriminator is not sensitive to the Rayleigh scattering signal, so there is unnecessary to consider the Rayleigh signal interference . In August, the data points are evenly distributed in the vertical direction, and the amount of low-altitude data still accounts for a considerable proportion. Before the laser beam reaches low altitude, it is attenuated by high-altitude clouds. The low-altitude Mie scattered echo signal generated by the laser beam propagates through the upper atmosphere and attenuate again. This means that it will become very weak when it reaches the Aeolus receiving telescope, 285 which is not conducive to subsequent signal processing and reduces the quality of the Mie-cloudy data. When we use to represent the value of SNR, as shown in Fig. 12, is significantly higher in August than that in July when the altitude is lower than 8km, which supports the above explanation. That is to say, in summer, a considerable part of the data in the Miecloudy group comes from high altitude, but the low-altitude Mie scattering signal and the high-altitude Mie scattering signal contribute to the overall relative error in the opposite way. This may be the reason why the seasonal variation of Aeolus relative 290 error in the Mie-cloudy group is not obvious.

Conclusion 295
In this study, the seasonal variation of Aeolus detection performance in China is analyzed by using Aeolus detection data, Lband RS detection data and ERA5 data from July 2019 to December 2019 and may 2020 to October 2020. Firstly, the difference between Aeolus data and L-band RS data is discussed, and the selection threshold of Aeolus data estimation error is clarified, which is 8m/s for the Rayleigh-clear data and 4m/s for the Mie-cloudy data. After the valid data filtered, a comparative analysis of the three wind field data is carried out. The R value of Aeolus' Rayleigh-clear (Mie-cloudy) 300 data and ERA5 data is 0.95 (0.97), and the R value of L-band RS data is 0.92 (0.94). This shows that the Aeolus detection data is in good agreement with the ERA5 and L-band RS data, and quality of the data used in this paper is reliable. Then these three data sets are used to calculate the monthly mean value of relative error. From the calculation results, the relative errors of the Rayleigh-clear data increase significantly for the four regions in the summer as the mean relative error parameter in July is 174% higher than that in December. In the Mie-cloudy group, this seasonal trend is not obvious and the 305 performance is more random. Combining the working principle of Aeolus, we propose two conjectures to explain the seasonal variation of the relative error of Aeolus data. One is the variation in the angle between the actual horizontal wind direction and Aeolus HLOS direction, which may affect the extent of Aeolus single vector data to reflect the true wind vector. The other is the seasonal variation of the altitude of the high-altitude clouds and cloud tops, which may affect the SNR of the echo signals in different channels. 310 For the first conjecture, we calculates the distribution of the angle between the actual horizontal wind direction and Aeolus HLOS direction of different regions in July and December. It is found that there were obviously more data points distributed in the high error interval of 70°-110° in July than December, which proved that this conjecture is reasonable. However, the first conjecture encounters problems in explaining the situation of the Mie-cloudy data. For the second conjecture, we sets the parameter to represent the backscattering ratio, and calculates the distribution of at different altitudes in different months. 315 It is found that there is a strong high-altitude Mie scattering layer in summer. Combining with cloud coverage information, we know that the Mie scattering layer is caused by high-altitude clouds in summer. The high-altitude clouds reduce the signal-tonoise ratio of the echo signal received by Aeolus, and also interfere with the signal analysis and processing of the Rayleigh channel. This conjecture is also reasonable and can reasonably explain the seasonal variation in the relative error of the Miecloudy channel Aeolus. 320 In this study, the analysis of the Aeolus data quality and its seasonal changes in the four regions (Chifeng, Baoshan, Shapingba and Qingyuan) of China will help to better understand and use the Aeolus detection data over China. Besides, as the first spaceborne wind lidar, the analysis of factors affecting the detection performance of Aeolus will help provide a reference for the follow-up development of spaceborne wind lidar.