Determination and climatology of diurnal cycle of atmospheric mixing layer height over Beijing 2013-2018: Lidar measurements and implication for air pollution

The atmospheric mixing layer height (MLH) determines the volume available for the dispersion of pollutants and thus contributes to the assessment of the pollutant concentration near the surface. The study evaluates the capability of lidar to describe the evolution of atmospheric mixing layer and then presents a long term observed climatology of MLH diurnal cycle. A system for automatic detection of the mixing layer height based on two wavelet methods (MLH and MLH’) applied to lidar observations was operated from January 2013 to December 2018 in the Beijing urban area. The two dataset results are 15 compared with radiosonde as case studies and statistical form. MLH shows good performance to calculate the convective layer height at daytime and the residual layer height at night. While MLH’ has the potential to describe the stable layer height as radiosonde at night, the performance is limited due to the high range gate of lidar. A nearly six year climatology for diurnal cycle of MLH is calculated for convective and stable conditions using the dataset of MLH from lidar. The MLH characteristics of seasonal change in Beijing indicate that it is low in winter and autumn, and high in spring and summer. A significant 20 phenomenon is found that from 2013 to 2018, the diurnal cycle of MLH increase year by year. It may partly benefit from the improvement of air quality. As to converting the column optical depth to the surface pollution, MLH from lidar shows better accuracy than that from radiosonde. Additionally, the accuracy with lidar MLH shows a diurnal cycle, with the peak at time of 1400 LST. The study provides a significant dataset of MLH based on measurement and could be an effective reference to atmospheric models for surface air pollution calculation and analysis. 25


Introduction
The height of mixing layer (MLH) is a crucial parameter for air quality forecast, pollutants dispersion and identification of pollutant emissions and sources (Haeffelin et al., 2012;Seibert et al., 2000;Liu and Liang, 2010). Pollutant emitted into the atmospheric boundary layer is gradually dispersed vertically through the action of turbulence (Gan et al., 2011;Monks et al., 2009;Guo et al, 2016), and finally becomes completely mixed over this layer if sufficient time is given (Emeis et al., 2008). Sawyer and Li, 2013;Cimini et al., 2013;Tang et al, 2016;Su et al, 2019), of which good correlations are found in the case of convective weather conditions with differences of 100-300 m, while non-convective weather conditions lead to much larger difference in the MLH estimations. In these cases, the discrepancy tends to be even larger if the approaches are supposed to measured different structure of ML such as CBL, SBL or RL (Collaud et al., 2014). The meteorological radiosondes can only 65 acquire the MLH in the morning (08:00 LT) and at night (20:00 LT), when the ML tends to be stable and the convective condition is lacking. Ceilometer is used to measure the diurnal variation of MLH, however, the accuracy suffers from its low vertical resolution and poor signal noise ratio (De Haij et al., 2006;Tang et al., 2016). Su et al. (2019) indicates that lidar shows good consistent with RS in the daytime stable condition with the improved algorithm, but the nocturnal MLH with very low SBL height is not evaluated.

70
In the existing studies, numerical simulations, ground-based remote sensing, or meteorological radiosonde are used to obtain the characterization of MLH during short time periods in Beijing, mainly focusing on heavy pollution event (Yang et al., 2005;Zhang et al., 2006;Quan et al., 2013;Hu et al., 2014;, which underscores the scarceness of continuous high-resolution observations for a long time period. Depending on the measured atmospheric parameters and observational uncertainties, different measurement approaches may reveal different aspects of PBL structure (Seibert et al., 2000;Seidel et 75 al., 2010;Beyrich and Leps, 2012). Thus, it is of great significance to apply consistent algorithm to consistent types of atmospheric structure parameter when comparing MLH from different times.
The main objective of this study, therefore, is aimed to present a long term observed climatology of the MLH diurnal cycle based on lidar observations. For that, the capability of lidar to describe the diurnal evolution of mixing layer height is evaluated first. The data and methods used are described in Section 2. Sect. 3 is the result and discussion, which consist of the comparison 80 of lidar-derived MLH with radiosonde measurements, the climatology of MLH in Beijing and implication for surface pollution retrieval. Then, it is concluded in Sect. 4.

Site and lidar measurements
In the study, the observation site (116.37° E, 40.00° N) in Beijing city is located on the building roof (59 m a.s.l.) of the 85 Institute of Remote Sensing and Digital Earth, Chinese Academy of Sciences. The micro pulse lidar (CE370, CIMEL, France) was used at wavelength of 532nm with the vertical resolution of 15 m. The laser used is frequency-doubled Nd:YAG with pulse repetition frequency 4.7 kHz and energy 8-20 μJ. The signal received from lidar is processed by subtraction of atmospheric background, correction of overlap, correction of range (range corrected signal, RCS), and then logarithm calculation (expressed in In( 532 ′ )) (Yan et al., 2014;Campbell t al. 2003；Su et al., 2019. MLH estimation from lidar systems 90 is based on the detection of the sharp decrease in aerosol backscatter at the top of the mixing layer (Seibert et al., 2000) .

MLH derived from Lidar
Wavelet transforms are commonly used in many studies for MLH determination from lidar observations (Cohn and Angevine, 2000;Davis et al., 2000;Brooks, 2003;De Haij et al., 2006;Su et al., 2019). When it is the maximum value of attenuated backscattering profile convolved with Haar function, the corresponding height is MLH. The equation of wavelet is defined as 95 follows: Where b is the transformation of the equation, where the equation is cantered, and a is the expansion of the equation. The equation of wavelet covariance transformation Wf(a,b), namely, the convolution of attenuated backscattering profile f(z) with wavelet function is defined as follows: Where f(z) represents the backscatter signal of lidar, zb denotes the lower limit of the height of the profile, and zt represents the upper limit of the height. A valid MLH is detected corresponding to the value b when Wf (a, b) reaches the biggest local maximum with a coherent scale of a (Brooks, 2003;De Haij et al., 2006;Emeis et al., 2008). In this study, the expansion a is selected as 420 m, 435 m, 450 m, respectively, and final Wf (a, b) is calculated from the averaged corresponding values.

105
Another layer, named MLH' is detected simultaneously by the first local maximum Wf (a, b) from zb, which is assumed to be smaller than or equal to MHL (De Haij et al., 2006). Since the absolute maximum in the vertical gradient of the lidar profiles is characterized by the rapid degrease in pollutants concentration, the MLH can be associated with the CBL height during day and to the RL during night (Collaud et al., 2014). However, the detection of MLH' is used to try to detect the height of SBL in the nocturnal time.

110
The scheme of cloud screen follows that MLH is eliminated if a cloud flag is marked when the cloud base is found within 6 km from the surface. Due to the limitation of algorithm and insufficient lidar overlap, the MLH calculation starts from 255 m.
All MLH results are one hour averaged and processed under good quality control, which involves with elimination of incomplete data, false value and peak value, and so on.

115
Radiosonde is usually considered as the ground truth for the detection of CBL height at daytime and SBL height at nocturnal The bulk Richardson number (Rib) is a dimensionless parameter combining the thermal energy and the vertical wind shear, and is widely used in MLH climatology (Seidel et al., 2012;Collaud et al., 2014). Ri is defined as the ratio of turbulence associated with buoyancy to that induced by mechanical shear, which is expressed as where z is the height (z > , subscript 's' denote the surface), θ characters virtual potential temperature, U and V indicates the two horizontal wind velocity components, g presents the Earth gravitational constant. The MLH corresponds to the first elevation z with Rib greater than a critical threshold taken as 0.25 (Stull, 1988;Seidel et al. 2012;Guo et al., 2016Guo et al., , 2019. In most cases, the exact threshold value has only a small impact on the PBL height due to the large slope of Rib in this interval

Air pollution model
The data of MLH is usually combined within the atmospheric model to obtain the surface air pollutant concentration. For example, PM2.5 remote sensing(PMRS)model, derived by , have the ability to calculate the mass concentration of PM2.5 above ground. The PMRS method is designed to employ currently available remote sensing parameters,

135
including aerosol optical depth (AOD), fine mode fraction (FMF), planetary boundary layer height (PBLH) and atmospheric relative humidity (RH), to derive PM2.5 from instantaneous remote sensing measurements under different pollution levels Li et al., 2016;Yan et al., 2017, detail information see S2). In this study, AOD and FMF is retrieved by a sunphotometer located with the lidar, while RH is measured by a meteorology station in the same station. The MLH obtained both from lidar and radiosonde is used in the model to calculate the surface PM2.5.

MLH operational measurement: case study
A selection of typical atmospheric conditions included in the data set of lidar measurement are plotted in Fig. 1 -Fig. 4. The heights of the mixing layer (MLH and MLH') are obtained from different criteria using the wavelet covariance transform method. As shown in Fig.1, the development of a convective mixing layer could clearly be observed, with a sharp decrease in 145 aerosol backscatter between the mixing layer and the free atmosphere. In the fair weather, there should be an obvious diurnal variation in the height of the boundary layer, which gradually rises at 8 am, reaches a maximum at 2 -3 pm, and gradually decreases at night. This is related to the diurnal change of the surface radiation, which is the driving factors of boundary layer.
However, in fact, the MLH from lidar is sometime not consist with thermodynamics PBL height, which could be obtained by MLH_RS (MLH from radiosonde) indicated by the dramatic variance of relative humidity (RH) and temperature (T). When 150 the PBL reaches the top and the thermodynamics PBL starts to decrease, the aerosol layer height keep for still or sometime even grow for some distance, rather than going down immediately, just as Fig.1a shown. MLH is always higher than MLH_RS, which indicates the stable layer height at 2000 LST. It should be noted that above the MLH_RS there still exist abundance of aerosols. The aerosol layer height should be the top of residual layer height, which corresponds to MLH at nocturnal time, determined by the greatest gradient of aerosol concentration. is much higher than MLH_RS (0.243 km), but equal to MLH. It is related to that the stable layer height obtained from 160 radiosonde in the case is out of the range of lidar detection (0.255 km), in which, that is, MLH' from lidar is disabled to determine the stable layer height.
In the summer time (JJA) when the radiosonde is additionally launched at 1400 LST to detect the convective boundary layer, MLH is consistent with MLH_RS, while MLH' is frequently under MLH_RS, as shown in Fig.2. The detailed information represented in Fig.2b shows that MLH is equal to MLH_RS, which reaches up to 2.95km, while MLH' is only 1.24km. Under

165
clear sky conditions, problems in estimating the MLH from lidar data can arise from the weak vertical gradients in the aerosol content as indicated by RCS. Fig.2c and 2d indicate that MLH approaches to MLH_RS, but a little lower than MLH_RS. The difference is associated with that aerosol within the mixing layer needs some time to adjust the thermal structure, and exist a delay to reach the thermodynamics MLH (Stull, 1988;Collaud et al., 2014).
As fig.3 shown that MLH' corresponds to MLH_RS for most time. But when stable layer height was around 0.25 km, both

170
MLH' missed the height of SBL, but point to the height of residual layer (Fig.3b. and Fig.4).When stable layer height is higher than 0.25 km, MLH' tend to approach to MLH_RS (Fig.3d.). However, in the afternoon MLH was used to be close to MLH_RS than MLH' (Fig.3c.). In some synoptic condition, the PBL stays stable in the whole day,and the height of SBL is missed by MLH' (Fig.4).
The greatest advantage of PBL detection by RS is its very good precision. However, radiosonde, crossing the PBL along a 175 slanted path within a few minutes, provides only a "snapshot"-like profile (Seibert et al., 2000). Its temporal resolution (two or three measurements per day) is not able to provide the MLH diurnal cycle. Meanwhile, radiosonde detects the daytime and nighttime layers in which ground-emitted atmospheric pollutants are trapped, but not the residual layer height corresponding to trapped atmospheric constituents discharged some hours before (Collaud et al., 2014).
As for lidar measurement, MLH tends to obtain CBL height in the afternoon and RL height at night and early morning. MLH' 180 sometime agree well with SBL height in nocturnal time and early morning, and it has the potential to describe the SBL.
However, MLH' is disabled in the case that SBL height is out of lidar measurement gate. And, in the afternoon, MLH' is frequently lower than CBL height. Therefore, for the operational lidar measurement of PBL height, MLH is more suitable and has the capability to mark the diurnal circle of mixing layer height. Hence, MLH obtained by the lidar can be used as an effective parameter for the vertical distribution of aerosols, and provides an important reference to obtain near-ground pollutant 185 concentrations for atmospheric model.

190
The frequency of MLH from radiosonde lower than 0.25 km is nearly 35%，where it is no data for MLH from lidar due to the limited detection range. Specifically, the rate of MLH from lidar smaller than 0.5 km is nearly 18% and 12%,repectively,at 0800 LST and 2000 LST, while the corresponding frequency of radiosonde is beyond 75% and 66%. The frequency of larger MLH value at the time of 2000 LST is bigger than that of 0800 LST, both from lidar and radiosonde. It is reasonable that the residual layer have not yet collapse entirely at 2000 LST, while the CBL have not develop well in the early morning. As for 195 the MLH', its distribution trend is more similar to MLH_RS than MLH (See Fig.S3), and that the correlation between MLH' and MLH_RS is higher than that between MLH and MLH_RS, in spite that it is still not good (See Fig.S4). It indicates once again that MLH' have the potential to determine the SBL height as radiosonde does. As to the seasonal variation of both lidar and RS measurement at 0800 LST, the frequency of larger MLH value in summer is minimal, indicating summer MLH is lower than other season. As for radiosonde, MLH lower than 0.25 km most distributes in winter, with the rate of around 15%

200
for both 0800 and 2000 LST, and the frequency decreases rapidly when MLH gets larger than 0.25 km.
The comparison of MLH and MLH_RS at time of 1400 LST in summer is presented in Fig.6 cooling, gravity waves. It is almost impossible to separate the various contributions to the observed PBL structure (Seibert et al., 2000). Additionally, it is difficult to identify a clear upper boundary of the mixing layer because the measured parameter (MLH) is actually not a fixed point but rather a transition layer between two atmospheric states (Stull, 1988;Garratt, 1992; Collaud et al., 2014).

Diurnal cycle
A data set containing nearly six years measurement in Beijing is used for assessment of the overall performance of the Wavelet MLH algorithm (MLH and MLH') with respect to the diurnal availability, as shown in Fig. 7. It can be seen that the expected shape resembling the growth of a convective mixing layer is observed. The diurnal cycles derived from MLH match well with RS results in the afternoon, but larger than MLH_RS in the early morning and evening. Contrarily, MLH' tend to approach to 220 MLH_RS in the early morning and evening, but keep away from MLH_RS in the afternoon.
Owing to solar heating of the surface, when convective layer begin to rise due to upward convection in the early morning and nocturnal residual layer tends to collapse, MLH from lidar presents the minimal value. After that, MLH grows continuously and reaches its maximum height around 1500 (Stull et al., 1984;Tang et al, 2016;Su et al, 2019). The difference between MLH and MLH_RS at 1400 is less around 0.1 km. It is reasonable considering that RS data is acquired at 1400 only from 225 summer time, when MLH is usually larger around the year. However, MLH is nearly 0.46 km larger than MLH' throughout the day, and with bigger standard deviation. The nocturnal SBL top is defined as the transition between the stable surface layer and the residual layer (Stull, 1988;Collaud et al, 2014).Although MLH' can catches the high SBL height, due to lidar usable range gate, the point derived from MLH' is residual layer height rather than the nocturnal SBL height. As a contrast, MLH retrieve the consistent RL height during the night following the CBL diurnal maximal. And, the SBL height provided by 230 radiosonde at 0800 and 2000 LST can be considered as complementary to the lidar approaches.

Seasonal variation
The seasonal mean diurnal cycle of the MLH from lidar are shown in Figure 8. Previous studies have suggested that the seasonal variation in the MLH may be associated with radiation flux (Stull, 1988； Kamp andMcKendry, 2010;Munoz and Undurraga, 2010), which was consistent with our results. The larger MLH in spring 240 https://doi.org/10.5194/acp-2020-175 Preprint. Discussion started: 16 March 2020 c Author(s) 2020. CC BY 4.0 License.
than that in summer can be explained by the higher total radiation flux than summer. The relatively low values in autumn and winter are likely to related to the low radiation flux. Tang et al. (2016) revealed the similar seasonal variation in Beijing, which is that the MLH is low in autumn and winter, and high in spring and summer. For the lowest MLH in the early morning of summer, a possible explanation may be that the strong convective mixing, due to intense strong surface heating, leads to the greatest aerosol gradient forming at the top of convective boundary layer, rather than that of residual layer in the early morning.

Implication for surface pollution retrieval
The dataset of MLH from lidar is able to indicate the height of the aerosol layer throughout the whole day. Thus, the knowledge on MLH can be employed to convert column-mean optical depths into near-surface air quality information (Sifakis et al., 1998;Dandou et al., 2002;Schafer et al., 2008). Using MLH data and other aerosol related parameters, the mass concentration of 260 PM2.5 above ground is calculated based the PMRS model. Due to the difference in the source of the MLH, lidar and radiosonde, the comparison of derived PM2.5_lidar and PM2.5_RS with the in-situ observational PM2.5 data at 0800 LST is presented in Fig.   10a and 10b. MLH from lidar shows reasonably good performance for the retrieval of PM2.5 in the morning, with the correlation coefficients of 0.741 and RMSE 46.69 ug/m 3 . However, the calculated PM2.5 from MLH_RS obviously overestimate the surface pollution, with lower correlation coefficients and larger standard deviation. The large overestimation should be due to 265 the underrating of aerosol layer height. In the morning, MLH_RS is the height of the SBL, above which there still exist some amount of aerosol. Therefore, MLH from lidar, as the good indicator of the aerosol layer height, is more suitable for estimating the surface air pollution from the column-mean optical depths.
As presented in Fig.10c, the calculated PM2.5_lidar data of daytime period (0800-1700 LST) shows better accuracy than that of only the early morning. Actually, the accuracy also shows a diurnal cycle, with the peak of correlation coefficients (0.927) 270 https://doi.org/10.5194/acp-2020-175 Preprint. Discussion started: 16 March 2020 c Author(s) 2020. CC BY 4.0 License.
at 1400 LST (Fig.10d). The high accuracy could lead from aerosol mixing well at noon time, while the relative poor accuracy is related with the complicate layers in the early morning and at dusk. Based on the observational data, PM2.5 tends to peak in the morning and evening. In contrast, afternoon usually witnessed lower mass concentration due to rapid vertical diffusion of aerosols (Guo et al., 2016).Thus, MLH from lidar can offer the significant contribution to retrieve the diurnal circle of the surface air pollution. Additionally, the existing diurnal cycle of MLH data sets from ground-based lidar can facilitate effective 275 analysis of current space-borne lidar missions and to support new space lidar mission around the topic of atmospheric environment at different measurement time.

Conclusion
To acquire the high-resolution observations of MLH diurnal variation, a study using lidar was performed from January 2013 Nearly six year climatology for MLH diurnal cycle is calculated for convective and stable conditions. It is true that the height of the aerosol layer obtained by different approaches may be different. We focus on the temporal change of aerosol layer height with a consistent method using the dataset of MLH. The MLH characteristics of seasonal change in Beijing indicate that it is low in winter and autumn, and high in summer and spring. A significant phenomenon is found that from 2013 to 2016, the 290 diurnal cycle of MLH increased year by year, which may partly be contributed from the improvement of air quality. As to converting the column aerosol optical depth to the surface pollution, MLH from lidar shows better accuracy than that from radiosonde. Additionally, the accuracy with lidar MLH shows a diurnal cycle, with the peak at time of 14 LST.
Actually, interpreting data from aerosol lidar is often not straightforward, because the detected aerosol layers are not always the result of ongoing vertical mixing, but may originate from advective transport or past accumulation processes (Russell et 295 al., 1974;Coulter, 1979;Baxter, 1991;Batchvarova et al., 1999). Each detection method has good performances only for defined ML structures and under specific meteorological conditions. Therefore, only the combination of several methods and instruments allows one to follow the complete diurnal cycle of the complex ML structure.