Comparison of mixed layer heights from airborne high spectral resolution lidar , ground-based measurements , and the WRF-Chem model during CalNex and CARES

The California Research at the Nexus of Air Quality and Climate Change (CalNex) and Carbonaceous Aerosol and Radiative Effects Study (CARES) field campaigns during May and June 2010 provided a data set appropriate for studying the structure of the atmospheric boundary layer (BL). The NASA Langley Research Center (LaRC) airborne high spectral resolution lidar (HSRL) was deployed to California onboard the NASA LaRC B-200 aircraft to aid in characterizing aerosol properties during these two field campaigns. Measurements of aerosol extinction (532 nm), backscatter (532 and 1064 nm), and depolarization (532 and 1064 nm) profiles during 31 flights, many in coordination with other research aircraft and ground sites, constitute a diverse data set for use in characterizing the spatial and temporal distribution of aerosols, as well as the depth and variability of the daytime mixed layer (ML) height. The paper describes the modified Haar wavelet covariance transform method used to derive the ML heights from HSRL backscatter profiles. HSRL ML heights are validated using ML heights derived from two radiosonde profile sites during CARES. Comparisons between ML heights from HSRL and a Vaisala ceilometer operated during CalNex were used to evaluate the representativeness of a fixed measurement over a larger region. In the Los Angeles basin, comparisons of ML heights derived from HSRL measurements and ML heights derived from the ceilometer result in a very good agreement (mean bias difference of 10 m and correlation coefficient of 0.89) up to 30 km away from the ceilometer site, but are essentially uncorrelated for larger distances, indicating that the spatial variability of the ML height is significant over these distances and not necessarily well captured by limited ground stations. The HSRL ML heights are also used to evaluate the performance in simulating the temporal and spatial variability of ML heights from the Weather Research and Forecasting Chemistry (WRF-Chem) community model. When compared to aerosol ML heights from HSRL, thermodynamic ML heights from WRF-Chem were underpredicted in the CalNex and CARES regions, shown by a bias difference value of−157 m and−29 m, respectively. Better agreement over the Central Valley than in mountainous regions suggests that some variability in the ML height is not well captured at the 4 km grid resolution of the model. A small but significant number of cases have poor agreement when WRF-Chem consistently overestimates the ML height in the late afternoon. Additional comparisons with WRFChem aerosol mixed layer heights show no significant improvement over thermodynamic ML heights, confirming that any differences between measurement and model are not due to the methodology of ML height determination. Published by Copernicus Publications on behalf of the European Geosciences Union. 5548 A. J. Scarino et al.: ML Heights from HSRL and WRF-Chem


Introduction
Measurements of atmospheric boundary layer (BL) height are of key importance as a prognostic variable in regional and global weather forecasting and climate models (Atlas and Korb, 1981) and for assessing these models.The National Research Council (2009) points to inadequacies in current national mesoscale observational capabilities necessary for addressing priorities like forest wildfire smoke dispersion, air quality forecasting, short-range forecasting of high-impact weather, and regional climate modeling.In particular, vertically resolved mesoscale observations are lacking and the report specifically recommends that determining the height of the atmospheric BL should be one of the highest priorities for addressing these inadequacies.There is also interest in BL height research for incorporation into weather and air quality forecasting models and for climate studies.The science plan of the Department of Energy's Atmospheric System Research program (Department of Energy, 2010) highlights the importance of measuring BL heights and, by studying them with respect to aerosol and cloud interactions, topographic features, and tropospheric dynamics, contributing to the development and evaluation of forecasting models.
Since the mid-1960s, scientists have been researching different methods in order to determine the height of the atmospheric BL within the troposphere (Hosler and Lemmons, 1972;Stull, 1988;Heffter, 1980).The convective boundary layer (CBL) is characterized by roughly uniform vertical profiles of moisture and potential temperature within that layer (Stull, 1988), and so many researchers use potential temperature to indicate BL height, measured, for example, by radiosonde.Atlas and Korb (1981) present the use of aerosol profile measurements made by lidar for determining BL heights, since aerosol gradients also can indicate BL heights, where aerosol concentration is sufficient.
In the current study, measurements acquired by the NASA Langley Research Center (LaRC) airborne high spectral resolution lidar (HSRL) during recent science campaigns are used to examine the spatial and temporal variability of BL height and to validate BL heights from the WRF-Chem transport model.The term mixed layer (ML) height is appropriate for the measurement made by lidar (Hayden et al., 1997;Seibert et al., 2000;Stull, 1988;Tucker et al., 2009).Tucker et al. (2009), defines the ML height as the volume of atmosphere in which aerosol chemical species emitted within the BL are mixed and dispersed and since all measurements discussed here were collected during the daytime, this terminology is applicable to the airborne HSRL observations.This term will be used for lidar, ceilometer, and radiosonde measurements.In areas where it might be necessary to denote the specific methodology from which the ML height is derived, we will refer to it as the "aerosol ML height" when discussing the lidar/ceilometer-derived height and the "thermodynamic ML height" when referring to heights derived from potential temperature.These terms will also apply for the WRF-Chem model as well when we discuss modeled backscatter and thermodynamic profiles.
Our study follows a heritage of other studies, which have examined how well various models perform when compared with BL heights derived from a radiosonde or a lidar.These studies include evaluations of mesoscale models, NCEP Mesoscale Eta (Angevine and Mitchell (2001) and MM5 (Bidokhti et al., 2008), with wind profilers in Illinois and Tennessee and a ground-based lidar in Zanjan, Iran, respectively.Both studies showed good correlations between the model and measurements.BL heights from global circulation models have also been evaluated by two satellitebased lidars.Measurements from the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) on the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite were used to validate the Goddard Earth Observing System-version 5 (GEOS-5) Modern-Era Retrospective analysis for Research and Applications (MERRA) (Jordan et al., 2010) and the Geoscience Laser Altimeter System (GLAS) evaluated the European Centre for Medium-Range Weather Forecasts (ECMWF) model (Palm et al., 2005).It was found that ECMWF under predicted the observed GLAS BL heights, however, there were instances of GEOS-5 over and under-predicting the BL heights from CALIPSO due to land and water interactions.
The airborne HSRL provides an opportunity to quantify spatial and temporal variations in ML heights that are difficult to obtain by other methods that rely on fixed sites.ML heights derived from HSRL will be extremely useful since it provides spatial and temporal evolution of the mixed layer that cannot be obtained from any other type of information and therefore can more rigorously evaluate meteorological models, such as WRF-Chem.
The current study focuses on data sets from two field campaigns in California that occurred from May to June 2010.The California Research at the Nexus of Air Quality and Climate Change (CalNex) campaign during May and June 2010 focused on air quality in the Los Angeles basin and the Carbonaceous Aerosol and Radiative Effects Study (CARES) took place in the Sacramento, CA, region during June 2010.The NASA LaRC airborne HSRL participated in CalNex for May 2010 and CARES in June 2010.The extensive suite of measurements and modeling during these campaigns presents the opportunity for assessing the WRF-Chem model in this region and presents insight into horizontal variability of ML height and the representativeness of localized ML height measurements.
The CalNex Science White Paper (National Oceanic and Atmospheric Administration, 2008) lists several science questions that relate to the transport and meteorology of the basin.Having extensive data on the BL height is useful in answering these science questions since it defines the vertical extent of mixing (e.g., dilution) of trace gases and aerosols in the boundary layer.The Los Angeles basin is bordered by the San Gabriel Mountains, the San Bernardino Mountains, and   1 summarizes the specifics of the eight flights.One of the objectives for the CARES campaign was to study the regional-scale transport and mixing of the Sacramento urban plume (Zaveri et al., 2012).Fast et al. (2012) discuss the dominant meteorological conditions over cen-tral California during CARES encompassing Sacramento, San Francisco, and the Sierra Nevada.The CARES campaign used two primary aircraft, the DOE G-1 and the NASA King Air B-200.Since the CARES schedule overlapped with CalNex, the NOAA WP-3 and Twin Otter aircraft participated in some of the central California flights.There were also two ground sites utilized during CARES: the T0 site, located in Sacramento (38.65 • N, 121.35 • W; 30 m m.s.l.), and the T1 site, located in Cool, CA (38.87 • N, 121.02 • W; 454 m m.s.l.).Very comprehensive suites of instruments were located at these ground sites; measuring meteorological parameters, trace gases, optical properties of aerosols, aerosol composition, aerosol size distributions, and solar radiation.Radiosondes were launched from these ground sites four times per day on the days with science flights to capture the evolution of the atmosphere.In total there were 23 science flights by the NASA B-200 during CARES.Figure 1b shows ground tracks of the B-200 flights during CARES, along with three regions discussed in Sect.3.3 and the locations of the ground sites.This paper presents the methodology used to derive ML heights from airborne HSRL measurements of aerosol backscatter and describes how these ML heights are used to evaluate modeled ML heights from the 2010 CalNex and CARES field campaigns.Portions of the flights during these field campaigns occurred over complex terrain in California.Section 2 describes the data products from the airborne HSRL instrument and the WRF-Chem model, and provides an overview of the methods used to calculate and compare the ML heights.Section 3 discusses these analyses in the context of the CalNex and CARES campaigns and summarizes the HSRL ML height values in comparison with the ML heights from radiosondes, a ceilometer, and the WRF-Chem model.Lastly, in Sect.4, the results between the measured and modeled ML heights are summarized along with a discussion of how these results may guide future model developments.

Determination of mixed layer heights from HSRL
The primary data set for this paper is the ML heights derived from HSRL.The airborne HSRL has acquired extensive data sets of aerosol extinction at 532 nm, backscatter at 532 nm and 1064 nm, depolarization at 532 nm and 1064 nm and aerosol optical depth (AOD) at 532 nm (Hair et al., 2008;Rogers et al., 2009).The instrument has flown aboard the NASA LaRC King Air B-200 and UC-12 aircraft on 349 science flights collecting 1142.4 h of data during 21 field campaigns in North America since 2006.Aerosol ML heights are derived for HSRL daytime measurements using an automated technique that utilizes a Haar wavelet transform with multiple wavelet dilations (Brooks, 2003) to identify the sharp gradients in aerosol backscatter profiles located at the top of the ML, modified to identify the lowest, not the strongest, significant aerosol gradient, as detailed in the following paragraphs.We have used a modified version of Brooks' technique routinely across fifteen flight campaigns (212 flights, 729 h) and have found that it identifies the ML height accurately 85 to 90 % of the time under a variety of meteorological and terrain conditions.Manual adjustment of the ML heights is performed when necessary, as discussed later in this section.
In this study, aerosol backscatter profiles (532 nm) derived from the HSRL measurements are the input data for the wavelet algorithm.These profiles are computed every 0.5 s using a 10 s running average of the HSRL 532 nm backscatter data (Hair et al., 2008).The aerosol backscatter values are averaged over ∼1000 m horizontal and 30 m vertical resolution (Rogers et al., 2009).Clouds were removed from the analyses because they can produce especially strong signal gradients that can be misinterpreted by the wavelet algorithm as the ML top.For cloud removal, a simpler algorithm is used which follows Davis et al. (2000); a convolution of the measured signal with the Haar wavelet is used to identify cloudy profiles in the lidar data.We use a flight-by-flight adaptive threshold on the convolution that separates the cloud gradients from the weaker aerosol gradients (Burton et al., 2010).
To identify ML heights, the averaged cloud-free backscatter profiles are then used in the wavelet transform algorithm based on the method described by Brooks (2003).Brooks' (2003) technique is an improvement over previous studies using wavelets (Davis et al., 2000;Cohn and Angevine, 2000), which were effective where the vertical backscatter gradient is small everywhere except at the aerosol layer top, but which can produce a bias in the ML height estimates when a gradient is present above or within the ML (Brooks, 2003).Brooks' (2003) algorithm computes a wavelet transform at multiple dilations (i.e., spatial extent) to compute the lower (H 1 ) and upper (H 2 ) limits of the transition zone, as well as the altitude of the maximum of the wavelet transform (H 3 ).Brooks (2003) indicated that H 1 , the lower limit of the transition zone, represents the top of the well-mixed aerosol layer.However, the altitude of the maximum wavelet transform, H 3 , is used more widely to identify the ML height and is more closely related to the methods used by the other techniques.Therefore, in this paper we also use the same convention and use H 3 as the ML height; for completeness the transition zone limits H 1 and H 2 are also included in our product.Brooks (2003) demonstrated his procedure using airborne backscatter lidar data acquired over relatively shallow marine boundary layers.In more varied meteorological conditions and over terrain, the mixed layer height can be more difficult to identify due to multiple sharp gradients in the profile that can correspond to elevated aerosol layers (lofted layers or residual layers), if these have stronger edge gradients than the boundary layer.In order to address this problem, an additional layer of logic is added for the HSRL algorithm that searches for local maxima in the convolution greater than an empirically determined threshold value, and chooses the one at the lowest altitude, rather than the overall maximum.This eliminates errors due to strong edges associated with elevated aerosol layers.Where feasible, the search for the correct edge gradient is also limited using the results for the transition zone from the previous minute's data.This restriction eliminates many of the false ML height detections; however, it gives less stable results in cloudy cases because the clouds interrupt the 1 min window.Because of the large changes in ML height between land and water, the results computed over water are not used as a limit on the results over land, and vice versa.
In the HSRL algorithm, we increased the dilation value used for searching for the maximum gradients.We found that larger dilation values of 900 m over land and 360 m over water generally capture the gradient at the boundary layer top well over land without false positives due to noise.The top and bottom of the transition zone is then found using Brooks' (2003) algorithm, which uses smaller dilation values to hone in on these boundaries.
Even with the modifications, complicated aerosol structures within and/or above the ML, or clouds at the top of the ML, can potentially prevent the algorithm from producing satisfactory results.Every curtain of backscatter profiles is visually inspected.If the algorithm chooses an edge gradient that does not appear to be associated with the mixed layer top, a manually selected height can be logged.The heights produced by the automated algorithm are also considered in this manual determination, thus the set of manual heights is not independent of the set of heights determined from the automated method.The H 3 altitudes determined from the automated algorithm and the ML heights determined from the manual inspection are combined to produce a set of "best estimate" ML heights, equal to the automated estimate where they agree within 300 m, and equal to the manual estimate where they do not agree.For the CalNex and CARES campaigns, the ML heights were determined manually 10-15 % of the time.Indeed, in many of the cases where the automated algorithm failed to give satisfactory results, it is also difficult to accurately locate the ML height even by visual inspection.In cases where the ML is not marked by a strong aerosol gradient, the algorithm will not produce a good estimate of ML height.

Thermodynamic mixed layer height calculation from radiosonde
Since these ML heights from HSRL have not been previously published, we demonstrate their validity in Sect.4.1 by comparison with thermodynamic ML heights from radiosonde measurements.Thermodynamic ML heights are derived from radiosonde profiles by using a technique described by Heffter (1980) based on locating a critical inversion in the potential temperature using the lapse rate and the inversion strength.Various criteria for minimum lapse rate and temperature difference for this algorithm have been proposed for different regions by, for example, Heffter (1980), Delle Monache et al. ( 2004), and Hayden et al. (1997).In this study, we use the Hayden et al. (1997) values, since these are appropriate for complex terrain similar to the conditions found in the radiosonde launch locations during the CARES campaign.These are 0.002 K m −1 for the minimum inversion lapse rate and 1 K for the temperature difference.

Mixed layer height calculation from ceilometer
In Sect.3.2, we compare with mixed layer heights from ceilometers.As with HSRL, the ceilometer ML heights are based on aerosol backscatter gradients.ML heights calculated from ceilometers are limited to mostly cloud-free conditions because the laser pulse is attenuated when it hits dense clouds (Haman et al., 2012).The Vaisala ML height algorithm (version 3.5) was used to analyze the ceilometer backscatter coefficient to identify aerosol structures and determine the height of the ML using the negative gradient method (Münkel et al., 2007).Additional information on the proprietary Vaisala ceilometer algorithm used in this analysis can be found in Haman et al. (2012).

WRF-Chem configuration
As described in Fast et al. (2012), the WRF-Chem model (Grell et al., 2005;Fast et al., 2006) was used to provide operational support for the CARES campaign by providing highresolution forecasts of wind and tracer dispersion resulting from carbon monoxide emitted from urban locations.The domain configuration used in this study is identical to Fast et al. (2009), with a horizontal grid spacing of 4 km that encompasses all of California and the surrounding region.The specific parameterizations used in this study are listed in Table 3.
As in Fast et al. (2012), the Mellor-Yamada-Janjić (MYJ) scheme (Janjić, 1990(Janjić, , 2002) ) is used to represent BL mixing.Several other ML parameterizations that employ either turbulence kinetic energy or non-local closure approaches are available in WRF; however, it is beyond the scope of this paper to compare the performance of multiple ML parameterizations using the lidar measurements (see LeMone et al., 2013, for evaluation of ML schemes).Some of the meteorological parameterizations are also different than those in Fast et al. (2012) because of future plans to study aerosol direct and indirect effects that require coupling aerosols to certain radiation and cloud schemes.
A continuous simulation from 1 May to 30 June 2010 was performed.Initial and boundary conditions for the meteorology were obtained from the global analyses of the National Center for Environmental Prediction's North American Mesoscale (NAM) model, while initial and boundary conditions for trace gases and aerosols were obtained from the global MOZART model (Emmons et al., 2010).The large-scale synoptic conditions were also constrained by the NAM model analyses from ∼4 km above sea level to the model top at ∼12 km.The Model for Simulating Aerosol Interactions and Chemistry (MOSAIC) (Zaveri et al., 2008) and SAPRC-99 photochemical mechanism (Carter, 2000) were used to simulate regional-scale evolution of aerosols and trace gases, respectively.Secondary organic aerosol formation was represented by a simplified two-species volatility basis set approach as described by Shrivastava et al. (2011).Anthropogenic emissions were obtained from the California Air Resources Board and were developed for the 2008 NASA Arctic Research of the Composition of the Troposphere from Aircraft and Satellites (ARCTAS) mission over California (Jacob et al., 2010).
The performance of the near-surface simulated meteorology was very similar to that described in Fast et al. (2012) and the differences in the meteorological parameterizations did not lead to substantial changes in the overall model performance (not shown).For example, the model reproduced the observed variability in the synoptic conditions and nearsurface diurnal variation in thermally driven flows associated with complex terrain and land-sea contrasts.Details of the performance of simulated aerosol mass, composition, and size distribution compared with the extensive CARES and CalNex field campaign observations will be presented in another study.
WRF-Chem ML heights are determined by two methods from the instantaneous model output at hourly intervals.The first method uses a critical potential temperature gradient to identify the top of the CBL, which is widely used by the atmospheric community.We estimate the ML height using a Heffter (1980) technique with a critical gradient of 0.001 K m −1 as in Delle Monache et al. (2004); however, this method can identify multiple layers for weak vertical potential temperature gradients.In this case, sharp vertical gradients in specific humidity are used to define the simulated ML height.The specific humidity is usually significantly higher in the CBL than in the free troposphere and sharp vertical gradients occur in the first layer above the ground, which can be identified by the critical potential temperature gradient.Software tools from the Aerosol Modeling Testbed (Fast et al., 2011) are used to extract the potential temperature and humidity profiles and interpolate those profiles in time along the aircraft flight path.The second method is identical to the algorithm described in Sect.3.2, except that simulated profiles of backscatter are used.As described in Fast et al. (2006) and Barnard et al. (2010), simulated aerosol properties and Mie theory are used to compute backscatter and extinction at four wavelengths: 300, 400, 600, and 1000 nm.To compare simulated backscatter and aerosol profiles with the HSRL measurements during CARES and CalNex, the Aerosol Modeling Testbed software tools are also used to interpolate the simulated quantities in space and time to match the aircraft flight paths.The Ångström relationship is then used to interpolate the simulated optical properties to 532 nm for direct comparisons with lidar measurements.As will be shown later, both methods are used to compare with the measured HSRL aerosol ML heights.While the first method is more commonly used by meteorological studies, the second method is a more equivalent to the lidar measurements.

ML height comparisons
Comparisons of ML heights are conducted in different ways depending on if it is the modeled output or measurements from the ground sites.The resolution of the ML heights from the HSRL aerosol backscatter profile is ∼1000 m horizontal and 30 m vertical.For the HSRL ML height comparisons to the radiosonde and ceilometer, a separation distance and temporal difference is used to determine coincident pairs to conduct the comparison.In the comparison with the radiosonde profiles, a separation distance of 15 km and temporal difference of 30 min is used, as discussed in Sect.3.1.The ceilometer comparison varies in separation distance from 0 to 50 km and a temporal difference of 15 min, as discussed in Sect.3.2.WRF-Chem has horizontal grid spacing of 4 km with hourly simulations and is extracted along the HSRL flight track using the Aerosol Modeling Testbed software tools.This provides a direct comparison of matched times and locations between HSRL and WRF-Chem.

Radiosonde and HSRL mixed layer heights
The CARES campaign in the Sacramento region provides the opportunity for verifying the applicability of HSRL derived aerosol ML heights by validating them with thermodynamic ML heights derived from radiosondes.The T0 and T1 sites were located in the Central Valley and at the foothills of the Sierra Nevada, respectively, so we conducted our analysis of these sites separately.Since radiosonde launches did not exactly correspond with the aircraft overpass times, we limited our comparison data to separation distance of 15 km and temporal difference of 30 min between the aircraft and the radiosondes.Figure 2 summarizes the comparisons of ML

Comparison with ceilometer and investigation of spatial variability
Evaluations of ML heights AGL from the ceilometer at the Pasadena, CA, ground site and HSRL were performed to compare the two instruments' measurements against each other and to better understand the extent to which the ceilometer measurements of ML height were indicative of the ML height throughout the CalNex study area.HSRL data were screened to find the points of closest spatial approach between the B-200 aircraft and the Pasadena ground site, within ± 7.5 min of the times of a ceilometer measurement and within various distances.Note that the ceilometer typically made a measurement every 15 min.Examination of the ceilometer ML heights during CalNex (not shown) indicates that the ML only changed by approximately 100 m at most during a 15 min span, so a 15 min window for comparisons between HSRL and ceilometer heights is appropriate.Figure 3a shows comparisons limited to 30 km spatial  separation.Eighteen overpasses satisfy this strict coincidence criterion and the agreement is very good, with an R of 0.89 and an RMS difference of 108 m and bias difference (HSRL minus Ceilometer) of −9.7 m within that separation distance.Generally, there is only at most one ceilometer in a given geographical area, so it is important to investigate if the ML height derived from a ceilometer is regionally representative.The ceilometer during CalNex was only active during the field campaign date range, as is the case with most ceilometers, since an active ceilometer network does not exist in the United States (Demoz et al., 2013).HSRL made measurements over the entire region, many of them temporally coincident with the ceilometer but separated by various distances.The variability of the ML height over the larger region was examined by looking at HSRL-derived ML heights as the B-200 flew within ± 7.5 min of each ceilometer measurement, though not limited by separation distance.It was found that the ML height can vary by large amounts, up to as much as 2 km, in locations surrounding the ceilometer.This is likely due to variations of ML heights with the complex terrain, differing meteorological conditions over the ocean and mountains, and the transport of pollutants within the study region (Duong et al, 2011).Repeating the comparison between HSRL and ceilometer ML heights, we see that the comparison breaks down quickly at separations beyond 30 km. Figure 3b illustrates this point with a comparison of points that are within ± 7.5 min but separated by 30-50 km spatially.We find that the measurements are essentially uncorrelated, since the comparison produces an R of 0.08 and an RMS difference of 554 m and bias difference (HSRL minus Ceilometer) of 234 m for points between 30 and 50 km.We emphasize that both the ceilometer and HSRL ML heights are calculated above ground level (a.g.l.), which maximizes the correlation, as expected.However, at distances beyond 30 km from the Pasadena ground site the B-200 flew over ocean and high-altitude terrain and large differences in altitude and differences in surface characteristics can dramatically affect ML growth.Therefore, in Fig. 3c, the comparison points are further limited to only consider points where the HSRL location had similar ground altitude (within ± 200 m) of the altitude of the Pasadena ground site.With this further limitation, good agreement can be found at separation distances up to approximately 100 km.Up to 50 km, R is 0.83 for 23 comparison points within that distance, from 50 to 100 km, R is 0.69 for 9 points, and from 100 to 150 km, R is 0.36 for 10 points.This supports the assumption that at least part of the regional variability of ML height is due to differences in terrain.
Figure 4 further illustrates that a point measurement of ML height (i.e., from a ceilometer) may not be indicative of ML behavior even in areas very close to the ceilometer (and at similar altitude) and that the spatial variability changes on different days.Figure 4a shows a case where the ML height is a good analog for a large area of the LA basin, with differences between HSRL and ceilometer ML heights of less than 100 m over the entire region except over water and the coast.Yet, the next day, illustrated in Fig. 4b, there is significant disagreement of more than 1000 m between the ceilometer and ML heights measured just tens of kilometers away.We believe this abrupt change in variability is probably due a change in the wind direction and the transport of aerosols through the basin.

Evaluation of WRF-Chem mixed layer heights
Mixed layer heights from HSRL provide an opportunity to assess model simulations from WRF-Chem over a large geographical domain.Figures 5 and 6 show scatter plots of HSRL aerosol ML heights and WRF-Chem simulated thermodynamic heights for the CalNex and CARES campaigns, respectively.ML heights are presented in meters above ground level (a.g.l.) for both data sets.The temporal resolution of the comparisons is 10 seconds, corresponding to approximately ∼1000 m at a typical B-200 flight speed.The number of points in these comparisons corresponds to the HSRL resolution of the backscatter profiles (1 point ≈ 1 km of airborne HSRL data).Figure 5 shows ML height comparisons for all flights during CalNex.The region from which the data used in this plot were obtained is bounded by the 35 • N line (see Fig. 1a) to include only the flights in the Los Angeles (LA) basin.A bisector regression produces an R of 0.58 with a bisector slope of 0.87 and intercept of −8.6 m.A bias difference value of −157 m indicates the WRF-Chem ML heights were underpredicted.Statistical results from this comparison are found in Table 4.
Figure 6 shows a similar comparison of WRF-Chem thermodynamic ML height and HSRL aerosol ML height for all flights during CARES.The bisector regression produces an R of 0.63 with a bisector slope of 1.35 and intercept of −340 m with a small bias difference of −29 m.WRF-Chem again underpredicted the ML heights when the ML height is low, but tends to over predict when the ML height is large.Statistical results from this comparison are found in Table 4.
A potential factor affecting the accuracy of the WRF-Chem simulation is the complexity of the terrain.Complex terrain and bodies of water introduce larger uncertainties in the simulated interaction of surface fluxes, boundary layer mixing, and ambient winds and there are local variations in the ML depth that the model may not be able to resolve using a grid spacing of 4 km.To investigate this further, the CARES domain was split into three regions, shown in Fig. 1b -A: San Francisco Bay region (including the southern Coast Range), B: Central Valley (that includes Sacramento and the T0 ground site), and C: Sierra Nevada (includes the T1 ground site) -for further analysis of the ML height values from both HSRL and WRF-Chem.Figure 1b shows how the regions are divided and the locations of the ground sites.Statistical results from the comparisons in the three regions separately are found in Table 4, along with the results for all regions together.Of the three regions, the worst agreement is found in region A, the San Francisco Bay and southern Coast Range area.The regression for region A (San Francisco Bay and southern Coast Range) has better agreement in the slope but relatively low correlation coefficient.Examination of specific cases shows that the comparisons over the bay itself are of similar quality to adjacent land regions, but flight segments in the southern Coast Range often have very poor agreement.In these cases, both the HSRL aerosol ML height and the WRF-Chem thermodynamic height estimates are indicated at very high altitudes (a.g.l.).We can understand this by looking at HSRL backscatter curtains.The mountain range reaches above the local boundary layer and there is very little aerosol above the mountains.Therefore, the ML height often indicates the top of the residual layer or weak structures in the free troposphere, and these tenuous features have much greater variability.The thermodynamic ML heights from WRF-Chem also reach very high altitudes (a.g.l.) in this situation, but show poor agreement with the aerosol ML heights observed by HSRL.
In the Sierra Nevada region, the correlation is better, but the slope and intercept reveal relatively poor agreement.Examination of specific cases in the Sierra Nevada region (not shown) suggests that the ML height in these mountainous regions is as well characterized in a regional sense as elsewhere (such as the Central Valley), but the greater variability of the ML height in the mountains is not well captured by the simulation.This suggests that the 4 km grid spacing is too coarse to resolve local variations of the ML height.
The Central Valley region also has a large slope, but in this case, it reflects poor agreement on a limited number of flights where WRF-Chem distributes the aerosol over a much taller column than that which is observed by HSRL.One of these cases is illustrated in Fig. 7, showing the HSRL aerosol backscatter and modeled WRF-Chem aerosol backscatter curtains from the 14 June 2010 afternoon flight.The first half of the flight was a raster pattern in region B (Central Valley), and the second part of the flight was mostly in region A (San Francisco Bay and southern Coast Range).In the Central Valley region on this particular day, the WRF-Chem simulation distributes aerosol up to 1.5 or even 2 km, whereas HSRL infers the ML top much lower, below 1 km, in the valley.
In order to make a more direct one-to-one comparison and separate potential issues with ML height determination form errors in the simulation of aerosol, another experiment was performed where ML heights from WRF-Chem simulated backscatter are compared to the HSRL ML heights for the CARES flights (results in Table 4).Figure 7 demonstrates that although the WRF-Chem simulations of aerosol backscatter and aerosol ML height are generally in good agreement with the HSRL measurements, the simulations sometimes have difficulty in accurately forecasting the vertical extent of aerosols in the ML as well as the magnitude of aerosol backscatter both in the ML and the free troposphere.In addition to the treatment of atmospheric chemistry, particularly secondary organic aerosol, emissions of primary aerosols and aerosol precursors over California, and boundary conditions also affect predictions of aerosols that will be described in a subsequent study.These observations are late in the day and the ML heights in the same region earlier in the day, before the growth of the boundary layer, show better agreement.
These cases notwithstanding, in general, WRF-Chem and HSRL show good agreement in the characterization of diurnal growth of the boundary layer.Figure 8a, b show statistics of the ML heights as a function of time of day for the Los Angeles area during CalNex and in central California during CARES, respectively.During both field campaigns, the hourly median values between the two methods agree to within a few hundred meters throughout most of the day.For CalNex, the largest difference in of approximately 700 m between hourly median values was found to be in the late afternoon between 16:00 and 17:00 LST.During CARES, the largest difference was approximately 200 m and found in the early morning between 09:00 and 10:00 LST.Upon examination of backscatter curtain plots and flight plans, the large differences in both campaigns are associated to locations over higher terrain and not diurnal ML height growth.

Summary
HSRL aerosol backscatter profiles were used to derive aerosol ML heights and assess simulations of the temporal and spatial variability of thermodynamic ML heights in both the CalNex and CARES study regions.The ML height assessments are critical for evaluation of the performance of research forecasting models like WRF-Chem when they are used for air quality assessments.
The use of aerosol ML heights from HSRL was shown to be justified by comparisons of the aerosol ML heights derived from HSRL and the thermodynamic ML heights from the radiosonde potential temperature profiles, which showed reasonable agreement.Although the ML heights are derived differently, the two measurements compared well with an R of 0.94, slope of 0.94, and a bias difference of 35 m.
The HSRL aerosol ML heights during the CalNex campaign were compared with the corresponding aerosol ML heights computed from a ground-based ceilometer located in Pasadena, CA.Overall these heights agreed well (R is 0.89) when HSRL is within 30 km of the ceilometer, or up to 50 km when considering only regions with terrain height similar (> 200 km difference) to the ceilometer location (R is 0.83).Spatial variability in the ML heights leads to significantly poorer comparisons for greater spatial separation, suggesting that ML height measurements at a single location are not representative beyond 30 km away, or 50 km for similar terrain height.Furthermore, the spatial variability of ML heights is highly variable from day to day, implying that the regional representativeness of the ceilometer heights is also highly variable.
HSRL ML heights were used to assess WRF-Chem ML heights.Overall, the WRF-Chem ML heights were underpredicted in the study regions for CalNex and CARES.To evaluate the impact of the complex terrain in the CARES study region, the domain was divided into the three regions to see how well the model simulations of ML performed as a function of location.While there are differences, it is not clear that WRF-Chem performs significantly better or worse in one region or another, but the investigation revealed some patterns in the comparisons that are instructive.There is generally good agreement over the flat terrain in the Central Valley, but on certain days WRF-Chem does not correctly represent the diurnal growth of the mixed layer and distributes aerosol over a much taller ML than measurements indicate, up to twice the measured ML height.In contrast, the complex terrain in the San Francisco Bay and Sierra Nevada regions introduce larger uncertainties in the simulated interaction of surface fluxes, boundary layer mixing, and ambient winds or there are local variations in the ML depth that the model cannot resolve using a grid spacing of 4 km.In both the Sierra Nevada and San Francisco regions, WRF-Chem under-predicts the ML heights; more so in the San Francisco area with the bias difference being −263 m.In the Central Valley region, WRF-Chem over-predicts the ML heights, as indicated by the bias difference of 121 m.The disagreement between measurements and models is exacerbated over the southern Coast Range when the mountains interrupt the mixed layer and ML heights over the high terrain are not well defined.Disagreement in these specific cases is probably not very significant in terms of WRF-Chem performance.
To further separate potential differences in ML height methodologies from errors in aerosol prediction by WRF-Chem, a further comparison was made by computing ML heights using aerosol gradients with WRF-Chem instead of thermodynamic gradients.This results in no significant improvement when compared to the HSRL aerosol ML heights.This finding supports the use of the ML height computed from aerosol backscatter gradients as a proxy for the BL.This also suggests that other factors in the modeling and/or HSRL ML height retrieval techniques were responsible for differences between the HSRL and WRF-Chem ML heights.The differences between the HSRL and WRF-Chem ML heights could be due to errors in the timing in convective BL growth in the model, which could be too fast or too slow.There could also be errors in surface fluxes, such as soil moisture errors, that will lead to the ML height being too shallow or too deep.
The results presented here demonstrate that the aerosol ML heights derived from HSRL aerosol backscatter profiles are closely comparable to those derived from radiosonde temperature profiles, and that these HSRL ML heights can be used to evaluate ML heights from other sensors (e.g., ceilometers) and models.As in earlier studies (Baker et al., 2013;Fast et al., 2012), the HSRL aerosol ML heights also provide additional information for validating and improving model ML heights by providing the means to distinguish between biases due to BL parameterizations from those due to other factors such as interaction with synoptic meteorology.

Figure 1 .
Figure 1.(A) Summary of flight tracks during CalNex and the location of the ground site in Pasadena, CA. (B) Summary of flight tracks during CARES along with the three study regions (A: San Francisco, B: Central Valley and C: Sierra Nevada) discussed in Sect.3.3 and locations of the two ground sites.In both images, the background behind the flight tracks is elevation (GLOBE digital elevation model), where shades of green represent the lowest elevation and brown is the highest elevation.

Figure 2 .
Figure 2. Scatter and bisector regression plot of HSRL aerosol ML heights and radiosonde thermodynamic ML heights within 15 km and 30 min of the ground sites.The statistics in the upper left corner are for the comparisons when both the T0 and T1 sites are combined.

Figure 3 .
Figure 3. Scatter and bisector regression plots of HSRL and ceilometer aerosol ML heights as a function of distance of closest approach of the aircraft to the ceilometer.The circles compare the ceilometer measurement to the HSRL data taken at the point of closest approach to the Pasadena ground site.The comparison points in the plot on the far right (Fig. 4c) are further limited to only consider points where the HSRL location had similar ground altitude (within ± 200 m) of the altitude of the Pasadena ground site.

Figure 4 .
Figure 4. Google Earth images displaying the absolute differences between the HSRL and ceilometer aerosol ML heights in meters for portions of two flights during CalNex ( ML Height = HSRLceilometer).(A) Data on 19 May were taken from approximately 14:00 to 14:10 LST; (B) data on 20 May were taken from approximately 12:45 to 13:15 LST.

Figure 5 .
Figure 5. Scatter and bisector regression plot of WRF-Chem thermodynamic ML heights and HSRL aerosol ML heights across all flights during CalNex.Number of occurrences in each histogram bin is shown in color.The bias difference and RMS difference are calculated by WRF-Chem minus HSRL.The number of points in these comparisons corresponds to the HSRL resolution of the backscatter profiles for the 8 CalNex flights (1 point ≈ 1 km of airborne HSRL data).

Figure 6 .
Figure 6.Scatter and bisector regression plot of WRF-Chem thermodynamic ML and HSRL aerosol ML heights across all flights during CARES.The PBL heights for WRF-Chem are derived from potential temperature.Number of occurrences in each histogram bin is shown in color.The bias difference and RMS difference are calculated by WRF-Chem minus HSRL.The number of points in these comparisons corresponds to the HSRL resolution of the backscatter profiles for the 23 CARES flights (1 point ≈ 1 km of airborne HSRL data).

Figure 7 .
Figure 7. (Top) aerosol backscatter measured from HSRL with aerosol ML heights derived from aerosol backscatter and (bottom) simulated aerosol backscatter from WRF-Chem with thermodynamic ML heights (in black) and aerosol ML heights (in white); data are from the second research flight on 14 June 2010.

Figure 8 .
Figure 8.Diurnal variation for HSRL aerosol ML heights and WRF-Chem thermodynamic ML heights over the entire A) CalNex and B) CARES campaigns.Filled boxes denote the 25th and 75th percentiles and vertical lines denote the 5th and 95th percentiles.Lines connecting the white dots denote the median value for each hour.The blue and red boxes are gridded by time and offset for clarity.

Table 1 .
Summary of HSRL flights during CalNex.Ceilometer data is available on all flight days.
Table 2 summarizes the 23 flights.

Table 2 .
Summary of HSRL flights during CARES.

Table 3 .
Selected WRF-Chem configuration options used for this study.

Table 4 .
Statistics of the HSRL aerosol ML and WRF-Chem thermodynamic ML height comparisons for CalNex and CARES.The CARES * statistics are from the HSRL and WRF-Chem aerosol ML height comparisons.The bias difference and RMS difference are calculated by WRF-Chem minus HSRL.The number of points in these comparisons corresponds to the HSRL resolution of the backscatter profiles (1 point ≈ 1 km of airborne HSRL data).