Retrieval of total column and surface NO 2 from Pandora zenith-sky measurements

Pandora spectrometers can retrieve nitrogen dioxide (NO2) vertical column densities (VCDs) via two viewing geometries: direct Sun and zenith sky. The direct-Sun NO2 VCD measurements have high quality (0.1 DU accuracy in clear-sky conditions) and do not rely on any radiative transfer model to calculate air mass factors (AMFs); however, they are not available when the Sun is obscured by clouds. To perform NO2 measurements in cloudy conditions, a simple but robust NO2 retrieval algorithm is developed for Pandora zenith-sky measurements. This algorithm derives empirical zenith-sky NO2 AMFs from coincident high-quality direct-Sun NO2 observations. Moreover, the retrieved Pandora zenith-sky NO2 VCD data are converted to surface NO2 concentrations with a scaling algorithm that uses chemical-transport-model predictions and satellite measurements as inputs. NO2 VCDs and surface concentrations are retrieved from Pandora zenith-sky measurements made in Toronto, Canada, from 2015 to 2017. The retrieved Pandora zenith-sky NO2 data (VCD and surface concentration) show good agreement with both satellite and in situ measurements. The diurnal and seasonal variations of derived Pandora zenith-sky surface NO2 data also agree well with in situ measurements (diurnal difference within ±2 ppbv). Overall, this work shows that the new Pandora zenith-sky NO2 products have the potential to be used in various applications such as future satellite validation in moderate cloudy scenes and air quality monitoring.


Introduction 25
Nitrogen dioxide (NO 2 ) is an important air pollutant and plays a critical role in tropospheric photochemistry (e.g., ECCC, 2016;EPA, 2014). It is primarily emitted from combustion processes such as fossil fuel combustion and biomass burning, as well as from lightning. NO 2 is a nitrate aerosol precursor, and it also contributes to acid deposition and eutrophication (ECCC, 2016). Exposure to NO 2 can lead to adverse health effects, such as irritation of the lungs, a decrease in lung function, and an increase in susceptibility to allergens for people with asthma (EEA, 2017;WHO, 2017). 30 Total vertical column NO 2 can be measured by many ground-based UV-visible remote sensing instruments using direct-sun, zenith-sky, or off-axis spectroscopy techniques (Cede et al., 2006;Drosoglou et al., 2017;Herman et al., 2009;Lee et al., 1994;Noxon, 1975;Piters et al., 2012;Roscoe et al., 2010;Vaughan et al., 1997). These measurements are of high quality and good precision, and have been widely used for atmospheric chemistry studies (e.g. Adams et al., 2012;Hendrick et al., 2014) and satellite validations (e.g., Celarier et al., 2008;Drosoglou et al., 2018;Irie et al., 2008;Wenig et al., 2008). 5 Among all these different viewing geometries, direct-sun measurements are of high accuracy, and are not dependent on radiative transfer models (RTMs) to calculate air mass factors (AMFs) (Herman et al., 2009) or on knowledge of other atmospheric constituents. Zenith-sky observations have been widely used for stratospheric ozone and NO 2 observations, particularly under cloudy conditions when direct-sun measurements are unreliable. Off-axis measurements have good sensitivity in the boundary layer and could provide tropospheric trace gas profiles and surface concentrations (Frieß et al., 10 2011;Hendrick et al., 2014;Kramer et al., 2008;Wagner et al., 2011), but they are more sensitive to cloud cover than zenith-sky measurements.
The Pandora sun spectrometer is a new instrument developed to measure vertical column densities (total columns) of trace gases in the atmosphere using sun and sky radiation in the UV-visible part of the spectrum (Herman et al., 2009). One of its primary data products is NO 2 total vertical column density (VCD) from the direct-sun viewing mode, where VCD represents 15 the vertically integrated number of molecules per unit area and is reported in units of molec cm -2 or Dobson Unit (1 DU = 2.6870×10 16 molec cm -2 ). The Pandora direct-sun NO 2 VCD products have been validated through many field campaigns (Flynn et al., 2014;Lamsal et al., 2017;Martins et al., 2016;Piters et al., 2012;Reed et al., 2015), ground-based comparisons (Herman et al., 2009;Wang et al., 2010), and satellite validations (Ialongo et al., 2016;Lamsal et al., 2014).
Since their introduction in 2006, Pandora spectrometers have been deployed at more than 50 sites globally. The Pandora no . 20 103 instrument used in this study has been deployed in Toronto, Canada since 2013 to perform direct-sun measurements (Zhao et al., 2016). Since 2015, the observation schedule of Pandora no. 103 has been modified to perform alternating directsun and zenith-sky measurements. Knepp et al. (2017) assessed Pandora's capability to derive stratospheric NO 2 using zenith-sky viewing geometry (in twilight periods), but their study was limited to slant column densities (SCDs). At this time, there are no standard Pandora zenith-sky NO 2 VCD data products available. As one goal of this work, we have focused on 25 developing a new NO 2 retrieval algorithm for zenith-sky measurements to expand Pandora NO 2 measurements into cloudy scenes.
In addition to retrieval of zenith-sky total column NO 2 , another goal of this work is to derive surface NO 2 concentration from total column measurements. Surface NO 2 has been a focus of scientific studies due to its strong correlation with air quality (AQ) and health issues (ECCC, 2016), with NO 2 as one of the three components (along with ozone and PM 2.5 ) used to 30 compute the Air Quality Health Index (AQHI: Stieb et al., 2008) in Canada's AQ public awareness programs. Efforts to link total column NO 2 with its surface concentrations have been made by many researchers (Flynn et al., 2014;Knepp et al., 2015;Kollonige et al., 2017;Lamsal et al., 2008Lamsal et al., , 2014McLinden et al., 2014). For example, Knepp et al. (2015) proposed a method to estimate NO 2 surface mixing ratios from Pandora direct-sun total column NO 2 via application of a planetary Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1336 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 5 February 2019 c Author(s) 2019. CC BY 4.0 License.
boundary-layer (PBL) height correction factor. Kollonige et al. (2017) adapted this method and compared Pandora direct-sun surface NO 2 and OMI surface NO 2 . They concluded that the two main sources of error for the conversion of the total column NO 2 to surface NO 2 are (1) poor weather conditions (e.g., cloud cover and precipitation) and (2) PBL height estimation, both of which affect the NO 2 column-surface relationship and instrument sensitivities to boundary layer NO 2 . Thus, in this work we present a simple but robust algorithm for deriving surface NO 2 concentration from Pandora zenith-sky measurements, 5 which has several advantages such as the ability (1) to extend Pandora NO 2 measurements to cloudy conditions and (2) to provide more accurate surface NO 2 concentration estimates that are less sensitive to PBL height. This work also provides reliable total column NO 2 measurements in cloudy conditions and could be used in satellite validations in partially cloudy scenes.
This paper is organized as follows. Section 2 describes the measured and modelled NO 2 data used in this study. In Section 3, 10 the empirical AMFs for Pandora zenith-sky NO 2 measurements are derived using high-quality Pandora direct-sun total column NO 2 data. These empirical AMFs and the Network for the Detection of Atmospheric Composition Change (NDACC) AMFs (Hendrick et al., 2011;Sarkissian et al., 1995;Van Roozendael et al., 1998;Van Roozendael and Hendrick, 2009;Vaughan et al., 1997) are both applied to Pandora zenith-sky total column NO 2 retrievals to help evaluate the performance of the empirical AMFs. Also, the retrieved Pandora zenith-sky total column NO 2 data are evaluated by 15 comparison with satellite measurements. In Section 4, the zenith-sky total column NO 2 data are converted to surface concentration by using a scaling algorithm. The zenith-sky surface NO 2 concentration data are assessed by comparison with in situ measurements. Lastly, in Section 5, several aspects of this zenith-sky surface NO 2 dataset are discussed, which include: diurnal and seasonal variation, and PBL effect, followed by conclusions in Section 6.

Pandora direct-sun total column NO 2
The Pandora instrument records spectra between 280 and 530 nm with resolution of 0.6 nm (Herman et al., 2009Tzortziou et al., 2012). It uses a temperature-stabilized Czerny-Turner spectrometer, with a 50 µm entrance slit, 1200 groove mm -1 grating, and a 2048 × 64 back-thinned Hamamatsu charge-coupled device (CCD) detector. The spectra are analysed 25 using a total optical absorption spectroscopy (TOAS) technique (Cede, 2017), in which absorption cross sections for multiple atmospheric absorbers, such as ozone, NO2, and sulphur dioxide (SO2), are fitted to the spectra.
The Pandora direct-sun total column NO 2 data are produced using Pandora's standard NO 2 algorithm implemented in the BlickP software (Cede, 2017). The measured direct-sun spectra from 400 to 440 nm are used in the TOAS analysis. A synthetic reference spectrum is produced by averaging multiple measured spectra and corrected for the estimated total 30 optical depth included in it. Cross sections of NO 2 at an effective temperature of 254.5 K (Vandaele et al., 1998), ozone at an Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1336 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 5 February 2019 c Author(s) 2019. CC BY 4.0 License. effective temperature of 225 K (Brion et al., 1993(Brion et al., , 1998Daumont et al., 1992), and a fourth-order polynomial are all fitted.
The resulting NO 2 SCDs are then converted to total column VCDs by using direct-sun geometry AMFs. Herman et al. (2009) show that Pandora direct-sun total column NO 2 has a clear-sky precision of 0.01 DU (in slant column) and a nominal accuracy of 0.1 DU (in vertical column). Additional information on Pandora calibrations, operation, and retrieval algorithms can be found in Herman et al. (2009) andCede (2017). 5 The Pandora no. 103 instrument has been deployed in Toronto since September 2013 to perform direct-sun observations (Zhao et al., 2016). The instrument is installed on the roof of the Environment and Climate Change Canada (ECCC) Downsview building (43.7810°N, -79.4680°W) in Toronto. The building is located in a suburban area with multiple roads nearby. Since 2015, the instrument has employed an alternating direct-sun and zenith-sky observation schedule, which consists of direct-sun measurements every 90 seconds and zenith-sky measurements every 30 minutes during the sunlit 10 period. About two-and-a-half years (February 2015 to September 2017) of continuous alternating measurements are used in this study.

Pandora zenith-sky total column NO 2
Retrieval of trace gases from Pandora's zenith-sky measurements is not included in the standard BlickP processing software (Cede, 2017). The Pandora zenith-sky spectra for this study are processed using the differential optical absorption 15 spectroscopy (DOAS) technique (Noxon, 1975;Platt, 1994;Platt and Stutz, 2008;Solomon et al., 1987) with the QDOAS software (Danckaert et al., 2015). A single reference spectrum is used, which was obtained from a zenith-sky measurement at local noon from a day that had low total column NO 2 . Following the NDACC recommendations (Van Roozendael and Hendrick, 2012), NO 2 differential slant column densities (dSCDs) are retrieved in the 425-490 nm window (to retrieve O 4 simultaneously). Cross sections of NO 2 (Vandaele et al., 1998), ozone (Bogumil et al., 2003), H 2 O (Rothman et al., 2005), 20 O 4 (Hermans et al., 2003), and Ring (Chance and Spurr, 1997) are all fitted; a fifth-order polynomial and a first-order linear offset are also included in the DOAS analysis.
The output of QDOAS is NO 2 dSCDs, which can be converted to total column NO 2 via the Langley plot method with the use of the NDACC NO 2 AMF look-up table (LUT) (Van Roozendael and Hendrick, 2012). The NDACC AMF LUT is used here only as a reference since it was primarily developed for retrieval of stratospheric NO 2 . Other empirical zenith-sky NO 2 25 AMFs have been developed and are used to convert NO 2 dSCDs to total columns. Details about these two different AMFs are given in Section 3.1.

OMI SPv3 data
The Ozone Monitoring Instrument (OMI) is a Dutch-Finnish nadir-viewing UV-visible spectrometer aboard the National Aeronautics and Space Administration (NASA)'s Earth Observing System (EOS) Aura satellite that was launched in July 30 2004. The OMI instrument measures the solar radiation backscattered by the Earth's atmosphere and surface between 270 and 500 nm with resolution of 0.5 nm (Levelt et al., 2006(Levelt et al., , 2018. OMI has a 780 × 576 CCD detector that measures at 60 Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1336 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 5 February 2019 c Author(s) 2019. CC BY 4.0 License.
across-track positions simultaneously, and thus, does not require across-track scanning. Due to this approach, the spatial resolution of the CCD pixels varies significantly along the across-track direction: those pixels near the track centre have ground footprint of 13 km × 24 km (along-track × across-track), whereas those close to the track edge (e.g. view zenith angle = 56°) have a ground footprint roughly of 23 km × 126 km (de Graaf et al., 2016). Note that from 2012 onwards the smallest pixels (across-track positions) can no longer be used and are excluded from the analysis (known as the "row anomaly", i.e. 5 Levelt et al., 2018). This means the "smallest" pixels available for an OMI comparison are larger than 13 km × 24 km.
The OMI NO 2 data used in this work are the NASA standard product (SP) (Bucsela et al., 2013;Wenig et al., 2008) version 3.0 Level 2 (SPv3.0) . The NO 2 SCDs are derived using the DOAS technique in the 405-465 nm window (Marchenko et al., 2015). The AMFs used in SPv3.0 are calculated by using 1° × 1.25° (latitude × longitude) resolution a priori NO 2 and temperature profiles from the Global Modeling Initiative (GMI) chemistry-transport model with 10 yearly varying emissions .

In situ measurements
The National Air Pollution Surveillance (NAPS) network was established in 1969 to monitor and assess the quality of ambient (outdoor) air in the populated regions of Canada (https://www.canada.ca/en/environment-climatechange/services/air-pollution/monitoring-networks-data/national-air-pollution-program.html, accessed 23 November 2018). 15 NAPS provides accurate long-term air quality data (ozone, NO 2 , SO 2 , carbon monoxide (CO), fine particulate matter, etc.) of a uniform standard across Canada(e.g., Dabek-Zlotorzynska et al., 2011;Reid and Aherne, 2016).
The in situ NO 2 data used in this study were collected at the NAPS Toronto North station (located 100 m away from the Pandora instrument). The site is 186 m above sea level, and the height of the air intake is 4 m above the ground.
Thus in areas where direct NO x (nitrogen oxides) emission sources are limited and other nitrogen compounds are present, NO 2 may be overestimated (e.g., in rural areas). For the current site, however, this positive bias has been found to be only about 5%, except for very low NO 2 concentrations (<5 ppbv) (Yushan Su, Ontario Ministry of the Environment, Conservation and Parks, personal communication, October 2018). 25

Numerical models
Predicted NO 2 fields from three atmospheric chemistry models are used in the algorithm described in Section 4.1 to derive surface NO 2 concentration from Pandora zenith-sky total column NO 2 data. Following McLinden et al. (2014), this work uses the Global Environmental Multi-scale Modelling Air quality and CHemistry (GEM-MACH) regional chemical transport model (CTM) and the GEOS-Chem global CTM to simulate total columns and vertical profiles of tropospheric 30 NO 2 and surface NO 2 concentration. The stratospheric NO 2 partial columns are estimated using OMI satellite data and the Pratmo box-model.

GEM-MACH
GEM-MACH is ECCC's regional air quality forecast model. It is run operationally two times per day to predict hourly surface pollutant concentrations over North America for the next 48 hours (Moran et al., 2009;Pavlovic et al., 2016;Pendlebury et al., 2018). The model consists of an online tropospheric chemistry module (Akingunola et al., 2018;Pavlovic et al., 2016)

GEOS-Chem
The GEOS-Chem chemical transport model (Bey et al., 2001) has been used extensively in the retrieval of tropospheric columns, and has been shown to be capable of reasonably simulating the vertical distributions of NO 2 (Lamsal et al., 2008;Martin et al., 2002;McLinden et al., 2014). The model has a detailed representation of tropospheric chemistry, including aerosols and their precursors (Park et al., 2004). In the simulation used in this study, a global lightning NO x source of 6 Tg N 20 yr -1 (Martin et al., 2002) was imposed. The model was run on a 1/2° × 2/3° (latitude × longitude) grid in nested mode over North America, and was driven by assimilated meteorology from the Goddard Earth Observing System (GEOS-5). The modelled NO 2 profiles were used to calculate monthly mean NO 2 partial columns in the free troposphere (1.5 to 12 km), as the GEM-MACH model does not include free-tropospheric NO 2 sources (lightning, in-flight aircraft emissions).
The model has detailed stratospheric chemistry that includes long-lived species (nitrous oxide (N 2 O), methane (CH 4 ), and water vapor (H 2 O)) and halogen families (NO y , Cl y , and Br y ) that are based on a combination of three-dimensional model output and tracer correlations (Adams et al., 2017). Heterogeneous chemistry of background stratospheric sulfate aerosols is also included. The model is constrained with climatological profiles of ozone and temperature. Stratospheric NO 2 has a strong diurnal variation; therefore, diurnal corrections must be applied when OMI stratospheric NO 2 measurements (around local noon) are interpolated to Pandora measurement times. Ratios of modelled stratospheric NO 2 columns are calculated at OMI overpass time and Pandora measurement time. These ratios are multiplied by the OMI measured stratospheric NO 2 to produce stratospheric NO 2 columns corresponding to the time of Pandora measurements.
Details about the use of the Pratmo box-model and the calculation of stratospheric NO 2 partial columns are provided in 5 Section 4.1.

Zenith-sky air mass factor
The NDACC UV-visible network uses zenith-sky AMFs in its total column NO 2 retrievals. To improve the overall homogeneity of the UV-visible NO 2 column measurements, NDACC recommended using the NO 2 AMF LUT ( Van 10 Roozendael and Hendrick, 2012). This LUT is based on climatological NO 2 profiles that are composed of (1) 20-60 km NO 2 profiles developed by Lambert et al. (1999Lambert et al. ( , 2000 and (2) 12-20 km NO 2 profiles derived from SAOZ balloon observations (Van Roozendael and Hendrick, 2012). The NO 2 concentration is set to zero below 12 km altitude. The NO 2 AMFs have been calculated using the UVSPEC/DISORT RTM (Hendrick et al., 2006;Wagner et al., 2007). The parameters used in building the LUT are: wavelength, ground albedo, altitude of the station, and solar zenith angle (SZA). Aerosol extinction, 15 ozone, and temperature profiles come from an aerosol model (Shettle, 1989), the U.S. Standard Atmosphere, and the TOMS V8 Climatology, respectively.
The NDACC LUT is designed for stratospheric NO 2 retrievals. Note that the absence of tropospheric NO 2 in the NDACC LUT construction will lead to an underestimation of the total column NO 2 in urban areas. For example, from 2015 to 2017, tropospheric NO 2 accounted for 73 ± 11 % (1σ) of the total column amounts in Toronto (OMI SPv3.0 data). To account for 20 this significant tropospheric NO 2 in urban areas, new empirical AMFs were developed in this study and the NDACC AMF LUT is used for comparison purposes only.
Empirical AMFs are calculated for Pandora zenith-sky NO 2 measurements in such a way that they can be used to retrieve zenith-sky total column NO 2 values that match the high-quality Pandora direct-sun total column NO 2 values. Inferring total columns from zenith-sky observations through comparisons with accurate direct-sun observations is a common approach for 25 Brewer and Dobson zenith-sky total ozone measurements (Kerr et al., 1988). For example, in the Brewer instrument zenithsky ozone algorithm, weighted zenith-sky light intensities measured at four wavelengths (F) are expressed as a function of the slant path (µ) and total column ozone (Kerr et al., 1981). The nine semi-empirical coefficients used to derive total column ozone from measured F in the equation are estimated from a set of direct-sun and zenith-sky observations made nearly simultaneously (Fioletov et al., 2011). Since Pandora zenith-sky spectra can be analyzed to produce NO 2 dSCDs, 30 instead of finding the link between zenith-sky intensity and total column values, deriving empirical zenith-sky AMFs for Pandora zenith-sky measurements is more straightforward.
The relation between VCD and dSCD can be expressed as: where, RCD is the reference column density that shows the slant column amount of the trace gas in the reference spectrum (Section 2.1.2). If we make an assumption that the coincident direct-sun (DS) and zenith-sky (ZS) measurements sampled the same air mass, then the empirical zenith-sky AMFs can be calculated by assuming VCD DS = VCD ZS , which gives 5 . (2) Next, we can use nearly-coincident VCD DS and dSCD ZS in a multi-non-linear regression to retrieve AMF ZS and RCD ZS together. To ensure the quality of the retrieved AMF ZS , only high quality direct-sun total column NO 2 data are used with SZA < 75°. Details about the empirical zenith-sky AMF calculation are shown in Appendix A. Figure 1 shows a comparison of the empirical zenith-sky AMFs and NDACC AMFs (calculated for the Toronto 10 measurements). Total column NO 2 can then be retrieved using Eqn. (1) and these two sets of AMFs, where the one based on empirical AMFs is referred to as VCD Emp and the one based on NDACC AMFs is referred to as VCD NDACC . The RCD value used in the retrievals is 0.39 ± 0.01 DU, which is retrieved along with AMF Emp (Appendix A). Figure 2 shows the comparisons of the NO 2 columns measured by zenith-sky and direct-sun methods. The regression analyses were performed by using the following coincidence criteria: (1) nearest Pandora direct-sun measurement that was within ± 5 min of Pandora 15 zenith-sky measurement, (2) SZA < 75°, and (3) Pandora direct-sun total column NO 2 data have assured high quality (BlickP L2 data quality flag for nitrogen dioxide = 0). In general, the VCD Emp and VCD NDACC performed as expected. Compared with VCD DS , the VCD NDACC shows a -25% bias, while the VCD Emp only shows a -4 % bias (indicated by the red lines on each panel and their slopes). In addition, VCD Emp shows less SZA dependence than VCD DS (see the increased bias for measurements made in larger SZA conditions in Figure 2b). These results confirm that, for urban sites, the tropospheric NO 2 20 profile should be included when calculating empirical zenith-sky AMFs. In the rest of the paper, only the zenith-sky NO 2 retrieved using empirical AMFs will be discussed. Note that the Pandora zenith-sky total column NO 2 data discussed in Sections 3 are a "clear-sky subset" of Pandora zenith-sky measurements. The assessment of Pandora zenith-sky NO 2 measurements in cloudy conditions are provided in Section 4.

Comparison with satellite measurements 25
To illustrate the NO 2 variability over Toronto, Figure 3 shows the time series (2015-2017) from Pandora direct-sun, zenithsky, and OMI SPv3.0 total column NO 2 . In general, the NO 2 datasets from the ground-based Pandora instrument and the satellite follow the same pattern. However, the satellite data are likely to miss the peak NO 2 values in the morning since OMI only passes over Toronto once per day around 1:30 p.m. (local time).
We also performed regression analyses by using the following coincidence criteria: (1) nearest (in time) measurement that 30 was within ± 30 min of OMI overpass time, (2) closest OMI ground pixel (having a distance from the ground pixel centre to the location of the Pandora instrument less than 20 km), and (3)  as determined by the OMCLDO2 algorithm; Celarier et al., 2016). In this comparison, only high-quality OMI data are used (VcdQualityFlags = 0) (Celarier et al., 2016). Figures 4a and 4b show the scatter plots of OMI vs. Pandora direct-sun and OMI vs. Pandora zenith-sky total column NO 2 , respectively. Figures 4c and 4d show similar comparisons but only use OMI NO 2 measured by "small pixels" (i.e., having viewing zenith angle of less than 35°). The better correlation and lower bias for zenith-sky versus direct-sun can probably be explained by the fact that the sampled air mass of zenith-sky is closer to what 5 the satellite sees than the air mass sampled by direct-sun measurements. The comparison results indicate that, at the Toronto site, OMI underestimates the total column by about 30 %. This underestimation is qualitatively consistent with the fact that the Pandora location is near the northern edge of peak Toronto NO 2 , and the relatively large OMI pixels are also generally sampling areas of less NO 2 in the vicinity. The use of the relatively coarse (1°) GMI model for profiles shapes (Section 2.1.3) will also lead to a low bias considering the peak NO x emissions span roughly 0.5° × 0.5°. Similar results have been 10 found elsewhere. Ialongo et al. (2016) reported a similar negative bias using OMI SPv3.0 and Pandora direct-sun total column NO 2 in Helsinki (-32 % bias and R = 0.51), and they suggested this was due to the difference between the OMI pixel and the relatively small Pandora field-of-view. In Reed et al. (2015), Pandora measurements at 11 sites were evaluated; the authors found that the best correlation between OMI SPv3.0 and Pandora direct-sun total column NO 2 data is for rural sites. They concluded this 15 could be due to smaller atmospheric variability in the rural region. Other studies such as Goldberg et al. (2017) found an even worse OMI-Pandora comparison between these two data products with striking negative bias at high values and poor correlation (R = 0.3). The authors attributed the poor agreement to the coarse resolution of OMI and its AMFs computed with GMI a priori NO 2 profiles. In general, our comparison results show that: (1) the Pandora direct-sun total column NO 2 data measured in Toronto have a reasonable agreement with OMI, and (2) the Pandora zenith-sky total column NO 2 data 20 show results similar to those for direct-sun total column when compared with OMI SPv3.0.

Surface NO 2 concentration retrieval
The performance of the clear-sky Pandora zenith-sky total column NO 2 data has been assessed by using OMI and Pandora direct-sun data as described in Section 3.2. However, the validation of cloudy-scene Pandora zenith-sky total column data is not simple, since near-simultaneous good quality direct-sun or satellite measurements in most cloudy conditions are not 25 available. This cloudy-scene validation can be done by comparison with in situ NO 2 measurements that are not affected by weather. In general, the comparison between total columns and surface concentrations can be done by two approaches: (1) convert Pandora zenith-sky total columns to surface concentrations; and (2) convert in situ surface concentrations to total column values. For example, Spinei et al. (2018) calculated "ground-up" VCDs from in situ surface concentrations by using additional measurements of PBL height or assuming trace gas profiles. In this work, the first approach is employed since the 30 surface NO 2 data products from Pandora remote-sensing measurements have direct applications in areas such as air quality monitoring.

Column-to-surface conversion algorithm
A simple but robust scaling method is adapted to derive surface NO 2 concentration from Pandora zenith-sky total column NO 2 measurements. Following Lamsal et al. (2008) and McLinden et al. (2014), the surface NO 2 concentration is estimated using the modelled profile and surface concentration, where C pan is the surface NO 2 volume mixing ratio (VMR) to be estimated, C is the surface NO 2 VMR from GEM-MACH (or G-M), V pan is the total column NO 2 measured by Pandora, V strat is the stratospheric NO 2 partial column, V ftrop is the NO 2 partial column in the free troposphere, and V PBL is the NO 2 partial column in the PBL. This equation assumes the chemical transport models can effectively capture the spatial and temporal behaviour of the concentration-to-partial-column ratio.
In this work, V PBL (0-1.5 km) is integrated from the GEM-MACH NO 2 profile and V ftrop (1.5-12 km) is integrated from the 10 GEOS-Chem NO 2 profile. Both GEM-MACH and GEOS-Chem have an hourly temporal resolution. Thus, the integrated V PBL and V ftrop can account for NO 2 diurnal variation. However, V strat is from OMI monthly mean stratospheric NO 2 , which does not have diurnal variation. Thus, the Pratmo box-model is used to calculate stratospheric NO 2 diurnal ratios. The OMI stratospheric NO 2 columns are interpolated to morning and evening hours by multiplying by the box-model diurnal ratios.
Details about the calculation of V strat as well as references are provided in Appendix B. 15 The (C/V PBL ) G-M ratio in Eqn. 3 is provided by GEM-MACH, and has hourly temporal resolution. This modelled (C/V PBL ) G-M ratio is referred to here as a conversion ratio. Besides the hourly modelled conversion ratio, a simple monthly look-up table is built using an average of the one-and-a-half years of GEM-MACH model outputs (April 2016 to December 2017) that were available. The look-up table (referred to here as the Pandora surface-concentration look-up table, or PSC-LUT) is composed of monthly conversion ratios with hourly resolution as shown in Figure 5. For example, assuming that a Pandora 20 NO 2 total column measurement is made on a day in December at 15:00 LST, then the corresponding conversion ratio from the PSC-LUT is 28 ppbv DU -1 (see the black arrow). Our results in Figure 5 show that the conversion ratio changes throughout the day as well as with season: 0.1 DU (partial column NO 2 in the PBL) corresponds to 5-8 pptv of surface NO 2 in the morning (8:00 LST), 2-3 pptv around local noon (13:00 LST), and 2-4 pptv in the evening (18:00 LST). In general, the variation of conversion ratios demonstrates that the surface NO 2 concentration is controlled not only by PBL height, but also 25 by both boundary-layer dynamics and photochemistry. The surface NO 2 derived using the hourly modelled (C/V PBL ) G-M ratio is referred to here as C pan-model , while the surface NO 2 derived using the monthly mean PSC-LUT is referred to here as C pan-LUT . In general, C pan-model is a data product that depends on daily model outputs, but C pan-LUT only needs the pre-calculated PSC-LUT and is thus less dependent on the model. Details of these two different surface NO 2 data products are discussed in the next section.  Figure 6 shows the evaluation of modelled and Pandora zenith-sky surface NO 2 concentrations, both using in situ NO 2 measurements as the reference. The Pandora data have been filtered for heavy clouds (details are given in Section 4.3). The GEM-MACH modelled surface concentrations in Toronto reproduce the in situ measurements very well with the comparison showing high correlation (R = 0.78) and moderate positive bias (37 %, Figure 6a). The Pandora zenith-sky surface NO 2 data, 5 C pan-model , shows almost the same correlation (R = 0.77), with only -7 % bias (Figure 6b). The better performance of C pan-model is expected since the conversion method for Pandora zenith-sky measurements relies on the GEM-MACH modelled NO 2 profile (see Eqn. 3); in other words, the Pandora zenith-sky surface NO 2 has at least one more piece of information (i.e., NO 2 total column) than GEM-MACH surface NO 2 concentrations. The C pan-LUT shows a similar correlation coefficient (R = 0.73) and has improved bias (-3 %, Figure 6c). This result (slightly lower correlation) is also reasonable and acceptable since C pan-10 LUT is derived with the monthly PSC-LUT, which has less accurate information than the hourly modelled data. Besides the improved bias, Pandora zenith-sky surface NO 2 concentrations, C pan-model and C pan-LUT (Figures 6e and 6f) also have better frequency distributions than the GEM-MACH (Figure 6d). Figure 6d shows that the NO 2 surface concentrations peaks (ambient background concentrations) from model and in situ data are misaligned. This indicates that the GEM-MACH NO 2 background surface concentrations have a 1ppbv low bias at this site. In contrast, the zenith-sky surface NO 2 at peak-15 frequency matches the in situ data (Figures 6e and 6f), indicating that the low bias of the background surface NO 2 value has been corrected with this additional information from Pandora zenith-sky total column measurements. In addition, in high NO 2 concentration conditions (> 20 ppbv), the zenith-sky surface NO 2 also shows better agreement with the in situ NO 2 than do the modelled data. The mean of the top 10 % of the in situ data is 26 ± 1 ppbv (uncertainty of the mean), whereas the corresponding values for GEM-MACH, C pan-model , and C pan-LUT are 39 ± 1 ppbv, 26 ± 1 ppbv, and 27 ± 1 ppbv, respectively. 20

Comparison with measurements and model
The total column-to-surface concentration conversion algorithm has also been applied to the Pandora direct-sun total column NO 2 (see Figure 7). Figure 7b shows that the direct-sun surface NO 2 data have a similar agreement with the in situ data (-8 % bias and R = 0.80) as the zenith-sky surface NO 2 . In high NO 2 concentration conditions, direct-sun data have a similarly good agreement with the in situ measurements. For this direct-sun based dataset, the mean of the top 10 % of the in situ data is 27 ± 1 ppbv, whereas the corresponding values for GEM-MACH, C pan-model , and C pan-LUT are 40 ± 1 ppbv, 27 ± 1 ppbv, and 25 27 ± 1 ppbv, respectively Thus, in general, both Pandora zenith-sky and direct-sun surface NO 2 datasets can be used reliably to obtain surface concentrations.

Measurements in different sky conditions
Although zenith-sky observations are less sensitive to cloud conditions than direct-sun observations, we still need to be 30 cautious about the derived zenith-sky surface NO 2 in heavy cloud conditions. Due to enhanced scattering, heavy clouds could lead to a significant overestimation of surface NO 2 derived from zenith-sky measurements. A cloud filtering method measurements. Under moderately cloudy conditions, when Pandora direct-sun observations cannot provide high-quality data, Pandora zenith-sky observation still can yield good measurements that compare well with in situ data (for example, April 26-29). Under heavy cloud conditions, however, which are identified by enhanced O 4 (Appendix C), Pandora zenith-sky-derived surface NO 2 yielded higher than in situ measurements (for example, April 4 and 6, see the green squares). This feature is due to the enhanced multi-scattering in heavy cloud conditions, which leads to enhanced NO 2 absorption in the measured spectra. 10 Sensitivity tests (Appendix C) show that only 10 % of all zenith-sky measurements are strongly affected by this enhanced absorption, indicating the zenith-sky NO 2 algorithm is applicable to most measurements made in thin and moderate cloud conditions (Toronto has about 44 % of daylight hours with clear-sky conditions per year). The relative strength of direct-sun measured by a collocated Total Sky Imager (model TSI-880) is plotted on top of each panel in Figure 8 as an additional indicator of sky conditions. The relative strength of direct-sun is from the integration of blocking-strip luminance. In general, 15 when the relative strength of direct-sun is high (> 60), good quality direct-sun and zenith-sky NO 2 data can both be produced. However, when sun strength is moderate , only zenith-sky NO 2 data are reliable. When sun strength is low (< 30), zenith-sky NO 2 has increased bias and needs to be filtered out.

Discussion
This study evaluated the performance of Pandora zenith-sky measurements with Pandora direct-sun measurements, satellite 20 measurements, and in situ measurements. In general, the quality of zenith-sky data is affected by three main factors: (1) quality of empirical zenith-sky AMFs; (2) cloud conditions (heavy clouds or moderate/thin clouds); and (3) quality of modelled NO 2 profile (this factor only applies to Pandora surface NO 2 data). The quality of empirical zenith-sky AMFs and the cloud effect have been addressed in Appendices A and C, respectively. The third factor is discussed in Sections 5.1 and

Diurnal and seasonal variation
From the Pandora zenith-sky and direct-sun measurements, and modelled NO 2 profiles, surface NO 2 concentrations were obtained that agree well with in situ measurements collected at the same location. The Pandora surface NO 2 data were also analyzed in more detail with a focus on temporal variations. Figure 9 shows the averaged surface NO 2 diurnal variations of four different datasets. The in situ instrument produces continuous measurements 24 hours per day, whereas Pandora only 30 has measurements when sunlight is available. The diurnal variation of surface NO 2 concentration is controlled by dynamics (e.g., vertical mixing, wind direction), photochemistry, and local emissions. Thus, the diurnal variations are calculated using only the hours when in situ, direct-sun, and zenith-sky data are all available. Figure 9 shows that all four datasets/curves captured the enhanced morning surface NO 2 and the decreasing trend afterwards. However, the model has a positive offset (6-9 ppbv) in the morning (due in part to the use of older emissions inventories: Moran et al., 2018) and a negative offset (1-3 ppbv) in the evening relative to the in situ data. Compared to the modelled 5 data, the Pandora direct-sun and zenith-sky data show improvements in the morning, but almost no changes for the evening.
This feature is investigated and found to be correlated with the GEM-MACH modelled PBL height (details in Section 5.2).
The diurnal variation is also examined by grouping the data by seasons. Figure 10 shows that the surface NO 2 concentrations in winter (December, January, and February) are higher than the corresponding values in summer (June, July, and August).
This difference is mainly due to short sunlit periods and less solar radiation (e.g., increased lifetime of NO 2 and decreased 10 PBL height) in winter. The model has better agreement with the in situ data in summer than in the colder seasons. The best performance of the model is found around local noon, and this feature is not dependent on seasons. Figure 10 also shows that the quality of Pandora zenith-sky and direct-sun surface NO 2 estimates is affected by the quality of GEM-MACH modelled data. For example, Figure 10c shows that in autumn (September, October, and November), GEM-MACH has the largest offset in the morning. This error is thus propagated to the Pandora direct-sun surface data, and leads to a larger offset in the 15 morning (than any other season). On the other hand, when GEM-MACH shows a better agreement with in situ measurements (e.g., in spring and summer), Pandora zenith-sky and direct-sun estimates also show better agreement with in situ observations. In general, both Pandora direct-sun and zenith-sky surface NO 2 data show good agreement with in situ measurements in all seasons; the hourly mean values of Pandora surface NO 2 are all well within the 1σ envelope of the in situ measurements. 20

Planetary boundary-layer effect
The larger morning offset in modelled surface NO 2 may indicate that the GEM-MACH modelled PBL heights are biased in the morning when the boundary layer is shallow. Figure 11 (left column) shows the modelled PBL height plotted as a function of the difference between modelled and in situ surface NO 2 . Figure 11a shows that, in general, the difference between modelled and in situ NO 2 decreases with an increase of PBL height. When the modelled PBL height is less than 100 25 m, the mean difference is 18 ± 12 ppbv (1σ), while when the modelled PBL height is 1 km, the mean difference is only 2.9 ± 6.4 ppbv.
Even though the modelled surface concentrations are significantly impacted by the PBL, the modelled conversion ratio (from column to surface concentrations) seems unaffected since the surface NO 2 concentrations derived from Pandora zenith-sky data (C pan-model ) show much less dependence on the PBL height (Figure 11b). When the modelled PBL height is less than 100 30 m, the mean difference is 0.9 ± 8.9 ppbv. When the modelled PBL height is 1 km, the mean difference is slightly improved to 0.1 ± 4.4 ppbv. Figures 11c and 11h show similar plots as Figure 11a and 11b, but the dataset has been divided into three time-bins (before 9:00, 11:00 to 13:59, and after 15:00). Figures 11c, 11e, and 11f confirm that whenever the modelled PBL Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1336 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 5 February 2019 c Author(s) 2019. CC BY 4.0 License. height is low, the relative difference between the model and in situ data is high. However, in general, most of these shallow PBL height conditions occur in the morning, and thus the modelled surface NO 2 has larger bias compared to in situ data in the morning. Figures 11d, 11f, and 11h show that Pandora zenith-sky surface NO 2 data have similar performance for all these three time-bins, which indicates that the data have less PBL height dependency than the modelled data. In other words, the model is able to capture the ratio between the boundary layer partial column and surface NO 2 , although the PBL height 5 may not be correct in the model. When this ratio is applied to both Pandora direct-sun and zenith-sky data, the estimated surface concentrations agree better with the in situ measurements.

Conclusions 10
The Pandora spectrometer was originally designed to retrieve total columns of trace gases such as ozone and NO 2 from direct-sun spectral measurements in the UV-visible spectrum. In this work, a new zenith-sky total column NO 2 retrieval algorithm has been developed. The algorithm is based on empirical AMFs derived from nearly simultaneous direct-sun and zenith-sky measurements. It is demonstrated that this algorithm can retrieve total columns in thin and moderate cloud conditions when direct-sun measurements are not available: only 10 % of the measurements affected by heavy cloud have to 15 be filtered out due to large systematic biases (68 %). The new Pandora zenith-sky total column NO 2 data shows only -4% bias compared to the standard Pandora direct-sun data product. In addition, OMI NO 2 SPv3.0 data demonstrate similar biases (-30 % and -29 %, respectively) when compared to direct-sun and zenith-sky Pandora total column NO 2 data.
Surface NO 2 concentrations were calculated from Pandora direct-sun and zenith-sky total column NO 2 using column-tosurface ratios derived from GEM-MACH regional chemical transport model. The bias between Pandora-based direct-sun and 20 zenith-sky NO 2 surface concentration estimates and in situ measurements is only -8 % and -7 % (with correlation coefficients 0.80 and 0.77), respectively, while the bias between the modelled concentrations and in situ measurements is up to 37 %. The Pandora-based surface NO 2 concentrations also show good diurnal and seasonal variation when compared to the in situ data. High surface NO 2 concentrations in the morning (from 6:00 to 9:00, local standard time) are present in all measured and modelled datasets, while, on average, the model overestimates surface NO 2 in the morning by 8.6 ppbv (at 25 7:00 LST). It appears that the bias in modelled surface NO 2 is related at least in part to an incorrectly diagnosed PBL height.
In contrast, the difference between Pandora-based and in situ NO 2 does not show any significant dependence on the PBL height. Thus, to enable a fast and practical Pandora surface NO 2 data production, the use of a pre-calculated conversion ratio PSC-LUT is recommended.
The new retrieval algorithm for Pandora zenith-sky NO 2 measurements can provide high-quality NO 2 data (both total 30 column and surface concentration) not only in clear-sky conditions, but also in thin and moderate cloud conditions, when direct-sun observations are not available. Long-term Pandora zenith-sky NO 2 data could be used in future satellite validation Atmos. Chem. Phys. Discuss., https://doi.org /10.5194/acp-2018-1336 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 5 February 2019 c Author(s) 2019. CC BY 4.0 License.
for the medium cloudy scenes. Moreover, a column-to-surface conversion look-up table was produced for the Pandora instruments deployed in Toronto; therefore, quick and practical Pandora-based surface NO 2 concentration data can be obtained for air quality monitoring purposes. The variation of conversion ratios in the PSC-LUT demonstrates that the surface NO 2 concentration is controlled not only by the PBL height, but also by both boundary-layer dynamics and photochemistry. This conversion approach can also be used to derive surface concentrations from satellite VCD 5 measurements and thus can be particularly useful for the new generation of geostationary satellite instruments for air quality monitoring such as the Tropospheric Emissions: Monitoring of Pollution (TEMPO, Zoogman et al., 2014).
Author contributions. XZ analyzed the data and prepared the manuscript, with significant conceptual input from DG, VF,

20
Acknowledgements. Xiaoyi Zhao was supported by the NSERC Visiting Fellowships in Canadian Government Laboratories program. We thank Ihab Abboud and Reno Sit for their technical support of Pandora measurements. We thank NAPS for providing surface NO 2 data. We acknowledge the NASA Earth Science Division for providing OMI NO 2 SPv3.0 data. We also thank Thomas Danckaert, Caroline Fayt, Michel Van Roozendael, and others from IASB-BIRA for providing the QDOAS software, the NDACC UV-visible working group for providing NDACC UV-visible NO 2 AMF LUT, and Yushan 25 Su from the Ontario Ministry of the Environment, Conservation and Parks for providing NAPS Toronto North station in situ NO 2 information. Phys., 18 (14), 10459-10481, doi:doi.org/10.5194/acp-18-10459-2018, 2018.

A. Empirical zenith-sky AMF 20
Before calculating the empirical zenith-sky AMF, the VCD DS and dSCD ZS have both been strictly filtered to ensure any measurements used in this calculation have the highest quality. For VCD DS , data are filtered following Cede (2017) with several factors being considered, such as wavelength shift and residual in spectra fitting, direct-sun AMF, and estimated uncertainties for the vertical column. For dSCD ZS , data are filtered using similar criteria as for VCD DS , with adjustments for zenith-sky observations. 25 The VCD DS and dSCD ZS data are merged and divided into several SZA bins. Each bin covers 5°. A multi-non-linear regression is performed by using the following equation: where, VCD n is not a single direct-sun VCD data point, but is an m × 1 matrix (m is the total number of measurements in SZA bin number n); the VCD n represents all direct-sun VCDs in a 5° SZA bin, and each element of the m × 1 matrix is a 30 single VCD in that SZA bin. Similarly, dSCD n is also an m × 1 matrix, with each element representing a single coincident zenith-sky dSCD in SZA bin number n. I n is an m × 1 indicator function, where the elements of I n are set to 1. The RCD andAtmos. Chem. Phys. Discuss., https://doi.org/10.5194/acp-2018-1336 Manuscript under review for journal Atmos. Chem. Phys. Discussion started: 5 February 2019 c Author(s) 2019. CC BY 4.0 License. b 1 to b n are the parameters to be retrieved. In short, the design of this regression is based on Eqn. 2 (Section 3.1). The idea is to retrieve zenith-sky AMFs in several SZA bins, and, at the same time, all these regressions in different SZA bins are constrained to share a common predictor (RCD). The regression model can be solved by using an iterative procedure (Seber and Wild, 2003) to yield the estimated coefficients, b 1 to b n and RCD. The b n is the reciprocal of zenith-sky AMF in SZA bin n. 5 This regression model has been evaluated by using different sizes for the SZA bins. A 5° SZA bin is selected because the SZA bin must be small enough to capture the SZA dependency on zenith-sky AMFs, and, at the same time, it must also be large enough to ensure a sufficient number of measurements in each SZA bin (to perform reliable regressions). In order to deal with the diurnal variation of NO 2 concentration and changing of profile shape (e.g., due to changing of boundary layer heights), the dataset has been divided into morning and evening sets, and discrete AMFs are retrieved for a.m. and p.m. 10 separately (see the blue and red squares with error bars in Figure 1).
Next, these discrete AMF values are used to fit an empirical zenith-sky NO 2 AMF function, which has the expression: The fitted empirical zenith-sky AMFs are shown in Figure 1 as blue and red lines (data regression period from February 2015 to September 2017). Several sensitivity tests have been performed to assess the quality of the empirical zenith-sky 15 AMFs, including fitting the AMFs with/without a diurnal difference, fitting the AMFs with different empirical functions (e.g., exponential and simple geometry approximation) and fitting the AMFs by seasons. All these different choices of empirical AMFs fitting functions or methods only introduce less than 5 % difference in the retrieved empirical AMFs. Thus, to make the empirical AMFs simple and robust, we selected to fit with a diurnal difference (Eqn. 5).

B. Stratospheric NO 2 column 20
Several stratospheric NO 2 column values were tested and used in the surface NO 2 concentration algorithm (Eqn. 3). Figure   A1a shows the OMI monthly mean (referred to as OMI) and Pratmo box-model stratospheric column NO 2 (Adams et al., 2016;McLinden et al., 2000) (referred to as box). Since the satellite only samples Toronto once per day, the OMI stratospheric NO 2 lacks diurnal variation. To account for the diurnal variation, diurnal ratios of NO 2 VCD have been calculated and applied to OMI monthly mean data. The stratospheric NO 2 columns are calculated using 25 where, V OMI (t 0 ) is the OMI measured stratospheric column, t 0 is OMI overpass time, V box (t 0 ) is the modelled stratospheric column at OMI overpass time, V box (t) is the modelled stratospheric column at time t, and V OMI (t) is the interpolated stratospheric column at time t. The interpolated OMI stratospheric columns are referred to as OMI-box. The grey dots on To justify why this diurnal variation has to be included, Figure A1c shows the total column NO 2 time series. The diurnal stratospheric NO 2 variation is about 0.1 DU in the summer (see grey dots in Figure A1b) when Pandora measured monthly mean total column is about 0.5 DU ( Figure A1c). Thus, neglecting this diurnal variation will lead to diurnal biases in the derived surface NO 2 data (e.g., in the morning, this will lead to the overestimation of the stratospheric NO 2 and thus the underestimation of surface NO 2 ). 5

C. Cloud effect and heavy cloud filtration
Direct-sun measurements need an unobscured sun. Even thin clouds could decrease the quality of retrieved NO 2 total columns, especially for low altitude clouds. Unlike direct-sun measurements, zenith-sky observations are made with scattered sunlight and have limited sensitivity to cloud cover. For example, Hendrick et al. (2011) calculated that, for NDACC UV-visible zenith-sky ozone measurements, clouds only contribute 3.3 % to the total random error. This is because 10 a trace gas that is mostly distributed in the stratosphere has the mean scattering layer located at a higher altitude than the cloud layer. However, this assumption may not be valid for NO 2 . Depending on the properties of the clouds and the NO 2 profile, the clouds could have non-negligible impacts on zenith-sky NO 2 observations. A typical method of removing zenith-sky measurements affected by heavy clouds is to eliminate measurements with large enhancements of O 4 and/or H 2 O (Van Roozendael and Hendrick, 2012). In the Pandora zenith-sky NO 2 retrieval, we use the 15 O 4 dSCDs. Since the measured O 4 dSCDs has SZA dependency, all measured O 4 dSCDs are plotted against SZA and a second order quantile regression (Koenker and Hallock, 2001) is applied to select the top few percentile of the measured O 4 dSCDs.