Spatiotemporal variability of NO 2 and PM 2 . 5 over Eastern China: observational and model analyses with a novel statistical method

. Eastern China (27–41 ◦ N, 110–123 ◦ E) is heavily polluted by nitrogen dioxide (NO 2 ), particulate matter with aerodynamic diameter below 2.5 µm (PM 2 . 5 ), and other air pollutants. These pollutants vary on a variety of temporal and spatial scales, with many temporal scales that are nonperi-odic the day-to-day variability of PM 2 . 5 better than that of NO 2 . These results shed light on model improvement. The EOF–EEMD package is freely available for noncommercial uses.

Over Eastern China, NO 2 and PM 2.5 concentrations vary diurnally and from one day to another. NO 2 is short lived (hours), and its diurnal cycle is affected by rush hour traffic emissions (Chen et al., 2015;Hu et al., 2014), other emission sources, planetary boundary layer (PBL) mixing , and chemistry (Lin et al., 2012). Although previous studies in the US, Germany, and Japan have suggested a weekly cycle of NO 2 due to variations in industrial and traffic emissions, such an emission-driven weekly cycle is not visible over developing countries such as China and India (Beirle et al., 2003;Boersma et al., 2009;Cui et al., 2016;Hu et al., 2014;Kaynak et al., 2009). Instead, groundbased observations show that the day-to-day variation in NO 2 over China is associated with changes in meteorological parameters such as wind speed, relative humidity (RH), surface pressure, and temperature Zhang et al., 2015).
For PM 2.5 over China, both the diurnal and the day-today variations are complicated by its relatively long lifetime, its various components from different sources, and meteorology. Liu et al. (2016) suggested three types of PM 2.5 diurnal cycle within a year, with the peak concentration occurring at distinctive hours in different seasons. In the summertime (April to August), the diurnal cycle may follow human activities (Gong et al., 2007;Liu et al., 2016), which is different from the diurnal cycles in the biomass burning season or in winter. Other studies suggested weak diurnal cycles of PM 2.5 in urban or suburban areas (Chen et al., 2015;. Moreover, some studies pointed to the lack of a weekly cycle of PM 2.5 , while others suggested contrasting weekly cycles (for Beijing, Chen et al., 2015;Hu et al., 2014). In winter, the frequent and irregular weather systems prohibit a clear weekly cycle (Gong et al., 2007).
This study analyzes the spatiotemporal variability of NO 2 (with the shortest lifetime of hours and the greatest variability among the pollutants measured by the official monitoring network) and PM 2.5 (the dominant air pollutant for prema-ture mortality; Forouzanfar et al., 2015) over Eastern China in fall-winter 2013. Given the complex and nonstationary nature of pollutant variability over Eastern China, here we compile an EOF-EEMD analysis visualization package to simultaneously distinguish and visualize the spatial and temporal variability of pollutants. In sequence, the package consists of an empirical orthogonal function (EOF) analysis (Lorenz, 1956) to separate spatial and temporal patterns, an ensemble empirical mode decomposition (EEMD) analysis (Wu et al., 2009) to separate different temporal modes, a Hilbert transform (HT), a marginal spectrum analysis (MSA), and a visualization step to present all physically meaningful spatial and temporal modes in a two-dimensional plot. In particular, EEMD (Huang, 2005;Huang et al., 1998Huang et al., , 1999Huang and Attoh-Okine, 2005;Wu et al., 2009) is an effective tool to extract signals from noisy nonlinear and nonstationary processes (Wu et al., 2009). EEMD and its variants (e.g., multidimensional ensemble empirical mode, MEEMD) have been widely used in climate studies (Feng et al., 2014;Huang et al., 2012a, b;Vecchio and Carbone, 2010;Wu et al., 2011Wu et al., , 2016. The EOF-EEMD package thus allows for quantitative manifestation of the spatial, (regular) diurnal, and (irregular) day-to-day variations of pollutants and meteorological drivers.
We further use the EOF-EEMD package to evaluate how well chemical transport models (CTMs) can reproduce the observed pollution variability. Although popularly used in air pollution diagnosis, forecast and projection, and remote sensing (Geng et al., 2015;Lin et al., 2015), models are subject to errors in emissions, chemistry, transport, PBL mixing, and other processes (Lin et al., 2008(Lin et al., , 2012Zhang et al., 2016b). This study evaluates two representative models, GEOS-Chem and WRF/CMAQ, with a note that such an evaluation can be applied to other models.
The rest of the paper is organized as follows. Section 2 introduces in situ measurements of NO 2 , PM 2.5 , meteorological parameters, model simulations, and the EOF-EEMD analysis visualization package. Section 3 analyzes the observed spatiotemporal variations of NO 2 and PM 2.5 , including their relationships with meteorological parameters. Section 4 evaluates the modeled spatiotemporal variations of NO 2 and PM 2.5 . Section 5 concludes the present study with further discussion on the applicability of the EOF-EEMD package.
2 Data and methods

Spatial and temporal domain
We focus on pollution over Eastern China (25-41 • N, 110-123 • E). Guided by an EOF analysis, we contrast pollution over the southern (SEC, south of 35 • N) and northern (NEC) parts to address the regional differences in day-to-day pollution variability. Such latitudinal separation coincides with Distribution of 42 cities with NO 2 and PM 2.5 observations. Both dots denote stations (a) and cities (b) with both NO 2 and PM 2.5 data. The blue dots indicates the same stations (cities) for both NO 2 and PM 2.5 , while the green dots are only used for NO 2 and purple dots are only used for PM 2.5 . The orange line separates northern Eastern China (NEC) and southern Eastern China (SEC), and the red line labels the location of the Huai River.
the Huai River climate transitional zone (Ye and Li, 2017). The orange lines in Fig. 1 separate the two regions.
Our study period is from 25 October to 25 December 2013, with a total of 1488 h in 62 days. Most air pollution data are missing in January and February 2014 because of instrumental failure or data retrieval failure, and data before 25 October are not available.

NO 2 and PM 2.5 observations
We retrieve hourly measurements of NO 2 and PM 2.5 from 193 air quality monitoring stations of the MEP. Most stations are located in urban areas, and only six stations are suburban. As almost every station has missing values in more than one day, we exclude stations that have missing values at ≥ 30 % of the 1488 h or during a consecutive 72 h period. We thus select 163 stations for NO 2 and 159 stations for PM 2.5 in the same 42 cities. The dots in Fig. 1a and b depict the stations and cities, respectively. The blue dots show stations and cities with both valid NO 2 and PM 2.5 , the green dots with NO 2 only, and the purple dots with PM 2.5 only. The slight difference between NO 2 and PM 2.5 stations does not affect our analysis of the regional pattern of pollutants.

Correction of raw NO 2 measurements
At the monitoring sites, NO 2 is measured via molybdenumcatalyzed conversion to nitric oxide (NO) and a subsequent chemiluminescence measurement. The measurement tech-nique suffers from interference by more oxidized nitrogen species, since the heated molybdenum surface exhibits low chemical selectivity (Boersma et al., 2009;Lamsal et al., 2008; Here we follow Lamsal et al. (2008) to correct for the interference by introducing a correction factor (CF) based on GEOS-Chem-simulated nitrogen species (NO 2 , HNO 3 , PAN, and all alkyl nitrates AN ).
We multiply CF with the raw NO 2 data to obtain "corrected" NO 2 concentrations. Our sensitivity test suggests that assuming PAN and HNO 3 to be fully converted to NO 2 (i.e., assuming the coefficients to be unity for both PAN and HNO 3 in Eq. 1) does not affect our spatiotemporal analysis of NO 2 .
Hereafter the NO 2 corrected by Eq. (1) is discussed, unless stated otherwise. Figure 2 compares the regional mean hourly time series of raw and corrected NO 2 . The correction reduces NO 2 concentrations by about 2-30 µg m −3 over the whole period and is higher at times when nitrogen is more oxidized. It slightly reduces the relative contribution of day-to-day variability to the total variance of NO 2 under the EOF-EEMD analysis (not shown) because excluding the more oxidized species shortens the lifetime of NO 2 . Figure 2. Regional mean hourly time series of raw and "corrected" NO 2 from the observations. The gray shading indicates 1 standard deviation across all stations.

Filling in missing values for EOF-EEMD analysis
Prior to an EOF-EEMD analysis, we fill in missing values in hourly pollution observations. If data are missing for more than a consecutive 12 h period, we fill in the missing value in each hour with data on that hour averaged over all days; as such, the diurnal cycle is maintained. In other cases, linear interpolation from adjacent valid data is applied. Our interpolation does not introduce significant artificial information for spatiotemporal analysis, as validated by a sensitivity test with GEOS-Chem model data. Specifically, the EOF-EEMD results based on the original GEOS-Chem data (i.e., no missing values) are similar to the results based on model data sampled at times of valid observations with missing values filled in with the same technique as for the observation data.

Conversion from station-to city-based datasets
Since different cities have different numbers of stations, we calculate city mean observations by averaging across all stations of each city. Compared to a station-based analysis, the city-based EOF-EEMD results reduce the spatial noise, leading to more distinctive temporal patterns. All analyses hereafter are based on city mean data. The longitude and latitude of each city center are used to identify the respective model grid cell.

Meteorological observations
We use 3-hourly measurements of 2 m air temperature, 2 m relative humidity, and 10 m wind speed from meteorological stations recorded at the National Oceanic and Atmo-spheric Administration National Centers for Environment Information (NOAA NCEI). We do not use surface pressure additionally because it is highly correlated with air temperature and relative humidity on the day-to-day scale. The locations of these stations do not always coincide with air pollution stations. Thus, we select 36 meteorological stations within 10 km of air pollution stations (red hollow dots in Fig. 1). Despite the difference (in number and location) between pollution and meteorological stations, an analysis of the regional temporal patterns of pollutants and meteorology is still informative (see Sect. 3.2).
To fill in missing values, we apply an interpolation process that accounts for diurnal variability using information for an adjacent day. For example, if the temperature on 26 October at 12:00 is missing, we calculate the temperature difference between 09:00 and 12:00 on the 25th as well as the difference between 15:00 and 12:00 on the 25th. We then use these differences to adjust the temperatures at 09:00 and 15:00 on the 26th, and finally use the mean of the two adjusted temperatures as the temperature on the 26th at 12:00.
For consistency with the hourly pollution data, we linearly interpolate the 3-hourly meteorological measurements to each hour. This interpolation does not distort the EOF-EEMD analysis, as confirmed by comparing the statistical analysis on 1-hourly GEOS-FP meteorological parameters versus an analysis on 3-hourly GEOS-FP data. Note that the GEOS-FP meteorology is used to drive GEOS-Chem.

GEOS-Chem
We use the nested GEOS-Chem CTM version 9-02 (L.  to simulate NO 2 , PM 2.5 , and other pollutants over China in October-December 2013. The model resolution is a 0.3125 • long. × 0.25 • lat. grid with 47 vertical layers, and the lowest 10 layers are of ∼ 130 m thickness each. The model is driven by the GEOS-FP assimilated meteorology from the National Aeronautics and Space Administration (NASA) Global Modeling and Assimilation Office, with the full O x -NO x -VOC-CO-HO x gaseous chemistry (Mao et al., 2013) and online aerosol calculations. Vertical mixing in the PBL adopts a nonlocal scheme (Holtslag and Boville, 1993;. Model convection is simulated with the relaxed Arakawa-Schubert scheme (Rienecker et al., 2008).
Chinese anthropogenic emissions of NO x and other pollutants adopt the monthly MEIC inventory with a base year of 2010 (http://www.meicmodel.org, last access: 1 December 2015) (Geng et al., 2017). We further use the monthly DOMINO v2 NO 2 data to scale monthly anthropogenic NO x emissions from 2010 to the simulation year . The emission scaling improves the simulation of NO 2 (Cui et al., 2016). Other model setups are referred to Lin et al. (2015) and Yan et al. (2016).
GEOS-Chem modeled PM 2.5 includes secondary inorganic aerosols (sulfate, nitrate, and ammonium), black carbon, primary organic carbon, natural dust, and sea salt. Secondary organic aerosols are not included in this study, considering the severe underestimate in China due to missing precursor emissions and formation pathways (Fu et al., 2012;. Anthropogenic dust is also not included.
The nested model simulation is from 15 October to 25 December in 2013, allowing for a 10-day spin-up period. Its lateral boundary conditions of chemicals are updated every 3 h by results from a corresponding global simulation on a 2.5 • long. × 2 • lat. grid. Modeled NO 2 and PM 2.5 in the first layer are sampled at city centers and times with valid observations, unless stated otherwise.
Chinese anthropogenic emissions are from MEIC (http: //www.meicmodel.org, last access: 1 December 2015). Emissions in 2013 are extrapolated from the base year (2012) based on country-level statistics (Zheng et al., 2015). Anthropogenic emissions in other Asian countries and biomass burning emissions are taken from the MIX emission inventory prepared for the Model Inter-Comparison Study Asia Phase III (MICS-ASIA III).
The simulation is from 15 October to 25 December 2013, allowing for a 10-day spin-up period. Initial conditions and boundary conditions are from GEOS-Chem (Zheng et al., 2015). Modeled NO 2 and PM 2.5 in the first layer are sampled at city centers and times with valid observations, unless stated otherwise.

EOF-EEMD analysis visualization package
As shown in Fig. 3, our EOF-EEMD analysis visualization package consists, in order, of an EOF analysis (Lorenz, 1956), an EEMD analysis (Wu et al., 2009), a Hilbert transform (HT) with marginal spectrum analysis (MSA), and a visualization step to quantitatively depict the spatial-temporal scales of measurement or model data.
The basic purpose of our package is to quickly and simultaneously identify and visualize various spatial and temporal scales of interest in the observation or model datasets. As shown by Feng et al. (2014) and Wu et al. (2016), combining EOF with EEMD to decompose the datasets leads to a faster calculation than MEEMD by 1 or 2 orders of magnitude because here the EEMD is applied to the temporal components (i.e., PCs) out of an EOF analysis rather than to all dimensions. Also, our EOF-EEMD package conducts additional HT-MSA and provides visualization of all spatial and temporal scales of interest.
-EOF analysis to decompose a two-dimensional dataset (time series at multiple locations) into spatial and temporal components.
Suppose there are n locations, each having a time series of length p. The associated dataset Z is an n×p matrix. An EOF analysis of Z gives Here is a diagonal q × q matrix containing the first q singular values of Z, and it represents the contribution of each pattern to the total variance of Z. The diagonal values of are in a descending order, and thus the first several modes are the dominant ones. U is an n × q matrix representing the spatial component, and each column of U represents a spatial mode. W is a p ×q matrix representing the temporal component, and each column of W represents a principal component (PC) for temporal variation associated with the corresponding spatial mode.
-EEMD analysis of each PC time series to obtain its "intrinsic mode functions" (IMFs) of descending frequencies.
Each PC is mixed with multiple scales, which requires further decomposition in the time domain. Unlike fast Fourier transform (FFT) or wavelet transform (WT), EEMD does not need a priori bases, and it can be appropriately applied to delineate nonlinear and nonstationary time series, as in our pollution study.
EEMD consists of an ensemble of empirical mode decomposition (EMD) performed on each PC time series (denoted as x(t) in Eq. 3). Each EMD linearly decomposes x(t) into individual IMFs c j (of ascending timescales and descending frequencies) and a residual r n : EMD is based on finding the local maxima and minima of the time series. A detailed decomposition process can be found in Huang et al. (1998Huang et al. ( , 1999. EMD is much less susceptible to missing values and data interpolation than approaches that are based on an analysis of the whole time series (e.g., FFT and WT).
EMD may be sensitive to noise in the real data to encounter a "mode mixing" problem (Wu et al., 2009). EEMD solves this problem by performing an ensemble of hundreds of EMDs, each with certain white noise added to x (t). Hence, the noise in the real data is incorporated as part of the white noise, and the ensemble further minimizes the effects of noise. The white noise is assumed to follow the standard Gaussian distribution (Wu et al., 2009). Figure 4 shows an example of the EEMD analysis.
-Hilbert transform and marginal spectrum analysis of each IMF to reveal its representative frequency range.
There are no discrete periods or frequencies in the pollution and meteorological time series. Correspondingly, an IMF also has a continuous frequency range (rather than a constant frequency) that can be determined by HT-MSA. The HT reveals the IMF energy-frequencytime distribution (Huang et al., 1999). The MSA further shows the IMF distribution of variance (energy) with respect to different frequencies. The spectral peak represents the largest contribution to total variance.
A spurious oscillation may occur near the edges of certain IMF time series, resulting in an inaccurate calculation of variance under HT-MSA. We apply a box-car filter (Gubbins, 2004) to select the internal 60 % of an IMF time series (from 20 % to 80 % of the 1488 h) to perform HT-MSA. Figure 4b shows an example of the visualized result of HT-MSA, in which the horizontal axis is the number of occurrences within the whole period (frequency, in h −1 , multiplied by the time length, 1488 h) and the vertical axis is the energy contribution. IMF2-IMF5 are visualized and analyzed in this study. The higher-frequency IMF1 is noisy as the energy is distributed over a wide range of occurrence numbers. IMF6-IMF10 represent the longest temporal scales that contribute little to the total variance of the decomposed PC. Thus IMF1 and IMF5-IMF10 are not further analyzed.
Based on HT-MSA, we determine a representative frequency range (RFR) such that the range encompasses the peak frequency and that the frequencies within the range contribute 50 % of the total variance of an IMF. The frequencies below and above the RFR bounds each contribute 25 % of the total variance of the IMF. Before calculating the RFR, we smooth the marginal spectrum by connecting all local maxima of the spectrum with a cubic spline.
-Visualization of the spatial and temporal scales in a twodimensional plot.
Finally, we simultaneously visualize the spatial and temporal scales as well as their contributions to the total variance of Z in a two-dimensional plot for easy observational diagnosis and model evaluation. In this plot, an IMF is represented by a vertical "error bar" and a horizontal bar. The length of the error bar stands for the representative period range (RPR, the inverse of RFR), and a shorter length means a more stationary variation mode (i.e., towards a fixed frequency or period). The length of the horizontal bar stands for the contribution to the total variance. For clearer presentation, the plot does not include IMFs that do not pass the white noise examination, that lay outside the range of scales considered here (hours to days), or that contribute little to the total variance of the original data (e.g., less than 1 %).
3 Observational analyses of NO 2 , PM 2.5 , and meteorological variables

General characteristics
The colored dots in Fig. 5a and b show the observed spatial distributions of city mean NO 2 and PM 2.5 averaged over the time period. Both NO 2 and PM 2.5 are largest over Beijing-Tianjin-Hebei (BTH) in the north and the Yangtze River Delta (YRD) in the east. NO 2 concentrations exceed 60 µg m −3 at many sites. The range of PM 2.5 is larger, from below 10 µg m −3 in some northern and coastal cities to about 200 µg m −3 in several cities of BTH. Figure 6a and b show the diurnal variations of NO 2 and PM 2.5 over Eastern China, NEC, and SEC averaged over all days. Similarly, over the three regions, NO 2 peaks around 19:00 due to evening rush hour emissions, reduced PBL mixing, and a lengthened lifetime. NO 2 reaches a minimum at 14:00 because of the shortest lifetime and strongest PBL mixing. The diurnal range (maximum minus minimum) is about 30 µg m −3 . The PM 2.5 level also reaches a minimum in the early afternoon. It has a much smaller diurnal range at 10 µg m −3 . The vertical error bars in Fig. 6a and b depict the standard deviation for the day-to-day variation of NO 2 and PM 2.5 at any given hour. At a given hour, the PM 2.5 level is much more variable across the days than NO 2 . In particular, the day-to-day standard deviation for PM 2.5 at a given hour is as large as the diurnal range of PM 2.5 . Figure 6c and d further show the time series of daily mean NO 2 and PM 2.5 . All data are de-trended (trends are at 0.01 µg m −3 h −1 for NO 2 and 0.05 µg m −3 h −1 for PM 2.5 ). Although local maxima and minima (peaks and troughs of the time series) occur every several days, there is no single period or amplitude for the variation of each species. For NO 2 over Eastern China (black line in Fig. 6c), the local maxima vary from 60 to 100 µg m −3 , and the local minima vary from 20 to 40 µg m −3 . For PM 2.5 over Eastern China (black line in Fig. 6d), the local maxima vary from 100 to 300 µg m −3 , and the local minima vary from 20 to 120 µg m −3 . Furthermore, comparing the green and blue lines reveals that pollutants over NEC and SEC synchronize on some days but are out of phase on others; this feature is quantitatively analyzed in Sect. 3.2. These day-to-day variation patterns are associated with meteorological conditions and pollutant lifetimes. Figure 7 shows day-to-day anomalies of observed pollutant concentrations and meteorological parameters over NEC and SEC. All data are de-trended. Over NEC, wind speed is clearly anticorrelated with pollutant levels. The correlation coefficient reaches −0.73 between NO 2 and wind speed and −0.60 between PM 2.5 and wind speed. Over this region, stronger winds are often associated with lower RH and lower temperature, characteristic of a cold air passage that brings cleaner, colder, and drier air from the north to NEC and transports the NEC pollution out of the region. Correspondingly, RH is strongly positively correlated with NO 2 (R = 0.62) and PM 2.5 (R = 0.69). The meteorology-associated day-today variability is more apparent after mid-November, when the variations of the two pollutants are more synchronous.
Over SEC (Fig. 7), the relationship between pollutant levels and meteorological parameters is more complex. The correlation between daily mean PM 2.5 and wind speed is relatively weak (R = −0.44 compared to −0.60 over NEC), and its correlation with RH is even weaker (R = 0.29). This indicates that the northerly air does not reduce PM 2.5 levels over SEC as effectively as over NEC, as PM 2.5 from NEC may be transported to SEC. By comparison, NO 2 is still highly anticorrelated with wind speed (R = −0.77) over SEC, likely a result of the short lifetime of NO 2 . Compared to PM 2.5 whose lifetime is sufficiently long (several days) for transport from NEC to SEC , NO 2 has a much shorter lifetime (below 1 day; Lin et al., 2012) and cannot undergo effective long-distance transport. However, almost all pollution measurement sites are urban, and weaker (stronger) winds allow for rapid accumulation (removal) of urban NO 2 pollution.

EOF-EEMD analyses of pollutants and meteorological parameters
Although informative, the time series analyses of regional mean pollution in Sect. 3.1 do not provide adequate quantitative information on the spatiotemporal variability and embedded scales. In fact, the separate discussion on NEC and SEC in Sect. 3.1 is largely inspired by the following EOF-EEMD analysis that suggests distinctive features between these two subregions. In this section, we use the EOF-EEMD package to distinguish and visualize the quantitative contributions of individual spatial and temporal modes to variations in the pollutant and meteorological data. The columns in Fig. 8 show the EOF-EEMD results for the observed temperature, RH, wind speed, NO 2 , and PM 2.5 . The first two rows show the first two spatial patterns (EOF1 and EOF2) from the EOF analysis. The third row visualizes the EEMD-HT-MSA results for PC1 and PC2, the temporal counterparts of EOF1 and EOF2. For all variables, the first two PCs contribute more than 50 % of the total variance of the original data. The following PCs (PC3, PC4. . . ) contain small variances and are not discussed here.

EOF-EEMD analyses of pollutants
The fourth column in Fig. 8 for NO 2 shows a primary pattern (EOF1 and PC1) with synchronous variation over the entirety of Eastern China. This pattern contributes 42 % of the total variance of NO 2 . The two dominant IMFs of PC1 have time periods at 24 and 12 h, respectively, and together they contribute 30.4 % of the total variance of NO 2 . Thus, PC1 mainly reflects the diurnal variation of NO 2 . PC1 also contains some day-to-day variability in IMFs, which contribute about 10 % of the total variance of NO 2 . The second pattern (EOF2) of NO 2 reveals opposite temporal variations between NEC and SEC. This temporal contrast is mainly reflected in the dayto-day variability, with RPRs around 2-5 days contributing 10.9 % of the total variance in NO 2 . The day-to-day components of PC1 and PC2 correspond to the finding in Sect. 3.1 that NO 2 over NEC and SEC is synchronous on some days but out of phase on others.
We further investigate the physical meanings of PC1 and PC2 for NO 2 . The red solid and red dashed lines in Fig. 6a and c show the diurnal and day-to-day variations of PC1 and PC2 in comparison to regional mean NO 2 levels over Eastern China (black line), NEC (green line), and SEC (blue line). Table 1 shows the associated correlation coefficients. PC1 is synchronous with Eastern China mean NO 2 for both diurnal and day-to-day variations (R reaches 1.0), confirming this regionally synchronous pattern. The day-to-day variation of PC2 is correlated with NEC NO 2 (R = 0.66) but anticorrelated with SEC NO 2 (R = −0.45, Table 1), again confirming this NEC-SEC contrasting pattern.
The last column in Fig. 8 shows the EOF-EEMD result for PM 2.5 . As for NO 2 , EOF1 and PC1 of PM 2.5 reflect a  temporally synchronous pattern over Eastern China, which contributes 44 % of the total variation of PM 2.5 . Again, PC1 is synchronous with Eastern China mean PM 2.5 (red versus black lines in Fig. 6b, d) in terms of both diurnal and day-today variations, with correlation coefficients approaching 1.0 (Table 2). However, the IMFs of PC1 representing diurnal variation are relatively weak, consistent with the noisy diurnal cycle of PM 2.5 discussed in Sect. 3.1. The dominant IMF of PC1 shows a period of around 7 days. PC2 of PM 2.5 reflects the day-to-day contrast between NEC and SEC ( Fig. 6d and Table 2) with RPRs of 2-5 days, similar to PC2 of NO 2 . Figure 8. EOF-EEMD-HT-MSA results for the observed temperature, RH, wind speed, NO 2 , and PM 2.5 . The first two rows depict EOF1 and EOF2, and the third row shows the EEMD-HT-MSA result for PC1 and PC2. In each panel of the third row, the length of the vertical "error bar" shows the RPR of an IMF, while the length of the horizontal bar represents the percentage contribution of the IMF to the total variance of the original data (as such, the horizontal lengths for different IMFs across different PCs can be compared). The blue (red) color indicates diurnal (day-to-day) variation. Table 1. Correlation between PCs and regional mean values in terms of diurnal and day-to-day variability for NO 2 .

EOF-EEMD analyses of meteorological parameters
For comparison, the first three columns in Fig. 8 show the EOF-EEMD results for the observed temperature, RH, and wind speed. The EOF-EEMD result for wind speed (the third column in Fig. 8) is closest to that for NO 2 , with a regionally synchronous pattern (EOF1 and PC1), an NEC-SEC con- Table 2. Correlation between PCs and regional mean values in terms of diurnal and day-to-day variability for PM 2.5 . trasting pattern (EOF2 and PC2), and a dominant IMF with a period of 24 h. The day-to-day wind speed variability is also reflected in the IMFs of PC1 and PC2 with RPRs of 2-5 days, consistent with that for NO 2 . The EOF-EEMD result for wind speed is also fairly comparable with that for PM 2.5 , although the latter shows a dominant IMF (in PC1) with a period of 7 days. These results are consistent with Sect. 3.1 but with a more quantitative analysis on the spatiotemporal scales.
The EOF-EEMD analysis for temperature (the first column in Fig. 8) shows that PC1 contributes 88 % of the total variance, and it is dominated by the IMF with a period of 24 h. The contribution of PC2 is negligible (4 %). For RH (the second column in Fig. 8), PC2 plays a minor role, and there are IMFs of PC1 with periods near 3 and 12 days, contributing to the correlation between RH and PM 2.5 . These results indicate a complex association in the day-to-day variability between temperature-RH and pollutants broadly consistent with the discussion in Sect. 3.1.

General evaluation
The color contours in Fig. 5a-d show the horizontal distributions of NO 2 and PM 2.5 simulated by GEOS-Chem and CMAQ. The model results here are averaged from all days over the time period rather than sampled from days with valid observations. Both models capture the general spatial patterns of observed NO 2 and PM 2.5 , with the heaviest pollution over the north and east. Figure 9 evaluates the regional mean diurnal and day-today variations of modeled pollutant levels over NEC and SEC. Here model data are sampled from days and locations with valid observations. All trends are negligible and have been removed, consistent with the observational analysis. GEOS-Chem underestimates the observations by about 17 µg m −3 (21 µg m −3 over NEC and 13 µg m −3 over SEC) for NO 2 and by 35 µg m −3 over Eastern China (31 µg m −3 over NEC and 41 µg m −3 over SEC) for PM 2.5 averaged over the whole period. The model bias is relatively consistent across individual hours. GEOS-Chem captures the observed diurnal variability for both pollutants as well as the day-today variability of PM 2.5 , although it greatly underestimates the day-to-day variability of NO 2 . More model evaluation statistics are shown in Table 3. Figure 9 also shows that WRF/CMAQ overestimates the nighttime observations by about 30 µg m −3 for NO 2 and 60 µg m −3 for PM 2.5 averaged over Eastern China, although it reproduces the daytime pollutant levels. This means an overestimate of the diurnal range, as is also revealed by the EOF-EEMD analysis in Sect. 4.2. CMAQ captures the dayto-day variability of daily mean NO 2 and PM 2.5 much better than GEOS-Chem (R = 0.63-0.84 versus 0.25-0.37 over NEC and SEC for NO 2 ; and 0.87-0.88 versus 0.55-0.75 for PM 2.5 ). Note that the correlations shown here mainly reflect the model capabilities to capture Eastern China synchronous day-to-day variation, and they do not imply the model performance in simulating the NEC-SEC contrast, which is revealed in Sect. 4.2. More model evaluation statistics are shown in Table 3.

Model evaluation based on the EOF-EEMD analysis
Figures 10 and 11 evaluate the EOF-EEMD results for modeled NO 2 and PM 2.5 , respectively. Prior to the EOF-EEMD analysis, modeled NO 2 and PM 2.5 were sampled at times and locations with valid observations and then underwent the same interpolation procedure to fill in the missing values. In these figures, the last three rows visualize the EOF-EEMD-HT-MSA results in different ways (manifested in different lengths of the horizontal bar for each IMF). In the third row, the variance of each IMF is normalized to the total variance of the original data (NO 2 or PM 2.5 ). In the fourth row, the variance of each IMF is normalized to the variance of its respective PC in order to better visualize the signals from PC2 (which has a much smaller variance than PC1); as such, only the IMFs from the same PC are intercomparable. The fifth row visualizes the absolute variance of each IMF without any normalization.
The first two rows in Fig. 10 show EOF1 and EOF2 of NO 2 . Both GEOS-Chem and CMAQ exhibit a synchronous pattern (EOF1) and an NEC-SEC contrasting pattern (EOF2), consistent with the observation. However, the CMAQ-simulated NEC-SEC contrast in EOF2 is much weaker than the observed. Table 1 shows that for modeled NO 2 , PC1 is highly correlated with Eastern China mean NO 2 for diurnal (R = 1.0 for GEOS-Chem and CMAQ) and day-to-day (R = 0.56-0.97) variability and that PC2 is correlated with NEC NO 2 (R = 0.74-0.81) and anticorrelated with SEC NO 2 (R = −0.47 to −0.32) in terms of day-to-day variability, in line with the observational analysis.
The last three rows in Fig. 10 show that both models underestimate the contribution of day-to-day variability to the total variance of NO 2 (with a shorter length of horizontal bar). For PC1, CMAQ captures the RPR (position of "error bar") and variance (length of horizontal bar) of the observed IMFs fairly well. By comparison, GEOS-Chem underestimates the day-to-day variance (too-small horizontal length) and does not capture its RPR. These results are consistent with the analysis in Sect. 4.1 (Fig. 9) showing that CMAQ is correlated with the observed Eastern China synchronous NO 2 time series much better than GEOS-Chem. For PC2, which reflects the NEC-SEC contrasting pattern, GEOS-Chem outperforms CMAQ in capturing the RPR and variance of the observed day-to-day IMFs (red colored in fourth row). This model characteristic is not seen from the time series discussion in Sect. 4.1. Figure 11 shows that both GEOS-Chem and CMAQ capture the synchronous pattern (EOF1) and the NEC-SEC contrasting pattern (EOF2) of PM 2.5 . For PC1, GEOS-Chem captures the variance of each IMF but not its RPR (especially for the day-to-day IMFs). CMAQ simulates too-strong diurnal IMFs, consistent with its overestimated diurnal cycle discussed in Sect. 4.1. CMAQ outperforms GEOS-Chem in capturing the RPR of day-to-day IMFs of PC1, in line with its better correlation with the observations (Fig. 9). For PC2, GEOS-Chem captures the variance and RPR of the observed day-to-day IMFs better than CMAQ.

Discussion on model deficiencies
WRF/CMAQ overestimates the diurnal variation of NO 2 and PM 2.5 . The causes are multifaceted. The ACM2 PBL mixing scheme in WRF v3.5.1 and CMAQ v5.0.1 (used here) assumes the same value for the eddy diffusivity of momentum (K m ) and heat (K h ), which implies a Prandtl number (P r = K m /K h ) of unity and too-weak mixing under stable atmospheric conditions (i.e., at night). This deficiency has been alleviated in WRF v3.7 and CMAQ v5.1. Also, there is inconsistency between CMAQ and WRF in the Monin-Obukhov length in the surface layer module. This error has been corrected in CMAQ v5.1. For more model update details, please refer to the online document (https: //www.airqualitymodeling.org/index.php/CMAQ_version_ 5.1_(November_2015_release)_Technical_Documentation, last access: 3 September 2018).
GEOS-Chem (the first model layer) underestimates surface NO 2 by about 17 µg m −3 and PM 2.5 by 35 µg m −3 averaged over Eastern China. The underestimate of PM 2.5 is in part because this simulation of GEOS-Chem does not in-   (Fu et al., 2012). Also, the model does not include anthropogenic dust. Furthermore, although the observation stations are close to the ground, the first layer of GEOS-Chem is too thick (130 m) to fully capture the vertical gradient of pollution concentrations. Figure 12 shows Eastern China mean vertical profiles of NO 2 in the two models. The center of the first layer of CMAQ (40 m) is closer to the ground, and the center of its second layer is located at a height similar to the center of the first layer of GEOS-Chem. CMAQ shows a strong vertical gradient of NO 2 from its first to second layer. Had we used the CMAQ-simulated ratio of the first over the second layer to extrapolate GEOS-Chem first-layer NO 2 to 40 m, this would significantly increase the model's "ground-level" NO 2 (by 24 % over NEC and 17 % over SEC) and PM 2.5 (by 45 % and 17 %). However, the extrapolation does not improve the day-to-day correlation to the observations, indicating the important roles played by other factors. See Table 3 for more evaluation statistics. GEOS-Chem (the first model layer) also underestimates the Eastern China synchronous day-to-day variation of NO 2 . When averaged over the 10 lowest layers (below 850 hPa), GEOS-Chem NO 2 captures the day-to-day variability of observed surface NO 2 . This suggests that the model deficiency in day-to-day variability may be specific to the first layer. Moreover, the first layer of GEOS-Chem captures the day-to-day variation of observed NO 2 in the afternoon Figure 12. Eastern China mean NO 2 vertical profiles simulated by GEOS-Chem and CMAQ averaged over 25 October-25 December 2013. The black and red dots denote the center of each vertical layer in the two models. The evening is from 20:00 to 23:00 LT, while the afternoon is from 12:00 to 15:00 LT.
(12:00-15:00 LT, R = 0.9 over NEC and 0.8 over SEC), but the model performance is rather poor in the evening (20:00-23:00 LT, R = 0.1 over NEC and SEC), suggesting nighttime-specific model inadequacies. A further analysis of nighttime ozone and the NO : NO 2 ratio suggests that GEOS-Chem greatly underestimates the observed nighttime ozone by 49.2 % on average over NEC and 54.6 % over SEC, particularly on days when its NO : NO 2 ratio is much greater than the CMAQ-modeled ratio. The mean NO : NO 2 ratio in GEOS-Chem is 1.8 over NEC and 1.4 over SEC, greater than the ratio in CMAQ (1.0 over NEC and 0.4 over SEC) by a factor of 2-3. Overall, it appears that the nighttime chemistry is poorly represented in the first layer of GEOS-Chem, the causes of which warrant further investigations.
The magnitude of emission differences between the two models plays an insignificant role in the differences between their simulated NO 2 or PM 2.5 concentrations. The Chinese anthropogenic emissions in 2010 used in GEOS-Chem (except for NO x ) are close to the emissions in 2013 used in CMAQ (within 10 % for both gases and primary aerosols, mostly within 5 %; see Zheng et al., 2018). NO x emissions in GEOS-Chem are scaled to 2013 using satellite NO 2 data, which further eliminates the differences from those used in CMAQ. The difference in the spatial distribution of emissions is also small (Geng et al., 2017;Zheng et al., 2018).
We further use CMAQ simulations to investigate whether the inclusion of SOA affects our analysis of the spatiotemporal patterns of PM 2.5 . Supplement Fig. S1 compares the time series of CMAQ-simulated PM 2.5 with versus without including SOA. Although SOA contributes about 8-9 µg m −3 of PM 2.5 averaged over the days, inclusion of SOA does not affect the temporal variability. The EOF-EEMD results in Supplement Fig. S2 further confirm that the spatiotemporal scales are very consistent whether or not SOA is included.

Conclusions and discussion
This study uses a newly compiled EOF-EEMD analysis visualization package to evaluate the spatiotemporal variations of hourly NO 2 and PM 2.5 data over Eastern China during fall-winter 2013. The observed NO 2 data exhibit an Eastern China synchronous pattern (EOF1) and a north-south contrasting pattern (EOF2). EOF1 of NO 2 consists of a dominant signal for diurnal variation and a weaker signal for day-today variation. EOF2 of NO 2 is dominated by the day-to-day variation. Although the diurnal cycle is relatively consistent across the days, the day-to-day variation exhibits an RPR at 2-5 days with no constant amplitude, a feature intended to be properly accounted for in the EOF-EEMD analysis. The day-to-day variation is largely driven by cold air passage, as revealed from analyses of observed wind speed, temperature, and RH. In particular, wind speed is most closely related to NO 2 based on an EOF-EEMD analysis and a complementary correlation calculation (R = −0.77 to −0.73 over NEC and SEC).
An EOF-EEMD analysis of the observed PM 2.5 also reveals an Eastern China synchronous (EOF1) and a north-south contrasting (EOF2) pattern. However, the diurnal variation of PM 2.5 is much noisier than that of NO 2 . The day-today variation dominates for PM 2.5 , and it is highly associated with wind speed, especially over NEC (R = −0.60).
Further evaluation of GEOS-Chem and WRF/CMAQ simulations shows that both models simulate the observed EOF1 and EOF2 patterns well. Both models capture the day-to-day variability of PM 2.5 better than that of NO 2 . CMAQ outperforms GEOS-Chem in Eastern China synchronous day-today IMFs, especially for NO 2 , whereas GEOS-Chem better captures the north-south contrasting day-to-day IMFs. CMAQ overestimates the diurnal variability of NO 2 and PM 2.5 such that the IMFs from the EOF-EEMD analysis are overly dominated by the diurnal signal (especially for NO 2 ). This is likely due to its underestimate of PBL mixing, for which deficiencies have been alleviated by the latest model updates. GEOS-Chem underestimates the concentrations of both pollutants due in part to missing secondary organic aerosols and anthropogenic dust (affecting PM 2.5 ) and a first layer too thick (130 m) to capture the vertical gradient near the ground. GEOS-Chem captures the diurnal variations of NO 2 and PM 2.5 . It underestimates the day-to-day variability of nighttime NO 2 likely due to chemical inaccuracies in the first layer.
This study suggests that the EOF-EEMD package is a useful tool providing a simultaneous and quantitative view of the spatial and temporal (both stationary and nonstationary) scales embedded in a dataset. The package can be applied to other chemical, meteorological, or climatic variables and will be freely accessible to the public.
Author contributions. ML, YW, and JL designed research, constructed the EOF-EEMD package, and performed the research. ZW provided the EEMD code in MATLAB, and JC and ZF contributed to revision of EEMD. YS provided pollution measurement data. JS and LZ provided GEOS-Chem code, ML, LC, and YY conducted GEOS-Chem simulations, and BZ, QZ, and YZ provided CMAQ