Evaluation of NU-WRF model performance on air quality simulation under various model resolutions – an investigation within the framework of MICS-Asia Phase III

Horizontal grid resolution has a profound effect on model performances on meteorology and air quality simulations. In contribution to MICS-Asia Phase III, one of whose goals was to identify and reduce model uncertainty in air quality prediction, this study examined the impact of grid resolution on meteorology and air quality simulation over East Asia, focusing on the North China Plain (NCP) region. The NASA Unified Weather Research and Forecasting (NUWRF) model has been applied with the horizontal resolutions at 45, 15, and 5 km. The results revealed that, in comparison with ground observations, no single resolution can yield the best model performance for all variables across all stations. From a regional average perspective (i.e., across all monitoring sites), air temperature modeling was not sensitive to the grid resolution but wind and precipitation simulation showed the opposite. NU-WRF with the 5 km grid simulated the wind speed best, while the 45 km grid yielded the most realistic precipitation as compared to the site observations. For air quality simulations, finer resolution generally led to better comparisons with observations for O3, CO, NOx , and PM2.5. However, the improvement of model performance on air quality was not linear with the resolution increase. The accuracy of modeled surface O3 of the 15 km grid was greatly improved over the one from the 45 km grid. A further increase in grid resolution to 5 km, however, showed diminished impact on model performance improvement on O3 prediction. In addition, a 5 km resolution grid showed large advantage in better capturing the frequency of highpollution occurrences. This was important for the assessment of noncompliance with ambient air quality standards, which was key to air quality planning and management. Balancing the modeling accuracy and resource limitation, a 15 km grid resolution was suggested for future MICS-Asia air quality modeling activity if the research region remained unchanged. This investigation also found a large overestimate of groundlevel O3 and an underestimate of surface NOx and CO, likely due to missing emissions of NOx and CO. Published by Copernicus Publications on behalf of the European Geosciences Union. 2320 Z. Tao et al.: Evaluation of NU-WRF performance on air quality simulation under various model resolutions

Abstract. Horizontal grid resolution has a profound effect on model performances on meteorology and air quality simulations. In contribution to MICS-Asia Phase III, one of whose goals was to identify and reduce model uncertainty in air quality prediction, this study examined the impact of grid resolution on meteorology and air quality simulation over East Asia, focusing on the North China Plain (NCP) region. The NASA Unified Weather Research and Forecasting (NU-WRF) model has been applied with the horizontal resolutions at 45, 15, and 5 km. The results revealed that, in comparison with ground observations, no single resolution can yield the best model performance for all variables across all stations. From a regional average perspective (i.e., across all monitoring sites), air temperature modeling was not sensitive to the grid resolution but wind and precipitation simulation showed the opposite. NU-WRF with the 5 km grid simulated the wind speed best, while the 45 km grid yielded the most realistic precipitation as compared to the site observations. For air quality simulations, finer resolution gener-ally led to better comparisons with observations for O 3 , CO, NO x , and PM 2.5 . However, the improvement of model performance on air quality was not linear with the resolution increase. The accuracy of modeled surface O 3 of the 15 km grid was greatly improved over the one from the 45 km grid. A further increase in grid resolution to 5 km, however, showed diminished impact on model performance improvement on O 3 prediction. In addition, a 5 km resolution grid showed large advantage in better capturing the frequency of highpollution occurrences. This was important for the assessment of noncompliance with ambient air quality standards, which was key to air quality planning and management. Balancing the modeling accuracy and resource limitation, a 15 km grid resolution was suggested for future MICS-Asia air quality modeling activity if the research region remained unchanged. This investigation also found a large overestimate of groundlevel O 3 and an underestimate of surface NO x and CO, likely due to missing emissions of NO x and CO.

Introduction
Air pollution is a threat to human health and climate and detrimental to ecosystems (Anenberg et al., 2010; https://www.who.int/airpollution/ambient/en/, last access: 24 February 2020). Lelieveld et al. (2015) estimated that over 3 million premature deaths could be attributable to outdoor air pollution worldwide in 2010 based on their analysis of data and the results from a high-resolution global air quality model. Since the turn of the 21st century, East Asia has undergone remarkable changes in air quality as observed by satellite and ground stations (Jin et al., 2016;Krotkov et al., 2016). In the past decade, haze (fine particle) pollution has become a household name in China and many severe haze events have been reported and their formation mechanisms and associations with global-and meso-scale meteorology have been analyzed (Zhao et al., 2013;Huang et al., 2014;Gao et al., 2016;Cai et al., 2017;Zou et al., 2017). Meanwhile, ground-level ozone has been a major air quality concern in China ( Wang et al., 2017;Lu et al., 2018), Japan (Akimoto et al., 2015), and South Korea (Seo et al., 2014). In combination with observations from various platforms, chemical transport models (CTMs) remain an important tool to understand mechanisms, to investigate spatialtemporal distributions, and to design feasible control strategies of air pollution. However, CTM uncertainties persist (e.g., Carmichael et al., 2008) and the interpretation of any model results needs caution and the exertion of careful analysis.
Intermodel comparison study provides a valuable way to understand model uncertainties and sheds light on model improvements. With these as two of its major goals, the Model Inter-Comparison Study for Asia (MICS-Asia) was initiated in 1998. Since then MICS-Asia has gone through three phases with emphasis on various aspects of air pollution. Phase I focused on long-range transport and deposition of sulfur over East Asia (Carmichael et al., 2002). Phase II expanded the analysis to more pollutants, including nitrogen compounds, particulate matter, and ozone, in addition to sulfur (Carmichael et al., 2008). Moving fast to Phase III, MICS-Asia concentrated on three topics with the first aiming at identifying strengths and weaknesses of current air quality models to provide insights into reducing uncertainties . There were a total of 14 CTMs -13 regional and 1 global -participating in the coordinated model experiment, which simulated air quality over Asia throughout the year 2010. Due to the constraints of computing resources among participating modeling groups, a 45 km horizontal resolution has been adopted by every team to run the year-long experiment.
This relatively coarse spatial resolution raises the question of how representatively the model can resolve key issues relevant to air quality and its planning and regulation, e.g., heterogeneous emissions, inhomogeneous land cover, and meteorology. For example, Valari and Menut (2008) explored the question using the CHIMERE chemistry transport model at various horizontal resolutions over Paris. They found that the ozone simulation was especially sensitive to the resolution of emissions. However, the benefit of increasing emission resolutions to improve ozone forecast skills was not monotonic and at a certain point the forecast accuracy decreased upon further resolution increase. Using the Weather Research and Forecasting-Chemistry (WRF-Chem) model with various horizontal resolutions (3-24 km) over Mexico City,  concluded that a 1 to 6 ratio of grid resolution to city size appeared to be a threshold to improve ozone forecasting skill over megacity areas: the forecast would be improved significantly when model resolution was below this threshold value. Contrary to Valari andMenut (2008), Tie et al. (2010) suggested that the meteorology changes associated with the grid size choice played a more prominent role in contributing to the improvement of ozone forecast skills. More recently, Neal et al. (2017) employed a highresolution (12 km) air quality model with high-resolution emissions within the Met Office's Unified Model for air quality forecasting (AQUM) over Great Britain. They found that AQUM improved significantly the forecast accuracy of primary pollutants (e.g., NO 2 and SO 2 ) but less obviously the forecast accuracy of secondary pollutants like ozone, as compared with a regional composition-climate model (RCCM, 50 km horizontal resolution). But there was a drawback to their conclusion in that the chemical mechanisms and photolysis rates utilized in AQUM and RCCM were different, complicating the underlying reasons for changes in forecast skills. Lee et al. (2018) examined the importance of aerosol-cloud-radiation interactions to precipitation and the model resolution impact of key meteorological processes that affected precipitation using the Advanced Research WRF model. They found that the coarse model resolution would lower updraft, alter cloud properties (e.g., mass, condensation, evaporation, and deposition), and reduce cloud sensitivity to ambient aerosol changes. They further concluded that the uncertainty associated with resolution was much more than that related to cloud microphysics parameterization. The resultant meteorological condition change would trigger an air quality response as well.
Despite the progress, the exploration of impacts of model resolution on local air quality over Asia is rare. Taking advantage of the MICS-Asia platform, we examined the issue over the MICS-Asia domain using the NASA Unified WRF (NU-WRF) model (Tao et al., , 2016Peters-Lidard et al., 2015), focusing on the North China Plain (NCP) which was plagued by frequent heavy air pollution episodes. The investigation would not only assist in gaining insights into how model horizontal resolution affects simulated meteorology and air quality but also contribute to the formulation of uncertainties resulting from model resolutions for the MICS-Asia community. The latter would especially be valuable since most MICS-Asia Phase III model simulations were conducted at a specific horizontal resolution (i.e., 45 km for most participants).

NU-WRF model and experiment design
NU-WRF is an integrated regional Earth-system modeling system developed from the advanced research version of WRF-Chem (Grell et al., 2005), which represents atmospheric chemistry, aerosol, cloud, precipitation, and land processes at convection-permitting spatial scales (typically 1-6 km). NU-WRF couples the community WRF-Chem with NASA's Land Information System (LIS), a software framework including a suite of land surface models (LSMs) that are driven by satellite and ground observations and reanalysis data (Kumar et al., 2006;Peters-Lidard et al., 2007). It also couples the Goddard Chemistry Aerosol Radiation and Transport (GOCART) bulk aerosol scheme (Chin et al., 2002(Chin et al., , 2007 with the Goddard radiation (Chou and Suares, 1999) and microphysics schemes , which allows for fully coupled aerosol-cloudradiation interaction simulations. In addition, NU-WRF links to the Goddard Satellite Data Simulator Unit (G-SDSU), which converts simulated atmospheric profiles, e.g, clouds, precipitation, and aerosols, into radiance or backscatter signals that can be directly compared with satellite level-1 measurements at a relevant spatial and temporal scale (Matsui et al., 2009(Matsui et al., , 2013. In this study, NU-WRF has been employed to carry out the model simulations at various horizontal resolutions using the same set of physical and chemical configurations. A nested domain setup was configured for this investigation as shown Fig. 1. The 45 km resolution mother domain (d01) covered the MICS-Asia Phase III study region. The nested 15 km (d02) and 5 km (d03) domains covered East Asia and the NCP, respectively. A one-way nesting approach was applied so that the values of the mother domains were independent of those of the respective nested domains. This analysis focused on the NCP and its adjacent areas, with a total area of over 1.1 million square kilometers. The key NU-WRF configurations included the updated Goddard cumulus ensemble microphysics scheme , the new Goddard long-and shortwave radiation schemes (Chou and Suares, 1999), the Monin-Obukhov surface layer scheme, the unified Noah land surface model (Ek et al., 2003) with LIS initialization (Peters-Lidard et al., 2015), the Yonsei University planetary boundary layer scheme (YSU; Hong et al., 2006), the new Grell cumulus scheme developed from the ensemble cumulus scheme (Grell and Devenyi, 2002) that allowed subsidence spreading , the secondgeneration regional acid deposition model (RADM2; Stockwell et al., 1990;Gross and Stockwell, 2003) for trace gases, and GOCART for aerosols. In this investigation, the option of fully coupled GOCART-Goddard microphysics and radi-ation schemes  was implemented to account for the aerosol-cloud-radiation interactions.
Anthropogenic emissions were from the mosaic Asian anthropogenic emission inventory (MIX; Li et al., 2017) that was developed for MICS-Asia Phase III. The MIX inventory was at the 0.25 • by 0.25 • resolution and was projected to the study domain under the 45, 15, and 5 km horizonal resolutions. Fire emissions were from the 0.5 • by 0.5 • Global Fire Emissions Database version 3 (GFEDv3; van der Werf et al., 2010;Mu et al., 2011) and were also projected to the targeted region. Biogenic emissions were computed online using the Model of Emissions of Gases and Aerosols from Nature version 2 (MEGAN2; Guenther et al., 2006). Dust and sea salt emissions were also calculated online using the dynamic GO-CART dust emissions scheme (Kim et al., 2017)

Comparisons with observations
The NU-WRF results of different horizontal resolutions have been compared with ground observations using the following statistical measures.
Mean bias : Root mean square error : Here, m i and o i denote the modeled and observed values at time-space pair i; m and o represent the average modeled and observed values, respectively. The term r describes the strength and direction of a linear relationship between two variables; a perfect correlation has a value of 1. NMB and MB depict the mean deviation of modeled results from the respective observations. A perfect model simulation yields a In total there were 77 sites reporting daily average values of wind speed (Wind), air temperature (Temp), and relative humidity (RH), as well as daily total precipitation (Precip). Figure 2a shows the Taylor diagram summarizing r, NMB, and NSD of the comparison of regional mean (average of observations from 77 sites) daily meteorological variables. Along the azimuthal angle is r. NSD is proportional to the radial distance from the origin. NMB (sign and range) is represented by the geometric shapes. The statistical measures under 45, 15, and 5 km resolutions are represented by the colors blue, green, and red, respectively. The closer to the point Obs on the Taylor diagram and the smaller the NMB, the better the model performance is. It can be seen that the model horizontal resolution has little impact on surface air temperature simulation. Regardless of resolution selections, the modeled temperature correlated very well with the corresponding observations with r values all approaching 0.99. NU-WRF also reproduced the observed temperature variations well with NSD ranging between 1.05 and 1.10. Meanwhile, NMB was within ±1 % for all experimented resolutions. RMSEs were 1.13, 2.26, and 2.02 K for the 45, 15, and 5 km grids, respectively. The insensitivity of surface air temperature to the choice of model resolutions was also reported by Gao et al. (2017), who used WRF to explore the issue for summer seasons at the 36, 12, and 4 km resolutions.
On the other hand, the horizontal resolution has a remarkable effect on surface wind speed as shown in Fig. 2a. At the 5 km resolution, NU-WRF yielded an r value of 0.75, NMB of approximately 54 %, and NSD of 1.78. NU-WRF simulated a larger variation in wind than the observations showed. As comparisons, the values of r, NMB, and NSD for 15 and 45 km were 0.54, 95 %, and 2.14 and 0.71, 103 %, and 2.01, respectively. The respective RMSEs of the 45, 15, and 5 km grids were 2.87, 2.82, and 1.67 m s −1 . It was apparent that the 5 km resolution gave the overall best wind speed simulation compared to the observations, though NU-WRF overestimated the surface wind speed in all cases. The wind speed overestimate, especially under low wind conditions, was a common problem in all MICS-Asia participating models and other weather forecast models . This overestimate stemmed from many factors, including but not limited to terrain data uncertainty, poor representation of urban surface effect, and horizontal and vertical grid resolutions. Yu (2014) in her doctoral dissertation pointed out that surface wind simulation would be improved upon using more accurate land-use data. This is expected since surface wind is largely dependent on the land surface characteristics, such as albedo and roughness. High-resolution grids tend to have more accurate land-use representation because they recognize the inhomogeneous nature of land type.
NU-WRF simulations at all three resolutions yielded similar reproductions of the observed variations in relative humidity (RH) with NSD ranging between 0.87 and 0.88. The modeled RH was less variable than the observed one. While the modeled RH at the 45 km resolution (r = 0.84) better correlated with the observations than those at the finer resolutions did (approximately 0.67 for both 15 and 5 km resolutions), the NMB at this resolution was the largest (−17 %) among the three cases. The NMBs for the 15 and 5 km cases were −10 % and −12 %, respectively. Overall, NU-WRF underestimated the surface RH. The respective RMSEs for 45, 15, and 5 km resolutions were 13.2 %, 12.6 %, and 13.3 %. The simulation with the 15 km grid appeared to yield the overall best RH in all three cases.
It was interesting to find that NU-WRF simulated the precipitation best, as directly compared to the rain gauge data, when using the 45 km grid. At this resolution, NU-WRF gave r of 0.81, NMB of 1.7 %, RMSE of 3.2 mm d −1 , and NSD of 1.41. As comparisons, the values of r, NMB, RMSE, and NSD for 15 and 5 km were 0.53, 76 %, 5.7 mm d −1 , and 1.71 and 0.52, 80 %, 5.8 mm d −1 , and 1.72, respectively. Finer resolutions indeed yielded worse results in precipitation modeling as compared to the site data. This may be because precipitation was a very heterogeneous phenomenon: finer model grids had larger chances of missing a precipitation event or hitting an event that was nonexistent, leading to a greater overall bias and a poorer correlation. On the contrary, Gao et al. (2017) compared their WRF modeled results to the gridded precipitation based on the daily rain gauge data that were gridded to the 0.125 • resolution using the synergraphic mapping algorithm with topographic adjustment to the monthly precipitation climatology (Maurer et al., 2002). They reported that the modeled precipitation of the 4 km resolution was much improved over that of the coarser 36 or 12 km resolutions.
The time series of daily mean wind speed, air temperature, and RH, as well as daily total precipitation averaged over the monitoring sites are illustrated in Fig. S1 in the Supplement. They echoed the above findings based on the Taylor diagram. It appeared that NU-WRF constantly overestimated surface wind speed throughout the year with large overestimates occurring in fall and winter, while it severely underestimated RH in summer. Uncertainty in representation of land surface characteristics at least partially explained these biases (Yu, 2014;Gao et al., 2018). High-resolution grids tended to reduce the uncertainty in land surface representation, which would be helpful for improving model performance in meteorology simulation. A more detailed exploration of model-observation mismatch would be insightful but was beyond the scope of this research.

Air quality
The difference seen in the aforementioned meteorology would cause varied performances of air quality simulations at various model horizontal resolutions. In this study, the NU-WRF simulated surface air quality was compared to the corresponding observations. The 2010 ground-level air quality data were obtained from the Chinese Ecosystem Research Network (CERN; http://www.cern.ac.cn, last access: 25 February 2020) operated by the Institute of Atmospheric Physics of the Chinese Academy of Sciences. There were 25 monitoring sites distributed within a 500 km by 500 km area centering around Beijing, China (open diamonds in Fig. 1b). The site locations and characteristics were listed in Table 1. Of the 25 sites, 22 were either in an urban or a suburban setting with the rest being in a rural setting. Each site reported hourly concentrations of at least one of the following six pollutants: ozone (O 3 ), nitrogen oxides (NO x ), carbon monoxide (CO), sulfur dioxide (SO 2 ), and particulate matters with aerodynamic diameters less than 2.5 and 10 µm (PM 2.5 and PM 10 ).
(a) Regional average First, the regional mean (averaged across 25 sites) daily surface concentrations from both observations and simulations, paired in space and time, were calculated. The r, NMB, and NSD were then computed and illustrated in a Taylor diagram (Fig. 2b, c).
The six pollutants can be put into two groups: one most relevant to ozone photochemistry including O 3 , NO x , and CO and the other closely tied to aerosols including SO 2 , PM 2.5 , and PM 10 . It was readily seen that the r values of O 3 , NO x , and CO were not very sensitive to the choice of model horizontal resolutions. For O 3 , the r values for 45, 15, and 5 km grids were all around 0.85. The respective r values were 0.84, 0.81, and 0.80 for NO x and 0.80, 0.75, and 0.73 for CO. In general, however, NU-WRF reproduced the observed variations in O 3 , NO x , and CO better with a fine resolution than with a coarse one. An NSD of 1.23 for O 3 at the 5 km resolution was the closest to 1 among the three resolutions (1.24 for 15 km and 2.01 for 45 km). NSDs were 0.40, 0.36, and 0.46 for NO x and 0.24, 0.27, and 0.31 for CO, under the 45, 15, and 5 km resolutions, respectively, suggesting that simulations with the finest resolution tended to reproduce the observed variations better than the ones with coarse resolutions for these three trace gases. Meanwhile, NU-WRF yielded the smallest bias when employing the fine-resolution grid. NMBs for O 3 decreased from 115 % to 92 % when grid resolutions increased from 45 to 5 km. NMBs were −38 %, −30 %, and −18 % for NO x and −61 %, −55 %, and −51 % for CO, under the 45, 15, and 5 km resolutions, respectively. It was apparent that NU-WRF overestimated surface O 3 but underestimated NO x and CO, consistent with the findings in the companion MICS-Asia Phase III studies that based their results on ensemble model simulations Kong et al., 2020). The majority of the air quality monitoring sites used in this study were in an urban setting, which typically were in a VOC-limited regime. This meant that the underestimate of NO x would reduce the titration that consumed surface O 3 , leading to its overestimate. We further analyzed the model bias for daytime (08:00-18:00 LT) vs. nighttime. It was found that the nighttime biases for surface O 3 and NO x were approximately 2-4 times higher than those of the daytime, consistent with the finding that insufficient NO x titration caused an overestimate of modeled surface O 3 .
NU-WRF simulated fewer variations in the three aerosolrelated pollutants than those of observations under all applied horizontal resolutions. The NSDs ranged from 0.56 (for SO 2 at 15 km resolution) to 0.96 (for PM 2.5 at 45 km resolution). Though it reproduced the observed SO 2 variations the best (NSD = 0.68) with the 5 km resolution, NU-WRF yielded the best NSD for PM 2.5 (0.96) and PM 10 (0.92) when the 45 km resolution was employed. Similar to three trace gases relevant to surface O 3 formation, the choice of model resolution had a limited effect on r statistics. The r values varied from 0.70 (45 km resolution) to 0.76 (both 15 and 5 km resolution) for surface SO 2 and from 0.68 (45 km resolution) to 0.63 (5 km resolution) for PM 2.5 . The r values for PM 10 were all around 0.58 under the selected resolutions. The impact of model resolution on NMBs showed mixed information: while the smallest NMBs for SO 2 (20 %) and PM 10 (−19 %) were achieved using the 45 km resolution, the smallest NMB for PM 2.5 (1.5 %) was observed at the 15 km resolution. The model underestimate of PM 10 was consistent with the findings of the companion investigation using the multimodel ensemble analysis (Chen et al., 2019). Figure S2 shows the time series of daily mean air quality averaged over the monitoring sites for the year 2010. The constant underestimate of CO throughout the year, severe underestimate of NO x in fall and winter, and large underestimate of SO 2 in summer all indicated that the emissions inventory may be incomplete, agreeing with the reports by Kong et al. (2020) and Li et al. (2019). In the future, improvement of the emissions inventory accuracy and a more realistic temporal emissions distribution may help improve NU-WRF performance in simulating O 3 photochemistry.

(b) Individual sites
The daily average concentrations of each pollutant were calculated and paired in space and time at each air quality monitoring site. Then the statistics at each individual site were computed. Figure 3 illustrates the comparisons of MB, RMSE, and correlation coefficient r of surface O 3 from different horizontal resolutions at each site. It can be found that there was no single resolution that yielded the best correlation across all sites. For example, the simulation with the 45 km horizontal resolution gave the best correlation over sites BD, CFD, CZ, HJ, SJZ, SQL, TG, TJ, TS, XH, XL, YF, YJ, and ZJK. At the other end of spectrum, BJT, DT, and LTH achieved the best correlation when the 5 km grid was applied. QHD saw the best correlation of the simulation with the 15 km resolution. In any case, however, the variations in r values from different horizontal resolutions at each site were small (less than 0.04). On the other hand, NU-WRF yielded the worst MB and RMSE when employing the 45 km resolution grid, while MB and RMSE were similar across simulations with 15 and 5 km resolutions. Typically, at sites with urban or suburban settings, MB (RMSE) based on the 45 km grid was approximately 15 %-30 % (20 %-40 %) higher than that of the 15 or 5 km grids. It appeared that NU-WRF tended to have a better performance of ground-level O 3 simulation when increasing the horizontal resolution from 45 to 15 km, but further finer resolution had diminished impact on improving surface O 3 modeling. This was consistent with the finding by Valari and Menut (2008), who concluded that the benefit of finer-horizontal-resolution grids for improving surface O 3 forecast skill would diminish at a certain point. Figure 4 shows the PM 2.5 case of comparisons of MB, RMSE, and r. Only 10 sites reported PM 2.5 measurements over the year 2010. In general, the NU-WRF simulation with the 45 km grid correlated better with the respective observations than the other two resolutions. The only exception was site BD which saw the best correlation for the 5 km resolution. MB and RMSE results were mixed with no single resolution giving superior results across all sites. Over two rural sites (LS and XL), the simulations with the 15 or 5 km grids yielded remarkably smaller MB but correlated less with the corresponding observations than the one with the 45 km grid. Over eight urban and suburban sites, BD, SQL, and TG experienced the smallest MB when employing the 5 km resolution grid, while TG, TJ, and XH saw the least bias at the 45 km resolution. The smallest MB at BJT and LTH occurred using the 15 km grid.
At the individual site level, the impact of grid resolution on surface NO x and CO (figures not shown) modeling was similar to that at the regional average. Finer-resolution simulation generally reduced MB and RMSE. The results of the 45 km grid always had the largest bias. The underestimates of NO x at least partially explained the overestimate of surface O 3 at each site due to a less efficient NO titration of O 3 . This suggested that a higher-resolution modeling with more accurate spatial representation of NO x emissions would help improve its performance on surface O 3 simulations.
The signals for SO 2 and PM 10 (figures not shown) simulations were mixed as well. For example, the largest bias for SO 2 simulation over sites BD, CZ, GA, HS, LS, QA, QHD, XH, XL, YF, and YJ occurred when applying the 45 km grid, while the maximum bias over BJT, DT, HJ, LF, LTH, SJZ, SQL, TG, TJ, TS, ZJK, and ZZ happened at the 5 km resolution. Sites CD and CFD saw the largest bias at the 15 km resolution. Unlike PM 10 which was almost always underestimated at each site regardless of grid resolution, SO 2 was overestimated at 18 out of 25 sites and underestimated at the remaining 7 sites.
An effort has been made to identify the potential reasons that caused the model-observation discrepancy. First and as discussed previously, the spatial distribution of emissions was one key to determining air quality forecast accuracy. Figure S3 shows the typical time evolutions of surface O 3 and NO x over rural (XL) and urban (QHD) sites. It can readily be seen that NO x was underestimated at the urban site but overestimated at the rural site. The coarser the grid resolution was, the severer the underestimates and overestimates were. This indicated that the 45 km resolution tended to smooth out emissions to make urban sites (or emission centers) less polluted but rural sites more polluted. This in turn led to an overestimate of surface O 3 over the urban sites mainly due to the reduced NO x titration effect, especially at night when there was no photochemical O 3 formation. The statistics showed that the bias of the modeled daytime (08:00-18:00 LT) average surface O 3 was 30 %-90 % smaller than that of the daily average in the urban sites, no matter which grid resolution was applied. This suggested that in the future the high-resolution emissions, especially proper representation of emission gradients, would be helpful in improving air quality prediction. The effect of emission gradients associated with the grid resolution will be further discussed in the inter-resolution comparisons section.
Next, the driving meteorology, especially wind, was important for accurately forecasting air quality over coastal areas that bore sharp thermal contrasts. The QHD site is located approximately 5 km from the ocean and is subject to sea breeze effects. The detailed analysis of meteorology and air quality over QHD was conducted. The results indicated that the choice of grid resolution had large impacts on model simulations at this coastal site. The selection of the 5 km grid reduced biases of both surface temperature and wind speed. The biases of temperature reduced from 1.22 K (45 km) to −0.42 K (15 km) and further down to −0.31 K when the 5 km grid was applied. The biases of surface wind speed for the 45, 15, and 5 km grids were 3.72, 4.19, and 1.95 m s −1 , respectively. The improvement of meteorology forecasts helped reduce the biases of air quality modeling. The biases of O 3 /NO x for the 45, 15, and 5 km resolution grids were 29.94/ − 22.46, 24.09/ − 20.29, and 23.97/ − 17.95 ppbv, respectively. The improvement using the 15 km grid over the 45 km grid was remarkable but that using the 5 km grid over the 15 km grid was marginal. The result emphasized the importance of high-resolution modeling to improvements in air quality forecast skills, especially in coastal and complex-terrain areas (e.g., QHD and XL).

(c) Extreme values
High concentrations of air pollutants are of more concern because of their adverse health effects on both human beings and ecosystems. High pollutant concentrations also pose a greater risk for noncompliance to the ambient air quality standards. Therefore, evaluations of impacts of grid resolution on extreme concentrations of air pollutants are desirable. Figure 5 displays the probability density function distributions of six pollutants based on hourly surface concentrations across the monitoring sites. This analysis was focused on high pollutant concentrations with the cutoff values for CO, O 3 , NO x , SO 2 , PM 2.5 , and PM 10 being 1.1 ppmv, 60 ppbv, 25 ppbv, 5.5 ppbv, 15 µg m −3 , and 30 µg m −3 , respectively. It appeared that NU-WRF, regardless of the grid resolutions, failed to simulate surface CO with concentrations of more than 4 ppmv, likely due to the underestimate of CO emissions (Kong et al., 2020). The grid resolution appeared to have limited impact on surface PM 10 simulations when PM 10 concentrations were more than 200 µg m −3 . On the other hand, the grid resolution had a large impact on NU-WRF's capability to simulate high surface concentrations of O 3 , NO x , SO 2 , and PM 2.5 . For surface O 3 with concentrations of more than 100 ppbv, the NU-WRF results with the 5 km grid appeared to better agree with the probability distribution of observations. For surface NO x with concentrations of more than 70 ppbv, the NU-WRF results with the 5 km resolution grid better mimicked the observed distribution. Modeling with the 5 km grid also yielded the best results of distributions, in comparison to the respective observations, of SO 2 with concentrations of more than 45 ppbv and of PM 2.5 with concentrations greater than 120 µg m −3 . Table 2 lists the occurrences of violations of China's national ambient air quality standards (NAAQSs) for the six pollutants from both observations and simulations in which columns Class 1 and Class 2 list the standards for rural and urban-suburban sites, respectively, and column Frequency indicates the time integration of each NAAQS. It was apparent that NU-WRF failed to report CO violations at any grid resolutions. No CO NAAQS violation was simulated, but the observation showed that surface CO exceeded the national standard more than 1000 times. NU-WRF underestimated the NAAQS exceedances of NO x and SO 2 . A higher- resolution grid appeared to be able to catch more violations although the modeled results at the 5 km resolution only captured 33 % and 10 % of the observed exceedances of NO x and SO 2 , respectively. NU-WRF overestimated surface O 3 and PM 2.5 when their concentrations were more than the corresponding NAAQS. The fine grid resolution (i.e., 5 km) appeared to largely reduce the overestimation of surface O 3 exceedances as compared to the 45 km grid but only marginally compared to the 15 km grid. Compared to the number of observed occurrences of surface O 3 standard violation (3684), the simulated exceedances were 5.7, 1.8, and 1.7 times higher when employing the 45, 15, and 5 km resolution grid, respectively. The observations showed 1343 occurrences of surface PM 2.5 exceedances, while the modeled exceedances were 377, 267, and 231 more for the 45, 15, and 5 km grids, respectively. As for surface PM 10 , the modeled exceedances were approximately 27 %, 43 %, and 41 % less than the observed ones for the 45, 15, and 5 km grids, respectively.

Inter-resolution comparisons
It is informative to compare the NU-WRF results of different horizontal resolutions. This, in addition to the discussion in Sect. 3.1.2b, can help us understand the reasons why model resolution matters.

Emissions
There were two types of emissions applied in this study. One was the prescribed emissions of anthropogenic and wildfire sources and the other was emissions computed online using real-time meteorology (or dynamic emissions) including emissions from biogenic sources, dust sources, and sea spray. Amounts of and temporal variations in dynamic emissions depended on surrounding environmental conditions. For example, air temperature and solar radiation regulate biogenic emissions (Guenther et al., 2006). Surface wind speed plays a major role in both dust (Ginoux et al., 2001;Chin et al., 2002) and sea salt emissions (Gong, 2003).
For the prescribed emissions, the differences in domain total mass of each grid were small (less than 5 %). However, the emission gradient around sources of a fine-resolution grid appeared to be sharper than that of a coarse-resolution grid. This meant that a coarse grid tended to distribute the prescribed emissions more evenly into the domain, while a fine grid tended to produce more extreme concentrations of primary pollutants (emitted directly from a source) such as NO x and SO 2 , as shown in Table 2.
Online-calculated emissions, on the other hand, displayed large differences in both gradient and total mass. Similar to the case of prescribed emissions, a fine-resolution grid tended to give a sharper gradient of dynamic emissions than a coarse-resolution grid did, as highlighted in Fig. 6 (first row), which illustrated the biogenic isoprene emissions (mol km −2 h −1 ) on a typical summer day. It was apparent that many more details were simulated using a fine-resolution grid, so the flow of the Yellow River can even be seen on the 5 km resolution map that was otherwise invisible on the coarser-resolution maps. Meanwhile, the total masses of dy- Figure 6. Simulated emissions and July average meteorology from the three grids: first row is isoprene emissions (mol km −2 h −1 ) from biogenic sources on a typical summer day; second row is surface wind vector with the shading representing wind speed (m s −1 ); third row is surface air temperature (K); fourth row is PBLH (m); fifth row is SWDOWN (W m −2 ); sixth row is CWP (g m −2 ). namic emissions showed large differences across differentresolution grids as listed in Table 3. On an annual basis, the domain total isoprene emissions were 740 562 t when estimated using the 45 km grid, which is approximately 85 % and 86 % of those estimated with the 15 and 5 km grids, respectively. The total dust emissions of the 45 km grid were 2431 t, which is only 54 % and 62 % of those based on the respective 15 and 5 km grids. The percentage contrasts for sea salt emissions were even larger, with emissions of the 15 and 5 km grids being 1.3 and 1.6 times more, respectively, than those of the 45 km grid. It should be noted that, although they differed greatly between the 45 and 15 km grids, the dynamic emissions of the 5 km grid were much closer to those of the 15 km grid, partially explaining why the impact of model resolution on surface air quality was less remarkable when increasing the resolution from 15 to 5 km than from 45 to 15 km. The spatial (gradient) and mass variations in emissions of different-resolution grids result in differences in air quality simulations.

Meteorology
It has been reported that simulated meteorology varies in response to selections of model grid resolutions (e.g., Lee et al., 2018). Meteorology plays an important role in regulating regional air quality: it affects the emissions amount originating from biogenic, dust, and sea sources; it impacts atmospheric chemical and photochemical transformation; and it directs air flows and the associated transport of trace gases and aerosols. In this investigation, a few meteorological parameters key to air pollutant generation and accumulation were analyzed, including surface wind, air temperature, downward shortwave flux at surface (SWDOWN), planetary boundary layer height (PBLH), and cloud water (liquid + ice) path (CWP). We focused on months that were prone to deteriorated PM 2.5 (January) and O 3 (July) air quality as shown in Fig. 6 and Table 3. NU-WRF simulated a similar direction of surface wind in July 2010 over the eastern portion of the domain (second row of Fig. 6). In general, average wind speed was higher over the Bohai Sea and Yellow Sea than over the surrounding land areas, with the dominating wind direction being south and southeast. Based on the results from the 15 and 5 km grids, the peak average wind speeds of over 4 m s −1 were found in Bohai Bay blowing toward Tianjin and Beijing. However, such a peak was absent from the 45 km grid simulation. In the west portion of the domain, the wind direction generally changed from southeast in the south to southwest in the north. Compared to the more organized wind directions of the 45 km grid, wind directions of the 15 and 5 km grids were more chaotic. Averaged over the domain, the January mean wind speed of the 45 km grid was 2.92 m s −1 , which was 7 % and 16 % higher than those of the 15 and 5 km grids, respectively. The highest July mean wind speed was again simulated with the 45 km grid; it was 10 % and 12 % higher than the corresponding wind speed of the 15 and 5 km grids, respectively.
Overall, NU-WRF simulated very similar magnitudes and spatial patterns of surface air temperature in July (third row of Fig. 6), regardless of the selection of grid resolution. Large portions of the NCP experienced July average air temperatures of more than 300 K. The minimum average temperature of approximately 290 K was found in the central northern part of the domain, which was part of the Mongolian Plateau with elevations being over 1500 m a.s.l. (meters above sea level). The domain average January and July surface air temperatures were around 268 and 300 K, respectively, for simulations of all three grids.
As expected, the modeling results from all three grids (fourth row of Fig. 6) showed that July average PBLH over sea was much smaller than that over land. The largest average PBLH (more than 1000 m) was found in the northwestern corner of the domain with a dominant land cover type of grassland mosaicked with open shrubland that appeared to be drier than the other land cover types in the domain. The high sensible heating associated with dry soil tended to produce the deep PBL (planetary boundary layer; Tao et al., 2013). The largest domain average PBLHs in January and July were found in the simulations of the 15 and 45 km grids, respectively. In January, the differences of the domain average PBLHs from different grid resolutions were small and within 2 % of each other. In July, however, such differences can be over 9 %.
Regardless of the grid resolutions, NU-WRF simulated a generally southeast-northwest gradient of SWDOWN in July with the highest flux (over 300 W m −2 ) occurring in the northwestern domain (fifth row of Fig. 6). The differences between the maximum and minimum domain average SWDOWN of the three grids were 5.6 % and 3.3 % in January and July, respectively.
CWP represented the vertical integration of cloud water (including both liquid and ice phases) contents and can be regarded as a proxy of cloud amount and coverage. Opposite to the SWDOWN case, NU-WRF modeled a generally northwest-southeast gradient of CWP in July with the highest values found in the southeastern domain (sixth row of Fig. 6). This is understandable since cloud reflects and scatters the incoming solar radiation and thus affects SWDOWN. Large cloud existence tended to reduce the solar flux reaching the Earth's surface underneath. The CWP differences among the model results of different grid resolutions appeared to be larger than SWDOWN differences. In July, the domain average CWPs of the 15 and 5 km grids were 37 % and 33 % larger than that of the 45 km grid, respectively. The gaps were even larger in January, during which the domain average CWPs from the 15 and 5 km grids were approximately 1.6 times larger than that from the 45 km grid.

Air quality
In response to the aforementioned emissions and meteorological variations resulting from the selection of model grid resolutions, changes in regional air quality ensued as illustrated in Fig. 7 and Table 3. Figure 7 shows the July average concentrations of ground-level O 3 and its precursors of NO x and CO, as well as the January mean concentrations of surface SO 2 , PM 2.5 , and PM 10 , during which month the respective pollutants tended to reach high concentrations.
O 3 is a secondary pollutant that is formed in the atmosphere through complex photochemical processes upon its precursors such as NO x and volatile organic compounds (VOCs). Figure 7 (row 1) shows that the spatial distributions of surface O 3 are similar to each other but the concentrations of the 15 and 5 km grids are smaller than those from the 45 km grid. The domain average surface O 3 concentration in July was approximately 87 ppbv based on the results from the 45 km grid, which was 26 % and 25 % higher than those of the 15 and 5 km grids, respectively. In January, however, the highest domain average concentration occurred when the 5 km grid was used, which was 5.3 % higher than that of the 45 km grid.
For the primary pollutants, i.e., NO x , CO, and SO 2 (rows 2-4 of Fig. 7, respectively), which were emitted directly by their sources, the spatial distributions of their concentrations mimicked closely their emission distributions. High concentrations centered around emission sources with a reducing gradient outward. The domain average concentrations of these three pollutants of the 45 km grid results were always the largest in both January and July. The average surface NO x concentrations from the simulations of the 15 and 5 km grids were around 24 % lower than their counterpart of the 45 km grid in January. In July, the differences were reduced to 7.9 % and 11.8 % for the 15 and 5 km grids, respectively. On the other hand, the larger percentage differences, as compared to the results of the 45 km grid, occurred in July rather than in January for both CO and SO 2 . For example, the surface CO concentrations of the 5 km grid were 12.3 % and 30.6 % lower than those based on the 45 km grid in January and July, respectively. The ground-level SO 2 concentrations from the 5 km grid were 20.5 % and 38.9 lower than those from the 45 km grid in January and July, respectively.
It was interesting to note that, among the three cases, the domain average July surface O 3 and NO x concentrations were both the highest from the 45 km grid, contrary to the results discussed in Sect. 3.1.2a, where the highest O 3 concentration occurred in the simulation using the 45 km grid and the highest NO x concentration happened with the 5 km grid. This seemingly contradictory result was internally consistent. Section 3.1.2a actually depicted the average surface Figure 7. Simulated January (SO 2 , PM 2.5 , and PM 10 ) and July (O 3 , NO x , and CO) surface average air quality from the three grids: first row is O 3 (ppbv); second row is NO x (ppbv); third row is CO (ppmv); fourth row is SO 2 (ppbv); fifth row is PM 2.5 (µg m −3 ); sixth row is PM 10 (µg m −3 ). concentrations in an urban environment (23 of 25 monitoring sites were in an urban-suburban setting), where surface O 3 formation was typically VOC controlled such that NO tended to consume O 3 through titrations. As discussed in Sect. 3.2.1, a 5 km grid gave a much sharper emission gradient with anthropogenic emissions concentrating in urban-suburban areas. This led to higher NO x concentrations around urbansuburban areas in the simulation with the 5 km grid, which effectively resulted in lower O 3 concentrations there through the NO titration effect. The domain average discussed in this section, however, was the average covering the vast rural area that was generally NO x -limited such that surface O 3 formation was controlled by the availability of NO x , with more NO x resulting in more O 3 through photochemical processes. In this case, the 45 km grid tended to distribute NO x emissions more evenly across the region, effectively decreasing the surface NO x concentration in urban areas but increasing it over rural areas. The larger average July wind speed simulated by the 45 km grid (Fig. 6 and Table 3) further smoothed out the NO x distribution in the NCP. This in turn increased the domain average surface O 3 concentration via photochemistry based on the 45 km resolution results. In addition, ver-tical lifting played an important role in explaining the maximum regional O 3 in July simulated by the 45 km grid as compared to the results by the other two grid resolutions. As displayed in Fig. S4, a fine-resolution model (e.g., 5 km) tended to produce a stronger updraft than a coarse-resolution model (e.g., 45 km), consistent with the findings by . The strong uplift would bring more surface pollutants such as NO x into the upper atmosphere, thus further reducing the NO x availability at ground and limiting the surface ozone production but increasing its formation in the upper atmosphere.
Vertical distributions of O 3 also tend to have a sizable impact on the next day's surface O 3 levels (e.g., Kuang et al., 2011;Caputi et al., 2019). Figure 8 illustrates the domain average profiles of vertical wind, NO x , O 3 (panels ac), and the average diurnal distribution of surface O 3 (panel d) over July. Here we limited our discussion to the results from the 15 and 5 km grids since the 45 km grid artificially allowed more NO x emissions to spread to rural areas to produce much more O 3 , as shown in the previous paragraph. Lee et al. (2018) claimed that a coarse-resolution model appeared to lower updraft as compared with a fine-resolution model. This study agreed with their finding, as illustrated in Fig. 8a. The domain average July vertical wind in the simulation with the 5 km grid ranged from 0.25 to 0.45 cm s −1 (upward) between 800 and 400 hPa, stronger than the corresponding one of the 15 km grid. The reason was complex and the aerosol-cloud-interaction-induced freezing-evaporationrelated invigoration mechanism played a role . The stronger upward wind tended to lift more gaseous pollutants up to the free troposphere as shown in Fig. 8b (NO x ) and Fig. 8c (O 3 ). The pollutants there would have visible impacts on the following-day surface air quality, especially on O 3 levels at night and in the morning when sunlight breaks through the nocturnal planetary boundary layer, as evidenced in Fig. 8d. At night with no photochemical formation, surface O 3 concentration was largely controlled by upper-level O 3 mixing down, NO titration, and O 3 dry deposition. With virtually the same average surface NO concentrations from the 15 and 5 km grids, the upper-level O 3 mixing down appeared to control the relative magnitudes of surface O 3 concentrations simulated using the 15 and 5 km grids. This partially explained why, at night and in the early morning, the ground-level O 3 concentrations were higher in the 5 km grid than the 15 km grid. During the daytime when the photochemical formation of O 3 takes control, the regional average surface O 3 concentrations are largely determined by the availability of O 3 precursors (i.e., NO x and VOC) and ambient environmental conditions. In this case, more spreading NO x emissions of the 15 km grid appeared to generate more surface O 3 than the 5 km grid did. PM 2.5 and PM 10 were mixed pollutants that not only were emitted by various sources but also were generated in the atmosphere through physical and chemical processes. Figure 7 shows that high surface concentrations of PM 2.5 (more than 120 µg m −3 , row 5) and of PM 10 (more than 170 µg m −3 , row 6) were still found around the source areas based on the modeling results of the 15 and 5 km grids. However, high PM 2.5 and PM 10 concentrations spread out to larger areas based on the results from the 45 km grid as compared to the ones from the finer grid resolutions. Similar to the primary pollutants, the largest domain average surface concentrations occurred when a 45 km grid was used for the NU-WRF simulation. The domain average PM 2.5 concentrations of the 15 and 5 km grids in January were 15.7 % and 14 % lower than those from the 45 km grid, respectively. The surface PM 2.5 concentration differences among the results of different grid resolutions grew larger in July, reaching 48 % when comparing the result from the 5 km grid to that from the 45 km grid. The domain average surface PM 10 concentrations showed a similar pattern to that of PM 2.5 , with the results from the 5 km grid being 12.2 % (January) and 44.2 % (July) smaller than that from the 45 km grid.
It is worth noting that the magnitudes and spatial distributions of ground-level pollutants were close to each other when comparing the results of the 15 and 5 km grids. This again indicates that the improvement of fine-grid-resolution modeling reduces at a certain point. In future MICS-Asia efforts, a 15 km grid appears to offer the optimized results balanced between performance and resources.

Summary
Contributing to MICS-Asia Phase III, whose goals included identifying and reducing air quality modeling uncertainty over the region, this investigation examined the impact of model grid resolutions on the performances of meteorology and air quality simulation. To achieve this, NU-WRF was employed to simulate 2010 air quality over the NCP region with three grid resolutions of 45, 15, and 5 km. The modeling results were compared to the observations of surface meteorology archived by CMA and of ground-level air quality collected via CERN. The intermodel comparison among the simulation results of the three grids was also conducted to understand the reasons why model resolution mattered.
The analysis showed that there was no single resolution which would yield the best reproduction of meteorology and air quality across all monitoring sites. From a regional average prospective (i.e., across all monitoring sites in this study), the choice of grid resolution appeared to have a minimal influence on air temperature modeling but affected wind, RH, and precipitation simulation profoundly. A 5 km grid appeared to give the best wind simulation as compared to the observations, quantified by bias, RMSE, standard deviation, and correlation. Compared to the one from the 45 km grid, the simulated wind speed from the 5 km grid reduced the positive bias by 46.8 %. While a 15 km grid yielded the best overall performance of RH modeling, the result of the 45 km grid gave the most realistic reproduction of precipitation. The statement on precipitation should be taken with caution since it was based on comparison with the site observations. Bearing in mind the very heterogeneous nature of precipitation, the penalty of the model hitting or missing a rain event was severe. Thus, the coarse grid covering more areas within a grid cell would reduce chances of mistaken precipitation hitting or missing simulations. However, a comparison of modeled precipitation to gridded observations that were reconstructed using the synergraphic mapping algorithm with topographic adjustment to the monthly precipitation climatology showed the opposite result, where the fine-resolution modeling showed superior reproduction of precipitation than the coarse-resolution simulation did (Gao et al., 2017).
The simulated meteorology differences due to the selection of grid resolution would consequently lead to differences in air quality simulation. Air pollutant concentrations were basically determined by their emissions and underlying meteorology that directed their formation (e.g., O 3 and aerosols), transport, and removal processes. For the prescribed emissions originating from anthropogenic and wildfire sources, the grid resolution had limited influence on emissions amount -less than 5 % difference with each other under the different-resolution grids -but a large impact on emission spatial distribution, with sharper emission gradients around sources from a fine-resolution grid than from a coarse-resolution one. For the dynamic emissions driven by meteorology, not only was an emission gradient around a source larger from a higher-resolution grid but also the total emissions amount varied greatly. For example, the domain total annual biogenic isoprene emissions from a 5 km grid was about 16 % larger than those from a 45 km grid due to the underlying differences in land cover and meteorology.
Though the impact of grid resolution on air quality varied from location to location, a finer grid yielded better results for daily mean surface O 3 , NO x , CO, and PM 2.5 simulations from a regional average perspective. For example, after reducing the grid resolution from 45 to 15 km, the positive bias of daily mean surface O 3 and PM 2.5 decreased by 15 % and 75 %, respectively. Fine-resolution modeling was especially beneficial to high pollutant concentration forecasting. This was important to air quality management. Taking China's NAAQSs as cutoff values for each pollutant, the frequencies of noncompliant occurrences of O 3 , NO x , SO 2 , and PM 2.5 in the 5 km grid simulation were much closer to the observations than those of the 45 km modeling. For example, the simulation with the 5 km grid produced 168 % and 17 % more exceedances of the NAAQSs of O 3 and PM 2.5 , respectively, whereas the respective exceedances were 573 % and 28 % more with modeling using the 45 km grid, as compared to the observed exceedances. It also was worth noting that the benefit of increasing grid resolution to better surface O 3 and PM 2.5 simulations started to diminish when the horizontal resolution reached 15 km, agreeing with the finding by Valari and Menut (2008). There was a caveat, though. The anthropogenic MIX and fire GFEDv3 emissions inventories bore 0.25 • by 0.25 • and 0.5 • by 0.5 • resolutions, respectively. These resolutions cannot resolve the 5 km grid. Should a 5 km resolution emissions inventory be available and used, the benefit of high-resolution modeling would likely be more prominent.
It should be pointed out that NU-WRF significantly overestimated surface O 3 concentration but underestimated ground-level CO and NO x concentrations regardless of grid resolutions. This was true not only for the regional averages but also at the majority of the monitoring sites. Missing emissions were believed to be largely responsible for this result (Kong et al., 2020). Underestimate of surface NO x tended to increase ground-level O 3 due to the reduced titration effect, especially at night over urban areas that were typically NO x abundant.
In conclusion, grid resolution had a profound effect on NU-WRF performance on meteorology and air quality over East Asia. A fine-resolution grid did not always generate the best modeling results, and the proper selection of horizontal resolution hinged on investigation topics for a given set of physics and chemistry choices in a model. With regard to MICS-Asia Phase III, whose major goal was to examine regional air quality, in general, the finer the grid resolution was, the better the simulation results would be. This was especially true over the coastal areas and complex terrains where a sharp local energy gradient existed. A fine-resolution grid was also extremely helpful in reproducing pollutants at higher concentrations that were most relevant to air quality planning and management. However, the benefit of high resolution was not linear with the decrease in grid size. At a certain point, the improved modeling accuracy due to an increase in grid resolution was so marginal that it could not justify the computational cost associated with the fine-grid simulation. Based on the balance of modeling accuracy and efficiency, a 15 km horizontal grid appeared to be an appropriate choice to optimize model performance and resource usage if the study domain were to remain unchanged for future MICS-Asia activities. The study suggested that the highresolution emissions, especially the proper representation of emission gradients, would be helpful in improving air quality prediction. Moreover, the profile measurements of both meteorology and air quality, in conjunction with the ground monitoring networks, would be greatly helpful in identifying model deficiencies and thus improving model forecast skills.
Data availability. All data collected and generated for this research are archived and stored on NASA Center for Climate Simulation (NCCS) servers. Due to the sheer extent of data, it is impractical to upload data to a public domain repository. However, the authors will be happy to share data on an individual-request basis.
Author contributions. ZT and MC designed the experiments. ZT, MG, TK, DK, and HB carried out the experiments, working on various modeling components. YW and ZL collected, organized, and archived the ground air quality measurement data. All authors contributed to model result analysis and interpretation. ZT prepared the paper with contributions from all coauthors.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "Regional assessment of air pollution and climate change over East and Southeast Asia: results from MICS-Asia Phase III". It is not associated with a conference.
program. The authors thank MICS-Asia for its organized platform of discussion and data sharing. This work was not possible without the supercomputing and mass storage support of the NCCS.
Financial support. This research has partially been supported by NASA (grant no. NNH14ZDA001N-ACMAP).
Review statement. This paper was edited by Hailong Wang and reviewed by two anonymous referees.