Model bias in simulating major chemical components of PM2.5 in China

High concentrations of PM2.5 (particulate matter with an aerodynamic diameter less than 2.5 μm) in China have caused severe visibility degradation. Accurate simulations of PM2.5 and its chemical components are essential for evaluating the effectiveness of pollution control strategies and the health and climate impacts of air pollution. In this study, we compared 20 the GEOS-Chem model simulations with comprehensive data sets for organic aerosol (OA), sulfate, nitrate, and ammonium in China. Model results are evaluated spatially and temporally against observations. The new OA scheme with a simplified secondary organic aerosol (SOA) parameterization significantly improves the OA simulations in polluted urban areas. The model underestimates sulfate and overestimates nitrate for most of the sites throughout the year. More significant underestimation of sulfate occurs in winter, while the overestimation of nitrate is extremely large in summer. Our model is 25 unable to capture some of the main features in the diurnal pattern of the PM2.5 chemical components, suggesting underrepresented processes. Potential model adjustments that may lead to a better representation of boundary layer height, precursor emissions, hydroxyl radical, heterogeneous formation of sulfate and nitrate, and the wet deposition of nitric acid and nitrate are tested in the sensitivity analysis. The results suggest that uncertainties in chemistry perhaps dominate the model bias. The proper implementation of heterogeneous sulfate formation and the good estimates of the concentrations of sulfur 30 dioxide and hydroxyl radical are essential for the improvement of the sulfate simulation. The update of the heterogeneous uptake coefficient of nitrogen dioxide significantly reduces the modeled concentrations of nitrate, and accurate sulfate simulation is important for modeling nitrate. However, the large overestimation of nitrate concentrations remains in summer for all tested cases. The uncertainty of the production of nitrate cannot explain the model overestimation, suggesting a problem related to the removal. A better understanding of the atmospheric nitrogen budget is needed for future model studies. Moreover, 35 https://doi.org/10.5194/acp-2020-76 Preprint. Discussion started: 4 May 2020 c © Author(s) 2020. CC BY 4.0 License.

the results suggest that the remaining underestimation of OA in the model is associated with the underrepresented production of SOA.

Introduction
In developing countries like China and India, the concentrations of PM2.5 (particulate matter with an aerodynamic diameter less than 2.5 μm) often exceed air-quality standards, leading to visibility reduction and negative health effects (Chan and Yao, 40 2008;Lelieveld et al., 2015). Chemical transport models (CTMs) are valuable tools to evaluate the PM2.5 pollution and its health and climate impacts. Many studies have shown reasonable simulations of surface PM2.5 concentrations in China by the CTMs. For example, the Weather Research and Forecasting/Community Multi-scale Air Quality (WRF/CMAQ) model has reproduced the monthly-averaged concentrations of PM2.5 at the air-quality sites in 60 Chinese cities (J. . The MICS-Asia Phase III studies further show the normalized mean biases (NMBs) of less than 50% for daily or monthly mean 45 PM2.5 concentrations in various CTMs (Gao et al., 2018;Chen et al., 2019). However, the model performance on PM2.5 is component-dependent and may contain compensation errors, which bias the evaluation of the effectiveness of the emission control strategies. Recent model evaluations have reached an agreement that CTMs generally underestimate the concentrations of organic aerosol (OA) (Fu et al., 2012;Han et al., 2016) and sulfate  but overestimate the concentrations of nitrate (Wang et al., 2013;Chen et al., 2019). During the severe haze periods, the models 50 often significantly underestimate the PM2.5 concentrations .
Uncertainties exist in meteorological fields, emission inventories, and the physical and chemical processes, which contribute to the model biases in the PM2.5 simulations. For example, models are well recognized to reproduce temperature (T) and relative humidity (RH), but are difficult to capture the near-surface wind fields (Guo et al., 2016a;Gao et al., 2018;. Boundary layer structures greatly affect the PM2.5 concentrations (Z. Su et al., 2018). Evaluations of 55 the boundary layer (e.g., boundary layer height (BLH)) in the CTMs are however limited Chen et al., 2016).
For typical primary components and secondary precursors of PM2.5, the uncertainties of their emissions in Asia range from tens to several hundreds of percent (M. . The bottom-up and top-down estimates of the emissions of sulfur dioxide (SO2), nitrogen oxides (NOx), ammonia (NH3), volatile organic compounds (VOCs) and organic carbon (OC) show significant differences in magnitude and seasonal variability (Koukouli et al., 2018;Qu et al., 2019;Cao 60 et al., 2018;Fu et al., 2012).
For sulfate, the model underestimation has been attributed largely to heterogeneous production. The proposed heterogeneous formation mechanisms include the SO2 oxidation by nitrogen dioxide (NO2) directly (Cheng et al., 2016;Wang et al., 2016) or indirectly (L. , by O2 via transition-metal-ion (TMI) catalysis (G.  or radical chain reactions (Hung and Hoffmann, 2015;Hung et al., 2018), and by hydrogen peroxide (Ye et al., 2018). Among them, TMI-catalyzed 65 oxidation of SO2 perhaps dominates the sulfate formation during the haze periods, constrained by the observations of sulfate https://doi.org/10.5194/acp-2020-76 Preprint. Discussion started: 4 May 2020 c Author(s) 2020. CC BY 4.0 License. listed in Table S1 of the Supporting Information (SI), including 77 surface online measurements from 2006 to 2016 in China.
The dataset covers the regions of North China Plain (NCP), Yangtze River Delta (YRD), Pearl River Delta (PRD), and Northwest China (NW). The measurements are made by Aerodyne high-resolution time-of-flight aerosol mass spectrometer 100 (HR-ToF-AMS), quadrupole aerosol mass spectrometer (Q-AMS), and aerosol chemical speciation monitor (ACSM) and are mostly for submicron particles (Y. J. . We also compared our model simulations to long-term ACSM measurements of submicron particle composition at the site of Institute of Atmospheric Physics, Beijing (IAP, 39°58′28″ N, 116°22′16″ E) from July 2011 to May 2013 (Sun et al., 2015). The long-term data have a time resolution of 15 minutes and were averaged to an hour when comparing with the model results. All data were corrected by collection efficiency as stated in 105 the original publications. Our recent measurements show that the submicron-to-fine ratios for sulfate, nitrate, ammonium, and OA are quite similar (i.e., 0.8) for the summertime and wintertime measurement periods in Beijing except for the severe winterhaze episodes under high RH (i.e., about 0.5) (Fig. S1 in SI) (Zheng et al., 2020). For simplicity, we divided the observation data herein by 0.8 for the four species when comparing to the model results.
The meteorological parameters (e.g., T, RH, wind speed, and wind direction) and the concentrations of gaseous pollutants 110 including ozone (O3), carbon monoxide (CO), SO2, and NO2 were measured at the Peking University Urban-Atmosphere Environment Monitoring Station (PKUERS, 39°59'21" N, 116°18'25" E) from July 2011 to May 2013. Both the IAP and PKUERS sites are in the same GEOS-Chem model grid. The monthly mean NH3 concentrations are taken from the 2007-2010 observations at the IAP site (Pan et al., 2012). The BLH in Beijing (39°48'00" N, 116°28'12" E) was derived from the radiosonde observations at 8 AM, 2 PM (only in the summer), and 8 PM during July 2011 to May 2013 by using bulk 115 Richardson algorithms (Guo et al., 2016b;Guo et al., 2019). All the hours refer to Beijing time (UTC+8). The radiosondederived BLH is greater in spring and summer and lower in autumn and winter, which is consistent with the findings from the satellite observations and the ground-based ceilometer measurements (W. Tang et al., 2016).
Moreover, the observed concentrations of OH· and HO2· , gaseous nitrous acid (HONO) and nitric acid (HNO3), and isoprene in Beijing are taken from literature, including the studies in south Beijing (Wangdu, 38°39'36" N, 115°12'00" E) from 7 June 120 to 8 July 2014 and in north Beijing (Huairou, 40°24'36" N, 116°40'48" E) from 6 January to 5 March 2016 (Tan et al., 2017;Tan et al., 2018;Liu et al., 2019), and additional isoprene measurements at the PKUERS site during the summer of 2011 . The observed concentrations of NO3· and aromatic compounds are taken from the measurements at the PKUERS site in September 2016 and in summer and winter of 2011-2012 , respectively.

Model Description 125
The atmospheric chemical transport model GEOS-Chem 12.0.0 (http://geos-chem.org) was run at nested grids with 0.5°×0.625° horizontal resolution over Asia and adjacent area (11°S-55°N, 60°-150°E) and 47 vertical levels between the surface and ~0.01 hPa. Boundary conditions were provided by the global simulations at 2°×2.5° horizontal resolution. Both global and nested https://doi.org/10.5194/acp-2020-76 Preprint. Discussion started: 4 May 2020 c Author(s) 2020. CC BY 4.0 License. simulations were spun up for one month. MERRA2 reanalysis meteorological data from the NASA Global Modeling and Assimilation Office (GMAO) were used to drive the model. Model simulations were run for the measurement period of July 130 2011 to May 2013 to compare with long-term data sets. When comparing with the campaign-average data, the model simulations for the year of 2012 were used. For other comparisons, the model simulations were run for the measurement periods.
The GEOS-Chem model simulates the ozone-NOx-hydrocarbon-aerosol chemistry (Park et al., 2003;Park et al., 2004;Liao et al., 2007). Aerosol thermodynamic equilibrium is performed by ISORROPIA-Ⅱ (Fountoukis and Nenes, 2007;Pye et al., 135 2009). The simulation of OA includes primary organic aerosol (POA) and secondary organic aerosol (SOA). The model assumes that 50% of POA emitted from combustion sources are hydrophobic and hydrophobic POA converts to hydrophilic POA with an e-folding time of 1.15 days. A ratio of 1.6 is applied to account for the non-carbon mass in POA (Turpin et al., 2000). SOA is simulated by the Simple SOA scheme (Hodzic and Jimenez, 2011;Kim et al., 2015). SOA precursor surrogates are estimated from the emissions of biogenic volatile organic compounds (i.e., isoprene and terpenes) and CO from the 140 combustion of biomass, biofuel, and fossil fuel. The Simple SOA scheme assumes that the irreversible conversion from precursors to particle-phase SOA takes a fixed timescale of 1 day and that 50% of biogenic SOA precursors are emitted as particle-phase SOA. The SOA yields of isoprene and terpenes are set to be 3% and 10%, respectively. The SOA yield of biomass burning emissions is set to be 1.3% of CO, and the yield for fossil-fuel combustion is set to be 6.9%. These yields are derived from the observed ratios between SOA and CO in aged air masses from the studies in the United States (US) (Hayes 145 et al., 2015) and are able to reproduce the OA mass without detailed SOA chemistry in the southeast US (Kim et al., 2015).
Because of the lack of related measurements in China, we did not change these yields herein.
Wet depositions of soluble aerosols and gases include convective updraft, rainout, and washout as described by Liu et al. (2001). SOA is treated as highly soluble with a fixed Henry's law coefficient of 10 5 M atm -1 and a scavenging efficiency of 80% for simplicity (Chung and Seinfeld, 2002). The Henry's law coefficients may vary in magnitudes depending on the SOA 150 types (Hodzic et al., 2014). Hodzic et al. (2016) shows similar vertical profiles of modeled SOA mass for using the fixed 10 5 M atm -1 and the volatility-dependent Henry's law coefficients. Dry deposition is calculated by a standard resistance-in-series model for the aerodynamic, boundary-layer, and canopy-surface resistance (Wesely, 1989).  . This inventory shows stronger peak emissions in the summer than other inventories such as the Regional Emission in Asia (REAS2), PKU-NH3, and the Emission Database for 160 Global Atmospheric Research (EDGAR) show, which agrees better with the top-down estimates. The non-agricultural NH3 https://doi.org/10.5194/acp-2020-76 Preprint. Discussion started: 4 May 2020 c Author(s) 2020. CC BY 4.0 License. emissions in China are taken from the study done by Huang et al. (2012), which is based on the year of 2006 and represents the low-end estimates as the emissions increased rapidly after 2006 (Kang et al., 2016;Meng et al., 2017). The MIX Asian emission inventories are used for the anthropogenic emissions in the rest part of Asia (M. , which has combined the South Korea inventory (CAPSS) (Lee et al., 2011), the Indian inventory (ANL-India) Lu and Streets, 2012) 165 and the REAS2 inventory (Kurokawa et al., 2013). Our simulations used sector-specific MEIC diurnal patterns for the anthropogenic emissions of CO, NOx, SO2, BC, OC, and VOCs from power, industry, residential, transportation, and agriculture sectors ( Fig. S2 in SI) and the MEIC agriculture diurnal patterns for all anthropogenic emissions of NH3 in China.
NOx emission from soils and lightning are included in the model Murray et al., 2012). The biogenic emissions are calculated from the Model of Emissions of Gases and Aerosols from Nature (MEGAN v2.1) (Guenther et al., 170 2012). The emissions from biomass burning are provided by the Global Fire Emission Database (GFED4) (Giglio et al., 2013).
Heterogeneous uptake of SO2 into aerosol liquid water is not included in the standard simulations but in the sensitivity runs in Sect. 4.3. The parameterizations of the SO2 uptake coefficient (γSO2) include γSO2 depending on RH or on aerosol liquid water content (ALWC) (B. . The heterogeneous uptake of N2O5 and NO2 is an important contributor to nitrate in northern China Wen et al., 2018). The uptake coefficients of γN2O5 and γNO2 on the 175 aerosol surface vary by several orders of magnitude, depending on temperature, particle particle-phase state, composition, ALWC, pH and so on (Bertram and Thornton, 2009 We did not include any heterogeneous production of SOA because of the lack of good parameterizations . underestimated in winter (Fig. 1a) when the SO2 emissions are plausibly underestimated Koukouli et al., 2018). The model-observation agreement for nitrate is the best in winter (Fig. 1b).
Tables S2, S3, and S4 in SI list the statistical values for the model-observation comparisons in different regions, urban or nonurban sites, and various seasons, respectively. The model biases for sulfate, nitrate, and OA are consistently positive or negative 195 among regions (Table S2), suggesting that the model biases are general problems in China. The underestimation of sulfate is over 40% (NMB) in most regions except YRD, and the overestimation of nitrate is over 80% (NMB) in most regions except NW. The OA simulations show much lower NMB (−10%) and RMSE values in YRD and PRD than in NCP and NW. For ammonium, the model significantly overestimates its concentrations in YRD and underestimates its concentrations in NW.
The former may be explained by the excessive formation of ammonium through thermodynamic equilibrium under conditions 200 of abundant NH3 emissions (L.  and overestimated nitrate concentrations (Wang et al., 2013) in YRD in the model. The latter is likely a result of combined factors including emissions, meteorology, and thermodynamic equilibrium.
Moreover, the mean observed concentrations of sulfate, nitrate, ammonium, and OA at urban sites are 20-90% greater than those at non-urban sites ( Fig. 1 and Table S3). The model also shows greater simulated concentrations of OA at urban sites,  (Table S4). Similar to other models, our model failed to reproduce the high sulfate concentrations during the winter-haze periods . By contrast, the nitrate concentrations are largely 210 overestimated in spring, summer, and autumn (NMB = 0.79-1.28). The model bias is much smaller in winter (NMB = 0.41) when higher concentrations of nitrate present. Wang et al. (2013) also showed the summertime overestimation for East Asia. Heald et al. (2012) showed the summer-, autumn-, and winter-time overestimation for the eastern US. To sum up, the large overestimation of nitrate happens in most seasons and regions, and is more severe in non-urban sites. For ammonium, the model underestimates its concentrations in winter and spring but overestimates its concentrations in summer and autumn. Both 215 the uncertainties of HNO3 and NH3 simulations may affect the modeled ammonium concentrations (Wen et al., 2018;. The underestimation of OA is another year-round problem, and the worst case happens in autumn. The R value however is much lower in summer (i.e., 0.28 compared with  0.5 in other seasons), showing the complexity of the OA simulations.
The model simulations are further compared to the long-term hourly observations in Beijing. Figure S3 in SI shows the 220 simulation-to-observation ratios for the SIA species and OA. The mean and median values of the simulation-to-observation ratios of the mass concentrations of non-refractory PM2.5 (NR-PM2.5) are generally within the measurement uncertainty of 30%.
Compensation of the underestimation of sulfate and OA and the overestimation of nitrate leads to the good performance on NR-PM2.5. The seasonal variations of the model biases for the SIA species and OA in Beijing are consistent with the findings https://doi.org/10.5194/acp-2020-76 Preprint. Discussion started: 4 May 2020 c Author(s) 2020. CC BY 4.0 License.
in the nation-wide comparisons (Table S4), except that the greatest underestimation of OA occurs in spring instead of autumn. 225 Figure S4 in SI shows the simulation-to-observation ratios when excluding the periods of NR-PM2.5 mass concentrations over 150 μg m -3 . The model biases and their seasonal variations are similar to those in Fig. S3. Figure 2 shows the diurnal patterns of the observed and the simulated concentrations of sulfate, nitrate, ammonium, and OA for four seasons in Beijing. Considerable differences exist. For instance, the observed sulfate shows a daytime concentration build-up in spring and summer, suggesting a photochemical production (Sun et al., 2015). The wintertime diurnal pattern 230 shows a steady but later enhancement in the afternoon. The simulated profiles however show insignificant daytime concentrations elevations in the model, suggesting insufficient production, overestimated boundary-layer dilution, or removal during the day (Fig. 2a). By contrast, the observed nitrate and ammonium concentrations show flatter diurnal patterns than sulfate (Fig. 2b-c). Nighttime production of nitrate by the heterogeneous uptake of N2O5 and NO2 is a major pathway of nitrate production in northern China Alexander et al., 2020). For OA, the model is unable to reproduce the midday and evening peaks for all seasons (Fig. 2d). Previous positive matrix factorization (PMF) analysis of the OA mass spectra suggests that cooking emissions contribute to the midday peaks of the 245 OA concentrations and the evening peaks are driven by mixed primary emissions including cooking, traffic, and coal combustion (W. Sun et al., 2015). Cooking emissions are not included explicitly in the model, and the emissions of POA and SOA precursors from traffic and coal combustion are uncertain Peng et al., 2019). We compared the modeled POA and SOA with PMF-derived POA and oxygenated OA (OOA) . The model reproduces the monthly mean concentrations of PMF-derived POA (Fig. 3a), suggesting that the MEIC POA inventory generally represents 250 the particle-phase SVOCs emissions under ambient conditions. The model underestimation of OA is mainly from SOA as indicated by the underestimation of the monthly mean concentrations of PMF-derived OOA (i.e., 50-70% of the observed OA mass) (Fig. 3b). Figures S6 and S7 in SI show the model performance of the Simple SOA scheme and the traditional scheme (so-called Semivolatile POA scheme in GEOS-Chem) in simulating OA. The Semivolatile POA scheme significantly underestimates both POA and SOA. This scheme treats 1.27 times of the POA inventory as the SVOC emissions, among 255 which only 1.5% of the carbon remains as POA (Pye and Seinfeld, 2010). There is also a lack of constraints on the SOA production from IVOCs and SVOCs.

Potential contributors to the model-observation discrepancies
We focus here on measurements in Beijing to discuss about the potential contributors to the model bias. Table 1 lists the statistics of T, RH, wind speed, wind direction, and BLH between the MERRA2 outputs and the observations in Beijing. The 260 MERRA2 reanalysis reproduces T (NMB < 2%) and RH (NMB < 15% except for winter) but is unable to reproduce the wind speed and directions. Large RMSE for surface wind directions is a common problem in meteorological reanalysis products as well as the WRF simulations. The overestimation of wind speed (1-2 times) is slightly greater than the bias reported in other studies and may cause some underestimation of PM2.5 (J. Wang et al., 2014). The MERRA2 slightly overestimates 2 PM BLH compared with the radiosonde measurements in summer (NMB = 0.34). For 8 AM and 8 PM, 265 MERRA2 underestimates the radiosonde-derived BLH in autumn and winter. Bei et al. (2017) indicated that the uncertainty in temperature and wind field simulations leads to the frequent underestimation of the nighttime BLH in January 2014 in Beijing by the ensemble WRF meteorology. Such underestimation of BLH may lead to overestimated nighttime concentrations of PM2.5 in autumn and winter. The large RMSE values for the BLH comparisons at 8 AM and 8 PM suggest that the nighttime simulation of PM2.5 may have greater meteorological uncertainty than the daytime simulation ( Fig. S8 and Table 1). 270 The emission inventories of SIA and SOA precursors are important model inputs (Huang et al., 2014). The uncertainty of SO2 emissions affects surface sulfate concentrations. Our model underestimates SO2 concentrations in winter and overestimates its concentrations in summer in Beijing (Fig. 4a). Consistently, top-down estimates suggest lower SO2 emissions in summer and higher in winter in China compared with the MEIC inventory (Koukouli et al., 2018). Improving SO2 emissions may reduce the model bias for sulfate. Our model largely underestimates NO2 concentrations year round (Fig. 4b). The bottom-up NOx 275 inventory has about 50% of uncertainty (M. . Top-down estimates suggest lower NO2 emissions in Beijing and its surrounding area than the MEIC inventory (Qu et al., 2017). Moreover, laboratory and field measurements show that the NO2 uptake coefficient (γNO2) on the aerosol surface ranges from 10 -8 to 10 -4 (Spataro and Ianniello, 2014 and references therein; M. Li et al., 2019 and references therein). The default GEOS-Chem model uses a relatively high γNO2 of 10 -4 , which may cause the underestimated NO2 concentrations as well as the overestimated concentrations of HNO3, HONO, and nitrate and needs 280 further evaluation (Alexander et al., 2020). For NH3, the model underestimates its monthly mean concentrations in Beijing (Fig. 4c). The non-agriculture NH3 emissions are based on the year of 2006 and can be greater in 2012 because of the rapid economic growth (Kang et al., 2016;Meng et al., 2017). Several studies show that the non-agriculture emissions are the dominant NH3 sources during haze periods in Beijing (Pan et al., 2016;Sun et al., 2017). The underestimation of NH3 affects the ammonium simulations when the thermodynamic equilibrium is limited by gaseous NH3. For SOA precursors, Fig. 4d  285 shows that the model underestimates surface CO concentrations in Beijing, which may contribute to the model underestimation of anthropogenic SOA. The modeled summertime isoprene concentrations in Beijing are lower than the observations by 20-50%, affecting the simulations of biogenic SOA (Table S5 in SI). The model also underestimates the aromatic concentrations, similar to previous studies . However, such underestimation has little influence on SOA herein because that aromatic SOA is modeled by the parameterization on CO in the Simple SOA scheme as part of anthropogenic SOA. 290 Oxidants are essential to chemical conversion. Figure 5a-b shows the modeled and the observed concentrations of OH· and HO2· radicals in Beijing. The peak concentrations of OH· and HO2· radicals are underestimated by a factor of 1.5-2 and 2-4, respectively, explained by the missing source of daytime HONO (Fig. S9)   . Nevertheless, the overestimated O3 has little influence on the SOA simulation by the Simple SOA scheme and has minor impacts on SIA because of the dominant contribution from the photochemical and heterogeneous pathways. Moreover, NO3· affects the formation of nitrate and SOA (Ng et al., 2017). Measurements of NO3· in Beijing shows nighttime peak concentrations of less than 6 pptv in summer and below the detection limit of 2.7 pptv in winter. 300 The modeled concentrations are three times greater than the peak concentrations in summer (Fig. 5d), suggesting a possible overestimation of nighttime oxidation.
In addition, the heterogeneous production of sulfate and SOA are not included in the standard models, leading to underestimations. The model uses relatively high values of γN2O5 and ignores the formation of nitryl chloride from the N2O5 uptake, both leading to the overestimation of nitrate (McDuffie et al., 2018;Davis et al., 2008;Jaegle et al., 2018). Another 305 bias is the high default value of γNO2 as described previously. Biases may also relate to the atmospheric removal of the SIA species. For example, the GEOS-Chem model underestimates the wet deposition of nitrate in China by 15-23%, especially in urban areas in summer, which may affect both nitrate and ammonium Xu et al., 2018;Jaegle et al., 2018;Luo et al., 2019). The surface resistance of HNO3 is overestimated in the model (Shah et al., 2018), although the test with doubling the deposition velocity of HNO3 suggests a minor impact of this factor on nitrate simulations . 310 The photolysis rate of particle-phase nitrate affects the loss of nitrate (Romer et al., 2018;Kasibhatla et al., 2018). In Beijing, particulate nitrate may have lower photolysis rates because of the high mass concentrations and thick coating of PM2.5 (Ye et al., 2017). The tested factors for the sensitivity runs are listed in Table 2. Case 0 represents the standard model simulations. The nighttime BLH was multiplied by 3.6 based on the lowest median value of the MERRA2-to-observation ratios at 8 AM and 8 PM (Fig.  S8) when the original BLH was lower than 500 m (i.e., the median of the observed BLH) in Case 1. The SO2 emissions in China were multiplied by 0.8 in summer and 1.5 in winter in Case 2 based on the minimum and maximum values of the ratios between the top-down estimates provided by Koukouli et al. (2018) and the MEIC inventory (Fig. S10), respectively. The non-325 agriculture NH3 emissions in China were scaled up by 1.4 as suggested by Kang et al. (2016) in Case 3. In Case 4, the reaction rate coefficients for the reactions that directly involve OH· oxidation and affect the formation and loss of PM2.5 such as the gaseous formation of sulfuric acid and HNO3 and the oxidation of HNO3 were multiplied by 1.5 in summer and 2 in winter to offset the influence of underestimated OH· concentrations. The multipliers of 1.5 and 2 were derived on the basis of the largest ratio of simulated to observed hourly mean OH· concentrations between 9 AM to 3 PM. In terms of the heterogeneous 330 formation of sulfate, we added two types of parameterizations for γSO2 in Cases 5 and 6. One derives the uptake coefficient of SO2 from RH (γSO2-RH) (B. , and the other calculates the coefficient as a function of ALWC (γSO2-ALWC) (J. . The former is in the order of 10 -5 , and the latter is in the range of 10 -6 to 10 -4 . For comparisons, the uptake coefficients are 5×10 -5 in G.  and 10 -9 to 10 -3 in Shao et al. (2019). In Case 7, we reduced the value of γN2O5 from the parameterization of Evans and Jacob (2005) (Table S6 in SI) for the two 340 case periods. We did not test any parameter related to OA because of the lack of sufficient ambient and laboratory constraints. Figure 6 shows the simulation-to-observation ratios of hourly mean mass concentrations of NR-PM2.5, sulfate, nitrate, and ammonium for Cases 0 to 9. The nocturnal BLH, the non-agriculture NH3 emissions, the OH· levels, and the wet deposition of nitrate have minor impacts on the model performance of these components. The updated SO2 emissions in Case 2 can significantly improve the model simulation of sulfate in Beijing, although further improvements are needed in winter. Similar 345 to previous findings, the heterogeneous uptake of SO2 in Case 5 and 6 increases the simulated sulfate concentrations and leads to better model-observation comparisons in winter (B. . However, both of the cases lead to the overestimation of sulfate concentrations in summer. The variances of the simulation-to-observation ratios for both cases are also greater than the standard simulation in Case 0, indicating the limitation of those parameterizations. Mechanistic approach other than using indirect indicators like RH and ALWC may be necessary to improve the seasonality of the sulfate 350 simulation.

Relative importance of various factors to the model bias
The reduced γN2O5 in Case 7 leads to a minor reduction of simulated nitrate concentrations, suggesting that the uncertainty of heterogeneous uptake of N2O5 is not the main cause of the overestimation of nitrate. The simulations with more reasonable γNO2 in Case 8 are able to reproduce the observed nitrate concentrations in winter, indicating that biased NO2 uptake is an important contributor to the overestimation of nitrate. However, the updated γNO2 alone is insufficient to correct the nitrate 355 concentrations in summer, suggesting additional factors that contribute to the summertime overestimation. Given that the model overestimates both of the summertime concentrations of HNO3 (Fig. S5) and nitrate, the bias is perhaps related to the insufficient removal of them. The updated wet deposition of nitrate can reduce the summertime monthly mean concentrations by about 20% (Fig. S11) but is still minor in terms of the large overestimation. Greater dry deposition of HNO3 and faster photolysis of particulate nitrate as well as the joint influence of multiple factors (as discussed later) are possible ways to solve 360 the remaining overestimation.
Sulfate and nitrate simulations interact with each other through thermodynamic equilibrium, especially in winter when NH3 emissions are lower than in summer. As shown in Fig. 6, adding the heterogeneous formation of sulfate reduces the simulationto-observation ratios of nitrate in winter (i.e., the median ratio from 2.6 to 1.8-2.3 in Cases 5-6) and the simulated weekly mean concentrations of nitrate by 16-36% (Fig. S12a). On the other hand, the reduced γNO2 leads to the reduction of the simulation-365 to-observation ratios of sulfate (i.e. about 0.1 reduction of the median ratios) and the weekly mean simulated sulfate concentrations by 12-20% (Fig. S12b). The reduced γNO2 decreases the HONO concentrations by 98% and hence the OH· levels by 26-74% in Beijing, which leads to lower concentrations of sulfate. Figure 7 shows the R and absolute NMB (|NMB|) values of the sulfate and nitrate simulations for Case 0, and Cases 5, 6, and 8 (i.e., updated heterogeneous formation), and Cases 10 to 50. In winter, the parameterization of heterogeneous sulfate 370 formation on RH in Case 5 improves R but leads to greater |NMB|, while the parameterization on ALWC in Case 6 leads to near-zero |NMB| but little changes of R. By contrast, Case 5 leads to worse values of both R and |NMB|, and Case 6 only affects R in summer. The results suggest that the parameterization on ALWC seems to be better in terms of overall model performance than the parameterization on RH. The decreased R in summer in Case 6 is perhaps because that the biased inorganic aerosol concentrations and the underrepresented organic contribution in the ALWC calculations lead to large 375 uncertainty in the estimated ALWC and sulfate concentrations (Pye et al., 2009). For nitrate, the change of γNO2 in Case 8 leads to large improvements of either R or NMB in both seasons.
The combination of the heterogeneous factors with other factors in Cases 10-50 shows various model improvements. For example, the combination of factors related to heterogeneous formation of sulfate, SO2 emissions, OH· levels, and γNO2 shows worse R or |NMB| compared to Case 6 in winter (Fig. 7a) but improved model performance in summer (Fig. 7b). Such 380 interaction suggests that the parameterization of heterogeneous sulfate formation is sensitive to the precursor concentrations and the oxidation conditions. Therefore, accurate SO2 emissions and well-reproduced oxidant conditions are necessary for improving the sulfate simulation. For nitrate, the combinations of the γNO2 factor with other factors can worsen R and |NMB| in winter. In particular, the combination of the improved γNO2 with the implementation of heterogeneous sulfate formation and the updated SO2 emission lead to the greatest reduction of R and increase of |NMB| among cases (Fig. 7c), explained by the 385 limitation of NH3 relative to high sulfate concentrations. This impact is perhaps smaller in summer because of the greater NH3 emissions. Accurate sulfate simulation therefore is important for the improvement of the simulation of wintertime nitrate in Beijing. The combination of various factors with the improved γNO2 leads to the consistent reduction of |NMB| in summer (Fig. https://doi.org/10.5194/acp-2020-76 Preprint. Discussion started: 4 May 2020 c Author(s) 2020. CC BY 4.0 License. 7d). Case 50 represents the combination of all factors (including γSO2-ALWC not γSO2-RH). It shows an R value of 0.8/0.9 (winter/summer) and an |NMB| value of 0.05/0.3 for sulfate, and an R value of 0.8/0.7 and an |NMB| value of 0.3/2.1 for nitrate. 390 By contrast, the standard simulation in Case 0 shows an R value of 0.9/0.9 and an |NMB| value of 0.6/0.3 for sulfate, and an R value of 0.9/0.7 and an |NMB| value of 2.0/4.7 for nitrate. For sulfate, the |NMB| is largely improved in winter by the combination of all factors. In summer, the influence of all factors seems being canceled out and therefore leads to an insignificant change in |NMB|. For nitrate, the combination of all factors can greatly improve the |NMB| in both seasons, although the overestimation of nitrate is still very large in summer. 395

Conclusions
We evaluated the GEOS-Chem model simulations with a national-wide dataset in China and a long-term hourly dataset in Beijing for sulfate, nitrate, ammonium, and OA. The underestimation of sulfate and the overestimation of nitrate concentrations for most of the sites are consistent with previous findings. The Simple SOA scheme significantly improves the OA simulations in China, suggesting that the SOA formation from anthropogenic precursors is perhaps the main reason for the underestimation 400 of OA in previous studies. The model-observation agreement shows significant seasonality. Sulfate is mostly underestimated in winter, and nitrate is significantly overestimated except in winter. Our model is unable to reproduce the diurnal patterns of nitrate and ammonium. Sensitivity analysis for factors related to meteorology, emission, chemistry, and atmospheric removal with laboratory constraints show that uncertainties in chemistry perhaps dominate the model bias. Among the various individual factors, updated heterogeneous parameterizations of SO2 and NO2 efficiently reduce the model-observation gaps of 405 sulfate and nitrate, respectively. The impacts of various factors on model improvements are canceled out in some cases. Overall, the combination of all factors significantly improves the simulation for sulfate and nitrate. Because of the joint influence among factors, accurate SO2 emissions as well as well-reproduced oxidant conditions and heterogeneous formation are essential for accurate sulfate simulation. Good sulfate simulation improves the nitrate simulation in urban areas with high anthropogenic emissions. Mechanistic approaches other than parameterization on RH and ALWC are needed to improve the seasonality of 410 the sulfate simulation. The summertime overestimation of nitrate remains the biggest problem in the model, which requires a better understanding of the atmospheric nitrogen budget. Simultaneous measurements of major reactive nitrogen species including NOx, N2O5, NO3· , HONO, HNO3, NH3, and particle-phase nitrogen in the field campaigns can provide critical data sets for future model investigations. For OA, the remaining underestimation is plausibly associated with the insufficient SOA production in the model, which merits further explicit investigations. 415 Data availability. Data presented in this manuscript are available upon request to the corresponding author.        (Table 2) are also shown (marked as numbers). The solid red, open red, solid blue circles represent the cases that combine Cases 5, 6, and 8 with other changes, respectively, whereas the solid gray circles represent the rest of the cases (Table S6).