Seasonal distribution and drivers of surface fine particulate matter and organic aerosol over the Indo-Gangetic Plain

The Indo-Gangetic Plain (IGP) is home to 9 % of the global population and is responsible for a large fraction of agricultural crop production in Pakistan, India, and Bangladesh. Levels of fine particulate matter (mean diameter< 2.5 μm, PM2.5) across the IGP often exceed human health recommendations, making cities across the IGP among the most polluted in the world. Seasonal changes in the physical environment over the IGP are dominated by the large-scale south Asian monsoon system that dictates the timing of agricultural planting and harvesting. We use the WRF-Chem model to study the seasonal anthropogenic, pyrogenic, and biogenic influences on fine particulate matter and its constituent organic aerosol (OA) over the IGP that straddles Pakistan, India, and Bangladesh during 2017– 2018. We find that surface air quality during pre-monsoon (March–May) and monsoon (June–September) seasons is better than during post-monsoon (October–December) and winter (January–February) seasons, but all seasonal mean values of PM2.5 still exceed the recommended levels, so that air pollution is a year-round problem. Anthropogenic emissions influence the magnitude and distribution of PM2.5 and OA throughout the year, especially over urban sites, while pyrogenic emissions result in localised contributions over the central and upper parts of IGP in all non-monsoonal seasons, with the highest impact during post-monsoon seasons that correspond to the post-harvest season in the agricultural calendar. Biogenic emissions play an important role in the magnitude and distribution of PM2.5 and OA during the monsoon season, and they show a substantial contribution to secondary OA (SOA), particularly over the lower IGP. We find that the OA contribution to PM2.5 is significant in all four seasons (17 %–30 %), with primary OA generally representing the larger fractional contribution. We find that the volatility distribution of SOA is driven mainly by the mean total OA loading and the washout of aerosols and gas-phase aerosol precursors that result in SOA being less volatile during the premonsoon and monsoon season than during the post-monsoon and winter seasons.

Abstract. The Indo-Gangetic Plain (IGP) is home to 9 % of the global population and is responsible for a large fraction of agricultural crop production in Pakistan, India, and Bangladesh. Levels of fine particulate matter (mean diameter < 2.5 µm, PM 2.5 ) across the IGP often exceed human health recommendations, making cities across the IGP among the most polluted in the world. Seasonal changes in the physical environment over the IGP are dominated by the large-scale south Asian monsoon system that dictates the timing of agricultural planting and harvesting. We use the WRF-Chem model to study the seasonal anthropogenic, pyrogenic, and biogenic influences on fine particulate matter and its constituent organic aerosol (OA) over the IGP that straddles Pakistan, India, and Bangladesh during 2017-2018. We find that surface air quality during pre-monsoon (March-May) and monsoon (June-September) seasons is better than during post-monsoon (October-December) and winter (January-February) seasons, but all seasonal mean values of PM 2.5 still exceed the recommended levels, so that air pollution is a year-round problem. Anthropogenic emissions influence the magnitude and distribution of PM 2.5 and OA throughout the year, especially over urban sites, while pyrogenic emissions result in localised contributions over the central and upper parts of IGP in all non-monsoonal seasons, with the highest impact during post-monsoon seasons that correspond to the post-harvest season in the agricultural calendar. Biogenic emissions play an important role in the magnitude and distribution of PM 2.5 and OA during the monsoon season, and they show a substantial contribution to secondary OA (SOA), particularly over the lower IGP. We find that the OA contribution to PM 2.5 is significant in all four seasons (17 %-30 %), with primary OA generally representing the larger fractional contribution. We find that the volatility distribution of SOA is driven mainly by the mean total OA loading and the washout of aerosols and gas-phase aerosol precursors that result in SOA being less volatile during the premonsoon and monsoon season than during the post-monsoon and winter seasons.
The unique geography of the IGP and broader scale meteorological drivers, coupled with the regional diversity of seasonal pollutant emission sources, makes this region one of the most challenging places to study the controls of its air pollution and the consequent impact on human health. Here, we use the WRF-Chem regional atmospheric chemistry and transport model to describe the seasonal patterns of surface organic aerosol and PM 2.5 and to help disentangle the role of anthropogenic, pyrogenic, and biogenic emissions in their surface patterns across the IGP.
The importance of the IGP lies in the fertility of its soils formed from alluvium that is deposited across the Indus and Ganges basins by the Indus and Ganges rivers. These rivers originate in the Himalaya mountains and the Tibetan Plateau. The Indus and Ganges basins also benefit from precipitation from the seasonal monsoon. The monsoon timing also defines the main seasons over the IGP (India Meteorological Department, 2020): the pre-monsoon season runs from March to May, the monsoon season is from June to September, the post-monsoon season is from October to December, and winter occurs in January and February. The Indian states across the IGP (e.g. Punjab, Haryana, and Uttar Pradesh) represent the vast majority of nationwide wheat and rice production. Rice and wheat are planted in May and November and harvested in October-November and April-May respectively, following the rice-wheat cropping cycle. The IGP is also an important producer of sugarcane, cultivated mainly in the Indus valley in Pakistan and in the Indian state of Uttar Pradesh. The two main seasons for planting are in September-October and February-March, followed by harvesting during the winter and pre-monsoon months, respectively. Crop residues left from harvesting, e.g. husk, bran, and straw, are generally burned in open fires. Traditionally, these residues were ploughed back into the soil to maintain fertility and stability, but the sheer scale of current production precludes these practices in time for a second growing season (Chauhan et al., 2012;Ahmed et al., 2015). Open burning of these residues across the IGP, particularly during the postmonsoon season, is a large source of gaseous and particulate pollution that has implications for regional air quality and human health (Vadrevu et al., 2011;Jethva et al., 2019;Sembhi et al., 2020). Residential biofuel combustion also plays an important role in air quality (Conibear et al., 2020;Agarwala and Chandel, 2020).
The high population density and intense human activity over the IGP result in anthropogenic emissions being a major source of regional surface air pollution (Begum et al., 2013;Guttikunda and Jawahar, 2014;Shahid et al., 2015;Venkataraman et al., 2018). Residential energy consumption represents a major contribution to anthropogenic emissions, with a large fraction of the rural and urban population using solid fuel for cooking (Conibear et al., 2018). Emissions from land transportation, particularly in cities, also represent a significant contribution to anthropogenic emissions (Begum et al., 2013;Mallik and Lal, 2014). Intense agriculture over the IGP is associated with large emissions of ammonia, an aerosol precursor, from urea fertiliser application, as well as from post-harvest burning as described above (Kuttippurath et al., 2020;Wang et al., 2020). Vegetation cover over the IGP consists mainly of croplands (Stibig et al., 2007;Gumma et al., 2019), which have lower isoprene emissions than trees (Hardacre et al., 2013). Consequently, biogenic emissions over the IGP are lower compared to other parts of south Asia (Guenther et al., 2006;Stavrakou et al., 2014).
Regional dispersion of air pollution over the IGP is dominated on a seasonal timescale by the monsoon system, influenced by the high mountain ranges of Hindu Kush and Himalayas that lie to the northwest to northeast of the IGP. Agricultural planting and harvesting (and associated burning) are determined by the timing of the monsoon when the majority of the annual rainfall falls. Consequently, observed variations of PM 2.5 reflect large-scale variations in meteorology and the seasonal variations in anthropogenic, biogenic, and pyrogenic emissions (Jethva et al., 2005;Lelieveld et al., 2018;Schnell et al., 2018).
A growing body of regional models have been used to study the relationship between emissions, meteorology, and PM 2.5 over India (Kumar et al., 2015b;Bran and Srivastava, 2017;Kulkarni et al., 2020;Ojha et al., 2020) and to estimate the health impacts of outdoor exposure to PM 2.5 (Ghude et al., 2016;Conibear et al., 2018;David et al., 2019). Many studies have focused on post-monsoon biomass burning episodes and on air pollution during the winter season over the upper-central Indian part of the IGP (Guttikunda and Gurjar, 2012;Ram et al., 2012;Pant et al., 2015;Kumar et al., 2015a;Jethva et al., 2018;Singh et al., 2018;Krishna et al., 2019;Mhawish et al., 2020). But of course the IGP also includes parts of Pakistan and Bangladesh that remain poorly studied, even though they are connected via atmospheric transport. With only a few exceptions, these studies have focused on total PM 2.5 , although there is evidence that single aerosol components play a major role in PM 2.5 composition over the IGP (Gani et al., 2019 andSingh et al., 2018, andreferences therein). Measurements have shown that organic aerosol (OA), originating from anthropogenic, pyrogenic, and biogenic emissions, constitutes a significant fraction (20 %-35 %) of PM 2.5 across the IGP, especially during post-monsoon and winter seasons (Ram et al., 2008;Alam et al., 2014;Rajput et al., 2014;Behera and Sharma, 2015;Sharma et al., 2016). OA exists as a complex mixture, comprising of thousands of individual organic compounds, and it is made up of primary OA (POA), emitted directly to the atmosphere, and of secondary OA (SOA), formed by the condensation of organic vapours as they become progressively less volatile through oxidation (Seinfeld and Pandis, 2016;Donahue et al., 2006). Changes in OA volatility are key for the formation of SOA, and it is particularly sensitive to temperature, ambient concentration of OA, and nitrogen oxide levels (Shrivastava et al., 2017). We take advantage of the volatility basis set (VBS) model, which helps to describe succinctly the evolving volatility of OA through oxidative chemistry in the atmosphere (Donahue et al., 2006(Donahue et al., , 2012Chuang and Donahue, 2016), described below. This method has been used successfully in a range of modelling studies (Lane et al., 2008b;Bergström et al., 2012;Ahmadov et al., 2012;Zhang et al., 2013;Zhao et al., 2016).
We use the WRF-Chem regional atmospheric chemistry model to characterise the seasonal and spatial distributions and composition of PM 2.5 and OA in light of synoptic meteorology and emission drivers over three subregions of the IGP, including relevant parts of Pakistan and Bangladesh. We use a 1-D VBS model to describe the evolution of OA and its influence on PM 2.5 , described in Sect. 2. In Sect. 2, we also describe the in situ and satellite measurements we use to evaluate our model. In Sect. 3, we describe the seasonal meteorology over the IGP, the seasonal distributions and composition of PM 2.5 and OA, and the seasonal distribution of SOA volatility. In Sect. 3, we also use a perturbative approach to understand the sensitivity of PM 2.5 constituent distributions to changes in anthropogenic, pyrogenic, and biogenic emissions and to seasonal changes in the atmospheric environment. We conclude our study in Sect. 4.

Data and methods
Here, we describe the WRF-Chem model that we use to understand the influence of anthropogenic, pyrogenic, and bio-genic emissions on the atmospheric distribution of PM 2.5 and OA over the IGP.

Weather Research and Forecasting model coupled with Chemistry
We use v.3.9.1.1 of the Weather Research and Forecasting (WRF) model coupled with Chemistry (WRF-Chem) (Grell et al., 2005) to describe the emissions and atmospheric chemistry and transport associated with gas-and aerosol-phase compounds over the IGP during 2017 and 2018. WRF uses the Advanced Research WRF (ARW) dynamical solver to solve the fully compressible, non-hydrostatic Euler equations that describe atmospheric flow. These calculations are coupled with atmospheric chemistry, so that our PM 2.5 and OA calculations are consistent with the meteorology.
Our study domain is defined as 17-40 • N and 64-97 • E, encompassing the IGP at a horizontal spatial resolution of 20 km and using 33 vertical levels that span from the surface to 50 hPa ( 19 km). For the description of terrain data for the domain (land-use and soil categories), we use MODIS IGPB 21-category data at 30 arcsec resolution (∼ 1 km) (Friedl et al., 2010). To define our initial conditions and lateral boundary conditions, and for nudging (Newtonian relaxation), we use meteorological reanalyses from NCEP FNL Operational Model Global Tropospheric Analyses Data (National Centers for Environmental Prediction, National Weather Service, NOAA, U.S. Department of Commerce, 2015) at a spatial resolution of 0.25 • × 0.25 • and at a temporal resolution of 6 h. We use the nudging approach at Figure 2. Seasonal mean daily emissions over the IGP (g m −2 d −1 ) of (a, d, g, j) anthropogenic, (b, e, h, k) biomass burning, and (c, f, i, l) biogenic (isoprene) emissions. Anthropogenic emissions from EDGAR-HTAP and fire emissions from FINN. Biogenic emissions are calculated online in WRF-Chem using MEGAN. To determine total anthropogenic and pyrogenic emissions, we sum across all emitted species, respectively, while for biogenic emissions, we consider only isoprene. all levels to prevent our calculations from deviating too far from observed meteorology. Table B1 provides more details about the meteorological processes we use in our calculations. Chemical initial conditions and lateral boundary conditions for each month are provided by 6-hourly CAM-CHEM global model data (Buchholz et al., 2019). We spin up each simulation for a week before studying the model output to minimise the influence of the initial conditions.
To describe gas-phase chemistry, we use the Model for OZone And Related chemical Tracers, version 4 (MOZART-4) chemical mechanism (Emmons et al., 2010), including the extended treatment of volatile organic compound (VOC) chemistry (Knote et al., 2014). Photolysis rates are calculated by the Fast Tropospheric Ultraviolet-Visible (FTUV) module .
The log 10 C * = −4 volatility corresponds to an inert compound and serves computationally as a loss of particle-phase organics to avoid unrealistic volatile mixtures due to continuously ageing of gas-phase SVOCs. Lumped anthropogenic, pyrogenic, and biogenic gas-phase aerosol precursors undergo continuous gas-phase oxidation and partition between the gas and aerosol phase using pseudo-ideal partitioning theory (Pankow, 1994). Partitioning between the gas and aerosol phase depends on total organic aerosol load and temperature. SOA yields are also dependent on NO x levels, so SOA yields are calculated differently for low-and high-NO x conditions, through a branching ratio (Lane et al., 2008a). We also include the SOA formation from glyoxal (Knote et al., 2014). Loss of SVOCs is from washout via convective-and gridscale precipitation. Our chosen implementation of VBS only accounts for SVOCs and assumes that POA is inert, so that it contributes only to the aerosol mass. We do not include direct emissions of SVOCs or intermediate VOCs (IVOCs). This is a limitation of our current implementation given evidence that SVOC and IVOC vapours create a considerable amount of regional SOA and that POA emissions are semivolatile and undergo oxidation and should also be considered in describing SOA production (Robinson et al., 2007). To describe POA using the VBS approach, we would require information about the volatility distribution of POA, but conventional inventories typically consider POA to be non-volatile. The 1-D version of the VBS model is unable to describe some aspects of SOA formation, including fragmentation and the increase in OA oxidation state, which are better described by the 2-D version of the model that tracks the oxygen-to-carbon ratio (O : C) in addition to organic mass (Donahue et al., 2012). Previous studies have shown that the 2-D VBS model improves model-measurement agreement in SOA (e.g. Zhao et al., 2016) but has a significant associated computational burden when used in 3-D chemistry transport models. Further details of this VBS implementation in WRF-Chem are described in Knote et al. (2015) and references therein.
We use monthly anthropogenic emissions from Emission Database for Global Atmospheric Research with Task Force on Hemispheric Transport of Air Pollution (EDGAR-HTAP v2.2) for the year 2010 (Janssens-Maenhout et al., 2015) as provided by the WRF-Chem community, which provides the total anthropogenic emissions and includes a non-methane volatile organic compound (NMVOC) speciation according to the gas and aerosol chemistry scheme we use here (MOZART-MOSAIC). Using an anthropogenic emission inventory for 2010 to describe atmospheric chemistry during 2017-2018 will inevitably introduce some biases in our model PM 2.5 estimates, particularly because our study domain includes regions with rapidly growing emissions. From 2010 to 2017, India has seen reductions in black carbon (BC), organic carbon (OC), CO, and NMVOC emissions from the residential sector, owing to policies that have enabled a switch to cleaner residential fuels and energy sources. However, India's growing economy had led to a rapid increase of NO x and SO 2 emissions from the industrial sector (∼ +12 %, ∼ +10 %) and energy sector (∼ +20 %, ∼ +26 %) and an increase in NO x and NMVOC from on-road transportation (∼ +50 %, ∼ +27 %). An increase in intensive agricultural practices over the Indian IGP has increased ammonia emissions (NH 3 ; ∼ +15 %) (McDuffie et al., 2020). Errors in PM precursor gaseous emissions will impact our ability to describe air pollution for our study year, especially for individual components of secondary inorganic aerosols (nitrate, sulfate, and ammonium) and SOA. It remains difficult to disentangle the impact of using outdated emission estimates from other sources of model error, e.g. meteorology, chemistry, land-use change, and model resolution. For pyrogenic emissions, hourly biomass burning emissions are taken from the Fire Inventory from NCAR (FINNv.15) inventory for 2017-2018 (Wiedinmyer et al., 2011). Pyrogenic emissions are apportioned between FINN and EDGAR-HTAP inventories. The FINNv1.5 inventory includes global estimates of trace gas and particle emissions from the open burning of biomass, which includes wildfire, agricultural fires, and prescribed burning (Wiedinmyer et al., 2011). EDGAR-HTAPv2.2 is focused on anthropogenic emissions but excludes large-scale biomass burning (e.g. forest fires, peat fires), agricultural waste, or field burning. Within its residential sector, emissions include small-scale combustion, including heating, lighting, cooking, and solid waste disposal or incineration (Janssens-Maenhout et al., 2015). Biogenic emissions are calculated online using the Model of Emissions of Gases and Aerosol from Nature (MEGAN; Guenther et al., 2006). Figure 2 shows the seasonal distributions of total anthropogenic, pyrogenic, and biogenic (predominately isoprene) emissions over the IGP. Total anthropogenic emissions have been calculated by summing the mass contribution from all the chemical species (gas and particle) specified in the inventory once preprocessed onto the model domain using the WRF-Chem tools for the community (ACOM-NCAR, 2020). We converted gas emissions to mass units using the appropriate molar mass for each species. The same approach has been used to calculate fire emissions, while isoprene emissions are calculated online by MEGAN in the WRF-Chem model and then converted to mass units. Anthropogenic emissions generally dominate in all seasons (Fig. 2a,d,g,j), with daily values ranging from 10 1 to 10 2 g m −2 d −1 . The two largest localised regions of anthropogenic emissions are Delhi and Kolkata with emissions > 100 g m −2 d −1 , followed by smaller Indian cities, e.g. Patna, Varanasi, Kanpur, and Lucknow ( Fig. 1). Just south of the border of Uttar Pradesh, the Madhya Pradesh district of Singrauli hosts several large power plants. The Pakistani and Bangladeshi parts of the IGP generally have the lowest anthropogenic emissions, with the exception of Karachi in south Pakistan, the north Pakistani Punjab (the most populated part of Pakistan where Lahore and Faisalabad are located), and Dhaka in Bangladesh. Emissions from Karachi and Dhaka have lower emissions per capita than Indian cities of comparable size.
Fires have a strong seasonal cycle, peaking during premonsoon and post-monsoon seasons (Fig. 2b, h), with emissions ∼ 10 −1 g m −2 d −1 , mainly due to agricultural stubble burning. The post-monsoon harvesting season includes fire emissions rates that are 3 times higher compared to the premonsoon season (∼ 0.3 and ∼ 0.9 g m −2 d −1 , respectively). Post-monsoon fires are almost exclusively located in the Indian Punjab, with the largest values at the border with the state of Haryana. Pre-monsoon fires are located around the border of Pakistani and Indian Punjab and upper Haryana. There are also some isolated fires in the eastern part of the IGP. During winter (Fig. 2k), low fire activity is present in the Indus valley in Pakistan and mainly over Uttar Pradesh from the post-harvesting of sugarcane crops.
Biogenic emissions peak during pre-monsoon and monsoon seasons (Fig. 2c, f), with values of 2 × 10 −3 and 1.5 × 10 −2 g m −2 d −1 , respectively. The largest values are over Sindh in Pakistan, West Bengal, and Bangladesh. Land cover over the IGP is dominated by croplands, but the state of Sindh includes coastal mangrove plantations, inland riverine forests, irrigated plantations, and rangelands (Ministry of Environment Government of Pakistan, 2009). Moreover, West Bengal and Bangladesh emissions are mostly confined close to the coast, where forest land is present (Reddy et al., 2016). During these two seasons, there are also isoprene emissions over Uttar Pradesh from forests in Pilibhit and Kheri and from northeast Pakistan.
For computational expediency, we have chosen a representative period of 1 month for each distinct season over the IGP. We define, based on the seasonal definition of the Indian Meteorological Department (India Meteorological Department, 2020), the pre-monsoon period as 18 April to 16 May 2017, the monsoon season as 3 to 31 July 2017, the post-monsoon season at 18 October to 16 November 2017, and finally winter as 8 January to 5 February 2018. The 2017-2018 year is close to the climatological mean state, so our results are typical of this region rather than being influenced by significant circulation changes due to, for example, El Niño-Southern Oscillation climate variations (Null, 2020).
For the purposes of reporting our results, we divide the IGP into three subregions: the upper IGP that includes the Pakistani states of Sindh and Punjab and the Indian Punjab; the middle IGP that includes the Indian states of Haryana, Delhi NCT, and Uttar Pradesh; and the lower IGP that includes the Indian state of Bihar and West Bengal and Bangladesh, excluding the states of Chittagong and Sylhet ( Fig. 1).

Determining the sensitivity of PM 2.5 and OA to changes in precursor emissions
We use a perturbative approach to determine the importance of different source sectors on PM 2.5 and OA, which takes into account the non-linear chemical environment. Alternatively, setting a particular emission source to zero would result in a significant non-linear response that is unique to the source, consequently precluding any meaningful comparison of the importance of a particular source to PM 2.5 and OA. First, we run a base run for each season. We then, for each season, systematically perturb one emission source by +5 % over the study domain for the central week of each season, keeping the other sources the same as the base run. Finally, we calculate the sensitivity S ij of species concentration to the changes in a given source of emissions as where C ij represents the concentration change of our target species (PM 2.5 and OA in this study) at grid point ij in response to an emission change E summed over the IGP for a particular source. We perturb directly anthropogenic and fire emissions rates. Biogenic emissions are calculated online by scaling normalised emission rates by factors that describe changes in, for example, temperature, photosynthetic active radiation, and leaf area index (LAI) (Guenther et al., 2006). We modify the WRF-Chem code to increment only isoprene emissions because our calculations suggest they account for almost all of biogenic emissions over the IGP, in agreement with other studies (Singh et al., 2011;Surl et al., 2018). C ij is calculated by summing over time the difference in concentrations at each grid cell ij of the perturbed run p C p ij, t and the base run b C b ij, t . The change in concentration in each grid cell is therefore scaled by the same E, allowing local and non-local emission influences to be considered equally and to avoid singularities in grid cells where there is no net emission change. We use this scaling because it allows us to compare the sensitivity of atmospheric concentrations to different sources types. E is calculated as the difference of total emissions within the IGP domain between the perturbed model run and the base model run for a given source type.
Total emissions across the IGP for the perturbed run E p tot and for the base run E b tot are calculated by summing emissions from all species for the length of the simulation and for all grid cells across the IGP. In more detail, emissions at each grid point ij for species s between two consecutive model outputs at t and t + 1 are calculated (for both the perturbed and base runs) by E ij, t, s = ij, t, s tA ij . ij, t, s denotes the emission rate of species s at location ij and output time t, A ij denotes the area of grid point ij , which in our calculations is constant at 400 km 2 , and t corresponds to an interval of model output, which in our calculation is 3 h. To take into account the different spatial variability of emissions from different sources (Fig. 2), we scale E with the total number of grid cells within the IGP for which the emission difference is > 0.001 g m −2 d −1 , corresponding approximately to cumulative emissions > 2.8 Mg for each grid cell in 1 week. This threshold corresponds to a lower limit for significant emissions rate across the area considered (Fig. 2). We also neglect values of S ij for which the change in the pollutant concentration C ij < 5 % of mean pollutant seasonal concentration over the IGP (4 µg m −3 and 1 for PM 2.5 and total OA, respectively). Using this additional threshold allows us to isolate significant changes in concentrations due to direct changes in emissions and remove smaller values due to model nonlinearity. We report the sensitivity parameter S ij with units of micrograms per cubic metre per gigagram (µg m −3 Gg −1 ). In a policymaking context, our sensitivity parameter provides information about how to control atmospheric concentrations by changing different emission sources in order to obtain the highest air quality benefits from certain emission reductions.

Data used for model evaluation
We use in situ measurements of PM 2.5 , PM 10 , CO, NO 2 , O 3 , and SO 2 from the Indian Central Pollution Control Board (CPCB, 2020) and PM 2.5 data collected atop the US Embassy in Pakistan and Bangladesh (U.S. Department of State, 2020). We accessed these data from the OpenAQ platform (OpenAQ, 2020). Appendix B describes an overview of the in situ data, our data cleaning approach, and evaluation metrics. Given the lack of continuous measurements of OA and its components POA and SOA over the IGP, we compare our model OA with measurements available from the literature. We also evaluate the model using satellite observations of aerosol optical depth (AOD) from the NASA Moderate Resolution Imaging Spectroradiometer (MODIS) instru-ment aboard the Terra and Aqua satellites, which have a local equatorial overpass time of 10:30 and 13:30, respectively. AODs are retrieved at 550 nm, corresponding to particle sizes of 0.1-2 µm and comparable to the PM 2.5 size range. In particular, we use the MODIS Collection 6.1 Level 2 combined Dark Target and Deep Blue AOD product, available on a 10 km spatial resolution (Levy et al., 2013).
Here we summarise the main results of our evaluation (detailed results are available in Appendix B). We report the normalised mean bias (NMB) and the Pearson correlation coefficient r, which we use to describe how well the model reproduces the observations. The model tends to overestimate surface PM 2.5 concentrations (0.004 < NMB < 0.4), especially during monsoon season (NMB = 0.4), but it has skill in reproducing observed seasonal variations (r > 0.62), with the exception of the monsoon season (r = 0.09). Poorer model performance during the monsoon period may be due to a number of compounding factors. In particular, it is challenging to reproduce observed atmospheric water vapour and precipitation over the Bay of Bengal, western coasts of India, and the Himalayan foothills during summer months. Uncertainties in the representation of topography; insufficient mixing in the boundary layer; errors in moisture transport, simulation of surface moisture availability, and soil temperature; and an excessive water vapour flux from the ocean all contribute to model error (Kumar et al., 2012a). Previous studies have shown that monsoonal rainfall is not well described by regional models such as MM5 or WRF (Rakesh et al., 2009;Ratnam and Kumar, 2005). When we compare our WRF model simulation with MERRA-2 reanalysed meteorology (Gelaro et al., 2017), we find that precipitation rates have a negative model bias of 80 % over the IGP, similarly to what Conibear et al. (2018) obtained with a similar model set-up.
For PM 10 , the model tends to underestimate the observation in all seasons (NMB up to −0.25), except in the pre-monsoon season (NMB = 0.15), and has poorer skill in reproducing observed PM 10 variations compared to PM 2.5 (r ≤ 0.69), especially during winter and the pre-monsoon season. We generally find poorer model agreement with gasphase pollutants, including a positive model bias and comparatively poor correlations with observations of NO 2 , SO 2 , and O 3 (Table B3). We attribute this to multiple sources of error. Given the coarse spatial and temporal resolution our model (20 km × 20 km spatial, 3 h temporal), we expect our model to be affected by non-negligible representation error due to the CPCB network sites often being located near to roadsides or in dense urban areas where the model will struggle to reproduce. This source of error preferentially affects reactive trace gases that react on timescales with transport across individual model grid cells. Previous studies have reported similar model limitations (Fountoukis et al., 2013;Paolella et al., 2018;Kuik et al., 2016;Tan et al., 2015;Sirithian and Thepanondh, 2016;Balasubramanian et al., 2020). Data for Pakistan are not available for our modelling study period (2017-2018), so we instead use data from 2019 for the monsoon and post-monsoon seasons and data from 2020 for the winter and pre-monsoon seasons, which represents an additional source of error. Previous studies show that regional modelling over south Asia tends to overestimate satellite column observations of NO 2 by 10 %-50 % over the Indo-Gangetic Plain, the bias peaking as high at 90 % during winter months (Kumar et al., 2012b) and up to +131 % when compared to ground-based observations over densely populated urban regions (Karambelas et al., 2018). These differences have been attributed mainly to errors in NO x emission inventories over densely populated areas, uncertainties in seasonal variations of emissions, absence of diurnal and vertical profiles of anthropogenic emissions (Kumar et al., 2012b;Karambelas et al., 2018), and underestimation of precipitation rate that will reduce the loss of soluble trace gases (Kumar et al., 2012a). Similarly, previous regional model studies of the IGP region have tended to over-predict concentrations of SO 2 , with NMB > 3.5 (Conibear et al., 2018;Kota et al., 2018). We attribute our positive model bias of SO 2 to using an outdated emission inventory that does not take into account the beginning of a shift from coal-to gasbased power plants (Sharma and Khare, 2017). Urbanisation has been shown to affect the diurnal spatial distribution of surface ozone (Li et al., 2014, and references therein) and also the magnitude and location of anthropogenic emissions of NO x and VOCs that subsequently affect surface ozone photochemistry (Zhang et al., 2004;Ghude et al., 2013). Finally, some fraction of the overestimation of surface ozone is linked to our use of the MOZART chemical mechanism that has been previously reported to have a positive model bias over south Asia compared to other mechanisms . Collectively, these model limitations associated with describing reactive trace gases will impact our ability to model particulate matter, especially secondary components over urban areas across the IGP. For OA, the model reproduces the order-of-magnitude seasonal trends (Table B4), but additional measurements are needed to robustly assess model performance. Table B5 shows that WRF-Chem AOD agrees with spatial distributions of MODIS AODs, with r typically > 0.5 with the exception of the monsoon season (r = 0.35). Poor model skill during the monsoon season may reflect difficulties in retrieving AOD during extensive seasonal cloud coverage. In addition, the model has specific difficulties in reproducing atmospheric aerosol abundances during monsoon season, as highlighted earlier in this section, that could affect the simulated total AOD column. The model tends to overestimate MODIS AOD during the premonsoon (NMB = 0.33, 0.26 for Terra and Aqua satellites) and slightly underestimate AOD in the other seasons (NMB ranges from −0.06 to −0.19).

Results
First, we summarise the seasonal meteorology over the IGP, which influences the physical and chemical environments that determine PM 2.5 and OA. We then report seasonal distributions of surface PM 2.5 and the corresponding constituent aerosol composition. Finally, we investigate the seasonal influence of POA and SOA on PM 2.5 and the volatility of the surface SOA across the IGP. In describing the seasonal distribution of PM 2.5 , OA, POA, and SOA we highlight the influence of anthropogenic, pyrogenic, and biogenic emissions and synoptic meteorology in shaping these patterns.
For the purpose of describing PM 2.5 and OA, we begin our narrative with the post-monsoon season and finish with the monsoon season, reflecting the central importance of the monsoon system on atmospheric chemistry over the IGP. However, in the corresponding figures, we retain the chronological order of events in a calendar year. Figure A1 shows model seasonal mean values for planetary boundary layer height (PBLH, m), surface relative humidity (RH, %), surface temperature ( • C), mean daily rainfall (mm d −1 ), and 10 m wind (m s −1 ) over the IGP. Given that PBLH and RH show a diurnal cycle with high variance, we report night-time and daytime values for these variables.

Seasonal meteorological drivers
During the pre-monsoon season, mean surface temperatures are higher than 30 • C. Mean PBLH ranges from 1000 up to 4500 m in the daytime, with the highest values over Pakistan and central IGP, and is almost an order of magnitude smaller during the night-time (120 up to 400 m). Seasonal mean winds are typically 3 m s −1 , southward from the northern mountain chain of Hindu Kush and the Himalayas, and stronger northward from the coast, allowing pollutants to be transported mainly in the inland. Air is much more humid over the lowest part of the IGP (> 60 %). Rainfall follows similar patterns of RH, limited to Bangladesh, with values below ∼3 mm d −1 .
During the monsoon season, the dominant feature is the monsoon itself. This manifests most obviously in increased rainfall, which increases the washout of hydrophillic pollutants, mainly in the central and lower part of the IGP, with mean daily rainfall values of 3-7 mm d −1 and localised regions of rainfall in excess of 15 mm d −1 and wind speeds in excess of 6 m s −1 north-northeastward. Values of RH are > 50 % almost everywhere over the IGP, and relatively low values for the PBLH allow for a well-mixed chemical environment, with smaller day-night variation compared to premonsoon (1000-3000 m in the daytime and 500-1200 m in the night-time). Mean temperatures are similar to those during the pre-monsoon, with the most prominent increase over northern Pakistan (> 35 • C).
The post-monsoon season is characterised by cooler temperatures than the previous two seasons, with mean values of ∼ 23 • C, much lower values for PBLH (below 2000 m during the day and ∼ 200 m during the night), and weaker wind speeds (< 1 m s −1 ) with no predominant direction, a combination of factors that results in pollution stagnation. With the exception of Bangladesh and the Indian states that are adjacent to the Bay of Bengal, rainfall is almost absent from the IGP. Nevertheless, air continues to be humid, with the distribution and values of RH similar to the monsoon season, with values of up to 80 % over the central and lower IGP, environmental conditions that favour water significantly contributing to PM mass without washout from rain.
During winter, mean temperature further drops to ∼ 15 • C with cooler temperatures over regions adjacent to the northern mountain chains. PBLH values are at their daily annual minimum ( 1000 m), and its night values are similar to postmonsoon ( 200 m). Winds speeds are typically < 12 m s −1 , with a net west-east gradient from the upper IGP to the lower IGP, which transports pollutants towards Bihar, West Bengal, and Bangladesh, and with a north-south gradient over the Indus basin that transports pollution from northern Pakistan to the coast. Daily rainfall is below 3 mm d −1 anywhere across the IGP, but as for post-monsoon, RH remains high over the central and lower IGP (> 40 % during daytime, 70 % during night-time). Figure 3 shows seasonal variations of surface PM 2.5 across the upper, middle, and lower IGP. Generally, we find the highest values of surface PM 2.5 , up to 350 µg m −3 , during post-monsoon and winter seasons that are associated with lower PBLH, allowing large anthropogenic emissions to accumulate in the boundary layer without ventilation from strong winds. From this section we begin our narrative from the post-monsoon season and finish with the monsoon season but retain the figure panels in chronological order for a particular calendar year. Our seasonal distributions of PM 2.5 are similar to recent studies (Shahid et al., 2015;Ojha et al., 2020;Mhawish et al., 2020), although we report higher PM 2.5 concentrations, especially over the lower IGP. Compared to these studies, our model also takes into account water content in PM 2.5 mass in addition to dry PM 2.5 mass through aqueous-phase chemistry. Our results shows that water content in PM 2.5 is substantial, especially over the lower IGP, where water makes up to 42 % of total PM 2.5 mass (see later in this section). This helps to explain our comparatively high PM 2.5 estimates.

Seasonal distributions of surface PM 2.5
During the post-monsoon season (Fig. 3g-i), the mean values of surface PM 2.5 in the upper, middle, and lower IGP are 137, 176, and 185 µg m −3 , respectively. On a local scale, Kolkata and its surroundings in the lower IGP experience the worst air quality, with mean PM 2.5 values in excess of 300 µg m −3 , closely followed by Delhi NCT, the border region between Indian and Pakistani Punjab, and Singrauli at the southern border of middle IGP (∼ 300 µg m −3 ). The best air quality is found in the Pakistani state of Sindh, with PM 2.5 concentrations below 75 µg m −3 . Biomass burning in the Indian Punjab plays a key role in shaping the distribution of PM 2.5 during this season. Figure 4h shows that fire emissions have the largest impact on PM 2.5 concentrations across the Indian and Pakistani Punjab region, Haryana, and Delhi NCT (sensitivities of up to > 10 3 µg m −3 Gg −1 ). The impact of post-monsoon biomass burning emissions extends to the central part of the middle IGP over Uttar Pradesh, where sensitivity of PM 2.5 to pyrogenic emissions (up to 6×10 2 µg m −3 Gg −1 ) is higher than anthropogenic emissions (up to 4 × 10 2 µg m −3 Gg −1 ).
The sensitivity of PM 2.5 to changes in biogenic emissions ( Fig. 4i) only has non-negligible values (< 2 × 10 2 µg m −3 Gg −1 ) over part of West Bengal in the lower IGP.
During the winter season ( Fig. 3j-l), wind patterns transport pollutants from the upper IGP to the lower IGP, resulting in a west-east gradient in seasonal mean PM 2.5 concentrations. The mean PM 2.5 value in the lower IGP is 191 µg m −3 , the highest mean seasonal value for the IGP. The highest PM 2.5 concentrations are reached in Kolkata (> 300 µg m −3 ) and in the Bihar state, with a local peak in Patna (> 220 µg m −3 ) known as the "Bihar pollution pool" . In the middle IGP, mean PM 2.5 concentrations are 18 µg m −3 lower than post-monsoon levels, with east Delhi and Singauli remaining the largest hotspots of the region (> 220 µg m −3 ). The upper IGP experiences the lowest seasonal PM 2.5 concentration (86 µg m −3 ), lower than half the value in the lower IGP, with concentrations decreasing from the Punjab to the Sindh coast. Anthropogenic emissions dominate the distribution of PM 2.5 during winter over the lower IGP (sensitivity up to 4 × 10 2 µg m −3 Gg −1 , Fig. 4j), with the highest sensitivities over the cities of Kolkata and Singrauli. The influence of biomass burning is significant over the Indus basin, stretching until Uttar Pradesh (sensitivity up to 10 3 µg m −3 Gg −1 ; Fig. 4k), while biogenic emissions do not show a significant influence during this season (Fig. 4l).
During the pre-monsoon season (Fig. 3a-c), air quality begins to improve due to higher PBLHs and stronger winds (Fig. A1) that help to disperse pollutants. Mean PM 2.5 concentrations are similar over the upper and middle IGP, with values lower than 90 µg m −3 . Higher concentrations remain in the lower IGP (128 µg m −3 ) due to the accumulation of pollutants from the winds blowing from the Bay of Bengal to the slopes of the Himalayas over North Bangladesh. High aerosol loading over the lower IGP during the pre-monsoon season is also influenced by biomass burning from Northeast India and Myanmar-Laos, which are partially included in our model domain. PM 2.5 values over the upper part of the middle IGP (Fig. 3b) show some influence from biomass burning (Fig. 4b). We find that anthropogenic emissions are most important over the lower IGP and localised regions in the central IGP (Fig. 4a). PM 2.5 concentrations in Delhi NCT are jointly influenced by biomass burning and anthropogenic sources. Biogenic sources only have a significant impact over localised regions in the lower and middle IGP (Fig. 4c).
Generally, the onset of the monsoon results in better air quality across the IGP due to higher rainfall rates, which increases wet deposition of aerosols, and higher PBLHs that improve the physical dispersal of surface emissions. Mean values of PM 2.5 are ≤ 100 µg m −3 across the IGP. The largest values of PM 2.5 are over the lower IGP (up to 170 µg m −3 ). We find that PM 2.5 is sensitive to biogenic emissions over localised regions across the IGP, where PM 2.5 can be more sensitive to changes in biogenic emissions than changes in anthropogenic emissions (∼ 200-500 µg m −3 ) and < 200 µg m −3 , respectively). Fires play only a small role in PM 2.5 during this season. Figure 5 shows the modelled composition of PM 2.5 across the IGP. Generally, we find more variability between seasons than across different parts of the IGP, except for the water contribution to PM 2.5 mass. The results we report for the chemical composition and seasonal trends of PM 2.5 are broadly consistent with chemical characterisation studies over the region (Chowdhury et al., 2007;Bhowmik et al., 2020). As discussed in Sect. 2.3, model limitations in reproducing precursor trace gases will affect our ability to model secondary components of particulate matter. When comparing the model with recent field observations of PM 1 over Delhi during post-monsoon and winter (Gani et al., 2019;Gunthe et al., 2021;Patel et al., 2021), corresponding to two of our study seasons, we find that the model generally underestimates PM 1 (57-161 µg m −3 observed, 17-22 µg m −3 simulated), although we acknowledge that the model configuration we use is not ideal to model submicron PM due to our use of four sectional size bins. The model overestimates the contribution of PM 1 from nitrate (6 %-11 % observed, 11 %-13 % simulated) but underestimates the contributions from sulfate (7 %-9 % observed, 2 % simulated) and organics (54 %-68 % observed, 16 %-18 % simulated).

Surface PM 2.5 composition
Inorganic species (secondary inorganic aerosol of sulfate, nitrate and ammonium and other inorganic aerosol) domi- nate the chemical composition by mass of PM 2.5 , representing between 30 %-80 % of total PM 2.5 for each season across the IGP. The mean seasonal mass of total inorganics across the IGP is 54-70 µg m −3 during the pre-monsoon season, 27-35 µg m −3 during the monsoon season, 79-111 µg m −3 during the post-monsoon season, and 51-114 µg m −3 during winter. The largest inorganic aerosol values are found during the post-monsoon and winter seasons due to nitrate from fossil fuel combustion and from residential and energy use. We find a similar but relatively muted seasonal variation for black carbon with mass values between 2-11 µg m −3 . Sea salt transported from the coasts during the monsoon season adds 3-5 µg m −3 (3 %-9 %) to PM 2.5 across the IGP.
The water contribution to PM 2.5 is substantial over the lower IGP during pre-monsoon, monsoon, and post-monsoon seasons, with a mass contribution of 32-44 µg m −3 (25 %-42 %), while during winter it accounts for 6 µg m −3 (3.5 %). For the middle IGP, water is a non-negligible fraction of PM 2.5 mainly during monsoon (20 µg m −3 , 24 %) and winter (12 µg m −3 , 8 %) seasons, while for the upper IGP the highest values of water mass are found during only the monsoon season (4 µg m −3 , 8 %). The seasonal variation of water content reflects RH distributions, which above values of 60 %-70 % allows PM hydrophilic components (e.g. nitrate, sulfate, sea salt) to uptake water via deliquescence.
The sum of primary and secondary OA contributes by mass between 17 % and 31 % of PM 2.5 across the IGP, with contributions from POA and SOA varying with season. During the pre-monsoon season, OA contributes 11-21 µg m −3 to PM 2.5 , representing 17 %-22 % of the total mass. A similar mass contribution is found during the monsoon season (18-21 µg m −3 ) but with higher percentage contribution to PM 2.5 (20 %-31 %). The percentage mass contribution of OA to PM 2.5 is similar during the postmonsoon (28 %-31 %, 43-52 µg m −3 ) and winter (22 %-31 %, 26-60 µg m −3 ), with higher mass contribution during post-monsoon for the middle and lower IGP and during the winter season for the lower IGP. Our results for modelled PM 2.5 composition confirm the significance of OA contribution to fine particulate matter, and we analyse in more detail OA and its components in the next sections.  During the post-monsoon season (Fig. 6g-i), the largest OA concentrations are over the upper IGP at the border of Pakistani and Indian Punjab (> 80 µg m −3 ), where POA values can exceed 50 µg m −3 , although the largest regional mean is found over the lower IGP (52 µg m −3 ) due to urban anthropogenic emissions in and around Kolkata and Patna, where values are > 70 µg m −3 . Over the middle IGP, the mean OA value is similar to the lower IGP (52 µg m −3 ) but shows a more homogeneous distribution, with the highest OA values found at the borders between upper and lower IGP. Regional mean POA values range from 23-29 µg m −3 (Fig. A2g-i), similar to SOA values (20-24 µg m −3 ; Fig. A3g-i). POA levels are much higher than SOA over the Punjab states in India and Pakistan and in the Indian lower IGP (40-70 and 30-40 µg m −3 for POA and SOA, respectively). Over the middle IGP, SOA is generally higher than POA (29 and 24 µg m −3 for SOA and POA, respectively), with highest concentrations of SOA found in the lower Uttar Pradesh (up to 40 µg m −3 ). Over Bangladesh and the Pakistani state of Sindh, POA and SOA have comparable values (< 35 µg m −3 ).

Seasonal distribution of surface OA
Similarly to PM 2.5 , we find that during the post-monsoon season, the OA distribution across the IGP is most sensitive to changes in biomass burning emissions (Fig. 7g-i), with higher values over the Punjab to Delhi NCT and part of Uttar Pradesh (up to 10 3 µg m −3 where fires are located over Indian Punjab). The sensitivity of OA to changes in biomass burning is localised, with POA most influenced by fires over Punjab and Haryana (Fig. A4h) and the corresponding impact on SOA extending over Pakistani Punjab and towards the middle IGP (Fig. A5h). Similarly, biogenic emissions play only a localised role in OA and SOA concentrations where biogenic emissions are still significant during this season (Figs. 7i and A5i). OA is most sensitive to anthropogenic emissions over the Indian part of the lower IGP and in the Pakistani Punjab values (between 50-150 µg m −3 ). We find that OA over the Delhi NCT megacity is not sensitive to these changes unlike other cities mentioned previously, so that Delhi is not one of the main hotspots of OA across IGP during this season (Fig. 6h) unlike it is for PM 2.5 (Fig. 3h). We find that the sensitivity of POA and SOA to changes in anthropogenic emissions is comparable across major cities of the Punjab states (Figs. A4g, A5g).
We find that the largest seasonal mean values of OA are during winter over the lower IGP (60 µg m −3 , Fig. 6j-l), with contributing localised peaks over Kolkata and Patna (> 80 µg m −3 ) and at the border between Pakistan and In- Figure 7. Seasonal sensitivity of total OA to changes in (a, d, g, j) anthropogenic, (b, e, h, k) pyrogenic, and (c, f, i, l) biogenic emissions (µg m −3 Gg −1 ). The sensitivity calculation is described in the main text. Regions marked in white show where sensitivity corresponds to OA concentrations below the set threshold of 1 µg m −3 . dia (ranging 40-70 µg m −3 ). Seasonal mean values of POA and SOA also peak during winter over the lower IGP (34 and 26 µg m −3 , respectively.) During winter, the OA distribution is shaped by anthropogenic and pyrogenic emissions (Fig. 7j-l). POA concentrations are shown to be sensitive to anthropogenic emissions in a similar way as they are for the post-monsoon season (Fig. A4g, j). SOA is also mostly determined by anthropogenic emissions over the lower IGP (Fig. A5j). POA and SOA are also sensitive to pyrogenic emissions, but during this season it is limited to fires over the Indus basin in Pakistan and central IGP (Figs. A4k, A5k). We find that biogenic emissions do not significantly influence OA during winter.
During pre-monsoon and monsoon seasons, the OA distributions ( Fig. 6a-f) have similar mean values over the middle and lower IGP (20-21 µg m −3 ) and lower mean values over the upper IGP (11 and 18 µg m −3 , respectively). The highest POA concentrations are found at the border on India and Pakistan and over the lower IGP ( 30 and 40 µg m −3 , respectively). In both seasons, mean SOA concentrations are below 15 µg m −3 across the whole IGP. During pre-monsoon and monsoon seasons, OA concentrations are sensitive to anthropogenic emissions across the IGP, with similar spatial distributions (Fig. 7a, d). Pyrogenic emissions influence the OA distribution during the pre-monsoon season over the central IGP (Fig. 7b), but OA is less sensitive to these emissions compared with the post-monsoon season (Fig. 7h). During the monsoon season, the influence of fires on OA is negligible across the IGP. The influence of biogenic emissions on OA, determined exclusively in our model via SOA, is limited to the lower IGP during the pre-monsoon season. During the monsoon season, these emissions have a widespread impact on OA (Fig. 7f), with seasonal mean peak sensitivity of up to 2.3 × 10 2 µg m −3 Gg −1 . PM 2.5 and OA are more sensitive to changes in biogenic emissions than changes in anthropogenic emissions during the monsoon period because of the role that anthropogenic emissions play in controlling the production of biogenic SOA. Previous studies have shown that anthropogenic emissions can enhance biogenic SOA production, with NO x concentrations playing a strong role in enhancing SOA formation from isoprene and terpenes (Spracklen et al., 2011;Shilling et al., 2013;Shrivastava et al., 2019;Xu et al., 2020). A disadvantage of our using a single-variable perturbative method is that we can only consider the impacts of one controlling factor in the production of OA. Research that considers the interactions between controlling factors is outside the scope of this study.

Seasonal distribution of SOA volatility
We use aerosol volatility to describe how SOA is partitioned between the gas and particle phase to understand when it contributes to PM 2.5 mass loading. Figure 8 shows the sea- sonal mean volatility distributions for SOA across the IGP simulated using the 1-D VBS model in WRF-Chem (Knote et al., 2015). Seasonal and regional variations reflect changes in the physical and chemical environment in which the SOA is formed. Broadly, we find a gradual increase in the volatility of SOA from the pre-monsoon season to the winter season, mainly reflecting the increase in the mean OA loading (Fig. 6). Higher OA loading leads to a shift in the gas-particle partitioning towards more volatile bins, reflected in the seasonal variation in the population of the inert bin (denoted here as log 10 C * = −4, as described above). The contribution of this inert bin is negligible during winter and peaks during the monsoon season, with intermediate values during the pre-monsoon and post-monsoon transition seasons.
During the post-monsoon season, the particle-phase organic mass is present at high-volatility bins up to log 10 C * = 2. The largest particle-phase mass loading (10 µg m −3 ) is found over the middle IGP. The upper and lower IGP show a similar volatility distribution as the middle IGP but with lower mass loadings, with the lower IGP having the lowest mass loadings. The smallest values over the lower IGP reflect the persistence of rainfall over this region that leads to con-tinued removal of water-soluble gas-phase and aerosol-phase organics.
Surface-level atmospheric organic mass becomes even more volatile during the winter season, with particle-phase organic matter present in all volatility bins. The largest mass loading for SOA is found over the lower IGP (> 10 µg m −3 ) and decreases westwards towards the upper IGP, reflecting the E-W gradient of the total OA loading (Fig. 6j-l).
SOA during the pre-monsoon (Fig. 8a-c) and monsoon ( Fig. 8d-f) seasons is characterised by a volatility ≤ log 10 C * = 1 and by aerosol masses lower than 5 µg m −3 for each volatility bin in both seasons. The higher volatility bins (log 10 C * = 2 and log 10 C * = 3) are occupied exclusively by gas-phase organic compounds. We attribute this to water-soluble SVOCs being washed out by monsoonal rainfall. The washout of SVOCs results in gas-aerosol repartitioning to establish thermodynamic equilibrium, associated with particle-phase organics partitioning to the gas phase. Aerosols are also removed via wet and dry deposition, but we find most of the mass of SVOCs and SOA is lost via the gas phase (Knote et al., 2015). This also helps to explain the low levels of OA during the pre-monsoon and monsoon seasons (Fig. 6). The OA volatility distribution is similar across the IGP, reflecting an approximately uniform physical environment during the two seasons (Figs. A1 and 7).

Concluding remarks
We used the WRF-Chem regional atmospheric chemistry model to understand the influence of anthropogenic, pyrogenic, and biogenic emissions and meteorology on seasonal variations of the magnitude, distribution, and composition of PM 2.5 and organic aerosol across the Indo-Gangetic Plain (IGP) during 2017-2018.
We find that the model reasonably reproduces concentrations of PM 2.5 in all seasons (NMB < 0.2, r > 0.6), except for the monsoon season (NMB = 0.4, r = 0.09), a reflection that modelling monsoonal meteorology remains challenging. However, uncertainty in our estimates remains on the individual PM 2.5 secondary components, given the limitation we found in the modelling to reproduce surface concentrations of precursor gases when compared with observations. Availability of additional monitoring stations outside urban areas that are more representative of the spatial scales associated with model grid cells would help to evaluate model error, as well as the use of finer resolution and up-to-date inventories for precursor gases over the rapidly changing region of the IGP.
We find that the IGP experiences the highest seasonal mean levels of PM 2.5 during the post-monsoon (October-December, 166 µg m −3 ) and winter (January-February, 145 µg m −3 ) seasons with a heterogeneous distribution, in agreement with previous studies. The magnitude and distribution of anthropogenic emissions across the IGP are approximately constant throughout the year. During the postmonsoon season, agricultural burning emissions of postharvest residues influence PM 2.5 mostly over the upper and middle IGP, particularly affecting the Indian and Pakistani Punjab region. These additional emissions are exacerbated by high-pressure weather systems that reduce ventilation of surface air pollution to the free troposphere. During the winter season, ongoing anthropogenic emissions, wind patterns, and a seasonally shallow boundary layer result in a gradient in air quality from the upper to lower IGP, with the highest PM 2.5 values (in excess of 250 µg m −3 ) over Kolkata and the state of Bihar. During the pre-monsoon (March-May) and monsoon (June-September) seasons, wet scavenging of hydrophilic gas-phase aerosol precursors and aerosols, and more rigorous vertical mixing, reduces levels of PM 2.5 (95-79 µg m −3 respectively). Generally, we find that PM 2.5 composition has a stronger seasonal variation than a geographical variation within each season. Total inorganic species dominate PM 2.5 composition (30 %-80 %), with water uptake contributing substantially to the PM 2.5 mass especially over the lower IGP (up to 40 %).
We find that OA represents a significant contribution to PM 2.5 throughout the year. On an annual mean basis, OA represents 17 %-30 % of PM 2.5 , with higher contributions during post-monsoon and winter seasons. Typically, POA contributes more to the OA loading than SOA in all seasons across the IGP. Anthropogenic and pyrogenic sources impact POA and SOA, with similar patterns of PM 2.5 across the IGP during all seasons. Biogenic sources have a significant impact on SOA distribution across the IGP during the monsoon season but are limited to the lower IGP during the pre-and post-monsoon seasons. We find that the volatility distribution of SOA is driven mainly by the mean total OA loading and the washout of aerosols and gas-phase aerosol precursors that result in SOA being less volatile during the pre-monsoon and monsoon season than during the post-monsoon and winter seasons.
Mitigating levels of PM 2.5 over the IGP will require a range of regional and state-level policies that address the influences of intra-and inter-state anthropogenic, pyrogenic, and biogenic emissions. The relative influence of these emissions on PM 2.5 and the broader photochemical environment will likely change in the context of a warmer climate; e.g. biogenic emissions will increase as they are temperature-dependent. It is therefore imperative that future studies should also consider subregional and city spatial scales, whereby individual sectors will be more important and areas with the highest population density that will suffer from poor air quality are taken into account.
C. Mogno et al.: Fine particulate matter over the IGP Appendix A: Meteorological drivers and POA and SOA distribution Figure A1 shows the mean seasonal WRF-Chem meteorological drivers of pre-monsoon, monsoon, and post-monsoon 2017 and winter 2018 seasons. Figures A2 to A5 show POA and SOA distribution over the IGP and their sensitivity to emissions drivers.

B1 Ground-based measurement evaluation
We use ground-based measurements from the Central Pollution Control Board and the U.S. Embassies, which are available for our 2017-2018 study period and accessed through the OpenAQ Platform (OpenAQ, 2020). Data from Pakistan are only available from 2019, so we use 2019 data for the monsoon and post-monsoon seasons and data from 2020 for the winter and pre-monsoon seasons.
We apply a cleaning procedure of data for each pollutant. The cleaning procedure followed five sequential steps: (1) exclude invalid, negative, and zero values; (2) exclude hourly data with z score ≥ 3 with respect to the daily mean; (3) exclude days with fewer than 12 hourly measurements per day; (4) exclude stations with fewer than 15 d of measurements per simulated season; and (5) exclude all stations but one if there are multiple stations in the same model grid cell (for statistical independence in the comparison). From this cleaning procedure, we get 31 independent stations (Table B2, Fig. B1) with a total of 332 seasonal measurements: 63 for CO, 54 for SO 2 , 61 for NO 2 , 50 for O 3 , 84 for PM 2.5 , and 20 for PM 10 . For particulate matter, we compare the dry mass of PM 2.5 and PM 10 .
To compare the model against these measurements, we sample the model at the time and location of each measurement. In practice, we identify the model value closest to the measurement. We report seasonal mean statistics. We evaluate the model using five metrics: the mean bias (MB), root mean square error (RMSE), normalised mean bias (NMB), normalised mean absolute error (NMAE), and sample Pearson correlation coefficient (r). These metrics are widely used for air quality model evaluation (Zhang et al., 2006;Kumar et al., 2012b;Brasseur and Jacob, 2017;Conibear et al., 2018). Table B3 summarises the seasonal mean evaluation of the model with the metrics described.

B2 Organic aerosols
In the absence of continuous monitoring data of OA, we compare our model OC values with values found in the literature. Table B4 shows the comparison of modelled OC with measurement studies. Location of measurement sites is shown in Fig. B1. OA is converted from organic aerosol mass to organic carbon mass assuming OA / OC ratios of 1.4 for POA and 2.0 for SOA, following Knote et al. (2015).

B3 Total AOD column
We compare our modelled prediction against satellite AOD retrievals for 2017-2018 at 550 nm with a 10 km horizontal resolution obtained from both Terra (MOD04_L2) and Aqua (MYD04_L2) MODIS instruments. We use the best-quality AOD retrievals merged from the Dark Target and the Deep Blue algorithms (Levy et al., 2013). We re-grid the 10 km Terra and Aqua MODIS AOD data to the coarse WRF-Chem 20 km × 20 km model grid.
We calculate the 550 nm AOD using WRF-Chem values at 300 and 1000 nm by interpolation using the Ångström power law. We sample the model at the local overpass time of Terra (10:30) and Aqua (13:30), where there is at least one bestquality AOD retrieval. We then calculate mean model and MODIS AOD values over time to generate seasonal statistics. Table B5 reports the main statistical metrics for AOD evaluation, together with the range of observed and modelled AOD. Figure B1. Location of ground-based observation for PM and gases (purple) and OA (orange). The ID number for each station corresponds to the ID numbers in Tables B2 and B4. The inset map shows Delhi NCT in detail. Table B1. Chosen parametrisations for meteorological processes in WRF-Chem.
Author contributions. CM and PIP conceived the study and methodology. CM set up the model with support from CK. CM performed the simulations, led the model evaluation and data analysis, and interpreted the results together with PIP, CK, and TJW. FY provided the AOD observations to be compared with the model AOD. CM and PIP wrote the paper, with input from all co-authors.
Competing interests. The authors declare that they have no conflict of interest.
Disclaimer. Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.