High-resolution (0.05 × 0.05) NOx emissions in the Yangtze River Delta inferred from OMI

Emission datasets of nitrogen oxides (NOx) at high horizontal resolutions (e.g., 0.05× 0.05) are crucial for understanding human influences at fine scales, air quality studies, and pollution control. Yet high-resolution emission data are often missing or contain large uncertainties especially for the developing regions. Taking advantage of longterm satellite measurements of nitrogen dioxide (NO2), here we develop a computationally efficient method of estimating NOx emissions in major urban areas at the 0.05×0.05 resolution. The top-down inversion method accounts for the nonlinear effects of horizontal transport, chemical loss, and deposition. We construct a two-dimensional Peking University High-resolution Lifetime-Emission-Transport model (PHLET), its adjoint model (PHLET-A), and a satellite conversion matrix approach to relate emissions, lifetimes, simulated NO2, and satellite NO2 data. The inversion method is applied to the summer months of 2012–2015 in the Yangtze River Delta (YRD; 29–34 N, 118–123 E) area, a major polluted region of China, using the NO2 vertical column density data from the Peking University Ozone Monitoring Instrument NO2 product (POMINO). A systematic analysis of inversion errors is performed, including using an independent test based on GEOS-Chem simulations. Across the YRD area, the summer average emissions obtained in this work range from 0 to 15.3 kg km−2 h−1, and the lifetimes (due to chemical loss and deposition) range from 0.6 to 3.3 h. Our emission dataset reveals fine-scale spatial information related to nighttime light, population density, road network, maritime shipping, and land use (from a Google Earth photo). We further compare our emissions with multiple inventories. Many of the fine-scale emission structures are not well represented or not included in the widely used Multi-scale Emissions Inventory of China (MEIC).


Introduction
Nitrogen oxides (NO x = NO + NO 2 ) are a main precursor of particulate matter, ozone, and other atmospheric pollutants. NO x strongly influence the atmospheric oxidative capacity, affect the climate, and are toxic to many organisms. NO x are emitted from natural and anthropogenic sources (Lin, 2012). Over 15 the past decade, China has experienced rapid growth in the Gross Domestic Product (GDP, by 8.3% a -1 on average from 2008 to 2017), fossil fuel consumption (by 5.5% a -1 from 2007 to 2015), and urbanization (National Bureau of Statistics of China, http://data.stats.gov.cn/). These socioeconomic changes have been accompanied by a rapid change in NO x emissions in the urban and surrounding areas. With the large and continuously increasing urban population and motor vehicles, NO x pollution is particularly severe in 20 large cities such as Beijing and Shanghai (Barnes and Rudziński, 2013;Lin et al., 2016). Many coastal cities like Shanghai have also experienced enormous growth in the shipping business. Therefore, pollution along the coastal line has become a serious problem associated with the growth of global economic trade; and emissions from seaborne transport play an increasingly important role in the global air pollution (Fu et al., 2017). Understanding the urban pollution and its environmental impacts requires accurate 25 quantitative knowledge of NO x emissions at a very high horizontal resolution (e.g., 0.05°×0.05°), which is typically lacking especially for the developing countries.
Gridded bottom-up emission inventories typically use spatial proxies (like population and GDP) to allocate provincial-level emission values, which are derived from activity statistics and emission factor data, to individual locations (Zhao et al., 2011;Janssens-Maenhout et al., 2015;Zhao et al., 2015). Such 5 a gridding method may lead to large uncertainties at high resolutions , because the mismatch between proxies and emissions becomes more significant and emitting facilities are harder to allocate accurately as the resolution increases . For a small area, emission factors and activity data of the major sources can be collected by on-site surveys to allow construction of a highresolution inventory (Zhao et al., 2015;Granier et al., 2019), such as Zhao et al. (2015) for Nanjing. 10 However, on-site surveys are extremely time consuming and resource demanding, and therefore difficult to be applied to a large domain in a timely manner.
Top-down inversion using satellite retrieval products of tropospheric vertical column densities (VCDs) of nitrogen dioxide (NO 2 ) is a widely used independent estimate of NO x emissions (Martin et al., 2003;Stavrakou et al., 2008;Lin et al., 2010;Mijling and van der A, 2012;Gu et al., 2014;Beirle et al., 2015;15 Miyazaki et al., 2016;Ding et al., 2017b). Top-down inversion typically provides the total emission data, although emissions from individual sources can be further derived by integrating a priori data (often from bottom-up inventories) about source-specific information such as diurnal and seasonal variabilities (e.g., Lin et al., 2010;Lin, 2012) and spatial variabilities (Timmermans et al., 2016).
The traditional top-down methods based on local mass balance (LMB) or its variants assume a weak 20 effect of horizontal transport (Martin et al., 2003;Lamsal et al., 2011;Lin, 2012;Gu et al., 2014;Boersma et al., 2015). These algorithms work relatively well at low resolutions (> 50 km) given the relatively short lifetime of NO x (hours to 1 day), but may introduce large uncertainties when applied to higher resolutions for example, emissions in the rural-urban fringe zone cannot be identified accurately. The Adjoint Model and Kaman Filter methods better account for horizontal transport, although their applicability is 25 limited by expensive computational costs. These more sophisticated methods have often been applied to relatively short time periods (e.g., Gu et al., 2016 for one month), small spatial domains (e.g., Tang et al., 2013 in Texas), and/or at coarse horizontal resolutions (e.g., Miyazaki et al., 2012 at T41 grid, i.e.;~2.8°, and Stavrakou et al., 2008 at 5°×5°). Top-down estimates can be further combined with bottom-up inventories and spatial proxies to increase the spatial resolution, such as from 0.25°×0.25° in the DECSO 5 derived emissions to 0.01°×0.01° for 2014 during the MarcoPolo Project (Hooyberghs et al., 2016;Timmermans et al., 2016) and similar inventories over Qatar and South Africa (Maiheu and Veldeman, 2013). The LMB, Adjoint Model and Kaman Filter approaches normally use 3-dimentional chemical transport models (CTMs) to relate emissions to VCDs. CTM-based studies typically provide an estimate of the overall model error, although Lin et al. (2012) and Stavrakou et al. (2013) present errors in the 10 individual model processes (e.g., key chemical reactions and meteorological parameters). A computationally low-cost method for space-based high-resolution (0.05°×0.05°) NO x emission estimate will be helpful for understanding the urban pollution and its trends and variability.
This study presents a computationally low-cost space-based top-down approach to construct highresolution NO x emission inventories for urban and surrounding areas. The approach is applied to the 15 Yangtze River Delta (YRD) area (118°E-123°E, 29°N-34°N, which includes Shanghai, Nanjing, Hangzhou and 15 other cities) on a 0.05°×0.05° grid, using the POMINO NO 2 VCD data retrieved from the Ozone Monitoring Instrument (OMI). We derive the average NO x emissions for the summer months (June, July, and August) of 2012-2015. We construct a model called PHLET (2-dimensional Peking University High-resolution Lifetime-Emission-Transport) and its adjoint model (PHLET-A) to facilitate 20 the emission estimate. The lifetimes of NO x are estimated as well, in order to account for the nonlinear NO x chemistry.
Section 2 presents the data and method for top-down inversion of high-resolution NO x emissions.
Inversion uncertainties are analyzed explicitly. Section 3 presents spatial distributions of NO 2 VCDs, the derived local net sources (which are used subsequently to derive NO x emissions and lifetimes), NO x 25 lifetimes, and their uncertainties. Section 4 analyzes the top-down emission data estimated here, including comparisons with spatial proxies (population density, night light brightness, power plant locations, road network, marine shipping routes, and a Google Earth photo for land use indication), the Multi-scale Emissions Inventory of China (MEIC) (Zheng et al., 2014;, the DECSO top-down emissions (Mijling et al., 2013;Ding et al., 2017a), and the MarcoPolo emissions (Hooyberghs et al., 2016;Timmermans et al., 2016). Section 5 tests our inversion method by applying it to the NO 2 VCDs 5 simulated by the GEOS-Chem CTM. Section 6 concludes the study.

A general framework to retrieve NO x emissions at a high resolution
The high-resolution NO x emission retrieval framework consists of multiple steps, as illustrated in the flowchart (Fig. 1). First (Sect. 2.2), the POMINO NO 2 VCD data over summer 2012-2015 are averaged 10 on a 0.05°×0.05° grid, using a special oversampling technique that preserves the finest spatial information possible. A Satellite Conversion Matrix (SCM), which will be applied to PHLET simulated NO 2 VCDs at the second step, is also calculated based on the OMI pixel parameters (i.e. corner co-ordinates).
Second (Sect. 2.3), the PHLET model is constructed to simulate the local net source (i.e., emissionloss) and horizontal transport of NO 2 VCDs on the 0.05°×0.05° grid. The SCM is then applied to PHLET 15 simulated VCDs to mimic how each satellite pixel averages the spatial distribution of NO 2 , in order to ensure the spatial sampling consistency between PHLET and POMINO. This process is needed because satellite pixels represent the NO 2 spatial distribution at a coarser (than PHLET) resolution with irregular shapes of individual pixels. Third (Sect. 2.4), the PHLET-A adjoint model is constructed to, together with PHLET and POMINO 20 VCDs, derive the local net source at each 0.05°×0.05° grid cell. We construct a cost function to quantify the difference between the distribution of POMINO VCDs and that simulated by PHLET at the 0.05°×0.05° grid. The inversion process to derive the local net sources is equivalent to minimization of the cost function. Finally (Sect. 2.5), the emission and lifetime of NO x at each 0.05°×0.05° grid cell is derived from the local net source term, by fitting a formula for the nonlinear relationship between lifetimes and VCDs. The formula is assumed to be fixed, i.e., the relationship is applicable to all grid cells within the small study domain.
Furthermore, a rigorous error analysis for the framework and models is conducted (Sect. 2.6). This 5 analysis is complemented by an test based on the GEOS-Chem simulated distribution of NO 2 VCDs (Sect.

5).
Our inversion method explicitly accounts for horizontal transport and the nonlinear relationship between lifetimes and NO 2 VCDs. With a few reasonable assumptions, the method is computationally efficient, suitable for speedily conducting high-resolution emission estimates in multiple areas and across a long 10 time period (2012-2015 in this study). Both PHLET and PHLET-A are numerically solved based on FEniCS, a popular open source solver (Farrell et al., 2012;Funke and Farrell, 2013;Alnaes et al., 2015).
With one computational core (Intel ○ R Xeon ○ R Gold 6130 CPU @ 2.10GHz), derivation of NO x emissions over the YRD here takes less than one hour after necessary input data are prepared. Applying the framework to multiple areas would take a similar amount of time by using one computational core for 15 each area.

Tropospheric NO 2 VCDs retrieved from OMI
OMI is a UV/VIS nadir solar backscatter spectrometer on board the Aura satellite (Levelt et al., 2006). OMI provides daily global coverage. Each complete swath of OMI consists of 60 ground pixels, the sizes of which increase from 13 km × 24 km at nadir to about 40 km × 150 km at the swath edge in 20 accordance to the view zenith angle (VZA) from 0° to 57° (de Graaf et al., 2016).
We use Level-2 tropospheric NO 2 VCD data from POMINO (Peking University Ozone Monitoring Instrument NO 2 product) Lin et al., 2015a). As described in detail in Lin et al. (2014;2015), POMINO is an OMI-based regional NO 2 product that includes a number of important features.
Briefly, POMINO adopts the tropospheric slant column density (SCD) data from DONIMO v2 and conducts an improved calculation of tropospheric air mass factors (AMFs) and VCDs (i.e., VCD = SCD / AMF) . Key features of the POMINO algorithm include explicit representation of aerosol scattering and absorption (by combining aerosol data from daily nested GEOS-Chem (at 0.3125° long. × 0.25° lat.) simulations and monthly MODIS/Aqua aerosol optical depth (AOD) data), explicit 5 representation of the angular dependence of surface reflection, high-resolution NO 2 profiles from GEOS-Chem (at 0.3125° long. × 0.25° lat.), consistent retrievals of clouds (a prerequisite for the NO 2 retrieval) and NO 2 , and use of a parallelized, LIDORT-driven AMFv6 package. POMINO NO 2 VCDs are consistent with ground-based MAX-DOAS data (Liu et al., 2019).
To better relate NO x emissions to NO 2 VCDs at the 0.05°×0.05° resolution, we only employ the NO 2 data 10 in summer (June, July, and August), in which season the lifetimes of NO 2 are the shortest (a few hours).
We combine data over 2012-2015 to increase the sample size. The change in NO 2 VCD from June to August is relatively small, reducing the effect of intra-seasonal variability when deriving NO x emissions from summer mean NO 2 VCDs. We screen out the 30 outer pixels with VZA larger than 30° (cross-track width larger than 36 km) that greatly smear the spatial gradient of NO 2 , pixels with cloud radiance fraction 15 exceeding 50%, and pixels with AOD larger than 3 (i.e., when the aerosol data used in the NO 2 retrieval are unreliable and the NO 2 retrieval is subject to an excessive error) Lin et al., 2015b;Liu et al., 2018a). We also exclude data with raw anomaly problems (http://projects.knmi.nl/omi/research/product/rowanomaly-background.php). After data screening, we obtain valid data from 22,007 pixels. We then convert the pixel-specific Level-2 data to the 0.05°×0.05° 20 grid.
To convert from the satellite pixels to the 0.05°×0.05° grid cells, we use an oversampling method that employ satellite data on multiple days to enhance the horizontal resolution . For each 0.05°×0.05° grid cell, we average all pixels covering the grid cell from all valid days, using area-based weighting. The oversampling approach takes advantage of the fact that the exact location of the OMI 25 footprint slightly changes from one day to another, so does the exact location of the footprint of an OMI pixel at a given VZA. Thus, sampling from multiple days increases the horizontal resolution of data. Our oversampling approach is different from previous studies, which filled a grid cell with data from pixels within a certain distance (e.g., 30 km) and would result in spatial smoothing (Fioletov et al., 2011;Krotkov et al., 2016;Sun et al., 2018).
For the purpose of emission estimating, we assume that the error of VCD at a satellite pixel ( ) contains 5 an absolute error of half of the mean VCD over the domain (i.e., 1.9×10 -15 molecules cm -2 ) and a relative error of 30% (Lin et al., 2010;Lin et al., 2015a). We further add in quadrature an additional error ( ) when a satellite pixel is projected to the grid cells at a finer resolution; this error is important in the urban-rural fringe zone. For a given grid cell, is set to be 50% of the standard deviation of VCDs at its eight surrounding grid cells. Sampling over multiple days reduces the 10 random error by a factor of s = (√(1 − ) + ⁄ ), where c represents the fraction of systematic error (assumed to be 50%) and the number of days with valid data (Eskes et al., 2003;Miyazaki et al., 2012).
Thus, the total error for the temporally averaged VCD at a given grid cell is = √( 2 + 2 ) • .

The PHLET model simulation
We construct the PHLET model on the 0.05°×0.05° grid to interpret the relationship between local net 15 source (i.e., emissionloss), horizontal transport, and VCDs of NO 2 in a 2-D gridded space (Eq. 1) in the sense of long-time average. PHLET simulates the horizontal transport of NO x through a time averaged advection process and an "effective" diffusion process, which represents the residual from the temporally averaged advection. The vertical distribution is simplified as in Sect. 2.3.2. The loss process of NO x is represented by the lifetime. 20

Governing equation of PHLET
PHLET is an equilibrium model for the local net source, VCDs and horizontal transport of NO 2 at each grid cell. Equation 1 presents the governing equation in PHLET: ( , ) = • ( , ) − ( , ) • ( , ) + • ( ( , ) • ( , )) = 0 (1) ( , ) represents the tropospheric NO 2 VCD (in molecules cm -2 ) due to sources over the YRD. The value of ( , ) gives the distribution of NO 2 VCDs at equilibrium ( ( , ) = 0 for every x and y). The discrete form of PHLET is set on the 0.05°×0.05° grid. The simulated VCDs will be applied with the SCM and compared to the gridded OMI data (after the contribution of horizontally homogeneous regional 5 background is subtracted from the OMI data, see Sect. 2.3.2).
We assume a steady state of NO 2 in PHLET, although NO 2 observed by the satellite instrument may be in a transient state. We assign an error of 15% to simulated ( , ) to account for the possible range of NO 2 variability at the overpass time of the instrument. Also, combining data from multiple years to derive an averaged NO 2 distribution for simulation (rather than conducting the simulations for individual years 10 and months) leads to an additional uncertainty, which is set to be 10% based on a comparison between the emissions estimated from multiple years together (here) and the average of emissions estimated from individual years (in a sensitivity test).
( , ) represents the local net source term (in molecules cm -2 s -1 , equivalent to 2.63 × 10 -12 kg km -2 h -1 ), which combines the effects of ground emissions (anthropogenic + soil + biomass burning; see discussion 15 in Sect. 2.3.2), deposition, and chemistry of NO x . At equilibrium, the domain average of modeled ( , ) reaches zero, because there are no horizontal fluxes into or out of the domain boundaries. ( , ) can be separated into an emission term and a loss term: where ( , ) denotes the gridded emissions of NO x , and ( , ) the lifetimes associated with deposition 20 and chemical loss. represents the ratio of NO 2 over NO x concentration. The daytime NO x chemical system reaches equilibrium rapidly and varies little Valin et al., 2013). We set to be 0.76 with an uncertainty of 10% (Seinfeld et al., 2006;. • ( ( , ) • ( , )) represents the diffusion term, where ( , ) denotes the "effective" diffusion coefficient tensor (in m 2 s -1 ). The diffusion term accounts for transport by the residual winds deviating 5 from the temporally averaged wind vector ( , ). Appendix A shows how to determine the diffusion coefficient tensor.

Vertical shape and regional background of NO2
PHLET assumes a horizontally homogeneous vertical shape of NO 2 concentrations, and that NO 2 is concentrated near the surface . The assumption is implicitly used in many previous 10 studies for polluted areas Liu et al., 2016;Liu et al., 2017). The corresponding uncertainty in the modeled NO 2 VCDs is set to 15% Lin et al., 2014).
Lightning emissions, biomass burning emissions, aircraft emissions, transport from neighboring regions, and convection can lead to NO 2 at higher altitudes over the YRD area. However, the amount of NO 2 aloft is much smaller than near-ground NO 2 due to large ground sources (Lin, 2012). Thus, we regard NO 2 15 aloft as the regional background, and do not include it in Eq. 1. Also, for near-ground NO 2 over the YRD area, the contribution of downward vertical transport is negligible compared to the contribution of ground sources. Aircraft emissions contribute little to the total ground source, because 78% of aircraft emissions occur at the high altitudes (9-12 km) (Ma and Xiuji, 2000). Therefore, PHLET only accounts for nearground NO 2 from ground soil, biomass burning and anthropogenic sources (energy, industry, 20 transportation, and residential).
To ensure the consistency between PHLET and OMI NO 2 data, we assume the background value to be half of the minimum OMI NO 2 VCD among all grid cells (i.e., 0.54×10 15 molecules cm -2 ), and then subtract the background value from the gridded OMI data when comparing with PHLET simulations. The corresponding uncertainty in the modeled NO 2 VCDs is set as 5%.

Initial conditions, lateral boundary conditions, and wind data input
To run PHLET, the NO 2 VCDs at the domain edges, as the lateral boundary conditions (LBCs), are set as the corresponding OMI NO 2 VCDs. For initial conditions, the VCD and the local net source at each grid 5 cell inside the domain boundaries are set as zero. The horizontal distributions of modeled NO 2 VCDs and local net sources at equilibrium do not depend on the initial conditions.
For horizontal transport, we use 3-hourly wind fields from the ECMWF ERA5 dataset (https://confluence.ecmwf.int//display/CKB/ERA5+data+documentation; last access: 2018/7/2). The resolution of raw ERA5 data is 0.28125° on the reduced Gaussian grid, which is regridded to 0.05°×0.05° 10 by using the online program offered by ECMWF (see Fig. A). We adopt the mean wind field of the lowest 14 vertical levels (out of 157 levels in total); these 14 levels represent the altitudes from surface to about 500 m Hersbach and Dee, 2016). Over the study period, the prevailing wind is northwesterly, and the wind speed is small over land (Fig. A). For both zonal and meridional wind speeds, the uncertainty in the average wind speed is set to be 10%, which is similar to the temporal standard 15 deviation of the wind speed and may partly account for the fact that lower-resolution wind data are used. We assess the model errors introduced by the uncertainties in the wind field and effective diffusion coefficients by Monte Carlo simulations in which the wind speeds are changed according to their uncertainties. The resulting relative uncertainty in the modeled NO 2 VCDs is about 20%.

Application of SCM 20
Re-mapping of PHLET simulated NO 2 VCDs in accordance to satellite pixels is important. Given the size of OMI pixels, the OMI NO 2 data smooth to some extent the actual horizontal distribution of NO 2 . To ensure consistent spatial sampling between PHLET and OMI data, for each day we project the PHLET modelled NO 2 VCD data (in the original 0.05°×0.05° grid) to the satellite pixels to mimic how OMI "sees" the ground, remove the pixels with invalid OMI data, and then project the model data back to the 25 0.05°×0.05° grid. The last two procedures are the same as done for OMI data. The whole process of grid conversion is done through the SCM approach (Appendix B). Although PHLET simulates summer average NO 2 VCDs (rather than daily values), we repeat the grid conversion process for as many days as there are valid OMI data.

Summary of model errors 5
The model error is set to be the sum in quadrature of errors contributed by the above mentioned steady state assumption (15%), the time averaging over multiple years and months (10%), the assumption of horizontally constant vertical shape of NO 2 (15%), the NO 2 /NO x ratio (10%), the treatment of background NO 2 concentration (5%), and the error in the wind data and the calculation of effective diffusion coefficients (20%). 10

PHLET-A: The adjoint model of PHLET
We construct the PHLET-A adjoint model to obtain an optimized horizontal distribution of the local net source term (L in Eq. 1) under the given OMI NO 2 VCDs, wind field, and other parameters. PHLET-A accounts for the complex nonlinear effects of 2-dimensional transport and loss processes.
We first define a scalar cost function (Eq. 3) to quantify the difference between OMI NO 2 VCDs and 15 PHLET simulated (and SCM applied) NO 2 VCDs.
Because PHLET does not require a priori knowledge about the local net source, the cost function does not include the a priori term either. The vector denotes gridded NO 2 VCDs. denotes the observational error covariance matrix consisting of a satellite data error covariance matrix ( ) and a 20 PHLET model error covariance matrix ( ): For simplicity and following previous studies (Keiya and Itsushi, 2006;Cao et al., 2018), both and are assumed to be diagonal, with the diagonal elements set to be 2 and 2 , respectively. Grid cells nearby may share the same pixels, although the area-based weights would be different. This means that nearby grid cells may not be fully independent, leading to a weakness of the diagonal assumption here.
The associated uncertainty is partly accounted for by an error term based on the variability of NO 2 VCDs 5 (i.e., 50% of the standard deviation across the surrounding grid cells; see Sect. 2.2).
We then derive PHLET-A and its initial and lateral boundary conditions by applying Lagrange identity and integrating by parts (Marchuk, 1994;Sandu et al., 2005;Martien et al., 2006;Hakami et al., 2007): As shown in Eq. 5, PHLET-A represents the sensitivity of cost function ( ) to local net source ( ) where stand for the adjoint variable (Marchuk, 1994;Sandu et al., 2005). T stands for the time when the domain-wide NO 2 VCDs come to equilibrium, i.e., the start time of the adjoint simulation. By discrete adjoint sensitivity analysis, the gradient of to is obtained: 15 where the indices i, j and k denote zonal, meridional, and time, respectively. The gradient is then used in an iterative optimization (shown by the blue arrows in Fig. 1) to minimize the cost function , i.e., to minimize the weighted difference between model simulated and OMI NO 2 .
The numerical solution to obtain an optimized that minimizes is as follows. Given a starting point 20 of , we derive a search direction by the Broyden-Fletcher-Goldfarb-Shanno (BFGS) method (Li and Fukushima, 2001;Bousserez et al., 2015). Then, by practicing backtracking line search based on the Armijo-Goldstein condition (Armijo, 1966), we obtain a revised for the next iteration. The numerical calculation is done through FEniCS. It takes 50 iterations of PHLET and PHLET-A runs before the convergence is reached, according to the rate of reduction in . The value of is reduced from an initial value of 6585.2 to a stabilized value of 73.6 (Fig. 2). 5 The uncertainty of the optimized is given by the Hessian of the cost function, which is approximated by the BFGS method (Brasseur and Jacob, 2017):

Deriving emission and loss from the local net source term
The optimized local net source term combines the contributions of emission and loss (chemical loss + 10 deposition). We further separate emission from loss by assuming a fixed formula within our small study domain for the nonlinear relationship between lifetimes and VCDs of NO 2 .
In the summertime daytime, the dominant sink of NO x is reactions with the radicals to produce nitric acid and organic nitrogen species. The NO x chemistry quickly reaches a steady state under high solar radiation and air temperature in the early afternoon (Murphy et al., 2006;Valin et al., 2013) when OMI passes over 15 the YRD. The chemical lifetime of NO x depends on the concentrations of NO x and non-methane volatile organic compounds (NMVOC), radiation, temperature, and other factors. Within our small study domain, we assume the net effect of all factors except NO x concentrations to be spatially homogeneous. As such, the chemical lifetime of NO x at steady state is a sole function of NO x concentration (and thus NO 2 VCD, given the constant NO 2 /NO x ratio). Appendix C shows in detail how to deduce the chemical lifetime of 20 NO x from NO 2 VCD, to account for the effect of dry deposition, to separate emission and lifetime from the local net source term, and to quantify the errors involved.

Uncertainty estimate for top-down emissions
For a particular grid cell, the derived emission is affected by the error involved in the estimate of (embedded in satellite data and model simulations) and the error in the separation of emission and lifetime from . The satellite data error is analyzed in Sect. 2.2. The model related error is analyzed in Sect.
2.3. The error of , , is connected with and through the adjoint simulation and is given by 5 Hessian of the cost function J (Sect. 2.4).
The error involved in the separation of emission and lifetime, , is contributed by the assumption on the NO 2 /NO x ratio r (Sect. 2.3.1), the simplified treatment of deposition and chemical processes of NO x (Appendix C), and the assumed relationship between lifetimes and VCDs (Appendix C). is estimated by data fitting of L with different fitting parameters (Appendix C). 10 Thus, the error in emission, , is equal to the sum in quadrature of and , i.e., = √ 2 + 2 . The error in the lifetime is derived from the errors in NO x loss (estimated in Appendix C) and NO 2 VCDs, according to the common manner of error synthesis.
3. High-resolution spatial distributions of NO 2 VCDs, local net sources, and lifetimes over the YRD 15 Figure 3a shows the number of days with valid OMI data in summer 2012-2015 over the YRD area on the 0.05°×0.05° grid. The number of days varies from about 12 to 101 (48 on average). There is a "band" pattern in the spatial distribution, due to the difference in the number of satellite orbits covering each grid cell (not shown). This band pattern is not obvious in the distribution of OMI NO 2 VCDs (Fig. 3b), suggesting that the temporally averaged VCD values are less sensitive to the number of days (12 or more) 20 used for temporal averaging. There are fewer valid data in severely polluted locations. The effect of sampling size on the uncertainty in OMI NO 2 is accounted for in our study (Sect. 2.2). Figure 3b shows the gridded horizontal distribution of OMI NO 2 VCDs. The background value (0.54×10 15 molecules cm -2 ) has not been removed. NO 2 VCDs are high over the major urban centers along the Yangtze River and the coastal line, especially Shanghai, Nanjing (Capital of Jiangsu Province), Hangzhou (Capital of Zhejiang Province), and the Ningbo-Zhoushan area (with intensive maritime shipping activities). The maximum VCD value exceeds 16×10 15 molecules cm -2 in north Shanghai. NO 2 VCDs are 5 larger than 1×10 15 molecules cm -2 at all grid cells, reflecting the influence of local anthropogenic sources and/or pollution transported from nearby cities (Cui et al., 2016). NO 2 VCDs are lower than 5×10 15 molecules cm -2 along the boundaries of our study domain.
Across the grid cells, the absolute errors in OMI NO 2 VCDs are about 1.6-4.9×10 15 molecules cm -2 (Fig.   4a), and the relative errors are about 30%-157% (Fig. 4b). In general, the grid cells with larger NO 2 VCDs 10 have larger absolute errors but smaller relative errors. Over the eastern sea and the southwestern corner of the domain, NO 2 VCDs are relatively small (Fig. 3b), thus their absolute errors are small (Fig. 4a), but their relative errors are very large (Fig. 4b). Figure 4c shows the spatial distribution of the local net source L (emissionloss). A positive (negative) value of L indicates that the emission is larger (smaller) than the loss. The values of L are the greatest 15 (11.7 kg km -2 h -1 ) over the major urban areas with high NO 2 VCDs, and are low (< -1.0 kg km -2 h -1 ) in many areas with low NO 2 loadings. The values of L are the lowest ((-6.9)-(-2.0) kg km -2 h -1 ) at places in the urban-rural fringe zones with NO 2 hotspots nearby; this feature reflects that NO 2 is transported from the urban centers and destroyed in the fringe zones. The absolute errors of L vary from 0.6 to 4.5 kg km -2 h -1 . The spatial correlation between the absolute errors of L (Fig. 4c) and those of the NO 2 VCDs (Fig.  20 4a) is about 0.5. The absolute errors of L are notable in the urban-rural fringe zones where L is small but NO 2 VCD is high, because the deviation of L at these areas is very sensitive to errors in the assumed transport and loss process. NO 2 VCDs of about 1.6×10 15 molecules cm -2 , increasing to 0.8 h at grid cells with the lowest VCDs (around 1.0×10 15 molecules cm -2 ), and exceeding 3 h at many polluted grid cells, i.e., the urban centers.

The nonlinear dependence of lifetimes on VCDs is expected from our inversion method (Appendix C).
Appendix C further shows the chemical lifetimes to be 0.6-3.8 h and the deposition lifetimes to be constantly at 30.4 h across all grid cells.  (Fu et al., 2012). High emission values also occur at places along the Yangtze River and the coastal line. Along the Yangtze River, the highest emission value occurs in Nanjing City. Along the coastal line, there is an emission hotspot in the Ningbo-Zhoushan area. The general spatial distribution of NO x emissions is consistent with that of OMI NO 2 VCDs (correlation = 0.69), reflecting the short lifetimes of NO x and thus a modest effect of horizontal transport. Nonetheless, emissions are much more concentrated at a few sparse locations than NO 2 VCDs are, and many locations near the emission hotspots have very low emissions but relatively large NO 2 5 VCDs, suggesting that the effect of horizontal transport cannot be ignored at such a high resolution. Figure 5b shows the absolute errors of NO x emissions at individual grid cells. The emission errors vary from 0.7 to 4.5 kg km -2 h -1 across all grid cells. The largest uncertainty occurs in north Shanghai, corresponding to the highest VCD (Fig. 3b) and emission (Fig. 5a) values. The spatial pattern of absolute emission errors (Fig. 5b) is closer to the pattern of NO 2 VCDs (Fig. 3b, correlation = 0.51) than to the 10 pattern of emissions (Fig. 5a, correlation = 0.33). There are more hotspots in the distribution of emission errors than in the distributions of VCDs and emissions, because the emission errors can be high at locations with low VCDs and emissions. The spatial pattern of emission errors is consistent with that of L errors (Fig. 4c, correlation = 1.0), indicating that the errors in deriving emissions from the local net sources are rather homogenous. Figure 5c further shows that the relative errors of emissions are high (> 15 100%) over low-emission locations but much lower over emission hotspots.
The scatter plot in Fig. 5d shows the relationship between absolute emission errors and emissions at individual grid cells. The relationship is highly nonlinear, and there is large data spread where the emissions are low. The data spread tends to be smaller when emissions exceed 5 kg km -2 h -1 . The emission errors tend to decrease as emissions increase until about 5 kg km -2 h -1 , after which the emission errors 20 tend to increase with the increasing emissions. The data points in Fig. 5d are colored to indicate the different ranges of VCDs, and they show that grid cells with higher NO 2 VCDs have larger emission errors and smaller data spread.

Comparison between our top-down emissions and spatial proxies
This section compares our NO x emission dataset (Fig. 6b) with several spatial proxies widely used in bottom-up inventories, including nighttime light brightness (Fig. 6c), population density (Fig. 6d), road network (Fig. 6e), ship route density (Fig. 6f), power plant locations (Fig. 6g), and a satellite photo from Google Earth based on Landsat measurement that indicates the extent of land use (Fig. 6i). These proxies 5 broadly represent the intensity of human activities and are highly related to NO x emissions . brightness is represented digitally from 0 to 63 bits. The nighttime light reflects the intensity of household activity, commercial activity, and resource consumption (Elvidge et al., 2013). When regridded to 0.05°×0.05°, the spatial correlation between nighttime light brightness and NO x emissions is about 0.61 over land.    developed places. The majority of lands over the YRD have been developed. Although the lands over the southwest are less influenced by humans than the areas like Shanghai are, many places of the southwest have been developed as cities, towns and roads. This explains the spotted emission sources (Fig. 6b) retrieved from the satellite NO 2 VCDs.
Our emission data and the DECSO inventory are top-down estimates and include the contributions of soil and biomass-burning sources. Thus, we estimate soil and biomass burning emissions from independent 15 sources, and then subtract these emissions from our and DECSO emission datasets. Soil emissions are calculated by the nested GEOS-Chem (Fig. 7c), with the uncertainties assumed to be within 50% (J. Yienger and Ii Levy, 1995;Wang et al., 1998). Biomass burning emissions (Fig. 7b) are taken from the Global Fire Emissions Database (GFED4; www.globalfiredata.org/data.html; last access: 2019/7/10) (Giglio et al., 2013), with the uncertainties estimated to be within 10% over the YRD (Giglio et al., 2009;20 Giglio et al., 2013). Summed over the study domain, the soil sources contribute about 0.5% of our emissions while biomass burning contribute about 5.1%. Figure 7a shows the resulting "anthropogenic" portion of our emissions.
Compared with MEIC (Fig. 7d) and DECSO (Fig. 7e), our high-resolution anthropogenic emission dataset ( Fig. 7a) provides much more detailed spatial information. Our dataset identifies the emission hotspots 25 and their contrast with nearby low-emission areas (e.g., in the urban-rural fringe zones) better than MEIC and DECSO do. The contribution of mobile sources along the road network is clearer in our dataset. Our emission data contain sources over the nearby sea (i.e., from shipping), along the coastal line, and in the southwest of the domain, which are not included in MEIC. Compared with DECSO, our dataset suggests higher emissions on the northern parts of the sea, which may be due to our underestimate of NO x lifetimes 5 (Sect. 3) and/or errors in the DECSO estimate.
MarcoPolo emissions (regridded to 0.05°×0.05°, Fig. 7f) show more detailed spatial information than our emission dataset (Fig. 7a) does. This is because our top-down estimate is limited by the intrinsic resolution of NO 2 VCDs, i.e., our oversampling approach does not fully compensate for the large sizes of OMI pixels. Therefore, the large spatial gradient of NO x emissions is smoothed to some extent in our dataset. 10 On the other hand, the domain that MarcoPolo covers (118.135°E, 29.635°N − 122.125°E, 32.625°N) is much smaller than ours, and emissions of 9 cities (including Zhou Shan, Ningbo, Nantong, Hangzhou, Huai'an, Yancheng, Yangzhou, Taizhou, Shaoxing) and marine shipping emissions are not included in MarcoPolo.
At 0.05°×0.05°, the spatial correlations between our NO x emissions and spatial proxies are 0.61 for 15 nighttime light brightness and 0.50 for population density (Sect. 4.2). These values can be compared to the respective results for MarcoPolo on the 0.05°×0.05° grid (0.35 and 0.55, respectively). When regridded to 0.25°×0.25°, the correlations between our emissions and these spatial proxies become higher: 0.70 for nighttime light brightness and 0.69 for population density. The weaker correlation at a higher resolution reflects that as the spatial resolution gets finer, the chance that NO x emissions are collocated 20 with population or nighttime light becomes smaller , because of the influences of  For most cities, our emissions are consistent with at least one of the other three inventories, often the DESCO top-down inventory, after accounting for errors in our emission estimate. In Yancheng, Huai'an 10 and Ningbo, our emission values are higher than the averages of ours, DECSO and MEIC by 67.5%, 57.5% and 34.6%, respectively. Ningbo (around 29.8°N, 121.5°E ) is a coastal city with many isles and marine ports, as identifiable on the nighttime light map (Fig. 6c). The marine ports in the Ningbo-Zhoushan area contribute about 10% of the total shipping emissions in China (Endresen et al., 2003;Fu et al., 2017). Our dataset and DECSO account for emissions from marine shipping and ports, whereas MEIC does not. 15

Test of our top-down emission derivation method by using GEOS-Chem simulated NO 2 data
This section further presents an test to estimate the reliability of our emission derivation method, by examining to what extent the method can re-produce the emissions used in a nested GEOS-Chem simulation. Specifically, we use the nested GEOS-Chem v9-02 Liu et al., 2018b;Ni et al., 2018) to simulate the NO 2 VCDs in the early afternoon (around the overpass time of OMI) in summer 20 2014 on the 0.3125° longitude ×0.25° latitude grid. The simulated NO 2 data are shown in Fig. 9a and the emission inputs are shown in Fig. 9b. Next, we convert the GEOS-Chem NO 2 VCDs into the 0.05°×0.05° grid, and parameterize PHLET with the wind field adopted by GEOS-Chem, following the procedures in Sect. 2.3. Then we use PHLET, PHLET-A, and the lifetime-emission separation method to estimate the NO x emissions. Finally, we compare the derived emissions (re-mapped to the 0.3125°×0.25° grid) to those used in GEOS-Chem.

Concluding remarks
This study presents a satellite-based top-down method to estimating NO x emissions over urban and surrounding areas at a high horizontal resolution. As a demonstration, the method is applied to the YRD area at the 0.05°×0.05° resolution in summer 2012-2015, based on the POMINO NO 2 product. We 15 construct a simplified, computationally efficient 2-D lifetime-emission-transport model (PHLET) and its adjoint model (PHLET-A) to, together with other procedures, facilitate the emission estimate. The reliability of our inversion method is supported by 1) a rigorous step-by-step derivation of models, assumptions, and parameters used, 2) a comprehensive uncertainty analysis, and 3) an test with GEOS-Chem simulated NO 2 data. Our emission dataset in the YRD area on the 0.05°×0.05° grid shows fine-20 scale spatial information that is tied to nighttime light, population density, road network, maritime shipping, and land use indicated from a Google Earth photo. Our dataset reveals many fine-scale spatial characteristics not well represented or not included in lower-resolution inventories such as MEIC and DECSO. Although this study derives the averaged emissions over summer 2012-2015, calculations of emissions at higher temporal resolutions (e.g., every 2 years) is possible to better capture the interannual variability and trends.
Our inversion method is useful for understanding how human activities have altered the atmospheric environment at fine resolutions. Many crucial human activities, such as urbanization, are conducted at very fine spatial scales. How the resulting emissions affect air quality, public health, and geo-health are 5 still poorly understood due to lack of high-resolution emission data. This problem is particularly severe in the developing countries, because of their rapid paces of urbanization and great inadequacies in emission-related information such as economic statistics and emission factors. This poses a grand challenge for emission control and environmental management. Thus, our inversion method and resulting emission data offer useful independent high-resolution information to monitor the fine-scale emission 10 sources, to improve the bottom-up inventory, to model the urban pollution chemistry and the effect of urbanization, and to conduct spatially targeted emission control.
Our inversion method also has a few shortcomings. The derived emissions do not separate the individual contributions of anthropogenic sectors (i.e., power plants, industry, transportation, and residential). The spatial resolution of the estimated emissions is limited by that of satellite VCD data, although a special 15 oversampling technique has been used to help achieve the highest spatial resolution possible for emissions.
The PHLET model is assumed to be 2-dimensional by simplifying the vertical distribution of NO 2 and not accounting for the spatial variability in the vertical shape, similar to previous studies. The adjoint model assumes the observational error covariance matrix to be diagonal, without fully considering the effect of correlations between individual grid cells. Also, we assume a spatially uniform relationship 20 between NO 2 VCDs and NO 2 lifetimes, which may lead to an underestimate in the lifetimes at low-NO 2 locations over the eastern sea.
Our emission inversion method and models have a few important features enabling their global applications. PHLET and PHLET-A are written in the Python language, which can be readily used with low financial costs. The PHLET model offers computationally efficient simulations of the NO x chemistry, 25 deposition, and transport. At a low computational cost, our inversion method is able to account for the nonlinear relationship between NO x concentration, chemical loss, deposition, and transport. With the advent of TROPOMI and other satellite sensors with unprecedented spatial resolutions, our inversion approach can be applied to these measurements for continuous inference of emissions at finer and finer resolutions. 5

Data availability
Observational data are obtained from individual sources (see links in the text and acknowledgments).
Model results are available upon request. Model codes are available on a collaborative basis.

Author contributions
JL conceived the research. HK, RZ and JL designed the research. HK and RZ performed the data 10 processing, model development, and simulations. ML, HW, LC, RN, JW and YY contributed to data processing, model simulations, and data analyses. QZ provided MEIC data. HK and JL analyzed the results and wrote the paper with input from all authors.
• ( ( , ) • ( , )) = ( and are the diffusion coefficients in the zonal and meridional directions, respectively. We derive the diffusion coefficients based on a random walk assumption (Schirmacher, 2015): ′ is the deviation of wind speed in the zonal or meridional direction. 0 is 3 hours, the sampling 5 interval of ERA5 wind data. Figure A shows the time averaged wind vector and the distribution of and . The relative uncertainty in wind speed is assumed to be 10%, close to the temporal standard deviation of wind speed. The uncertainties of and are set to be 20%, about twice of the relative uncertainty in wind speed. The calculated ranges from 30397 2 −2 over land to 203783 2 −2 over sea. The ranges from 25811 2 −2 over land to 297053 2 −2 over sea. These diffusion 10 coefficients tend to be slightly underestimated, because the variabilities of wind speed at higher frequencies (than 3-hourly) are not accounted for. This means that PHLET may underestimate the horizontal transport slightly.

Appendix B. Satellite Conversion Matrix to account for the smoothing effect of satellite pixels
The SCM is essentially a tool to preform quick conversion between grids, regular or not. In the YRD area, 15 there are 100 × 100 = 10000 grid cells on the 0.05°×0.05° grid. We use the SCM (A matrix: [10000, 10000]) to convert from its original grid (X vector: [10000, 1]) to the final grid (Y vector: [10000, 1]), i.e.,
The 10000 elements in one specific row of A represent the weights of the 10000 elements of X to an element in Y. Apparently, A is a sparse matrix. The following description shows how A is constructed. 20 First, the VCDs specific to satellite pixels are reconstructed from the model grid cells. Each model grid cell (MGC) is divided into 10 × 10 = 100 finer grid cells (FGCs), each having the same area. Suppose the number of MGCs fully or partially covered by a given pixel p is N c , and the number of FGCs in a given MGC i covered by p is g i p , then the total number of FGCs covered by p is: Thus, the average VCD for the pixel p can be reconstructed as follows: The next step represents how the oversampling approach is applied to satellite-smoothed VCD data. 10 Suppose the number of satellite pixels fully or partially covering an MGC j is Np, then the total number of FGCs being part of the intersection of the Np pixels and MGC j is: Finally, the average VCD for the MGC j converted from the Np pixels is: The pink portion of Fig. B denotes the projection from pixel p to MGC j. Thus, the element of SCM converting from MGC i to MGC j can be derived as follows:

Appendix C. Deriving NO 2 lifetime from VCD
We assume a steady state of radicals (HO x ), where the production rate of HO x is equal to the loss rate through three types of termination reactions: between the hydroxyl radical (OH) and NO 2 , between NO and peroxyl radicals to form organic nitrates, and between peroxyl radicals (Murphy et al., 2006;Valin et al., 2011): 5 P(HO x ) = k 1 2 + αk 2eff Here P(HO x ) is the production rate, and the right-hand side of Eq. (C1) is the loss rate. C NO 2 and C OH denote the concentrations of NO 2 and OH, respectively. Since the conversion between the peroxyl radicals (HO 2 + RO 2 ) and OH is in steady state, the term k 1 k 2eff expresses the "effective" total concentration of peroxyl radicals in terms of the concentrations of NMVOC, OH and NO. Assuming P(HO x ), 10 and all reaction constants to be constant (Valin et al., 2011), and given that can be simplified as Eq. (C2): Here a, b, c are the coefficients. Because the chemical lifetimes of NO 2 is determined by ( = 1 k 1 ) , we can deduce the relationship between 2 and : 15 NO 2 is lost primarily through reaction with OH and secondarily through dry deposit ( 2 ), thus its lifetime ( ) is also determined by these two loss processes. Therefore, In the areas of low emissions, the emission term can be neglected in Eq. where k = 1 . We determine the coefficients a ′ , b ′ , c ′ , and k in Eq. C5 by conducing nonlinear fitting of OMI NO 2 VCD data and L values in the low emission areas (see below). This procedure establishes the 5 nonlinear relationship between and VCD, which is then applied to the entire study domain.
The low-emission areas have small values of VCD and large negative values of L. Figure C shows a scatter plot for the derived local net source L and VCD at each individual grid cell of the study domain.
The data scatter reflects the combined effect of emission, loss, and horizontal transport. We then fit the quantiles of L where the VCD is relatively low (< 5 × 10 15 molecules cm -2 , shown as blue points in Fig.  10 C) into Eq. C5 through a nonlinear quantile fitter based on Tensorflow (Abadi et al., 2016). Using the quantile fitting also means that the low-emission grid cells do not need to be explicitly identified prior to the fitting. The quantile fitting gives L as a function of VCD (when emissions are neglected), through which the relationship between lifetimes and VCDs is derived. We conduct the fitting by 50 times, each by linearly changing the assumed percentile threshold of L from 0.1% to 5%, to determine the fitted 15 median value (red line in Fig. C) and uncertainty (gray shaded areas, 95% CI). The uncertainty is caused by the assumption on the NO 2 /NO x ratio r, the simplification of the relationship between lifetimes and VCDs, and possible misjudgment of low-emission areas.
The orange line in Fig. C presents the relationship between NO 2 VCDs and chemical lifetimes ( ) derived based on the mean value of the fitting. The value of varies from 0.6 to 3.8 h with an average of 1.2 h. 20 The lifetimes decline rapidly with increasing VCDs from 0 to 2 × 10 15 molecules cm -2 , and then grows gradually with increasing VCDs. This result is consistent with Valin et al. (2011). By comparison, the value of is 30.4 h and is spatially homogeneous under the assumption here. The total lifetime ( ) varies from 0.6 to 3.3 h (Fig. C, blue line) across the study domain.        VCDs (x-axis) and derived local net sources (y-axis) across individual grid cells. Grid cells with NO 2 VCDs below (above) 5 × 10 15 molecules cm -2 are coloured in blue (orange). The red line and shading denote the median and uncertainty (1 σ CI) of the quantile fitting, respectively, to estimate the nonlinear relationship between NO 2 VCD and lifetime, based on data in the 5 low-emission areas. (b) The derived relationship between NO 2 VCD and lifetime across the range of NO 2 VCDs in the YRD area.