Satellite constraint for emissions of nitrogen oxides from anthropogenic , lightning and soil sources over East China on a high-resolution grid

Introduction Conclusions References


Introduction
Nitrogen oxides (NO x ≡ NO + NO 2 ) are important constituents in the troposphere affecting the formation of ozone and aerosols with significant consequences on air quality, climate forcing and acid deposition.They are emitted from anthropogenic combustion sources as well as natural sources from lightning, soil and biomass burning.Understanding the individual contributions of anthropogenic and natural emissions is critical both for evaluating the effects of NO x on the global environment and for forming appropriate emission control strategies in polluted areas like East China.
Several inversion studies have attempted to separate anthropogenic from other sources of NO x , especially soil sources, based on measurements from the Global Ozone Monitoring Experiment (GOME) instrument (Jaeglé et al., 2005;Müller and Stavrakou, 2005;Wang et al., 2007;Stavrakou et al., 2008), the Scanning Imaging Absorption Spectrometer for Atmospheric CHartographY (SCIAMACHY) instrument (Müller and Stavrakou, 2005;Stavrakou et al., 2008), and the Ozone Monitoring Instrument (OMI) (Zhao and Wang, 2009).Jaeglé et al. (2005) and Wang et al. (2007) proposed two different methods to separate anthropogenic, soil and biomass burning emissions month by month with no attempt to constrain lightning emissions.Jaeglé et al. (2005) assumed the a posteriori nonlightning emissions to be solely anthropogenic if the a priori anthropogenic emissions exceed 90 % of the a priori total emissions or if they exceed the a posteriori non-lightning emissions.Otherwise, differences between the a posteriori and a priori emissions were attributed to soil or biomass burning sources.A similar criterion was adopted by Zhao and Wang (2009) to differentiate anthropogenic and soil emissions.Wang et al. (2007) distinguished anthropogenic and soil sources using prescribed values for errors in the a priori anthropogenic emission data.Specifically, if the a posteriori non-lightning non-biomass burning emissions exceed the a priori anthropogenic emissions plus errors (assumed to be 40-60 %), the differences are attributed to soil sources.Müller and Stavrakou (2005) and Stavrakou et al. (2008) used an adjoint modeling approach for source attribution.Wang et al. (2007) suggested that soil emissions over East China amounted to 0.85 TgN per year for 1997-2000, differing significantly from other inverse estimates (Jaeglé et al., 2005;Müller and Stavrakou, 2005;Stavrakou et al., 2008;Zhao and Wang, 2009;L. Jaeglé, personal communication, 2011;C. Zhao and Y. Wang, personal communication, 2011).Soil emissions derived from the inverse modeling also differ from the bottom-up estimates (Yienger and Levy, 1995;Yan et al., 2003Yan et al., , 2005;;Hudman et al., 2012;Steinkamp and Lawrence, 2011); in particular, they are 50-300 % larger than the Yienger and Levy (1995) estimate.According to Jaeglé et al. (2005) and Wang et al. (2007), soil emissions may be as large as 40-50 % of anthropogenic emissions in summer for East Asia in 2000 and for East China in 1997-2000, respectively, with significant implications for the global biogeochemical cycling of nitrogen.
The magnitude of lightning emissions is difficult to estimate (Boersma et al., 2005;Schumann and Huntrieser, 2007), especially on the regional scale with significant variations in lightning occurrences from one year to another (Schumann and Huntrieser, 2007).Most inverse estimates did not attempt to constrain lightning emissions (Jaeglé et al., 2005;Wang et al., 2007;Zhao and Wang, 2009;Lin et al., 2010a, b;Lin and McElroy, 2010).The inverse modeling by Stavrakou et al. (2008) suggested lightning emissions to be 50-80 % larger than their a priori values (3 TgN yr −1 globally) with the largest difference over the tropics; the study did not specify the magnitude of lightning emissions over China.
This study presents a new method to inversely derive emissions of NO x for 2006 over East China (101.25 • E-126.25 • E, 19 • -46 • N; see Fig. 1) from anthropogenic, lightning and soil sources individually based on satellite retrievals of NO 2 columns and simulations of the global chemical transport model (CTM) GEOS-Chem.Emissions from biomass burning are not constrained since they are unimportant over East China (Wang et al., 2007;Lin et al., 2010a).A regression-based multi-step inversion approach is used to derive emissions for all months, exploiting information on the seasonal variations of individual sources simulated by the CTM.The satellite data are taken from the DOMINO product version 2 (DOMINO-2) retrieved by the Royal Netherlands Meteorological Institute (KNMI) from the Ozone Monitoring Instrument (OMI) (Boersma et al., 2007(Boersma et al., , 2011)).The nested GEOS-Chem model for East Asia (Chen et al., 2009) is used to calculate VCDs of NO 2 in response to various emission sources for the inversion purpose.The topdown emissions are derived at a relatively fine resolution of 0.25 • long × 0.25 • lat allowing for a more detailed analysis of the spatial distribution of emissions, compared to previous studies (Jaeglé et al., 2005;Müller and Stavrakou, 2005;Wang et al., 2007;Stavrakou et al., 2008;Zhao and Wang, 2009).
The paper is organized as follows.Section 2 presents the satellite product.Section 3 describes the CTM and compares simulated VCDs with retrieved values.Section 4 describes the inversion process in detail and analyzes the resulting top-down emissions for anthropogenic, lightning and soil sources.It also evaluates the effects of key assumptions made during the inversion process.Section 5 presents the a posteriori emissions in comparison with previous inverse and bottom-up estimates.Section 6 concludes the present analysis.

VCDs of NO retrieved from OMI
The KNMI DOMINO-2 product offers a level-2 dataset for VCDs of NO 2 derived by three main steps involving the calculation of slant column densities (SCDs), tropospheric SCDs, and tropospheric VCDs (Boersma et al., 2007(Boersma et al., , 2011)).The derivation relies on information on air mass factors (AMFs) to convert the tropospheric SCDs to VCDs.The AMFs are interpolated from a look-up table (LUT), and are subject to errors in the predetermined information for clouds, aerosols, surface albedo, the a priori vertical profile of NO 2 , surface pressure, and surface height.The reader is referred to Boersma et al. (2007Boersma et al. ( , 2011) ) for detailed derivation of the product.
Errors in retrieved VCDs are derived mainly from the calculation of SCDs and its tropospheric portion over cleaner 35 Figure 1.regions, and are mainly from the calculation of AMFs for polluted regions (Boersma et al., 2007).Compared to version 1, DOMINO-2 incorporates a variety of improvements on the LUT, surface albedo, and the a priori vertical profile of NO 2 (Boersma et al., 2011).It also includes a crosstrack stripe correction and a high-resolution dataset for surface height.As a result, systematic biases found in version 1 are reduced significantly in DOMINO-2 (Boersma et al., 2011).The overall error for retrieved VCDs in DOMINO-2 is estimated to be about 30 % (a relative error) plus 0.7 × 10 15 molec.cm −2 (an absolute error), likely with a magnitude larger in winter than in summer (Boersma et al., 2007(Boersma et al., , 2011;;Lin andMcElroy, 2010, 2011;Lin et al., 2010a).In this study, the relative error is assumed to vary nonlinearly from 30 % in summer to 50 % in winter based on the following formula: 0.3 + 0.2 × (1 − sin(i/10 ×π)), where i = 0, 1, 2, 3, 4, 5, 5, 4, 3, 2, 1, 0 for months from January to December.This information will be employed for purposes of emission inversion; and the assumed seasonality will be evaluated in Sect.4.6.
In this study, the daily level-2 data from DOMINO-2 are gridded to 0.25 • long × 0.25 • lat, which are averaged then to obtain monthly mean VCDs for subsequent emission inversion.The level-2 dataset includes measurements at 60 viewing angles corresponding to 60 ground pixels, and the pixel sizes increase nonlinearly from 13 × 24 km 2 at nadir to 25 × ∼140 km 2 at the edges of the viewing swath.This study excludes pixels with cloud radiance fraction exceeding 50 % (Boersma et al., 2007).In addition, it only uses data from the 30 pixels around the swath center (with a crosstrack length less than 30 km), allowing for a better analysis of the spatial distribution of VCDs within short distances.It consequently changes the swath width in use to about 800 km so that global coverage is achieved roughly about every three days.Note that the pixel sizes here are much smaller than the GOME (320 × 40 km 2 ) and SCIAMACHY (60 × 30 km 2 ) instruments used in previous inverse estimates (Jaeglé et al., 2005;Müller and Stavrakou, 2005;Wang et al., 2007;Stavrakou et al., 2008).

Descriptions of model simulations
This study uses the nested model of GEOS-Chem (version 08-03-02; http://wiki.seas.harvard.edu/geos-chem/index.php/MainPage)for East Asia run at a horizontal resolution of 0.667 • long × 0.5 • lat with 47 layers vertically (Chen et al., 2009).The model is run with the full O x -NO x -CO-VOC-HO x chemistry.It is driven by the assimilated meteorological fields of GEOS-5 taken from the National Aeronautics and Space Administration (NASA) Global Modeling and Assimilation Office (GMAO).Vertical mixing in the planetary boundary layer follows the non-local scheme (Lin and McElroy, 2010) accounting for the varying magnitude of mixing from stable to unstable states of the boundary layer.Convection is parameterized based on a modified version of the Relaxed Arakawa-Schubert scheme by Moorthi andSuarez (1992) (Rienecker et al., 2008).The lateral boundary conditions are updated every 3 h using results from associated global simulations at 5 • long × 4 • lat horizontally.
Annual anthropogenic emissions of NO x , carbon monoxide (CO) and non-methane volatile organic compounds (VOC) are taken from the INTEX-B dataset for 2006 provided by Zhang et al. (2009), including sources from power plants, industry, transportation and the residential sector.Emissions from the residential sector are further assumed to vary month to month accounting for heating related emissions that depend on ambient air temperature (Streets et al., 2003).They, however, contribute only 6 % of anthropogenic sources of NO x on the annual basis (Zhang et al., 2009).Emissions from power plants, industry and transportation are held constant across the seasons since the seasonality is relatively small and is not included in the INTEX-B dataset.The impact of such simplification is found to be small (see Sect. 5).The diurnal variations of individual sources follow Lin et al. (2010a) and Lin andMcElroy (2010, 2011).
Natural sources of NO x include lightning, soil and biomass burning.Emissions from biomass burning are taken from the year-to-year varying monthly dataset of GFED2 (van der Werf et al., 2006); their magnitudes are negligible for NO x over China (Wang et al., 2007;Lin et al., 2010a) as the relatively low combustion temperature does not allow for significant production of NO x .In 2006, the emission budget for East China is only about 0.013 TgN.
Production of NO x from lightning is determined by the flash rate multiplying the yield of nitric oxide (NO) from each flash.In GEOS-Chem, the NO yield is latitude dependent: over the Asian continent, the yield is set to be 500 moles per flash north of 35 • N reducing to 260 moles per flash south of 35 • N (Martin et al., 2006;Hudman et al., 2007), based on previous observational constraint that the NO yield in the tropics is lower than in the midlatitude (Huntrieser et al., 2006).The total (intra-cloud and cloudto-ground) flash rate is determined by convective cloud top height to the 4.9th power over lands free of snow and ice, as formulated by Price et al. (1997).The total amount of lightning induced emissions is distributed vertically with a backward "C-shape" profile (Ott et al., 2010).Horizontally, as the flash rate depends on cloud properties that are highly parameterized, it is subject to large uncertainties particularly for individual locations.Alternate lightning parameterizations based on cloud mass flux or convective precipitation (Allen and Pickering, 2002) were found not to improve the simulation of lightning distribution (Hudman et al., 2007).To improve the simulation, a horizontal adjustment is taken for each model gridbox based on the OTD/LIS satellite measurements of lightning flashes (Sauvage et al., 2007;Murray et al., 2009Murray et al., , 2010Murray et al., , 2012)).For each month, the mean flash rate over 2004-2008 is set as the monthly climatology derived from the satellite measurements from 1995 to 2005; while the interannual variability is determined by the year-to-year varying cloud heights taken from the GEOS-5 meteorological fields (note that measurements for north of 35 • N are derived from the OTD instrument available in [1995][1996][1997][1998][1999][2000].The constraint on the monthly climatology is meaningful since no significant trend of lightning activities is found based on the satellite measurements (Sauvage et al., 2007;Murray et al., 2009Murray et al., , 2010Murray et al., , 2012)).Over East China, convection and precipitation amount are both driven by the seasonal transition of the East Asian Monsoon and thus are highly correlated month to month.Figure 3a shows that GEOS-5 captures the observed seasonal variation of precipitation and near-surface (2 m) air temperature, likely indicating that the seasonality of convection is simulated reasonably well, at least on the regional mean basis.It is expected thus that the seasonality of lightning activities is likely reasonably reproduced by GEOS-Chem, although large uncertainties still exist associated with the exact magnitude and timing of lightning emissions.
Soil emissions are based on the Yienger and Levy (1995) scheme with the canopy reduction factors described by Wang et al. (1998).They include sources due to microbiological processes producing NO x naturally as well as those associated with use of chemical fertilizers and manure.The net emissions vary with vegetation type (Olson, 1992), temperature and precipitation.The N-pulsing is determined by the amount of precipitation over lands containing dry soils prior to the precipitation (Yienger and Levy, 1995).It is noted that GEOS-5 simulates very well the seasonal variability of precipitation and air temperature (Fig. 3a).Fertilizer derived emissions are limited to agricultural lands and are distributed evenly over the growing season (May to August north of 28 • N and all year long in the tropics) (Yienger and Levy, 1995;Wang et al., 1998).They are assumed to be 2.5 % of the total amount of fertilizer use taken from the country based statistics of the Food and Agriculture Organization of the United Nations (FAO); for China, the fertilizer data represent the year 1990 (Wang et al., 1998).As normally assumed, fertilizer associated emissions are considered to be part of natural sources in this study for comparison with anthropogenic emissions relating to combustion.
Averaged over East China, the a priori anthropogenic emissions of NO x are relatively constant across the seasons, while lightning and soil sources reach maximum values in summer and are not important in winter (Fig. 2).On the annual basis, the a priori anthropogenic emissions of NO x are about 5.8 TgN yr −1 over East China; and lightning and soil emissions are only about 3-6 % of anthropogenic emissions (Table 1).In July, lightning and soil emissions increase to as large as 10-13 % of anthropogenic emissions (Table 1).Locally, anthropogenic emissions exhibit a seasonal pattern that is negatively or weakly correlated with the seasonality of lightning and soil emissions (Fig. 3b).The spatial correlation between anthropogenic and lightning or soil emissions is lower than 0.36 in all months, with a value larger in summer and much lower in winter.
A total of five 1-yr simulations for 2006 were conducted to quantify VCDs of NO 2 from anthropogenic, lightning, soil and biomass burning sources, as shown in Table 2.For consistency with satellite retrievals, model VCDs in each day are obtained by regridding modeled NO 2 at each vertical layer to 0.25 • long × 0.25 • lat, sampled from gridboxes with valid satellite retrievals, and applied with the averaging kernel from DOMINO-2.The daily data are averaged then to obtain monthly mean values for each gridbox.
Potential sources of model errors include emissions of NO x , emissions of other pollutants affecting the chemistry of NO x , the chemical mechanism for NO x , the scheme for mixing in the boundary layer, and the meteorological fields.The total model error from factors other than emissions of NO x is estimated to be about 30-40 % (Martin et al., 2003;Wang et al., 2007;Lin and McElroy, 2011); in this study, the value of 40 % is taken for the inversion purpose.GEOS-Chem captures fairly well the spatial distributions of retrieved VCDs in different seasons (Fig. 4).The R 2 for spatial correlation between modeled and retrieved VCDs reaches 0.64 for January and 0.53 for July (Table 3).The smaller correlation in July is in part because the native horizontal resolution of the CTM (0.667 • long × 0.5 • lat) is not fine enough to capture the large spatial variation of VCDs within short distances resulting from the short lifetime of NO x (Martin et al., 2003;Lin et al., 2010a).Spatial smoothing of 5 gridboxes by 5 gridboxes (i.e.1.25 • long by 1.25 • lat) results in a significant enhancement of modelretrieval correlation in July with the R 2 increasing to 0.67.The improvement of R 2 due to the smoothing is moderate in January, compared to July, since the lifetime of NO x is longer and the spatial variability of VCDs within short distances is smaller and better simulated by GEOS-Chem.
GEOS-Chem underestimates the magnitude of retrieved VCDs particularly over polluted regions in wintertime (Fig. 4).The range of simulated VCDs is also narrower than the retrieved range: spatially, the modeled maximum VCD is lower than the retrieved maximum with the minimum being higher (Table 3).Averaged over East China, model VCDs are about 20 % lower than retrieved values in July and about 36 % lower in January (Table 3).

Method
As discussed in Sect.3.1, anthropogenic emissions of NO x in East China exhibit weak seasonality and natural emissions reach maximum values in summer and minimum in winter (Fig. 2).In addition, the lifetime of NO x is shortest in summer and longest in winter as a result of varying photochemical activity (Martin et al., 2003;Lin et al., 2010a;Lin and McElroy, 2010).This results in minimum values in summer for VCDs of NO 2 of anthropogenic origin and maximum values for NO 2 from natural sources, as simulated by GEOS-Chem (Fig. 5).Averaged over East China, natural sources contribute to about 30 % of the total abundance of NO 2 in July and August, in contrast to their negligible contributions in winter months.This characteristic is exploited here to estimate anthropogenic and natural emissions separately.
The inversion here involves a multi-step process based on a weighted multivariate linear regression analysis facilitated by several supplementary procedures.It is done gridbox by gridbox to derive the respective emissions.The regression is described in Sect.4.1.1.The complete inversion process is described in Sect.4.1.2together with the supplementary procedures.

A weighted multivariate linear regression analysis for each gridbox
Neglecting horizontal transport and assuming a linear relationship between the total VCD of NO 2 and VCDs from individual sources, the retrieved VCD of NO 2 for a given gridbox (of 0.25 • long × 0.25 • lat) in a given month can be approximated as the sum of modeled VCDs from individual emission sources, multiplied by certain scaling factors, and a random error term: where m = m,a + m,l + m,s + m,b .Here r denotes retrieved VCD of NO 2 , and m denotes modeled VCD.The error term ε is assumed to follow a normal distribution with zero mean and standard deviation (σ ) equal to the sum in quadrature of errors from r and m .The subscripts "a", "l", "s", and "b" indicate anthropogenic, lightning, soil, and biomass burning sources of NO x , respectively.The VCD that can be predicted from the inversion process ( p ) is:

Atmos
where each k is an estimate of the corresponding K and is to be determined by the inverse modeling.The top-down emission (E t ) is calculated as the sum of the a priori emissions from individual sources multiplied by corresponding scaling factors: Here a linear relationship is assumed for each source between emissions of NO x and VCDs of NO 2 .r , m and σ are known variables, and they vary from one month to the next.By minimizing the sum of [( r − p )/σ ] 2 in all months, Eqs.
(1)-( 2) serve as a weighted multivariate linear regression model to determine the scaling factors.Here the scaling factors are assumed implicitly to be season independent.Over China, the contribution of biomass burning is very small for NO x , thus we do not attempt to constrain the associated emissions: the respective scaling factor is set as unity.Emissions from lightning and soil vary with seasons with similar patterns (Fig. 2), thus it is difficult to distinguish their contributions accurately based on the regression approach.Therefore the scaling factors for the two sources are assumed to be the same in conducting the regression analysis.Under these assumptions, Eqs.(1)-(3) reduce to: The regression model here provides a basic statistical tool to calculate k a and k l .It is necessary to determine the ranges of k a and k l to facilitate the regression analysis; without such constraints the derived scaling factors may be unrealistic for certain gridboxes (too high, too low or even negative).k a is set to range from 0.33 to 3 reflecting that uncertainties in anthropogenic emissions are normally moderate for individual gridboxes.Uncertainties in natural emissions are expected to be larger than those in anthropogenic emissions, therefore k l is set to vary between 0.2 and 5.A larger range allowable for k l (e.g.0.1-10) results in more extreme values of k l in several sparse locations with very low emissions, and has negligible impacts on the top-down emission budgets for East China.

A multi-step inversion process beyond the regression analysis
To better estimate k a and k l , the gridboxes are allocated to different groups for ancillary procedures supplementing the regression analysis.The ratio r / m differs between winter and summer due in part to the season-dependent contribution of natural sources to the abundance of NO 2 and the relative magnitudes of errors in the a priori natural versus anthropogenic emissions.Thus the seasonally varying ratio is used to categorize the gridboxes, followed by a multi-step procedure to calculate k a and k l for all gridboxes (Fig. 6).Gridboxes are assigned to group 1 if the ratio r / m for a selected winter month is smaller than the ratio for a selected summer month; otherwise they are assigned to group 2. For gridboxes in group 1, the regression analysis is performed to calculate k a and k l (step 1 in Fig. 6).Results are discarded if the regression is not statistically significant at a significance level of 0.1 under the Chi-square test.Spatial interpolation (step 2 in Fig. 6) is then conducted iteratively to derive k l for all gridboxes: for a gridbox with undetermined k l , the value of k l is calculated as the geometric mean of values in the surrounding 24 gridboxes derived previously.For gridboxes in group 2, the regression analysis usually results in unrealistically low values for k l and thus is used to estimate k a only (step 3 in Fig. 6); the value of k l has been determined at previous steps.Again, the result is discarded if the regression is not statistically significant.
The regression approach may not be appropriate for certain gridboxes with unrealistically large r / m ratios in the winter month.This is because the scaling factor k a likely has a significant seasonal dependence as a result of the seasonality in anthropogenic emissions not fully accounted for in the a priori dataset.These gridboxes are therefore reassigned to a third group, where a k a is derived for each month as the ratio a (step 4 in Fig. 6).Tentatively, a gridbox is allocated to group 3 if the ratio r / m exceeds 3, or if the ratio exceeds 2 with r being higher than 6 × 10 15 molec.cm −2 , in the winter month.
As a final step, spatial interpolation is conducted to obtain k a for gridboxes of groups 1 and 2 where the regression is not statistically significant (step 5 in Fig. 6).
In group allocation and subsequent inversion process, each of three winter months (December, January and February) is paired with each of two summer months (July and August) to generate a suite of six cases for winter-summer contrast.(June is not selected since the contribution of natural sources to the total abundance of NO 2 is not as significant as that in July or August).Results from the six cases are combined to obtain final values of k a and k l .Specifically, results at a given step available from any or all of the six cases are geometrically averaged to obtain final values of k a and k l , if and only if they have not been derived at earlier steps.

Scaling factors estimated for anthropogenic, lightning and soil sources
The scaling factors are determined at step 1 for most gridboxes (Figs. 7 and 8).At this step, the values of k a are around unity in most areas, are consistently larger than 1 over the southwestern provinces, and vary between 1 and 2 in many of the remaining areas (Fig. 7).By comparison, k l varies more significantly from one location to the next (Fig. 8).It reaches maximum values in southern Hebei Province and along the northern coasts of the Bohai Sea, which are attributable in part to an underestimate in fertilizer-associated soil emissions in the a priori dataset (see Sect.The special treatment at step 4 is taken mainly for gridboxes in and around Shanxi Province and in parts of Ningxia and Inner-Mongolia (Fig. 7).These places are main areas in China for coal mining and coal-fired electricity generation.The large values of k a in these places, particularly in winter, suggest that the a priori dataset from INTEX-B likely underestimates anthropogenic sources related to use of coal, consistent with the findings of Wang et al. (2010).
The final values of k a and k l are determined at step 5 (see Fig. 7 for January) and step 2 (Fig. 8), respectively, for all gridboxes.They are used to calculate the top-down emissions for respective sources.

VCDs of NO 2 predicted from the inversion process
Compared to simulation results, predicted VCDs are much closer to retrieved values (Fig. 4).The spatial correlation between predicted and retrieved VCDs is much higher than the correlation between simulated and retrieved VCDs, with the R 2 increasing from 64 % to 83 % in January and from 53 % to 86 % in July (Table 3).Averaged over East China, predicted VCDs are within 15 % of retrieved values across the seasons.

Top-down emissions
Figure 9 compares the top-down emissions of NO x for anthropogenic, lightning and soil sources with respective a priori emissions for July.Both top-down and a priori datasets suggest significant anthropogenic sources over the coastal provinces from Shanghai to Beijing and over the Pearl River Delta, resulting in large concentrations of NO 2 (Fig. 10).The top-down anthropogenic emissions are much higher than the a priori emissions in cities and at locations with extensive use of coal, especially in the northern provinces.
In July, emissions from lightning are large in the north as a result of strong convection events associated with the summer monsoon (Fig. 9).Soil emissions are greatest in major agricultural areas in Hebei, Henan, Shandong and parts of neighbor provinces (Fig. 9).The top-down estimates for lightning and soil emissions are much larger than the a priori values at many places of the respective source regions.In particular, the significant increases in soil emissions over southern Hebei and along the northern coasts of the Bohai Sea are not likely to be an artificial result of assumptions embedded in the inversion process, as they persist during various sensitivity tests (Cases 1-19 in Table 1).The increases are consistent with the new bottom-up estimates by Steinkamp and Lawrence (2011) and Hudman et al. (2012), and are contributed in part by the intensive fertilizer-derived emissions that are underestimated in GEOS-Chem (Yienger  and Levy, 1995;Wang et al., 1998;Steinkamp and Lawrence, 2011).More detailed comparisons with Steinkamp and Lawrence (2011) and Hudman et al. (2012) are conducted in Sect.5.1.2.
The contribution of anthropogenic sources to the total emissions of NO x in July is generally consistent between the a priori and top-down datasets (Fig. 9).The anthropogenic contribution exceeds 80 % over large areas of East China, but is lower than 60 % over most of the northwest and Inner-Mongolia.It differs from the anthropogenic contribution to the VCD of NO 2 (Fig. 10).This is in part because lightning emissions occur at higher altitudes with longer lifetime of NO x than near the ground, compensated by a reduced fraction of NO 2 in NO x .Another factor is the use of averaging kernel in deriving model VCDs.The averaging kernel is larger for NO 2 of lightning origin at higher altitudes and lower for NO 2 derived from the ground (Eskes and Boersma, 2003).Thus the contribution is enhanced for a given amount of lightning emissions to the total abundance of NO 2 , compared to the same amount of emissions from the ground.The  effect is evident particularly over the southwest.The inverse modeling study by Lin et al. (2010a) showed that an assumed 100 % increase (0.57 TgN yr −1 ) in top-down lightning emissions resulted in a 15 % reduction (0.86 TgN yr −1 ) in top-down anthropogenic emissions for July 2008 over East China; that is, any NO 2 molecule originating from lightning is 1.5 times as likely to be observed by OMI than a NO 2 molecule of anthropogenic origin, consistent with the analysis here.In addition, data for simulated VCDs are sampled at the time of day of satellite overpass from days with valid retrieval data; while data for monthly emissions are averaged over all time of day in all days of the month.The different sampling methods and the day-to-day and diurnal variations in natural emissions may introduce some differences between anthropogenic contributions to total emissions and to total VCDs.Another possible cause is the neglect of horizontal transport likely introducing uncertainties in source attribution for relatively clean regions, e.g. at many places of Inner-Mongolia.
Table 1 compares the a priori and top-down emission budgets over East China for individual sources.Annually, the inversion results in budgets of 8.016 TgN, 0.228 TgN and 0.424 TgN for anthropogenic, lightning and soil emissions, respectively.These values are about 39 %, 31 % and 31 % larger than the corresponding a priori estimates.The topdown datasets also suggest that both lightning and soil emissions are less than 6 % of anthropogenic emissions.For July, the top-down budgets for lightning and soil emissions are 0.0623 TgN and 0.0810 TgN, respectively, about 10 % and 13 % of anthropogenic emissions estimated at 0.646 TgN.

Improved GEOS-Chem simulations using the top-down emissions
The inversion approach presented here assumes a linear relationship for a given gridbox between the total VCD of NO 2 and VCDs from individual sources and between emissions of NO x and VCDs of NO 2 .It also neglects horizontal transport, which may introduce uncertainties in deriving emissions for individual gridboxes.To evaluate the reliability of the inversion results, sensitivity simulations of GEOS-Chem were conducted for January and July 2006 by using the top-down emissions to drive the CTM.As shown in Fig. 4, VCDs resulting from the sensitivity simulations reproduce the spatial 1 Figure 10. 2 Fig. 10.VCDs of NO 2 (10 15 molec.cm −2 ) in July 2006 over East China resulting from anthropogenic, lightning and soil emissions of NO x and the anthropogenic contributions (in percentage) to the total VCDs modeled by GEOS-Chem and predicted by the inversion.Areas outside the territory of China or without valid retrievals are shown in grey.distribution of retrieved VCDs in both months.The R 2 for spatial correlation reaches a high level of 81 % for January and 66 % for July (Table 3).The smaller correlation in July is due to the short lifetime of NO x such that the CTM (at 0.667 • long × 0.5 • lat) is not able to simulate the large spatial variation of NO 2 within short distances, as discussed in Sect.3.2.After (5 gridboxes by 5 gridboxes) horizontal smoothing, the R 2 increases to 88 % for January and 81 % for July (Table 3).Averaged over East China, the magnitude of model VCD is about 10 % higher than the predicted value in January and about 5 % lower in July (Table 3), indicating a slight nonlinear relationship between emissions and VCDs through the impacts on other species (HO x , ozone, etc.) and consequently on the lifetime of NO x and its partitioning into NO and NO 2 .

Sensitivity of emission inversion to embedded assumptions
This section evaluates the effects on the top-down emissions of several important assumptions taken during the inversion process, particularly for assumptions on the seasonality of various emission sources.The results are summarized in Table 1.This study only includes 30 out of the 60 pixels from each OMI scan with smaller sizes in order to better analyze the spatial distribution of nitrogen within short distances.The top-down emission budgets for individual sources are similar to results from a sensitivity calculation employing OMI data from all pixels (Case 2 in Table 1); the differences are larger for individual locations (not shown).
The inversion approach allocates individual gridboxes to three groups prior to the regression.The effect of group allocation was evaluated by two tests, one by re-allocating gridboxes in group 2 to group 1 and the other by re-allocating gridboxes in both group 2 and group 3 to group 1.The tests suggested that the effect of group allocation is less than 15 % for top-down lightning/soil emission budgets for East China (Cases 3 and 4 in Table 1) with much larger impacts for individual locations (not shown).
The regression accounts for the seasonal dependence of retrieval errors.A sensitivity test assuming a time invariant relative error of 30 % resulted in decreases by less than 5 % in top-down lightning/soil emissions (Case 5 in Table 1).Therefore the inversion approach is not sensitive to the seasonality of retrieval errors assumed here.
The regression here relies on assumptions on the seasonal variations of individual emission sources.Anthropogenic emissions from power plants, industry and transportation differ slightly among seasons (Zhang et al., 2009), but are assumed here to be season independent for simulations of GEOS-Chem.The assumption was evaluated by a sensitivity analysis taking into account the seasonality estimated by Zhang et al. (2009).Specifically, modeled anthropogenic VCDs ( m,a ) for individual months were scaled prior to the inversion by the seasonality of total anthropogenic emissions derived from Table 9 of Zhang et al. (2009).This resulted in increases by less than 15 % in top-down lightning/soil emissions (Case 6 in Table 1).The effect is smaller than 5 % for top-down anthropogenic emissions.
Another test was taken to evaluate the effect of errors in the Yienger and Levy (1995) soil emissions used as our a priori estimate.Specifically, the updated emission data by Hudman et al. (2012) (see Sect. 5.1.2for specifications) were used to adjust modeled VCDs from soil sources ( m,s ) prior to the inversion, by scaling the VCDs for individual gridboxes with the ratios of Hudman et al. (2012) over Yienger and Levy (1995) soil emissions.As such, the annual top-down lightning emissions were reduced by 11 % and soil emissions enhanced by 19 % (Case 7 in Table 1).This is because of differences in seasonality (timing and magnitude) between soil emissions estimated by Hudman et al. (2012) and by Yienger and Levy (1995).The impacts of emission seasonality are analyzed further below.Jaeglé et al. (2005), Wang et al. (2007) and Zhao and Wang (2009) assumed lightning emissions to be simulated well by the CTM with no attempt to constrain them inversely.Under the same assumption, a sensitivity analysis was conducted by keeping lightning emissions unchanged during the inversion process here.This resulted in a 52 % increase in the top-down soil emission budget on the annual basis and a 41 % increase for July (Case 8 in Table 1).
If soil emissions are held unchanged during the inversion process, the top-down lightning emissions will be increased by less than 6 % (Case 9 in Table 1).
The inversion relies on modeled seasonal variations of lightning and soil emissions for separation from anthropogenic emissions.Due to similarity in seasonality, lightning and soil sources cannot be separated unambiguously for a given gridbox.A total of nine additional tests were performed to further analyze the sensitivity of inversion results to assumptions on the seasonality of lightning and soil emissions, including the timing and magnitude (Cases 10-18 in Table 1).To test the timing of the seasonality, m,l and m,s were shifted arbitrarily forward or backward by one month, separately or in combination, prior to the inversion process (Cases 10-15 in Table 1).As a result, top-down lightning and soil emissions are affected by up to 14 % on the regional mean basis.Another three tests evaluate the magnitude of the seasonality, by lowering m,l and m,s , separately and in combination, by 20 % in spring (March, April, May) and fall (September, October, November) and 40 % in summer (June, July, August).These changes have significant impacts on top-down lightning and soil emissions: by reducing m,l and m,s simultaneously, top-down lightning and soil emissions were enhanced by about 33-34 % in July.
Convection lifts pollutants in the boundary layer to the upper troposphere affecting the vertical distributions of various species.The magnitude of convection is highly parameterized in current climate models and CTMs and thus contains large uncertainties (Tost et al., 2010).The importance of model convection for simulated VCDs of NO 2 originates from the altitude dependences of the lifetime of NO x , the fraction of NO 2 in NO x , and the averaging kernel that is applied to the vertical distribution of NO 2 (when comparing with satellite retrievals).Its net effect can be estimated roughly by comparing NO 2 originating from a given amount of lightning emissions (i.e.located mostly at high altitudes) and NO 2 from the same amount of anthropogenic emissions (i.e.located mostly in the boundary layer).As suggested by Lin et al. (2010a) and discussed in detail in Sect.4.4, any NO 2 molecule originating from lightning is 1.5 times as likely to be observed by OMI than a NO 2 molecule of anthropogenic origin for July 2008 over East China.Thus, doubling the magnitude of model convection and consequent changes in vertical distribution of NO 2 will result in a net increase by 50 % in modeled convection-associated VCD of NO 2 (when the averaging kernel is taken into account).To evaluate the impact of potential errors in model convection, a sensitivity test enhanced by 50 % the magnitude of convection of anthropogenic NO 2 in July and August 2006, by tentatively increasing by 25 % modeled concentrations of anthropogenic NO 2 above 500 hPa (Case 19 in Table 1).This effectively increased modeled anthropogenic VCDs of NO 2 over East China by about 7.5 % in both months.It consequently resulted in a slight reduction in top-down anthropogenic emissions with reductions by about 4-5 % in top-down lightning and soil emissions (Case 19 in Table 1).
Errors in top-down emissions attributable to the inversion procedure are calculated as the sum in quadrature of percentage deviations of all inversion results (Cases 2-19) relative to the base estimate (Case 1), added in quadrature with errors resulting from the nonlinearity between emissions of NO x and VCDs of NO 2 that are not accounted for in the sensitivity tests.The nonlinearity derived errors are taken to be about 10 % for East China as a whole, based on discussions in Sect.4.5.Thus, errors in top-down emissions for East China attributed to the inversion procedure are estimated to be about 12 %, 59 % and 69 % for anthropogenic, lightning and soil sources, respectively (Table 1).

Total errors in the top-down emission budgets over East China
The inverse estimate here is subject to errors in retrievals, errors in model simulations, and errors in the inversion procedures as estimated from the sensitivity analyses.The total error in top-down emission budget over East China is taken to be the sum in quadrature of the three errors, amounting to about 52 %, 77 % and 85 % for anthropogenic, lightning and soil sources, respectively (Table 1).We do not attempt to estimate the total errors for individual gridboxes as the model and retrieval errors available here only represent bulk estimates on the regional mean basis.

A posteriori emissions
The a posteriori emissions are estimated as the average of the a priori and top-down emissions weighted by the inverse-square of their respective errors (Martin et al., 2003).Errors in the a priori emissions are taken to be 60 % for anthropogenic sources (Wang et al., 2007;Zhao and Wang, 2009) increasing to 100 % for lightning and soil sources accounting for the large range of current estimates (Boersma et al., 2005;Jaeglé et al., 2005;Schumann and Huntrieser, 2007;Wang et al., 2007;Zhao and Wang, 2009;Lin et al., 2010a).Errors in the respective top-down emissions are taken to be 52 %, 77 % and 85 %, as derived in Sect.4.7.Note that the error estimates here are conducted for total emissions in East China.Errors at individual locations may be much larger for both a priori and top-down datasets; the derivation however requires more detailed information that is not available currently.Therefore the error estimates for regional emission budgets are applied to individual locations, as a simplified procedure, in deriving the a posteriori emissions.

Comparison with previous estimates
That anthropogenic emissions inferred from space are larger than bottom-up inventories for China is consistent with results from many previous studies (Jaeglé et al., 2005;Wang et al., 2007;Zhang et al., 2007;Lin and McElroy, 2011).Our a posteriori budget for anthropogenic emissions over East China is similar to the value of 0.565 TgN for July 2007 estimated by Zhao and Wang (2009).This study further pinpoints, at a higher resolution, cities and areas with extensive use of coal to be the main regions where bottom-up inventories likely underestimate anthropogenic emissions.Evaluation of lightning emissions is more difficult due to the large uncertainty in current research (Boersma et al., 2005;Schumann and Huntrieser, 2007) and the significant interannual variability of lightning occurrences on the regional scale (Schumann and Huntrieser, 2007).Our a posteriori emissions are within the range of previous estimates (Boersma et al., 2005;Schumann and Huntrieser, 2007;Stavrakou et al., 2008).
Soil emissions of NO x over China are of great interest concerning the extensive use of fertilizers.A detailed analysis is conducted as follows for our a posteriori estimate of soil emissions.Stehfest and Bouwman (2006).Hudman et al. (2012) combine satellite measurements for the growing season of vegetation and attribute 75 % of fertilizer derived emissions to the first month of the growing season.They also consider deposited nitrogen species as an additional fertilizer-like source of NO x .Steinkamp and Lawrence (2011) update the leaf area index (LAI) data for calculating the canopy reduction factor; while Hudman et al. (2012) assume no canopy reduction.Our a posteriori soil emissions at 0.382 TgN (±65 %) annually is within 25 % of the value of 0.504 TgN calculated by Hudman et al. (2012), if canopy reduction is accounted for in their estimate (R. Hudman, personal communication, 2011).Our emissions are also within the range of 0.18-0.97TgN suggested by Steinkamp and Lawrence (2011) (see their Table 7).In addition, the horizontal distribution of soil emissions is in broad agreement between our a posteriori dataset and the two new bottom-up estimates.The inversion starts by allocating the gridboxes to three groups based on analyses of the ratio of retrieved over modeled VCDs in winter and summer.A multivariate regression analysis is used then to derive emissions from individual sources for all months, taking advantage of the seasonal patterns of different sources determined by the CTM.Ancillary procedures are taken to supplement the regression analysis for gridboxes in different groups.Assumptions made during the inversion process contribute to errors in the topdown emission budgets for East China by ∼12 % for anthropogenic sources, ∼59 % for lightning sources and ∼69 % for soil sources.Sensitivity simulations of GEOS-Chem driven by the top-down emission data reproduce the spatial distribution of VCDs retrieved from OMI, with the R 2 for spatial correlation reaching 0.88 for January and 0.81 for July after (5 gridboxes by 5 gridboxes) horizontal smoothing.

Comparison with previous satellite-derived soil emission estimates
The inversion results in an annual budget of 7.060 TgN (±39 %) for the a posteriori anthropogenic emissions of NO x over East China, about 23 % larger than the INTEX-B dataset (Zhang et al., 2009) used as our a priori emissions.On the 0.25 • long × 0.25 • lat grid, it is evident that the excess is greater over cities and areas with extensive use of coal, particularly in the north in winter.
The a posteriori budgets are 0.208 TgN (±61 %) and 0.382 TgN (±65 %) for lightning and soil emissions, respectively, for 2006 over East China.Both values are about 18 % higher than the respective a priori estimates, but are each less than 6 % of the a posteriori anthropogenic emissions.Even for July, the a posteriori lightning and soil emissions are only about 10 % and 13 % of anthropogenic emissions, respectively.Our results for soil emissions are consistent with recent bottom-up estimates by Steinkamp and Lawrence (2011) and Hudman et al. (2012) and previous inverse estimates by Jaeglé et al. (2005), Stavrakou et al. (2008) and Zhao and Wang (2009).They are however about half of the inverse estimate by Wang et al. (2007) who suggested soil emissions to be more than 40 % of anthropogenic emissions in summer of 1997-2000.
In concluding, anthropogenic emissions are found to be the dominant source of NO x over East China for 2006, even in summer when natural sources reach maximum values.The contribution of anthropogenic emissions most likely has increased in more recent years due to their rapid growth (Lin and McElroy, 2011).In the future, the anthropogenic contribution may continue to increase along with the rapid economic and industrial development, if emission control is not taken successfully.The importance of nitrogen control has been recognized by the Chinese government, resulting in control strategies targeting the power sector.However, the successfulness of nitrogen control also depends on changes in emissions from other sectors, particularly the industrial sector for which the current inventories may be subject to much larger uncertainties than for the power sector (Zhao et al., 2011).Further research is required to evaluate the effectiveness of nitrogen control and resulting impacts on the contributions of anthropogenic versus natural sources to atmospheric nitrogen burdens.

Fig. 1 .
Fig. 1.Regional specifications with provincial boundaries for EastChina.Also presented are its subregions, provinces and provincelevel municipalities discussed in the text: the Yangtze River Delta (YRD), the Pearl River Delta (PRD), the Sichuan Basin, the Bohai Sea, Beijing (BJ), Shanghai, Hebei, Henan, Shandong, Shanxi, Ningxia, Inner-Mongolia, and Sichuan.On the background is the annual mean VCDs of NO 2 from the DOMINO-2 product.

Fig. 2 .
Figure 2. 2 3 4 5 6 7 8 9 10 11 Figure 3. 2 Fig. 3. (a) Seasonal variations of monthly mean near-surface (2 m) air temperature and precipitation observed from 284 meteorological stations over East China and modeled by GEOS-5.The observation data are taken from the global hourly dataset (DS3505) archived in the National Oceanic and Atmospheric Administration (NOAA) National Climatic Data Center (NCDC) (http://www7.ncdc.noaa.gov/CDO/cdo; see Lin et al., 2010b).Data from GEOS-5 are sampled at the meteorolgocal stations for consistency with the observations.(b) Spatial distribution of the month-to-month correlation between a priori anthropogenic and natural (lightning + soil) emissions.
4.4 for details).The value of k l also has spike values at spotty locations in other parts of East China where natural emissions are normally very low.These spikes are likely artificial results of the inversion algorithm, errors in GEOS-Chem, and/or errors in the satellite product; they however have negligible impacts on emission budgets over East China.

Fig. 4 .Fig. 5 .
Fig. 4. VCDs of NO 2 (10 15 molec.cm −2 ) for January, April, July, October and annual average for 2006 retrieved from OMI, simulated by GEOS-Chem with a priori emissions, predicted by the inversion approach, and simulated by GEOS-Chem with top-down emissions.Areas outside the territory of China or without valid retrievals are shown in grey.

Fig. 6 .
Fig. 6.Description of the regression-based step-by-step inversion process after gridbox grouping.Values of k l are determined at steps 1 and 2 for all gridboxes and k a at steps 1, 3, 4, 5. See Sect.4.1.2for detailed analysis.

Fig. 9 .
Fig. 9.The a priori, top-down and a posteriori estimates of anthropogenic, lightning and soil emissions of NO x (10 15 molec.cm −2 h −1 ) and the anthropogenic contributions (in percentage) to total emissions for July 2006 over East China.Areas outside the territory of China are shown in grey.

Table 1 .
Emission budgets over East China (TgN) derived from various inversion calculations.

Table 2 .
Descriptions of VCDs of NO 2 from individual sources derived from model simulations.Case 3 − Case 4 * Emissions from all sources are always included for pollutants other than NO x . 3.

2 Comparison between simulated and retrieved VCDs of NO 2
Figure 4 compares retrieved VCDs with simulated values (from all sources; Case 1 inTable 2) for January, April, July, October and annual average for 2006.Retrieved VCDs are large in regions with more advanced economic and industrial development and/or dense population, including the coastal and neighbor provinces from Beijing to Shanghai, the Pearl River Delta and the Sichuan Basin.Spike values are evident over major cities.In addition, retrieved VCDs vary across the months significantly, reaching maximum values in January and minimum values in July.

Table 3 .
Statistics for retrieved and modeled VCDs of NO 2 in January and July 2006.Slope c R 2,c Intercept c Slope c R 2,c a Predicted from the regression-based inversion.b Sensitivity simulation using the top-down emissions.c With respect to retrieved VCDs.d After 5 gridboxes by 5 gridboxes horizontal smoothing.
A regression-based multi-step inversion approach is proposed to estimate emissions of NO x from anthropogenic, lightning and soil sources for 2006 over East China on a 0.25 • long × 0.25 • lat grid.It exploits information on VCDs of tropospheric NO 2 retrieved from the OMI instrument by KNMI (the DOMINO product version 2).The nested GEOS-Chem model for East Asia is used to interpret the impacts of individual sources on VCDs of NO 2 to facilitate the inversion analysis.The inversion is conducted gridbox by gridbox to derive the respective emissions.For any given gridbox, anthropogenic and natural sources are separated based on their different seasonality; and lightning and soil emissions are considered together due to their similarity in seasonality.Differences in spatial patterns between lightning and soil emissions are used implicitly for source separation to some extent.