An Inversion of NOx and NMVOC Emissions using Satellite Observations during the KORUS-AQ Campaign and Implications for Surface Ozone over East Asia

1Harvard-Smithsonian Center for Astrophysics, Cambridge, MA, USA 10 2School of Environmental Science and Engineering, Southern University of Science and Technology, Shenzhen, China 3Department of Chemistry, University of California, Irvine, Irvine, CA, USA 4Institute of Arctic & Alpine Research, University of Colorado, Boulder, CO, USA 5National Center for Atmospheric Research, Boulder, CO, USA 15 6Department of Advanced Technology Fusion, Konkuk University, Seoul, South Korea 7Department of Earth System Science, Tsinghua University, Beijing, China


Introduction
The study of ozone (O3) formation within the troposphere in East Asia is of global importance. This significant pollutant is not confined to the source, as it spreads hemispherically 60 through the air, affecting background concentrations as far away as U.S. A study by Lin et al. [2017] provided modeling evidence of enhancements of springtime surface ozone levels (+0.5 ppbv yr -1 ) in the western U.S. in 1980-2014 solely due to the tripling of Asian anthropogenic emissions over the period. As more studies have informed the impact of ozone pollution on both human health and crop yields, Chinese governmental regulatory agencies have begun to take action 65 on cutting the amount of NOx (NO+NO2) emissions since 2011-2012 [Gu et al., 2013;Reuter et al., 2014;Krotkov et al., 2016;de Foy et al., 2016;Souri et al., 2017]; however no effective policy on volatile organic compound (VOC) emissions, emitted from various sources such as solvent use, mobile, and chemical industries [Liu et al., 2008a,b], had been put into the effect prior to 2016 [Stavrakou et al., 2017;Souri et al., 2017;Shen et al., 2019;, with an exception to 70 Pearl River Delta (PRD) [Zhong et al. 2013]. In addition to China, a number of governments including those of Malaysia and Taiwan have put a great deal of effort into shifting their energy pattern from consuming fossil fuels to renewable sources [Trappey el al., 2012;Chua and Oh, 2011]. On the other hand, using satellite observations, Irie et al. [2016] and Souri et al. [2017] revealed a systematic hiatus in the reduction of NOx over South Korea and Japan potentially due 75 to increases in the number of diesel vehicles and new thermal power plants built to compensate for the collapse of the Fukushima nuclear power plant in 2011. Therefore, it is interesting to quantify to what extent these policies have impacted ozone pollution.
Unraveling the origin of ozone is complicated by a number of factors encompassing the nonlinearity of ozone formation to its sources, primarily from NOx and VOCs. Therefore, to be 80 able to quantify the impact of recent emission changes, we have developed a top-down estimate of emission inventories using well-characterized observations. There are a myriad of studies focusing on optimizing the bottom-up anthropogenic and biogenic emissions using satellites observations, which provide high spatial coverage, in conjunction with chemical transport models for VOCs [e.g., Palmer et al., 2003;Shim et al., 2005;Curci et al., 2010;Stavrakou et al., 2009Stavrakou et al., , 2011, and 85 NOx Chai et al., 2009;Miyazaki et al., 2017;Souri et al., 2016aSouri et al., , 2017Souri et al., , 2018.
Most inverse modeling studies do not consider both NO2 and formaldehyde (HCHO) satellitebased observations to perform a joint-inversion. It has been shown that VOC and NOx emissions https://doi.org/10.5194/acp-2020-220 Preprint. Discussion started: 31 March 2020 c Author(s) 2020. CC BY 4.0 License. can affect the production/loss of each other [Marais et al., 2012;Wolfe et al. 2016;Valin et al., 2016;Souri et al., 2020]. Consequently, a joint method that incorporates both species while 90 minimizing the uncertainties in their emissions is better suited to address this problem. Dealing with this tangled relationship between VOC-NO2 and NOx-HCHO requires an iteratively nonlinear inversion framework able to incrementally consider the relationships derived from a chemical transport model. Here we will provide an optimal estimate of NOx and VOC emissions during the KORUS-AQ campaign using the Smithsonian Astrophysical Observatory (SAO) Ozone 95 Mapping and Profile Suite Nadir Mapper (OMPS-NM) HCHO and the National Aeronautics and Space Administration (NASA) Ozone Monitoring Instrument (OMI) NO2 retrievals whose accuracy and precisions are characterized against rich observations collected during the campaign.
Having a top-down constraint on both emissions permits a more precise quantification of the impact of the recent emission changes on different chemical pathways pertaining to ozone 100 formation and loss.

OMPS HCHO
OMPS-NM onboard the Suomi National Polar-orbiting Partnership (Suomi NPP) is a UV-105 backscattered radiation spectrometer launched in October 2011 [Flynn et al., 2014]. Its revisit time is the same as other NASA A-Train satellites, including Aura at approximately 13:30 local time at the equator in ascending mode. OMPS-NM covers 300-380 nm with a resolution of 1 nm fullwidth half maximum (FWHM). The sensor has a 340×740 pixel charge-coupled device (CCD) array measuring the UV spectra at a spatial resolution of 50×50 km 2 at nadir. The HCHO retrieval 110 has been fully described in González Abad et al. [2015;. Briefly, OMPS HCHO slant columns are fit using direct radiance fitting [Chance, 1998] in the spectral range 327.7-356.5 nm.
The spectral fit requires a reference spectrum as function of the cross-track position as it attempts to determine the number of molecules with respect to a reference (i.e., a differential spectrum fitting). To account for this, we use earthshine radiances over a relatively clear area in the remote 115 Pacific Ocean within -30 o to +30 o latitudes. An upgrade to this reference correction is the use of daily HCHO profiles over the mean climatological ones from simulations done by the GEOS-Chem chemical transport model. The scattering weights describing the sensitivity of the light path through a simulated atmosphere are calculated using VLIDORT. The shape factors used for https://doi.org/10.5194/acp-2020-220 Preprint. Discussion started: 31 March 2020 c Author(s) 2020. CC BY 4.0 License. calculating air mass factors (AMFs) are derived from a regional chemical transport model 120 (discussed later) that is used for carrying out the inversion in the present study. We remove unqualified pixels based on cloud fraction < 40%, solar zenith angle < 65 o , and a main quality flag provided in the data. We oversample the HCHO columns for the period of May-June 2016 using a Cressman spatial interpolator with a 1 o radius of influence.

OMI Tropospheric NO2 125
We use NASA OMI tropospheric NO2 (version 3.1) level 2 data whose retrieval is made in the violet/blue (402-465 nm) due to strong absorption of the molecule in this wavelength range [Levelt et al., 2018]. The sensor has a nadir spatial resolution of 13´24 km 2 which can extend to 40´160 km 2 at the edge of scanlines. A more comprehensive description of the retrieval and the uncertainty associated with the data can be found in Krotkov et al. [2017] and Choi et al. [2019]. 130 We remove bad pixels based on cloud fraction < 20%, solar zenith angle < 65 o , without the row anomaly, vertical column density (VCD) quality flag = 0, and Terrain Reflectivity < 30%. Similar to the OMPS HCHO, we recalculate AMFs by using shape factors from the chemical transport model used in this study. We oversample the OMI granules using the Cressman interpolator with a 0.25 o radius of influence. 135

Model simulation
To be able to simulate the atmospheric composition, and to perform an analytical inverse modeling, we set up a 27-km grid resolution regional chemical transport model using the Community Multiscale Air Quality Modeling System (CMAQ) model [Byun and Schere, 2006] that consists of 328×323 grids covering China, Japan, South Korea, Taiwan and some portions of 140 Russia, India and South Asia (Figure 1). The time period covered by the simulation is from April to June 2016. We use the month of April for spin-up. The anthropogenic emissions are based on the monthly MIX-Asia 2010 inventory [Li et al., 2015] in the CB05 mechanism. The anthropogenic emissions are mainly grouped into three different sectors, namely mobile, point, and residential (area) sources. We apply a diurnal scale to the mobile sectors used in the national 145 emission inventory (NEI)-2011 emission platform to represent the first-order approximation of traffic patterns. We include biomass burning emissions from the Fire Inventory from NCAR (FINN) v1.6 inventory [Wiedinmyer et al., 2011], and consider the plume rise parametrization used in the GEOS-Chem model (i.e., 60% of emissions are distributed uniformly in the planetary boundary layer (PBL)). We use the offline Model of Emissions of Gases and Aerosols from Nature 150 (MEGAN) v2.1 model [Guenther et al., 2006] following the high resolution inputs described in Souri et al. [2017]. The diurnally lateral chemical conditions are simulated by GEOS-Chem v10 [Bey et al., 2001] using the full chemistry mechanism (NOx-Ox-HC-Aer-Br) spun up for a year.
With regard to weather modeling, we use the Weather Research and Forecasting model (WRF) v3.9.1 [Skamarock et al., 2008] at the same resolution to that of the CMAQ (~27km), but with a 155 wider grid (342×337), and 28 vertical pressure sigma levels. The lateral boundary conditions and the grid nudging inputs are from the global Final (FNL) 0.25 o resolution model. The major configurations for the WRF-CMAQ model are summarized in Table 1 and Table 2.

Inverse modeling
We attempt to improve our high-dimensional imperfect numerical representation of 160 atmospheric compounds using the well-characterized NO2 and HCHO columns from satellites. We use an analytical inversion using the WRF-CMAQ model to constrain the relevant bottom-up emission estimation [Souri et al., 2016;Souri et al., 2017;Souri et al., 2018]. The inversion seeks to solve the following cost function under the assumptions that i) both observation and emission error covariances follow Gaussian probability density functions with a zero bias, ii) the observation 165 and emission error covariances are independent and iii) the relationship between observations and emissions is not grossly non-linear: where x is the inversion estimate (a posteriori) given two sources of data: a priori (xa) and observation (y). So and Se are the error covariance matrices of observation (instrument) and emission. F is the forward model (here WRF-CMAQ) to project the emissions onto columns. The 170 first term of Eq.1 attempts to reduce the distance between observations and the simulated columns.
The second term incorporates some prior understanding and expectation of the true state of the emissions, that is, it does not allow the a posteriori to deviate largely from the a priori, even though the observations could be far from our estimation. The weight of each term is dictated by its covariance matrix. If Se is large compared to So, the a posteriori will be independent of the prior 175 knowledge and, conversely, if So dominates, the final solution will consist mostly of the a priori.
Following the Gauss-Newton method described in Rodger [2000], we derive iteratively (i.e., i is the index of iteration) the posterior emissions by: where G is the Kalman gain, = 2 ; + < ; 2 ; and ; (= ( )) is the Jacobian matrix calculated explicitly from the model (discussed later). The 180 covariance matrix of the a posteriori is calculated by: where @ is the Jacobian from the ith iteration. Here we iterate Eq.2 three times. The averaging kernels (A) are given by: The inversion system is complicated by the commonly overlooked fact that observations are biased. For instance, Souri et al. [2018] found that airborne remote sensing observations were 185 high relative to surface Pandora measurements. The overestimation of the VCDs was problematic, since it could have been propagated in the inversion, inducing a bias in the top-down estimation.
The authors partly mitigated it by constraining the MODIS albedo which was assumed to be responsible for the bias. Attempts to reduce the bias resulting from coarse profiles from a global model in calculating gas shape profiles were made by recalculating the shape factors using those 190 from higher spatial resolution regional models in other studies [e.g., Souri et al., 2017;Laughner et al., 2018]. For this study, we use abundant observations from the KORUS-AQ campaign and follow the intercomparison platform proposed by Zhu et al. [2016; using aircraft observations collected during the campaign to be able to mitigate the biases in HCHO columns.
Based on the corrected global model as a benchmark, we scale up all OMPS HCHO columns by 195 20%. To mitigate the potential biases in OMI NO2, we followed exclusively the values reported over the KORUS-AQ period in Choi et al. [2019]. We increase the NO2 concentration uniformly by 33.9% (see table A3 in the paper).
We calculate the covariance matrix of observations using the column uncertainty variable provided in the satellite datasets and consider them as random errors. Therefore, these values are 200 significantly lowered down by oversampling the data over the course of two months. In addition to that, we take into account a fixed error for all pixels due to variability that exists in the applied bias correction. This error is based on the RMSE obtained from the mentioned studies used for removing biases.
To increase the degree of freedom for the optimization, we combine all sector emissions 205 including anthropogenic, biomass burning and biogenic emissions for NOx and VOCs. Therefore, we use the following formula to estimate the covariance of the a priori: where f denotes the fraction of the emission sector with respect to the total emissions, and s is the standard deviation of each sector category which is calculated from the average of each sector to a relative error listed in Table 3. 210 For the same purpose (enhancing the amount of information gained from satellite observation) and to increase computational speed, we reduce the dimension of the state vectors (emissions) by aggregating them. However, grouping emissions into certain zones could also introduce another type of uncertainty, known as the aggregation error. We choose optimally aggregated zones by running the inversion multiple times, each with a certain selection of state 215 vectors [Turner and Jacob, 2015]. As in our previous study in Souri et al. [2018], we use the Gaussian Model Mixture (GMM) method to cluster emissions into certain zones that share roughly similar features and investigate which combinations will lead to a minimum of the sum of aggregation and smoothing errors.
In order to create the K matrix, one must estimate the impact of changes in emissions for 220 each of the aggregated zones to the concentrations of a target compound which is calculated using CMAQ-Direct Decoupled Method (DDM) [Dunker et al., 1989;Cohan et al., 2005]. For instance, the first row and column of K denoting the response of the first grid cell to a zonal emission can be obtained by: NOD is the sensitivity result in units of molecule cm -2 for the first grid indicating how 225 concentration of NO2 column will change at the first row and column of the domain by changing one unit of emission of total NOx emissions. We do not consider the interconnection between the zonal emissions and concentrations due to computational burdens. The same concept will be applied to HCHO and VOC emissions. The advantage of using CMAQ-DDM to estimate the sensitivity lies in the fact that it calculates the local gradient which better represents the non-linear 230 relationship existing between the emissions and the columns [Souri et al., 2017;Souri et al., 2018], which in turn, reduces the number of iterations.

Validation of the model in terms of meteorology
It is essential to first evaluate some key meteorological variables, because large errors in the weather can complicate the inversion [e.g., Liu et al. 2017]. In order to validate the performance 235 of the WRF model in terms of a number of meteorological variables including surface temperature, relative humidity, and winds, we use more than 1100 surface measurements from integrated surface database (ISD) stations (https://www.ncdc.noaa.gov/isd) over the domain in May-June 2016. Table 4 lists the comparison of the model and the observations for the mentioned variables.
Our model demonstrates a very low bias (0.6 o C) with regard to surface temperature. We find a 240 reasonable correspondence in terms of relative humidity indicating a fair water vapor budget in the model. The largest discrepancy between the model and observations in terms of temperature and humidity occurs in those grid cells that are in the proximity of the boundary conditions (not shown). Concerning the wind components, the deviation of the model from the observations is smaller than results obtained in a relatively flat area like Houston in Souri et al. [2016]. 245

Comparison of the model and the satellite observations
Prior to updating the emissions, we find it necessary to shed light on the spatial distribution of tropospheric NO2 and HCHO total columns from both observations and model, and their potential differences relative to their key precursors emissions.

NO2 250
The first row in Figure 2 illustrates tropospheric NO2 columns from the regional model, OMI (using adjusted AMF and bias corrected), and the logarithmic ratio of both quantities in May-June 2016 at ~1330 LST over Asia. The second row depicts daily-mean values of dominant sources of NOx, namely as, biogenic, anthropogenic, and biomass burning emissions (that are subject to change after the inversion). A high degree of correlation between the anthropogenic NOx emissions 255 and NO2 columns implies the predominant production of NO2 from the anthropogenic sources [Logan, 1983]. We find a reasonable two-dimensional Pearson correlation (r=0.73) between the modeled and the observed columns. Generally, the WRF-CMAQ largely underestimated (56%) tropospheric NO2 columns with respect to those of OMI over the entire domain. Segregating intuitively the domain into high emission areas (NOx > 10 ton/day) and low ones (NOx < 10 260 ton/day) allows for a better understanding of the discrepancy between the model and the observations. In the high NOx areas, the model tends to overestimate tropospheric NO2 columns by 73%, whereas for the low NOx regions, the model shows a substantial underestimation by 68%. The underestimation of the model in the low NOx regions is related to a number of factors such as i) the widely-reported underestimation of soil (biogenic) NOx emissions due to the lack of precise knowledge of fertilizers use, soil biota, or canopy interactions [Jaeglé, et al., 2005;Hudman et al., 2010;Souri et al., 2016], ii) the underestimation of the upper-troposphere NO2 due to non-280 surface emissions (aviation/lightning) or errors in the vertical mixing or moist convection [e.g., Souri et al., 2018], and iii) a possible overprediction of the lifetime of organic nitrates diminishing background NO2 levels [Canty et al., 2015]. Addressing the second issue requires a very high resolution model with explicit resolving microphysics and large eddy simulations, and the last problem requires more experimental studies to improve organic nitrates chemistry [Romer Present 285 et al., 2020]. In this study, we attempt to mitigate the discrepancy between the model and the satellite observations solely by adjusting the relevant emissions. Accordingly, future approaches improving models the physical/chemical processes can offset top-down emissions estimates inevitably.

HCHO 290
A comparison between HCHO columns from the model and OMPS along with the major sources of VOCs in May-June 2016 is depicted in Figure 3. A reasonable correlation (r=0.78) between the model and OMPS suggests a good confidence in the location of emissions. However, the magnitude of HCHO columns between the two datasets strongly disagrees, especially over the tropics where biogenic emissions are large. A myriad of studies have reported a largely positive 295 bias (by a factor of 2-3) associated with isoprene emissions estimated by MEGAN using satellite measurements [e.g., Millet et al., 2008;Stavrakou et al., 2009;Marais et al., 2012;Bauwens et al., 2016]. To compound, Stavrakou et al. [2011] found a large overestimation in methanol emissions from the same model that can further preclude the accurate estimation of the yield of HCHO. This is especially the case for the tropics. As a response to the overestimation of the biogenic VOCs by shown here, it is necessary to adjust the emissions to better match the simulated columns with the satellites observations given their errors, and by doing so, there is a chance for a better simulation of the formation of tropospheric ozone.

Updated Emissions
In this section we report the results from the inverse modeling and the associated 320 uncertainty associated with the top-down estimation; moreover, we wish to assess how much information is gained from utilizing satellite observations via the calculation of averaging kernels.
Finally, observations are used to verify, to some extent, the accuracy of our top-down emission estimations.

NOx 325
The first row in Figure 4 shows the a priori, the a posteriori, and their ratios in terms of the total NOx emissions in May-June 2016. We observe that the ratios are highly correlated with those of CMAQ/OMI shown in Figure 2, suggesting that the inversion attempts to reduce the distance between the model and the observations. Major reductions occur over China. We attribute them to strict emissions policies Reuter et al., 2015;de Foy et al., 2016;Krotkov et al., 330 2016;Souri et al., 2017]. The enhancements in NOx emissions are commonly found in rural areas, especially over grasslands located in the western/central China and Mongolia. The changes in NOx emissions over South Korea and Japan are positive [Irie et al., 2016;Souri et al., 2017]. This is especially the case for Japan for which we observe a larger enhancement in total NOx emissions (12%) essentially due to new thermal power-plants. The second row in Figure 4 depicts the relative 335 errors in the a priori, the a posteriori, and AKs. Relative errors in the a priori are mostly confined to values close to 50% in polluted areas. They increase further, up to 100%, in areas experiencing relatively large contributions from biomass burning or biogenic (soil) emissions. Encouragingly, OMI tropospheric NO2 columns in conjunction with the solid mathematical inversion method [Rodger, 2000] greatly reduce the uncertainties associated with the emissions in polluted areas; we 340 observe AKs close to 1 over major cities or industrial areas. We see the lowest values in AKs over rural areas due to weaker signal/noise ratios from the sensor. Therefore, it is desirable but very difficult to improve the model using the sensor in terms of NOx chemistry/emissions in remote areas, evident in the low values of AKs. Table 5  An interesting observation lies in the discrepancy between the ratio of OMI/CMAQ ( Figure   2) to that of the a posteriori to the a priori over the North China Plain, suggesting that using a bulk ratio  cannot fully account for possible chemical feedback. The ratio of 355 OMI/CMAQ is consistently lower than changes in the emission. Two reasons contribute to this https://doi.org/10.5194/acp-2020-220 Preprint. Discussion started: 31 March 2020 c Author(s) 2020. CC BY 4.0 License. effect: i) as NOx emissions decrease in NOx-saturated areas (i.e., the dominant sink of radicals is through NO2+OH), OH levels essentially increase resulting in a shorter lifetime in NO2; therefore to reduce NO2 concentrations, a substantial reduction in NOx (suggested by OMI/CMAQ) is unnecessary coinciding with results from the inverse modeling, ii) the CMAQ-DDM ( Figure S1) 360 suggests that NO2 columns decrease due to increasing VOC emissions over the region; accordingly, the cross-relationship between NO2 concentrations and VOC emissions which has been implicitly taken into consideration by iteratively optimizing the cost function partly adds to the discrepancy. It is because of the chemical feedback that recent studies have attempted to enhance the capability of inverse modeling by iteratively adjusting relevant emissions [e.g., To assess the resulting changes in the tropospheric NO2 columns after the inversion, and to validate our results, we compare the simulated values using the a priori and the a posteriori with OMI in Figure 5. We observe 64% reduction in the tropospheric NO2 columns on average over NCP despite only 32% reduction in the total NOx emissions over the region, a result of the chemical 370 feedback. The two-dimensional Pearson correlation between the simulation using the a posteriori and OMI increases from 73% (using the a priori) to 83%. Both datasets now are in a better agreement as far as the magnitude goes. However, we do not see a significant change in the background values in the new simulation compared to those of OMI. This is primarily because of the consideration of higher covariances over low-emitting areas that weighs up the inversion 375 towards the prior values.
To further validate the results, we compare the NO2 data from the NCAR's four-channel chemiluminescence instrument onboard the DC-8 aircraft during the campaign (not shown). These for specific plants. Nevertheless, a non-trivial oversight in models could be an insufficient representation of both HOx chemistry and dry deposition in forest canopies [Millet et al., 2008]; as a result, the net amount of HCHO in the atmosphere over forest areas is higher than what should be if removal through either a chemical loss or a faster dry deposition is considered.
Owning to the fact that we assume anthropogenic VOC emissions to be less uncertain 400 relative to other sectors, the errors in the a priori are smaller in populated areas. We observe that OMPS HCHO columns are able to significantly reduce the uncertainty associated with the total VOC emissions over areas showing a strong HCHO signal (>10 16 molec.cm -2 ). Over clean areas, it is the other way around; we see less confidence in our top-down estimate (AK<0.4) in areas such as Tibet and Mongolia. 405 We then compare the simulated HCHO column using two different emission inventories with those of OMPS in Figure 7. We observe a substantial improvement both in the spatial structure and the magnitude of simulated HCHO columns using the a posteriori with respect to OMPS. The two-dimensional Pearson correlation increases from 0.78 to 0.91 after applying the adjustments to the emissions. In response to the increases in the total VOC emissions over the 410 North China Plain, we observe ~11% enhancements in the simulated HCHO total columns. The updated emissions lead to a reduction in HCHO total columns as large as 70% in the tropics. Validation of the model in terms of VOCs is not a straightforward task because the chemical mechanism used for our model has lumped several VOC species such as terminal/internal olefin or paraffin, only a handful of which were measured during the campaign. Besides, the MIX-415 Asia inventory estimates the anthropogenic emissions for a selected number of VOCs in the CB05 mechanism. Here, we focus only on six compounds including isoprene, HCHO, ethene, ethane, acetaldehyde, and methanol whose emissions are adjusted (with the same factor) based on satellite measurements. The comparison of the simulated values with the DC-8 measurements showed a noticeable mitigation in the discrepancy between two datasets at lower boundaries (<900 hPa) in 420 terms of isoprene ( Figure S2), ethane ( Figure S3), ethene ( Figure S4), and acetaldehyde ( Figure   S5). Surprisingly, we observe a large underestimation of methanol over the Korean Peninsula by a factor of ten ( Figure S6). Same tendency was observed in other regions in Wells et al. [2014] (see Figure 8 in the paper). Our inversion obviously fails at mitigating the bias as there is not much direct information from the satellite observations on this compound. Wells et al. [2014] and Hu et 425 al. [2011] demonstrated that methanol can be a secondary source of HCHO up to 10-20% in midlatitudes in warm seasons. We tend to underestimate HCHO concentrations (by 15%) in the lower atmosphere (<900 hPa) after using the a posteriori over the Korean Peninsula ( Figure S7).

Implications for surface ozone
The results we have generated can be further exploited to elucidate changes in the ozone 430 production rates P(O3) owing to have constrained both NOx and VOC emissions. We calculate P(O3) by subtracting the ozone loss driven by HOx (HO+HO2), reaction with several VOCs (i.e., alkenes and isoprene), the formation of HNO3, and O3 photolysis followed by the reaction of O( 1 D) with water vapor, from the ozone formation via removal of NO through HO2 or RO2: Since P(O3) is a non-linear function of NOx and VOC emissions, it is advantageous to look at the 435 ratio of chemical loss of NOx to that of ROx (RO2+HO2), a robust indicator to pinpointing underlying drivers for ROx cycle. LROx is defined through the sum of primarily radical-radical reactions: LNOx mainly occurs via the NO2+OH reaction: Typically, a value of LNOx/LROx~2.7 defines the transition line between VOC-sensitive and 440 NOx-sensitive regimes [Schroeder et al., 2017;Souri et al., 2020]. The resultant changes in the LNOx/LROx ratios shed some light on ozone sensitivity with respect to its major precursors, but P(O3) is also dependent on the absolute values of emissions, the degree of reactivity of VOCs, and the abundance of radicals. Here we use the integrated 455 reaction rates (IRR) to determine the most influential reactions pertaining to ozone loss and production at the surface. We focus on 1200 to 1800 China standard time (CST) hours. Figure 9 shows the differences in the major pathways for the loss and the formation of ozone at the surface within the time window. The differences are computed based on the subtraction of the simulation with the a posteriori from that with the a priori. In Figure 9 we see a strong degree of correlation 460 between the changes in magnitude of P(O3) through HO2+NO reaction with those of NOx emissions (Figure 4), whereas the changes in magnitude of P(O3) via RO2+NO reaction primarily are par with those of VOC emissions ( Figure 6). We observe P(O3) increases through HO2+NO and RO2+NO reactions in Japan, South Korea, Myanmar, and Philippines because of increases in NOx emissions in NOx-sensitive regions. The simultaneous decrease in NOx and VOC in PRD and 465 Taiwan causes the production of ozone via the same pathways to reduce.
Normally, in VOC-rich environments, reduction in VOC emissions boosts OH concentrations ( Figure S8). Consequently, we observe an enhancement in NO2+OH reaction in the tropics and subtopics. A substantial reduction in the chemical loss of ozone through NO2+OH over NCP and YRD arises from a considerable decrease of NOx emissions and an increase in OH (due 470 to chemical feedback of NOx). Because of increases in HOx concentrations over NCP ( Figure S8-S9), we also observe an enhancement in ozone loss through reacting with HOx. Changes in the ozone photolysis (O 1 D+H2O) are majorly dictated by photolysis and water vapor mixing ratios, both of which are roughly constant in both simulations; accordingly the difference of the reaction rate is mainly reflecting those in ozone (shown later). Interestingly, we observe a large reduction 475 in the loss of ozone through reaction with VOCs at lower latitudes. This is essentially because of the reduction in ISOP+O3, a VOC that prevails in those latitudes. Despite a much slower reaction rate for ISOP+O3 compared to ISOP+OH and ISOP+hv [Karl et al. 2004], this specific chemical pathway can be important as a way to oxidize isoprene and forming HOx in forests [Paulson and Orlando, 1996]. 480 Figure 10 sums the differences of all mentioned chemical pathways involved in formation/loss of surface ozone at 1200-1600 CST. Because of a complex non-linear relationship between P(O3) and its precursors, we observe a variability in both the sign and amplitude of P(O3).
On average, changes in O3 production dominate over changes in O3 sinks except in Malaysia which underwent a significant reduction in isoprene emissions, thus slowing down the ISOP+O3 485 reaction. Following the patterns of NOx-limited and VOC-limited in Figure 8, it is possible to conclude that the P(O3) differences are mainly driven by those of NOx depending at which chemical condition the changes in emissions have occurred.
Much of the above analysis is based on ozone production rates, however, various parameters encompassing dry deposition, vertical diffusion, and advection can also affect ozone 490 concentrations. Therefore we further compute the difference between the simulated maximum daily 8-h average (MDA8) surface ozone levels before and after the inversion depicted in Figure   11. We see a striking correlation between P(O3) (right panel in Figure 10) and MDA8 surface ozone indicating that the selected chemical pathways in this study can explain ozone changes.
Nonetheless, the transport obviously plays a vital role in the spatial variability associated with the 495 differences of surface ozone [e.g., Souri et al., 2016b]. Figure 11 suggests a significant enhancement of ozone over NCP (~4.56 ppbv, +5.6%) and YRD (5.2 ppbv, +6.8%) due to simultaneous decreases/increases in NOx/VOCs which is in agreement with . On the other hand, reductions in NOx mitigate ozone pollution in PRD (-5.4%), Malaysia (-5.6%) and Taiwan (-11.6%). Table 6 lists the simulated MDA8 surface ozone levels for several regions before 500 and after updating the emissions. Increases in MDA8 ozone over NCP and YRD overshadow decreases in southern China resulting in 1.1% enhancement for China. This provides strong evidence that regulations on cutting VOC emissions should not be ignored. The largest reduction/increase of MDA8 ozone is found over Taiwan/YRD.

Summary
In this paper we have focused on providing a top-down constraint on both volatile organic compound (VOC) and nitrogen oxides ( [Zhang et al., 2012;Trappey el al., 2012;Chua and Oh, 2011]. The analytical inversion also paves the way for estimating the averaging kernels (AKs), thereby informing the amount of information acquired from satellites on the emissions estimation. We observe AKs>0.8 over major polluted areas indicating that OMI is 525 able to improve the emission estimates over medium to high-emitting regions. Conversely, AKs are found to be small over pristine areas suggesting that little information can be gained from the satellite over rural areas given retrieval errors. In line with the studies of Irie et al. [2016] andSouri et al. [2017], we observe a growth in the total NOx emissions in Japan (12%) and South Korea (+9%) which are partially explained by construction of new thermal power plants in Japan, and an 530 upward trend in the number of diesel vehicles in South Korea. MEGAN v2.1 estimates too much isoprene emissions in the tropics and subtropics, a picture that emerges from the latitudinal dependence of the posterior VOC emissions to the prior ones. It is readily apparent from the top-down constrained VOC emissions that the prevailing anthropogenic VOC emissions in NCP is underestimated by 25%, a direction that is in agreement 535 with studies by Souri et al. [2017] and Shen et al. [2019]. We find out that OMPS HCHO columns can greatly reduce the uncertainty associated with the total VOC emissions (AKs>0.8) over regions having a moderate-strong signal (>10 16 molec.cm -2 ).
A large spatial variability associated with both NOx and VOC results in great oscillation in chemical conditions regimes (i.e., NOx-sensitive or VOC-sensitive). Due to considerable 540 reduction/increase in NOx/VOC emissions in NCP and YRD, we observe a large increase (53%) in the ratio of the chemical loss of NOx (LNOx) to the chemical loss of ROx (RO2+HO2) shifting the regions towards NOx-sensitive. As a result, a substantial reduction in afternoon NO2+OH reaction rate (a major loss of O3), and an increase in afternoon NO+HO2 and RO2+NO (a major production pathway for O3) are observed, leading to enhancements of the simulated maximum 545 daily 8-hr average (MDA8) surface ozone concentrations by ~5 ppbv. Therefore, additional regulations on VOC emissions should be implemented to battle ozone pollution in those areas. On the other hand, being predominantly in NOx-sensitive regimes favors regions including Taiwan, Malaysia and PRD to benefit from reductions in NOx, resulting in noticeable decreases in simulated MDA8 surface ozone levels. 550 It has taken many years to develop satellite-based gas retrievals, and weather and chemical transport models accurate enough to enable observationally-based estimates of emissions with reasonable confidence and quantified uncertainty, and produce credible top-down emission inventories over certain areas. However it is essential to improve certain aspects to be able to narrow the range of uncertainty associated with the estimation: i) getting the bias of the satellite 555 gas retrievals about right for some areas (which requires a rigorous construction of representivity factor when it comes to comparing two datasets) would be insufficient because the bias can vary substantially over time and space depending on underlying surface properties and the atmospheric state, ii) there is a need for a proper quantification of the errors in the prior emissions based on the very raw information used in the emission inventory to discourage some arbitrarinesses, iii) the 560 model parameter errors including those from PBL, radiation, and winds should be propagated to the final output [e.g., Rodger 2000], iv) due to intertwisted chemical feedback between various chemical compounds, inverse modeling needs to properly incorporate all available information (beyond HCHO and NO2) considering the cross-relationship either explicitly or implicitly. run on the Smithsonian Institution High Performance Cluster (SI/HPC).

Authors' contributions 570
A  1-The errors in the a priori are estimated from equation 6. 2-The errors in the a posteriori are calculated by equation 4.