Impacts of Different Plant Functional Types on Ambient Ozone Predictions in the Seoul Metropolitan Areas (smas), Korea

Plant functional type (PFT) distributions affect the results of biogenic emission modeling as well as O 3 and particulate matter (PM) simulations using chemistry-transport models (CTMs). This paper analyzes the variations of both surface biogenic volatile organic compound (BVOC) emissions and O 3 concentrations due to changes in the PFT distributions in the Seoul Metropolitan Areas, Ko-rea. The Fifth-Generation NCAR/Pennsylvania State Meso-scale Model (MM5)/the Model of Emissions of Gases and Aerosols from Nature (MEGAN)/the Sparse Matrix Operator Kernel Emissions (SMOKE)/the Community Multiscale Air Quality (CMAQ) model simulations were implemented over the Seoul Metropolitan Areas in Korea to predict surface O 3 concentrations for the period of 1 May to 31 June 2008. Starting from a performance check of CTM predictions , we consecutively assessed the effects of PFT area deviations on the MEGAN BVOC and CTM O 3 predictions, and we further considered the basis of geospatial and statistical analyses. The three PFT data sets considered were (1) the Korean PFT, developed with Korea-specific vegetation database; (2) the CDP PFT, adopted from the community data portal (CDP) of US National Center for Atmospheric Research in the United States (NCAR); (3) MODIS PFT, reclassified from the NASA Terra and Aqua combined land cover products. Although the CMAQ performance check reveals that all of the three different PFT data sets are applicable choices for regulatory modeling practice, noticeable primary data (i.e., PFT and Leaf Area Index (LAI)) was observed to be missing in many geographic locations. Based on the assessed effect of such missing data on CMAQ O 3 predictions , we found that this missing data can cause spatially increased bias in CMAQ O 3. Thus, it must be resolved in the near future to obtain more accurate biogenic emission and chemistry transport modeling results. Comparisons of MEGAN biogenic emission results with the three different PFT data showed that broadleaf trees (BTs) are the most significant contributor, followed by needleleaf trees (NTs), shrub (SB), and herbaceous plants (HBs) to the total BVOCs. In addition, isoprene from BTs and terpene from NTs were recognized as significant primary and secondary BVOC species in terms of BVOC emissions distributions and O 3-forming potentials in the study domain. A geographically weighted regression analysis with locally compensated ridge (LCR-GWR) with the different PFT data (δO 3 vs. δPFTs) suggests that addition of BT, SB, and NT areas can contribute to O 3 increase, whereas addition of …

Abstract.Plant functional type (PFT) distributions affect the results of biogenic emission modeling as well as O 3 and particulate matter (PM) simulations using chemistrytransport models (CTMs).This paper analyzes the variations of both surface biogenic volatile organic compound (BVOC) emissions and O 3 concentrations due to changes in the PFT distributions in the Seoul Metropolitan Areas, Korea.The Fifth-Generation NCAR/Pennsylvania State Mesoscale Model (MM5)/the Model of Emissions of Gases and Aerosols from Nature (MEGAN)/the Sparse Matrix Operator Kernel Emissions (SMOKE)/the Community Multiscale Air Quality (CMAQ) model simulations were implemented over the Seoul Metropolitan Areas in Korea to predict surface O 3 concentrations for the period of 1 May to 31 June 2008.Starting from a performance check of CTM predictions, we consecutively assessed the effects of PFT area deviations on the MEGAN BVOC and CTM O 3 predictions, and we further considered the basis of geospatial and statistical analyses.The three PFT data sets considered were (1) the Korean PFT, developed with Korea-specific vegetation database; (2) the CDP PFT, adopted from the community data portal (CDP) of US National Center for Atmospheric Research in the United States (NCAR); (3) MODIS PFT, reclassified from the NASA Terra and Aqua combined land cover products.Although the CMAQ performance check reveals that all of the three different PFT data sets are applicable choices for regulatory modeling practice, noticeable primary data (i.e., PFT and Leaf Area Index (LAI)) was observed to be missing in many geographic locations.Based on the assessed effect of such missing data on CMAQ O 3 pre-dictions, we found that this missing data can cause spatially increased bias in CMAQ O 3 .Thus, it must be resolved in the near future to obtain more accurate biogenic emission and chemistry transport modeling results.
Comparisons of MEGAN biogenic emission results with the three different PFT data showed that broadleaf trees (BTs) are the most significant contributor, followed by needleleaf trees (NTs), shrub (SB), and herbaceous plants (HBs) to the total BVOCs.In addition, isoprene from BTs and terpene from NTs were recognized as significant primary and secondary BVOC species in terms of BVOC emissions distributions and O 3 -forming potentials in the study domain.A geographically weighted regression analysis with locally compensated ridge (LCR-GWR) with the different PFT data (δO 3 vs.δPFTs) suggests that addition of BT, SB, and NT areas can contribute to O 3 increase, whereas addition of an HB area contributes to O 3 decrease in the domain.
Assessment results of the simulated spatial and temporal changes of O 3 distributions with the different PFT scenarios reveal that hourly and local impacts from the different PFT distributions on occasional inter-deviations of O 3 are quite noticeable, reaching up to 13 ppb.The simulated maximum 1 h O 3 inter-deviations between different PFT scenarios have an asymmetric diurnal distribution pattern (low in the early morning, rising during the day, peaking at 05:00 p.m., and decreasing during the night) in the study domain.Exponentially diverging hourly BVOC emissions and O 3 concentrations with increasing ambient temperature suggest that the use of different PFT distribution data requires much caution when modeling (or forecasting) O 3 air quality in complicated urban atmospheric conditions in terms of whether uncertainties in O 3 prediction results are expected to be mild or severe.

Introduction
Biogenic volatile organic compounds (BVOCs) emitted from vegetated areas play an important role in the chemistry of the lower troposphere and atmospheric boundary layer via a series of oxidation reactions with OH and NO 3 radicals and O 3 (Finlayson-Pitts and Fitts, 2000;Atkinson and Arey, 2003).It is known that BVOC emissions can enhance O 3 formation in the areas with high NO x concentrations because BVOC oxidation increases the concentrations of hydroperoxy and organic peroxy radicals (HO 2 and RO 2 ) that can convert NO into NO 2 without depleting O 3 .In addition, the BVOC emissions can reduce O 3 concentrations in the areas with low levels of NO x because the reaction of O 3 and BVOC reduces hydroxyl radicals and leads to decreased O 3 formation (Finlayson-Pitts and Fitts, 2000;Hogrefe et al., 2011).For example, a regional air quality modeling study reported that biogenic emissions are associated with at least 20 % of surface O 3 concentrations in most areas of the continental United States (Tao et al., 2003).A modeling study with MOZART-4 (Model for Ozone and Related Chemical Species, version 4) showed that biogenic isoprene emissions cause −5 ppb to 10 ppb changes in surface O 3 concentrations over the Amazon region, Indonesia, and parts of South Africa during the spring season (Pfister et al., 2008).
The flux of biogenic emission is a strong function of vegetation area, biomass density, and other environmental factors (Guenther et al., 1995(Guenther et al., , 2006)).In biogenic emission modeling, vegetation area has been commonly considered to be one of the most important driving variables because it reflects the phenological emission capacity of the area of interest.In order to more efficiently prepare vegetation area information as an input to biogenic emission modeling, plant functional type (PFT) data have been used (Guenther et al., 2006;Pfister et al., 2008;Arneth et al., 2011).Conceptually, PFTs are classes of vegetation species that share similar responses to environmental factors, similar functioning at the organismic level, and/or similar effects on ecosystems (Smith et al., 1997;Sun et al., 2008).In a biogenic emissions modeling approach, the PFT data sets specify the type and composition of vegetation classes in a grid cell to determine the capacity of biogenic emissions.Thus, careful consideration of PFT distributions is needed for the estimation of biogenic emissions and subsequent predictive work such as the O 3 prediction with chemistry-transport model (CTM).
Several recent modeling studies have reported sensitivities of PFT to the biogenic emissions and O 3 concentrations.For example, Guenther et al. (2006) reported about −13 to 24 % changes in global annual isoprene emissions from standard Model of Emissions of Gases and Aerosols from Nature (MEGAN) modeling using the 11 different PFT data sets.Pfister et al. (2008) reported a factor of two or more differences in monthly isoprene emissions on global to regional scales through the examination of the MEGAN sensitivity to three different sets of satellite-derived leaf area index (LAI) and PFT data sets.Pfister et al. (2008) also reported a difference of up to 5 ppb in monthly mean O 3 concentrations from the global-scale simulations of MOZART /MEGAN with three different sets of satellite-derived LAI and PFT input data.
Although the previously summarized modeling studies produced valuable findings, none of the studies represents a local scale and a more detailed situation.For example, Pfister et al. (2008) adopted a coarse-grid resolution (i.e., 2.8 • × 2.8 • ) in their study and use only satellite-based PFT distributions to carry out their global modeling study.In addition, they also showed that the sensitivity or uncertainty (e.g., a factor of 2 or more differences in isoprene and up to 5 ppb of surface O 3 inter-deviation) is not only related to changes in LAI data but also to changes in PFT data.
The purpose of our study is to investigate the impacts of different PFT distribution data on O 3 concentrations.Carrying out biogenic emission model and CTM simulations, we investigated changes in surface biogenic emissions and ambient O 3 concentrations with different PFT distribution data sets.Our approach has several distinctive features.First, in this study, we included a region-specific high-resolution PFT distribution data set: the Korean PFT database (KORPFT).The KORPFT was derived from three data sets: (1) Korean land cover classification maps; (2) tree stock maps; and (3) Korean vegetation survey data.Secondly, we adopted a 3 km by 3 km fine grid system in order to more closely investigate spatial patterns of biogenic emissions and O 3 behavior in accordance with finer PFT distribution patterns.Thirdly, we changed only PFT data sets to isolate the impacts of the different PFTs on atmospheric chemistry (or O 3 concentrations).
The main idea behind our approach is that because PFT distributions affect the magnitudes of biogenic emission capacity in the domain of interest, the use of different PFT data can affect biogenic emission estimations and consequently O 3 prediction results.The proposed idea and method was tested over the Seoul Metropolitan Areas (SMAs) in Korea, in which both tremendously developed urban area and densely vegetated areas actually coexist.
Section 2 of this paper describes the PFT distribution data sets developed for or used in this study together with the simulation framework used for biogenic emission estimation and O 3 predictions and some statistical measures used for the evaluation of the modeling results.Section 3 presents and discusses evaluation and comparison results for biogenic emissions and O 3 .Finally, Sect. 4 presents some conclusions of this study.

Korean PFT database
A highly resolved PFT distribution data set (resolution of 150 × 150 m 2 ) for the Korean Peninsula (KORPFT) was established in this study by compiling various local sources of vegetation data, such as a Korean land cover map (produced in 2007 and renewed in 2009) and vegetation survey (collected for the consecutive years [1994][1995][1996][1997][1998][1999][2000][2001][2002][2003][2004] from the Korea Ministry of Environment, and tree stock maps (produced and renewed for the consecutive years 1996-2005) from the Korea Forest Service.
A four-PFT scheme in which multiple vegetation species were classified into four types, broadleaf (BT), needleleaf (NT), shrub (SB), and herbaceous (HB), was applied.The fraction of individual PFT groups (PFTF k ) was calculated by applying the canopy density-weighted focal average as follows: where SA is a selected grid region (m 2 ) with n-grid cells; PFTF k denotes the percentage area of an individual PFT in the SA (%); k denotes the individual PFT type (e.g., BT, NT, SB, and HB); n denotes the number of unit grid cells in the SA; fm PFT k_i is the spatial moving averaged value (m 2 ) of the ith unit grid cell for a PFT in the SA calculated by the focal average method; and δ PFT k_i is the vegetation canopy density factor for an individual PFT at ith grid cell (0 ≤ δ PFT k_i ≤ 1).
The focal average of individual PFTs at every focal grid cell (the ith grid cell) can be defined as follows: where N i denotes the number of all neighbors of the ith unit grid cell, the set {j : j ∼ i} contains all the neighboring locations j of unit i, and a j wrt PFT k is the area sum (m 2 ) of all neighboring grids of a focal grid cell with respect to an individual PFT group.

CDP and MODIS PFT
Two other PFT distribution data sets used in this study are based on satellite observations.The former PFT distribution data set (resolution of 1 × 1 km 2 ) was developed for the global isoprene emission study (Guenther et al., 2006) and derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) land cover product in 2001.As this PFT data set is downloadable from the community data portal (http://cdp.ucar.edu)maintained by the US National Center for Atmospheric Research (NCAR), we refer to this data set as CDP for brevity in this study.The latter PFT distribution data set (resolution of 0.5 × 0.5 km 2 ) was reclassified for this study from the MODIS land cover type 5 products of the Terra and Aqua satellite sensors in 2008.
Hereafter, MODIS will only refer to this reclassified 2008 data set.The MODIS land cover type 5 data, with a PFT scheme including eight vegetation and four non-vegetation classes (Bonan et al., 2002;Friedl et al., 2002;Strahler et al., 1999), were processed to generate the MEGAN PFT.
The MODIS vegetation classes were converted into MEGAN PFT classes by straightforward mapping of the eight MODIS vegetation classes into the four MEGAN PFTs (i.e., BT, NT, SB, and HB).For example, the PFT distribution was calculated by adopting land cover class descriptions provided by the International-Geosphere-Biosphere Programme (IGBP) (Friedl et al., 2002;Strahler et al., 1999).As an example, geographically distributed pixels (pixel size = 500 m × 500 m) for MODIS broadleaf evergreen and broadleaf deciduous trees were reclassified as BT.Based on the IGBP description of the vegetation-covered fraction for the broadleaf evergreen and broadleaf deciduous trees (Strahler et al., 1999) (i.e., "Lands dominated by woody vegetation with a percentage cover > 60 %"), we assumed that the BT-covered fractions for every reclassified pixel range from 60 to 100 % (mean BTcovered fraction = 80 %).Therefore, we simply assigned a mean % BT-covered fraction of 80 % to every BT pixel when we developed a gridded PFTF BT database (500 m × 500 m spatial resolution) for MEGAN.

Framework of the simulation experiment
A modeling framework, the Fifth-Generation NCAR/Pennsylvania State Mesoscale Model (MM5)the Sparse Matrix Operator Kernel Emissions (SMOKE)-the Model of Emissions of Gases and Aerosols from Nature (MEGAN)-the Community Multiscale Air Quality (CMAQ), was established for meteorological modeling, anthropogenic and biogenic emissions processing and modeling, and chemistry-transport model simulations for May-June 2008.In this modeling study, three-step nested gridding was conducted.Horizontal resolutions of 27 × 27 km 2 for the largest domain, 9 × 9 km 2 for the second largest domain, and 3 × 3 km 2 for the smallest domain were used (refer to Fig. 1a).In the framework, the model settings for three-step simulations were the same, except for the biogenic emissions input.The three sets of hourly, gridded, and speciated biogenic emissions (based on KORPFT, CDP, and MODIS data sets) were merged with anthropogenic emissions (described in Sect.2.2.2).
Because we are focusing on the impact of using different PFT data sets, the specific simulation scenarios are hereafter referred to as KORPFT, CDP, and MODIS (Table 1).
Figure 1.The modeling domain.The target domain is a 3 km × 3 km grid system domain covering the Seoul, Gyeonggi, and Incheon Metropolitan Areas.Note that "27 km", "9 km", and "3 km" refer to the model grid resolutions.

Meteorological and chemical transport modeling
For meteorological inputs to the CMAQ model, MM5 modeling was carried out for a period of May-June 2008.Each of the three domains (i.e., 27 km, 9 km, and 3 km) consisted of 20 vertical layers resolving the atmosphere between the surface and 100 hPa in sigma coordinate.Applying a two-way nesting technique, the meteorological outputs from coarse-grid to fine-grid domains (i.e., 27 to 9 to 3 km) were derived.NCEP/DOE AMIP-II (National Centers for Environmental Prediction/Department of Energy Atmospheric Model Intercomparison Project II) Reanalysis data (Reanalysis-2) (Kanamitsu et al., 2002) were used for the initial and boundary conditions (IC/BC).The Grell scheme (Grell et al., 1994), based on the rate of destabilization or quasi-equilibrium, was employed for cumulus parameterization.The Medium-Range Forecast (MRF) Planetary Boundary Layer (PBL) scheme was applied to obtain high resolution in the PBL (Hong and Pan, 1996).For an explicit moisture scheme, a mixed-phase option was used (Reisner et al., 1998).In order to reduce meteorological uncertainties, four-dimensional data assimilation (FDDA) was employed with 10 m Advanced Scatterometer (ASCAT) wind data.The MM5 outputs were then processed with the Meteorology-Chemistry Interface Processor utility (Byun and Ching, 1999) to derive the meteorological input variables for CMAQ simulations.
For the O 3 air quality simulation using CMAQv4.6 (Byun and Ching, 1999;Byun and Schere, 2006), we used the State Air Pollution Research Center mechanism, version 1999 (SAPRC-99) chemical mechanism for gas-phase chemistry (Carter, 2000a), AERO3 module for aerosol formation (Binkowski and Roselle, 2003), the piecewise parabolic method (PPM) for advection (Collela and Woodward, 1984), the multi-scale method for horizontal diffusion (CMAQ v4.6 Operational Guidance Document, 2006), the Asymmetric Convective Method (ACM) for cloud (Pleim and Chang, 1992), and an updated version of ACM (ACM2) for vertical diffusion (CMAQ v4.6 Operational Guidance Document, 2006).Based on these emissions and meteorological inputs, one-way nested model simulations were performed at the three domains (Fig. 1a) with a four-day spin-up period.Here, we focus on the fine and detailed domain (3 × 3 km 2 ) in which developed urban areas and densely vegetated areas coexist.The BCs for the 3 km CMAQ modeling domain were obtained from the simulation outputs of the coarse domains (27 km down to 9 km) using the CMAQ boundary condition (BCON) processor to generate hourly concentrations along the outer lateral edges of the 3 km domain.The ICs for the 3 km domain were also obtained from the CTM results at coarse domains.

Anthropogenic emissions modeling
For processing anthropogenic emission, SMOKE-Asia (Woo et al., 2012) was applied to generate CMAQ-ready anthropogenic emissions for our study domain.SMOKE-Asia adopts the Sparse Matrix Operator Kernel Emissions (SMOKE) processing system of the US Environmental Protection Agency (EPA) as the base frame, but with some improved and upgraded contents.For example, SMOKE-Asia includes spatial and temporal surrogate databases, such as 38 source classification codes (SCCs), 2752 administrative division codes, and some regionalized temporal profiles.Details of SMOKE-Asia are described in Woo et al. (2012).A merged version of the INTEX 2006 (Zhang et al., 2009) and TRACE-P 2000 inventories (Streets et al., 2003) was used as a base inventory and was allocated into our study domain using spatial surrogates in SMOKE-Asia.Through further processing with temporal and chemical speciation (SAPRC-99 chemical mechanism) profiles, we generated hourly gridded CMAQ-ready emissions data for this study.
The anthropogenic volatile organic compound (VOC) and NO x emissions processed by SMOKE-Asia for the study period are shown in Table 2, and their spatial distributions are displayed in Fig. 2.

Biogenic emissions modeling
We used MEGANv2.04 (Guenther et al., 2006) to produce BVOC emission inputs to CMAQ.The BVOC flux is a function of emission factor, vegetation area, and various environmental factors: where ER is the net emission rate (µg of compound h −1 ); EF is the an emission factor that represents the net in-canopy emission rate expected at standard conditions (µg m −2 h −1 at 303 K); A is the vegetation cover area represented by the PFT (A = grid area × PFTF); γ includes environmental activity factors that account for the emission changes due to activity deviations from standard conditions, such as changes in leaf area and age (γ F ), stress due to soil moisture content (γ S ), and environmental effects (i.e., temperature and solar radiation) within the canopy (γ W ); and φ is a factor that defines chemical production and loss within plant canopies (Guenther et al., 2006).
In this study, detailed soil-moisture and plant-canopy information was not considered, so γ S and φ were set to 1.For EF, we applied PFT-specific emission factors tabulated in MEGAN module.For γ W , MM5-derived solar radiation and temperature data sets were used.For the leaf area and age (γ F ) input calculation, we utilized monthly averaged the LAI of vegetation covered surfaces (LAIv) data using 2008 MODIS LAI (eight-day coverage and 1 km resolution) and each source of PFT distribution data.MEGAN uses an approach that divides the surface of each grid cell into different PFTs and non-vegetated surfaces (Guenther et al., 2006).MEGAN uses LAIv to simulate the seasonal variations in leaf biomass and age distribution rather than LAI values (Guenther and Sakulyanontvittaya, 2011;Guenther et al., 2006).MEGAN assumes that plant leaves cover only that part of the grid cell containing vegetation (Guenther et al., 2006).Thus, the LAIv calculations can only be performed with LAI at the grid cells with PFT values.LAIv is different from LAI in that LAIv is estimated by dividing the grid-average LAI by the vegetation-covered fraction.The upper limit of LAIv is set to 6 to eliminate the very high values that can be estimated for grids with very little vegetation (Guenther et al., 2006).Guenther and Sakulyanontvittaya (2011) suggested two main reasons for using LAIv in MEGAN.First, LAIv is the actual LAI of the canopy and is, thus, a more appropriate input for a canopy environment model.Second, lower and upper bounds can be placed on LAIv because vegetation-covered areas rarely have a maximum LAI of less than 0.1 or more than 10.In this study, we adopted three different sources of PFT data (i.e., KORPFT, CDP, and MODIS) and a single source of LAI data (i.e., MODIS LAI at 1 km × 1 km resolution).Raw MODIS LAI values range from 0 to 10.Following the definition of LAIv in MEGAN literature (e.g., Guenther and Sakulyanontvittaya, 2011;Guenther et al., 2006), the MODIS LAI values were divided by the PFT-covered fraction at each grid cell (3 km × 3 km size) and converted to LAIv values (e.g., 3 km × 3 km zonal averaged LAI values/3 km × 3 km zonal averaged PFT fractional values).The LAIv calculations were performed for three different vegetation data sets (i.e., MODIS LAI-KORPFT, MODIS LAI-CDP, and MODIS LAI-MODIS PFT) independently.It is important to note that the LAIv values were not computed unless a grid cell includes both PFT and LAI values concurrently.
The estimated biogenic VOC and NO emissions from the MEGAN model with the three different PFT scenarios for the study period are shown in Table 2. Overall, the biogenic VOC emissions (BVOC) contribute 44.5 % of the total VOC emissions (anthropogenic + biogenic).The differences of BVOC emissions between the three PFT scenarios ranged from 1.8 Gg (MODIS-KORPFT) to 4.2 Gg (MODIS-CDP) for the study period (May-June 2008).Biogenic NO x (specifically NO) showed marginal contributions (∼ 0.4 %) to the total NO x emissions and differences (e.g., 0.02 Gg between CDP and KORPFT) between the three different PFT data sets.Investigating the distribution of the VOC / NO x ratio across the domain revealed that consistently low values of the VOC / NO x ratio were distributed across the domain for the case of anthropogenic emission only (i.e., .Geospatial distributions of estimated anthropogenic emissions for NO x and VOCs and their ratios for the study period (May-June 2008).The VOC / NO x ratio for the "anthro-only" case includes only the anthropogenic VOCs and NO x emissions; the "anthro + bio" case includes both the anthropogenic and biogenic VOCs and NO x emissions.
VOC / NO x < 3), whereas noticeably increased VOC / NO x ratios were distributed over some suburban and border areas for the case of combined anthropogenic and biogenic emissions (Fig. 2).

Statistical measures for quantitative evaluation
We investigated the impacts of different sources of PFT distributions on CTM O 3 predictions by examining the deviation of each data set (i.e., PFT, BVOC emissions, and O 3 ) from the norm (here, their mean values).The deviation of an individual data set at each grid cell (δGV) was calculated as follows: where GV X| (i,j ) is the grid cell value of an individual source of PFT X at a given grid cell location (i,j), and GV (i,j ) is the mean value for every source of PFT data at a given grid cell location (i, j ).We consider the mean value as the best guess of the true values of each variable (PFT areas, biogenic emissions, or O 3 concentrations) because the mean value for each set of variable data from the given scenarios describes each central tendency of those variable data sets.The effect of using different PFT data sets on CTM performances was assessed based on some statistical measures: mean bias (MB), normalized mean bias (NMB), and normalized mean error (NME): normalized mean bias (NMB) where P is model prediction and O is observation.We used hourly O 3 , NO x , and isoprene data gathered from the ambient air quality monitoring network (Fig. 1b), 148 ambient monitoring stations (AMS), and 8 photochemical air monitoring stations (PAMS) of the National Institute of Environmental Research to assess the CMAQ O 3 predictions over the Seoul-Gyeonggi Metropolitan Area for the study period.The US EPA suggested the informal performance standards for regulatory modeling practices of ±5 to ±15 % for NMB and ±30 to ±35 % for NME (Russell and Dennis, 2000).
Beginning with a performance check of CTM predictions, we consecutively investigated the effects of PFT area deviations on the results of MEGAN BVOC and CTM O 3 predictions.

Performance evaluation of CMAQ simulations
The CMAQ-simulated O 3 , NO x , and isoprene concentrations were compared with the observed concentrations at 148 monitoring sites in the domain.Although the number of comparisons is limited, we assessed whether the CMAQ simulations with different PFT scenarios can provide reasonable O 3 concentrations in our study domain.Note that the main objective of the performance evaluation is not to determine which one of the three different PFT databases should be preferred.
Figure 3 shows a comparison between the observed and modeled O 3 , NO x , and isoprene concentrations for the period of 1 May-30 June 2008.The CMAQ predictions follow the observations reasonably well with a tendency to underpredict O 3 and isoprene and a tendency to overpredict NO x .It should be noted that we displayed the averaged value of the CMAQ simulations with the different PFT scenarios in Fig. 3 because the time variations of the individual concentrations were not substantially different in the graph.
In general, the underpredictions in O 3 concentration could primarily be caused by the combined effect of multiple sources of uncertainty, such as in O 3 precursor (e.g., NO x and VOC) emissions, meteorological fields, and so forth.For example, an overestimation of NO x emissions and an underestimation of VOC emissions may have contributed to the overall underprediction of O 3 concentration under VOC-limited conditions.The overprediction of surface wind speeds and underprediction of ambient temperature for the simulation period may also have contributed to the O 3 underprediction.The overprediction of NO x concentrations is primarily due to the overestimation of anthropogenic NO x emissions, and the underprediction of isoprene concentrations is due to the combined effects of the overestimations of NO x and underestimations of VOC and ambient temperature (i.e., underpredictions).
Table 3 shows a summary of the statistical measures indicating the CMAQ performance for O 3 , NO x , and isoprene for each PFT scenario at 148 monitoring sites in the domain.Although the effects of using different PFT distributions on CMAQ performance were not readily recognized across the 148 monitoring sites, the mean values from the CMAQ with KORPFT are between those from two other sites.This is because the O 3 and isoprene concentrations from CMAQ with MODIS-based PFT are higher than those with KORPFT, and the concentrations from CDP-based CMAQ are lower than those with KORPFT (Table 3).
For O 3 , the overall evaluation statistics fall within the US EPA performance standards (i.e., ±5 to ±15 % for NMB and ±30 to ±35 % for NME), although CMAQ underpredicts when compared to the hourly O 3 observations.Because urban sites account for about 82 % of the total data sets (121 urban sites in 148 sites) in our comparison, the underpredictions at the urban sites contribute significantly to the overall CMAQ performance, resulting in underpredictions for the 1 h average O 3 concentrations.CMAQ shows the best performance at suburban sites, followed by background, urban, and roadside.For example, the respective values of MB, NMB, and NME for 1 h average O 3 are about −0.86 ppb, −2.14 %, and 21.64 at the suburban sites, about −2.76 ppb, −5.48 %, and 22.86 % at the background sites, about −8.10 ppb, −23.49%, and 34.60 % at the urban sites, and about −5.37 ppb, −23.17 %, and 41.02 % at the roadside sites.
Comparing the CMAQ O 3 for the three different PFT scenarios (i.e., KORPFT, CDP, and MODIS), the MODIS case provides slightly better agreement compared to the observations than the other cases.The mean error and correlation coefficient are almost the same for three simulations.The resultant maximum O 3 differences between different PFT scenarios were only 0.4 % across the monitoring sites.
For NO x , an important O 3 precursor, CMAQ overpredicts compared to the average value of 1 h observations for all sites, although it underpredicts at the roadside, suburban, and background sites, because of the significant contribution of the overprediction at the urban sites to the overall performance.This inaccuracy in NO x predictions may occur due to the effect of uncertain estimations of anthropogenic NO x emissions, such as overestimations at the urban sites and underestimations at the other sites.The respective values of MB, NMB, and NME for 1 h average NO x are about 14.47 ppb, 40.55 %, and 71.39 % for all sites, about 19.68 ppb, 66.22 %, and 85.34 % at the urban sites, about −8.20 ppb, −9.78 %, and 59.01 % at the roadside sites, about −4.16 ppb, −31.43 %, and 63.23 % at the suburban sites, and about −12.85 ppb, −75.16 %, and 85.12 % at the background sites.
In contrast, for isoprene, another important O 3 precursor, CMAQ underpredicts compared to the average value of 1 h observations at PAMS distributed over urban (four sites), suburban (three sites), and background (one site) areas in the domain.The respective values of MB, NMB, and NME for 1 h average isoprene are about 0.05 ppb, −21.07 %, and 63.15 % throughout all the sites.Among the three CMAQ isoprene results, the CMAQ provided values closer to the observations with MODIS (MB = −0.02ppb and NMB = −7.78%) than with the others (MB = −0.05 and NMB = −22.62% with KORPFT; MB = −0.08 and NMB = −32.82% with CDP).For correlation, the CMAQ provided slightly higher r values with KORPFT (r = 0.622) than with the others (r = 0.598 with MODIS and r = 0.591 with CDP).These results suggest that all three different PFT data sets are acceptable options for regulatory modeling practices when evaluated in terms of hourly O 3 concentrations following the US EPA performance standards.
A comparison of the spatial patterns of the PFT area deviations allowed us to understand qualitative discrepancies among the sources of PFT data.Figure 4 shows the spatial distributions of the δPFT areas for each PFT scenario for the total vegetation and PFT classes (i.e., BT, NT, SB, and HB).The δPFT area of each PFT scenario in the study domain was produced based on Eq. ( 4).The resulting maps show several noticeable features in the δPFT area distributions: 1. KORPFT delivered larger BT covers over the Seoul and Incheon Metropolitan Areas and other regions (Fig. 4b1).

KORPFT delivered larger NT covers across the domain.
There are hot spots over Gangwon-do and the border areas of Gaeseong and Hwanghaebuk-do (Fig. 4c1).
3. CDP lacked PFT distribution information for some islands and costal city areas off the Incheon Metropolitan Area (the brightest yellow color area in Fig. 4a2).
4. CDP delivered comparatively larger NT-and HB-type vegetation covers that concentrated in the Seoul and Incheon Metropolitan Areas and were widespread across the domain (Fig. 4a2, c2, d2, and e2) along with larger BT covers over the Gaeseong and Hwanghaebuk-do areas (Fig. 4b2).
5. MODIS delivered larger BT vegetation covers across the domain except for the Seoul and Incheon Metropolitan Areas and Gaeseong and some of its neighborhood areas.There are hot spots over Gangwondo, Choongcheongbuk-do, and Choongcheongnam-do) (Fig. 4b3).
From the subsequent investigation of the δBVOC emission distributions, we found that the patterns from biogenic emission distributions closely resembled those from the PFT distributions in the study domain, except for the missing biogenic emission zones detected over the Seoul and Incheon Metropolitan Areas and some of their neighborhood areas (Fig. 5).A brief summary of the features that were similar to those from the δPFT distributions follows: 1. KORPFT estimated comparatively higher isoprene emissions from some border areas (between the Gyeonggi-do and Gaeseong areas and between the Gyeonggi-do and Choongcheongbuk-do areas)) as well as some islands areas (Fig. 5b1) due to the influences of BT cover over these areas.
2. KORPFT estimated larger and widespread terpene emissions across the domain except for Seoul and some border areas, (Fig. 5c1) due to the influences of NT cover over these areas.
3. CDP omitted biogenic emissions from some island and costal city areas off the Incheon Metropolitan Area (Fig. 5a2, b2, c2, and d2) due to omitted PFT areas.
4. CDP estimated comparatively higher and widespread terpene and NO emissions, except for Seoul and some border areas, across the domain (Fig. 5c2 and d2) due to the influence of NT and HB cover, and higher isoprene emissions from the Gaeseong and Hwanghaebuk-do areas (Fig. 5b2) due to the influence of BT cover.
5. MODIS estimated larger isoprene emissions across the domain except for the Seoul and Incheon Metropolitan Areas and Gaeseong and some of its neighborhood areas (Fig. 5b3) due to the influence of BT cover over these areas.There are hot spots over Gangwon-do, Choongcheongbuk-do, and Choongcheongnam-do.
We observed zones of missing biogenic emissions in many locations.These zones (the red zones in Fig. 5a-d) occurred because of the problem of primary data missing.In this study, there were two causes of missing data: PFT and LAI data omissions.Either PFT or LAI data omission (Fig. S1 in the Supplement) can inhibit the computation of LAIv and then inhibit the computation of biogenic emissions.The cause of the LAIv computation inhibition (i.e., missing LAIv data) was different according to the sources of PFT data.For the KORPFT, most (approximately 96 %) of the missing LAIv data was due to the omitted MODIS LAI data (Fig. S1b in the Supplement).For the CDP and MODIS, approximately 60 % of the missing LAIv data was due to the omitted MODIS LAI data, and approximately 40 % of the missing LAIv data was due to the omitted PFT data (Fig. S1a2 and a3 in the Supplement).In other words, the MODIS LAI data omission issue caused approximately 96 % of the zones with missing biogenic emissions for the KORPFT scenario and approximately 60 % of the zones with missing biogenic emissions for the CDP and MODIS scenarios, whereas the PFT data omission issue caused approximately 4 % of the zones with missing biogenic emissions for the KORPFT scenario and approximately 40 % of the zones with missing biogenic emissions for CDP and MODIS scenarios.This problem of lacking biogenic emission can affect the surface-level O 3 simulations of CTM.This issue will be discussed in Sect.3.3.

Effect of missing data on CMAQ O 3 simulations
It is worthwhile to investigate the effect of missing data, mentioned in Sect.3.2, on biogenic emissions and O 3 predictions in the modeling domain.
Because the MEGAN model assumes that plant leaves cover only that part of the grid cell containing vegetation (Guenther et al., 2006), we supplemented the LAIv values into the grid cells that lacked LAIv but contained PFT values.We made up for each LAIv missing data location by  O 3 increment values for each PFT scenario occurred at an LAIv-supplemented grid cell (KORPFT and CDP scenarios) or in the neighborhood of the LAIv-supplemented grid cells (MODIS scenario).The simulated increment in hourly O 3 concentration was up to 5, 4.5, and 4 ppb for KORPFT, CDP, and MODIS scenarios, respectively.This result suggests that the problem of lacking input data for MEGAN biogenic emission modeling can cause spatially far different 1 h O 3 prediction results.
Although PFT area loss might be inevitable at this point in time, it must be resolved in the near future to obtain more accurate biogenic emission and chemistry transport modeling results.

PFT class-dependent BVOC emissions and their O 3 -forming potentials
The links between BVOC emissions and their O 3 -forming potentials (OFPs) were further investigated by focusing on the compositional differences and the OFPs of the three BVOC emissions groups from the different PFT scenarios (Fig. 7).From these investigations, we found that isoprene was the most important BVOC compound showing the largest contributions to the total BVOC emission amounts and potential O 3 formation.
In the analyses, we reconstructed the 22 SAPRC-99 VOC species of MEGAN into 10 BVOC compound groups for a clearer presentation of the BVOC emission distributions.The 10 reconstructed compound groups were ALKs (sum of alkanes, i.e., ALK1-ALK5), AROs (sum of aromatics, i.e., ARO1 and ARO2), CCHO (CCHO), ETHENE (ethene), HCHO (HCHO), ISOPRENE (isoprene), MEOH (MEOH), OLEs (sum of olefins, i.e., OLE1 and OLE2), Other Ox-Orgs (sum of ACET, BALD, CCO_OH, HCOOH, MEK, RCHO, and RCO_OH), and TERPENE (terpene).To examine the OFP of each PFT scenario, we employed the maximum incremental reactivity (MIR) (Carter, 2000b).The MIR is a useful quantitative measure of the impact of a VOC on O 3 under high NO x conditions (i.e., grams of O 3 generated/grams of VOC added) under which O 3 is most sensitive to VOCs and which represent near-source or urban areas (Carter, 2000b).The MIR was derived from several box model scenarios representing various urban areas with NO x inputs adjusted to yield maximum sensitivities of ozone to changes in VOC levels (Carter, 2000b).The MIR has been adopted in the state of California for the purpose of implementing reactivity-based regulations (CARB, 1993) and often used as a general reactivity scale to study the impact of VOC on O 3 formation (Xie et al., 2008;Zheng et al., 2009;Carter and Seinfeld, 2012).
Among the 10 BVOC compound groups, isoprene is shown to have the largest contribution to the total BVOC emissions (about 52 %) followed by MEOH, terpene, and other VOCs (Fig. 5a).Among the different PFT scenarios applied to the MEGAN biogenic emission modeling, MODIS derived the highest isoprene emissions because of the highest BT areas, followed by KORPFT and CDP (Fig. 7b).A distinct result with the KORPFT scenario is the highest level of terpene emissions, shown in Fig. 7b, due to the larger NT areas of KORPFT (Fig. 4c1) compared to the other PFT scenarios.
By simply multiplying BVOC emissions with MIR values (Fig. 7c), we calculated domain-wide total OFPs of BVOC emissions.The calculated distributions of OFPs for each PFT scenario (Fig. 7d) were mainly affected by the spatial distributions of isoprene emissions.For example, most locations with higher OFPs overlapped with the BT and the isoprene emission hot spots in the study domain (Figs. 4b,5b,and 7d).From the ozone-forming potential computation, we derived the following proportion for the maximum O 3 -forming potentials in the study domain: MODIS : KORPFT : CDP = 1 : 0.82 : 0.78.Assuming that under high NO x level our study domain experiences optimal balance between VOC and NO x to generate O 3 , this rough estimation implies that about 22 % of the maximum difference in the O 3 concentrations can occur between the different PFT scenarios.For example, when the MODIS scenario estimates 60 ppb of domain-average O 3 concentration, the probable O 3 concentrations from the KORPFT and CDP scenarios would be 49.2 and 46.8 ppb, respectively.There are two points to be noted here because the MIR values used for OFP calculations are from the chamber experiments (Carter, 2000b).Firstly, the proportional expression (i.e., MODIS : KORPFT : CDP = 1 : 0.82 : 0.78) we obtained was simply a rough one and it is not applicable to the relevant atmospheric conditions for O 3 formation over some areas.From Fig. 7d, we can easily recognize that the estimated proportional expression is not applicable to some areas, e.g., areas with missing BVOC reactivity zones due to the lack of BVOC emissions and areas with small OFP gaps between the different PFT scenarios.Secondly, the MIR-scale-based OFP calculation may not duplicate the relevant atmospheric conditions and chemistry of VOC associated with O 3 formation for the period and domain of this study.Thus, we can only expect that the areas with higher OFP levels have a greater possibility of experiencing higher levels of O 3 formation in the presence of higher NO x conditions.This MIR-OFP application issue will be discussed further in Sect.3.6.
Despite the mentioned limitations of MIR-OFP, this approach helps to evaluate the relative importance of biogenic VOC compounds in the production of ground-level ozone in the current study domain.In our OFP calculation, both the isoprene and terpene emission amounts were the first and second largest contributors to the O 3 concentrations (about 79 % contribution by isoprene and 9 % contribution by terpene).The result suggests that the primary and secondary BVOC species of concern are isoprene and terpene because of their high reactivities and large emissions across the domain.

Spatial relationship between δPFT and δO 3
The results in Sects.3.2-3.4suggest some qualitative and quantitative evidence of a causal link between the δPFT areas, δBVOC emission estimations, and δO 3 predictions.To analyze this link more quantitatively, we performed a spatial regression analysis with each gridded data set in a vegetated region by applying the geographically weighted regression technique with a locally compensated ridge (LCR-GWR) (Brunsdon et al., 1996;Fotheringham et al., 2002).The LCR-GWR is a useful method to address spatially varying relationships between the dependent and independent variables and to reduce the problem of collinearity among the explanatory (independent) variables (Brunsdon et al., 2012;Lu et al., 2013;Gollini et al., 2013).The regression models that include δBT, δNT, δSB, and δHB as independent www.atmos-chem-phys.net/14/7461/2014/Atmos.Chem.Phys., 14, 7461-7484, 2014 variables were fitted by LCR-GWR techniques (refer to Appendix A).As the counterpart, the parameters of the ordinary least square (OLS) were also estimated.The main objective of this spatial regression analysis was to show that changes in O 3 concentrations in the modeling grid cells are influenced by the changes in PFT distributions in the modeling grid cells.Because of the short photochemical lifetimes of BVOC (e.g., isoprene ∼ 2 h) (Atkinson and Arey, 2003), many studies used the assumption that the emitted BVOCs from local biogenic emission sources (i.e., PFT area distributions) immediately affect the levels of local surface O 3 concentrations rather than move over a long distance.However, this assumption may not be appropriate for our spatial regression analysis.Because of the relatively high wind speed (mean wind speed at 10 m altitude = 3.7 m s −1 ) over our finer modeling grid system (3 km × 3 km), isoprene emissions generally may not fully react in the grid in which they are emitted.The GWR technique was applied to appropriately address the relationships between those non-quantified O 3 variations with spatial dependency and distributed spatial effects of biogenic sources.Specifications of OLS and LCR-GWR are briefly described in Appendix A, and the regression modeling was conducted in the following three steps: 1. Fit global OLS models to the data sets of a dependent variable (δO 3 ) and four explanatory variables (i.e., δBT, δNT, δSB, and δHB) for the three PFT scenarios.
3. Assess the models from Steps 1 and 2.
In LCR-GWR simulation, the GWR model (Eq.A3) was calibrated using dual optimization of both the bandwidth and ridge terms.An optimal bandwidth was selected by leaveone-out cross-validation (Bowman, 1984).Using a Gaussian kernel with the optimal bandwidth, the weighting matrix of GWR was specified.At the locations where collinearity among the independent variables affects the corresponding local coefficient estimates (local condition number > 30), GWR applied a local ridge regression.The functions used in the analysis with OLS and LCR-GWR are included in the GWmodel R package (Lu et al., 2013;Gollini et al., 2013).
The estimated coefficients of δO 3 with respect to δBT, δNT, δSB, and δHB, as well as regression model diagnostic information for OLS and LCR-GWR, are summarized in Table 4.
The OLS provided a global relationship of delta O 3 with δBT, δNT, δSB, and δHB.The OLS for KORPFT demonstrates that gaining the areas of the two major biogenic isoprene sources (i.e., +0.016 • δBT and +0.001 • δSB) and the one major biogenic terpene source (i.e., +0.002 • δNT) contributes to O 3 increase, whereas gaining the area of the major soil NO x source (i.e., −0.001 • δHB) contributes to O 3 decrease across the domain.
The OLS for CDP suggests that gaining areas of BT contributes to O 3 increase, whereas gaining NT, SB, and HB areas contributes to O 3 decrease across the domain.
The OLS for MODIS indicates positive contributions from gaining BT, SB, NT, and HB areas to O 3 increase in the domain.
However, these OLS parameter estimates of delta O 3 with respect to δBT, δNT, δSB, and δHB may have the poor applicability in some locations because these models ignored the local spatial effects.The summarized LCR-GWR results reveal the improvements of the model fits and calibrated relationships between model parameters (i.e., δO 3 and δPFTs).
LCR-GWR models provide better fits than the OLS models for the PFT scenarios, where the bias-corrected Akaike information criterion (AICc) values have been reduced by 7666 for KORPFT, 9317 for MODIS, and 9678 for CDP.As expected, the parameter estimates with LCR-GWR show spatially varying impacts of δBT, δNT, δSB, and δHB on δO 3 in the study domain (Fig. S2 in the Supplement).The revealed relationship of δO 3 with δBT, δNT, δSB, and δHB from the summarized coefficient estimates of LCR-GWR for three PFT scenarios is parallel to that of the OLS for KO-RPFT: gaining BTs, SB, and NTs contributes to O 3 increase, whereas gaining HBs contributes to O 3 decrease in the domain.
To assess the prediction accuracy of the OLS and LCR-GWR models, we compared the δO 3 values predicted by each regression model in turn with those derived from CMAQ outputs in Sect.3.3.We performed local assessments at the missing data zones (the black squared zones in Fig. S2 in the Supplement).Because the parameter values for the LCR-GWR model were not estimated in the missing data zones, we used the nearest parameter values to given missing data locations for the LCR-GWR model prediction.Results of the local assessment of model prediction accuracy are shown in Fig. 8 and Table 5.The prediction accuracy for the LCR-GWR model was greater than that for the OLS model, which suggests that stationary relationships from OLS do not fit the local situation.Among the three LCR-GWR fits, the most accurate δO 3 predictions were found with KO-RPFT (r 2 = 0.57, SEE = 0.0081, and F = 356.4),followed by MODIS (r 2 = 0.4, SEE = 0.0087, and F = 112.6)and CDP (r 2 = 0.33, SEE = 0.0086, and F = 79.8).

Impact of PFT distribution differences on hourly O 3 predictions
From the investigation of the daytime O 3 episodes, we found noticeable deviation patterns of hourly CMAQ O 3 predictions with different PFT scenarios in our modeling domain.Table 4.The relationship between δO 3 deviations and the corresponding δPFT area across the domain derived from the OLS and LCR-GWR models.The third to seventh columns show the estimated coefficients of the explanatory variables and the intercept of the regression models.
The standard errors of the coefficient estimates by OLS and the minimum and maximum values of the coefficient estimates by LCR-GWR are enclosed by square brackets.The 8th column shows the adjusted coefficient of determination representing the explanation power of the regression model.The standard errors of the coefficient estimates by OLS and the minimum and maximum values of the coefficient estimates by LCR-GWR are enclosed by square brackets.The 9th to the 11th columns show test statistics for the goodness of the regression model fits.A higher F statistic, a lower RSS, and a lower AICc value indicates better fit of the regression model and a lower p value indicates greater significance of the regression model.The 12th column shows the discrepancy of AICc between OLS and LCR-GWR fits.A higher reduction (at least over 3) indicates the greater benefits of moving from an OLS to an LCR-GWR.The 13th column shows the number of the fitted data.
It should be noted that the data used for this regression analysis are from each of the individual modeling grid locations where both PFT and LAI values are not missing.The significance levels of the parameters and the model fits are (***) for p value < 0.001, (**) for 0.001-0.01,(*) for 0.01-0.05,(•) for 0.05-0.1,and ( ) for > 0.1.RSS: residual sum of square.N.D. stands for "not determined".KORPFT, CDP, and MODIS) from the average of these three CMAQ O 3 distributions (see Eq. 4).The δO 3 distributions show that MODIS usually develops higher positive deviations (Fig. 9a11-a15 and 9b11-b15) while CDP usually develops higher negative O 3 deviations (Fig. 9a6-a10 and 9b6-b10) across the domain.The KORPFT develops both negative and positive O 3 deviations (Fig. 9a1-a5 and 9b1-b5) across the domain.Interestingly, the patterns of these negative or positive O 3 deviations develop in concert with enhancements in ambient O 3 concentration.The 29 May 2008 episode shows relatively high concentrations of the simulated O 3 over some island and costal city areas off the Incheon Metropolitan Area.At the time, the Incheon coastal and neighborhood areas were located downwind of the Incheon industrial complex and the Gyeonggi-do and Seoul Metropolitan Areas, where anthropogenic NO x emissions are very strong (emission ratio for VOC / NO x < 3; see Fig. 2d), and suffered a temperature inversion that limited the vertical mixing of pollutants (see the grey-hatched zones in Fig. 9a).While the released BVOC was trapped within the inversion layer, the easterly winds brought anthropogenic NO x -rich air from the Incheon industrial complex and some areas further inland (e.g., Gyeonggi-do and Seoul) to the inversion area.With the temperature inversion, mixing between the local BVOC emissions and the intruded anthropogenic NO x and VOC produced high concentrations of O 3 through photochemical reaction.Eventually, the Incheon coastal and neighborhood areas suffered consistently high O 3 development.The most distinctive feature of the δO 3 pattern on 29 May 2008 is the level of deviations of O 3 (maximum difference is up to 7 ppb: 3 ppb of δO 3 for MODIS and −4 ppb of δO 3 for CDP at 03:00 p.m.) that developed and consistently remained in the temperature inversion zone (i.e., the Incheon coastal and neighborhood areas).The location of this deviation is coincident with the location of the PFT deviation in Fig. 4 and the BVOC deviation in Fig. 5.The high negative deviation of CDP O 3 is associated with the influence of the missing PFT areas of CDP (see Figs. 4a2,b2,c2,d2,and e2) on the CMAQ O 3 predictions over these areas.The high positive deviation of O 3 with the MODIS scenario is associated with the impact of the larger PFT areas (e.g., δBT ∼ 1.5 km 2 and δNT ∼ 2.1 km 2 ) of MODIS on the CMAQ predictions over the area (see Fig. 4b3 and c3, Fig. 5b3 and c3, and Fig. 7b).
The 30 June 2008 episode shows more widespread concentrations of the simulated O 3 throughout the domain.At the time, most of the high O 3 concentration regimes (e.g., see the O 3 contour in Fig. 9b3-b4) were located downwind from higher anthropogenic NO x emission source areas (Fig. 2a).The anthropogenic NO x emissions transported from urban center areas affected several border areas (e.g., the border areas between Seoul and Gyeonggi-do, between Gyeonggido and Gaeseong, between Gyeonggi-do and Gangwon-do, and between Gyeonggi-do and Choongcheongnam-do and Choongcheongbuk-do, etc.) and some suburban areas (e.g., the southern part of Gyeonggi-do and the northern part of Choongcheongnam-do) and some suburban areas (e.g., the southern part of Gyeonggi-do and the northern part of Choongcheongnam-do) where abundant hydroperoxy and organic peroxy radicals are generated through the oxidation of large amounts of locally emitted BVOC, resulting in high O 3 concentrations upon photolysis.Although the Incheon coastal area and the neighborhood areas underwent a temperature inversion similar to the 29 May 2008 episode, these areas were not consistently affected by anthropogenic NO x transported from inland areas due to a wind direction change (i.e., easterly to westerly).One of the most noticeable features of the δO 3 pattern for 30 June 2008 consists of the strong deviations of O 3 that develop over Gangwon-do (difference is up to 10 ppb: −4 ppb of δO 3 for KORPFT and 6 ppb for MODIS) and Choongcheongbuk-do (differences of 9-10 ppb: −4 ppb of δO 3 for KORPFT, −3 ppb for CDP, and 6 ppb for MODIS) at 05:00 p.m. Another noticeable feature consists of the consistent deviations, which are not small (difference up to 5 ppb: ∼ −2 ppb of δO 3 for CDP and ∼ 3 ppb for MODIS), that develop over the border areas between East Seoul and Gyeonggi-do.Figures 4 and 7 show that these higher O 3 deviations are associated with higher deviations in biogenic isoprene emissions that occur due to the higher difference of BT area among the different PFT scenarios (e.g., the positive δBT area of MODIS (∼ 3.5 km 2 ) and the negative δBT areas of other PFTs (∼ −3 km 2 )).
Interestingly, the 9-10 ppb of difference between CMAQ O 3 with the different scenarios, detected in the 30 June 2008 episode, is about 11 % (i.e., [10 ppb of CMAQ O 3 difference]/[90 ppb of CMAQ O 3 ] × 100).This corresponds roughly to half of the maximum difference (i.e., 22 %) estimated based on application of the MIR-OFP approach in Sect.3.4.As mentioned earlier in Sect.3.4, this gap could be the result of differences between the assumed CMAQ atmospheric conditions for surface level O 3 prediction and the chamber simulation conditions for MIR estimation: the O 3 simulations were conducted for the period of May-June 2008 under much narrower distributions of NO x and temperature in the atmosphere (domain-averaged NO x range: 0-24 ppb; domain-averaged temperature range: 6-26 • C) compared to those of the chamber experiment for MIR developments (chamber environment NO x range: 150-1000 ppb; chamber experiment temperature: 22-43 • C) (refer to Table B-1 in Carter, 2000b).This indicates that our CMAQ atmospheric conditions for photochemical O 3 formation were not fully developed to derive the maximum O 3 reactivities of BVOC emissions from different PFT scenarios.Thus, it can be said that MIR-OFP approach is crucially dependent on pollution episodes (Carter and Seinfeld, 2012).The earlier discussions regarding MIR-OFP and the results in this section could have important implications for designing and implementing biogenic emission estimation and air quality management strategies in this region (the Seoul, Gyeonggi, and Incheon Metropolitan Areas).For example, as described in the literature (Zheng et al., 2009;Carter and Seinfeld, 2012), the distributions of MIRs-OFPs for BVOC compounds can be affected more dynamically by meteoro-logical factors (e.g., wind direction, temperature, light intensity, etc.) and the locations and magnitudes of source emissions in this region.This discussion points out that developing region-specific reactivity scales for BVOC can be an important tool to better characterize the impact of ozone precursor emissions on regional ozone formation mechanisms and can support the O 3 air quality management strategies in this region.In this sense, an accurate BVOC inventory based on accurate PFT distribution information would be an essential prerequisite to yield appropriate MIR-OFP information in this region.

Diurnal variation in the maximum difference between the simulated O 3
To investigate the more accurate impact of the different PFT scenarios on CMAQ O 3 prediction results, we conducted a number of iterative adjustments for various inputvariable data sets (e.g., meteorological variables, boundary conditions, anthropogenic NO x , and isoprene emissions) in CMAQ simulations for June 2008.One of the most important tasks was to fix CMAQ-predicted NO x concentrations at the level of observed NO x concentrations.We applied observation data for pollutants and meteorological variables to support these tasks.In this work, we did not consider the spatial variation of the adjustment factors because we assumed that meteorological and chemical variables in the modeling domain are systematically biased.Thus, we applied a common set of temporally varying adjustment factors to every individual modeling grid.Specific adjustment methods are described in Appendix B, and the adjustments were conducted in the following four steps: 1. Using averaged temperature and wind speed data from 11 meteorological monitoring stations (MET in Fig. 1b) maintained by the Korea Meteorological Administration, we corrected meteorological inputs for MEGAN biogenic emission and CMAQ modeling (i.e., temperature and wind speed).Every missing value in observation data was filled by data interpolation.
2. Boundary conditions (BCON) of the simulation domain 3 were adjusted by the hourly varying ratios of observations and corresponding CMAQ concentrations for isoprene, NO x , and O 3 .Here, the CMAQ concentrations represent the averaged values of the CMAQ simulations with the different PFT scenarios.This adjustment was conducted once.Using the adjusted BCON, we performed CMAQ simulations.Every missing value in observation data was filled by interpolation.

Using the output from
Step 2, we adjusted anthropogenic emission species (isoprene and NO x ) by applying the hourly varying ratios of observations and corresponding CMAQ concentrations for isoprene and NO x .
After emission adjustment, we conducted CMAQ simulation.These emission adjustments and CMAQ simulations were conducted iteratively until the specified criterion was satisfied.Although the optimal criterion was NMB = 0 % , we set a relatively moderate criterion because of the limitation of model computing time and storage.
4. When the prespecified convergence criterion in Step 3 was satisfied, the anthropogenic isoprene emissions were adjusted by use of a single ratio of the mean observed isoprene concentration and corresponding CMAQ mean isoprene concentration to make the NO x / isoprene ratio derived from CMAQ predictions closer to that from observations.The result of Step 3 is shown in Fig. S3 in the Supplement.
After five iterative simulations applying hourly varying adjustment factors for emissions, the specified criterion was satisfied by revealing that NMB of CMAQ NO x is approximately −7.4 %.Furthermore, the corresponding MB (mean bias in Eq. 5) of O 3 for each PFT scenario and the mean CMAQ NO x / isoprene ratio are shown in Fig. S4a in the Supplement.
The result of Step 4 is shown in Fig. S4 in the Supplement.Through four iterative simulations with averaged adjustment factors for anthropogenic isoprene emissions, we explored the variation of the MB of CMAQ O 3 and the CMAQ NO x / isoprene ratio (Fig. S4b-d and f in the Supplement).Among four different simulations (Fig. S4b-d and  f), we chose the Fig. S4f simulation case for the remainder of the investigation of the impact of the different PFT scenarios on CMAQ O 3 predictions because the CMAQ NO x / isoprene ratio from the Fig. S4f case (= 101.29) is closer to the observed NO x / isoprene ratio (= 101.28) compared with other cases.
After the CMAQ-predicted NO x values and NO x / isoprene ratio were adjusted by following the procedure described above, we investigated the impacts of the different PFT scenarios on CMAQ O 3 prediction results.As a result, diurnal distributions of simulated hourly maximum O 3 differences between different PFT scenarios in subregions are shown in Fig. 10.This figure enables the characterization of the periodic impact patterns in everyday CMAQ predictions with different PFT scenarios in the study domain.
In the subregions, the simulated maximum O 3 differences between different PFT scenarios have an asymmetric diurnal distribution pattern (skewed to the left: lower tail is longer than the upper tail).The maximum O 3 difference values in the range (of the whiskers) are low in the early morning (06:00 a.m. to 09:00 a.m.), rise gradually (reaching up to 2 ppb in the Choongcheong regions) during the day with peaks at 05:00 p.m. and then gradually decrease.Although the extreme values (outside the range of the whiskers) of the simulated maximum O 3 differences similarly show an asymmetric diurnal distribution pattern, these distributions have relatively steeper variations, with peaks at 05:00 p.m. at approximately 13 ppb in the Seoul, Incheon, and Gyeonggi regions.
The resultant high differences in the CMAQ 1 h O 3 predictions with the different PFT scenarios could have important implications for air quality decisions and human health studies.For example, the Korea O 3 alert system provides a warning at 120 ppb and an alarm at 300 ppb for 1 h O 3 , and the Korea NAAQS (national ambient air quality standard) values for 1 h O 3 and 8 h O 3 are 100 ppb and 60 ppb, respectively.In the short term, the highly biased forecasting (e.g., bias up to 13 ppb) of high-O 3 episodes may result in the incorrect issuance of O 3 alerts.Moreover, such inaccurate O 3 forecasting may result in the suggestion of incorrect regulatory design values to O 3 air quality decision supporting authorities.Ji et al. (2011) reported that emergency hospitalizations for total respiratory disease increased by about 3 % per 10 ppb 24 h O 3 among the elderly.This result suggests a critical point at which chemistry transport modeling with highly biased PFT distribution scenarios would predict highly bi-ased O 3 concentrations and subsequently provide misleading information for studying the relationship between O 3 air quality and human health outcomes.
Another point of concern is the impact from uncertain meteorological variables, especially temperature.Figure 11 shows the deviation tendencies of the simulated BVOC emissions and O 3 concentrations as a function of temperature for the three different PFT scenarios.KORPFT shows a comparatively gentle declining tendency for both δBVOC and δO 3 whereas the other two PFT scenarios show steeper inclining (MODIS) and declining (CDP) tendencies for both δBVOC and δO 3 as temperature increases.These diverging tendencies can be expected to increase (i.e., positive bias for MODIS and negative bias for CDP) as the weather warms due to the seasonal change from early summer to midsummer.Assuming ambient temperature increases due to climate forcing in the future (IPCC, 2007), the use of different PFT distribution data require much caution in terms of whether uncertainties in O 3 prediction results would be mild or severe.

Conclusions
The CMAQ performance check suggests that all of three different PFT data sets are acceptable options for regulatory modeling practices when evaluated as to hourly O 3 concentrations based on the US EPA performance standards.
From the investigation of the δPFT areas and δBVOC emission distributions for each PFT scenario (KORPFT, CDP, and MODIS), similar patterns were clearly shown for the PFT and BVOC emission distributions.Three PFT scenarios commonly showed that broadleaf trees (BTs) were the most significant contributor, followed by needleleaf trees (NTs), shrub (SB), and herbaceous plants (HBs), to the total BVOC emissions.Furthermore, isoprene from BTs and terpene from NTs were recognized as significant primary and secondary BVOC species of interest in terms of potential O 3 level increases in the study domain.
The effect of the observed lack of primary data (i.e., PFT and LAI) on CMAQ 1 h O 3 predictions in many geographic locations was noticeable.The lack of this data can cause spatially mistaken CMAQ O 3 prediction results.Thus, we suggest that this issue of missing primary data should be recognized as a current limitation of MEGAN biogenic emission modeling and chemistry transport model O 3 simulations, and it must be resolved in the near future.
An LCR-GWR analysis with different PFT data (δO 3 vs.δPFTs) suggests that addition of BT, SB, and NT areas can contribute to O 3 increase, whereas addition of an HB area contributes to O 3 decrease in the domain.
An assessment of the prediction accuracy of the LCR-GWR model fits with each different PFT scenario showed that the KORPFT provides the best explanations for the www.atmos-chem-phys.net/14/7461/2014/Atmos.Chem.Phys., 14, 7461-7484, 2014 relationship between PFT, BVOC emissions, and surfacelevel O 3 changes, followed by MODIS and CDP.
The temporally and spatially averaged effects of the different PFT distributions on CMAQ O 3 simulation results can be regarded as marginal because the usual difference of CMAQ O 3 simulations with different PFT scenarios is less than 0.4 ppb.However, the hourly and local impacts of these are quite noticeable, showing occasional differences of O 3 of up to 13 ppb.The simulated maximum 1 h O 3 differences between different PFT scenarios show an asymmetric diurnal distribution pattern (low by early morning, rising during the daytime, peaking at 05:00 p.m., and decreasing during the nighttime) in the study domain.
Exponentially diverging hourly BVOC emissions and O 3 concentrations were found as a function of temperature change in our modeling domain.Thus, we conclude that the PFT distributions could play the role of a large uncertainty source in hourly O 3 air quality modeling (or forecasting) that supports air quality decision-making and human health studies.The higher the ambient temperature applied to air quality simulation, the larger the likely bias related to PFT distributions.
The Supplement related to this article is available online at doi:10.5194/acp-14-7461-2014-supplement.

Figure 3 .
Figure 3. Evaluation of CMAQ results for hourly O 3 concentrations.Note that "CTM Mean" represents the average of the CMAQ predicted O 3 concentrations from the three different BVOC scenarios, and "Obs" represents the average of measured O 3 concentrations at monitoring stations across the domain.

Figure 4 .Figure 5 .
Figure 4. Spatial distribution of the PFT area deviations for each PFT scenario.The mean spatial distributions for PFT total and the PFT classes (BT, NT, SB, and HB) were derived by averaging the three different distribution data sources.

Figure 6 .
Figure 6.Longitudinal distribution of the incremented ozone concentrations due to the effect of additional BVOC emissions for each PFT scenario via LAIv input supplementation for MEGAN modeling.A filled red circle denotes an O 3 increment in an LAIv supplemented grid cell.A filled blue circle indicates an O 3 increment in a normal grid cell.

Figure 7 .
Figure 7. Composition and distribution of the estimated BVOC emissions, along with spatial distribution of the estimated net ozone-forming potentials of biogenic VOC emissions.The total BVOC in (a) was derived by averaging the total BVOC emissions of the three sources of data.The 10 BVOC compound groups in (b) are arranged in descending order following their O 3 reactivity, i.e., MIR shown in (c).The distributed net ozone-forming potentials in (d) were calculated by aggregating the O 3 -forming potentials (OFPs) for 10 BVOC compound groups calculated by multiplying the emissions by their corresponding MIR values at each grid cells.The red zones denote the zones missing BVOC reactivity due to the lacking biogenic emissions during the OFP calculations.
Figure 9a and b show the time change of the spatial distributions of δO 3 and mean O 3 concentrations in the surface layer for the daytime of 29 May 2008 and 30 June 2008, respectively.The δO 3 distributions were produced by subtracting the hourly CMAQ O 3 distributions for each scenario (i.e.,

Figure 9 .
Figure 9. Spatial distributions of CMAQ O 3 hourly deviations for each BVOC emission scenario and CMAQ mean O 3 hourly concentrations.The CMAQ mean O 3 hourly concentrations are the averaged values of CMAQ O 3 concentrations for the three PFT scenarios.These CMAQ O 3 concentrations are illustrated by the contour.The grey-hatched lines represent the temperature inversion zone.

Figure 10 .
Figure 10.Diurnal distributions of the maximum O 3 differences between different PFT scenarios in subregions derived from CMAQ simulations.The CMAQ simulation was conducted after adjustments of anthropogenic isoprene emission input under fixed CMAQ NO x at the level of observed NO x concentrations.Isoprene emission input adjustment was conducted to fix the CMAQ predicted NO x / isoprene ratio at the level of observed the NO x / isoprene ratio (see Fig. S4 in the Supplement).For each hour of the day, the boxes represent the range between the 25th and 75th percentile of all maximum difference values for that hour across the simulation period (1-30 June 2008) in the subregions.The line components in boxes indicate the median values.The whiskers indicate the highest (1.5× IQR (the interquartile range) of the upper quartile) and lowest (1.5× IQR of the lower quartile) values and connect the extreme points to the box.

Figure 11 .
Figure 11.Divergence of the predicted hourly BVOC emissions and O 3 concentrations with temperature change in the modeling domain.Every displayed value of δBVOC and δO 3 in this figure was derived from the hourly domain-averaged values of the MEGAN BVOC emissions and CMAQ O 3 concentrations for the three PFT scenarios (e.g., KORPFT, CDP, and MODIS).The temperature on the horizontal axis in this figure represents averaged values of MM5-predicted ambient temperature.

Table 1 .
List of model simulation scenarios and the PFT data sets used for biogenic emissions modeling.

Table 2 .
Anthropogenic and biogenic nonmathane VOC (NMVOC) and NO x emissions and deviations between different PFT-input cases (unit: Gg period −1 ).

Table 3 .
Performance statistics for CMAQ O 3 with the three different BVOC emission scenarios against the measured O 3 concentrations.