Global top-down smoke-aerosol emissions estimation using satellite ﬁre radiative power measurements

. Fire emissions estimates have long been based on bottom-up approaches that are not only complex, but also fraught with compounding uncertainties. We present the development of a global gridded (1 ◦ × 1 ◦ ) emission coefﬁcients ( C e ) product for smoke total particulate matter (TPM) based on a top-down approach using coincident measurements of ﬁre radiative power (FRP) and aerosol optical thickness (AOT) from the Moderate-resolution Imaging Spectro-radiometer (MODIS) sensors aboard the Terra and Aqua satellites. This new Fire Energetics and Emissions Research version 1.0 (FEER.v1) C e product has now been released to the community and can be obtained from http://feer.gsfc. nasa.gov/, along with the corresponding 1-to-1 mapping of their quality assurance (QA) ﬂags that will enable the C e values to be ﬁltered by quality for use in various applications. The regional averages of C e values for different ecosystem types were found to be in the ranges of 16–21 g MJ − 1 for savanna and grasslands, 15–32 g MJ −

Despite the considerable advancement achieved in satellite remote sensing and atmospheric modeling during the last couple of decades, there still remains a large uncertainty in the overall atmospheric impacts of aerosols and certain shortlived trace gases, particularly those originating from biomass burning such as BC and carbon monoxide (CO) (e.g., Urbanski et al., 2011;Yurganov et al., 2011;Ichoku et al. 2012;Bond et al., 2013). A major part of the uncertainty stems from the fact that their emission from fires are still very poorly constrained mainly due to the rather sporadic and transient character of biomass burning, which makes it difficult to characterize experimentally (e.g., Forster et al., 2007. This can be contrasted, for instance, with emissions from industries and fossil-fuel burning, which can be quantified in a fairly straightforward manner, as the sources are generally stable and relatively easy to characterize. For instance, the global total fossil-fuel CO 2 emissions are accurate to within 10 % at a 95 % confidence interval (e.g., Andres et al., 2012), whereas the uncertainty associated with biomass burning CO 2 emissions is still not quantifiable because of lack of sufficient information (e.g., Andreae and Merlet, 2001), although one could infer that it would be as much as 100 % considering the propagation of uncertainties from the various parameters that influence the emissions estimates (van der Werf et al., 2010). Similar uncertainty ratios exist for other types of particulate and gaseous emissions from various source types (biogenic, industrial, volcanic) as compared to biomass burning (e.g., Diehl et al., 2012;Bond et al., 2013;Carslaw et al., 2013).
Many of the currently available biomass burning emissions inventories and other related products, including those derived from satellite data, are based on bottom-up approaches, whereby estimates of burned biomass are derived from satellite-retrieved fire pixel counts, burned areas, and/or fire radiative power (FRP), and are then multiplied by emis-sion factors (EFs) of different smoke constituents derived from laboratory or field experiments to obtain the smoke emissions of these constituents (e.g., Chin et al., 2002, Ito andHoelzemann et al., 2004, Liousse et al., 2004Michel et al., 2005;van der Werf et al., 2006van der Werf et al., , 2010Generoso et al., 2007). Examples of such inventories that are currently being used by the community include GFED (van der Werf et al., 2006(van der Werf et al., , 2010, GFAS (Kaiser et al., 2012), FLAMBE (Reid et al., 2009), FINN (Weidinmyer et al., 2011), and GBBEP-Geo (Zhang et al., 2012). Recent research findings suggest that such bottom-up approaches lead to severe underestimations particularly of smoke aerosols unless bias correction is applied through modeling (e.g., Liousse et al., 2010;Kaiser et al., 2012). Top-down approaches are starting to be investigated for deriving biomass-burning emissions, sometimes in conjunction with model assimilation (e.g., Sofiev et al., 2009;Kaiser et al., 2012;Darmenov and da Silva, 2013). Although biomass burning emits several dozens of particulate and gaseous species (e.g., Andreae and Merlet, 2001;Akagi et al., 2011), this study is specifically focused on smoke aerosol or total particulate matter (TPM) emissions.
This paper presents the development of the first gridded global top-down biomass burning aerosol emission coefficients product that is based strictly on locally collocated satellite measurements of both fire radiative power (FRP) and aerosol optical thickness (AOT). The original idea and an initial algorithm were developed in Ichoku and Kaufman (2005) in which FRP and AOT retrieved from the Moderate-resolution Imaging Spectro-radiometer (MODIS) sensor aboard the NASA Terra and Aqua satellites were utilized together with wind vectors from the National Center for Environmental Prediction/National Center for Atmospheric Research (NCEP/NCAR) meteorological reanalysis data to generate smoke-aerosol emission coefficients (C e in kg MJ −1 ) for several biomass burning regions. Such topdown emission coefficients are found to be useful, as simply multiplying C e by the satellite-retrieved FRP of a fire gives the corresponding instantaneous TPM emission rate for that fire. Likewise, in the case of consistent and frequent fire observations such as from a geostationary platform, multiplying C e by the time-integrated FRP (or fire radiative energy, FRE) gives the TPM emission for that time interval. This C e × FRP (or C e × FRE) emissions estimation approach and variants of it have been subsequently developed and implemented successfully in various regional studies (e.g., Jordan et al., 2008;Henderson et al., 2008Henderson et al., , 2010Sofiev et al., 2009;Vermote et al., 2009). However, the original Ichoku and Kaufman (2005) algorithm has been substantially enhanced and used to generate a gridded global C e product using an updated algorithm and newer versions of FRP and AOT data from MODIS as well as wind vectors from the Modern Era Retrospective-Analysis for Research and Applications (MERRA) data sets (Rienecker et al., 2011) provided by the NASA Goddard Global Modeling and Assimilation Office (GMAO). The newly generated gridded C e data products are available at the NASA Fire Energetics and Emissions Research (FEER) web site (http://feer.gsfc.nasa.gov/) together with MODIS FRP data and links to other relevant satellite FRP data.
Section 2 provides the background and theoretical basis of the approach. Section 3 describes the characteristics of the various input data (FRP, AOT, winds) used to calculate C e . Section 4 gives the full details of the updated methodology for deriving C e and the associated uncertainty analyses. Section 5 presents the use of the gridded C e product to estimate smoke particulate emissions over different regions and comparisons with similar emission products, namely, the Global Fire Emissions Database version 3.1 (GFED.v3: van der Werf et al., 2006van der Werf et al., , 2010, the Global Fire Assimilation System version 1.0 (GFAS.v1: Kaiser et al., 2012), and the Quick Fire Emission Dataset version 2.4 (QFED.v2: van Donkelaar et al., 2011;Darmenov and da Silva, 2013). Finally, Sect. 6 provides a summary and relevant concluding statements.

Background and theoretical considerations
Traditionally, the amount of a given aerosol or trace-gas species emitted from open biomass burning is derived by multiplying that species' emission factor (in grams of species per kilogram of dry matter burned) by the mass of biomass burned. The basic equation is of the form (e.g., Andreae and Merlet, 2001): where M x is the mass of emitted smoke species x, EF x is the emission factor for the emitted species x, and M biomass is the mass of the dry biomass burned. A similar relationship to Eq. (1) was established by Ichoku and Kaufman (2005) in which EF x is replaced with C x e , which is designated as the emission coefficient (for any given species x), and M biomass is replaced with either FRE or its release rate R fre (i.e., FRP). Thus, where R x is the rate of emission of species x (expressed in kg s −1 ) since R fre is the FRE release rate expressed in MJ s −1 , or MW. C x e is therefore expressed in kg MJ −1 . The validity of the relationship in Eq. (2) has been verified in a laboratory experiment, where satellite measurements of fire energetics and smoke were replicated by burning small biomass fuel samples in a burn chamber equipped with a giant smoke stack upon which the relevant instruments were set up, and the retrieved FRP and AOT were used to derive C e for smoke aerosols (Ichoku et al., 2008b).
Equations (1) and (2) are functionally very similar, and relating the two would suggest that there is a linear relationship between M biomass and FRE. Indeed, a series of field experiments showed that FRE is proportional to M biomass in a linear fashion, such that M biomass = 0.368(± 0.015)· FRE, in which the numeric coefficient (0.368 kg MJ −1 ) is designated as the biomass consumption factor (F c ) . The Wooster et al. (2005) study indicated that the same relationship is expected to hold for satellite observations when total biomass consumed M biomass is substituted with the rate of biomass consumption and FRE with R fre . That relationship has also been verified in laboratory experiments (Freeborn et al., 2008;Ichoku et al., 2008b), and has been applied in the estimation of M biomass over Africa using FRE derived by integrating R fre measurements from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) aboard the Meteosat second generation (MSG) series of European geostationary meteorological satellites (Roberts et al. , 2011. Similarly, as derived in Ichoku et al. (2008b), the mass-based emission factor, EF x , in Eq. (1) is related to the FRE-based emission coefficient C x e for any given fire-emitted species x as where F c is the biomass consumption factor (F c ) defined in Wooster et al. (2005). This ability to relate the satellite-measured fire radiant heat release rate R fre and the top-down derived emission coefficient C e to physical quantities of combusted biomass M biomass and its associated bottom-up smoke emission factor EF x , respectively, is a major motivation buttressing the study described in this paper. Currently, only a few generalized values of EF x are available for certain ecosystem types, which is highly limiting given that EF x is likely to vary by location in the same manner as fuel characteristics, even within the same ecosystem type. Therefore, by using satellite-measured R fre and smoke aerosols to derive C e globally as a gridded product based on the developed top-down approach, it is not only possible to compare these results with those based on bottom-up approaches, but it can even lead to the development of a gridded EF x product that would offer a much finer spatial coverage and resolution than do the current products.

Data
The main data products used in generating the gridded C e are satellite measurements of FRP and AOT, as well as assimilated wind fields from MERRA. Both the FRP and AOT products used in this work are derived from the MODIS sensors aboard the (1) Terra satellite launched in 1999 with local equator crossing times of 10:30 a.m. and 10:30 p.m., and (2) Aqua satellite launched in 2002 with local equator crossing times of 1:30 p.m. and 1:30 a.m. The analysis in this paper is based on data covering the period between 2003 and 2010, inclusive. The specific attributes of these products, such as their spatial and temporal resolutions, versions, and uncertainties are discussed in the following sub-sections. It should be noted that MODIS data versions are essentially referred to as data "collections", a terminology that will be used throughout this paper.

Fire radiative power
Active fire observation products from MODIS on Terra (MOD14) and Aqua (MYD14) are provided at a nominal spatial resolution of 1 km at nadir (Justice et al., 2002;Giglio et al., 2003). FRP (or R fre ) is one of the main parameters provided within these products for every fire pixel detected. The original formulation for derivation of R fre was developed in Kaufman et al. (1998, p. 32 226, Eq. (1)) as where R fre is the pixel fire radiative power (in W m −2 ), T 4 is the fire pixel brightness temperature (in K) at the 4 µm channel (3.96 µm for MODIS), and T 4b is the 4 µm brightness temperature of the background surrounding the fire pixel. Equation (3) was used to derive FRP values from MODIS up to the collection 4 data set released in 2004. Those collection 4 data were used for the Ichoku and Kaufman (2005) study. Starting from collection 5, the right hand side of Eq. (3) was multiplied by the area of each pixel to account for the variation of ground pixel size with MODIS scan angle, resulting in units of MW for R fre (Giglio, 2010). The collection 5 FRP data set, which is the latest data version available at the time of this study, has been used for the calculations reported here. The potential effects that this change in FRP values has on computed C e is analyzed in Sect. 4.6. However, it is noteworthy that FRP retrievals from MODIS have not yet been validated, even though the uncertainty associated with the detection of fire locations has been characterized using fire detections at 30 m nominal spatial resolution from the Enhanced Thematic Mapper Plus (ETM+) sensor aboard the Landsat-7 satellite and the Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) aboard Terra (e.g., Morisette et al., 2005a, b;Schroeder et al., 2008a, b).

Aerosol optical thickness
The AOT (τ aλ ) data used for this study were also retrieved from MODIS on Terra (MOD04_L2) and Aqua (MYD04_L2) at 10 km spatial resolution at nadir. MODIS measures AOT at 470, 550, 660, and 2100AOT at 470, 550, 660, and nm wavelengths (λ) over land, and at 470, 550, 660, 870, 1200AOT at 470, 550, 660, and , 1600, and 2100 nm wavelengths over ocean (e.g., Remer et al., 2005Remer et al., , 2008Levy et al., 2010). However, only the AOT data retrieved over land are used in this study, since smoke from biomass burning can only be emitted over land where fires normally occur, although this makes it difficult to get a sufficient amount of retrievals for fires occurring very close to coastlines. Specifically, we use AOT measurements at 550 nm wavelength, as this falls within the mid-visible or green region of the electromagnetic spectrum, which is the most commonly used wavelength region in aerosol radiation studies. Unlike the FRP data, MODIS AOT data have been extensively characterized and validated using ground-based sun-photometer measurements from the global Aerosol Robotic Network (AERONET, e.g., Holben et al., 1998Holben et al., , 2001. However, as with the fire product, the collection 5 MODIS Level 2 Aerosol Product (http://modis-atmos.gsfc. nasa.gov/C005_Changes/C005_Aerosol_5.2.pdf) was used in this study instead of the collection 4 that was used in Ichoku and Kaufman (2005). Detailed analyses of the effects of this change in AOT data version and other factors on the computed C e results are presented in Sect. 4.6.

Wind vectors
The wind vectors used for this study were extracted from MERRA's inst3_3d_asm_Cp product provided at a spatial resolution of 1.25 • × 1.25 • and a temporal resolution of 3 h. The documentation for that product is available at http://disc.sci.gsfc.nasa.gov/mdisc/data-holdings/merra/ inst3_3d_asm_Cp.shtml. Wind data at pressure levels of 925, 850, and 700 mb, roughly corresponding to heights above mean sea level (a.s.l.) of 750 m, 1.5 km, and 3 km, respectively, were extracted and used for spatial aerosol data analysis to derive smoke TPM emission rates. However, after the analyses, the wind data at 850 mb (roughly 1.5 km a.s.l.) were used to generate the final product as described in Sect. 4 and in Ichoku and Kaufman (2005), because most fires observable from satellite at~1 km spatial resolution in different parts of the world typically inject plumes to heights of about 1.5±1.0 km above ground level (a.g.l.), with the exception of very large fires (e.g., Lavoué et al., 2000;Freitas et al., 2006;Kahn et al., 2007;Val Martin et al., 2012;Yang et al., 2013).

Other data
Several other data types, products, and parameters were used in this study. The global average aerosol mass extinction efficiency value of β e = 4.6 m 2 g −1 that was used in Ichoku and Kaufman (2005), based on the work of Reid et al. (2005), has also been used in the current work. Coincident digital elevation model (DEM) data and ecosystem data were obtained for each data point during processing for reference. DEM data at 30 arcsec resolution (GTOPO30: https://lta.cr.usgs. gov/GTOPO30) are provided by the US Geological Surveys (USGS). We used the DEM data sets to determine the landsurface elevation, land-sea mask, slope, and aspect. Ecosystem data used in this work are from the 1 arcmin resolution global ecosystem map of 2004 derived from MODIS (http: //modis-atmos.gsfc.nasa.gov/ECOSYSTEM/) using the International Geosphere/Biosphere Program (IGBP) classification scheme. Digitized smoke plume data from the Multiangle Imaging Spetro-Radiometer (MISR) INteractive eXplorer (MINX) tool (Nelson et al., 2008(Nelson et al., , 2013 were used to evaluate the relationship between the wind direction from MERRA and the actual plume direction as observed on the MODIS imagery.

Methodology
The basic methodology for deriving the smoke-aerosol emission coefficients C e from satellite measurements of R fre and τ aλ was developed in Ichoku and Kaufman (2005). However, although the basic structure and processing sequence of the original algorithm have been maintained, several adjustments and updates were required, in terms of both the algorithm and input data, in order to generate the gridded products reported in this paper.

Algorithm logic for C e calculation
The logic progression within the algorithm to calculate C e is generally similar to that described in Ichoku and Kaufman (2005) in that the first stage of the algorithm is completed on a 10 km resolution aerosol-pixel level, followed by a second stage where these units are aggregated within larger areas (1 • × 1 • regular grid in the present case), and then ending with the actual calculation of C e . The core of the algorithm is outlined in this section and visualized in Fig. 1. The specific details of the three analysis stages, including the main adjustments and updates applied in the current study, are described in Sects. 4.2, 4.3, and 4.4.
The first stage of the algorithm is designed to generate values of R fre and R sa (the smoke-aerosol emission rate) for each aerosol pixel with detected fire(s). Fitting the MODIS 1 km resolution active fire data into the corresponding 10 km resolution aerosol-pixel data is very straightforward because both data sets originate from the same instrument on the same platform and from the same original data product. Therefore, the R fre for a given aerosol pixel is derived as where FRP is the fire radiative power measurement of individual active fire pixels, and N f is the total number of active fire detections within a given aerosol pixel. Derivation of R sa is less straightforward and involves calculations utilizing AOT and wind vectors in a 3 × 3 aerosol matrix centered on the fire-affected aerosol pixel, as depicted in Fig. 2. Since the plume can easily influence neighboring pixels, the 3 × 3 matrix is split into four quadrants, one of which is deemed to be the "downwind quadrant" based on the wind direction, and the four pixels therein are assumed to contain parts of the plume, whereas the remaining five pixels are the background. The zonal (u) and meridional (v) components of the wind vector at the 850 mb atmospheric pressure level are used to calculate the wind speed (WS = √ u 2 + v 2 ) and azimuth (θ = cos −1 v WS ). The azimuth is compared with the MODIS along-track direction to determine the downwind quadrant in which the plume is located (see Fig. 2). The AOT that is attributable to the fire(s) within the central aerosol pixel can be determined as where the superscripts f, t, and b, respectively, designate the fire-emitted, total, and background AOT at 550 nm wavelength. The background AOT value, τ b a550 , is calculated as the mean of the valid background AOT values (shown in blue in Fig. 2), weighted by aerosol-pixel area. The fire-emitted AOT, τ f a550 , is found by subtracting this mean τ b a550 value from τ t a550 of each aerosol pixel in the downwind quadrant (plume direction). Individual τ f a550 values that are negative are set to zero. Subsequently, an area-weighted average τ f a550 value is calculated from the downwind aerosol pixels to represent the unit plume being analyzed. Thus, where A is the aerosol-pixel area, N at is the number of valid aerosol retrievals in the downwind quadrant, and N af is the number of valid aerosol retrievals in the downwind quadrant whose τ t a550 appropriately exceeds τ b a550 . This fire-emitted AOT (τ f a550 ) is converted to smoke-aerosol column mass density (M d in g m −2 ) as where β e (expressed in m 2 g −1 ) is the smoke aerosol specific extinction or mass extinction efficiency derived from Reid et al. (2005). Using the total area of the four downwind pixels, A T , the mass of smoke-aerosol emission is then calculated by Determining the smoke-aerosol emission rate R sa requires knowledge of how much time, T , it must have taken to emit M d . For a given plume, T is assumed to be the time it would take for the wind to clear all smoke aerosol from the downwind quadrant within the 3 × 3 aerosol-pixel matrix, and is estimated as where L represents the length of the plume within the 3 × 3 aerosol-pixel matrix. In the case where there are multiple active fire detections within one aerosol pixel, the plume distances are averaged to yield one value for L. Finally, the rate of smoke-aerosol emission is estimated as where R sa is expressed in kg s −1 . Thus, paired values of R fre and R sa are calculated for each fire-containing aerosol pixel and collected into a "pixel-level" product, which is used in the second stage to generate similar measurements at larger scales. The second stage of the algorithm involves aggregating the pixel-level calculations of R fre and R sa to determine corresponding values for larger areas or regions at each MODIS overpass event. Since it is the aim of this study to render the FEER.v1 C e product in a 1 • × 1 • -grid configuration, corresponding pixel-level R fre and R sa values within each 1 • × 1 • grid cell are aggregated as and, Details of the implementation of this stage, including its unique sensitivity analyses and data filtering processes, are described in Sect. 4.3. The output is the "grid-level" data set to be used for the third and final stage during which C e is actually derived. The procedure to derive C e is to generate a scatter plot for each 1 • × 1 • grid cell using the R fre and R sa data calculated in the second stage for a specified time domain (in our .5 • E. The latter demonstrates the effect of removing outliers in such scatter plots. The outlier is identified in red and the blue line is the linear least-squares regression fit through the remaining points, which in this case results in a C e value of 0.0747 and an r 2 value of 0.82. This is a great increase over the case without the outlier removal process, whose regression line is shown in gray and has much lower values of both C e (0.0128) and level of confidence (r 2 = 0.16).
case 2003-2010), with R sa on the dependent axis. A minimum of six points is allowed for a scatter plot, as we consider six to be a minimally reasonable number of data points for linear regression fitting. A zero-intercept regression line (of the form y =mx, where y = R sa and x = R fre ) is fitted to the scatter plot for each grid cell (Fig. 3). The gradient m is the coefficient of emission C e and the goodness of fit is evaluated on the basis of the coefficient of determination (r 2 ), as will be described in Sect. 4.4. Hence, for grid cells with good fits, R fre only needs to be multiplied by C e to derive the smoke emission rate R sa , even in near-real time, bearing in mind the possibility of large biases due to the inherent differences in individual fire characteristics even within the same fire regime (e.g., Schroeder et al., 2014).

First stage: pixel-level data analysis
The calculations of R fre and R sa begin at the pixel-level. Several changes have been made in the current study relative to the original method described in Ichoku and Kaufman (2005). In that paper, τ t a550 was defined as the maximum AOT measured in the 3 × 3 matrix, whereas τ b a550 was defined as the minimum AOT measured in the eight pixels immediately surrounding the center pixel, regardless of the actual direction of the plume. That methodology should produce good results when the plume is prominent and the background is uniform and clear. Otherwise, such as when the plume is thin or highly dispersed, or when plumes from a different fire enter any of the aerosol pixels within the 3 × 3 matrix, the result can be unpredictable. To characterize such situations, 240 digitized MISR plumes from fires that occurred in Siberia in May 2003 were analyzed. The outlines of the 3 × 3 matrix of MODIS aerosol pixels centered on each fire were delineated over a MISR true-color imagery, and the cor-responding MODIS AOT values were recorded along with wind directions from both MISR (observed) and MERRA (modeled). For each of the 240 surveyed plumes, visual classifications were made with the help of the MISR fine (275 m) spatial-resolution imagery to identify which 3 × 3 matrices of MODIS aerosol pixels contained: plumes, clouds, haze, or fires. Of the cases analyzed, 64 % had detected fires in the surrounding pixels and 61 % had background smoke or haze. These proportions can be quite different in other regions because fire density and smoke dispersion characteristics vary by region, biome, and season. In particular, regions that typically have relatively smaller fire sizes, such as the African savannas, are likely to be more impacted by this phenomenon. This significant percentage of extraneous aerosol contamination can have an adverse impact on the determination of τ t a550 and τ b a550 , and consequently also on the accuracy of τ f a550 (see Eq. 7). For instance, in the Ichoku and Kaufman (2005) method: (i) if a larger plume exists in a neighboring pixel its AOT can be erroneously taken as the τ t a550 for the current plume, and (ii) unless any external smoke entering into the 3 × 3 matrix affects every one of the aerosol pixels similarly, the minimum AOT eventually used as τ b a550 may be much lower than the average background AOT. Furthermore, in case (ii) τ t a550 may be likely boosted to a higher value, potentially resulting in the overestimation of τ f a550 when τ b a550 is subtracted from τ t a550 . To mitigate such situations in the current study, the plume direction is first identified based on the relative wind direction, as shown in Fig. 2, whereupon a combination of the four downwind pixels are used to calculate more realistic values of τ t a550 . Likewise, instead of using the overall minimum AOT, the average AOT of the five upwind pixels are used to more realistically determine τ b a550 . It is assumed that where smaller smoke plumes from fires in the neighboring or external pixels are present, the distribution of their effects among the four downwind and/or five upwind aerosol pixels will tend to dilute the smoke contamination on τ t a550 and τ b a550 values, thereby minimizing its impact on the resulting τ f a550 value. Cases that have more serious contamination also have a good chance of being eliminated both by the threshold requirements outlined in Sect. 4.3 and the outlier removal process described in Sect. 4.4. However, another issue that can potentially affect the result, though it is not directly related to this τ f a550 algorithm, is the fact that nearsource thick smoke plumes are occasionally misclassified as clouds and therefore omitted in the collection 5 MODIS aerosol product (e.g., Livingston et al., 2014;Schroeder et al., 2014). For instance, if any of the four downwind pixels in a 3 × 3 matrix has no AOT value, this could lead to underestimation of τ f a550 and consequently also of M sa and R sa for the specific plume unit being analyzed. Such situations are expected to improve with future enhancements in aerosol retrieval algorithms particularly close to smoke sources, but the filtering and outlier removal processes implemented in this study are helpful in the meantime.
One important condition in classifying downwind and upwind sections is that the wind direction needs to be correct. The level of accuracy, however, is variable since the actual requirement is that only the correct downwind quadrant is identified. The MINX data set for Siberia in May 2003 also makes the evaluation of this condition possible, and showed that the success rate of using MERRA to correctly identify the downwind quadrant was on the order of 80 %. This is an acceptable rate, especially considering the fact that most of the failed cases will likely be filtered out in the second stage of the C e algorithm (Sect. 4.3) due to a probable decrease in τ t a550 and increase in τ b a550 such that τ f a550 will be too low. It should also be noted that there was no increase in accuracy when data were matched to the closest plume injection level as recorded in the MINX database than when only the 850 mb pressure level data were used. This reaffirms the validity of the use of wind data at 850 mb for generating the FEER.v1 C e product, although since this is based only on data from Siberia, it may not be ideal for other parts of the world where smoke plumes are typically injected either much lower or much higher than the 850 mb pressure level. Examples include the African Sahel region where fires are typically small in sizes and inject plumes lower than 1 km (e.g., Yang et al., 2013) and the Canadian boreal forest region where very large fires can inject plumes higher than 2 km (e.g., Lavoué et al., 2000;Val Martin et al., 2012).
Another measure taken to minimize uncertainty in the pixel-level analysis relative to the original algorithm in Ichoku and Kaufman (2005) is the use of the wind vectors in the derivation of the distance (L) the plume travels within the analysis unit (the 3 × 3 aerosol-pixel matrix). The original method equates L to the square root of the central aerosolpixel area without considering the actual relative positions of the individual 1 km resolution fire pixels within the 10 km  (Table 2) used in data screening and selection for generating C e scatter plots. resolution aerosol pixel. The new algorithm takes into account the relative positions of these fire pixels within the central aerosol pixel in estimating the distance traveled by each smoke plume from its source (center of the fire pixel) to the edge of the 3 × 3 aerosol-pixel matrix (see Fig. 2). L is extended to the edge of the 3 × 3 pixel matrix, instead of only to the edge of the central aerosol pixel, to prevent any ambiguity in L from introducing large errors in R sa calculations (Eqs. 10 and 11), particularly when the fire is very close to the downwind edge of the central aerosol pixel. Therefore, provided the smoke plume follows MERRA's wind direction at 850 mb, it is believed that the derived values for L and consequently T , R sa , and C e will be much more accurate. Although conceptually this algorithm change is an important improvement, however, for relatively small fires and low wind-speed situations, plumes may not reach the edge of the 3 × 3 aerosol-pixel matrix, resulting in overestimation of L and consequently T (Eq. 10). Figure 4 shows the probability density functions (PDF) of the plume time, T , for the four data-filter settings (00000, 10000, 11000, 11 300: lowest to highest quality) used in this study, as explained later in Sect. 4.3. Overall, the maximum probability for T lies in the range of 45-55 min, with a gradual decrease beyond the peak. For such large time periods of R sa estimation, the fire that emitted τ f a550 may have changed significantly relative to the FRP value recorded at the time of observation. This is the weakness of using the instantaneous and simultaneous observations of a fire and corresponding plume based on a relatively low spatial-resolution aerosol product. However, if the spatial resolution of the input aerosol product improves (as is currently being developed by the MODIS aerosol team), this issue will be alleviated.
Lastly, in the original algorithm by Ichoku and Kaufman (2005), single values of R sa and R fre were calculated for large regions or areas (in this case 1 • × 1 • grid cells) involving multiple fire/plume units only after the upstream variables had been aggregated into these regions. That approach has been modified in the current implementation to minimize its vulnerability to errors that may be inherent in the aggregation processes preceding the calculations. In the current algorithm, the pixel-level analysis is continued up until the calculation of R sa and R fre for each fire/plume unit. This allows for flexibility in the use and aggregation of these products at different scales and corresponding uncertainty estimation. In the current work, the values of R sa and R fre generated at the pixel-level are aggregated into the 1 • resolution grid cells for creating scatter plots.

Second stage: gridded data analysis
The creation of a gridded product at 1 • resolution arises from the need for derivation of a gridded smoke emission coefficient C e that would be available for use in generating emissions wherever fires occur around the world for various types of applications and modeling. Because the pixel-level smokeaerosol emission rates parameter (R sa ) simply reports values for all aerosol pixels containing fire regardless of the quality of the aerosol retrievals, the development of this gridded product necessitates a methodology for removing invalid or erroneous data, which is accomplished through the use of thresholds applied to selected parameters. These are described in Table 1, along with the purpose for using each one of them.
To determine appropriate thresholds for these parameters, several 1 • × 1 • grid cells were selected around the globe from a variety of biomass burning regions to conduct sensitivity analyses using data from the full time period of 2003-2010 (Fig. 5). The selections were made by examining random grid cells spread out throughout the entire globe and manually ensuring that the final selection maintained diversity in location, fire type, biome, number of data points, and expected goodness-of-fit of linear regression. Data contained within these sample grid cells were used to perform a dynamic, detailed analysis of the calculations described in Sect. 4.1 (and illustrated in Fig. 1) to quickly generate different emission coefficients. For each site, these algorithmic calculations to aggregate pixel-level values of R fre and R sa into the grid cell and to calculate C e were applied inside an Excel workbook, where provisions were made for a user to control the threshold parameters listed in Table 1. Each threshold parameter was varied and studied in different combinations as their effects on the final results were visualized. The calculations were followed through all the way to the scatter plots of R sa and R fre , and a linear least-squares regression line passing through the origin was fitted, resulting in values of C e . In this way, the corresponding change in the look of the scatter plot and in the value for C e due to varying threshold settings was observed in real time. Thus, the results were dynamic in nature and allowed for proficient sensitivity analysis at each of the sites.
A five-digit code was developed to represent the different combinations of the threshold settings, as designated in the header row of Table 2. Each digit within the five-digit code represents one set of parameters that are changed, and the digit number represents different settings for those parameters. Thus, the 00000 setting represents the case when no filtering is applied to the data set at all, except the standard requirement that there be valid retrievals of R fre (F_power) and R sa . A basic set of parameters were selected as a common improvement in all the selected sites, identified as a "1" for the first digit in the settings code (i.e., 10000 is a setting with only these basic settings turned on). These basic parameters are (Tables 1 and 2) the scan angle of the aerosol pixel, the wind speed, the number of available surrounding aerosol retrievals, the lowest AOT quality assurance flags specified for selecting the upwind and downwind aerosol pixels, and the number of valid AOT retrievals of the upwind and downwind pixels. The second digit of value "1" (i.e., 11000) represents the elimination of cloud contamination by setting A_cloud_fraction_mean to 0. This setting produced the largest single noticeable improvement across the board, not only in reduced point scatter, but also in improved regression line fits. The third digit setting corresponds to the next set of thresholds used to impose restrictions on extreme minima in the main parameters contributing to the calculation of C e , namely τ f a550 (A_AOT550_fire) and R fre . Over the course of examining sufficient threshold values to use for these parameters, two values for each parameter were selected for further testing with all the sites collectively, creating four possible combinations: "1" (τ f a550 > 0.01 and R fre > 15 MW), "2" (τ f a550 > 0.01 and R fre > 20 MW), "3" (τ f a550 > 0.02 and R fre > 15 MW), and "4" (τ f a550 > 0.02 and R fre > 20 MW). This was motivated by the realization that extremely low τ f a550 and R fre values within a 10 km × 10 km aerosol pixel would be too close to the noise level to be good Table 1. List of parameters that are used for data filtering in the gridded product development step described in Sect. 4.3 (parameter-name prefixes "A", "F" and "M" indicate whether a parameter belongs to the MODIS Aerosol, MODIS Fire, or MERRA Meteorological data sets, respectively.)

Parameter
Description Purpose

A_scan_angle
The scan angle of the aerosol pixel. Eliminate the effect of pixel overlap, which adds too much complexity in determining total and upwind AOT values.

M_wind_speed
The wind speed from MERRA. Eliminate slow air mass that would escalate T and make R sa very small. A_retrievals_nearby The number of available aerosol retrievals immediately surrounding the center pixel.
Ensure the pixel is not along the edge of the MODIS scene (granule), and that no nearby feature prevents aerosol retrieval. A_AOT550_fire Fire-emitted AOT at 550 nm (i.e., total -background AOT at 550 nm).
Eliminate cases where the plume signal is weak relative to the background. A_QA_AOT_total The smallest QA used in selecting total AOT from the downwind pixels.
Allow flexibility to specify desired range of AOT quality flags. A_QA_AOT_bkgd The smallest QA used in selecting background AOT from the upwind pixels.
Reduce uncertainty in background (upwind) AOT measurements. A_AOT550_retr_total The number of valid downwind aerosol retrievals. Ensure that enough valid pixels are available for accurate total AOT determination. A_AOT550_retr_bkgd The number of valid upwind aerosol retrievals.
Ensure that enough valid pixels are available for accurate background AOT determination. A_cloud_frac_mean Mean cloud fraction of the 3 × 3 aerosol-pixel matrix for unit plume analysis.
Reduce the chances of cloud being falsely identified as smoke and vice versa, or cloud obscuration of fire. F_pcounts The number of MODIS fire pixels inside the center aerosol pixel.
Optimize number of fire pixels within aerosol pixel for accurate FRP total. F_pcounts_nearby The number of MODIS fire pixels in surrounding 8 aerosol pixels.
Eliminate uneven contamination of AOT by emissions from nearby fires. F_pcounts_DW3 The number of fire pixels in three downwind pixels (excluding center).
Eliminate contamination of target plume by those from nearby fires. F_power The cumulative FRP value of all fires within the center aerosol pixel.
Limit small fires and underestimated FRP values that can cause large errors.

Rsa
The rate of smoke emission.
Limit invalid values or cases with insignificant amounts of smoke production for useful analysis. However, between the two values of the τ f a550 threshold tested, 0.02 was adopted as more realistic for further analysis because it is closer to the absolute component (i.e., 0.05) of the expected AOT retrieval error ± (0.05 + 15 %) from MODIS over land (e.g., Levy et al., 2010). Also, by observing the effect of different choices of R fre thresholds on the sites collectively, it became visually apparent that using R fre > 15 MW was the better solution (compared to 20 MW). The fourth digit setting is used for controlling the number of MODIS fire pixels within the center aerosol pixel (F_pcounts), with "1" and "2" designating one and two-or-more fire pixels, respectively. It was noted that setting F_pcounts ≥ 2 seems to produce similar effects on C e scatter plots as setting the minimum FRP value because both tend to eliminate small fires that potentially have underestimated FRP values. The fifth digit corresponds to thresholds imposed on fire pixel counts like the fourth digit except that it refers to surrounding aerosol pixels in the 3 × 3 aerosol-pixel matrix other than the central one. Two parameters are used: setting "1" counts all the fire pixels within all eight aerosol pixels immediately surrounding the central one (F_pcounts_nearby), and setting "2" counts all the fires within the downwind pixels excluding the central one (F_pcounts_DW3). This last setting was studied as a possible method to ensure that there is no background aerosol contamination from spurious plume sources that are not well dispersed within the 3 × 3 aerosol-pixel matrix. Table 3 shows a summary of the overall sensitivity of each parameter to the various threshold settings in Table 2. The analysis was based on global MODIS-Aqua retrievals for the first day of each month in 2010, for which the total number of retrievals over this data set without any filtering was 43 211, whereas the number of valid retrievals (after applying the 00000 filter to ensure that valid values exist for both R sa and R fre ) was 28 494. Thus, roughly 34 % of the recorded data is invalid. The values in the table are the percentages of the data remaining after applying each of the thresholds. It is evident that the amount of available data severely decreases as more and more restrictions are applied. Therefore, a much more detailed analysis was required in order to determine the best choice in settings to use in the final product. After a careful evaluation of the different filters, considering their effects on the point scatter on plots of R sa against R fre and the associated correlations of the linear regression fitting vis-à-vis Table 2. Value ranges of the threshold parameters in Table 1 and the combinations of their threshold settings used to derive the different five-digit filter configurations (00000, 10000, 11000, etc.) that were applied in screening out potentially erroneous or corrupted data during the grid-level data analysis described in Sect. 4.3.

Parameter
Range 00000 10000 11000 11100 11200 11300 11400 11310 11320 11321 11322 the percentage of available valid data, 11300 was selected for generating the final C e product. Table 3 reports that only about 10 % of the available valid data is used to generate C e with the 11300 setting, but the confidence in the resulting C e values is increased by a satisfactory amount while retaining enough data for product development. This systematic data filtration process in conjunction with the algorithmic improvements in τ f a550 calculations described in Sect. 4.2 have resulted in about a 67 % drop globally in τ f a550 (which directly affects R sa and C e ), as will be seen in Table 6. This is a significant improvement over the Ichoku and Kaufman (2005) method, whose C e values were found to be overestimated (Sofiev et al., 2009;Kaiser et al., 2012), as discussed in Sect. 4.6. However, these results are still susceptible to uncertainty and bias, as fires located in the neighboring aerosol pixels are not specifically accounted for. Although a provision was made to filter out such cases, this was not implemented because of high data reduction without a significant improvement in the result. This step will be reevaluated for possible implementation in future versions of the FEER C e algorithm.

Third stage: generation of smoke emission coefficients
Scatter plots of R sa against R fre were generated for each 1 • × 1 • grid cell, as illustrated in Fig. 3, using all available MODIS data for the period of 2003-2010 after filtering as described in Sect. 4.3. Scatter plots with fewer than six data points were discarded. A linear least-squares regression line passing through the origin was fitted to each scatter plot, and the slope and coefficient of determination (r 2 ) were calculated. The slope is the C e value for that grid cell. However, the general equation of r 2 for a regular linear least-squares regression analysis cannot be used for this zero-intercept fit-ting approach. Instead, going back to the derivation of r 2 and making the correct adjustments, the appropriate equation described in Eisenhauer (2003) was used for our situation.
Although the process of using thresholds to remove inaccurate data as described in Sect. 4.3 has been successful at retaining the relatively higher quality R sa and R fre data series for derivation of reliable C e , in some cases there remain examples where a few erroneous data points that are not successfully detected and filtered out can constitute outliers and cause large errors in this process (e.g., Fig. 3b). Such outliers potentially originate from undetected errors in the data source, such as when the existence of clouds is undetected by the cloud detection algorithm. In the Fig. 3b example, when contrasted with Fig. 3a, only one outlier out of a total of 18 data points cause r 2 to be as low as 0.16, and C e to be lower than the expected value by a factor of six. Although the effect of removing outliers is usually not as drastic as this example, the importance of applying a filter in order to remove outliers from these scatter plots before generating the final C e product is evident.
The process of identifying a robust outlier removal algorithm proved to be non-trivial. Regression analysis assumes linearity, independence, homoscedasticity, and normality. Residual plots produced from data similar to those of Fig. 3 show violations of at least one of these requirements, the most persistent being the non-normality of R sa versus R fre scatter plots due to the persistent positive skewness of the residuals. This characteristic seems to render most if not all mainstream outlier algorithms unusable for the current study. Wisnowski (2001) describes a few highly respected multiple outlier detection algorithms, some of which were tested and found to produce many false alarms with our R sa vs R fre scatter plots. It became increasingly apparent that a custom outlier algorithm would have to be developed specifically for these data sets. A detailed empirical study was Table 3. Percentages of all available data that meet the threshold requirements in Table 2. These numbers were derived using the global coverage of MODIS-Aqua retrievals for the first day of each month in 2010. The number of retrievals over this data set totaled 43 211, whereas the number of valid retrievals (where "F_power" and "Rsa" are both greater than zero, see setting 00000 in Table 2) totaled 28 494. The last row (" % of Valid") shows the overall percentages based on the 00000 setting, which gives an estimate using only valid data. Values are color-coded in different shades of red < 15 %, 15 % < orange < 50 %, 50 % < yellow < 70 %, green > 70 %. undertaken to fully understand the variety of point distributions that can occur in our data sets and their potential impacts on C e and r 2 resulting from the linear regression fitting in order to develop a robust outlier removal algorithm that would be optimal for our data set. The central idea behind the resulting outlier algorithm is to compare the fraction of mean squared error (MSE) measurements between the scatter plot with all points and without potential outliers against an empirically developed function in order to properly identify outliers. This outlier algorithm was then applied to 110 test scatter plots, each of which was manually assigned to one of 15 identified scatter-point distribution categories, in order to rate its performance. Overall, outliers were correctly identified and removed in 75 % of the 110 cases tested, although three of the 15 types of scatter-point distributions showed a high failure rate. However, the fact that 75 % of available linear regression lines with outlier contamination can be rectified using this algorithm is still a vast improvement over the conventional outlier removal algorithms that were tested. When this outlier algorithm is applied to the full data set from both Terra and Aqua, the outlier detection rate is very consistent at around 30 %, regardless of the filter setting (as described in Sect. 4.3) that is used. If these outliers are correctly identified, then combined with the earlier conclusion that 75 % of contaminated grid cells are identified by the algorithm, it is deduced that roughly 40 % of all grid cells contain outliers. Figure 6 offers an informative display of how the application of this outlier algorithm impacts the final C e product. After outlier removal, the distribution of C e values shifts noticeably towards higher values. This would be the  Tables 1, 2, and 3) before and after outlier removal. C e appears to have shifted towards higher values overall. expected behavior for successful outlier detection since outliers below the regression line (and close to the independent axis) have a very significant influence on the linear leastsquares fit as compared to outliers above the line and close to the dependent axis.
Initially, the scatter plots and associated linear regression fitting and calculations were done separately for Terra and Aqua data. Although a majority of the plots showed agreement between Terra and Aqua, we decided to combine them for deriving the final C e product. This combination offered two advantages: (1) it increased the number of data points on scatter plots with an insufficient amount of data due to the filtering performed above (Sect. 4.3) such that C e values could be determined, and (2) it avoided the necessity to develop methods of conducting weighted averaging between two independent C e values for each grid cell. The resulting C e product is shown in Fig. 7 along with its corresponding r 2 map.

Gap filling and quality assurance
This polished C e product presented in Sect. 4.4 and Fig. 7 offers the advantage of including only the highest confidence data, since it is based on the stringent 11300 filter and outlierremoval processes. However, the tight constraints imposed by these processes have the effect of limiting the data suitable for the final product generation, such that many parts of the world that are known to be affected by fire do not have C e values generated, despite the efforts to increase coverage by combining Terra and Aqua data into one input stream. The concern of having incomplete coverage is that if a significant fire event were to occur in an important region, it may not be possible to make even a rough estimation of the smoke emission rates. Therefore, it is evident that some sort of filled product is needed.
The possibility of a gap-filled product whereby missing C e values would be determined by interpolation using surrounding existing values for similar land cover types was initially pursued. However, this procedure could not be applied at first because the gaps are quite extensive in certain areas, with unreasonably great distances between the grid cells that need to be filled and those containing valid data from which their values can be interpolated. Thus, gaps were first filled in, as much as possible, using C e values based on successively lower filter settings starting from the 11300 setting (see Tables 2 and 3) such that those with higher quality but less data are utilized before moving to those with lower quality and more data. To account for the differences in quality introduced by this procedure, a quality assurance (QA) product is provided in conjunction with the filled C e product, to serve as an indication of its reliability as well as to give users flexibility in the application of this product.
The compilation process begins with the 11300-filterbased C e product, which is the highest confidence product, and progressively fills in missing data with products of lesser confidence: first 11000, then 10000 and finally 00000. The outlier removal algorithm has been applied to all except the 00000 product. A QA flag of 0 is assigned to the lowest confidence product (i.e., 00000) and steps up to 3 for the highest confidence product (i.e., 11300). C e values of the 11300 product with r 2 ≥ 0.7 are assigned the highest QA value of 4. In order to account for cases of low data availability during this filling process, grid cells that are already filled may be replaced with values from the lesser confidence product under certain conditions. The decision to replace such existing values is determined based on the number of data points used to determine C e for the previously filled value, N f , and for the new value, N n , and based on their respective r 2 values, such that the conditions, must all be met, where N limit represents the minimum number of data points needed to confidently fit a linear regression line, set to 30, which is the conventional minimum sample size for statistical significance. It is pertinent to recall that any scatter plot with less than the bare minimum of six data points is discarded. If Eq. (14) is satisfied for a given grid cell, then the C e value in the current grid cell of the new (less filtered) product is substituted for the existing value in the filled product. Likewise, the QA of the filled data is replaced with that of the new data. Finally, as many of the gaps remaining in the filled product as possible are filled using the C e values in nearby grid cells with identical land cover types. Land cover type may vary significantly within a grid cell at the spatial resolution of 1 • × 1 • used in this product, which can cause issues especially since the dominant land cover type within a given grid cell may very well not be the one that burns most often. Thus, the MODIS ecosystem classification map for 2004 at 1 arcmin resolution was used to develop a custom land cover product at 1 • × 1 • resolution that reports the dominant fire-prone land cover type, which is used in the following analysis. Grid cells that are potentially vegetated (not completely classified as water, barren, or snow/ice) are identified as candidates for gap-filling, and carefully analyzed. First, a 15 × 15 grid cell box is drawn around each candidate grid cell, and all grid cells with valid C e within this box and whose fire-prone land cover type is identical to that of the center grid cell are selected. The QA values of these selected grid cells are observed, and the highest minimum QA value (QA min ) is set such that there will be at least eight total qualified grid cells with QA ≥ QA min . If this condition cannot be met, then no gap-filling procedure is completed in that case. This QA requirement is a method of balancing quantity with quality of data to get the most certainty in the results. Then, starting with a 3 × 3 box centered around the grid cell whose C e is being filled, the box is expanded only as necessary (up to a maximum of 15 × 15 window) until it contains at least eight grid cells that have valid C e values, the same fireprone land cover type, and QA values greater than or equal to QA min . If there are a sufficient number of grid cells that meet these criteria, the C e values of these grid cells are averaged and used to fill in the missing value. The gap-filled grid cell is assigned a QA value of zero, irrespective of those of the source grid cells.
The final global 1 • × 1 • gridded C e product (Fig. 8a) has much better spatial coverage than the original (Fig. 7). The land areas that are not covered seem to comprise only desert and snow/ice regions, except for the farthest reaches of eastern Russia where the last gap-filling procedure did not have a large enough extent to fill that area. Nevertheless, this product provides sufficient coverage for nearly 100 % of all vegetation fires that might occur around the globe. Furthermore, the corresponding QA and r 2 products (Fig. 8b, c) provide the user with parameters for determining how accurate the C e in a given area might be. Table 4 shows the relative QA and r 2 distribution of all 13919 grid cells with C e values. With the exception of the grid cells whose QA equals zero, the majority of grid cells within each QA category are either in the 0.7-1 or 0.5-0.7 r 2 range. Therefore, a user can apply the QA (and r 2 ) values as a filter to select only the C e values that meet or exceed the minimum quality requirement for Table 4. The distribution of FEER.v1 C e product QA flags among different coefficients of determination (r 2 ) ranges. The "N/A" category is for the 1 • × 1 • grid cells that have been filled in using the gap-filling process due to lack of sufficient data (and therefore lack an r 2 value). The r 2 range containing the most grid cells is shown in boldface type for each QA flag.  specific applications. On the other hand, if a major fire occurs in a grid cell with a QA value of zero, an emissions estimate can still be derived, as long as the user recognizes that it is in fact a low-confidence estimate. This C e product is being released as the Fire Energetics and Emissions Research version 1 (FEER.v1) product.

Relative evaluation of the FEER.v1 C e product
Although there is no equivalent "ground truth" data to validate the new FEER.v1 gridded C e product, the latter still requires a certain level of evaluation to determine where it stands in the spectrum of existing comparable products/parameters. This was done by comparing the new FEER.v1 C e data to regional values of C e that were reported for 19 different regions in Table II of Ichoku and Kaufman (2005), hereafter referred to as "IK05". Since the FEER C e product is gridded at 1 • × 1 • , it became necessary to generate a comparative set of average C e values that fit the 19 regions for comparison against the IK05 values. Simply averaging the C e grid cells within each region is unrealistic due to the fact that the spatial distribution of fires within each region is non-uniform and the certainty of the C e varies. Therefore, a weighted average of C e based on the number of fires within each grid cell and also on the QA was used to generate the mean and standard deviation of the C e values within each region. Table 5 shows that the C e average values from the FEER.v1 product are distinctly lower than those of IK05 by a factor of 2-4.5, with the exception of East Kazakhstan, where they are practically equal. It is pertinent to mention that Ichoku and Kaufman (2005) estimated that C e values were probably overestimated by a factor of 2, and Sofiev et al. (2009) by applying a more rigorous plume dispersion modeling found C e values that were lower than those of IK05 by a factor of 2 to 3. Kaiser et al (2012) also found values that were lower than those of IK05. The fact that those subsequent studies, including the current study, produced lower values than those of IK05, confirms that IK05 values were indeed probably overestimated and suggests that those from the current study are more realistic. The change from IK05 to the current study can be categorized into two types, namely, algorithms and input data versions/sources. It is necessary to characterize these two types of change independently in order to determine their relative contributions (as will be reported in Table 6).
To account for the effects of using new input data versions/sources, an updated version of the IK05 product (hereafter referred to as IKu) was generated by ingesting the new data being used in the current study into an algorithm that matches that of IK05 as closely as possible. Recall that the IK05 C e values were based on the MODIS collection 4 FRP and AOT products, with wind data from the NCEP reanalysis data set (GDAS1). By contrast, the IKu C e values are based on the MODIS collection 5 FRP and AOT products, with wind data from the MERRA reanalysis data set. Differences in C e from IK05 to IKu should only be due to changes in data versions and sources, whereas the effects of the algorithmic alterations described in Sect. 4.2 can be isolated by comparing IKu to FEER.v1.
Using the relationships defined in Sect. 4.1, it is evident that In other words, C e is directly proportional to the fire-emitted AOT (τ f a550 ), aerosol-pixel area and wind speed, but inversely proportional to the plume length and FRP. Three of the five variables on the right-hand side of Eq. (15) (τ f a550 , WS and FRP) have updated data sources in FEER.v1, and three (τ f a550 , A and L) have updated derivations. However, both A and L, which are dependent on each other, can be adjusted together here to emulate the IK05 algorithm such that A would be equal to the area of only one aerosol pixel, and L would be halved (using IK05 definitions, L IK05 L FEER = √ A √ 4A = 0.5). These adjustments are made for the following analysis. If the ratio of a variable in FEER.v1 to the same in IK05 is represented by R, then Equation (16) quantifies the change in C e due to both input source and algorithmic alterations from IK05 to FEER.v1. Changes in only data sources from IK05 to IKu are captured in the relationship: because the calculation of L does not involve the use of data from different sources. The relationship that quantifies the algorithmic changes from IKu to FEER.v1 is given by www.atmos-chem-phys.net/14/6643/2014/ Atmos. Chem. Phys., 14, 6643-6667, 2014 Table 5. Estimates of regional FRE-based smoke-aerosol emission coefficients (C e ) from MODIS are shown here for different regions using both the original method as reported in Ichoku and Kaufman (2005) and version 1 of the new FEER C e product (FEER.v1). because the way in which FRP and wind speed are calculated remains the same between the two algorithms. The relationships shown in Eqs. (16), (17) and (18) can be utilized to test whether the differences between the IK05, IKu and FEER.v1 product data sets can fully explain the change in C e between IK05 and FEER.v1 shown in Table 5 as well as to identify the main factor responsible for the changechange in algorithm or the input data version/source. The only available data from the original IK05 data set cover relatively short time periods (Terra: 25 June 2002to 4 October 2002, and Aqua: 25 June 2002to 31 December 2002. The fact that these ranges do not cover a full year means that any seasonal differences that may exist will be lost and will therefore cause the resulting data to be biased low or high. Nevertheless, these 2002 data sets were used to estimate R C e , by first pairing corresponding individual data points in the IK05, IKu and FEER.v1 data sets. The ratios between IK05 and IKu, between IKu and FEER.v1, and between IK05 and FEER.v1 were calculated for each data point for AOT, wind speed, FRP, and plume length. Subsequently, the ratio of C e was calculated for each data point pair according to Eq. (16) for the transition from IK05 to FEER.v1, Eq. (17) for the transition from IK05 to IKu, and Eq. (18) for the transition from IKu to FEER.v1. To appropriately represent these matched data points and ratios in a uniform fashion within the spatial domains outlined in Table 5, they were binned into a global grid at a spatial resolution of 0.5 × 0.5 • and then filtered according to the appropriate settings reported in Table 2 using the QA values from the FEER.v1 C e product in Fig. 8. Finally, the median of those ratios within each grid cell was retrieved, and the mean of these values (weighted identically as was done in Table 5 for C e ) within each region were reported, as displayed in Table 6.
In Table 6, column 1 (highlighted yellow) shows observed changes in C e from IK05 to FEER.v1. The subsequent columns outline the process of deriving the predicted  Ichoku and Kaufman (2005) method (IK05) to FEER.v1 is shown here for the regions listed in Table 5. The values in the yellow highlighted column on the left-hand side are the observed changes in C e from IK05 to FEER.v1 from Table 5. The subsequent columns outline the sample size and mean parameter changes during the process of deriving the predicted changes in C e from IK05 to FEER.v1 according to Eqs. (16), (17), and (18), the results of which are highlighted in orange and yellow. The "IK ⇒ FEER.v1" on the first header row indicates that both the IKu to FEER.v1 and IK05 to FEER.v1 transitions are included in the columns beneath. Both Terra and Aqua data were used in these calculations.  (18), the results of which are shown in the last column (highlighted yellow). Both Terra and Aqua data were used in these calculations. The two main process changes have been separated out: columns 2-6 (labeled "IK05 ⇒ IKu") clearly showing the effect of altering only the data inputs, and columns 7-14 (labeled "IK ⇒ FEER.v1") showing both the effect of altering only the algorithm (IKu ⇒ FEER.v1) and the combination of algorithm alteration and data updates (IK05 ⇒ FEER.v1). From the resulting maps showing the global variation in ratios of τ f a550 , WS, FRP, and L, it was apparent that the change in each variable is uniform throughout the globe. On average, the change in C e due to differing data sources is about a 40 % decrease (column 6), mostly from the change in FRP from collection 4 to 5 (i.e., without and with multiplication by fire-pixel area, respectively), whereas the algorithm alterations cause about a 60 % decrease in C e (column 13), resulting in an overall decrease in C e of about 80 % globally (column 14). Even though these combined effects of data-source and algorithm changes are slightly overcompensating compared to the observed differences listed in column 1, it can be stated that the observed reduction in C e values between the IK05 and FEER.v1 is indeed realistic. The changes in wind speed, plume distance and FRP due to algorithmic changes are small relative to the large change in AOT. Therefore, most of the change in C e is attributable to the change in fire-emitted AOT (τ f a550 ). Figure 9 shows the global distribution of τ f a550 changes due to data version/source change (i.e., from IK05 to IKu, Fig. 9a) and due to algorithm change (i.e., from IKu to FEER.v1, Fig. 9b). Interestingly, when the new collection 5 AOT data are used in lieu of collection 4, τ f a550 actually increases in most cases around the globe, confirming that the lower C e values from IK05 to FEER.v1 due to AOT is very strongly attributable to the change in the τ f a550 algorithm. In fact, the ratios of τ f a550 in the data-source part are very near unity (column 5, Fig. 9a), whereas in the algorithm alteration part it is around 0.3 (columns 11 and 12, Fig. 9b). It is pertinent to recall that the τ f a550 algorithmic changes mainly involve (1) using wind direction to determine which AOT values to classify as plume or background, and (2) taking the average of the upwind AOT values (instead of just the minimum value) as the background in an effort to account for contamination from external aerosols. Although these modifications have resulted in a severe change in the derived τ f a550 , this change translates to increased confidence in C e .

Emissions calculation results
The new FEER.v1 C e product is used to demonstrate the topdown derivation of emission rates and totals from satellite measurements of FRP. The resulting emissions are compared against other emission inventories to gain a general understanding for how model simulations will change when using this new FEER.v1 inventory.

Emissions estimates (rates and totals)
The FEER.v1 C e product is used to derive smoke-aerosol emissions by simple multiplication, as represented in Eq. (2) and the associated discussion. When C e (kg MJ −1 ) is multiplied directly by FRP (in MW or MJ s −1 ), instantaneous emission rates (in kg s −1 ) are derived, whereas when multiplied by FRE (in MJ) representing a finite (e.g., daily, monthly, or yearly) time period, the result is emission totals (in kg) for that time period. Generating a global FRE product for use in this analysis is not straightforward due to the fact that semi-continuous measurements of unsaturated FRP around the entire globe is not currently available, though it is expected that this situation will improve within the next decade or so, given the anticipated launches of different geostationary and polar-orbiting satellite missions by some of the major space agencies. However, to closely compare emissions based on the new FEER.v1 C e product with other emissions products, this study uses FRP data from the 0.5 • × 0.5 • gridded monthly data set derived from MODIS observations aboard the Terra and Aqua satellites as part of the GFAS.v1 product (http://gmes-atmosphere.eu/fire, Kaiser et al., 2012).
The GFAS.v1 values of monthly average FRP in W m −2 were simply multiplied by the number of days in each calendar month to get FRE in J m −2 , as was done in the GFAS.v1 algorithm (Kaiser et al., 2012). Such derivation of monthly average FRE based on only four or less MODIS fire observations a day (from Terra and Aqua satellites) result in high uncertainty, as it cannot capture the fire diurnal cycle. However, that is currently the only feasible way to obtain FRE globally. Higher-frequency (sub-hourly) observations from a few available geostationary satellite sensors that measure FRP have different characteristics and produce an average of 17-38 % underestimation relative to MODIS coincident FRP observations Xu et al., 2010). Moreover, a combination of these geostationary FRP data still does not provide global coverage, as some large biomassburning regions, including Siberia, Central Asia, and India, are left uncovered (Zhang et al., 2012). Since the GFAS.v1based FRE data are global, publicly available, and being used in the European Union's Monitoring Atmospheric Composition and Climate (MACC) project (http://gmes-atmosphere. eu/fire), they were considered appropriate for use in deriving emissions using the FEER.v1 C e product to enable comparison with existing emissions inventories, as described below. Therefore, these GFAS.v1 monthly FRE values at 0.5 • × 0.5 • resolution were multiplied by the FEER.v1 C e product at 1 • × 1 • resolution to obtain the monthly emissions of smoke aerosols around the globe at 0.5 • × 0.5 • resolution. Then, the monthly emissions for all months of a calendar year were summed up to get yearly emissions estimates, such as the 2010 example shown in Fig. 10.

Comparison with other emissions inventories
The FEER.v1 monthly emissions were compared with some of the existing emissions products -GFED.v3, GFAS.v1, and QFED.v2 -as a way of evaluating the FEER.v1 emissions within the context of these existing global emission inventories that are currently used by the research and operational communities. It should be noted that there are a few dissimilarities between these products. First, like FEER.v1, QFED.v2 is based on the top-down approach using satellite measurements of both aerosols and FRP, whereas both GFED.v3 and GFAS.v1 are based on the bottom-up approach using literature-extracted emission factors (EF) to multiply burned biomass estimates from satellite observations of FRP (GFAS.v1) or fire pixel counts and burned areas (GFED.v3). Secondly, the emissions values used for comparison from both GFED.v3 and GFAS.v1 represent the smoke TPM emissions, whereas for QFED.v2, whose product exists as the component species of smoke aerosols, the closest equivalent product is particulate matter < 2.5 µm aerodynamic diameter (PM 2.5 ). The ratio of PM 2.5 to TPM (by ratioing their corresponding emission factors) is estimated to range between 65 % and ∼ 100 % depending on ecosystem type (e.g., Andreae and Merlet, 2001;Akagi et al., 2011). Thus, ideally, smoke PM 2.5 emissions (from QFED.v2) for a given area and time period should be expected to be lower than the corresponding TPM emissions (from FEER.v1, GFED.v3, and GFAS.v1). These different data sets were aggregated regionally according to the regional biomass burning partitions provided in Kaiser et al. (2012) as delineated in Fig. 11, and plotted as time series annual total smoke TPM emissions (Fig. 12).
All the emissions products portray similar temporal patterns, with lows and highs occurring in the same years, for both the global and regional plots (Fig. 12). This may be due at least in part to the fact that all products are influenced by MODIS fire-pixel counts, either directly or indirectly. GFAS.v1 emissions are generally equal to those of GFED.v3, because the former was scaled to match the latter (Kaiser et al., 2012). Globally, GFED.v3 and GFAS.v1 each Figure 11. Regional partitions as defined in Kaiser et al. (2012) that are used in this paper to compare FEER.v1 emissions with GFED.v3, GFAS.v1, and QFED.v2 emission inventories. The background MODIS true-color image shows fire locations (red dots) detected by MODIS from both Terra and Aqua for all of 2010, to illustrate the global spatial distribution of annual fire occurrence.
constitutes only about 55 % of the FEER.v1 annual TPM emissions. Since the GFAS.v1 FRE data set was also used for FEER.v1, it follows that the large difference between their emission products stem from the relative magnitudes of the C e used to generate them. Furthermore, given that it was already established that the TPM emissions in GFAS.v1 (and by inference also in GFED.v3) need to be boosted by a factor of 2-4 to match realistic global distributions of aerosols, it follows that FEER.v1 C e results are probably closer to realistic values. However, although the QFED.v2 emissions represent only PM 2.5 , which should be lower than TPM, paradoxically, it is slightly higher than FEER.v1 global TPM emissions.
The relationship between the FEER.v1 emissions and those of GFED.v3, GFAS.v1, and QFED.v2 portrays significant regional differences, as indicated by the regional plots in Fig. 12. In North America (NAme), incidentally, FEER.v1 emissions seem to agree closely with those of GFED.v3 and GFAS.v1, whereas QFED.v2 (though only PM 2.5 ) shows double the values of the former three TPM emissions inventories. Not surprisingly, out of all the regions, NAme has the largest distribution of the lowest QA and r 2 values for the FEER.v1 C e values, as shown in Fig. 8. We suspect that FEER.v1 C e values are severely underestimated in this NAme region probably because, among other possible reasons, the gap-filled areas are quite extensive with very low QA values that may have tended toward underestimation. On the other hand, QFED.v2 appears to have been overestimated in northern and southern Asia (NAsi and SAsi), perhaps due to contamination from the persistent regional pollution, since QFED.v2 is based on regional aerosol observations in contrast to FEER.v1, which is based on near-source AOT measurements. Similarly, GFED.v3 is probably overestimated in tropical Asia (TAsi) only in 2002 and 2006, although the investigation of possible reasons for these two anomalous years is beyond the scope of this paper.

Summary and conclusions
This study has presented a first attempt at providing C ean index that is similar to EF -for every 1 • × 1 • cell containing burnable vegetation globally. Whereas EF is used to multiply burned biomass estimates to calculate emissions, C e is the equivalent parameter used to multiply time-integrated satellite measurements of FRP to estimate emissions. Thus the FEER.v1 global gridded C e product developed in this study for TPM emissions estimation has several important attributes, of which the most significant are that it (1) is the first global gridded product in the family of "emission factors", whereas existing products specify one value per ecosystem type; (2) requires only direct satellite measurements of FRP or its time-integrated FRE to generate emission rates or totals, respectively, whereas regular EF values still require estimation of burned biomass through an intricate process fraught with high uncertainty; and (3) is the only variable in the family of "emission factors" that does not require pre-determination of the ecosystem type of an actively burning fire to evaluate its emission rate in near-real time, which is essential for operational activities, such as the monitoring and forecasting of smoke emission impacts on air quality.
Although the FEER.v1 global gridded C e product was based on the original approach proposed by Ichoku and Kaufman (2005), this study implemented significant improvements in all stages of the product development. The latest available versions (collection 5) of both the aerosol and fire products from MODIS were used, along with MERRA meteorological data from the GEOS-5 global assimilation model. The identification of near-source plume and background pixels from the MODIS AOT data set was based on actual wind directions from MERRA. Rigorous methods were used to determine the valid ranges of all parameters utilized in the algorithm, in order to limit the effects of spurious errors and uncertainties from measurements and assumptions. These updates in data versions and algorithm resulted in an overall decrease in regional average C e values by a factor of 2-4.5 relative to those of Ichoku and Kaufman (2005). This decrease seems reasonable, as observed by recent studies that evaluated those C e values based on model analyses (e.g., Sofiev et al., 2009;Kaiser et al., 2012). Nevertheless, the FEER.v1 global gridded C e product may still contain several limitations and uncertainties, which may be due to various factors, such as (1) uncertainties in the satellite retrievals of AOT and FRP, (2) omission of smaller fires or even larger fires that are mostly smoldering with significant smoke emission but limited radiant energy signal below the MODIS detection limit, (3) possibility of erroneously including external aerosols to specific plumes being analyzed or having large variations in the aerosol background surrounding the plume, (4) smoke underestimation due to the erroneous masking out of near-source thick smoke plumes as cloud during the aerosol retrieval process, (5) lack of knowledge of plume injection heights, (6) use of wind vectors at 850 mb globally, (7) uncertainty in the MERRA wind vectors used in the calculations of smoke emission rate and trajectory, (8) assumption of a single value of smoke-aerosol mass extinction efficiency globally, and (9) uncertainties due to the gap-filling process of the FEER.v1 global gridded C e product. Therefore, there is need to find ways of validating this product. Fortunately, the fact that this global C e product is anchored on a geographically fixed grid system makes validation much more feasible than is the case for existing EF values whose geographical attributes may have been lost, thereby making them difficult to replicate or to trace to a specific geographic domain. Thus, for the FEER.v1 global gridded C e product, deliberate effort could be made to conduct field experiments within any 1 • × 1 • grid cell for use in validating its C e value.
Pending the validation of this FEER.v1 global gridded C e product in a representative sample of locations, perhaps through the use of observations in conjunction with regional modeling, QA flags (ranging from 0 to 4 in increasing order of quality) have been provided with the product to guide the user in using this product for different applications. These QA flags were based on several qualitative and quantitative considerations including the r 2 from the zero-intercept linear least-squares regression fitting of smoke-aerosol emission rates against FRP. A corresponding gridded map of r 2 is also provided for reference. Thus, a user desiring to derive only high-quality emissions can use the QA as a filter to select only the C e values with the highest quality required, while the corresponding r 2 value can give a general idea as to whether this QA is based on quantitative or qualitative considerations. On the other hand, if a fire occurs in a grid cell for which emissions estimates are needed to determine the general smoke trajectory without the need for precise quantitative estimates of concentrations, even C e values having a QA value of zero can be used to accomplish the desired task.
The FEER.v1 global gridded C e product was used to generate global and regional total emissions of TPM, which were compared against existing emissions inventories, including GFED.v3, GFAS.v1, and QFED.v2. The FEER.v1 annual total TPM emissions are low in North America, where they had comparable magnitudes as those of GFED.v3 and GFAS.v1, each of which was about half of the PM 2.5 emissions from QFED.v2. Pending validation, with the exception of the North America, FEER.v1 and QFED.v2 seem comparable in most regions relative to GFED.v3 and GFAS.v1 emissions, which are considered low. Since GFED.v3 and GFAS.v1 products are based on bottom up approaches (with regards to the determination of the emission factors used), whereas FEER.v1 and QFED.v2 are based on top-down approaches (in relation to the emission coefficients used), it is reasonable to assume that top-down approaches based on satellite measurements would yield smoke distributions that have a closer resemblance to satellite observations of aerosols. Therefore, it is recommended that increased effort be made toward further enhancement of top-down approaches, not only for aerosol emissions, but also for gaseous emissions. It is hoped that this approach will become more and more accurate and beneficial with continued improvement in the satellite retrievals of these aerosols and gases.
The FEER.v1 C e product, which is currently based on 2003-2010 MODIS observations, will certainly require future updates as new, improved, and more representative data inputs become available from other relevant sources. Furthermore, even in places where the current C e values are reasonably accurate, over time, changing land-cover conditions and fire regimes will invariably necessitate the revision of these C e values, which may indicate diurnal, seasonal, annual or longer-term temporal variability. Future studies will reveal the approaches required to ensure optimal accuracy in time and space.
The current study has been focused on the development of a global gridded C e product for smoke TPM because it is based on the total columnar AOT parameter as retrieved from satellite observations. Although it is recognized that modeling activities often require smoke-aerosol speciation into its various components such as organic carbon (OC), BC, or PM 2.5 , it was beyond the scope of this study to derive emission coefficients for these smoke constituent species, as it would have involved several assumptions (with associated compounding uncertainties) to estimate any one of them from satellite AOT retrievals. However, the user of the FEER.v1 TPM C e product may optionally estimate corresponding C e values for any of the other smoke-aerosol constituents by multiplying with their emission ratios relative to TPM. Such emission ratios can be obtained from the literature or derived from the constituent emission factors, which are also available in the literature depending on ecosystem type (e.g., Andreae and Merlet, 2001;Akagi et al., 2011). Indeed, the FEER.v1 global gridded TPM C e product developed in this paper represents a versatile foundational product that can lead to several important advances in fire emissions research and applications.
Jun Wang, Johannes Kaiser, Cathy Liousse, and several other colleagues for valuable discussions. Finally, we are very grateful to the Editor, Philip Stier, who handled this manuscript review process and the anonymous reviewers who devoted their precious time and effort to carefully review the discussion version of this paper and to provide excellent comments and insights that have resulted in a much-improved final version.
Edited by: P. Stier