Towards a satellite formaldehyde – in situ hybrid estimate for organic aerosol abundance

Organic aerosol (OA) is one of the main components of the global particulate burden and intimately links natural and anthropogenic emissions with air quality and climate. It is challenging to accurately represent OA in global models. Direct quantification of global OA abundance is not possible with current remote sensing technology; however, it may be possible to exploit correlations of OA with remotely observable quantities to infer OA spatiotemporal distributions. In particular, formaldehyde (HCHO) and OA share common sources via both primary emissions and secondary production from oxidation of volatile organic compounds (VOCs). Here, we examine OA–HCHO correlations using data from summertime airborne campaigns investigating biogenic (NASA SEAC4RS and DC3), biomass burning (NASA SEAC4RS), and anthropogenic conditions (NOAA CalNex and NASA KORUS-AQ). In situ OA correlates well with HCHO (r = 0.59–0.97), and the slope and intercept of this relationship depend on the chemical regime. For biogenic and anthropogenic regions, the OA–HCHO slopes are higher in low NOx conditions, because HCHO yields are lower and aerosol yields are likely higher. The OA–HCHO slope of wildfires is over 9 times higher than that for biogenic and anthropogenic sources. The OA–HCHO slope is higher for highly polluted anthropogenic sources (e.g., KORUSAQ) than less polluted (e.g., CalNex) anthropogenic sources. Near-surface OAs over the continental US are estimated by combining the observed in situ relationships with HCHO column retrievals from NASA’s Ozone Monitoring Instrument (OMI). HCHO vertical profiles used in OA estimates are from climatology a priori profiles in the OMI HCHO retrieval or output of specific period from a newer version of GEOS-Chem. Our OA estimates compare well with US EPA IMPROVE data obtained over summer months (e.g., slope= 0.60–0.62, r = 0.56 for August 2013), with correlation performance comparable to intensively validated GEOS-Chem (e.g., slope = 0.57, r = 0.56) with IMPROVE OA and superior to the satellite-derived total aerosol extinction (r = 0.41) with IMPROVE OA. This indicates that OA estimates are not very sensitive to these HCHO vertical profiles and that a priori profiles from OMI HCHO retrieval have a similar perPublished by Copernicus Publications on behalf of the European Geosciences Union. 2766 J. Liao et al.: Satellite–in situ hybrid estimate for organic aerosol abundance formance to that of the newer model version in estimating OA. Improving the detection limit of satellite HCHO and expanding in situ airborne HCHO and OA coverage in future missions will improve the quality and spatiotemporal coverage of our OA estimates, potentially enabling constraints on global OA distribution.


Introduction
Aerosols are the largest source of uncertainty in climate radiative forcing (IPCC, 2013;Carslaw et al., 2013) and decrease atmospheric visibility and impact human health (Pope, 2002). Organic aerosols (OAs) comprise a large portion (∼ 50 %) of submicron aerosols (Jimenez et al., 2009;Murphy et al., 2006;Shrivastava et al., 2017), and this fraction will grow with continued decline in SO 2 emissions (Attwood et al., 2014;Marais et al., 2017;Ridley et al., 2018). In addition, OAs serve as cloud condensation nuclei (CCN) and affect cloud formation and climate radiative forcing. OA components also have adverse health effects (e.g., Walgraeve et al., 2010) and contribute significantly to regional severe haze events (e.g., Hayes et al., 2013). Finally, because the response of temperature to changes in climate forcing is non-linear (Taylor and Penner, 1994) and the forcing by aerosols has strong regional character (Kiehl and Briegleb, 1993), it is necessary to separate out different climate forcing components to accurately forecast the climate response to changes in forcing.
Despite their importance, it has been challenging to accurately represent OAs in global models. Chemical transport models (CTMs) often underpredict OA (e.g., more than a factor of 2 lower OA near the ground) compared to observations, and model-to-model variability can exceed a factor of 100 in the free troposphere (Tsigaridis et al., 2014;Heald et al., 2008Heald et al., , 2011. Fully explicit mechanisms have attempted to capture the full OA chemical formation mechanisms (e.g., Lee-Taylor et al., 2015), but it is too computationally expensive to apply these mechanisms to OA formation in global CTMs at a useful resolution. For computational efficiency, 3-D models such as GEOS-Chem include direct emissions of primary OA (POA) and represent secondary OA (SOA) formation either by lumping SOA products according to similar hydrocarbon classes  or based on the volatility of the oxidation products (Pye et al., 2010). Marais et al. (2016) applied an aqueous-phase mechanism for SOA formation from isoprene in GEOS-Chem to reasonably simulate isoprene SOA in the southeastern (SE) US. Schroder et al. (2018) showed GEOS-Chem has a very large underprediction of SOA in the northeastern US dominated by anthropogenic emissions. Accurate emission inventories are also needed to correctly represent volatile organic compounds (VOCs) and NO x (NO x = NO+NO 2 ) inputs, and these often have biases compared to observational constraints (Kaiser et al., 2018;Travis et al., 2016;Anderson et al., 2014;McDonald et al., 2018).
A quantitative measure of OA from space would be helpful for verifying emissions and aerosol processes in models. However, direct measurements of OA from space are currently unavailable. Aerosol optical depth (AOD) measured by satellite sensors provides a coarse but global picture of total aerosol distributions. The Multi-angle Imaging Spectro-Radiometer (MISR) provides aerosol property information such as size, shape, and absorbing properties, which allows retrieving the AOD of a subset of aerosols (Kahn and Gaitley, 2015). Classification algorithms have been developed to speciate different aerosol types (e.g., OA) based on AOD, extinction Ångström exponent, UV aerosol index, and trace gas columns from satellite instruments (Penning de Vries et al., 2015). Here, we aim to provide a quantitative estimation of OA mass concentrations from satellite measurements.
Formaldehyde (HCHO) is one of the few VOCs that can be directly observed from space. Sources emitting POA (e.g., biomass burning; BB) often simultaneously release VOCs. HCHO and SOA are also both produced from emitted VOCs. VOCs, as well as intermediate-and semi-volatile organic compounds (I/SVOCs), are oxidized by hydroxyl radicals (OH) to form peroxy radicals (RO 2 ), which then react with NO, RO 2 , or hydroperoxy radicals (HO 2 ) or isomerize. These oxidation processes produce HCHO and oxidized organic compounds with low volatility that condense to form SOA (Robinson et al., 2007;Ziemann and Atkinson, 2012). The yield of HCHO and SOA from hydrocarbon oxidation thus depends on the VOC precursors, oxidants (OH, O 3 , and NO 3 ), RO 2 reaction pathway (e.g., NO levels), and pre-existing aerosol abundance and properties Pye et al., 2010;Marais et al., 2016Marais et al., , 2017Xu et al., 2016). Moreover, although the lifetime of HCHO (1-3 h) is shorter than OA (1 week), HCHO continues to form from slower-reacting VOCs, as well as from the oxidation of latergeneration products. Observations across megacities around the world show that OA formation in polluted/urban areas happens over about 1 day (e.g., DeCarlo et al., 2010;Hodzic and Jimenez, 2011;Hayes et al., 2013Hayes et al., , 2015, and HCHO is also significantly formed over this timescale (Nault et al., 2018). In addition, Veefkind et al. (2011) found that satellite AOD correlated with HCHO over the summertime SE US, BB regions, and southeast Asian industrialized regions. This also suggests that OAs share common emission sources and photochemical processes with HCHO and are a major contributor to AOD in the regions above. Marais et al. (2016) further used the relationship between aircraft OA and satellite HCHO to evaluate the GEOS-Chem representation of SOA mass yields from biogenic isoprene in the SE US.
We present an OA surface mass concentration estimate (OA estimate) derived from a combination of satellite HCHO column observations and in situ OA-HCHO relationships. Because the detection limit of satellite HCHO column observations limits the quality of OA estimate, we focus our analy-ses on summertime when HCHO levels are high. The OA estimate is evaluated against OA measurements at ground sites. A 3-D model GEOS-Chem OA simulation is shown for comparison. Figure 1 shows flight tracks with altitudes < 1 km of the field campaigns used in the current study. The Studies of Emissions, Atmospheric Composition, Clouds and Climate Coupling by Regional Surveys (SEAC 4 RS) mission (Toon et al., 2016;SEAC 4 RS Science Team, 2013) covered the continental US with a focus on the SE US in August-September 2013. The Deep Convective Clouds and Chemistry Experiment (DC3) (Barth et al., 2015;DC3 Science Team, 2012) surveyed the central and SE US in May-June 2012, targeting isolated deep convective thunderstorms and mesoscale convective systems. The California Research at the Nexus of Air Quality and Climate Change (CalNex) (Ryerson et al., 2013;CalNex Science Team, 2010) investigated the California region in May-June 2010, targeting the Los Angeles (LA) Basin and Central Valley. The Korea-United States Air Quality Study (KORUS-AQ) studied South Korean air quality, sampling many large urban areas in South Korea and continental Asian outflow over the West Sea, in May-June 2016 (KORUS-AQ Science Team, 2016). KORUS-AQ only includes data with longitude < 133 • E to exclude the transit from the US because it targeted South Korea and the nearby region. These field campaigns were selected as they had recent high-quality in situ HCHO and OA data measured with state-of-the-art instruments and studied summertime regional tropospheric chemical composition.

In situ airborne observations
In situ airborne HCHO observations were acquired by multiple instruments. The DC3 NASA DC-8 payloads featured two HCHO measurements: the NASA In Situ Airborne Formaldehyde (ISAF) (Cazorla et al., 2015) and the Difference Frequency Generation Absorption Spectrometer (DF-GAS) (Weibring et al., 2006). The SEAC 4 RS NASA DC-8 payloads also featured two HCHO measurements: the NASA ISAF and the Compact Atmospheric Multispecies Spectrometer (CAMS) . HCHO measurements from ISAF were found to be in good agreement with CAMS, with a correlation coefficient of 0.99 and a slope of 1.10 . HCHO measurements from ISAF also had a good agreement with DFGAS, with a correlation coefficient of 0.98 and a slope of 1.07. Because ISAF has higher data density, we used ISAF HCHO data for DC3 and SEAC 4 RS. During KORUS-AQ, CAMS was the only HCHO instrument aboard the DC-8. In CalNex, a proton transfer reaction mass spectrometer (PTR-MS) (Warneke et al., 2011) was used to measure HCHO aboard the NOAA P3 aircraft. In situ airborne OA from SEAC 4 RS, DC3, and KORUS-AQ was measured by the University of Colorado highresolution time-of-flight aerosol mass spectrometer (HR-ToF-AMS; DeCarlo et al., 2006;Dunlea et al., 2009;Canagaratna et al., 2007;Jimenez et al., 2016) and in situ airborne OA from CalNex was measured by the NOAA compact time-of-flight aerosol mass spectrometer (Drewnick et al., 2005;Canagaratna et al., 2007;Bahreini et al., 2012). The OA measurements are from 1 min merge data and converted from µg sm −3 (at 273 K and 1013 mbar) to µg m −3 under local T and P for each data point, to be consistent with HCHO concentrations in µg m −3 or molec cm −3 at local T and P .
Although NO modulates the RO 2 lifetime, and thus the production of HCHO and SOA, NO cannot be directly observed via remote sensing. Instead, NO 2 can be directly observed in space by satellites, and because NO 2 represents typically ∼ 80 % (e.g., SEAC 4 RS and KORUS-AQ) of the boundary layer NO x concentrations during the daytime, NO 2 can be used as a surrogate for daytime NO concentrations and oxidative conditions around the globe. In situ airborne NO 2 was measured by the NOAA chemiluminescence NO y O 3 instrument (Ryerson et al., 2001) during SEAC 4 RS, DC3, and CalNex and by University of Berkeley laser-induced fluorescence NO 2 instrument (Day et al., 2002) during KORUS-AQ. SEAC 4 RS isoprene measurements were from the protontransfer-reaction mass spectrometer (PTR-MS) (Wisthaler et al., 2002).

Ground-based OA measurements
Ground-based OA measurements over the US were from the EPA Interagency Monitoring of Protected Visual Environments (IMPROVE) (Malm et al., 1994;Solomon et al., 2014;Hand et al., 2014Hand et al., , 2013Malm et al., 2017) and Southeastern Aerosol Research and Characterization (SEARCH) (Edgerton et al., 2006) networks. In the IMPROVE network, aerosols were collected on quartz fiber filters and analyzed in the lab by thermal optical reflectance for organic and elemental carbon. The data were reported every 3 days from 1988 to 2014. Monthly averages were used for comparison in this study. IMPROVE OA data over the SE US (east of 70 • W) in summertime were multiplied by a factor of 1.37 to correct for partial evaporation during filter transport, following the recommendation of a comparison study with SEARCH organic carbon (OC) measurements Hand et al., 2013). Although IMPROVE OA corrected for evaporation has potential uncertainties with the constant scaling factor, the IMPROVE measurements have high spatial coverage. SEARCH network (Edgerton et al., 2006;Hidy et al., 2014) OC was determined by the difference between total carbon (TC) detected by a tapered element oscillating microbalance (TEOM) and black carbon (BC) measured by an in situ thermal-optical instrument. This allowed real-time measurement of OC and prevented evaporation during filter transport. Although the SEARCH network only has five sites available, we used observations from this network due to their high accuracy. The IMPROVE and SEARCH network OC measurements were converted to OA by multiplying by a factor of 2.1 based on ground and aircraft observations (Pye et al., 2017;).

Satellite measurements
Satellite HCHO column observations were derived from NASA's Ozone Monitoring Instrument (OMI), a UV-visible nadir solar backscatter spectrometer on the Aura satellite (Levelt et al., 2006). Aura passes over the Equator at 13:30 LT, daily. Here, we used the OMI HCHO version 2.0 (collection 3) gridded (0.25 • × 0.25 • ) retrieval data (Gonzalez Abad et al., 2015) from the Smithsonian Astrophysical Observatory (SAO). Satellite data for HCHO columns were subjected to data quality filters: (1) solar zenith angle lower than 70 • , (2) cloud fraction less than 40 %, and (3) main quality flag and the xtrackquality flag both equal to zero (Harvard-Smithsonian Center for Astrophysics OMI HCHO data product description, 2017). The monthly average HCHO columns were also weighted by the column uncertainties of the pixels. The HCHO retrieval used a priori profiles without aerosol information from the GEOS-Chem model (Gonzalez Abad et al., 2015). Satellite NO 2 column observations were also derived from NASA's OMI level 3 data (Lamsal et al., 2014;Krotkov, 2013). Satellite NO 2 observations were used to calculate the NO x -related chemicalfactor-dependent OA estimate (see Table 2). Satellite AOD observations were acquired from the Moderate Resolution Imaging Spectroradiometer (MODIS) aboard the Aqua satellite, using overpasses at about 13:30 LT. Here, we used Collection 06 (Levy and Hsu, 2015), retrieved using the dark tar-get (DT) and deep blue (DB) algorithms , monthly average data.

GEOS-Chem
We used GEOS-Chem (v9-02) at 2 • × 2.5 • with 47 vertical layers to simulate HCHO and OA globally, the same as that in Marais et al. (2016). GEOS-Chem was driven with meteorological fields from the NASA Global Modeling and Assimilation Office (GMAO). The OA simulation included POA from fires and anthropogenic activity and SOA from the volatility-based reversible partitioning scheme (VBS) of Pye et al. (2010) for anthropogenic, fire, and monoterpene sources, and an irreversible aqueous-phase reactive uptake mechanism for isoprene. The aqueous-phase mechanism was coupled to gas-phase isoprene chemistry and has been extensively validated using surface and aircraft observations of isoprene SOA components in the SE US . This model version used the fourth-generation Global Fire Emissions Database (GFED4) (Giglio et al., 2013) as a BB emission inventory. The model was driven with Goddard Earth Observing System -Forward Processing (GEOS-FP) meteorology for 2013 and sampled along the SEAC 4 RS (2013) and KORUS-AQ (2016) flight tracks. The model was also run with a 10 % decrease in biomass burning, biogenic, or anthropogenic emissions as a sensitivity test to evaluate the contributions of different sources to the OA and HCHO budget. Model monthly mean surface layer OA and total column formaldehyde were obtained around the OMI overpass time (12:00-15:00 LT) for 2008-2013 using Modern-Era Retrospective analysis for Research and Applications (MERRA) (Gelaro et al., 2017) meteorology, as GEOS-FP was only available from 2012. This was compared to the OA estimate derived from satellite HCHO.
Global isoprene emissions from the Model of Emissions of Gases and Aerosols from Nature version 2.1 (MEGAN) (Guenther et al., 2006) and satellite NO 2 column data were used to calculate an isoprene-and NO 2 -dependent OA estimate (see Table 2). Global isoprene emissions from MEGAN were implemented in GEOS-Chem and driven with MERRA (MEGAN-MERRA).

Estimation of surface organic aerosol mass concentrations
An estimate for surface OA mass concentration was calculated based on a simple linear transformation.
Here, ε(i) is the OA estimate for grid cell i (µg m −3 ), HCHO (i) is the OMI HCHO column density (molec cm −2 ) in each 0.25 • × 0.25 • grid cell (similar resolution to OMI HCHO nadir pixel data), η(i) is the ratio of midday surface layer (∼ 60 m) HCHO concentrations (molec cm −3 ) to column concentrations (molec cm −2 ) from GEOS-Chem, and α(i) and β(i) are the slope and intercept of a linear regression between OA and HCHO from low-altitude (< 1 km) airborne in situ measurements. The in situ to column conversion factor η(i) was similar to that used by Zhu et al. (2017) to convert HCHO columns into surface concentrations. η(i) was derived from the HCHO a priori profiles used in SAO OMI air mass factor (AMF) calculations (GEOS-Chem v9-01-03 climatology) or from GEOS-Chem v9-02, which included an updated isoprene scheme for OA and is the next version of the model (v9-01-03) for a priori profiles used in SAO satellite HCHO retrievals. HCHO a priori profiles were used to be consistent with satellite HCHO retrievals and also to show that the OA estimate can be derived without running a global model separately. The newer version of GEOS-Chem was used to test the sensitivity of OA estimates to the updated version of η. The newer version of GEOS-Chem also allows sampling through the flight tracks of a recent field campaign (SEAC 4 RS) and examining the factors impacting η with both modeled and measured HCHO profiles. The detailed information about the impact of HCHO profiles on η is provided in Sect. 5.

Aerosol extinction from satellite measurements
Currently, remote sensing techniques observe aerosols by quantifying AOD. The MISR satellite instrument can estimate a subset of AOD, using constraints on size range, shape, and absorbing properties, but it cannot distinguish OA from other submicron aerosol compounds such as sulfate and nitrate and also requires AOD to be above 0.1. Because MISR estimates a subset of AOD, it is discussed above to verify that we are not neglecting a satellite dataset that has already captured OA AOD. Moreover, OAs account for a large and relatively constant fraction of submicron aerosols in the SE US Wagner et al., 2015) and are one of the major submicron aerosol components over the US (Jimenez et al., 2009). Therefore, AOD was converted to extinction to represent OA for comparison: where A ext is the calculated aerosol extinction (Mm −1 ), is the ratio of surface layer OA concentrations (µg m −3 , at ambient T and P ) to column OA concentrations (µg m −2 ) from GEOS-Chem multiplied by 10 6 m Mm −1 . The shape of the average vertical profile of OA (OA fraction: 0.54-0.7) was close to that of total aerosol mass over the SE US , where a large fraction of the enhanced non-BB aerosol concentrations in summertime over the US are located. Data with BB plume interferences were excluded in the following analysis. The potential contribution of dust and nitrate could alter the shape of the vertical profiles and introduce uncertainties when using OA vertical profiles for other parts of the US. However, the outliers in the aerosol extinction com-pared to ground OA measurements (see Sect. 6.3) were not located outside of the SE US. Similar vertical profile shapes of OA and submicron particles were also observed in a campaign outside the US over South Korea (Nault et al., 2018). Although OA accounted for ∼ 40 % of the total submicron particles, the shape of OA and total submicron particles' vertical profiles were nearly identical.

In situ OA-HCHO relationship
Although OA and HCHO share common VOC emission sources and photochemical processes, their production rates from different emission sources and photochemical conditions vary, as do their loss rates. We found the main factors that modulate OA-HCHO relationships from in situ measurements and discussed them in the following section.

Regional and source-driven variability
For all regions and/or sources investigated, near-surface in situ OA and HCHO are well correlated. A scatter plot of in situ OA vs. HCHO at low altitudes (< 1 km) from a number of field campaigns (SEAC 4 RS, DC3, CalNex, and KORUS-AQ) is displayed in Fig. 2. The slopes, intercepts, and correlation coefficients are summarized in Table 1. SEAC 4 RS, DC3, and CalNex excluded BB data when acetonitrile was > 200 pptv (Hudson et al., 2004). KORUS-AQ used a BB filter with higher acetonitrile (> 500 pptv) because the air masses with moderate acetonitrile enhancement (200-500 pptv) were actually from anthropogenic emissions. This attribution is based on high levels of acetonitrile detected downwind of Seoul and west coastal petrochemical facilities, the slope between acetonitrile and CO being to urban emissions (Warneke et al., 2006), and the concentrations of anthropogenic tracer CHCl 3 being high (Warneke et al., 2006). Similar to acetonitrile, another common BB tracer, hydrogen cyanide (HCN), was also enhanced in these air masses. BB data (acetonitrile > 200 pptv) for SEAC 4 RS were analyzed separately and are in the inset in Fig. 2. Although all CalNex data had a tight correlation, we only included the flight data near the LA Basin to target the area strongly influenced by anthropogenic emissions. In general, the correlation coefficients between in situ OA and HCHO were strong (r = 0.59-0.97) ( Table 1).
The variety in OA-HCHO regression coefficients among different campaigns reflects the regional and/or sourcedriven OA-HCHO variability. Considering only the nonbiomass burning (non-BB) air masses sampled, OA and HCHO had the tightest correlation for CalNex, because Cal-Nex focused on the LA area (shown in Fig. 2) and Central Valley, while SEAC 4 RS and DC3 covered a larger area with a potentially larger variety of sources and chemical conditions. Although SEAC 4 RS and DC3 both sampled the continental US, SEAC 4 RS had more spatial coverage and sam- The unit of the slope is g g −1 . b The unit of the slope is pg molec −1 . c The unit of the intercept is µg m −3 . The uncertainties are 1 standard deviation. pled more air masses at low altitudes, while DC3 was designed to sample convective outflow air masses and had more data at high altitudes. Although KORUS-AQ covered a much smaller area compared to SEAC 4 RS, KORUS-AQ data also had a large spread, which may be due to the complicated South Korean anthropogenic sources mixed with transported air masses (e.g., from China) and maybe biogenic sources. OA exhibits a tight correlation with HCHO for both wildfires and agricultural fires during SEAC 4 RS. This is because the production of HCHO and OA is much higher in BB air masses compared to background. This may also suggest that the emissions of OA and HCHO in these air masses are relatively constant. SEAC 4 RS data are chosen because the campaign sampled fires and had state-of-the-art, high-quality measurements. More intensive fire sampling is needed to probe the correlation between OA and HCHO across fuel types and environmental conditions.
The different slopes of OA-HCHO among different campaigns also reflect the regional or source-driven OA-HCHO variability. Among the BB, anthropogenic, and biogenic sources, the slopes of OA vs. HCHO for BB air masses were the highest. This is consistent with high POA emission in BB conditions Lamarque et al., 2010;Cubison et al., 2011), with low addition of mass due to SOA formation Shrivastava et al., 2017). The slope of OA to HCHO was higher for wildfires than in agricultural fires during SEAC 4 RS though data were limited (see Table 1). This is consistent with more OA emitted in wildfires than agricultural fires . The factors driving higher OA to HCHO with wildfires are not clear and may be related to burning conditions and fuels.
For the non-BB sources, the slope of OA vs. HCHO was highest for South Korea (KORUS-AQ), which is dominated by heavily polluted anthropogenic sources. During KORUS-AQ, the high OA to HCHO air masses also had high acetonitrile. By the time we sampled, most organic aerosols were secondary (Nault et al., 2018). This indicates that the formation rates of OA and HCHO from different emission sources contribute to the different slopes of OA-HCHO. This also indicates that emission sources with enhanced acetonitrile tend to form more OA relative to HCHO downwind. The slope of OA-HCHO for the LA Basin (California), dominated by relatively clean anthropogenic emissions, was much lower than that for South Korea. The potential difference in the anthropogenic emissions mix could contribute to the different OA-HCHO slopes from the US LA region and South Korean anthropogenic sources (Baker et al., 2008;Na et al., 2002Na et al., , 2005. The slopes of OA vs. HCHO of SEAC 4 RS and DC3 dominated by biogenic emissions in the SE US were in between heavily polluted (KORUS-AQ) and clean anthropogenic sources (CalNex). As SEAC 4 RS had the largest geographic coverage for low-altitude data over the US, the campaign average slope of OA vs. HCHO was used to represent the US region in summer. CalNex LA Basin data were used to represent large cities as case studies.
Overall, the source-dependent OA-HCHO relationships (Fig. 2) showed higher OA-HCHO slopes of BB and heavily polluted anthropogenic sources with inefficient combustion (e.g., KORUS-AQ) compared to biogenic and relatively clean anthropogenic sources. This indicated that inefficient combustions contribute to the high slopes of OA-HCHO, probably due to both enhanced primary OA and increased formation of SOA. Enhanced pre-existing aerosols such as primary aerosols can provide more surfaces to increase VOC condensation and SOA formation. VOCs co-emitted from heavily polluted anthropogenic sources can also form more SOA. It is possible to extract the factors that govern the different OA-HCHO relationships and potentially have a universal application of the slopes as a function of the factors (e.g., sources and combustion efficiencies).

Dependence on NO x and VOC speciation
Biogenic and anthropogenic VOCs are oxidized by atmospheric oxidants (e.g., OH as the dominant oxidant) to form RO 2 . HCHO is produced from the reactions of RO 2 with HO 2 or NO, with RO 2 +NO typically producing more HCHO than RO 2 + HO 2 (e.g., Wolfe et al., 2016). RO 2 can react with HO 2 or NO, or isomerize to form oxidized organic compounds with high molecular weight and low volatility, which condense on existing particles to form SOA. The products of RO 2 +NO tend to fragment instead of functionalize and often lead to higher volatility compounds (e.g., HCHO) and thus less SOA formation compared to the products of RO 2 + HO 2 Worton et al., 2013). Therefore, with the same VOC, we expect more HCHO and less OA formed at high NO conditions, and vice versa. As mentioned before, NO 2 instead of NO is easily measured from space and NO 2 typically is ∼ 80 % of NO x in the boundary layer during the day. Therefore, NO 2 is used as a surrogate for the NO levels influencing OA and HCHO production. The yields of HCHO and SOA also depend on VOC speciation (e.g., Lee et al., 2006;Bianchi et al., 2016). Specifically, isoprene has a higher yield of HCHO than most non-alkene VOCs (Dufour et al., 2009).
A scatter plot of OA vs. HCHO for SEAC 4 RS low-altitude data is shown in Fig. 3a. The data are color coded by the product of in situ isoprene and NO 2 , attempting to capture time periods strongly influenced by oxidation products of isoprene at high NO conditions. No trends are evident when the data are instead color coded by NO 2 or isoprene only. This may be because isoprene (biogenic source) and NO 2 (anthropogenic sources) are generally not co-located in the US  and isoprene is the dominant source of HCHO compared to anthropogenic VOCs in the US (e.g., Millet et al., 2008). This plot shows that, at high NO 2 and high isoprene conditions, less OA was formed for each HCHO produced generally. The correlation coefficient of 0.45 for high NO 2 and isoprene conditions during SEAC 4 RS is not very high but still shows significant dependence of the OA-HCHO relationship on the product of NO 2 and isoprene, considering that these are ambient data and other factors (e.g., different specific sources) also play a role in determining OA-HCHO relationships. This is consistent with high NO and isoprene conditions promoting HCHO formation over SOA formation. We also looked at the dependence on peroxy acetyl nitrate (PAN), as PAN is a product of the photooxidation of VOCs, including isoprene, in the presence of NO 2 . The dependence on PAN was not as clear as on the product of NO 2 and isoprene.
KORUS-AQ OA vs. HCHO, color coded with NO 2 , is plotted in Fig. 3b. The OA-HCHO ratio clearly decreased as NO 2 levels increased during KORUS-AQ, suggesting that high NO conditions accelerated HCHO formation more than they did SOA production. OA-HCHO relationships do not have dependence on local time of the day (not shown). This further confirms that NO x is an important factor that affects the OA-HCHO relationship. Compared to SEAC 4 RS, the KORUS-AQ OA-HCHO ratio does not depend on VOCs. This may be consistent with the dominant VOCs being anthropogenic VOCs that are co-located with NO sources. This may also suggest that the anthropogenic VOCs generally have a lower HCHO yield than isoprene does. Because OA and HCHO were tightly correlated during CalNex and DC3, we did not parse for NO x . The NO x range during DC3 lowaltitude data was smaller than KORUS-AQ and SEAC 4 RS. DC3 OA-HCHO relationships only had a slight dependence on NO 2 (not shown here), largely due to the limited dataset.
The NO x range during CalNex low-altitude data was large. The OA and HCHO correlation during CalNex was very tight and the slope of OA-HCHO did not show clear dependence on NO x , which could be due to the combination of different VOC sources and NO x levels.

GEOS-Chem
In situ OA-HCHO relationships from SEAC 4 RS lowaltitude non-BB (Fig. 4a), KORUS-AQ low-altitude (Fig. 4b), and SEAC 4 RS BB (Fig. 4c) air masses were compared to GEOS-Chem model simulations (Fig. 4d-f) sampling along the corresponding flight tracks. Similar to the in situ data, GEOS-Chem model simulations also found correlations between OA and HCHO for these three regions, especially for SEAC 4 RS non-BB. GEOS-Chem was intensively validated with in situ measurements for the SE US (e.g., Marais et al., 2016;Kim et al., 2015). The ratios of the slopes between OA and HCHO for the US (SEAC 4 RS), South Korea (KORUS-AQ), and wildfire cases (SEAC 4 RS) from GEOS-Chem were 1 : 1.1 : 0.4, which was different from the in situ measurements of 1 : 1.4 : 13 (Table 1). GEOS-Chem could not capture any wildfires in the US during SEAC 4 RS, which is probably due to poor representation of the BB emission inventory for US wildfires and also the coarse grid in GEOS-Chem. GEOS-Chem also significantly underpredicted the slope of OA to HCHO for South Korea. We attribute this to a likely underprediction of anthropogenic SOA, which was dominant in South Korea, in GEOS-Chem (Schroder et al., 2018), as well as a different mix of OA and HCHO sources in the US compared to South Korea and representation of these in GEOS-Chem. Although GEOS-Chem contains isoprene chemistry with a focus on the SE US , there is still room to improve the GEOS-Chem model especially for anthropogenic and BB sources, as well as anthropogenic OA formation mechanisms. For example, in the model, biogenic sources are more important than anthropogenic sources for the OA and HCHO budgets in South Korea, which is not the case from KORUS-AQ in situ measurements. In the model, a 10 % decrease of emissions from biogenic, anthropogenic, and BB sources results in 6 %, 3 %, and 1 % decreases in OA and 2 %, 1 %, and 0 % decreases in HCHO over South Korea in May 2016. However, the in situ airborne field campaign KORUS-AQ found that OA and HCHO were higher near anthropogenic emission sources compared to rural regions. The larger impact of biogenic sources compared to anthropogenic sources on OA and HCHO in the model can be due to both low-biased anthropogenic emission inventories and low-biased anthropogenic SOA. Improving anthropogenic emissions inventories in the models can bring model results closer to observations. Improving anthropogenic SOA, such as implementation of the SIMPLE model, in GEOS-Chem (Hodzic and Jimenez, 2011) can also improve the model results compared to observations. Measurements or measurement-constrained estimation with sufficient spatial and temporal coverage can help to narrow down the key factors (e.g., emission inventories or chemical schemes) in GEOS-Chem to better represent VOCs and OA globally. Furthermore, we did also find that GEOS-Chem could not capture the observed higher slope of OA to HCHO at high altitudes (not shown), which could be due to issues such as transport, OA lifetime, and OA production.

Relating satellite HCHO column to surface HCHO concentrations
To utilize the derived in situ OA-HCHO relationship, the satellite HCHO columns need to be converted to surface HCHO concentrations. We used a vertical distribution factor η (cm −1 ) (Sect. 2.5), which is defined as the ratio of surface HCHO concentrations (molec cm −3 ) to HCHO column (molec cm −2 ), to estimate surface HCHO concentrations from satellite column measurements. Zhu et al. (2017) used the same vertical distribution factor for their study. The use of this factor is justified by the fact that the derived surface HCHO retained the spatial pattern of the satellite HCHO column and agreed with local surface measurements of HCHO for a multi-year average (Zhu et al., 2017). We also investigated the main factors affecting the variation of the vertical distribution factor η. Because the factor is determined by HCHO vertical distributions, we examined three typical normalized HCHO vertical distribution profiles with the highest, median, and lowest η values for the SEAC 4 RS field campaign (Fig. 5). Because the sensitivity of OA estimates to η was investigated with η from different GEOS-Chem versions (Sect. 6.2), we did not compare HCHO vertical profiles from the model to the measurements from a comprehensive set of field campaigns. We Figure 5. Three typical vertical profiles of the ratio of in situ HCHO concentrations (molec cm −3 ) to integrated HCHO column from the SEAC 4 RS flight track. These three profiles were located at the Kansas-Oklahoma border (red), Arkansas-Tennessee border (black), and Gulf of Mexico (blue). Solid curves were from GEOS-Chem results and the dashed ones were from ISAF measurements. HCHO columns were integrated HCHO concentrations of these vertical profiles extrapolated from 0 to 10 km, assuming the HCHO values below and above the measured HCHO vertical profiles were the same as the HCHO at the lowest and highest altitudes sampled, respectively. The boundary layer heights (BLHs) of these three profiles are plotted by the shaded areas.
chose SEAC 4 RS to illustrate the main factors impacting the η over the US because SEAC 4 RS had a larger spatial coverage than DC3 and CalNex. GEOS-Chem can generally capture the vertical profiles of measured HCHO. Boundary layer mixing height and surface emission strength are the dominant factors in determining the fraction of HCHO near the surface. Higher boundary layer mixing height results in lower η for SE US profiles, where there are biogenic sources of HCHO from the surface and HCHO has distinct concentration differences below and above the boundary layer. However, there are exceptions, such as for the profiles over the ocean and the coastal regions. Although the boundary layer is shallow in these regions, a large portion of HCHO resides above the boundary layer, resulting in low η. In these cases, surface emissions of HCHO or precursors are very small, and therefore methane oxidation makes a large contribution to the total HCHO column. High concentrations of HCHO (e.g., in BB plumes) lofted by convection can also impact the vertical profile (Barth et al., 2015), which is not further investigated because OA estimates with BB influences over the US are excluded in current study. Overall, the source intensities and boundary layer mixing height mostly determined the HCHO vertical profiles. 3) and surface isoprene emission rate (Sect. 2.4) was above the threshold of 5 × 10 27 molec cm −2 atom C cm −2 s −1 , the slope and intercept from SEAC 4 RS high isoprene and NO 2 conditions were used. When the NO 2 column isoprene emission product was below that threshold, the slope and intercept from SEAC 4 RS low isoprene and NO 2 conditions were used. The threshold of "isoprene × NO 2 " was determined by its mean value over the SE US (32-35 • N, 83-96 • W). Large urban cities were categorized with high NO 2 vertical columns (> 4 × 10 15 molec cm −2 ) (Tong et al., 2015) based on the satellite NO 2 levels over LA. Isoprene emissions instead of concentrations were used because global models use the isoprene emission inventory to simulate isoprene concentrations and the isoprene emission inventory is easier to access. Since isoprene has a short lifetime of up to a few hours (Guenther et al., 2006), the emissions have a similar spatiotemporal distribution to the concentrations.
6 Construction of the OA estimate 6.1 Variables to construct OA estimate As mentioned in Sect. 2.5, the OA estimate value in each grid cell was estimated from monthly average satellite HCHO column observation by the linear Eq. (1). Satellite monthly average HCHO column data, HCHO , were converted to surface HCHO concentrations by multiplying by the η(i) factor either from climatology a priori profiles or monthly average HCHO profiles. Surface OA was then estimated by multiplying the derived surface HCHO concentrations by the slope α (i) and adding the intercept β(i). The slope α (i) and intercept β(i) were determined from the linear regression of in situ OA and HCHO from aircraft field campaign data. The relationship between OA and HCHO varies but previous sections demonstrated that we can quantify the surface OA-HCHO relationship by their regions, sources, and chemical conditions (e.g., NO x and isoprene levels). To test the impact of the chosen OA-HCHO relationship on the calculated OA estimate, the OA estimate in the US was calculated using four different methods (see Table 2). The OA estimate was calculated on the monthly timescale, largely because OA estimate is based on OMI HCHO observations, and an uncertainty weighted average for a timescale of about 1 month (Gonzalo et al., 2015;Zhu et al., 2016) is needed to reduce the noise in daily OMI HCHO data. With improved satellite HCHO data from the Tropospheric Monitoring Instrument (TROPOMI), higher time resolution (e.g., weekly average) HCHO data could be useful to estimate OA in the future.

OA estimate over the US
The monthly average surface OA estimates over the US in August 2013 using SEAC 4 RS lump-sum slope and intercept (see Table 2) with different η are shown in Fig. 6a and b. Because BB regions in the US are not covered by smoke continuously during a period of time and it is challenging for satellite retrieval to separate thick BB plumes and clouds without information on the time and location of the burning, thick BB events (OMI UV aerosol index (UVAI) > 1.6) (Torres et al., 2007) were excluded and shown as the blank (white) grid cells in Fig. 6a and b. The same filter was also applied to aerosol extinction and GEOS-Chem OA abundance. To evaluate the representative quality of the OA estimate, OA estimate data were compared to the EPA IM-PROVE ground sites' corrected-OA measurements over the US and SEARCH ground sites' OA measurements in the SE US (Sect. 2.2). The locations of IMPROVE and SEARCH sites are displayed in Fig. 6e as small and large dots, respectively. The dot color represents the average OA mass concentrations for August 2013. Considering the uncertainties in satellite HCHO measurements, in using the campaign lump-sum OA-HCHO relationship to represent spatial resolved OA, in HCHO vertical profiles, and in ground IMPROVE network measurements, the correlation (correlation coefficient r = 0.56) between the OA estimate and corrected IMPROVE network measurements ( Fig. 6f and g) is reasonably good and indicates that the OA estimate can generally capture the variation of OA loading over the US. First, the correlation coefficient between HCHO SAO retrievals and in situ measurements during SEAC 4 RS was not high (r = 0.24), but this may be partly because they were not sampled at the same time. The uncertainty in HCHO SAO data was likely less than 76 %. Second, the uncertainty in applying a campaign lump-sum OA-HCHO relationship to individual spatial resolved satellite HCHO data to estimate OA induced an uncertainty of 41 % according to the correlation coefficient of OA-HCHO in the field campaign. Third, η in the Fig. 6a OA estimate was from GEOS-Chem v9-02 output for the specific month of August 2013. η in the Fig. 6b OA estimate was from GEOS-Chem v9-01-03 climatology, the same as satel- lite data a priori profiles. The good correlations of OA estimates with IMPROVE OA indicate that OA estimates are not very sensitive to η from different model versions. The largest difference between the two OA estimates is their concentrations over east Texas. There are no IMPROVE OA measurements in east Texas to evaluate which works better. Fourth, the uncertainties in IMPROVE OA measurements, such as using a constant correction factor to correct the partial evaporation across all SE US sites, and the spatially dependent OA/OC ratio (Tsigaridis et al., 2014), may also have contributed to the discrepancies between the OA estimate and EPA IMPROVE sites' OA. Therefore, higher quality of satel-lite HCHO data and refining OA-HCHO relationships will help improve our OA estimate products. These combined with a spatially resolved IMPROVE OA correction factor and OA/OC ratios will help improve the correlation coefficients between OA estimates and IMPROVE OA.
The linear correlation between the OA estimate and IM-PROVE OA measurements yielded a slope of 0.62 or 0.60, indicating that the OA estimate underestimated OA. First, the different data collection time for satellite data, in situ measurements, and ground observations could contribute to the bias. Satellite HCHO data were measured midday, in situ airborne OA and HCHO were measured during the daytime, and IMPROVE network organic carbon was collected day and night. Because ground OAs in the SE US were observed to have little diurnal variation (Xu et al., 2015;Hu et al., 2015), the different sampling time of ground and airborne OAs probably does not have a significant impact on the comparison of OA estimate and IMPROVE OA. Surface HCHO has evident diurnal profiles with the highest concentrations around midday , which could add uncertainties to OA estimate when using inconsistent time ranges of satellite HCHO data measured midday and in situ airborne OA-HCHO relationships measured in the daytime. The SEAC 4 RS HCHO concentrations were converted to 13:30 LT concentrations according to the average HCHO diurnal profile from the Southern Oxidant and Aerosol Study (SOAS) . The OA-HCHO relationship with HCHO converted to 13:30 LT yielded a slope of 5 % lower than the original OA-HCHO relationship. Second, the potential uncertainty (±30 %) in the OA/OC ratio could also contribute to the systematic difference because we used OA/OC of 2.1 and studies (e.g., Pye et al., 2017;Canagaratna et al., 2015) showed that the OA/OC ratio can range from 1.4 to 2.8. Third, the potential underestimation of HCHO from satellite retrieval (by −37 %)  compared to SEAC 4 RS may be one of the most important reasons that cause the systematic difference (low slope) between the OA estimate and IMPROVE OA according to Eq. (1). Satellite HCHO data corrected by the low bias (by −37 %)  will increase our slopes of 0.60-0.62 to be close to the unity.
SEARCH OA data were also used to compare to the OA estimate. The correlation was good for August 2013. Although the SEARCH network OA measurements have better accuracy, the number of SEARCH sites is limited (five sites). The correlation of OA estimate and SEARCH OA varied dramatically in 2008-2013 (Fig. S1 in the Supplement). GEOS-Chem OA did not correlate with SEARCH OA except for the year 2013 (Fig. S1). As the IMPROVE network has more sites and spatial coverage, we used IMPROVE network data as ground OA measurements for comparison in the remainder of the discussion.

Comparison to aerosol extinction from AOD
To further evaluate the method of using satellite HCHO to derive an OA surface estimate, satellite aerosol measurements were used to approximate surface OA extinction for comparison. Satellite measurements of AOD were converted to surface extinction (see Sect. 2.6). Studies showed that OAs were a dominant component of aerosol mass and extinction during SEAC 4 RS Wagner et al., 2015) and the fractions of OA were relatively constant (interdecile 62 %-74 %) . Therefore, AOD variation is expected to generally reflect the OA variation during SEAC 4 RS. Satellite measurements from MISR can provide more aerosol property information to apportion total AOD to AOD of a subset of aerosols with small to medium size and round shape, which can better capture OA, when AOD is above 0.15 to 0.2 (Kahn and Gaitley, 2015;personal communication with Ralph Kahn, 2018). Because MISR cannot distinguish OA and other submicron aerosol components (e.g., sulfate and nitrate) and would cut off low AOD data which accounted for near half of the data over the US, we used total AOD to derive extinction for our comparison. The AODderived extinction map is shown in Fig. 6c, and the scatter plot of AOD-derived extinction and EPA-corrected OA is displayed in Fig. 6h. The same filter of high AI was also applied to AOD-derived extinction to remove BB plumes. Generally, the derived aerosol extinction had a correlation with IMPROVE OA, but the correlation was not as good as for the OA estimate with IMPROVE OA. The high surface aerosol extinctions (> 150 Mm −1 ) (outliers in the scatter plot) were located in the SE US and therefore were not due to potential contribution of dust and nitrate altering the shape of vertical profiles outside of the SE US. This indicates that the OA estimate derived from HCHO may be better than AOD at representing the concentrations of OA, even for the regions where AOD is dominated by OA (Xu et al., 2015).

Comparison to GEOS-Chem OA
Surface OA over the US from a GEOS-Chem simulation for August 2013 is shown in Fig. 6d, and the scatter plot of GEOS-Chem OA with IMPROVE OA is in Fig. 6i. Although HCHO vertical profiles from GEOS-Chem were used in OA estimate, the GEOS-Chem simulation had a coarser resolution than the OA estimate. To be comparable to the OA estimate, the scatter plot in Fig. 6i used GEOS-Chem results for the grid squares that overlapped with individual IM-PROVE sites. Compared to the OA estimate, GEOS-Chem OA had a similar correlation coefficient with IMPROVE OA. Although the GEOS-Chem OA plot appeared more scattered, there were many GEOS-Chem data points close to zero when IMPROVE OA was low, making the overall correlation coefficient similar to that for the OA estimate. GEOS-Chem underpredicted IMPROVE OA more with a slope of 0.57 com- pared to the OA estimate. This is consistent with underprediction of anthropogenic OA in Marais et al. (2016).

OA estimate with different OA-HCHO relationships
OAs were estimated with different OA-HCHO relationships for four cases ( Table 2). LUMP-SUM was using the non-BB SEAC 4 RS campaign lump-sum relationship, the same as shown in Fig. 6a; ISOP-NO x was using non-BB SEAC 4 RS NO 2 -and isoprene-dependent relationship; URBAN was using CalNex for large urban cities and SEAC 4 RS lump-sum for other US regions; and COMBINE was using CalNex for large urban cities and NO 2 -and isoprene-dependent non-BB SEAC 4 RS for other US regions. The OA estimates from the four cases (Table 2) were compared to IMPROVE OA and the correlation coefficients are shown in Fig. 7. In general, OA estimate results from the four cases were similar. The details about how to implement chemical-factordependent OA estimates for the four cases are also provided in Table 2. Including the NO 2 -isoprene-dependent OA-HCHO relationship (ISOP-NO x case) showed a similar (or slightly worse) correlation between the OA estimate and IMPROVE OA. OMI NO 2 column observations were used to represent surface NO 2 levels and surface isoprene emissions from MEGAN were used to represent surface isoprene concentrations, assuming that NO 2 column observations reflect surface NO 2 distributions and isoprene emissions reflect the concentrations of isoprene due to its short lifetime (∼ 1 h). The detailed implementation is provided in the notes in Table 2. As the in situ data showed a mod-erate NO 2 -isoprene-dependent OA-HCHO relationship, we attributed this to the locations of IMPROVE sites in rural regions, the uncertainty in IMPROVE network measurements, the uncertainty in isoprene emissions from MEGAN, or factors (e.g., source-dependent OA-HCHO) that also need to be taken into account when determining the specific OA-HCHO relationship. Satellite OMI NO 2 data (at 13:30 LT) were used to represent NO 2 levels, big cities were defined as NO 2 > 4 × 10 15 molec cm −2 , and the CalNex in situ OA-HCHO relationship was applied for big cities. It turned out that only one IMPROVE site (San Gabriel, SAGA1) near LA was affected by high NO 2 and led to the insignificant change in URBAN compared to LUMP-SUM. This is not unexpected because IMPROVE sites are in rural regions. The OA estimate in SAGA1 decreased from 1.88 g m −3 from LUMP-SUM to 0.17 g m −3 in URBAN, while the measured OA in IMPROVE SAGA1 was 1.52 g m −3 . This may infer that Cal-Nex is not very consistent with SEAC 4 RS due to different sampling instruments, strategies and seasons. Lowering the NO 2 threshold when defining big cities did not help improve the agreement either.
Because separating large urban areas and other regions and applying a simple chemical-regime-dependent in situ OA-HCHO relationship did not improve the agreement between the OA estimate and IMPROVE OA, we used the lump-sum OA-HCHO relationship to derive the OA estimate (shown in Fig. 6). SEAC 4 RS and DC3 only had a few low-altitude data in the midwest and did not cover the northeast US. The measured OA-HCHO relationship in the midwest did not show significant difference from the SE US. The scatter plots ( Fig. 6f and g) of OA estimates and IMPROVE OA do not show outliers for the northeast and midwest. This indicates that using the SEAC 4 RS lump-sum OA-HCHO relationship can reasonably capture regions outside of the SE US.
6.6 Temporal variation of the agreement between OA estimate and IMPROVE OA Besides August 2013 (see Fig. 6), the correlations between the OA estimate and IMPROVE OA for the summer months (June-July-August 2008-2013) were also examined and shown in Fig. 7. Generally, the correlation coefficients between the OA estimate and IMPROVE OA were > 0.5 for summer months of the years investigated. The correlation coefficients were generally higher in June compared to July and August. The lower average temperature in June might be related to the higher correlation coefficients. IMPROVE network aerosol samples were transported at ambient temperature in a truck and more organic vapors likely evaporated at higher temperature. The different temperatures and distances from IMPROVE sites to the laboratory may lead to inhomogeneous evaporation among the samples and result in lower correlation coefficients. Although higher temperatures in July and August may also lead to more BB, the average aerosol index over the US was not higher in July (mean: 0.35) and August (mean: 0.36) compared to June (mean: 0.39) for these years. The underlying cause for the lowest correlation coefficients in July and August 2012 is not clear and may be related to the severe drought in 2012 (Seco et al., 2015). The correlation coefficients were also low for the linear regressions (not shown) of IMPROVE OA with both GEOS-Chem OA and AOD-derived extinction. Because the lowest correlation coefficients were consistently observed for multiple OArelated products and not just the OA estimate, we attributed this to uncertainties in the IMPROVE OA measurements or some unknown bias shared by the satellite HCHO, GEOS-Chem OA, and satellite AOD. , satellite HCHO measurements will have higher spatial and temporal resolutions and lower detection limits. These higherquality satellite HCHO measurements will improve the quality and spatial and temporal coverage of our OA estimate. Because the OA estimate uses the relationship of in situ HCHO and OA measurements, the coverage of in situ aircraft field campaigns will impact the OA estimate quality. Currently, in situ airborne measurements of OA and HCHO focus on the continental US. Extending measurements to regions such as African BB, South America, and east Asia, where HCHO and OA have high concentrations, will increase the spatial coverage of the OA estimate product. Ground site measurements of OA with consistent quality control in those regions will also be important for validating the OA estimate. Improvement of satellite HCHO retrieval during the BB cases will also improve OA estimate quality. BB cases with high UV aerosol index over the US were excluded in the current OA estimate. With improvement in the satellite retrieval of HCHO, we may be able to estimate OA during BB cases over the US. Upcoming field campaigns such as the Fire Influence on Regional and Global Environments Experiment -Air Quality (FIREX-AQ) will provide opportunities to improve the OA estimate in BB cases in the US.
This OA estimate method has limitations in remote regions far away from HCHO sources. Because the lifetimes of HCHO (1-3 h) and OA (1 week) are different, the slopes and intercepts between HCHO and OA are expected to change when air masses are aged (e.g., in remote regions). HCHO is close to being in steady state with production rates roughly equal to loss rates while OA is not in steady state with a lifetime of a week. Therefore, OA can be accumulated relative to HCHO when air masses are aged. OA vs. HCHO from SEAC 4 RS and KORUS-AQ field campaigns, color coded with altitude as an indicator of air mass age, are plotted in Fig. S2a and b, respectively. A relative depletion of HCHO at high altitudes was observed due to its shorter lifetime. This also suggests that, at remote regions far away from the sources, the ratios of OA and HCHO could be much higher and the relationship between OA and HCHO derived near the sources may no longer apply. On the other hand, the lifetime of 1-3 h for HCHO does not imply that the OA estimate only works within this timescale. HCHO is formed from oxidation of transported gas-phase VOCs, including the oxidation products of the primary emitted VOCs, as well as of the slower-reacting VOCs (e.g., ethane and benzene). Most gas-to-particle oxidation processes that might produce HCHO can last up to 1-2 days (Palm et al., 2018). Figure S3 shows the ratios of OA and HCHO did not change significantly downwind for the Rim Fire plume for about 1 day of aging, which was determined by the distance from the source and the wind speed. A lower photolysis rate of HCHO in the plume can also contribute to this. However, we do not expect the relationship of OA and HCHO to remain past one to two boundary layer ventilation cycles (Palm et al., 2018). Although OA-HCHO relationships depend on air mass age, it does not largely affect our study for monthly average surface OA over the continental US because our OA estimates showed reasonably good agreement with ground sites IM-PROVE OA measurements. This also indicates that SOAs are enhanced near the source regions statistically. Nault et al. (2018) also showed the production of HCHO and SOA are similar and plateau around 0.5-1 photochemical days. So, in the near field of emissions and chemistry, the production of these two species is similar; however, outside the near field of emissions and rapid chemistry, the long lifetime of OA vs. the steady state of HCHO would start controlling the slopes and correlations.

Summary
We have developed a satellite-based estimate of the surface OA concentration ("OA estimate") based on in situ observations. This estimate is based on the empirical relationships of in situ OA and HCHO for several regions. OA and HCHO share VOC sources with different yields and lifetimes. Using surface OA and HCHO linear regression slopes and intercepts, we can relate surface HCHO to OA. To estimate the surface HCHO concentration from the satellite HCHO column, we used a vertical distribution factor η from either climatology satellite data a priori profiles or an updated model run for a specific period, which is largely determined by boundary layer height and surface emissions and found to reasonably retrieve surface HCHO from column HCHO.
The OA estimate over the continental US generally correlated well with EPA IMPROVE network OA measurements corrected for partial evaporation, with a biased low slope of 0.62 or 0.60, mostly due to underestimation of HCHO concentrations from the OMI HCHO retrieval. The good correlations are not only for the time during SEAC 4 RS but also for most summer months over the several years (2008)(2009)(2010)(2011)(2012)(2013) investigated. Compared to aerosol extinction derived from AOD, the OA estimate had slightly higher correlation coefficients with IMPROVE OA. GEOS-Chem can predict OA with a similar correlation coefficient with IMPROVE OA compared to the OA estimate when GEOS-Chem was intensively validated with in situ measurements for the SE US. Better satellite HCHO data from TROPOMI and future TEMPO and GEMS and extending spatiotemporal coverage of in situ measurements will improve the quality and coverage of the OA estimate.
Data availability. The OA estimate products, the GEOS-Chem outputs, and satellite HCHO data in this study can be obtained by contacting the corresponding author, Jin Liao (jin.liao@nasa.gov (Krotkov, 2013). Satellite MODIS AOD data are available at https://ladsweb.nascom.nasa.gov/ (Levy and Hsu, 2015).
Author contributions. JL performed the analysis and wrote the paper. TFH directed the research topic and discussed the analysis with JL. TFH, GMW, JSC, AF, and CW provided in situ HCHO measurements. JLJ, PCJ, and BAN provided in situ OA measurements. EAM provided GEOS-Chem model results. GGA and KC provided satellite HCHO data. HTJ provided MODIS AOD data. TBR provided in situ NO 2 measurements. AW provided in situ isoprene and acetonitrile measurements. GMW, TFH, JSC, JLJ, BAN, PCJ, EAM, and GGA provided constructive comments to help improve the paper. All authors have reviewed and edited the paper.
Competing interests. The authors declare that they have no conflict of interest.
Acknowledgements. Jin Liao, Thomas F. Hanisco, Glenn M. Wolfe, and Jason St. Clair were supported by NASA grants NNH15ZDA001N and NNH10ZDA001N. Benjamin A. Nault, Pedro Campuzano-Jost, and Jose L. Jimenez were supported by NASA grants NNX15AT96G and 80NSSC18K0630. Armin Wisthaler and PTR-MS measurements during DC3, SEAC 4 RS, and KORUS-AQ were supported by the Austrian Federal Ministry for Transport, Innovation and Technology (bmvit) through the Austrian Space Applications Programme (ASAP) of the Austrian Research Promotion Agency (FFG). The PTR-MS instrument team (Philipp Eichler, Lisa Kaiser, Tomas Mikoviny, and Markus Müller) is acknowledged for their field support. We thank Eric Edgerton for providing the SEARCH network data.
Edited by: Sally E. Pusede Reviewed by: two anonymous referees