Quantitative evaluation of the uncertainty sources for the modeling of atmospheric CO2 concentration within and in the vicinity of Paris city

The top-down atmospheric inversion method that couples atmospheric CO2 observations with an atmospheric transport model has been used extensively to quantify CO2 emissions from cities. However, the potential of the method is limited by several sources of misfits between the measured and modeled CO2 that are of different origins than the targeted CO2 emissions. This study investigates the critical sources of errors that can compromise the estimates of the city-scale emissions and identifies the signal of 10 emissions that has to be filtered when doing inversions. A set of one-year forward simulations is carried out using the WRF-Chem model at a horizontal resolution of 1 km focusing on the Paris area with different anthropogenic emission inventories, physical parameterizations and CO2 boundary conditions. The simulated CO2 concentrations are compared with in situ observations from six continuous monitoring stations located within Paris and its vicinity. Results highlight large nighttime observation-model misfits, especially in winter within the city, which are attributed to large uncertainties in the diurnal profile of anthropogenic emissions as 15 well as to errors in the vertical mixing near the surface in the WRF-Chem model. The nighttime biogenic respiration to the CO2 concentration is a significant source of modeling errors during the growing season outside the city. When winds are from continental Europe and the CO2 concentration of incoming air masses is influenced by remote emissions and large-scale biogenic fluxes, differences in the simulated CO2 induced by the two different boundary conditions (CAMS and CarbonTracker) can be of up to 5 ppm. Our results suggest three selection criteria for the CO2 data to be assimilated for the inversion of CO2 emissions from 20 Paris (i) discard data that appear as statistical outliers in the model-data misfits which are interpreted as model’s deficiencies under complex meteorological conditions; (ii) use only afternoon urban measurements in winter and suburban ones in summer; (iii) test the influence of different boundary conditions in inversions. If possible, using additional observations to constrain the boundary inflow, or using CO2 gradients of upwind-downwind stations, rather than absolute CO2 concentration, as atmospheric inversion inputs. 25


Introduction
Worldwide, almost two-thirds of global final energy consumption takes place in urban agglomeration areas that have a high population density and corresponding infrastructure, and cities are responsible for more than 70% of the global anthropogenic CO2 emissions (IEA, 2016;Seto et al., 2014). Due to progressing urbanization processes, the number of people living in cities is expected to increase from the current 7.7 billion in 2019 to more than 9.7 billion by 2050 (United Nations, 2019). More than ever, 30 cities are at the front line of climate change mitigation and take the lead in energy transition and emission reduction of greenhouse gases.
Currently, a variety of efforts are underway to quantify cities' total CO2 emissions and establish a high spatially and temporally resolved emission inventory for supporting urban emission mitigation strategies. An independent monitoring of city emission is highly desirable, which could be delivered by the top-down atmospheric inversion method using regional high-resolution transport 35 models together with ground-based urban CO2 concentration networks and/or satellites with imagery capabilities. The so-called https://doi.org /10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License.
atmospheric inversion provides an optimized estimate of CO2 emissions aiming at the best agreement between atmospheric CO2 measurements and their simulated equivalents. It relies on the filtering of the CO2 signal associated with the urban emissions at the targeted spatial and temporal scales from other sources of misfits between measured and modeled CO2 concentrations. These other sources of misfits include uncertainties in the atmospheric transport, in atmospheric CO2 conditions that are used at the boundaries of the regional model, in the natural CO2 fluxes within the modeling domain, but also in the spatial and temporal distribution of 5 the urban emissions at scales finer than the targeted ones. Even when controlling the emissions at a relatively high temporal and spatial resolution, city-scale inversion frameworks have generally targeted monthly to annual budgets of the emissions at the city scale, or for large areas of these cities (strong temporal and spatial correlations are assumed). The uncertainties in the assumed temporal and spatial emission variations induce a critical source of error poorly constrained by the inversions due to the lack of data (Bré on et al. 2015;Lauvaux et al. 2016). The spatial and temporal allocation of the emissions is generally derived from high-10 resolution gridded inventory based on uncertain activity data in the transportation, residential, and power sectors . Moreover, local sources of CO2 emission in the vicinity of an urban station can cause variations of atmospheric CO2 that are not captured by the inventories and transport models of kilometric scale that have been used for city inversions so far (Boon et al., 2016;Lian et al., 2019). Further, cities have green areas and are surrounded by rural areas that actively take up CO2 in the daytime during the growing season. Uncertainties and variability in those biogenic fluxes also significantly affect the results of 15 atmospheric inversions (Hardiman et al., 2017).
Uncertainties in modeling the atmospheric transport of CO2 are exacerbated in urban areas due to building obstacles that generate specific mixing processes and modify the wind speed and direction. In addition, sensible heat emissions at the surface of urban areas enhance vertical mixing, increase the depth of the boundary layer (Dupont et al., 1999) and can drive regional mesoscale circulations under certain conditions. To reduce transport uncertainties in inversions over urban areas, one can use dedicated urban 20 surface schemes (e.g. Nehrkorn et al., 2013;Feng et al., 2016). More general approaches to reduce transport errors rely on the assimilation of upper-air weather data or on the optimization of the model configuration, e.g. based on comparisons against independent wind measurements (e.g. Deng et al., 2017). But some errors remain difficult to quantify, such as those from local circulations and complex meteorological conditions (Martin et al., 2019). As a consequence, an empirical selection of the data to be assimilated is usually performed, which is more or less stringent depending on each urban station and transport model. Typical 25 selection criteria of continuous urban CO2 data consist of (i) using only measurements acquired during the afternoon when a welldeveloped convective mixing layer is expected; (ii) using only observations when the wind speed is above a given threshold; (iii) removing statistical outliers.
Uncertainties in CO2 boundary conditions arise from the fact that city-scale inversions are performed over a limited spatial domain that receives CO2 signals from outside. These boundary conditions usually cannot be measured explicitly and they can be complex 30 for continental cities that receive CO2 advected by long-range and middle-range transport from other urban areas and biogenic fluxes. Göckede et al. (2010) found that small biases in CO2 boundary condition could lead to large errors (~47%) in the posterior annual state-level CO2 fluxes of Oregon. Lauvaux et al. (2012) found that a 0.55 ppm bias of CO2 boundary condition induced a 10% bias in the posterior annual CO2 flux of Iowa and surrounding states. In order to try to eliminate the bias from boundary conditions, Bré on et al. (2015) and Staufer et al. (2016) proposed to assimilate CO2 gradients between upwind-downwind stations 35 in inversions of CO2 fluxes of the Paris area, which reduces the number of data that can be assimilated.
Series of CO2 transport and inverse modeling studies have been conducted for Paris (Bré on et al., 2015;Staufer et al., 2016;Wu et al., 2016;Broquet et al., 2018;Xueref-Remy et al., 2018). A network of seven stations, including two stations at the center of the city and five stations in its vicinity, has been maintained since 2014. This study resumes this activity and aims at supporting the revisit of some of the options previously used for the inversion based on a more advanced transport modeling framework 40 https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License. developed by Lian et al. 2018 and. More specifically, we analyze in detail the model-measurement mismatches so as to identify critical sources of errors that would compromise a high-resolution atmospheric inversion of urban CO2 emissions in the Paris area. A set of forward simulations of atmospheric CO2 concentration are performed at 1-km horizontal resolution using the WRF-Chem model (Grell et al., 2005) with different anthropogenic emission inventories, physical parameterizations and CO2 boundary conditions over the Paris region from December 2015 to November 2016. The main objective of this paper is to describe 5 and quantify the uncertainties induced by anthropogenic emissions, biogenic fluxes, atmospheric transport and CO2 boundary conditions. We also address the question to what extent these model-measurement mismatches might be reduced and how our proposed diagnostics could be used to provide additional constraints for the inversion of CO2 emissions at the city scale.

10
The WRF-Chem V3.9.1 model was used to simulate hourly atmospheric CO2 concentrations over the Paris region. Details regarding the model setup and the reference data used in the simulations are outlined briefly below and described in Lian et al. (2019). The model was configured with one-way nesting of three modeling domains (D01, D02, and D03 in Figure 1a) at horizontal grid resolutions of 25, 5 and 1 km respectively, in which the innermost one (D03) covers the Î le-de-France region (IdF, which is the administrative area that includes the Paris urban area) and its surrounding. The meteorological initial and lateral boundary 15 conditions were retrieved from the global European Centre for Medium-Range Weather Forecasts (ECMWF) Interim Re-Analysis data (ERA-Interim) with 0.75°×0.75° horizontal resolution at 6-hourly update intervals (Berrisford et al., 2011). The grid nudging option in WRF to relax the model to ERA-Interim on large scales was applied to temperature and wind fields at model levels above the planetary boundary layer (PBL) of the outer two domains. We also used the surface analysis nudging and observation nudging options to assimilate the National Centers for Environmental Prediction (NCEP) operational global upper-air (ds351.0) and surface 20 (ds461.0) observation weather station data (https://rda.ucar.edu/datasets/ds351.0/; https://rda.ucar.edu/datasets/ds461.0/), which are described in more detail in Lian et al. (2018). The biogenic CO2 fluxes were calculated online in WRF-Chem by the diagnostic biosphere Vegetation Photosynthesis and Respiration Model (VPRM) (Mahadevan et al., 2008;Ahmadov et al., 2007Ahmadov et al., , 2009).

Atmospheric physics options
An accurate physical parameterization of atmospheric transport model is critical to numerical simulations of the meteorology and 25 CO2 concentrations within and around urban areas. A set of numerical experiments was performed to assess the sensitivity of the simulations with the WRF-Chem model to the choice of different physics schemes. The characteristics of CO2 distributions are highly related to the PBL structure and its temporal evolution. We thus carried out sensitivity experiments with three different PBL parameterization schemes (Table 1a), including the Yonsei University scheme (YSU) , the Mellor-Yamada-Janjic scheme (MYJ) (Janjić, 1990(Janjić, , 1994, and the Bougeault-Lacarrè re scheme (BouLac) (Bougeault and Lacarrere, 1989). In 30 addition, two different urban surface parameterizations were investigated, the single-layer urban canopy model (UCM) (Chen et al., 2011) and the multilayer urban canopy model BEP (Building Effect Parameterization) (Martilli et al., 2002) (Table 1a). The non-local YSU scheme was used with the Revised MM5 Monin-Obukhov surface layer scheme (Jimé nez et al., 2012), whereas the two local MYJ and BouLac schemes were used with the Monin-Obukhov Eta Similarity surface layer scheme (Janjić, 1996).
All other physics options were identical for all sensitivity runs: WSM6 microphysics scheme (Hong and Lim, 2006), RRTM 35 longwave radiation scheme (Mlawer et al., 1997), Dudhia shortwave radiation scheme (Dudhia, 1989), Unified Noah land-surface scheme for non-urban land cover surface energy fluxes (Chen and Dudhia, 2001). The Grell 3D ensemble cumulus convection https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License. scheme (Grell and Dé vé nyi, 2002) was only employed for the outer domain (D01). These options correspond to those selected by Lian et al. (2018) which showed good performances for simulating near-surface winds and temperatures over the Paris region. The simulations were performed for a period of 15 months from September 2015 to November 2016 including a spin-up of three months.

Anthropogenic emission inventories
Numerical experiments were carried out to assess the modeled CO2 sensitivity to the use of different anthropogenic emission maps 5 and to get insights on the signature of typical uncertainties in such maps (Table 1b). The two spatially and temporally explicit emission fields derived from inventories used in this study were the 2010 AirParif inventory at a spatial resolution of 1 km (AIRPARIF, 2013) Figure 2d), a substantial difference is found during the early morning when AirParif shows emissions that are much smaller than those of IER, and with a clear temporal trend.
We made a one-month simulation using these two anthropogenic inventories together with their respective temporal profiles in 25 order to analyze the impact on the simulated CO2 concentrations (Table 1b). Within the same group of simulations, we also used (i) a constant temporal profile (each pixel has a different emission, but constant in time based on the temporal average of the AirParif inventory) and (ii) a constant and spatially homogeneous emission where the emissions are distributed uniformly over the IdF whole territory. Distinct CO2 tracers are used for each of the four experiments to quantify their respective impacts on the atmospheric CO2 concentration, for a given configuration of the WRF-Chem model. The simulation was carried out for the one-30 month period of January 2016 when the influence of regional biogenic flux on CO2 signals is relatively small compared to that of anthropogenic flux.

Boundary conditions for CO2
A set of sensitivity experiments was designed to investigate the impact of different CO2 boundary conditions on the Paris CO2 concentrations (Table 1c). The initial and lateral boundary conditions for CO2 concentration fields used in the sensitivity 35 experiments were taken from two global CO2 atmospheric inversion products at 3-hourly update intervals: CAMS and CarbonTracker. CAMS has a horizontal resolution of 3.75°×1.90° (longitude × latitude), with 39 hybrid layers in the vertical (version v16r1, https://apps.ecmwf.int/datasets/data/cams-ghg-inversions/; Chevallier, 2017aChevallier, , 2017b. CarbonTracker has a https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License. horizontal resolution of 3° in longitude and 2° in latitude, with 25 vertical layers (version CT2017, http://carbontracker.noaa.gov; Peters et al. 2007). Both global datasets were interpolated onto the outermost domain of WRF-Chem (D01) (bilinearly in longitude, longitude and linearly in pressure) so as to provide the lateral boundary conditions for CO2 simulations. Given that CarbonTracker has an averaged value over each 3-hourly interval (the times on the date axis are the centers of each averaging period), it was also linearly interpolated in time to ensure consistency with both CAMS and the interval of input data for WRF-Chem (e.g. the value 5 at 00 UTC was generated by interpolating the one at 22.5 UTC of the previous day with the one at 1.5 UTC of the same day). Figure 3 shows time series of average differences in CO2 concentration between CAMS and CarbonTracker at each of the four lateral boundaries, averaged over the lowest 0.7 km above ground level (AGL), of D01 for both 00 UTC and 12 UTC. These time series are the spatial mean and standard deviation (± 1σ) over each boundary (a latitudinal transect for western and eastern boundaries / a longitudinal transect for southern and northern boundaries). In general, winds blow mostly from the west in all 10 seasons over the domain of interest. Small differences at the western boundary are observed under the influence of prevailing westerlies with annual means of the spatial mean and standard deviation of 0.01 ± 2.8 ppm for 00 UTC and 0.4 ± 1.8 ppm for 12 UTC, which is expected as the air masses are advected from clean air (oceanic) areas. In contrast, the differences are significantly larger at the eastern boundary (-4.8 ± 7.4 ppm for 00 UTC and -1.7 ± 3.3 ppm for 12 UTC), but can vary from day to day depending on the synoptic weather condition. This feature indicates that CAMS and CarbonTracker may provide substantially different 15 continental CO2 background signals to the inner domain when the wind blows from the east. Moreover, the magnitude and variability of the differences are overall smaller at noon compared to those at midnight. The variability of nighttime differences appears relatively larger in summer than those in winter. Note that the CO2 differences between CAMS and CarbonTracker are much smaller for the upper layers above 0.7 km AGL, with annual means of the spatial mean and standard deviation of -0.4 ± 0.4 ppm for both 00 UTC and 12 UTC at the eastern boundary ( Figure S1).

20
The WRF-Chem simulation with boundary conditions from CarbonTracker used the same physics schemes and prior fluxes as the one with boundary conditions from CAMS (also defined as the control run), whereas it was only carried out for the parent domain (D01) without nesting over a full-year period (2015.09-2016.11). Given the fact that lateral boundary conditions are fed to the nested domain from the parent (the nest is driven along its lateral boundaries by the parent domain), results from D01 should therefore be representative enough to access the modeled CO2 sensitivity over the IdF region to the use of different CO2 boundary 25 conditions.

CO2 in situ observations
For the model evaluation, we use observations from six in situ continuous CO2 monitoring stations established in the IdF region.
Four stations (AND, COU, OVS, SAC) are located within peri-urban areas and two (JUS and CDS) are located within the Paris city. The SAC station has two air inlets placed at 15 m and 100 m AGL respectively. Each of the other stations is equipped with a 30 continuous CO2 gas analyzer and inlets located on rooftops or on towers with heights varying from 20 m to 60 m AGL. The CO2 analyzers are high-precision cavity ring-down spectroscopy instruments with a calibration system using three reference gases tied to the WMO CO2 X2007 scale every 2 to 6 months (Tans et al., 2011). The six stations within IdF are complemented by two ICOS atmospheric background CO2 tall tower monitoring stations (TRN and OPE) located respectively 101 km and 235 km away from the center of Paris. In this study, these two stations are only used as potential background sites and to provide additional support https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License.

Overall model performance
In this section, we start with an evaluation of the overall performance of the control run (BEP_MYJ) in simulating atmospheric   https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License.
The measurements themselves have an accuracy that is on the order of a fraction of a ppm (Xueref-Remy et al., 2018) and measurement errors are therefore negligible when analyzing such model-data differences. In the following, we analyze in further detail the measurement-model discrepancies and attempt to identify cases when they appear to be mainly driven by uncertainties in the anthropogenic emissions, in the biogenic fluxes, in the physical parameterizations of the atmospheric transport model, or in the CO2 boundary conditions at the limit of the atmospheric transport model.

Emission inventory
The main objective when measuring CO2 concentrations within or in proximity to the city is to estimate the anthropogenic emissions by means of an atmospheric inversion. It is then natural to seek, in the time series, unambiguous signatures of erroneous assumptions on the anthropogenic emissions. This is a difficult task as significant uncertainties in the atmospheric transport also 10 impact the modeling results, while there is no knowledge of both "true" emissions and "true" transport.  The diurnal profile used for heating emissions in the AirParif inventory (with a significant decrease along the night) can thus be questioned.
Although there is a strong indication that the nighttime profile of the AirParif CO2 emissions is erroneous and that heating emissions do not reduce strongly during the night, this error does not entirely explain the model-data misfit at CDS shown in Figure 6. This is proven by the fact that even the "constant emission" simulation does not reproduce the increasing concentration during the night.

25
This implies that errors in atmospheric transport are also contributing to the model-data misfit, in particular concerning the vertical mixing near the surface. Further evidence for the transport deficiency is that the underestimations of nighttime CO2 concentration are not only large at the two urban sites but also obvious at all rural stations ( Figure 5).
To gain insights on vertical transport differences, we show in Figure 7 the vertical distribution of the BEP_MYJ modeled CO2 concentrations at CDS in January together with time series of observed and simulated CO2 concentrations at a sampling height of 30 34 m AGL. The PBL heights shown in Figure 7 are diagnosed using the 1.5-theta-increase method which defines the height of PBL as the level at which the potential temperature first exceeds the minimum potential temperature within the boundary layer by Moreover, both the BEP_MYJ and BEP_MYJ_IER model slightly overestimate CO2 concentrations at CDS in the late afternoon and early evening (from 18 pm to 22 pm) not only in January ( Figure 6) but also over the full year ( Figure 5). This is interpreted https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License.
as the consequence of a shift from a situation with convective mixing to stable nocturnal conditions around sunset occurring too early in the model. It may also be linked to an increase in traffic emissions during the evening rush hour, which could also lead to the overestimated modeled concentrations in the late afternoon.

Biogenic fluxes
To analyze the influence of biogenic fluxes on the CO2 concentrations, we computed CO2 horizontal differences between two sites 5 (i) CDS, that is within the limits of the Paris city where the diurnal cycle in winter is dominated by anthropogenic emissions (see During the night, there is a large measurement-model discrepancy from June to September (the SAC station had unfortunately measurement gaps). During this growing season period, the observed difference between CDS and SAC is negative at night (higher concentrations at SAC than at CDS), while the simulated difference is positive resulting from a large positive anthropogenic 20 contribution and a smaller negative biogenic contribution. Figure S2 shows that this nighttime misfit between the modeled and observed CO2 differences has a seasonal trend that follows closely the one of the modeled gross primary production (GPP). A large fraction of GPP realized each day is respired at night by plant maintenance respiration. The seasonal trend of the nighttime misfit between CDS and SAC thus indicates that the model underestimates plant respiration at night, and thus possibly GPP in the day.
Although it is impossible to negate other hypotheses related to the atmospheric transport and vertical mixing, this result suggests 25 that modeling nighttime CO2 at rural stations is affected by systematic errors of respiration during the growing season, so that nighttime rural CO2 data can hardly be used in atmospheric inversions for inferring anthropogenic emissions.
Further insight on the CO2 concentration dynamics at SAC is provided by the vertical differences that are derived from the measurements at two levels, 15 m and 100 m AGL, on a tall tower at that location (Figure 9). During the afternoon, the differences are small and there is little agreement between the observations and the simulated values (Figure 9a). This systematic bias between 30 the observed and simulated CO2 vertical gradients could be explained by an underestimation of the photosynthetic uptake. The vertical CO2 differences are much larger at night with a fair agreement between the measurements and the simulated values in wintertime (Figure 9b). Although the nighttime time series show strong similitudes, there is a significant bias between the observations and the model during the growing season, but not so during the non-growing season. The seasonal phase of the vertical misfit is well correlated with the one obtained from the horizontal diagnostics, which tends to indicate the same bias in the estimated 35 nighttime respiration.
The analyses of both Figure 8 and 9, together with similar results observed at other stations ( Figure S3: e.g. the horizontal difference between CDS and COU, and the vertical difference at TRN), are consistent with the hypothesis that the respiration emission at night is underestimated by the VPRM model. Since nighttime respiration is usually well correlated with daytime respiration https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License. (Reichstein et al., 2005), this result implies that the modeled positive gradients of CO2 between urban and rural stations are probably overestimated during the growing season, so that without any optimization of respiration, an inversion would tend to generate a positive bias in the anthropogenic emissions estimates, to compensate for the negative bias in the respiration from VPRM.

Atmospheric transport
Uncertainty in simulated CO2 due to transport errors can be evaluated empirically through the spread of simulated CO2 by 5 sensitivity experiments with different physical configurations of WRF-Chem. We have made five sensitivity simulations using the same surface fluxes and boundary conditions, but with three PBL schemes and two urban canopy schemes (see Table 1a). Figure 10a shows the horizontal distribution of the monthly median standard deviation of simulated hourly CO2 concentrations at approximately 20 m AGL using different physics schemes for two periods of the day (afternoon 11-16 UTC, nighttime 00-05 UTC), and for two months (January, July 2016). During January, the simulated CO2 concentrations within the city, both in 10 afternoon and nighttime, are highly sensitive to the choice of the physics scheme, with median standard deviations larger than 6 ppm. In contrast, the choice of the physics scheme has less influence on simulated CO2 concentrations over suburban and rural areas in winter, with the median standard deviations of 1.2 ppm in the afternoon and 2 ppm at night. During the summer period, the smallest uncertainty of simulated CO2 concentration resulting from different physics schemes is found in the afternoon with median standard deviations that are less than 1 ppm, which indicates that the various schemes provide very similar values. However, 15 it is necessary to compare these standard deviations to the amplitudes of the anthropogenic emission signature. We thus calculated the median ratios of the simulated anthropogenic CO2 concentration (average over the five sensitivity runs) to its respective standard deviation of the total CO2 signals among the five sensitivity runs, which we define as the "signal-to-noise ratio" ( Figure   10b). Indeed, the anthropogenic signal may be understood as the "signal" for the estimate of the emission, while the spread of the five sensitivity simulations provides an indicator of the atmospheric transport uncertainty. The largest signal-to-noise ratio is found 20 in the afternoon of summer within the urban area, indicating that the link between the anthropogenic emission and the CO2 concentration can be derived from the model with the highest confidence for these conditions. However, during the summer, the nighttime CO2 measurements over the entire suburbs are poorly suited for the inversion since the simulated CO2 are highly sensitive to the choice of physics scheme and the signal-to-noise ratios are then relatively small (< 1). Figure 10c shows the vertical distribution along a south-north transect through the JUS station in a similar way as Figure 10a. In 25 general, the simulations with various physics options show very large variations in the modeled CO2 concentrations (up to 7.5 ppm standard deviation) close to the surface, a few tens of meters above the emissions. The differences become much smaller (less than 1 ppm) with increasing altitude. This may be due to the fact that different physics schemes lead to different vertical mixing efficiencies, which has a strong impact on the vertical structure of CO2 concentrations. Given that the measurements are acquired at a level where the vertical gradient is large and variable, it may also indicate that the measurement-model discrepancy is highly 30 dependent on the physics parameterization in the representation of the vertical mixing process in near-surface layers. During the winter period, there is a considerable difference in the vertical concentration profiles reproduced by different physics schemes within the city, with the uncertainty extending to a higher altitude in the afternoon than those in the nighttime. Further away from the urban area, anthropogenic emissions are substantially lower, and the vertical gradient of CO2 generated by the strong city emission is smoothed out by the convection and diffusion processes. As a consequence, much less uncertainty is associated with 35 the choice of the physics scheme in the suburbs at altitudes above ~200 m AGL. As for the signal-to-noise ratio shown in Figure   10d, the large values within the city tend to indicate that the urban CO2 data are well suited for an estimate of the emissions using the atmospheric inversion method. https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License.
We also accessed the respective contributions of anthropogenic and biogenic fluxes to the simulated spread of CO2 concentrations using different physics schemes. This allows us to provide a characterization of the impact of uncertainties in the atmospheric transport modeling along with that of the impact of the individual fluxes. Figure 11 shows the simulated monthly mean plus/minus one standard deviation of both the anthropogenic and biogenic CO2 differences at approximately 20 m AGL between the control run (BEP_MYJ) and each of the other four sensitivity runs. The results in this figure are presented with the consideration of (i) 5 two periods of the day (afternoon 11-16 UTC, nighttime 00-05 UTC); (ii) two months (January, July 2016); and (iii) three land use types (urban, crop and the others). Urban (7.4%) and crop (84.6%) are the two dominant land use types of the innermost model domain (D03) from the MODIS land cover database used in the WRF-Chem model, where the percentages in parenthesis indicate the proportion of each land use category to the total area. The other land use types (8.0%) mainly include grass, shrub, mixed forest, deciduous forest and evergreen forest. During the winter, the simulated anthropogenic CO2 concentrations over the urban area are 10 sensitive to the choice of the urban canopy scheme used in WRF-Chem, which is characterized by a substantial decrease in standard deviation from UCM to BEP (Figure 11a). The three simulations using the UCM scheme tend to produce higher anthropogenic CO2 concentrations together with larger standard deviations with respect to the control run using the BEP scheme. On the other hand, these two urban canopy schemes (UCM, BEP) show small spreads in the simulation of anthropogenic CO2 concentrations over the rural vegetated area for both seasons. This indicates that the choice of an urban canopy scheme is critical for simulating 15 atmospheric transport at urban stations, but that the transport errors, without such scheme, remain mainly 'local' and have little remote influence at rural sites. That is, the choice of an urban scheme impacts CO2 concentrations over the urban areas but its impact on the larger scale transport is not significant enough to affect the simulated concentrations over rural areas. During the summer period, our results show that the modeled nighttime CO2 concentrations are strongly sensitive to both the urban canopy and PBL schemes. This conclusion applies to both the urban and the rural areas. Interestingly, the control run simulates higher 20 nighttime CO2 concentrations than the other four experiments.
Here, we quantify the uncertainty in the modeling results that is linked to the three PBL schemes and two urban canopy schemes.
Clearly, there are other potential sources of atmospheric transport uncertainties that are not accounted for in this study. The simulated CO2 differences among the ensemble of physics schemes tested here are therefore only a fraction of the full magnitude of model uncertainty. Nevertheless, this uncertainty is, in some cases, of similar magnitude as the measurement-model differences 25 that have been shown in section 3.1.

Boundary condition
To investigate the uncertainty in CO2 boundary conditions, we examined the modeled CO2 sensitivity over the Paris region to the use of two different global CO2 atmospheric inversion products as initial and boundary conditions for WRF-Chem (see Table 1c). Figure 12 shows all hourly CO2 concentration differences between BEP_MYJ and BEP_MYJ_CT that used CO2 fields from CAMS 30 and CarbonTracker products respectively. The comparison is based on the simulated CO2 in the 25-km grid cell of the outermost domain (D01) containing the Paris city. For most time of the year (~73%), the differences in simulated CO2 concentrations over Paris are within the range of ±1 ppm since they are mainly affected by those differences between CAMS and CarbonTracker at the western boundary of D01 under the influence of west winds. Nevertheless, considerable differences (up to 5 ppm) are observed during several synoptic episodes, which illustrates the magnitude of uncertainties linked with the boundary condition hypothesis.

35
These magnitudes are similar to those of the impacts of different physics schemes on simulated CO2 concentrations over suburban and rural areas as shown in section 3.2.3. Under such circumstances, it requires the use of additional observations to constrain the boundary inflow in inversions. On the other hand, as the IdF region is exposed to a relatively well-mixed background atmosphere after a long-range transport of CO2 from remote sources and sinks, one may expect that the resulting CO2 concentration features https://doi.org /10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License. are large scales. As a consequence, the potential modeling error induced by an erroneous boundary will be similar for monitoring stations located within Paris and its vicinity. This characteristic suggests that the assimilation of upwind-downwind gradient in CO2 concentrations in the inversion of city-scale emissions as done in previous studies could also be an effective way to minimize the potential biases both from the boundary conditions and from remote fluxes within the domain but outside the city (Bré on et al., 2015;Staufer et al., 2016).

4 Conclusions and discussions
We have analyzed CO2 concentrations measured and modeled at six stations located within and in the surrounding of the Paris city.
Our objective is to identify the main causes of the CO2 differences between the measurements and their simulated counterparts, with the overall goal to improve the quantification of anthropogenic emissions. To accomplish this, we have performed an ensemble-based sensitivity study and a full analysis of the uncertainties linked to anthropogenic inventories, biogenic fluxes, 10 atmospheric transports and boundary conditions, either focusing on limited periods or looking at the full-year period.
A preliminary identification of the modeling errors was first conducted with the KNN algorithm to identify the largest mismatches between the observations and the model results. These large discrepancies are either related to specific measurement contaminations from local unresolved sources of CO2 emissions, or to the model's inability to properly simulate the atmospheric transport for specific meteorological conditions. It is therefore necessary to explicitly discard these outliers for any atmospheric 15 inversion that aims at the city emissions.
Within the city, the modeled CO2 concentrations appear highly sensitive not only to the atmospheric vertical mixing close to the surface, but also to the prescribed temporal profile of anthropogenic emissions. These sources of errors are large, particularly in winter, and show a potential for biases that is problematic when aiming at quantification of city emissions. There is little hope to significantly decrease these uncertainty factors in the near future, unless better constraints on transport such as vertical profiles, 20 could be available. Such complementary measurements will be of great help to understand the characteristics of CO2 vertical distribution under both stable and convective boundary-layer conditions. It can also be used to verify and validate the atmospheric transport model, and to reduce transport errors based on the data assimilation technique.
In the suburbs, further away from the urban sources, the anthropogenic emissions are lower and the vertical gradient of CO2 concentration, generated by the city emissions, is smoothed out by the atmospheric convection and diffusion processes. There is 25 then less uncertainty than within the city about the efficiency of the vertical mixing. The link between the anthropogenic emission and the CO2 concentration during the afternoon in winter can then be derived from the model with more confidence. However, the contribution of the biogenic flux to the CO2 concentration is an issue during the growing season. The difficulty is mainly related to the simulation of the nocturnal CO2 concentrations because of the large uncertainties in the atmospheric transport modeling as well as the biogenic fluxes. Additional measurements of carbon isotopes ( 14 C, 13 C) and tracers coemitted with CO2 (e.g., CO, NOx) 30 could be used to separate the contributions from fossil fuel and biogenic components to the total CO2 concentrations, which would be beneficial for the optimization of sectoral CO2 fluxes.
The influence of different CO2 boundary conditions for our model domain is dependent on synoptic weather situations. As for the Paris region, the simulated CO2 differences between CAMS and CarbonTracker are less than one ppm during most periods of westerly winds that bring in clean oceanic air masses, but they can vary by several ppm during some synoptic episodes, e.g. with 35 north and easterly winds. This result advocates the practice of using additional observations to constrain the boundary inflow, or using CO2 gradients when the wind direction is properly aligned with two (upwind-downwind) stations in the inversion of CO2 fluxes of the Paris region. https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License.
Our analyses confirm the difficulty of accurately modeling the atmospheric CO2 concentration within and in the close vicinity of urban areas. Although we have pointed out the likely causes for the measurement-model discrepancies, we have not been able to unambiguously demonstrate the respective contributions of anthropogenic emissions, biogenic fluxes and atmospheric transport errors. This is a strong cause of concern for the objective of measuring the city emissions on the basis of atmospheric concentrations.
It has been argued that, rather than measuring the absolute emission, the atmospheric inversion approach should aim at the 5 emissions changes, with the argument that atmospheric transport biases would not change, and thus let any emission variation show up in the concentrations. However, one may argue that a significant change in the total emission would be associated with a change in its spatial distribution and/or a change in the temporal profile. These would then also impact the relationship between the emission and the concentration.

Code/Data availability
All data sets and model results corresponding to this study are available upon request from the corresponding author.

Competing interests
The authors have no competing interests to declare.

15
Daily CO2 vertical differences at SAC (15m-100m) from each sector for (c) Non-growing season from October to April and (d) Growing season from May to September. OBS indicates the observed CO2 concentration differences. TOT, ANT, BIO and BCK indicates the simulated total, anthropogenic, biogenic and background CO2 concentration differences respectively.
https://doi.org/10.5194/acp-2020-540 Preprint. Discussion started: 3 July 2020 c Author(s) 2020. CC BY 4.0 License. Figure 10: Analysis of the "signal-to-noise" as discussed in the text for two periods of the day (afternoon 11-16 UTC, nighttime 00-05 UTC), and two months (January, July 2016). (a) is the median of the hourly standard deviation of the simulated near-surface CO2 concentration computed among the five sensitivity runs (Table 1a); (c) is the same as (a) but for a vertical south-north slice that goes through the JUS station; (b) is the median ratios of the hourly anthropogenic CO2 concentration (average of the five sensitivity runs) to 5 its respective standard deviation of the total CO2 concentrations among the five sensitivity runs; (d) is the same as (b) but for the same vertical slice as in (c).