Toward targeted observations of the meteorological initial state for improving the PM 2.5 forecast of a heavy haze event that occurred in the Beijing-Tianjin-Hebei region

. An advanced approach of Conditional Nonlinear Optimal Perturbation (CNOP) was adopted 10 to identify the sensitive area for targeted observations of meteorological fields associated with PM 2.5 concentration forecasts of a heavy haze event that occurred in the Beijing-Tianjin-Hebei (BTH) region, China, from 30 November to 4 December 2017. The results show that a few specific regions in the southern and northwestern directions close to the BTH region represent the sensitive areas. Numerically, when predetermined artificial observing arrays (i.e., possible “targeted observations”) in the sensitive 15 areas were assimilated, the forecast errors of PM 2.5 during the accumulation and dissipation processes were aggressively reduced; in particular, these assimilations, compared with those in other areas that have been thought of as being important for the PM 2.5 forecasts in the BTH region in previous studies, exhibited a more obvious decrease in the forecast errors of PM 2.5 . Physically, the reason why these possible “targeted observations” can significantly improve the forecasting skill of PM 2.5 was interpreted 20 by comparing relevant meteorological fields before and after assimilation. Therefore, we conclude that preferentially deploying additional observations in the sensitive areas identified by the CNOP approach can greatly improve the forecasting skill of PM 2.5 , which, beyond all doubt, provides theoretical guidance for practical field observations of meteorological fields associated with PM 2.5 forecasts.


Introduction 25
Air pollution is one of the most severe environmental problems that China is facing. Among various air pollutants, fine particulate matter (PM2.5) has been considered as the most serious pollutant, frequently engulfing northern China, such as the Beijing-Tianjin-Hebei (BTH) region. Exposure to heavy PM2.5 episodes not only increases the risks of various respiratory diseases, but also induces the possibility of 2 diabetes and other metabolic dysfunction-related diseases (Guan et al., 2016;Lim and Thurston, 2019). 30 Accurate PM2.5 concentration forecasts are essential since they can remind people to reduce exposure during haze days and can assist policy-makers in making effective emission reduction measure decisions.
The atmospheric chemical transport model (CTM) is one of the most widely used and effective ways to forecast PM2.5 concentrations. However, relevant chemical and physical processes are complex, and associated parameterization schemes of turbulent processes and meteorological and emission conditions 35 cannot describe exactly the real world, causing model forecasts to have great uncertainty, especially on heavy haze days (Hu et al., 2010;Kong et al., 2021).
The uncertainties of CTM output, as mentioned above, are primarily attributed to the uncertainties of meteorological and emission inputs, in addition to those occurring in the chemical model formulation (Roman et al., 2004;Gilliam et al., 2015). Meteorological conditions including wind, temperature, and 40 relative humidity, which are crucial for the transformation, formation, diffusion and removal of pollutants in the atmosphere, have a great impact on PM2.5 forecasts of the BTH region in CTMs (Godowitch et al., 2011;Chen et al., 2020). Using an artificial neural network model combined with wavelet transformation, He et al. (2017) demonstrated that meteorological conditions explained more than 70% of the variance in daily PM2.5 concentrations over the major cities in China. Therefore, regional PM2.5 concentrations 45 rely on meteorological variations to a large extent. Thus, to improve the PM2.5 forecasting skill, it is necessary to understand the sensitivity of the CTM results to the inputted meteorological fields and to reduce meteorological uncertainty. It has been demonstrated that uncertainties in the meteorological initial field substantially influence pollution simulations, including their temporal variations and peak time concentrations (Zhang et al., 2007;Bei et al., 2017;Liu et al., 2018). Then increasing the accuracy 50 of the meteorological initial conditions is an effective way to improve the PM2.5 forecasting skill.
Data assimilation is recognized as a useful technique for improving the accuracy of initial conditions.
To obtain reliable initial meteorological conditions, sufficient and effective observations are essential.
However, conventional observations, which are distributed at a low resolution in both oceans and islands, have a limitation in improving the accuracy of initial conditions (Li et al., 2015). Assimilating additional 55 field observations has been proven to be an effective way to obtain a reliable initial field (Sydney, 1996;Mu et al., 2015). Since field observations are costly and never sufficiently dense, one can consider placing a preferentially limited number of observations in key areas to have the most positive impacts on 3 improving forecast skill. This idea is just one of the new observational strategies of "target observation", also called "adaptive observation", which has been developed over the past two decades (Snyder, 1996;60 Palmer et al., 1998;Majumdar, 2016). The "target observation" mainly serves the demand of forecasts on observations. The idea is as follows. To better predict an event at a future time 2 (i.e., verification time) in a focused area (i.e., verification area), additional observations are deployed at a future time 1 (i.e., target time; 1 2 ) in some key areas (i.e., sensitive areas) where additional observations are expected to have a large contribution in reducing the prediction errors in the verification area. These 65 additional observations are assimilated by a data assimilation system to provide a more reliable initial state, which would be supplied to the model to obtain a more accurate prediction. Targeted observations have become a hot topic in atmospheric science due to their successful applications in improving the prediction skills of extreme weather events, such as typhoons (Wu et al. 2009;Mu et al., 2009), winter storms (Kren et al., 2020), and high-impact climatic events, such as the El Niño-Southern Oscillation 70 (ENSO; Kramer and Dijkstra., 2013;Duan et al., 2018) and Indian Ocean Dipole (IOD; Feng et al., 2017;Beal et al., 2020). As we stated above, the meteorological initial fields have great impacts on the PM2.5 forecasts of the BTH region (Bei et al., 2017;Liu et al., 2018); meanwhile, our results also showed that the PM2.5 forecasts are sensitive to the uncertainties of meteorological initial conditions (see Section 3.1).
Based on these findings, we would propose the following question, can we apply the targeted observation 75 strategy to improve the meteorological condition forecasts, which then further improve the PM2.5 forecasts of BTH region? It has also been argued that sufficient satellite observations can be used to yield the meteorological initial field by using a data assimilation approach. However, assimilating more observations may not lead to higher forecast benefits. Therefore, even if there are sufficient observations, one should also consider observations in which area and how many observations should be preferentially 80 assimilated to improve the PM2.5 forecast skill to a larger degree. When the observations in the area with high sensitivity are assimilated to the initial values of the forecast, the forecasting skills will be greatly increased; conversely, if the observations in the area where the forecast is not sensitive to the initial values are assimilated, the forecasting skills will be improved slightly or even become worse (Yu et al., 2012;Janjić et al., 2018;Zhang et al., 2018). Then the present study would explore the relevant sensitive area 85 and examine the role of possible "targeted observations" on meteorological fields in improving the PM2.5 forecast skill during a heavy haze event that occurred from 30 November to 4 December 2017 in the 4 BTH region, eventually suggesting the usefulness of implementing targeted observations on meteorological fields for improving air quality forecasts.
The key for the targeted observation is the determination of sensitive areas mentioned above and 90 the design of the observation network. That is, when implementing the targeted observations, one should first make clear where to preferentially implement targeted observations and how to display these additional observations. To obtain the sensitive areas of meteorological fields for PM2.5 forecasting, an advanced optimization method, Conditional Nonlinear Optimal Perturbation (CNOP), is used (Mu et al., 2003;Mu and Zhang, 2006), which overcomes the linear limitation of the traditional singular vector 95 approach (Lorenz, 1965). The CNOP represents the initial perturbation that causes the largest error growth at a given future time over the verification area. The CNOP is therefore the most sensitive initial perturbation; therefore, it would have potential for providing the sensitive area for targeting observations.
In fact, the CNOP has been adopted to identify sensitive areas for targeting observations in both Observations System Simulation Experiments (OSSEs) and/or practical observation tasks associated 100 with typhoons, ENSO, Kuroshio, and marine environments over the coast of China (Mu et al., 2015;Da et al., 2019) and has gained great success in improving the forecasting skills of the concerned high-impact weather or climatic events.
In the present study, we would consider the importance of the meteorological initial conditions on PM2.5 forecasting and apply the targeted observation strategy of meteorological fields with the CNOP 105 approach to study the PM2.5 forecast of a heavy haze episode. As mentioned above, during the period from 30 November to 4 December 2017, a heavy air pollution event occurred in the BTH region, with hourly maximum PM 2.5 concentrations greater than 250 µg/m 3 , exceeding the standard of severe pollution (Feng et al., 2016). However, the Beijing Municipal Ecological and Environmental Monitoring Center did not provide a warning of this event in time (see the link http://www.bjmemc.com.cn/). We utilize this 110 event as an example to explore the possible "targeted observations" of meteorological fields and to investigate whether they can help improve the PM2.5 forecasting skill. Specifically, the following questions are addressed.
(a) Which area represents the sensitive area of initial meteorological fields for targeted observations associated with the PM2.5 forecast of the concerned event? 115 (b) What is the optimal observation array for targeted observations in meteorological fields (in terms 5 of locations and coverage density)?
(c) Why can the "targeted observations" in the sensitive areas lead to a larger improvement of the PM2.5 forecasting skill of the event?
The paper is organized as follows. The model, methodology and data used in the study are 120 introduced in the next section. Then, the CNOP-type errors of the meteorological field forecasting of the haze event are calculated in Sect. 3. In Sect. 4, the sensitive areas of the meteorological field for the PM2.5 forecasts are identified, and relevant OSSEs are designed to verify the validity of the targeted observation in improving the forecasting skill of PM2.5 in the haze event. In Sect. 5, the reasons why the "targeted observations" can result in a larger improvement of PM2.5 forecasts are interpreted. Finally, a summary 125 and discussion are presented in Sect. 6.

Model, Methodology and Data
In this study, we adopt the Nested Air Quality Prediction Modeling System (NAQPMS) and Weather Research and Forecasting (WRF) model to explore the role of targeted observations on meteorological fields in improving the surface air concentrations of PM2.5 forecasts by building an optimization problem 130 associated with the CNOP approach.

Models
The NAQPMS is a three-dimensional regional Eulerian chemical transport model developed by the Institute of Atmospheric Physics, Chinese Academy of Sciences (Wang et al., 1997;. It includes modules that address horizontal and vertical advection and diffusion, dry-wet deposition, gaseous phases, 135 aqueous phases, aerosols and heterogeneous chemical reactions, so that the PM2.5 simulation is with water component . The NAQMPS has been widely applied to forecast air pollutants and to study the source apportionment of pollutants (Yang et al., 2020 The NAQPMS is driven by the meteorological field generated through WRFV3.6.1 (http://www.wrf-model.org/). The WRF model used in the present study adopts the Lin microphysics 145 scheme (Lin et al. 1983), RRTMG longwave radiation (Iacono et al. 2008), Dudhia shortwave radiation schemes (Dudhia, 1989) and Yonsei University planetary boundary layer parameterization scheme (Hong et al. 2006). These parameterization schemes are also adopted in the adjoint model of the WRF, which is used to calculate the CNOP (see Sect. 2.2). To enhance the computing efficiency of the CNOP, a horizonal resolution of 30 km is used in the present study for an initial attempt. The model domain of the WRF and 150 its adjoint model are the same as that in the NAQPMS. The assimilation system we used is a 3-D variational data assimilation system of the WRF, which has been proven to be an efficient assimilation tool for PM2.5 simulations (Kumar et al., 2019;Zhang et al., 2021).

Conditional Nonlinear Optimal Perturbation (CNOP)
The CNOP represents the initial perturbation (or error) that can lead to the largest forecast error in the 155 focused area (verification area) at verification time. Suppose a nonlinear model is expressed as Eq. (1), where is the state vector with an initial value 0 and is a nonlinear partial differential operator.
The solution of Eq. (1) can be described as ( ) = ( ), in which is the nonlinear propagator. If ( ) is a reference state and an initial perturbation 0 is added to its initial state 0 , a forecast will 160 be made with x( ) ( ) ( 0 + 0 ) , where ( ) = ( 0 + 0 ) − ( 0 ) represents the evolution of the initial perturbation 0 . Then, an initial perturbation is CNOP ( 0 * ) if and only if where 0 1 0 ≤ is the constraint condition that the initial perturbation should satisfy and is a positive value that is comparable to the initial analysis error variance of the considered variables. 1 and 165 2 are coefficient matrices, which define the amplitudes of initial perturbations and its evolution ( 0 + 0 ) − ( 0 ) , with x consisting of zonal and meridional wind ( and , respectively), temperature ( ), water vapor mixing ratio ( ) and pressure ( ) components in the present study, and they play their role by calculating the total perturbation energy from surface to top (i.e., 100 hPa), as in Eq.
The optimization problem in Eq. (2) is solved by using the spectral projected gradient 2 (SPG2) method (Birgin et al., 2001) in the present study. A first guess is assigned to the initial perturbation .
The WRF model is integrated forward with the initial state 0 + 0 to obtain the forecast ( 0 + 0 ). The cost function J is calculated by using ( 0 + 0 ) and ( 0 ). The adjoint model 180 of the WRF is integrated backward to calculate the gradient of the cost function with respect to the initial perturbation 0 . The gradient represents the fastest descending direction of the cost function J in Eq.
(2). Based on the iteratively forward and backward integration governed by the SPG2 algorithm, the initial perturbation 0 is optimized and updated until the convergence condition of the algorithm is satisfied. Here, the convergence condition is ‖ ( 0 − ( 0 )) − 0 ‖ 2 ≤ 1 , where 1 is an 185 extremely small positive number, ( 0 ) projects the 0 outside the constraint to the boundary of the constraint condition and ( 0 ) represents the gradient of the cost function J with respect to 0 . Then, the resultant initial perturbation * is the CNOP. The details for the SPG2 algorithm can be seen in Birgin et al. (2001). The fifth generation ECMWF reanalysis for the global climate and weather (ERA5) (https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5) and National Centers for Environmental Prediction (NCEP) GFS historical archive forecast data (GFS, https://rda.ucar.edu/ 8 datasets/ds084.1/) are both used to produce the initial and boundary meteorological conditions for the WRF simulations. Both the ERA5 and GFS data have a 0.25° spatial resolution (approximately 25 km) 200 and 6-hour temporal resolution. 3 December. Specifically, the PM2.5 concentrations of most cities in the BTH region exceeded 250 ug/m 3 at 12:00 on 2 December; then, starting from 01:00 on 3 December, the PM2.5 dissipated rapidly within several hours. In Beijing, from 00:00 on 1 December, it took almost one day to accumulate PM2.5 from 77 ug/m 3 to 160 ug/m 3 according to the Dongsi station; then, from 01:00 on 3 December, the PM2.5 concentration decreased from 256 ug/m 3 to 19 ug/m 3 in 7 hours. 215

Simulations of the PM2.5 variability in the heavy haze event
After a 10-day spin-up of the WRF-NAQPMS, the ERA5 and GFS meteorological data are separately adopted to initialize the WRF at 00:00 BJT on 30 November 2017, and the simulations of PM2.5 9 concentrations at the Baoding and Dongsi stations are plotted in Fig. 2. Since the two simulations are generated by the same model using the same emission inventory, the PM2.5 forecast uncertainties are only 220 attributed to the uncertainties of meteorological initial fields. The simulation initialized by ERA5 can better reproduce the pollution event. During the period between 00:00 BJT on 30 November and 23:00 BJT on 1 December, the simulations initialized by ERA5 almost overlap with the observations. In the remaining time period, although the highest PM2.5 concentration simulated by ERA5 occurs approximately 12 hours earlier and more than 50 ug/m 3 lower than those in the observations, the 225 simulation can represent well the accumulation and dissipation processes of PM2.5. The simulations initialized by the GFS do not perform well in representing the episode of PM 2.5 .
They underestimate the PM2.5 concentrations during the accumulation process, and the simulated highest PM2.5 concentration (176 µg/m 3 ) occurs at approximately 21:00 on 3 December in Baoding, which is exactly in the dissipation process of the observed event. The simulation of Beijing PM2.5 also shows a 235 large deviation from the observational PM2.5 concentration, especially during the dissipation process.
To quantify the differences between simulations and observations, mean RMSEs and correlations of the 80 grids during the whole event (from 00:00 BJT on 30 November to 00:00 BJT on 4 December 2017) are calculated against the observations. As shown in Table 1, the mean RMSE of the simulations initialized by ERA5 is 60.09 ug/m 3 for the PM2.5 concentration, which is 19.87% lower than that of the 240 GFS simulations (i.e., 74.99 ug/m 3 ). The correlation between the ERA5 simulation and the observation is 0.47 and 20.51% higher than that of GFS simulations (i.e., 0.39). More specifically, we select two time 10 points to show the PM2.5 differences between simulations and observations, which are at 2:00 BJT on 2 December (hereafter defined as Accumulation Time; AT) and 14:00 BJT on 3 December (hereafter defined as Dissipation Time; DT). Almost all GFS simulations show an underestimation of the PM2.5 at 245 the AT and an overestimation at the DT. The mean deviations are -47.88 ug/m 3 at the AT and 55.02 ug/m 3 at the DT. The ERA5 simulation performs much better at the two time points, with mean deviations of -30.57 ug/m 3 and 41.58 ug/m 3 , although it also shows an underestimation at the AT and an overestimation at the DT. It is known that a bad forecast made by a numerical model is attributed to errors in both models and initial conditions. The study of targeted observations aims to improve the forecast by reducing the errors 255 in the initial conditions, which is usually implemented with perfect model assumptions (Mu et al., 2015).
A perfect model is assumed to limit forecast errors that result only from errors in the initial conditions, thus simplifying the complexity of problems. However, there are no perfect models in reality. Thus, when implementing the targeted observation tasks, we choose the model that exhibits relatively small model errors and is able to present good simulations to determine where (i.e., the sensitive area) to deploy the 260 targeted observations by calculating the CNOP. The WRF is one of the most advanced weatherforecasting models currently and exhibits small model errors (Liu et al., 2012). Therefore, we apply the WRF, together with the NAQPMS model, to explore the role of targeted observations in PM2.5 forecasts.
When we use different initial conditions to simulate PM2.5, a better simulation is taken as the "truth run", and the CNOP is calculated based on that. As shown above, the simulations initialized by ERA5 have 265 better performances in presenting the PM2.5 variability; particularly, they show the best simulation at the AT for the accumulation process of PM2.5 and at the DT for the dissipation process. Thus, the simulations initialized by ERA5, especially at AT and DT, are taken as the "truth run" to determine the sensitive area 11 for targeted observations by calculating the CNOP. Even though, the calculated sensitive area is actually an approximation of the real sensitive area. If such approximation is valid, then for any forecast, 270 preferentially assimilating additional observations in the sensitive area will help improve the PM2.5 forecasting skill much greatly. The validity of the above approximate sensitive area is often tested by prescribing a good simulation to observation (for example, the simulation initialized by ERA5) and then assimilating the simulated observations located in the sensitive area to a bad forecast (for example, the control forecast) to examine whether the assimilation forecast will be much closer to the good simulation, 275 which, actually, is a kind of OSSEs (see Masutani et al., 2010;Qin et al., 2013). In our study, to verify the validity of the sensitive area, the simulated targeted observations are assimilated to the GFS forecasts to improve their PM2.5 forecasts, where the GFS forecasts are taken as the "control run" and those after assimilating targeted observations are regarded as the "assimilation run". If the sensitive area is valid, the PM2.5 forecasts in the assimilation run will be much closer to the truth run. It can also be inferred that 280 if the real observations are available, assimilating the real targeted observations to the initial field of the meteorology of the control forecast would improve the PM2.5 forecast skill greatly against the observations. In the present study, we will adopt assimilating simulated observations to verify the validity of the sensitive area due to the unavailable observations.

CNOP-type errors of meteorological field forecasting 285
We select the AT and DT as verification times separately to determine the sensitive areas by calculating the CNOP-type errors. When the AT is taken as the verification time, we explore the forecast starting from 2:00 on 1 December, with a lead time of 24 hours, and the forecast starting from 14:00 on 1 December, with a lead time of 12 hours. When the DT is taken as the verification time, the forecasts starting from 14:00 on 2 December and 02:00 on 3 December, with lead times of 24 hours and 12 hours, 290 respectively, are investigated. Then, there are a total of 4 PM2.5 forecasts concerned here for the heavy haze event that occurred in the BTH region from 30 November to 4 December 2017, which are all initialized by ERA5.
As we described in Section 2.2, the CNOP-type initial errors which include the variables of wind, temperature, pressure and water vapor mixing ratio cause the largest forecast error of concerned 295 meteorological fields measured by the total energy at the verification time in the verification area, which may perturb the PM2.5 forecast to the greatest extent when considering the combined effect of different 12 meteorological components and thus represent the most disturbing initial error of the meteorological field.
The CNOP-type errors are calculated separately for these 4 forecasts. When it is the CNOP-type errors for the AT, their dominant anomalies, as mentioned above, occur at the low level (i.e., 850 hPa) for the forecast with a lead time of 24 hours; furthermore, the horizontal pattern mainly presents two areas that cover the large CNOP-type errors despite small position differences among the respective large-error areas of wind, temperature and water vapor components at the 850 hPa level (see Fig. 3). One area is near the southern part of the BTH region, with southerly wind, 310 positive temperature and water vapor biases, while the other area is in central Mongolia, with southerly wind, positive temperature and negative water vapor biases. However, at ground level, the horizontal patterns present different areas with large errors for the three meteorological components: the wind presents large errors in the southern and western parts of the BTH region, while the temperature and water vapor present large errors in the western part of the BTH region. For the forecast with a lead time 315 of 12 hours, the CNOP-type errors are dominant at ground level but mainly confined in Beijing city, with large northerly wind and positive temperature and water vapor biases (see Fig. 4). In addition, the wind and water vapor also present large errors in Shandong Province; at the low level (i.e., 850 hPa), the maximum errors of wind and temperature are located in the northwestern part of the BTH region, near the region of Nart, but the maximum error of water vapor is found in Shandong Province in the 320 southeastern part of the BTH region.
When the DT is the verification time, it can be seen that the CNOP-type errors mainly occur at the low level (i.e., 850 hPa and 750 hPa) for a lead time of 24 hours and large northerly wind, negative temperature and water vapor biases occur in southern Mongolia despite their specific positions having small differences, with the location of large water vapor errors further west to that of the large errors of 325 wind and temperature (see Fig. 5). For a lead time of 12 hours, the large northwesterly wind errors are 13 concentrated at the ground level, while the large positive temperature and water vapor errors occur at the low level; furthermore, there are also large temperature and water vapor errors occurring at the low and middle levels (see Fig. 6).
It is clear that the CNOP-type errors peak at different vertical levels for the 4 forecasts. Even for the 330 meteorological fields of wind, temperature, and water vapor, even at the same vertical level, the areas with large errors of different variables are somewhat different. The errors in the areas where the CNOPtype errors are concentrated could make the largest contribution to the forecast errors of the verification area at the verification time and therefore can be regarded as a sensitive area for targeted observations associated with PM2.5 forecasts. However, from the above CNOP-type errors, it is known that such areas 335 are dependent on different meteorological variables and are located at different vertical levels and regions, which therefore confuses us which meteorological variables, levels and areas should be identified to be preferentially observed and provides challenges to real field campaigns. Then, in this situation, how do we address the problems related to targeted observations for the meteorological fields associated with PM2.5 forecasting? We will address this question in the next section.

The sensitive area for targeted observations and associated validity verification on improving the PM2.5 forecasts
In this section, we propose an approach to measure the comprehensive sensitivity of initial errors occurring in different vertical levels and horizontal areas for different meteorological variables. Then, the sensitive areas for targeted observations can be identified by this comprehensive sensitivity that 360 considers the information of all meteorological variables at all pressure levels.

The sensitive areas for targeted observations associated with PM2.5 forecasts
To evaluate the comprehensive sensitivity of the CNOP-type initial errors occurring at different vertical levels and areas for different meteorological fields, a vertical integral (VI) of the CNOP-type errors, as in Eq. (4), is calculated. 365 The VI consists of all concerned meteorological variables and their vertical distributions and measures the comprehensive sensitivity of forecasting uncertainties on initial errors of different meteorological variables. In this situation, the PM2.5 forecast could be very sensitive to the combined effect of initial errors of the meteorological fields in the area of a larger VI, and preferentially reducing the 370 meteorological initial errors in these sensitive areas will lead to much larger improvements of the meteorological forecasts over the BTH region, which then significantly improves the regional PM2.5 forecasts.

Validity of "targeted observations" in improving PM2.5-forecasting skill
According to the definition of targeted observations, deploying additional observations in the 390 sensitive areas and assimilating them to the initial field will improve the forecasting skill of the meteorological field and then the PM2.5. If such improvement is significantly larger than those of assimilating the additional observations in other areas, the sensitivity of the targeted observations in the sensitive area determined by the CNOP is confirmed numerically. With the above argument, the better simulation of PM2.5 with the meteorological field forecast by ERA5 is assumed to be the "truth run", and 395 the worse simulation initialized by the GFS is the "control run" (see Sect. 3.1); thus, the differences between the PM2.5 concentrations in the control and truth runs can be regarded as forecast errors of the control run with respect to the truth run. Figure 8 shows the spatial distributions of forecast errors of PM2.5 at the AT and DT. This shows that the control run has an obvious underestimation of the PM2.5 concentrations over the whole BTH region at the AT and an overestimation at the DT. If taking the 400 absolute value of the biases, then the mean biases of the whole BTH region are 34.22 and 64.13 ug/m 3 at the AT and DT, respectively. To verify the validity of the targeted observations in the sensitive areas, we 20 take relevant meteorological fields in the truth run but confine them to the above identified sensitive areas as "additional observations" (i.e., artificial "targeted observations") and assimilate them to the initial fields of the control run by the 3D-Var assimilation system of the WRF (see Sect. 2.1), finally 405 obtaining an updated forecast of the PM2.5 concentration, which, as defined in Sect. 3.1, is called the "assimilation run". The validity of targeted observations in improving PM2.5 forecasts of the control run is quantified by two indices defined by Eqs. (5) and (6),    Tables 2 and 3. For a 24-hour lead time of the forecast at the AT, assimilating the five observation arrays can improve the PM2.5 forecast skill by reducing the forecast errors ranging from 4.29 440 ug/m 3 to 6.91 ug/m 3 , accounting for 12.54% to 20.20% of the forecast errors in control runs measured by AEV at the AT; the mean forecast errors during the whole forecast period can decrease from 19.79% to 29.20% measured by AEM (exactly from 3.58 ug/m 3 to 5.28 ug/m 3 ) ( Table 2) (Fig. 9a). When the observation arrays are deployed 12 hours before the AT, a larger improvement in forecasting skills can be found (Table 2). Of the five observation arrays, the improvements in forecasting skills at the AT measured by AEV range from 24.53% to 43.26%, and the mean improvement during the whole forecast 450 period measured by the AEM ranges from 32.84% to 50.81%, where the observation array deployed at a distance of 150 km shows the largest improvements in terms of both AEV and AEM despite the 22 observations being relatively sparse in this array. Overall, the observations deployed 12 hours before the AT in the sensitive areas identified by the CNOP-type errors measured by the VI show better performances than those deployed 24 hours before the AT. Thus, if we care about improving the PM2.5 455 forecast at the AT and the number of observation positions is fixed at 15 (only accounting for 0.17% of the grids over the domain), the observation array with an observation position distance of 150 km deployed in the sensitive areas (i.e., locations in Beijing and Tianjin cities) at 12 hours before the AT might be the optimal choice for targeted observations; in this case, the forecast error of PM2.5 could decrease by as much as 43.26% at the AT in terms of the AEV and 50.81% during the whole forecast 460 period in terms of the AEM (see also Table 2).

DT with a lead time of 24 hours. The black rectangle is the verification area.
To improve the PM2.5 forecast at the DT, five observation arrays in the corresponding sensitive areas can be similarly obtained, and of these arrays, their assimilation runs improve the PM2.5 forecast skills 470 with the AEV varying from 20.87% to 44.72% (exactly from 13.39 to 28.77 ug/m 3 ) and the AEM varying from 27.31% to 40.83% (exactly from 8.27 to 11.90 ug/m 3 ; Table 3)  However, when the lead time is reduced to 12 hours, the mean improvements are less than the forecast with a lead time of 24 hours, with the AEV varying from 20.92% to 31.01% (exactly from 11.24 to 16.66 ug/m 3 ) and AEM varying from 27.81% to 40.00% (exactly from 6.95 to 10.00 ug/m 3 , Table 3). Among the 5 observation arrays, the observations with an observation position distance of 90 km show the largest 480 improvement in both AEV and AEM, which is different from the optimal observation array of observation positions every 150 km deployed 24 hours before the DT. In contrast, the last array has the worst performance. Overall, if we care about improving the PM2.5 forecast skills at the DT, the optimal observation arrays should be deployed over the sensitive areas (i.e., locations in Mongolia) with an observation position distance of 150 km 24 hours before the DT, and assimilating the observations could 485 reduce the forecast errors by as much as 44.72% at the DT measured by AEV and 40.83% during the forecast period measured by the AEM. All these results are also summarized in Table 3. Through a series of OSSEs, the effectiveness of targeted observation is conducted by deploying a 24 fixed number of observations (15 horizonal grids through 4 pressure levels) and observations deployed 490 at different distances are evaluated to determine the optimal observation array. The results shows that when the observation number is fixed, an appropriate observing distance, not necessarily a large observing distance, is essential to obtain the largest improvement of PM2.5 forecast skills. To further examine the role of appropriate observing distance, we also conducted the following experiments that observations are deployed within a limited area with different observing distances (which corresponds to 495 different observation numbers in the limited area). Specifically, we first select a number of 120 most sensitive grids as the sensitive area in each of the four forecasts according to the VI value. Within the given size of the sensitive area, the observing arrays with the distance of 30, 60, 90, 120 and 150km are determined, with the same method as the experiments described above. The additional observations are assimilated to the control run and the improvements of PM2.5 forecast skills are shown in Figure 10. For 500 the two forecasts at the AT and the forecast at the DT with the lead time of 24 hours, the observation arrays with a distance of 30km shows the largest improvement in both AEV/AEM. It implies that in the given size of sensitive area, denser observation sites can better resolve the synoptic initial conditions within the sensitive area, which in turn enhance the forecasting skills more effectively. However, for the forecast at the DT with lead time of 12 hour, the observations with the distance of 90km show the largest 505 improvement. It implies that in this forecast, it is not necessarily much denser observation locations but an appropriate one that is much important for improving the PM2.5 forecasts. Thus, it emphasized that the observations deployed at a large distance or a high density, will not necessarily result in the largest improvement of PM2.5 forecast skills. It is suggested that, when we implement the field campaigns, the observations should be deployed carefully with an appropriate distance to get the largest benefits.

PM2.5 forecasts 515
The results in Sect. 3.2 show that assimilating targeted observations in the sensitive areas determined by the CNOP-type errors can largely improve the PM2.5 forecasting skills (hereafter CNOP-EXPs). To further illustrate the usefulness of CNOP in identifying the sensitive area for targeted observations, in this section, we compare the sensitive areas and other areas surrounding the BTH region.
Apart from the sensitive areas identified by CNOP-type errors, other areas surrounding the BTH 520 region are mainly located in the southwestern, southeastern, eastern and northern parts of the BTH region.
Previous studies demonstrated that the PM2.5 concentrations in the BTH region are continuously influenced by weather conditions (especially wind anomalies) in the southwestern and northern parts of the BTH region (Sun et al., 2019;Zhang et al., 2018). Specifically, they showed that southwesterly wind anomalies tend to transport the polluted air from the southwestern part to the BTH region and that 525 northerly wind anomalies blow away BTH pollution. It therefore seems that the PM2.5 forecasts are more sensitive to the meteorological conditions along the southwestern (i.e., Shanxi province) and northern (i.e., Inner Mongolia province) directions of the BTH region. To examine this sensitivity, we select two areas in these two directions, which are similar to the sensitive areas identified by the CNOP-type errors and surround the BTH region. Specifically, we refer to these two areas as Region-W (100.5-113.5 o E, 530 29.5-36.0 o N) and Region-N (115.5-126.0 o E, 42.5-51.0 o N), whose area sizes are approximately the same as those of the sensitive areas identified by the CNOP-type errors. In each region, we calculate the initial errors of meteorological conditions that lead to the largest forecast error at the verification time in the BTH region, which represents the most sensitive initial errors in this area to PM2.5 forecasts. The algorithms are the same as in calculating the CNOP-type errors, but the initial perturbations are restricted 535 to only Region-W and Region-N. We also use the vertical integral of the errors (VI) to determine the observation arrays and evaluate the sensitivity of PM2.5 forecasting uncertainties to the meteorological initial errors over these two regions. Specifically, the observation arrays in these two areas are constructed with the same configuration as in the area identified by CNOP-type errors. Then, five observation arrays are similarly obtained for Region-W and Region-N. Two groups of experiments are implemented 540 separately for the abovementioned 4 forecasts, i.e., the forecasts aimed at the AT with lead times of 12 and 24 hours and those aimed at DT with lead times of 12 and 24 hours. 26 The results are shown in Tables 2 and 3. For the 24-hour lead time forecast at the AT, the five observation arrays in Region-W are assimilated, and they can improve the PM2.5 forecast skill of the BTH region with an improved AEV ranging from 3.16% to 7.60% and AEM ranging from 5.12% to 11.30% 545 (see Table 2). These improvements measured by AEV and AEM are approximately one-third of those in CNOP-EXPs on average for the five observation array assimilations, with the former being 5.57% and 16.48% and the latter being 8.53% and 25.17% for AEV and AEM, respectively. In particular, although the observation array with a distance of 90 km has the best performance for the improvements in the PM2.5 forecasts in Region-W, this improvement is still lower than that of the worst one among the forecasts 550 with the five observation arrays in CNOP-EXPs. When the five observation arrays are deployed over Region-N and assimilated to forecast the PM2.5 in the control run, the AEV values at the AT are all negative for a lead time of 24 hours, which indicates a decline in the forecasting skills for the PM2.5 at the AT compared with the control run, whichever observation array is assimilated. For the mean of the forecast skill during the whole forecast period (as measured by AEM), the observation array with an adjacent 555 distance of 150 km presents a negative value of AEM when it is assimilated to forecast PM2.5, while the other four observation arrays present a positive value of AEM, but with the mean improvement being only 1.67%, far less than 25.17% in CNOP-EXPs. It is reasonable that assimilating observations in the Region-N may result in a worse forecast. Theoretically, if the observations in the area where the forecast is not sensitive to the initial values are assimilated, the forecasting skills will be improved slightly or neutral. 560 However, when implementing the realistic prediction, the imperfect procedure of data assimilation, the observation errors, model errors, the unresolved scales and processes in the model and other combined effects may induce additional errors (Janjić et al., 2018), which may cause the fact that assimilating observations in the unsensitive area results in a worse forecast. That also indicates that the Region-N is not the sensitive area for the forecast at the AT. For the 12-hour lead time PM2.5 forecast at the AT, we 565 also show that the five observation arrays in Region-W and Region-N present far fewer improvements in PM2.5 forecast skills than those in CNOP-EXPs when they are assimilated to forecast PM2.5 (see Table   2). Specifically, the improvements measured by the AEV averaged for the five observation arrays in Region-W and Region-N (i.e., 14.08% and -0.33%, respectively) are approximately one-third and one hundredth of that (i.e., 36.34%) in CNOP-EXPs, and the improvements measured by AEM (i.e., 15.92% 570 and 2.41%, respectively) are approximately one-third and one-twentieth of that (i.e., 43.62%) in CNOP-27 EXPs, respectively. From the above experiments, it is obvious that, for the 24-and 12-hour lead time forecasts at the AT, the five observation arrays deployed in Region-W (Region-N), although they often enhance the forecast skill of PM2.5 against the control run, present amplitudes of improvement in the PM2.5 forecast skill significantly smaller than those in the CNOP-EXPs. This shows that the sensitive 575 areas for targeted observations of meteorological fields associated with the PM2.5 forecast at the AT are most likely to be the ones identified by the CNOP-type errors, rather than Region-W and Region-N.
For the PM2.5 forecasts at the DT, the results also illustrate the strong sensitivity of the targeted observations in the sensitive area identified by the CNOP-type errors. Specifically, for the 24-hour lead time forecast, the observation arrays in Region-W tend to benefit the PM2.5 forecast, and the improvement 580 averaged for five observation arrays is 18.68% for the AEV and 12.68% for the AEM, which are both nearly half of those in the CNOP-EXPs; when the five observation arrays are deployed in Region-N, they all lead to worse forecasts at the DT than the control run, with the AEV varying from -0.92% to -0.15% and the AEM from -3.43% to -0.49% (Table 3). For the 12-hour lead time forecasts, the five observation arrays deployed in Region-W do not significantly improve the PM2.5 forecast, with AEV values ranging 585 from -0.45% to 4.83% and AEM values ranging from -2.86% to 2.62%; in contrast, the five observation arrays deployed in Region-N considerably improve the PM2.5 forecasts, with AEV ranging from 12.52% to 15.07% and AEM ranging from 15.00% to 16.64%, where the observation array with an adjacent distance of 30 km shows the best performance of the 5 observation arrays for improving the PM2.5 forecast skill. Despite this, the improvement is still less than that of the worst forecast in CNOP-EXPs 590 with the observation array with an adjacent distance of 150 km. Specifically, the improvements in AEV and AE M are 14.03% and 15.85%, respectively, which are both averaged for 5 observation arrays and approximately 50% lower than those in CNOP-EXPs. Therefore, the sensitive areas for targeted observation of meteorological fields associated with the PM2.5 forecast at the DT are the ones identified by the CNOP-type errors, i.e., the areas from Huhhot in Inner Mongolia to the Altai Mountains in 595 Mongolia for a lead time of 24 hours, and Zhangjiakou and Chengde cities, which lie in the northern part of the BTH region for a lead time of 12 hours.

Interpretation
In this section, we further interpret why the sensitive area identified by CNOP-type errors can result in a 28 larger improvement of PM2.5 forecast skill. It is known that dynamic and thermodynamic conditions are 600 two key factors that determine the transport and deposition of pollution. With a relatively strong wind, pollution could be transported to the downwind region in a short time, while a relatively clam wind could favor ground pollution accumulation. For the BTH region, northerly winds blow away PM2.5, while southerly winds lead to the accumulation of PM2.5 through the blocking effect of the surrounding mountains (Zhao et al., 2009). Thermodynamic conditions such as the strong temperature inversions in 605 the atmospheric boundary layer are also favorable for the accumulation of air pollutants to form air pollution events (Miao et al., 2015). Moreover, an increased temperature may accelerate the production rate of precursors and secondary pollutants, and a higher humidity notably increases the hygroscopic growth of PM2.5 (Liao et al., 2017), which contribute to variations in ground-level PM2.5.
In this paper, we showed that the "control run" either with a lead time of 12 hours or 24 hours 610 presents a severe underestimation of PM2.5 at the AT, and a large overestimation of PM2.5 at the DT for the heavy air pollution event that occurred from 30 November to 4 December 2017 (see Sect. 3.1). The assimilation runs greatly promote the skill of these PM2.5 forecasts by assimilating the targeted observations in the sensitive areas of the meteorological fields. Now, we interpret why the assimilation runs increase the PM2.5 forecast skill for dynamic and thermodynamic reasons. After we compare the 615 forecast biases of the control run with lead times of 12 and 24 hours, we find that the forecast biases of the control run under the two leading times are almost the same. For simplicity, we present the forecast with a lead time of 24 hours. Figure 11 shows the differences in the wind and temperature fields between the truth run and control run at ground level at the AT and DT with a lead time of 24 hours. The truth run presents significant southerly winds with a mean speed of 2.32 m/s over the BTH region (see Fig. 11 (a)), 620 while the control run forecasts a southerly wind with a mean speed of 0.74 m/s (see Fig. 11 (b)) and exhibits northerly wind biases, as shown in Fig. 11 (c). The weak southerly wind in the control run reduces the pollution transported from the south to the BTH region in the truth run, which results in a significant underestimation of the PM2.5 concentration of the control run at the AT. In addition to this dynamic reason, the thermodynamical conditions are also key factors influencing the PM2.5 forecasts. 625 Both the truth run and the control run are able to simulate the temperature inversion layer, which prevents vertical dispersion of pollutants and promotes the accumulation of surface PM2.5. For the forecasts at the AT, the truth run has forecasted 0.11K/100m vertical temperature inversion layers at Dongsi station 29 in Beijing City (the temperature arises 0.11K every 100m), whist the control run has forecasted 0.05K/100m. The mean lapse rate simulated by the truth run over the BTH region is 0.03K/100m and the 630 control run has forecasted a 0.002K/100m. So the truth run simulated a more stable thermodynamic condition, which is favorable for the accumulation of surface air pollutants. Meanwhile, the negative temperature bias in the near surface of the control run decreases the production rate of precursors of PM2.5 and the negative bias of relative humidity reduces the useful carrier of PM2.5, causing a decrease in PM2.5, finally favoring the underestimation of PM2.5 at the AT in the control run. 635

640
From the above, it is clear that the control run exhibits northerly wind, a less table boundary layer, low temperature and relative humidity biases at the AT relative to the truth run. However, after assimilating the artificial meteorological variables over the sensitive areas determined by the CNOP-type errors to the initial analysis field of the control run, the PM2.5 forecasts are promoted in forecasting skill.
For the forecasts with lead times of 12 and 24 hours, the interpretations of why the assimilation runs 645 increase the PM2.5 forecast skill and its related mechanisms are similar. For simplicity, we present the interpretations in detail for the forecast with a lead time of 24 hours. In Fig. 12, we plot the spatial evolution of the 24-hour forecast differences of wind and PM2.5 concentrations between the CNOP-EXP 30 and control run. From Fig. 12 , we can see that the sensitive areas for the PM2.5 forecast at the AT are mainly located in the southern and northwestern parts of the BTH region (also see Fig. 7), and 650 assimilating meteorological observations over the sensitive areas increases the southerly wind in the southern part of the BTH region at the initial field and finally enhances the southerly wind by 0.18 m/s over the BTH region at the verification time, which is helpful for transporting southern pollution to the BTH region. Between the two areas, the sensitive area near the Inner Mongolia plays a more dominant role on the PM2.5 forecast of BTH region, by inducing a larger southerly wind component. In addition, 655 the assimilation run has forecasted 0.06K/100m temperature inversion layers at Dongsi station and the mean lapse rate over the BTH region has reached to 0.004K/100m. The slightly improved thermodynamic conditions further result in the modifications of the boundary layer structure featuring a decreased PBL height. The mean boundary layer height over the BTH region has decreased from 261m in the control run to 256m in the assimilation run, which also contributed to the increased ground level PM2.5 pollution 660 and improved the PM2.5 forecast skill in the assimilation run. Moreover, assimilating the targeted observations increases the initial temperature and relative humidity in the western parts of the BTH region and decreases them in the northwestern parts of the BTH region. Then, the western warm air moves easterly, and the northwestern cool air moves southeasterly, which finally decreases the temperature by 0.05 °C and the relative humidity by 0.6% at the AT over the BTH region. Decreased 665 temperature and relative humidity are not beneficial for the formation of PM2.5. From the above analysis, it can be found that the improvements in the PM2.5 forecast skill in assimilation runs result from the increased south wind and more stable boundary layer during the accumulation process. For the forecast at the DT, the truth run presents a large northerly wind with a mean speed of 5.24 m/s, as shown in Fig. 11(d), which blows the pollution from the BTH region to the south. However, the 675 control run forecasts a southerly wind with a mean speed of 1.82 m/s (Fig. 11 (e)), which is the reverse of the truth run and might transport more pollution from the southwestern part to the BTH region than from the BTH region to the south in the truth run, finally contributing to the overestimation of the PM2.5 concentration in the control run. Meanwhile, the control run also presents a warm temperature and much higher relative humidity biases, which prevent the dissipation of PM2.5 over the BTH region and favors 680 the overestimation of PM 2.5 at the DT (see Fig. 11(f)). When the targeted observations are assimilated to the control run at 24 hours before the DT and then the assimilation run is formulated, it increases the 32 northerly wind and decreases the temperature and relative humidity in the sensitive areas at the initial time, which subsequently drives much cool and dry air in the sensitive area (i.e., the northwestern part of the BTH region; also shown in Fig. 7) to the south and accumulates over the BTH region (see Fig. 13), 685 finally decreasing the temperature and relative humidity over the BTH region at the verification time, improving the forecasts of the PM2.5 concentrations in the assimilation run at the DT. It is obvious that the improvement of both the dynamic and thermodynamic conditions is responsible for the increase in the PM2.5 forecast skill at the DT in the assimilation run. 690 Figure 13: The same as in Figure 12, but for the forecast starting from 14:00 on 2 December.

Summary and discussion
Motivated by the important role of the meteorological initial field in air quality forecasts, we make the first attempt in applying the targeted observation strategy of the meteorological fields with the CNOP 695 approach to the improvement of PM2.5 forecasts using the WRF-NAQPMS model. By considering a heavy haze episode that occurred from 30 November to 4 December 2017 in the Beijing-Tianjin-Hebei region, we explore the effect of possible "targeted observations" on PM2.5 forecasts during both the accumulation and dissipation periods of the haze event, where the targeted observations are represented by observation arrays consisting of 15 evenly and horizontally distributed grids through 4 pressure levels 700 (i.e., 950, 850, 750, 500 hPa) in the sensitive areas identified by the CNOP-type errors and that include horizontal wind, temperature, and relative humidity components.
To improve the PM2.5 forecast during the accumulation and dissipation periods of the haze event, forecasts with lead times of both 12 and 24 hours are investigated, where the AT (i.e., accumulation time, 02:00 BJT on 2 December) and DT (i.e., dissipation time, 14:00 BJT on 3 December) are selected as the 705 verification times (i.e., the forecast times), respectively. We first calculate the CNOP-type errors for these 4 forecasts separately. Then, since the CNOP-type errors concentrate on different vertical levels and in different horizontal areas for different meteorological variables, including wind, temperature and moisture components, we propose using the vertical integral of CNOP-type errors to measure the comprehensive sensitivity of initial errors and to determine the sensitive areas for targeted observations 710 of meteorological fields associated with the PM2.5 forecasts. The results show that for the verification time AT, the sensitive areas identified by CNOP-type errors mainly concentrate in Dezhou city and central Numerically, we conducted a series of OSSEs to explore whether the possible "targeted observations" in the above sensitive areas can improve the PM2.5 forecasts of the BTH region and then to infer the usefulness of these sensitive areas in implementing practical field observations. For each of the 4 forecasts, we tried different observation arrays of 15 evenly and horizontally distributed grids through 4 720 pressure levels in the sensitive areas and assimilated them to the initial fields for evaluating the 34 improvement of PM2.5 forecasting skill, finally suggesting a more useful observation array for improving the forecasts at the AT and DT. Specifically, for the forecast at the AT, the observation array with a grid space of 90 km in the sensitive area is more effective for a 24 hour lead time and a grid space of 150 km performs the best for a 12 hour lead time; however, for the forecast at the DT, the observation array of a 725 grid space of 150 km leads to a better forecasting skill at a 24 hour lead time while that with a grid space of 90 km results in a higher forecasting skill at a 12 hour lead time. To further confirm the usefulness of CNOP in identifying the sensitive area for targeted observations, we compare the improvements of PM2.5 forecasts after assimilating "targeted observations" in the sensitive areas and the additional observations in the areas along the southwestern (Region-W) and northern (Region-N) directions of the BTH region 730 suggested by previous studies. The results show that the improvements of the PM2.5 forecasting skills with the additional observations deployed in Region-W and Region-N are significantly smaller than those in the sensitive areas determined by the CNOP approach; in particular, assimilating the additional observations over Region-W and Region-N cannot ensure a positive forecast benefit. All these results indicate that preferentially implementing additional observations in the sensitive area determined by the 735 CNOP approach is more likely to significantly improve the PM2.5 forecasts.
Physically, we interpret the reason why the possible targeted observations can significantly improve the PM2.5 forecasting skill by comparing the relevant meteorological fields before and after assimilation.
Since the interpretation and its related mechanisms are similar for the forecasts with lead times of 12 and 24 hours, we present only the interpretations in detail for the forecast with a lead time of 24 hours. During 740 the accumulation process, the control run forecasts a weaker southerly wind and a less stable boundary layer at the AT, which is unfavorable for the accumulation of PM 2.5 and finally leads to a severe underestimation of PM2.5 at the AT. When the targeted observations are assimilated to the control run, the southerly wind increases in the southern part of the BTH region at the initial state and finally enhances the southerly wind over the BTH region at the verification time. The increased southerly wind transports 745 more PM2.5 from the south to the BTH region and improves the PM2.5 forecasting skills of the control run at the AT. The assimilation also induces a more stable boundary layer in the assimilation run, which contributed to the increased ground level PM2.5 pollution and improved the PM2.5 forecast skill. For the forecast at the DT, the control run exhibits large southerly wind and positive temperature and relative humidity biases, which prevents the dissipation of PM2.5 and results in an overestimation of PM2.5 at the 750 35 DT. When the targeted observations are assimilated to the control run, it increases the northerly wind and decreases the temperature and relative humidity in the sensitive areas at the initial state. The increased northerly wind drives the cool air in the sensitive area southward and finally blows more PM2.5 from the BTH region to the south, which improves the PM2.5 forecasting skills of the control run at the DT.
The present study provides numerical and physical evidence that the sensitive areas of 755 meteorological initial fields associated with the PM2.5 forecasts indeed exists and deploying "targeted observations" of meteorological fields in the sensitive areas determined by the CNOP approach can significantly improve PM2.5 forecasts. Such results formulate a theoretical basis to implement practical field campaigns associated with air quality forecasts. In the practical field campaigns, though the reanalysis data cannot be obtained in time, one can choose the forecast data from ECMWF, which are 760 widely regarded as the best and most reliable forecast data currently, as initial field to yield a better forecast. Based on this forecast, one can compute the CNOP-type error to identify the sensitive area and design the relevant filed observation networks. Such ideas have been applied on real-time typhoon forecasting and it has been verified to be able to improve greatly the typhoon forecasting skills Qin et al., 2022). It is also noted that even if sufficient observations exist, the results in 765 the present study can tell us which area of the observations should be preferentially assimilated to improve air quality forecasts.
As the first attempt to study the effect of targeted meteorological observations on improving air quality forecasts, we only utilized one event and, in the future, more events should be investigated to obtain a systematic and comprehensive conclusion about how to deploy targeted observations to improve 770 PM 2.5 forecasts. Meanwhile, in the present studies, finite meteorological variables (wind, temperature, pressure, and water vapor) are selected to represent the sensitivity of meteorological initial fields on PM2.5 forecasts. Though they are recognized as important meteorological variables on PM2.5 forecasts over the BTH region , to get a comprehensive conclusion, the sensitivities of more meteorological parameters such as boundary layer height and atmospheric stability, which may not 775 belong to an initial value problem but can be explored by the extension of CNOP method, such as CNOPparametric perturbation (CNOP-P; Mu et al, 2010) or nonlinear forcing singular vector method (Duan and Zhou, 2013). Also, a WRF with the horizontal resolution of 30 km was preliminarily tried in the present study. Beyond doubt, this resolution is relatively low for the PM2.5 forecasts. Nevertheless, the 36 sensitive areas revealed in the present study are still instructive for practical field observations of PM2.5 780 forecasts because of the verifications through a series of OSSEs and reasonable physical interpretation shown in the context. In any case, a WRF with much higher resolution should be used in the future. In addition, only two verification times were adopted for determining sensitive areas and dependence of sensitive areas on forecasting times was not explored, which will be addressed in next paper.
In addition to meteorological inputs, emissions are also a key input for air quality forecasts. Accurate 785 emission inputs are difficult enough in terms of their high uncertainties in time and 3-D space, and it is also challenging to satisfy the need for highly confident simulations of a specific event (Peng et al., 2017).
Targeted observation may be a better strategy to improve the quality of emissions, and the determination of sensitive areas of emissions is certainly important. Previous studies have adopted the singular vector decomposition and adjoint sensitivity methods to identify the sensitive area for the emissions (Daescu et 790 al., 2003;Goris and Elbern, 2013). However, it should be noted that the above two strategies are based on linear approximation of initial error evolutions and deploying the observations over the sensitive areas identified by these two strategies may not result in the largest improvement over the verification area, especially for the medium-and longer-range forecasts (Wang et al., 2011). Our current study represents the first step in studies of targeted observation strategies of meteorological variables associated with air 795 quality forecasts with the application of CNOP, and only observations of meteorological fields are explored. Then, targeted observations of emissions based on the CNOP approach are expected to be studied for air quality forecasts in the future.
Data availability. Hourly surface PM 2.5 data are obtained from China National Environmental 800 Monitoring Center (http://www.cnemc.cn/). The ERA5 reanalysis product is available at https://www.ecmwf.int/en/forecasts/datasets/reanalysis-datasets/era5. The NCEP GFS product is available at https://rda.ucar.edu/ datasets/ds084.1/. The data generated and/or analyzed during the study are stored on the computers at State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics (LASG; https://www.lasg.ac.cn) and will be available to researchers 805 upon request.