Reconstruction of PM2.5 Concentrations in East Asia on the Basis of a Wide–Deep Ensemble Machine Learning Framework and Estimation of the Potential Exposure Level from 1981 to 2020
Shuai Yin
,
Chong Shi
,
Husi Letu
,
Akihiko Ito
,
Huazhe Shang
,
Dabin Ji
,
Lei Li
,
Sude Bilige
,
Tangzhe Nie
,
Kunpeng Yi
,
Meng Guo
,
Zhongyi Sun
,
Ao Li
Reconstruction of PM2.5 Concentrations in East Asia on the Basis of a Wide–Deep Ensemble Machine Learning Framework and Estimation of the Potential Exposure Level from 1981 to 2020
aKey Laboratory of Remote Sensing and Digital Earth, Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
bGraduate School of Life and Agricultural Sciences, The University of Tokyo, Tokyo 113-8654, Japan
cState Key Laboratory of Severe Weather, Key Laboratory of Atmospheric Chemistry of CMA, Chinese Academy of Meteorological Sciences, Beijing 100081, China
dInformation Center, Inner Mongolia Normal University, Hohhot 010028, China
eSchool of Water Conservancy and Electric Power, Heilongjiang University, Harbin 150006, China
fState Key Laboratory of Urban and Regional Ecology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
gSchool of Geographical Sciences, Northeast Normal University, Changchun 130024, China
hCollege of Ecology and Environment, Hainan University, Haikou 570228, China
iSchool of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
Satellite observations are widely used to estimate the concentrations of surface air pollutants, but the temporal coverage of these datasets is relatively short. To overcome this limitation, we propose a wide–deep ensemble machine learning framework to reconstruct the fine particulate matter (particulate matter lower than 2.5 μm (PM2.5)) dataset of East Asia (EA) over the past four decades (1981–2020). The results indicate that the framework effectively leveraged the advantages of satellite observations (higher accuracy) and model-based estimations (longer temporal coverage) of surface air pollutants. The reconstructed PM2.5 concentrations agreed well with the ground measurements, with coefficient of determination (R2) and root-mean-square error (RMSE) values of 0.99 and 1.38 μg·m−3, respectively, which outperformed the satellite-based PM2.5 estimates. As more ground measurements were incorporated into the model for training, the average RMSE in Japan and the Korean Peninsula decreased to 0.83 and 1.50 μg·m−3, respectively. Simultaneously, on the basis of the reconstructed datasets, we investigated the exposure level to PM2.5 in EA from 1981 to 2020. Since 2000, the increase in anthropogenic emissions has substantially worsened the air quality in EA, and nearly 50% of the population resided in areas where the annual average PM2.5 concentrations exceeded 50 μg·m−3 from 2009 to 2010. Despite the implementation of various mitigation strategies by local authorities to lower the ambient PM2.5 concentrations, the entire exposure level in EA is still implausible to meet the World Health Organization (WHO) air quality guidelines. In addition, population aging and climate change have the potential to increase PM2.5 exposure risk in the future. For policy-makers in EA, it is essential to consider the effects of these factors and develop more effective mitigation strategies that aim to lessen the health impact associated with PM2.5 exposure.
Shuai Yin, Chong Shi, Husi Letu, Akihiko Ito, Huazhe Shang, Dabin Ji, Lei Li, Sude Bilige, Tangzhe Nie, Kunpeng Yi, Meng Guo, Zhongyi Sun, Ao Li.
Reconstruction of PM2.5 Concentrations in East Asia on the Basis of a Wide–Deep Ensemble Machine Learning Framework and Estimation of the Potential Exposure Level from 1981 to 2020.
Engineering, 2025, 49(6): 238-252 DOI:10.1016/j.eng.2024.09.025
Air pollution poses a significant global environmental challenge and has adverse effects on human health and the earth–atmosphere system [1], [2], [3], [4], [5], [6]. The World Health Organization (WHO) estimated that nearly the entire global population (99%) breathes air that exceeds WHO standards, resulting in approximately seven million premature deaths every year [7]. PM2.5, which refers to particulate matter with a diameter of less than 2.5 mm, is commonly considered the most representative type of air pollutant. The issue of PM2.5 pollution, especially in rapidly developing areas, such as China, India, and Southeast Asia, has received increasing attention in recent decades [8], [9], [10], [11]. The mechanisms contributing to urban PM2.5 pollution are very complex—escalating fossil fuel consumption directly amplifies primary PM2.5 levels, and anthropogenic emissions of gaseous air pollutants significantly promote the formation of secondary aerosols [12], [13], [14], [15]. Additionally, the surface-level PM2.5 concentration is sensitive to meteorological conditions, which influence circulation patterns, pollutant transportation, and other processes, such as photochemistry and wet/dry deposition [16], [17], [18], [19]. Many epidemiological studies have shown that short- or long-term exposure to PM2.5, even at low PM2.5 concentrations, increases the risk of cardiovascular and respiratory diseases [20], [21], [22], [23]. Therefore, the WHO significantly strengthened its air quality guidelines (AQG) in 2021. Notably, the recommended limit for annual mean PM2.5 exposure was revised from 10 to 5 μg·m−3, and the daily concentration limit was adjusted downward from 25 to 15 μg·m−3[24].
Accurate observation and reliable estimation of surface-level PM2.5 concentrations are prerequisites for air quality management, health risk assessment, and the creation of mitigation plans [25], [26]. Despite the expansion of ground-based monitoring networks, the sparse spread and heterogeneity in the distribution of monitoring stations, particularly in developing countries, make accurate measurements of air pollutants challenging [27]. Earth-observing satellites, with a synoptic view and repetitive coverage, offer a powerful tool for estimating surface air pollutants. It holds particular promise in regions with limited or no ground-based monitoring stations, effectively filling critical data gaps [28], [29], [30], [31], [32]. In response, various statistical models have been constructed to investigate the complex nonlinear link between the aerosol optical depth (AOD) observed by satellites and ground-level PM2.5 concentrations. These models include the empirical model, land use regression model, geographically weighted regression model, and machine learning model [33], [34], [35], [36], [37].
However, the majority of ground- and satellite-based aerosol observations cover only the time span of the past two decades. Che et al. [38] established the Chinese Sun Hazemeter Network (also known as CSHNET) in 2004, with 24 monitoring stations located throughout China to assess the optical characteristics of aerosol particles and their spatial and temporal variations. The Moderate Resolution Imaging Spectroradiometer (MODIS) onboard the Terra and Aqua satellites acts as an essential tool for monitoring global aerosol variations and has been operational for two decades [39], [40], [41]. Given that global or regional PM2.5 estimates have come primarily from satellite data since the early 2000s, spatial distribution datasets of ambient PM2.5 before 2000 are very limited. The absence of continuous and uniform PM2.5 datasets is becoming one of the greatest barriers to long-term (i.e., over several decades) epidemiological and policy-assessing studies. In the present study, we integrated ground measurements and satellite- and model-based PM2.5 estimations to construct a PM2.5 distribution dataset for East Asia (EA) from 1981 to 2020. The developed framework is based on three machine learning approaches, and a novel wide–deep model is proposed to ensemble the prediction results. On the basis of the integrated datasets, we investigated the exposure level to PM2.5 in EA over the past four decades. With rapid economic growth and increased emissions from fossil fuel consumption, EA has been suffering from severe and persistent air pollution since the 1980s. The results of this study provide essential information and theoretical support for the quantification and characterization of the spatiotemporal variation in PM2.5 in EA. Furthermore, these findings will have significant implications for the development of more effective mitigation plans aimed at reducing the detrimental health impacts of PM2.5 exposure in the future.
2. Datasets and methodology
2.1. Datasets
2.1.1. Ground-level PM2.5 observations
The study region consists of four countries: China, Japan, the Republic of Korea, and the Democratic People’s Republic of Korea. Owing to differences in socioeconomic conditions and development phases, there are significant variations in the temporal and geographic coverage of air quality monitoring stations among these countries. For example, monitoring networks of air pollutant concentrations in Japan and the Republic of Korea were established in the early 1970s, with approximately 1900 monitoring stations nationwide in Japan and 460 monitoring stations in the Republic of Korea [42], [43]. In contrast, China introduced the new National Ambient Air Quality Standard in February 2012, which set limits for the first time on PM2.5. By 2015, the air quality monitoring networks had expanded to cover every prefecture-level city in China, with the number of national monitoring stations nearly tripling from 2012 to 2021. In the present study, ground-based measurements of PM2.5 (2015–2020) in EA were obtained from the National Environmental Monitoring Center of China, Japan’s National Institute for Environmental Studies, and the Republic of Korea’s Ministry of the Environment. To reduce the effects of observational inconsistencies, stations with fewer than 320 d of PM2.5 measurements per year or fewer than 20 d of measurements per month were excluded. Consequently, the annual mean PM2.5 concentrations recorded during the 2015–2020 period in 352 Chinese cities, 583 cities in Japan, and 16 cities in the Republic of Korea were selected for this study. The spatial heterogeneity of ground-level PM2.5 observations in China is notable. The eastern and southern regions of China, which are characterized by high population density and severe air pollution, host more than 70% of the national monitoring stations. In contrast, despite the vast expansion of western China, fewer than 10% of the monitoring stations are situated in this region (Fig. S1 in Appendix A).
2.1.2. Satellite-based estimation of PM2.5
The Atmospheric Composition Analysis Group (ACAG†) estimates ground-level PM2.5 concentrations by integrating AOD retrievals from various satellite instruments (e.g., MODIS, Multi-angle Imaging SpectroRadiometer, and Sea-viewing Wide Field-of-view Sensor) and then calibrating them to global surface-level PM2.5 concentration monitoring networks through the use of a geographically weighted regression model [44], [45]. The ACAG series of ambient PM2.5 datasets have better spatial and temporal resolutions (0.01° × 0.01° and 0.1° × 0.1°; monthly and annually) and present good agreement with ground observations [46], [47], [48]. Thus, these data have been widely used to quantify exposure risk and assess air control policies on a regional or global scale. We used the newly released ACAG V5.GL.03 (spatial resolution: 0.1° × 0.1°; temporal coverage: 1998 to 2021) as a base dataset to reconstruct PM2.5 in EA. The V5.GL.03 products followed the methodology of the previous versions and updated the ground-based monitoring networks used to calibrate PM2.5 estimates for the entire time series, and the temporal coverage was extended to 2021 [49].
2.1.3. Model-based reanalysis of PM2.5
Modern-era retrospective analysis for research and applications (MERRA) is a reanalysis dataset for the period covered by satellite observations that employs a Goddard Earth Observing System data assimilation system [50], [51]. Building on the achievements of the initial MERRA analysis, the new version (MERRA-2) has made several improvements, including the incorporation of new observations, the mitigation of artifacts in trends and discontinuities associated with modifications in the meteorological observing system, and so forth [52], [53], [54]. The MERRA-2 aerosol reanalysis incorporates bias-corrected AOD observations from National Aeronautics and Space Administration (NASA) satellites and ground-based measurements from Aerosol Robotic Networks [55]. The dataset offers information starting in 1980 with a spatial resolution of 0.5° latitude by 0.625° longitude, encompassing the ground-level concentrations of five aerosol species (dust, sea salt, black carbon, organic carbon, and sulfate) originating from natural or anthropogenic emissions. Buchard et al. [56] introduced a methodology for estimating global ground-level PM2.5 concentrations from MERRA-2 aerosol reanalysis. In this study, we collected datasets from the MERRA-2 aerosol reanalysis and estimated the surface-level PM2.5 concentration as follows:
where CDust2.5, CSS2.5, CBC, COC, and CSO4 are the concentrations of dust, sea salt, black carbon, organic carbon, and sulfate aerosols (with diameters less than 2.5 μm), respectively. In the present study, the levels of the five aerosol species and PM2.5 estimates derived from MERRA-2 datasets served as predictors for reconstructing PM2.5 levels spanning from 1981 to 2020.
Yin [57], [58] proposed a method to use PM2.5 estimates from MERRA-2 as a proxy to extend the temporal scope of ACAG products, and the extrapolated results were compared with ground-based observations in EA and Southeast Asia, respectively. This method resolved the PM2.5 underestimation caused by the coarse resolution of the MERRA-2 datasets, and the extrapolated PM2.5 data were much closer to ground-based observations. Following the studies of Yin [57], [58], the extrapolation of the V5.GL.03 datasets prior to 1998 is defined by the following equation:
where represents the PM2.5 concentration extrapolated to the target year t; is the V5.GL.03 PM2.5 estimation for the baseline (BL) period (1998–2020); is the MERRA-2 PM2.5 estimation for the BL period (1998–2020); and is the MERRA-2 PM2.5 estimation in the target year t.
2.1.4. Meteorological and other predictors
Researches [59], [60], [61] have demonstrated that meteorological conditions play essential roles in the development and fluctuations of surface-level air pollutants. Meteorological data for the study region were sourced from the fifth generation of the European Centre for Medium-Range Weather Forecasts reanalysis (ERA5). ERA5 spans from 1979 onward, offering a spatial resolution of 0.25°× 0.25°, and is renowned for its comprehensive depiction of the atmospheric three-dimensional structure [62], [63], [64]. Compared with its predecessors (ERA-interim), ERA5 incorporates an updated version of the integrated forecast system model (IFS 41r2), with improved horizontal and vertical resolutions as well as several improvements to the data assimilation scheme [65], [66]. In the present study, eight climate variables were obtained from ERA5: 1000 hPa ambient temperature (TEM1000), surface shortwave downward radiation (SWDR), planetary boundary layer height (PBLH), precipitation (PRE), 1000 hPa relative humidity (RH1000), 1000 hPa zonal wind speed (U1000), 1000 hPa meridional wind speed (V1000) and 1000 hPa wind speed (WS1000).
Additionally, we also accounted for the influence of topography and vegetation cover on the concentration of ambient PM2.5. The vegetation cover was represented by the leaf area index for high (LAIhigh) and low (LAIlow) vegetation types, which was obtained from ERA5-land. Digital elevation data across EA were acquired from the shuttle radar topography mission (SRTM).
2.2. Methodology
2.2.1. Reconstruction framework of PM2.5
Following previous studies [67], [68], [27], bilinear interpolation was applied to downscale the MERRA-2 and ERA5 datasets to 0.1°× 0.1°, which is consistent with the spatial resolution of the ACAG datasets. Fig. 1 shows that the reconstruction framework of ambient PM2.5 consists of two main components: ➀ the base learner, comprising three models by machine learning algorithms, random forest (RF), gradient boosting machine (GBM), and extreme gradient boosting (XGBoost), which provide preliminary PM2.5 predictions; and ➁ a wide–deep model that combines the predictor variables and preliminary PM2.5 predictions to generate the final outputs. With the advancement of big data technology, machine learning has become a versatile tool in various disciplines [69], [70], [71]. RF, GBM, and XGBoost are three of the most widely utilized machine learning models for estimating the surface concentrations of air pollutants, and numerous studies have demonstrated that these models can achieve high prediction accuracy [72], [73], [74], [75], [76]. Therefore, these models were individually selected as the base learners to predict the PM2.5 concentration, with their outputs serving as the input for the wide component of the wide–deep model.
The wide–deep model comprises two components. For the deep component, two hidden layers were constructed to process the predictor variables. Following the methods of Zang et al. [77], each hidden layer includes a fully connected, exponential linear unit (eLU) activation function, batch normalization (BN), and dropout layers. The eLU activation function is used to improve the training speed and prevent issues related to vanishing or exploding gradients [78]. In addition, it achieves higher accuracy than other activation functions, including the rectified linear unit, sigmoid, and hyperbolic tangent. The BN and dropout layers were designed to stabilize the learning process and lower the influence of overfitting, respectively. For the wide component, we design a hidden layer (the structure is the same as that of the hidden layers in the deep component) to process the preliminary PM2.5 predictions from the three base learners. Through joint training, the wide and deep components are subsequently combined to adjust the parameters of both components simultaneously. The final PM2.5 prediction from the wide–deep model is expressed by the following equation:
where WDPM2.5 represents the PM2.5 prediction from the wide–deep model; Aero represents the aerosol predictors, including the ACAG PM2.5 estimations, MERRA-2 PM2.5 estimations, and the concentrations of five aerosol species from the MERRA-2 reanalysis; Metero represents meteorology predictors, including TEM1000, SWDR, PBLH, PRE, RH1000, U1000, V1000, and WS1000; and Oth represents other predictors, including LAIhigh and LAIlow from the MARRA-2 analysis and elevation from the SRTM. RFPM2.5, GBMPM2.5, and XGBoost PM2.5 represent the preliminary PM2.5 predictions from the three base learners. D and W refer to the deep and wide components of the wide–deep model and they are jointly trained to generate the final PM2.5 predictions (WDPM2.5), respectively.
Tenfold cross-validation was used to validate the model’s performance. In each fold, 90% of the data were used for training, and 10% were used for testing. This process was repeated ten times, and the final accuracy was obtained by averaging the results from each fold. Additionally, the datasets were split into two separate groups for model training and validation. The training group included ground measurements of PM2.5 and the values of each predictor variable from 2015 to 2019. The data from 2020 were used to test the predictive ability and hindcast performance. We calculated statistical indicators, such as the coefficient of coefficient of determination (R2), root-mean-square prediction error (RMSE), mean absolute error (MAE), and mean relative error (MRE), to assess model performance.
2.2.2. PM2.5 exposure level
The PM2.5 data predicted by the reconstruction framework were combined with demographic information to assess the exposure level of EA over the past four decades (1981–2020). The population distribution map and age structure data were obtained from the Gridded Population of the World (GPW) and Population Division of the United Nations (UN), respectively. The third- and fourth-generation GPW (GPW v3, GPW v4) datasets map the global population distribution at five-year intervals from 1990 to 2020. The population distribution data for subsequent years of interest were generated via linear extrapolation or interpolation of the GPW v3 and GPW v4 datasets (spatial resolution: 0.042° × 0.042°). Because the GPW maps were not created using the most current census data, we used the data collected from the UN Population Division to refine the interpolation and extrapolation outcomes. In addition, population structure data for China, Japan, the Republic of Korea, and the Democratic People’s Republic of Korea were also obtained from the UN Population Division to examine the year-to-year fluctuations in PM2.5 exposure across different age groups.
2.2.3. Trend analysis
Linear trends have been widely utilized in previous studies to quantify the interannual variation in air pollutants; however, linear trends are responsive to outliers and are appropriate only for data that follow a normal distribution. In the present study, we employ a nonparametric statistical analysis, the Theil–Sen estimator, to analyze the spatiotemporal fluctuations in PM2.5 exposure in EA. The Theil–Sen estimator, initially proposed by Theil [79] and extended by Sen [80] to address ties in pairwise slopes, calculates the median of all pairwise slopes. This method can accommodate outliers of up to 29.3% in the two-dimensional case and accurately determine the slope for nonnormally distributed data [81]. The Theil–Sen estimator is computed via the following equation:
where S is the slope of interannual variation, Med represents the median value, xi and xj are the annual means of the PM2.5 concentrations in year i (Yi) and year j (Yj), respectively (i > j).
3. Results and discussion
3.1. Distribution and variation in ground-level monitoring of PM2.5
Fig. 2 shows that the observed 6 year average ground-level PM2.5 concentrations have strong spatial heterogeneity across EA. Intense PM2.5 pollution has occurred mainly in eastern and southern China, particularly in the North China Plain, Yangtze River Delta, and Sichuan Basin, where the corresponding annual averages exceed 50 μg·m−3. East and south China feature large urban agglomerations, high population density, and rapid economic growth. According to the MERRA-2 dataset, sulfate, black carbon, and organic carbon are the dominant aerosols in these regions; thus, anthropogenic emissions from fossil fuel combustion mainly contribute to local PM2.5 pollution (Fig. S2 in Appendix A). Notably, severe PM2.5 pollution has occurred in some cities in western China, with the multiannual average PM2.5 concentration in Kashi (a city in the Xinjiang Uygur Autonomous Region) reaching 107 μg·m−3, which is the highest value in EA, as indicated in Table 1. Unlike eastern and southern China, western China features an arid climate, and the land cover is dominated by basins and deserts. This region is vulnerable to dusty weather, and wind-blown dust is regarded as the primary contributor to local PM2.5 pollution (Figs. S2). The 6 year average PM2.5 concentration at ground level in Chinese urban areas is 42 μg·m−3, and the pollution level is much more severe than that in other countries in EA. In contrast, Japan is least affected by PM2.5 pollution because of minimal human-caused emissions and unique geographical–climatic conditions. Additionally, there is no large disparity in PM2.5 exposure levels among the cities in Japan, and the annual average concentration is always less than 15 μg·m−3, which is equal to that of the least polluted Qinghai–Xizang Plateau in China. The urban PM2.5 concentration in the Republic of Korea typically falls between 20 and 28 μg·m−3 annually, with peak and minimum levels recorded in Jeonbuk and Jeju, respectively. In addition to local anthropogenic emissions, air pollutants transported over long distances from various sources significantly impact the PM2.5 concentration in the Korean Peninsula.
EA is widely acknowledged as one of the most polluted regions globally, prompting local authorities to adopt various measures to address the problem of PM2.5 pollution in recent decades. The most effective and stringent plan is the air pollution prevention and control action plan (APPCAP), which was initiated by the State Council of China in September 2013 [82]. Ground-based monitoring revealed that the APPCAP significantly decreased the level of exposure to PM2.5 in China. Specifically, the annual average PM2.5 concentration in Chinese cities demonstrated a consistent downward trajectory from 2015 to 2020, and the rate of decrease was −3.4 (95% confidence intervals (CI): −3.6–−3.1) μg·m−3 per year. In terms of spatial heterogeneity, the alleviation of PM2.5 pollution was more remarkable in the city cluster of the North China Plain and Sichuan Basin, where the annual rates of decrease exceeded 5 μg·m−3 (Fig. 2(b)). Table 1 and Fig. 2 indicate that although the improvement in air quality in Japan and the Republic of Korea is not as notable as that in China, the decline in the PM2.5 concentration in recent years has been pervasive across EA. The mitigation of PM2.5 pollution in Japan and Republic of Korea is attributable mainly to the following aspects: Like China, the governments of these two countries have also aimed to mitigate air pollution and improve air quality over the past few decades; the decline in PM2.5 levels in China has significantly reduced the transport of aerosol particles from the mainland to the Korean Peninsula and Japan.
3.2. Performance of the PM2.5 reconstruction framework
Fig. 3 depicts the validation outcomes for the three PM2.5 datasets: ACAG-estimated, MERRA-2-estimated, and reconstructed PM2.5 concentrations based on the wide–deep framework. In contrast to the MERRA-2-estimated PM2.5 concentrations, the ACAG datasets have a higher spatial resolution (0.01° × 0.01° and 0.1° × 0.1°), and they are capable of characterizing the spatial characteristics of the PM2.5 distribution in EA cities. Additionally, the R2, RMSE, MAE, MRE, and regression lines revealed that ACAG-estimated PM2.5 concentrations agree much better with the observed values than the estimations from MERRA-2. As shown in Fig. 3(b), the coarse resolution of the MERRA-2 dataset (0.500° × 0.625°) leads to a marked underestimation of PM2.5 concentrations in EA, and the corresponding RMSE (9.82 μg·m−3) is approximately three times greater than that of the ACAG dataset (3.34 μg·m−3). The wide–deep framework proposed in this study outperformed the ACAG and MERRA-2 datasets. The RMSE and MAE of the wide–deep model for the annual PM2.5 concentrations were 1.38 and 0.79 μg·m−3, respectively, demonstrating strong alignment with ground-based monitoring. Additionally, to assess the predictive ability of the model, the datasets were divided into two groups: the training group (the datasets from 2015 to 2019) and the testing group (the datasets from 2020). Subsequent findings revealed that the wide–deep model accurately predicts the PM2.5 concentrations in EA. The RMSE for PM2.5 prediction in 2020 was 2.73 μg·m−3 (Fig. S3 in Appendix A), which is better than the performance of the ACAG datasets (with RMSE = 2.82 μg·m−3).
Fig. S4 in Appendix A shows the spatial attributes of the wide–deep model performance and the deviation between the reconstructed PM2.5 concentrations and ground measurements in each city of EA. According to Fig. S1, China is divided into four parts to provide more insights into the performance of the wide–deep model across different regions. Overall, the results of the wide–deep model are in good agreement with those of ground-based monitoring in most regions—83% of the cities’ RMSEs are below 1.5 μg·m−3, and 71% of the cities’ R2 values are above 0.75. Table S1 in Appendix A reveals that the regions with the lowest MRE (1.98%) and RMSE (0.83 μg·m−3) were found in eastern China and Japan, respectively. Additionally, the four statistical indicators suggest that the wide–deep model exhibits robust performance in southern China and the Korean Peninsula. In contrast, the wide–deep model demonstrated comparatively poorer performance in western China (Table S1 and Fig. S4), particularly in some Xizang cities (e.g., Shigatse, Nyingchi), as evidenced by a high RMSE of 3.0 μg·m−3, an MAE of 1.88 μg·m−3, an MRE of 8.3%, and a low R2 value of 0.76. The significant deviation between the wide-model predictions and observed PM2.5 concentrations on the Qinghai-Xizang Plateau is partially attributed to the limited number of monitoring stations in that region.
Although the ACAG PM2.5 concentrations exhibited a stronger relationship with the observed values in the cities of China (with R2 = 0.91 and RMSE = 3.43 μg·m−3), the overall biases in Japan and the Korean Peninsula were much greater (Fig. S5 in Appendix A). In the present study, we collected more ground monitoring data for model training, and the results indicated that the accuracy of PM2.5 prediction in Japan and the Republic of Korea improved significantly (Fig. S4). The average RMSE of Japanese cities decreased from 1.92 (ACAG) to 0.83 μg·m−3 (wide–deep model); the corresponding value in the Republic of Korea decreased from 1.50 to 3.35 μg·m−3. Additionally, at both the country and city scales, the four statistical indicators revealed that the results of the wide–deep model outperformed those of the ACAG and MERRA-2 datasets (Table 2, Figs. S4–S6 in Appendix A). This implies that if more ground-based observations were incorporated for training (particularly those in western China), the reconstruction framework could improve the estimation accuracy of PM2.5 concentrations at the continental scale. More importantly, it can also capture the spatial characteristics and interannual variations in air pollutants in peninsular or archipelago countries.
3.3. Interannual PM2.5 concentration variation and trend
By employing the wide–deep model, we obtained distribution information on the annual PM2.5 concentrations in EA, and Fig. 4 illustrates that, owing to the varying stages and rates of economic development in the three EA countries, the PM2.5 concentration changes exhibited distinct trends from 1981 to 2020. The datasets from the emissions database for global atmospheric research (EAGAR) were used to investigate the variation in the PM2.5 concentrations for each region. In China, where anthropogenic emissions were relatively limited before 2000 (Fig. S7(a) in Appendix A), the PM2.5 concentration increase was less pronounced, with annual average PM2.5 levels ranging between 25 and 30 μg·m−3 and corresponding population-weighted (PW) PM2.5 falling within the range of 27 to 35 μg·m−3. Table 3 reveals that the annual PW PM2.5 concentrations prior to 2000 met the interim Target 1 of the WHO AQG (annual concentration < 35 μg·m−3). Following the turn of the millennium, the economy boomed, and the increase in fossil fuel consumption in China substantially worsened local air quality. The EDGAR inventory revealed that nationwide nitrogen oxide (NOx) emissions almost doubled from 2000 to 2010 and that primary PM2.5 emissions also increased by 50% (Fig. S7 in Appendix A). Therefore, compared with the period from 1981 to 1999, the PM2.5 concentration increase remarkably accelerated after 2000, and in 2006, the nationwide PW PM2.5 concentration peaked at 54 μg·m−3. Simultaneously, the gap between the geographical average PM2.5 and PW average PM2.5 concentrations in China continues to widen, which implies that the deterioration of air quality in densely populated areas such as eastern and southern China is more pronounced. The nationwide PM2.5 concentrations then plateaued from 2006 to 2013. Prior to the 2010s, the government of China launched a series of mitigation strategies to combat air pollution; however, the related measures focused on SO2 to solve the acid rain problem, but their effectiveness in reducing the risk of PM2.5 exposure was limited [83]. The rapid decrease in nationwide PM2.5 concentrations since 2013 was attributable to the implementation of the APPCAP, and the PW PM2.5 concentrations presented a consistent downward trend from 56 μg·m−3 in 2013 to 36 μg·m−3 in 2020 (Fig. 4(a)). In September 2013, China’s State Council enacted the APPCAP, which is considered the most stringent air pollution control policy in China to date. In support of this plan, specific mitigation goals for heavily polluted regions were proposed in 2017, and a series of measures focused on energy consumption control, energy structure adjustment, and clean production [84]. Fig. S7 shows that the nationwide emissions of NOx and PM2.5 continuously decreased from 2013 to 2020. This implies that the implementation of this plan represents a milestone in air quality control and significantly lowered the exposure level of PM2.5 in China.
Japan experienced a period of high economic growth before the 1980s. The emissions of NOx and PM2.5 did not present any notable increasing trend from 1981 to 2020 (Fig. S7). We found that, compared with those in China and the Korean Peninsula, the interannual variation in the PM2.5 concentrations in Japan tended to be much greater from 1981 to 2020. The 10 year averages for the 1980s and 1990s were 12.7 and 12.4 μg·m−3, respectively. Before 2000, the annual average concentration in Japan presented a slight downward trend (Table 3 and Fig. 4(b)). The annual average PW PM2.5 concentration then increased from 11 μg·m−3 in 1999 to 15 μg·m−3 in 2003. From 2000 to 2006, the PM2.5 exposure level plateaued, and the annual PW PM2.5 concentration fluctuated between 13 and 15 μg·m−3. Both PM2.5 indicators (geographical average and PW average) of Japan notably decreased after 2006, and the two indicators reached their minimums at 9 and 10 μg·m−3, respectively, in 2019.
The emission of NOx across the Korean Peninsula exhibited an increasing trend from 1981 to 1996 (Fig. S7(c)). Although nationwide NOx emissions began to decrease after 2000, PM2.5 concentrations continued to increase. Overall, the year-to-year fluctuations in the PM2.5 exposure level across the Korean Peninsula closely resemble those in China, suggesting that aerosols originating from China likely affect the air quality over the Korean Peninsula. Specifically, the yearly mean PW PM2.5 concentration experienced an increasing trend until 2003, and the increase was much faster in 1990–1997. The annual PW PM2.5 concentration across the Korean Peninsula prior to 2000 met the interim target 2 of the WHO AQG (annual concentration < 25 μg·m−3). In the following years, the risk of PM2.5 exposure over the Korean Peninsula remained high (annual average of PW PM2.5 > 25 μg·m−3). After 2014, the annual average PW PM2.5 concentration declined gradually from 21 to 29 μg·m−3 in 2020. The air quality across the Korean Peninsula is significantly impacted by long-range air pollutants transported from China. Thus, we infer that the decrease is partially attributed to the stringent implementation of the APPCAP in China.
The interannual variation in the PM2.5 exposure level in EA was spatially and temporally heterogeneous. As shown in Table 3, the PM2.5 concentrations in China and the Korean Peninsula significantly (p < 0.05) increased from 1981 to 1990, and the increase rates of the annual PW PM2.5 concentrations were 0.8 and 0.4 μg·m−3·a−1, respectively. A significant increase in the spatial distribution was observed in eastern and southern China. In contrast, the PM2.5 concentrations in some parts of western (around the Taklamakan Desert) and northern (around the Gobi Desert) China, which are highly affected by wind-blown dust, tended to decrease. Simultaneously, the PM2.5 exposure level in the densely populated Tokyo Bay of Japan also showed a noticeable downward trend from 1981 to 1990. Fig. 5 and Table 3 reveal that the PM2.5 concentration in the regions of EA experienced a gradual increase during the following decade (1991–2000), and the increase rate of the PW PM2.5 concentration over the Korean Peninsula was 0.7 (95%CI, 0.4–1.0) μg·m−3·a−1, which was significantly higher than that observed in China and Japan. In the first decade of the new century (2001–2010), the nationwide PM2.5 exposure level in China increased rapidly and continually due to the increase in anthropogenic emissions. Spatially, PM2.5 pollution worsened much more remarkably in the North China Plain, Yangtze Plain, and Sichuan Basin. Unlike China and the Korean Peninsula, PM2.5 pollution in Japan was significantly mitigated from 2001 to 2010, and as shown in Fig. 5(c), the downward trend in the east coastal region was much faster. With the effective control measures and policies implemented by local authorities, the air quality in EA improved markedly from 2011 to 2020, and the annual average PW PM concentrations in China, Japan, and the Korean Peninsula declined at rates of 2.8 (95% CI, 2.1–3.0), 0.2 (95% CI, 0.1–0.3), and 0.5 (95% CI, 0.3–0.8) μg·m−3·a−1, respectively. The mitigation of PM2.5 pollution has been more dramatic in eastern and southern China, and the rate of decrease in most parts of these regions has exceeded 2.5 μg·m−3·a−1. Fig. S8 in Appendix A shows the 10 year average PM2.5 concentrations in EA.
3.4. Changes in the PM2.5 exposure level
In September 2021, the WHO updated its AQG, and the latest guidelines highlighted the significant influence of air pollution [24]. On the basis of the recommended limit and the four interim targets of the WHO AQG, the annual PM2.5 exposure in EA was divided into six levels according to PM2.5 concentration: Level 1 (< 5 μg·m−3), Level 2 (5–10 μg·m−3), Level 3 (10–15 μg·m−3), Level 4 (15–25 μg·m−3), Level 5 (25–35 μg·m−3), and Level 6 (> 35 μg·m−3). The findings indicated that in one of the regions with the most severe air pollution globally, less than 0.05% of the population in EA lived with an annual PM2.5 concentration meeting the standard of the WHO AQG.
During the past four decades (1981–2020), the majority of PM2.5 exposure in China fell within the Levels 5 and 6 categories, with merely 3% of the population experiencing an annual average PM2.5 concentration below 10 μg·m−3 (Levels 1 and 2). In the early 1980s, approximately 40% of China’s population was exposed primarily to pollution at Level 4, whereas approximately 20% lived in heavily polluted areas at Level 6. Along with increasing anthropogenic emissions for economic growth in China, the population living in areas with severe PM2.5 pollution gradually increased, and in 2013, the number of people exposed to Level 6 PM2.5 reached a maximum of 1.16 billion, or 84% of the total population. In contrast, the percentages of the national population exposed to Levels 5 and 4 declined to 11% and 4%, respectively. After 2010, the Chinese government introduced several measures to address air pollution, including the most stringent APPCAP. As shown in Fig. 6(a), these efforts significantly lowered the long-term risk of PM2.5 exposure throughout the country. The population exposed to Level 6 restraint declined to only 647 million (45% of the total population) in 2020, which is 50% less than the value in 2013.
Japan is less affected by PM2.5 pollution than other EA countries are, and consequently, the average exposure level of the population is much lower. As depicted in Fig. 6, the predominant PM2.5 exposure risk level in Japan is dominated by Levels 2 and 3, and over the past four decades, 82% of the population has been exposed to these two levels. There was no significant variation in the nationwide PM2.5 exposure level before 2000. From 2000 to 2008, the population living with Level 4 pollution in Japan substantially increased, which was partly due to the influence of transboundary pollution from China. After the implementation of the APPCAP, the air quality continuously improved in Japan, and as shown in Fig. 6(b), in 2016–2020, the population exposed to Level 4 air quality declined to less than 1%.
In 1981–1995, over 70% of the citizens in the Korean Peninsula lived with Level 4 pollution, with almost no exposure to heavy PM2.5 pollution levels (Levels 5 and 6). With a surge in local anthropogenic emissions and the impact of long-range air pollutant transport since 2000, there has been a significant increase in the population exposed to Levels 5 and 6 pollution. In 2014, the percentage of people exposed to Level 5 pollution reached 70%, whereas the percentage of those exposed to Level 6 pollution reached 9%, indicating a substantial increase from previous years. Since 2015, the population breathing heavily polluted air across the Korean Peninsula has substantially declined. Notably, in the last four-year period (2017–2020), the proportion of the population exposed to Level 6 pollution has nearly dropped to zero. This improvement is likely attributable to the implementation of the APPCAP in China.
We calculated the overall percentage of the population exposed to varying PM2.5 concentrations in EA from 1981 to 2020. As depicted in Fig. S9 in Appendix A, by the beginning of the 1980s, more than 80% and 50% of the population in EA was exposed to PM2.5 interim Target 1 (annual concentration < 35 μg·m−3) and Target 2 (annual concentration < 25 μg·m−3) of the WHO AQG. The exposure curves shifted gradually toward higher concentrations from 1981 to 2000, with only 5% of the population subjected to a yearly PM2.5 concentration greater than 50 μg·m−3. Since 2000, the rightward shift of the curves has been more marked, and the percentages of the population living with annual PM2.5 concentrations above 50 and 75 μg·m−3 increased to 47% and 10%, respectively, in 2009–2010. Simultaneously, from 2003 to 2012, less than 30% of the population in EA was exposed to PM2.5, which reached the interim Target 1 of the WHO AQG. Owing to the adoption of the stringent APPCAP in China, PM2.5 pollution was significantly mitigated in EA, and the exposure curves started to shift leftward. In 2019–2020, the proportion of the population exposed to annual PM2.5 levels greater than 75 μg·m−3 dropped to only 0.4%, whereas the proportions exposed to PM2.5 levels meeting interim Targets 1 and 2 of the WHO AQG increased to 52% and 23%, respectively. Fig. S9(b) indicates that the PM2.5 exposure level in 2019–2020 was equivalent to that in 1999–2002.
Notably, older age groups, especially those with chronic lung and heart conditions, are more susceptible to the impacts of air pollution. In EA, population aging is emerging as the prevailing demographic trend (Fig. S10 in Appendix A), driven by various factors, such as decreased fertility rates, extended life expectancy, and the transition of significant population cohorts into older age brackets. Several epidemiological studies [85], [86], [87] have revealed that population aging is becoming the primary driving factor for the variation in PM2.5-related premature mortalities in EA, particularly in China. It is estimated that in EA, the population aged 65 and over increased by approximately three times, from 58.1 million in 1981 to 227.7 million in 2020. Fig. S11 in Appendix A shows that approximately 10 million older adults (65 years or older) breathed Level 6 polluted air at the beginning of the 1980s. Subsequently, the convergence of population aging and the rapid rise in anthropogenic emissions significantly elevated the PM2.5 exposure risk for the elderly in EA. By 2014, the population of older adults residing in areas with Level 6 pollution had peaked at 129 million, which was an almost eleven-fold increase compared with the figure reported in 1981. Although the promulgation of the APPCAP substantially improved the air quality in EA, the respective older population subjected to Level 6 restraint did not present a marked downward trend and remained at approximately 100 million per year from 2015 to 2020. Thus, we infer that the health benefits from the mitigation strategy are likely to be neutralized by the variation in the age structure in EA. Additionally, many epidemiological studies [88], [89] have revealed that although the implementation of the APPCAP has reduced the exposure level in China, PM2.5-related premature mortality has not shown any significant downward trend and has even maintained an increasing trend in recent years.
3.5. Discussion
3.5.1. Highlights and limitations
The advancement of remote sensing technology has substantially improved the accuracy and reliability of monitoring atmospheric properties, and it is also the most powerful way to develop datasets of surface air pollutants with high spatiotemporal resolution. Recent studies [9], [10] estimating air pollutants have focused on the satellite era, whereas spatial distribution datasets of ambient air pollutants before 2000 are very scarce. For example, the Global High Air Pollutant dataset developed by Wei et al. [90], [91], [92], [93] was generated from big data via artificial intelligence and is widely used in epidemiological and policy-assessing studies. However, the dataset has full coverage, high resolution, and high quality, and the temporal coverage of the dataset ranges from 2000 to the present. In contrast, model-based estimation or reanalysis datasets are capable of providing the global or regional distribution of air pollutants for a much longer time span, and several studies have used these datasets to estimate the health effects related to exposure to air pollutants [86], [94], [95]. However, model-based estimation or reanalysis datasets, such as the MERRA-2 series, have greater uncertainty, and the spatial resolution is much coarser than that of satellite-based estimations [96], [97]. In the present study, we integrated multiple sources of PM2.5 observations, meteorology variables, vegetation variables, and so forth, to reconstruct a PM2.5 distribution dataset of EA from 1981 to 2020. To reduce the uncertainty and overfitting caused by using a single machine learning method, we selected three machine learning approaches—RF, GBM, and XGBoost—to predict surface PM2.5 concentrations; then, a wide–deep framework is applied to ensemble the results from these three base learners.
The results indicated that the reconstruction framework can achieve high prediction accuracy. The reconstructed PM2.5 concentration data were in good agreement with the ground measurements, and the corresponding R2 and RMSE values were 0.99 and 1.38 μg·m−3, respectively, which outperformed those of the satellite-based ACAG dataset. As more ground measurements were incorporated into the model for training, the framework of this study reduced the uncertainty of the PM2.5 prediction in Japan and the Korean Peninsula. Compared with those in the ACAG dataset, the average RMSE in Japan decreased from 1.92 to 0.83 μg·m−3, and the corresponding RMSE in the Republic of Korea decreased from 3.35 to 1.50 μg·m−3. One notable aspect of this study is its extensive temporal scope, which spans a significantly longer period than previous research. Unlike other studies, our analysis spans four decades (1981–2020) and offers valuable insights into the distribution of PM2.5 concentrations in EA during this time frame. The findings of this study reveal intricate details regarding fluctuations in regional PM2.5 pollution. This information is crucial for evaluating the influence of anthropogenic emissions and mitigation policies on local air quality and exposure risk. Overall, the reconstruction framework proposed in this study effectively and fully combines the advantages of satellite- and model-based prediction to produce PM2.5 distribution datasets with higher accuracy and longer temporal coverage.
A major limitation of the reconstruction framework is the large prediction uncertainty for regions where monitoring stations are sparse or not available. For example, the four statistical indicators revealed that the performance of the reconstruction framework in the western part of China, particularly the Qinghai–Xizang Plateau, is much worse than that in other parts of EA. Owing to the extensive geographical coverage of western China and the limited number of PM2.5 monitoring stations, few data are available for investigating and understanding the correlations between surface PM2.5 concentrations and relevant predictors in this region. Although three machine learning methods and a wide–deep model were used to ensemble the predictions, the uncertainties in the western part of China remained large. The incorporation of more monitoring data could substantially improve the regional prediction accuracy, which was proven by the reduced uncertainties in the reconstructed PM2.5 concentrations in Japan and the Korean Peninsula. Thus, with an increasing number of stations and monitoring networks established in EA, the model can be further optimized to improve the regional prediction accuracy.
Another limitation pertains to the varying spatial resolutions of the predictor variables, and we utilized the bilinear interpolation method to downscale the ERA5 and MERRA-2 datasets to 0.1° × 0.1°. Although discrepancies in spatial resolution among predictor variables are likely to introduce uncertainties in PM2.5 estimations, it is important to note that the resolution of meteorological reanalysis datasets is generally coarser than 0.25° [98], and interpolation is a widely accepted method for harmonizing the spatial resolution of various variables in the estimation of surface air pollutants. Advances in remote sensing and data assimilation technologies are leading to the availability of global climate datasets with significantly higher temporal and spatial resolutions. These advancements are expected to mitigate the uncertainties associated with the coarse resolutions of the current datasets.
3.5.2. Implications for policy-makers
To combat air pollution, authorities in EA have launched several initiatives aimed at altering the energy structure and reducing anthropogenic emissions. Among these, the most effective has been the APPCAP. This plan significantly reduced PM2.5 pollution in southern and eastern China, with the annual PW PM2.5 concentration decrease rates in these regions being −4.0 (95%CI, −4.5–−3.7) and −2.7 (95%CI, −2.9–−2.3) μg·m−3·a−1 from 2014 to 2020, respectively (Table S2 in Appendix A). The implementation of the APPCAP not only mitigated local air pollution but also provided substantial environmental benefits to other EA countries. Prior to the APPCAP, Japan had already been working to reduce fossil fuel emissions, with PW PM2.5 decreasing at a rate of −0.3 (95%CI, −0.3–−0.2) μg·m−3·a−1 from 2001 to 2013. Following the implementation of the APPCAP, the rate of decline in PW PM2.5 concentrations in Japan accelerated. In contrast, the Korean Peninsula experienced an insignificant increasing trend in PW PM2.5 concentrations from 2000 to 2013. However, after the implementation of the APPCAP, nationwide PM2.5 pollution was mitigated, and PW PM2.5 levels declined at a rate of −1.2 (95% CI, −1.5–−1.0) μg·m−3·a−1. Additionally, both Table S2 and Fig. S12 in Appendix A indicate that the decrease in PM2.5 occurred across the entire region of the Korean Peninsula, not just in densely populated areas. Unlike other regions, the decline in regional PM2.5 concentrations across the Korean Peninsula was greater than the decrease in PW PM2.5 concentrations, and this reduction was particularly notable in the northern part of the Korean Peninsula, which is closer to China. This suggests that the implementation of the APPCAP reduced the aerosol load transported to Korea, thereby significantly improving the air quality in the region.
Although the implementation of the APPCAP has significantly mitigated air pollution in EA, a substantial gap remains in meeting the standard of the WHO AQG. Severe wintertime pollution episodes continue to occur in China. Furthermore, numerous epidemiological studies have indicated that older age groups are particularly vulnerable to the adverse health effects of air pollution [99], [100], [101]. Notably, the elderly population in EA is projected to increase rapidly in the coming decades. These shifting age demographics are emerging as primary factors influencing fluctuations in the PM2.5-related health burden in the region [85], [86], [87]. Therefore, to effectively mitigate future health risks associated with PM2.5, EA governments need to set ambitious reduction targets and consider the unique domestic characteristics of each country to develop practical mitigation pathways.
Meteorological conditions directly impact the photochemical reactions and dispersion of surface PM2.5, playing a crucial role in the occurrence of severe haze pollution episodes. Figs. S13 and S14 in Appendix A indicate that increasing temperature and solar radiation, decreasing humidity or rainfall, and weakening general circulation could increase the surface PM2.5 concentrations in EA (particularly in southern and eastern China). Several studies [102], [103] have shown that under different climate scenarios, air pollution in EA may further worsen, largely because of the warming climate and more intense and frequent extreme weather, such as heat waves and stagnation events. This implies that air pollution control strategies in EA need to consider the impacts of climate change, and synergistically reducing the emission of greenhouse gases and air pollutants to mitigate global warming and lower the exposure risk to air pollution is essential.
4. Conclusions
EA is widely recognized as one of the regions suffering from the worst PM2.5 pollution worldwide. In this study, we proposed a wide–deep framework that combines three machine learning methods to reconstruct the PM2.5 concentrations in EA with high accuracy. The results indicated that the reconstruction framework is an effective way to integrate model- and satellite-based PM2.5 estimations. More importantly, it is robust and reliable for historical PM2.5 prediction. Compared with the PM2.5 concentration estimations from the ACAG and MERRA-2 datasets, the reconstructed PM2.5 concentration data are in better agreement with the ground measurements, and the prediction uncertainties in Japan and the Korean Peninsula have substantially decreased because more ground monitoring data have been incorporated into the model. Unlike previous studies, this approach can calculate PM2.5 levels across the last forty years, offering crucial and significant data (particularly regarding PM2.5 concentrations prior to the advent of satellites) for studies related to epidemiology and policy evaluation.
Using the reconstructed datasets, we investigated the spatiotemporal fluctuations in PM2.5 and the associated exposure level in EA. Before 2000, the PM2.5 concentrations steadily increased, and the exposure level was dominated by moderate pollution (Level 4). The increase in anthropogenic emissions and consumption of fossil fuel dramatically worsened PM2.5 pollution in EA, and the population exposed to yearly PM2.5 concentrations above 50 μg·m−3 increased to nearly 50% at the end of the 2010s. Although, in recent years, the implementation of mitigation strategies has substantially reduced the atmospheric PM2.5 concentrations in EA, the entire exposure level is still very high, and it is still implausible for most parts to meet the updated WHO AQG 2021 (annual mean PM2.5 exposure < 5 μg·m−3). Notably, population aging has substantially offset the health benefits of measures to curb air pollution, and the elderly group exposed to severe pollution has not presented a marked declining trend in the last decade. Simultaneously, climate change could affect the photochemical reactions and dispersion of air pollutants. In most of EA, increases in temperature and solar radiation, along with diminished circulation and a drier climate, have created conducive conditions for the generation and buildup of PM2.5. Therefore, policy-makers in EA should consider these factors and set more aggressive mitigation targets to reduce the exposure level in the future.
CRediT authorship contribution statement
Shuai Yin: Writing – review & editing, Writing – original draft, Methodology, Conceptualization. Chong Shi: Writing – review & editing, Writing – original draft, Methodology, Conceptualization. Husi Letu: Writing – review & editing, Supervision. Akihiko Ito: Writing – review & editing, Supervision. Huazhe Shang: Methodology, Investigation. Dabin Ji: Methodology, Investigation. Lei Li: Methodology, Investigation. Sude Bilige: Validation, Investigation. Tangzhe Nie: Validation, Investigation. Kunpeng Yi: Writing – original draft, Data curation. Meng Guo: Writing – original draft, Data curation. Zhongyi Sun: Validation, Investigation. Ao Li: Validation, Investigation.
Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgments
This work is supported by the Third Xinjiang Scientific Expedition Program (2022xjkk0903) and the National Natural Science Foundation of China (42275145).
CheH, XiaX, ZhaoH, LiL, GuiK, ZhengY, et al.Aerosol optical and radiative properties and their environmental effects in China: a review.Earth Sci Rev2024; 248:104634.
[2]
GuiK, CheH, LiL, ZhengY, ZhangL, ZhaoH, et al.The significant contribution of small-sized and spherical aerosol particles to the decreasing trend in total aerosol optical depth over land from 2003 to 2018.Engineering2022; 16:82-92.
[3]
LovettGM, TearTH, EversDC, FindlaySE, CosbyBJ, DunscombJK, et al.Effects of air pollution on ecosystems and biological diversity in the eastern United States.Ann N Y Acad Sci2009; 1162(1):99-135.
[4]
StevensCJ, BellJNB, BrimblecombeP, ClarkCM, DiseNB, FowlerD, et al.The impact of air pollution on terrestrial managed and natural vegetation.Philos Trans Royal Soc A2020; 378(2183):20190317.
[5]
YueX, UngerN.Fire air pollution reduces global terrestrial productivity.Nat Commun2018; 9(1):5413.
[6]
ZverevaEL, KozlovMV.Responses of terrestrial arthropods to air pollution: a meta-analysis.Environ Sci Pollut Res Int2010; 17(2):297-311.
[7]
World Health Organization (WHO).New WHO Global Air Quality Guidelines aim to save millions of lives from air pollution [Internet].Geneva: World Health Organization; 2021 Sep 22 [cited 2024 Oct 21]. Available from: https://www.who.int/news/item/22-09-2021-new-who-global-air-quality-guidelines-aim-to-save-millions-of-lives-from-air-pollution#:∼:text=The%20guidelines%20recommend%20new%20air%20quality%20levels%20to,some%20of%20which%20also%20contribute%20to%20climate%20change.
[8]
ChowdhuryS, DeyS.Cause-specific premature death from ambient PM2.5 exposure in India: estimate adjusted for baseline mortality.Environ Int2016; 91:283-290.
[9]
MajiKJ.Substantial changes in PM2.5 pollution and corresponding premature deaths across China during 2015–2019: a model prospective.Sci Total Environ2020; 729:138838.
[10]
XiaoQ, GengG, LiangF, WangX, LvZ, LeiY, et al.Changes in spatial patterns of PM2.5 pollution in China 2000–2018: impact of clean air policies.Environ Int2020; 141:105776.
[11]
ZhangQ, ZhengY, TongD, ShaoM, WangS, ZhangY, et al.Drivers of improved PM2.5 air quality in China from 2013 to 2017.Proc Natl Acad Sci2019; 116(49):24463-24469.
[12]
BeheraSN, SharmaM.Reconstructing primary and secondary components of PM2.5 composition for an urban atmosphere.Aerosol Sci Technol2010; 44(11):983-992.
[13]
MancillaY, HerckesP, FraserMP, MendozaA.Secondary organic aerosol contributions to PM2.5 in Monterrey, Mexico: temporal and seasonal variation.Atmos Res2015; 153:348-359.
[14]
WangHL, QiaoLP, LouSR, ZhouM, ChenJM, WangQ, et al.PM2.5 pollution episode and its contributors from 2011 to 2013 in urban Shanghai, China.Atmos Environ2011; 2015(123):298-305.
[15]
XieY, LiuZ, WenT, HuangX, LiuJ, TangG, et al.Characteristics of chemical composition and seasonal variations of PM2.5 in Shijiazhuang, China: impact of primary emissions and secondary formation.Sci Total Environ2019; 677:215-229.
[16]
ChenZ, ChenD, ZhaoC, KwanMP, CaiJ, ZhuangY, et al.Influence of meteorological conditions on PM2.5 concentrations across China: a review of methodology and mechanism.Environ Int2020; 139:105558.
[17]
TiwariS, SrivastavaAK, BishtDS, ParmitaP, SrivastavaMK, AttriSD.Diurnal and seasonal variations of black carbon and PM2.5 over New Delhi, India: influence of meteorology.Atmos Res2013; 125:50-62.
[18]
TranHN, MöldersN.Investigations on meteorological conditions for elevated PM2.5 in Fairbanks, Alaska.Atmos Res2011; 99(1):39-49.
[19]
XuY, XueW, LeiY, HuangQ, ZhaoY, ChengS, et al.Spatiotemporal variation in the impact of meteorological conditions on PM2.5 pollution in China from 2000 to 2017.Atmos Environ2000; 2020(223):117215.
[20]
HopkePK, CroftD, ZhangW, LinS, MasiolM, SquizzatoS, et al.Changes in the acute response of respiratory diseases to PM2.5 in New York state from 2005 to 2016.Sci Total Environ2005; 2019(677):328-339.
[21]
LiuC, ChenR, SeraF, Vicedo-CabreraAM, GuoY, TongS, et al.Ambient particulate air pollution and daily mortality in 652 cities.N Engl J Med2019; 381(8):705-715.
[22]
McGuinnLA, SchneiderA, McGarrahRW, Ward-CavinessC, NeasLM, DiQ, et al.Association of long-term PM2.5 exposure with traditional and novel lipid measures related to cardiovascular disease risk.Environ Int2019; 122:193-200.
[23]
PolezerG, TadanoYS, SiqueiraHV, GodoiAF, YamamotoCI, de AndréPA, et al.Assessing the impact of PM2.5 on respiratory disease using artificial neural networks.Environ Pollut2018; 235:394-403.
[24]
World Health Organization.WHO global air quality guidelines: particulate matter (PM2.5 and PM10), ozone, nitrogen dioxide, sulfur dioxide and carbon monoxide [Internet]. Geneva: World Health Organization; 2021 Sep 22 [cited 2024 Oct 21]. Available from: https://www.who.int/publications/i/item/9789240034228.
[25]
de HooghK, HHéritier, StafoggiaM, KünzliN, KloogI.Modelling daily PM2.5 concentrations at high spatio–temporal resolution across Switzerland.Environ Pollut2018; 233:1147-1154.
[26]
LiZ, YimSHL, HoKF.High temporal resolution prediction of street-level PM2.5 and NOx concentrations using machine learning approach.J Clean Prod2020; 268:121975.
[27]
YuW, YeT, ZhangY, XuR, LeiY, ChenZ, et al.Global estimates of daily ambient fine particulate matter concentrations and unequal spatiotemporal distribution of population exposure: a machine learning modelling study.Lancet Planet Health2023; 7(3):e209-e218.
[28]
CooperMJ, MartinRV, HammerMS, LeveltPF, VeefkindP, LamsalLN, et al.Global fine-scale changes in ambient NO2 during COVID-19 lockdowns.Nature2022; 601(7893):380-387.
[29]
EvansJ, van DonkelaarA, MartinRV, BurnettR, RainhamDG, BirkettNJ, et al.Estimates of global mortality attributable to particulate air pollution using satellite imagery.Environ Res2013; 120:33-42.
[30]
GuiK, CheH, WangY, WangH, ZhangL, ZhaoH, et al.Satellite-derived PM2.5 concentration trends over Eastern China from 1998 to 2016: relationships to emissions and meteorological parameters.Environ Pollut1998; 2019(247):1125-1133.
[31]
MaZ, DeyS, ChristopherS, LiuR, BiJ, BalyanP, et al.A review of statistical methods used for developing large-scale and long-term PM2.5 models from satellite data.Remote Sens Environ2022; 269:112827.
[32]
SatheY, GuptaP, BawaseM, LamsalL, PatadiaF, ThipseS.Surface and satellite observations of air pollution in India during COVID-19 lockdown: implication to air quality.Sustain Cities Soc2021; 66:102688.
[33]
EeftensM, BeelenR, De HooghK, BellanderT, CesaroniG, CirachM, et al.Development of land use regression models for PM2.5, PM2.5 absorbance, PM10 and PM coarse in 20 European study areas; results of the ESCAPE project.Environ Sci Technol2012; 46(20):11195-11205.
[34]
HuC, KangP, JaffeDA, LiC, ZhangX, WuK, et al.Understanding the impact of meteorology on ozone in 334 cities of China.Atmos Environ2021; 248:118221.
[35]
KongL, XinJ, ZhangW, WangY.The empirical correlations between PM2.5, PM10 and AOD in the Beijing metropolitan region and the PM2.5, PM10 distributions retrieved by MODIS.Environ Pollut2016; 216:350-360.
[36]
ShogrkhodaeiSZ, Razavi-TermehSV, FathniaA.Spatio–temporal modeling of PM2.5 risk mapping using three machine learning algorithms.Environ Pollut2021; 289:117859.
[37]
XiaoQ, ChangHH, GengG, LiuY.An ensemble machine-learning model to predict historical PM2.5 concentrations in China from satellite data.Environ Sci Technol2018; 52(22):13260-13269.
[38]
CheH, ZhangXY, XiaX, GoloubP, HolbenB, ZhaoH, et al.Ground-based aerosol climatology of China: aerosol optical depths from the China Aerosol Remote Sensing Network (CARSNET) 2002–2013.Atmos Chem Phys2015; 15(13):7619-7652.
[39]
BarnesWL, XiongX, SalomonsonVV.Status of terra MODIS and aqua MODIS.Adv Space Res2003; 32(11):2099-2106.
[40]
SavtchenkoA, OuzounovD, AhmadS, AckerJ, LeptoukhG, KozianaJ, et al.Terra and Aqua MODIS products available from NASA GES DAAC.Adv Space Res2004; 34(4):710-714.
[41]
LiL, CheH, DerimianY, DubovikO, LuanQ, LiQ, et al.Climatology of fine and coarse mode aerosol optical thickness over East and South Asia derived from POLDER/PARASOL satellite.J Geophys Res-Atmos2020; 125(16):2020JD032665.
[42]
HanS, ParkY, NohN, KimJH, KimJJ, KimBM, et al.Spatiotemporal variability of the PM2.5 distribution and weather anomalies during severe pollution events: observations from 462 air quality monitoring stations across South Korea.Atmos Pollut Res2023; 14(3):101676.
[43]
ItoA, WakamatsuS, MorikawaT, KobayashiS.30 years of air quality trends in Japan.Atmosphere2021; 12(8):1072.
[44]
van DonkelaarA, MartinRV, BrauerM, HsuNC, KahnRA, LevyRC, et al.Global estimates of fine particulate matter using a combined geophysical–statistical method with information from satellites, models, and monitors.Environ Sci Technol2016; 50(7):3762-3772.
[45]
van DonkelaarA, MartinRV, LiC, BurnettRT.Regional estimates of chemical composition of fine particulate matter using a combined geoscience-statistical method with information from satellites, models, and monitors.Environ Sci Technol2019; 53(5):2595-2611.
[46]
BoysBL, MartinRV, Van DonkelaarA, MacDonellRJ, HsuNC, CooperMJ, et al.Fifteen-year global time series of satellite-derived fine particulate matter.Environ Sci Technol2014; 48(19):11109-11118.
[47]
HammerMS, van DonkelaarA, LiC, LyapustinA, SayerAM, HsuNC, et al.Global estimates and long-term trends of fine particulate matter concentrations (1998–2018).Environ Sci Technol2020; 54(13):7879-7890.
[48]
Van DonkelaarA, MartinRV, BrauerM, BoysBL.Use of satellite observations for long-term exposure assessment of global concentrations of fine particulate matter.Environ Health Perspect2015; 123(2):135-143.
[49]
Van DonkelaarA, HammerMS, BindleL, BrauerM, BrookJR, GarayMJ, et al.Monthly global estimates of fine particulate matter and their uncertainty.Environ Sci Technol2021; 55(22):15287-15300.
[50]
BosilovichMG, RobertsonFR, ChenJ.Global energy and water budgets in MERRA.J Clim2011; 24(22):5721-5739.
[51]
WarganK, LabowG, FrithS, PawsonS, LiveseyN, PartykaG.Evaluation of the ozone fields in NASA’s MERRA-2 reanalysis.J Clim2017; 30(8):2961-2988.
[52]
GelaroR, McCartyW, SuárezMJ, TodlingR, MolodA, TakacsL, et al.The modern-era retrospective analysis for research and applications, version 2 (MERRA-2).J Clim2017; 30(14):5419-5454.
[53]
RandlesCA, Da SilvaAM, BuchardV, ColarcoPR, DarmenovA, GovindarajuR, et al.The MERRA-2 aerosol reanalysis, 1980 onward. Part I: system description and data assimilation evaluation.J Clim2017; 30(17):6823-6850.
[54]
RieneckerMM, SuarezMJ, GelaroR, TodlingR, BacmeisterJ, LiuE, et al.MERRA: NASA’s modern-era retrospective analysis for research and applications.J Clim2011; 24(14):3624-3648.
[55]
BuchardV, RandlesCA, Da SilvaAM, DarmenovA, ColarcoPR, GovindarajuR, et al.The MERRA-2 aerosol reanalysis, 1980 onward. Part II: evaluation and case studies.J Clim2017; 30(17):6851-6872.
[56]
BuchardV, Da SilvaAM, RandlesCA, ColarcoP, FerrareR, HairJ, et al.Evaluation of the surface PM2.5 in version 1 of the NASA MERRA aerosol reanalysis over the United States.Atmos Environ2016; 125:100-111.
[57]
YinS.Decadal changes in premature mortality associated with exposure to outdoor PM2.5 in mainland Southeast Asia and the impacts of biomass burning and anthropogenic emissions.Sci Total Environ2023; 854:158775.
[58]
YinS.Spatiotemporal variation of PM2.5-related preterm birth in China and India during 1990–2019 and implications for emission controls.J Environ Manage2023; 330:117154.
[59]
AsimakopoulosDN, FlocasHA, MaggosT, VasilakosC.The role of meteorology on different sized aerosol fractions (PM10, PM2.5, PM2.5–10).Sci Total Environ2012; 419:124-135.
[60]
CamalierL, CoxW, DolwickP.The effects of meteorology on ozone in urban areas and their use in assessing ozone trends.Atmos Environ2007; 41(33):7127-7137.
[61]
HuX, WallerLA, Al-HamdanMZ, CrossonWL, Estes JrMG, EstesSM, et al.Estimating ground-level PM2.5 concentrations in the Southeastern US using geographically weighted regression.Environ Res2013; 121:1-10.
[62]
HersbachH, BellB, BerrisfordP, HiraharaS, HorányiA, Muñoz-SabaterJ, et al.The ERA5 global reanalysis.Q J Roy Meteorol Soc2020; 146(730):1999-2049.
[63]
SoaresPM, LimaDC, NogueiraM.Global offshore wind energy resources using the new ERA-5 reanalysis.Environ Res Lett2020; 15(10):1040a2.
[64]
ZhangW, VillariniG, ScoccimarroE, NapolitanoF.Examining the precipitation associated with medicanes in the high-resolution ERA-5 reanalysis data.Int J Climatol2021; 41(S1):E126-E132.
[65]
AlbergelC, DutraE, MunierS, CalvetJC, Munoz-SabaterJ, de RosnayP, et al.ERA-5 and ERA-interim driven ISBA land surface model simulations: which one performs better?.Hydrol Earth Syst Sci2018; 22(6):3515-3532.
[66]
UrracaR, HuldT, Gracia-AmilloA, Martinez-de-PisonFJ, KasparF, Sanz-GarciaA.Evaluation of global horizontal irradiance estimates from ERA5 and COSMO-REA6 reanalyses using ground and satellite-based data.Sol Energy2018; 164:339-354.
[67]
GongC, WangY, LiaoH, WangP, JinJ, HanZ.Future co-occurrences of hot days and ozone-polluted days over China under scenarios of shared socioeconomic pathways predicted through a machine-learning approach.Earth’s Future2022; 10:e2022EF002671.
[68]
KimM, BrunnerD, KuhlmannG.Importance of satellite observations for high-resolution mapping of near-surface NO2 by machine learning.Remote Sens Environ2021; 264:112573.
[69]
JafDKI, AbdulrahmanPI, MohammedAS, KurdaR, QaidiSM, AsterisPG.Machine learning techniques and multi-scale models to evaluate the impact of silicon dioxide (SiO2) and calcium oxide (CaO) in fly ash on the compressive strength of green concrete.Constr Build Mater2023; 400:132604.
[70]
AhmedHU, MohammedAS, FarajRH, AbdallaAA, QaidiSM, SorNH, et al.Innovative modeling techniques including MEP, ANN and FQ to forecast the compressive strength of geopolymer concrete modified with nanoparticles.Neural Comput Appl2023; 35(17):12453-12479.
[71]
AhmadJ, MajdiA, Babeker ElhagA, DeifallaAF, SoomroM, IsleemHF, et al.A step towards sustainable concrete with substitution of plastic waste in concrete: overview on mechanical, durability and microstructure analysis.Crystals2022; 12(7):944.
[72]
ZhangT, HeW, ZhengH, CuiY, SongH, FuS.Satellite-based ground PM2.5 estimation using a gradient boosting decision tree.Chemosphere2021; 268:128801.
[73]
GuoB, ZhangD, PeiL, SuY, WangX, BianY, et al.Estimating PM2.5 concentrations via random forest method using satellite, auxiliary, and ground-level station dataset at multiple temporal scales across China in 2017.Sci Total Environ2021; 778:146288.
[74]
GengG, MengX, HeK, LiuY.Random forest models for PM2.5 speciation concentrations using MISR fractional AODs. Environ Res Lett, 15 (3) (2020)
[75]
ChenZ, ZhangT, ZhangR, ZhuZ, YangJ, ChenP, et al.Extreme gradient boosting model to estimate PM2.5 concentrations with missing-filled satellite data in China.Atmos Environ2019; 202:180-189.
[76]
GuiK, CheH, ZengZ, WangY, ZhaiS, WangZ, et al.Construction of a virtual PM2.5 observation network in China based on high-density surface meteorological observations using the extreme gradient boosting model.Environ Int2020; 141:15801.
[77]
ZangZ, GuoY, JiangY, ZuoC, LiD, ShiW, et al.Tree-based ensemble deep learning model for spatiotemporal surface ozone (O3) prediction and interpretation.Int J Appl Earth Obs Geoinf2021; 103:102516.
[78]
ClevertDA, UnterthinerT, HochreiterS.Fast and accurate deep network learning by exponential linear units (ELUs).2015. arXiv: 1511.07289.
[79]
TheilH.A rank invariant method of linear and polynomial regression analysis, part 3.Proc K Ned Akad Wet C1950; 53:1397-1412.
[80]
SenPK.Estimates of the regression coefficient based on Kendall’s tau.J Am Stat Assoc1968; 63(324):1379-1389.
[81]
VanemE, WalkerSE.Identifying trends in the ocean wave climate by time series analyses of significant wave height data.Ocean Eng2013; 61:148-160.
[82]
LiuC, XingC, HuQ, LiQ, LiuH, HongQ, et al.Ground-based hyperspectral stereoscopic remote sensing network: a promising strategy to learn coordinated control of O3 and PM2.5 over China.Engineering2022; 19:71-83.
[83]
LuX, ZhangS, XingJ, WangY, ChenW, DingD, et al.Progress of air pollution control in China and its challenges and opportunities in the ecological civilization era.Engineering2020; 6(12):1423-1431.
[84]
ZhangS, ChenW.China’s energy transition pathway in a carbon neutral vision.Engineering2022; 14:64-76.
[85]
YueH, HeC, HuangQ, YinD, BryanB.Stronger policy required to substantially reduce deaths from PM2.5 pollution in China.Nat Commun2020; 11(1):1462.
[86]
LiuJ, ZhengY, GengG, HongC, LiM, LiX, et al.Decadal changes in anthropogenic source contribution of PM2.5 pollution and related health impacts in China, 1990–2015.Atmos Chem Phys2020; 20(13):7783-7799.
[87]
XuJ, YaoM, WuW, QiaoX, ZhangH, WangP, et al.Estimation of ambient PM2.5-related mortality burden in China by 2030 under climate and population change scenarios: a modeling study.Environ Int2021; 156:106733.
[88]
LiuM, SaariRK, ZhouG, LiJ, HanL, LiuX.Recent trends in premature mortality and health disparities attributable to ambient PM2.5 exposure in China: 2005–2017.Environ Pollut2021; 279:116882.
[89]
LiY, LiaoQ, ZhaoX, TaoY, BaiY, PengL.Premature mortality attributable to PM2.5 pollution in China during 2008–2016: underlying causes and responses to emission reductions.Chemosphere2021; 263:127925.
[90]
WeiJ, LiZ, LyapustinA, WangJ, DubovikO, SchwartzJ, et al.First close insight into global daily gapless 1 km PM2.5 pollution, variability, and health impact.Nat Commun2023; 14(1):8349.
[91]
WeiJ, LiZ, LyapustinA, SunL, PengY, XueW, et al.Reconstructing 1-km-resolution high-quality PM2.5 data records from 2000 to 2018 in China: spatiotemporal variations and policy implications.Remote Sens Environ2000; 2021(252):112136.
[92]
WeiJ, LiZ, CribbM, HuangW, XueW, SunL, et al.Improved 1 km resolution PM2.5 estimates across China using enhanced space–time extremely randomized trees.Atmos Chem Phys2020; 20(6):3273-3289.
[93]
WeiJ, LiZ, WangJ, LiC, GuptaP, CribbM.Ground-level gaseous pollutants (NO2, SO2, and CO) in China: daily seamless mapping and spatiotemporal variations.Atmos Chem Phys2023; 23(2):1511-1532.
[94]
KoplitzSN, MickleyLJ, MarlierME, BuonocoreJJ, KimPS, LiuT, et al.Public health impacts of the severe haze in Equatorial Asia in September–October 2015: demonstration of a new framework for informing fire management strategies to reduce downwind smoke exposure.Environ Res Lett2016; 11(9):094023.
[95]
VohraK, VodonosA, SchwartzJ, MaraisEA, SulprizioMP, MickleyLJ.Global mortality from outdoor fine particle pollution generated by fossil fuel combustion: results from GEOS-Chem.Environ Res2021; 195:110754.
[96]
MaJ, XuJ, QuY.Evaluation on the surface PM2.5 concentration over China mainland from NASA’s MERRA-2.Atmos Environ2020; 237:117666.
[97]
AliMA, BilalM, WangY, NicholJE, MhawishA, QiuZ, et al.Accuracy assessment of CAMS and MERRA-2 reanalysis PM2.5 and PM10 concentrations over China.Atmos Environ2022; 288:119297.
[98]
SunY, DengK, RenK, LiuJ, DengC, JinY.Deep learning in statistical downscaling for deriving high spatial resolution gridded meteorological data: a systematic review.ISPRS J Photogramm Remote Sens2024; 208:14-38.
[99]
YinH, BrauerM, ZhangJ, CaiW, NavrudS, BurnettR, et al.Population ageing and deaths attributable to ambient PM2·5 pollution: a global analysis of economic cost.Lancet Planet Health2021; 5:e356-e367.
[100]
ElbarbaryM, OganesyanA, HondaT, KellyP, ZhangY, GuoY, et al.Ambient air pollution, lung function and COPD: cross-sectional analysis from the WHO Study of AGEing and adult health wave 1.BMJ Open Respir Res2020; 7:e000684.
[101]
FlandersWD, StricklandMJ, KleinM.A new method for partial correction of residual confounding in time-series and other observational studies.Am J Epidemiol2017; 185:941-949.
[102]
HongC, ZhangQ, ZhangY, DavisSJ, TongD, ZhengY, et al.Impacts of climate change on future air quality and human health in China.Proc Natl Acad Sci2019; 116(35):17193-17200.
[103]
ZhangP, JeongJH, YoonJH, KimH, WangSYS, LinderholmHW, et al.Abrupt shift to hotter and drier climate over inner East Asia beyond the tipping point.Science2020; 370(6520):1095-1099.