Journal Home Online First Current Issue Archive For Authors Journal Information 中文版

Engineering >> 2020, Volume 6, Issue 8 doi: 10.1016/j.eng.2020.05.009

A New Model Using Multiple Feature Clustering and Neural Networks for Forecasting Hourly PM2.5 Concentrations, and Its Applications in China

Institute of Artificial Intelligence and Robotics (IAIR), Key Laboratory of Traffic Safety on Track of Ministry of Education, School of Traffic and Transportation Engineering, Central South University, Changsha 410075, China

Received: 2019-06-23 Revised: 2020-02-14 Accepted: 2020-05-25 Available online: 2020-06-10

Next Previous

Abstract

Particulate matter with an aerodynamic diameter no greater than 2.5 μm (PM2.5) concentration forecasting is desirable for air pollution early warning. This study proposes an improved hybrid model, named multi-feature clustering decomposition (MCD)–echo state network (ESN)–particle swarm optimization (PSO), for multi-step PM2.5 concentration forecasting. The proposed model includes decomposition and optimized forecasting components. In the decomposition component, an MCD method consisting of rough sets attribute reduction (RSAR), k-means clustering (KC), and the empirical wavelet transform (EWT) is proposed for feature selection and data classification. Within the MCD, the RSAR algorithm is adopted to select significant air pollutant variables, which are then clustered by the KC algorithm. The clustered results of the PM2.5 concentration series are decomposed into several sublayers by the EWT algorithm. In the optimized forecasting component, an ESN-based predictor is built for each decomposed sublayer to complete the multi-step forecasting computation. The PSO algorithm is utilized to optimize the initial parameters of the ESN-based predictor. Real PM2.5 concentration data from four cities located in different zones in China are utilized to verify the effectiveness of the proposed model. The experimental results indicate that the proposed forecasting model is suitable for the multi-step high-precision forecasting of PM2.5 concentrations and has better performance than the benchmark models.

Figures

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

References

[ 1 ] Ding Y, Wu P, Liu Y, Song Y. Environmental and dynamic conditions for the occurrence of persistent haze events in North China. Engineering 2017;3 (2):266–71. link1

[ 2 ] Polezer G, Tadano YS, Siqueira HV, Godoi AF, Yamamoto CI, de André PA, et al. Assessing the impact of PM2.5 on respiratory disease using artificial neural networks. Environ Pollut 2018;235:394–403. link1

[ 3 ] Zheng S, Pozzer A, Cao C, Lelieveld J. Long-term (2001–2012) fine particulate matter (PM2.5) and the impact on human health in Beijing, China. Atmos Chem Phys Discuss 2014;14:28657–84. link1

[ 4 ] Zhao C, Lin Y, Wu F, Wang Y, Li Z, Rosenfeld D, et al. Enlarging rainfall area of tropical cyclones by atmospheric aerosols. Geophys Res Lett 2018;45 (16):8604–11. link1

[ 5 ] Zhao C, Yang Y, Fan H, Huang J, Fu Y, Zhang X, et al. Aerosol characteristics and impacts on weather and climate over the Tibetan Plateau. Natl Sci Rev 2020;7 (3):492–5. link1

[ 6 ] Zhao C, Garrett TJ. Effects of Arctic haze on surface cloud radiative forcing. Geophys Res Lett 2015;42(2):557–64. link1

[ 7 ] Garrett TJ, Zhao C. Increased Arctic cloud longwave emissivity associated with pollution from mid-latitudes. Nature 2006;440(7085):787–9. link1

[ 8 ] Huang K, Xiao Q, Meng X, Geng G, Wang Y, Lyapustin A, et al. Predicting monthly high-resolution PM2.5 concentrations with random forest model in the North China Plain. Environ Pollut 2018;242:675–83. link1

[ 9 ] Pan Y, Tian Y, Liu X, Gu D, Hua G. Urban big data and the development of city intelligence. Engineering 2016;2(2):171–8. link1

[10] Guo H, Cheng T, Gu X, Wang Y, Chen H, Bao F, et al. Assessment of PM2.5 concentrations and exposure throughout China using ground observations. Sci Total Environ 2017;601–602:1024–30.

[11] Zhao C, Wang Y, Shi X, Zhang D, Wang C, Jiang JH, et al. Estimating the contribution of local primary emissions to particulate pollution using highdensity station observations. J Geophys Res Atmos 2019;124(3):1648–61. link1

[12] Prakash D, Christian H, Winston H, Kevin C, Jia-Yeong K, Gopal S. A retrospective comparison of model-based forecasted PM2.5 concentrations with measurements. Air Repair 2010;60(11):1293–308. link1

[13] Sun W, Zhang H, Palazoglu A, Singh A, Zhang W, Liu S. Prediction of 24-houraverage PM2.5 concentrations using a hidden Markov model with different emission distributions in northern California. Sci Total Environ 2013;443 (3):93–103. link1

[14] Biancofiore F, Busilacchio M, Verdecchia M, Tomassetti B, Aruffo E, Bianco S, et al. Recursive neural network model for analysis and forecast of PM10 and PM2.5. Atmos Pollut Res 2017;8(4):652–9. link1

[15] Jahed Armaghani D, Raja Shoib RSNSB, Faizi K, Rashid ASA. Developing a hybrid PSO–ANN model for estimating the ultimate bearing capacity of rocksocketed piles. Neural Comput Appl 2017;28(2):391–405. link1

[16] Niu M, Wang Y, Sun S, Li Y. A novel hybrid decomposition-and-ensemble model based on CEEMD and GWO for short-term PM2.5 concentration forecasting. Atmos Environ 2016;134:168–80. link1

[17] Gan K, Sun S, Wang S, Wei Y. A secondary-decomposition-ensemble learning paradigm for forecasting PM2.5 concentration. Atmos Pollut Res 2018;9 (6):989–99. link1

[18] Cheng Y, Zhang H, Liu Z, Chen L, Wang P. Hybrid algorithm for short-term forecasting of PM2.5 in China. Atmos Environ 2019;200:264–79. link1

[19] Wu Q, Lin H. A novel optimal-hybrid model for daily air quality index prediction considering air pollutant factors. Sci Total Environ 2019;683:808–21. link1

[20] Liu H, Duan Z, Chen C. A hybrid framework for forecasting PM2.5 concentrations using multi-step deterministic and probabilistic strategy. Air Qual Atmos Hlth 2019;12(7):785–95. link1

[21] Stidworthy A, Jackson M, Johnson K, Carruthers D, Stocker J. Evaluation of local and regional air quality forecasts for London. Int J Environ Pollut 2018;64(1– 3):178–91. link1

[22] Zhu S, Lian X, Liu H, Hu J, Wang Y, Che J. Daily air quality index forecasting with hybrid models: a case in China. Environ pollut 2017;231:1232–44. link1

[23] Sun W, Sun J. Daily PM2.5 concentration prediction based on principal component analysis and LSSVM optimized by cuckoo search algorithm. J Environ Manage 2017;188:144–52. link1

[24] Reátegui-Romero W, Sánchez-Ccoyllo OR, de Fatima Andrade M, Moya-Alvarez A. PM2.5 estimation with the WRF/Chem model, produced by vehicular flow in the lima metropolitan area. Open J Air Pollut 2018;7(3):215. link1

[25] Zhu S, Lian X, Wei L, Che J, Shen X, Yang L, et al. PM2.5 forecasting using SVR with PSOGSA algorithm based on CEEMD, GRNN and GCA considering meteorological factors. Atmos Environ 2018;183:20–32. link1

[26] Niu M, Hu Y, Sun S, Liu Y. A novel hybrid decomposition-ensemble model based on VMD and HGWO for container throughput forecasting. Appl Math Modell 2018;57:163–78. link1

[27] Liu H, Jin K, Duan ZJ. Air PM2.5 concentration multi-step forecasting using a new hybrid modeling method: comparing cases for four cities in China. Atmos Pollut Res 2019;10(5):1588–600. link1

[28] Wang Q, Zeng Q, Tao J, Sun L, Zhang L, Gu T, et al. Estimating PM2.5 concentrations based on MODIS AOD and NAQPMS data over Beijing–Tianjin– Hebei. Sensors 2019;19(5):1207. link1

[29] Da L, Wang J, Hui W. Short-term wind speed forecasting based on spectral clustering and optimised echo state networks. Renew Energ 2015;78:599–608. link1

[30] Xu L, Yu Y, Yu J, Chen J, Niu Z, Yin L, et al. Spatial distribution and sources identification of elements in PM2.5 among the coastal city group in the Western Taiwan Strait region, China. Sci Total Environ 2013;442(1):77–85. link1

[31] Li C, Zhu Z. Research and application of a novel hybrid air quality earlywarning system: a case study in China. Sci Total Environ 2018;626:1421–38. link1

[32] Wang C, Shao M, He Q, Qian Y, Qi Y. Feature subset selection based on fuzzy neighborhood rough sets. Knowl-Based Syst 2016;111:173–9. link1

[33] Wang S, Li Q, Yuan H, Li D, Geng J, Zhao C, et al. d-Open set clustering—a new topological clustering method. WIREs Data Mining Knowl Discov 2018;8(6): e1262. link1

[34] Yu S, Chu S, Wang C, Chan Y, Chang T. Two improved k-means algorithms. Appl Soft Comput 2018;68:747–55. link1

[35] Zhang Q, Yang LT, Chen Z, Li P. High-order possibilistic c-means algorithms based on tensor decompositions for big data in IoT. Inf Fusion 2018;39:72–80. link1

[36] Majumdar J, Udandakar S, Bai BM. Implementation of cure clustering algorithm for video summarization and healthcare applications in big data. In: Emerging Research in Computing, Information, Communication and Applications. Singapore: Springer; 2019. p. 553–64.

[37] Gilles J. Empirical wavelet transform. IEEE Trans Signal Process 2013;61 (16):3999–4010. link1

[38] Chitsazan MA, Fadali MS, Trzynadlowski AM. Wind speed and wind direction forecasting using echo state network with nonlinear functions. Renew Energy 2019;131:879–89. link1

[39] Fan H, Zhao C, Yang Y. A comprehensive analysis of the spatio-temporal variation of urban air pollution in China during 2014–2018. Atmos Environ 2020;220:117066. link1

[40] Zhang K, Zhao C, Fan H, Yang Y, Sun Y. Toward understanding the differences of PM2.5 characteristics among five China urban cities. Asia-Pacific J Atmos Sci 2019;55(2):1–10. link1

[41] Shi X, Zhao C, Wang C, Jiang J, Yung Y. A method of examination about the spatial representation of PM2.5 obtained from a network of limited surface stations. J Geophys Res Atmos 2018;123:3145–58. link1

[42] Rokach L, Maimon O. Clustering methods. In: Data Mining and Knowledge Discovery Handbook. Boston: Springer; 2005. p. 321–52.

[43] Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math 1999;20(20):53–65. link1

Related Research