An Interpretable Light Attention-Convolution-Gate Recurrent Unit Architecture for the Highly Accurate Modeling of Actual Chemical Dynamic Processes

Yue Li; Ning Li; Jingzheng Ren; Weifeng Shen

doi:10.1016/j.eng.2024.07.009

PDF(2255 KB)

Engineering ›› 2024, Vol. 39 ›› Issue (8) : 104-116. DOI: 10.1016/j.eng.2024.07.009

Research

Article

An Interpretable Light Attention-Convolution-Gate Recurrent Unit Architecture for the Highly Accurate Modeling of Actual Chemical Dynamic Processes

Author information +

History +

Abstract

To equip data-driven dynamic chemical process models with strong interpretability, we develop a light attention-convolution-gate recurrent unit (LACG) architecture with three sub-modules-a basic module, a brand-new light attention module, and a residue module-that are specially designed to learn the general dynamic behavior, transient disturbances, and other input factors of chemical processes, respectively. Combined with a hyperparameter optimization framework, Optuna, the effectiveness of the proposed LACG is tested by distributed control system data-driven modeling experiments on the discharge flowrate of an actual deethanization process. The LACG model provides significant advantages in prediction accuracy and model generalization compared with other models, including the feedforward neural network, convolution neural network, long short-term memory (LSTM), and attention-LSTM. Moreover, compared with the simulation results of a deethanization model built using Aspen Plus Dynamics V12.1, the LACG parameters are demonstrated to be interpretable, and more details on the variable interactions can be observed from the model parameters in comparison with the traditional interpretable model attention-LSTM. This contribution enriches interpretable machine learning knowledge and provides a reliable method with high accuracy for actual chemical process modeling, paving a route to intelligent manufacturing.

Graphical abstract

Keywords

Interpretable machine learning / Light attention-convolution-gate recurrent / unit architecture / Process knowledge discovery / Data-driven process model / Intelligent manufacturing

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Yue Li, Ning Li, Jingzheng Ren, Weifeng Shen. An Interpretable Light Attention-Convolution-Gate Recurrent Unit Architecture for the Highly Accurate Modeling of Actual Chemical Dynamic Processes. Engineering, 2024, 39(8): 104‒116 https://doi.org/10.1016/j.eng.2024.07.009

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	J. Sansana, M.N. Joswiak, I. Castillo, Z. Wang, R. Rendall, L.H. Chiang, et al. Recent trends on hybrid modeling for Industry 4.0. Comput Chem Eng, 151 (2021), Article 107365.

[2]	D. Solle, B. Hitzmann, C. Herwig, M. Pereira Remelhe, S. Ulonska, L. Wuerth, et al. Between the poles of data-driven and mechanistic modeling for process operation. Chem Ing Tech, 89 (5) (2017), pp. 542-561.

[3]	N. Meneghetti, P. Facco, F. Bezzo, M. Barolo. A methodology to diagnose process/model mismatch in first-principles models. Ind Eng Chem Res, 53 (36) (2014), pp. 14002-14013.

[4]	Y. Chen, M. Ierapetritou. A framework of hybrid model development with identification of plant-model mismatch. AIChE J, 66 (10) (2020), p. e16996.

[5]	J. Panerati, M.A. Schnellmann, C. Patience, G. Beltrame, G.S. Patience. Experimental methods in chemical engineering: artificial neural networks—ANNs. Can J Chem Eng, 97 (9) (2019), pp. 2372-2382.

[6]	M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver, C.V. Stevens, K.M. Van Geem. Machine learning in chemical engineering: strengths, weaknesses, opportunities, and threats. Engineering, 7 (9) (2021), pp. 1201-1211.

[7]	K.T. Leperi, D. Yancy-Caballero, R.Q. Snurr, F. You. 110th anniversary: surrogate models based on artificial neural networks to simulate and optimize pressure swing adsorption cycles for CO₂ capture. Ind Eng Chem Res, 58 (39) (2019), pp. 18241-18252.

[8]	H. Fang, J. Zhou, Z. Wang, Z. Qiu, Y. Sun, Y. Lin, et al. Hybrid method integrating machine learning and particle swarm optimization for smart chemical process operations. Front Chem Sci Eng, 16 (2) (2022), pp. 274-287.

[9]	S. Jiang, V.M. Zavala. Convolutional neural nets in chemical engineering: foundations, computations, and applications. AIChE J, 67 (9) (2021), p. e17282.

[10]	Z. Wu, A. Tran, D. Rincon, P.D. Christofides. Machine learning-based predictive control of nonlinear processes. Part I: theory. AIChE J, 65 (11) (2019), p. e16729.

[11]	C. Ning, F. You. Optimization under uncertainty in the era of big data and deep learning: when machine learning meets mathematical programming. Comput Chem Eng, 125 (2019), pp. 434-448.

[12]	Y. Su, Z. Wang, S. Jin, W. Shen, J. Ren, M.R. Eden. An architecture of deep learning in QSPR modeling for the prediction of critical properties using molecular signatures. AIChE J, 65 (9) (2019), p. e16678.

[13]	J. Zhang, Q. Wang, Y. Su, S. Jin, J. Ren, M. Eden, et al. An accurate and interpretable deep learning model for environmental properties prediction using hybrid molecular representations. AIChE J, 68 (6) (2022), p. e17634.

[14]	H. Wu, Y. Han, J. Jin, Z. Geng. Novel deep learning based on data fusion integrating correlation analysis for soft sensor modeling. Ind Eng Chem Res, 60 (27) (2021), pp. 10001-10010.

[15]	L. Sun, T. Liu, Y. Xie, D. Zhang, X. Xia. Real-time power prediction approach for turbine using deep learning techniques. Energy, 233 (2021), Article 121130.

[16]	M. Byun, H. Lee, C. Choe, S. Cheon, H. Lim. Machine learning based predictive model for methanol steam reforming with technical, environmental, and economic perspectives. Chem Eng J, 426 (15) (2021), Article 131639.

[17]

Naito

, Taguchi

, Nakata

, Kato

. Anomaly detection for multivariate time series on large-scale fluid handling plant using two-stage autoencoder. In: Xue B, Pechenizkiy M, Koh YS, editors. Proceedings of the 21st IEEE International Conference on Data Mining Workshops; 2021 Dec 7- 10 ; online conference. Piscataway: IEEE; 2021. p. 542-51.

[18]	M.L. Thompson, M.A. Kramer. Modeling chemical processes using prior knowledge and neural networks. AIChE J, 40 (8) (1994), pp. 1328-1340.

[19]	M. von Stosch, R. Oliveira, J. Peres, S. Feyo de Azevedo. Hybrid semi-parametric modeling in process systems engineering: past, present and future. Comput Chem Eng, 60 (2014), pp. 86-101.

[20]	S. Zendehboudi, N. Rezaei, A. Lohi. Applications of hybrid models in chemical, petroleum, and energy systems: a systematic review. Appl Energy, 228 (2018), pp. 2539-2566.

[21]	C. Valencia Peroni, M. Parisi, A. Chianese. Hybrid modelling and self-learning system for dextrose crystallization process. Chem Eng Res Des, 88 (12) (2010), pp. 1653-1658.

[22]	K. Wang, C. Shang, L. Liu, Y. Jiang, D. Huang, F. Yang. Dynamic soft sensor development based on convolutional neural networks. Ind Eng Chem Res, 58 (26) (2019), pp. 11521-11531.

[23]	Y. Zheng, Z. Wu. Physics-informed online machine learning and predictive control of nonlinear processes with parameter uncertainty. Ind Eng Chem Res, 62 (6) (2023), pp. 2804-2818.

[24]	G. Wu, W.T.G. Yion, K.L.N.Q. Dang, Z. Wu. Physics-informed machine learning for MPC: application to a batch crystallization process. Chem Eng Res Des, 192 (2023), pp. 556-569.

[25]	A.P. Shaha, M.S. Singamsetti, B.K. Tripathy, G. Srivastava, M. Bilal, L. Nkenyereye. Performance prediction and interpretation of a refuse plastic fuel fired boiler. IEEE Access, 8 (2020), pp. 117467-117482.

[26]

Ribeiro

, Singh

, Guestrin

. “Why should I trust you?”:explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016 Aug 13-17; San Francisco, CA, USA. New York City: Association for Computing Machinery; 2016. p. 1135-44.

[27]	K. Mu, L. Luo, Q. Wang, F. Mao. Industrial process monitoring and fault diagnosis based on temporal attention augmented deep network. J Inf Process Syst, 17 (2) (2021), pp. 242-252.

[28]	B. Xu, Y. Wang, L. Yuan, C. Xu. A novel second-order learning algorithm based attention-LSTM model for dynamic chemical process modeling. Appl Intell, 53 (2023), pp. 1619-1639.

[29]	B.R. Bakshi, G. Stephanopoulos. Representation of process trends—IV. induction of real-time patterns from operating data for diagnosis and supervisory control. Comput Chem Eng, 18 (4) (1994), pp. 303-332.

[30]	A. Ragab, M. El-Koujok, B. Poulin, M. Amazouz, S. Yacout. Fault diagnosis in industrial chemical processes using interpretable patterns based on logical analysis of data. Expert Syst Appl, 95 (2018), pp. 368-383.

[31]	A. Ragab, M. El-Koujok, H. Ghezzaz, M. Amazouz, M.S. Ouali, S. Yacout. Deep understanding in industrial processes by complementing human expertise with interpretable patterns of machine learning. Expert Syst Appl, 122 (2019), pp. 388-405.

[32]	Z. Wu, D. Rincon, P.D. Christofides. Process structure-based recurrent neural network modeling for model predictive control of nonlinear processes. J Process Contr, 89 (2020), pp. 74-84.

[33]	A. Chakraborty, A. Sivaram, L. Samavedham, V. Venkatasubramanian. Mechanism discovery and model identification using genetic feature extraction and statistical testing. Comput Chem Eng, 140 (2020), Article 106900.

[34]	H. Narayanan, M.N. Cruz Bournazou, G. Guillén Gosálbez, A. Butté. Functional-hybrid modeling through automated adaptive symbolic regression for interpretable mathematical expressions. Chem Eng J, 430 (4) (2022), Article 133032.

[35]	R. Subramanian, R.R. Moar, S. Singh. White-box machine learning approaches to identify governing equations for overall dynamics of manufacturing systems: a case study on distillation column. Mach Learn Appl, 3 (2021), Article 100014.

[36]	Y. Li, T. Zhang, S. Sun. Acceleration of the NVT flash calculation for multicomponent mixtures using deep neural network models. Ind Eng Chem Res, 58 (27) (2019), pp. 12312-12322.

[37]	H. Wu, J. Zhao. Self-adaptive deep learning for multimode process monitoring. Comput Chem Eng, 141 (2020), Article 107024.

[38]	Y. Bengio, P. Simard, P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw, 5 (2) (1994), pp. 157-166.

[39]	S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Comput, 9 (8) (1997), pp. 1735-1780.

[40]	F.A. Gers, J. Schmidhuber, F. Cummins. Learning to forget: continual prediction with LSTM. Neural Comput, 12 (10) (2000), pp. 2451-2471.

[41]

Cho

, van Merriënboer

, Gulcehre

, Bahdanau

, Bougares

, Schwenk

, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti A, Pang B, Daelemans W, editors. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing; 2014 Oct 25- 29 ; Doha, Qatar. Stroudsburg: Association for Computational Linguistics; 2014. p. 1724-34.

[42]	Chung J, Gulcehre C, Cho KH, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. 2014. arXiv:1412.3555.

[43]	Cho K, van Merriënboer B, Bahdanau D, Bengio Y. On the properties of neural machine translation: encoder-decoder approaches. 2014. arXiv:1409.1259.

[44]	Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. 2014. arXiv: 1409.0473.

[45]

Luong

, Pham

, Manning

. Effective approaches to attention-based neural machine translation. In: Màrquez L, Callison-Burch C, Su J, editors. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing; 2015 Sep 17-21; Lisbon, Portugal. Red Hook: Curran Associates; 2015. p. 1412-21.

[46]	Y. LeCun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proc IEEE, 86 (11) (1998), pp. 2278-2324.

[47]	A. Krizhevsky, I. Sutskever, G.E. Hinton. ImageNet classification with deep convolutional neural networks. P. Bartlett, F. Pereira, C.J. Burges, L. Bottou, K.Q. Weinberger (Eds.), Advances in neural information processing systems 26, Curran Associates, Red Hook (2013), pp. 1097-1105.

[48]	P.S. Pravin, J.Z.M. Tan, K.S. Yap, Z. Wu. Hyperparameter optimization strategies for machine learning-based stochastic energy efficient scheduling in cyber-physical production systems. Digit Chem Eng, 4 (2022), Article 100047.

[49]

Akiba

, Sano

, Yanase

, Ohta

, Koyama

. Optuna:a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2019 Aug 4-8; Anchorage, AK, USA. New York City: Association for Computing Machinery; 2019. p. 2623-31.

[50]	J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl. Algorithms for hyper-parameter optimization. J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, K.Q. Weinberger (Eds.), Advances in neural information processing systems 24, Curran Associates, Red Hook (2011), pp. 2546-2554.

[51]	Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Teh YW, Titterington M, editors. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics; 2010 May 13- 15; Sardinia, Italy; 2010. p. 249-56.

[52]	A. Savitzky, M.J.E. Golay. Smoothing and differentiation of data by simplified least squares procedures. Anal Chem, 36 (8) (1964), pp. 1627-1639.

[53]	J.P. Poort, M. Ramdin, J. van Kranendonk, T.J.H. Vlugt. Solving vapor-liquid flash problems using artificial neural networks. Fluid Phase Equilib, 490 (2019), pp. 39-47.