An Interpretable Light Attention-Convolution-Gate Recurrent Unit Architecture for the Highly Accurate Modeling of Actual Chemical Dynamic Processes

Yue Li , Ning Li , Jingzheng Ren , Weifeng Shen

Engineering ›› 2024, Vol. 39 ›› Issue (8) : 112 -125.

PDF (2255KB)
Engineering ›› 2024, Vol. 39 ›› Issue (8) :112 -125. DOI: 10.1016/j.eng.2024.07.009
Research
Article

An Interpretable Light Attention-Convolution-Gate Recurrent Unit Architecture for the Highly Accurate Modeling of Actual Chemical Dynamic Processes

Author information +
History +
PDF (2255KB)

Abstract

To equip data-driven dynamic chemical process models with strong interpretability, we develop a light attention-convolution-gate recurrent unit (LACG) architecture with three sub-modules-a basic module, a brand-new light attention module, and a residue module-that are specially designed to learn the general dynamic behavior, transient disturbances, and other input factors of chemical processes, respectively. Combined with a hyperparameter optimization framework, Optuna, the effectiveness of the proposed LACG is tested by distributed control system data-driven modeling experiments on the discharge flowrate of an actual deethanization process. The LACG model provides significant advantages in prediction accuracy and model generalization compared with other models, including the feedforward neural network, convolution neural network, long short-term memory (LSTM), and attention-LSTM. Moreover, compared with the simulation results of a deethanization model built using Aspen Plus Dynamics V12.1, the LACG parameters are demonstrated to be interpretable, and more details on the variable interactions can be observed from the model parameters in comparison with the traditional interpretable model attention-LSTM. This contribution enriches interpretable machine learning knowledge and provides a reliable method with high accuracy for actual chemical process modeling, paving a route to intelligent manufacturing.

Graphical abstract

Keywords

Interpretable machine learning / Light attention-convolution-gate recurrent / unit architecture / Process knowledge discovery / Data-driven process model / Intelligent manufacturing

Cite this article

Download citation ▾
Yue Li, Ning Li, Jingzheng Ren, Weifeng Shen. An Interpretable Light Attention-Convolution-Gate Recurrent Unit Architecture for the Highly Accurate Modeling of Actual Chemical Dynamic Processes. Engineering, 2024, 39(8): 112-125 DOI:10.1016/j.eng.2024.07.009

登录浏览全文

4963

注册一个新账户 忘记密码

1. Introduction

Theoretical models of chemical engineering processes play an important role in process monitoring, control, diagnosis, optimization, and design [1], and exhibit relatively high extrapolation power [2]. However, the complexity of an actual chemical process significantly influences the performance of its theoretical model, and it is usually difficult to accurately describe the real-time behavior of a complex chemical process. As an alternative to additional calibration works on the process-model mismatch [3,4], data-driven or machine learning (ML) technologies provide a more direct solution to deal with such issues.

ML emerged from the base of neural network theory, which was proposed in the 1940s; in combination with the deep learning technology, developed in the 2010s [5,6], ML has become one of the most popular modeling methods. Nowadays, ML involves a wide range of technologies, including feedforward neural networks (FNNs) [7,8], convolution neural networks (CNNs) [9], recurrent neural networks (RNNs) such as long short-term memory (LSTM) and gate recurrent unit (GRU) [10,11], graph neural networks (GNNs), and their variants [12,13]. In order to predict the melt index of polypropylene, Wu et al. [14] developed a soft sensing method based on a dilated CNN that combined data fusion and correlation analysis, achieving state-of-the-art prediction accuracy and generalization ability among ML technologies. Sun et al. [15] carried out a comprehensive comparison of seven typical data-driven models for the real-time power prediction of a turbine based on distributed control system (DCS) data from a power plant; they found that the RNN model performed best in balancing the accuracy-efficiency tradeoff. Byun et al. [16] adopted an ML-based predictive model to comprehensively investigate the technical, environmental, and economic feasibility of a promising methanol steam reforming system.

As shown by this literature review, ML has been demonstrated to be a suitable tool for modeling actual chemical processes. However, one of the most criticized drawbacks of ML is its lack of interpretability. This shortcoming makes the data-driven method unreliable under new conditions and prevents it from being trusted in practice [17]. Thus, studies on ML have paid increasing attention to interpretability and have tried to explain the "black box" in different ways. To the best of our knowledge, interpretable ML methods in chemical process can be divided into four categories: hybrid models, models with interpretable parameters or variable relationships, models with interpretable architectures, and mechanism discovery using ML.

1.1. Hybrid models

Hybrid or so-called "grey box" models combine process mechanisms and data-driven technologies in a single modeling framework [18]. The mechanism part can tell about the underlying mechanisms, as an interpretable basis of the whole model, and the ML part is responsible for the variable relationships that cannot be explained by mechanisms. Possible hybrid architectures generally include those with a parallel structure, serial structure, and their combination [19,20]. Chen and lerapetritou [4] proposed a hybrid model construction framework to improve the prediction accuracy of complex system models suffering from plant-model mismatches; the framework was tested with a simulated reactor model and two pharmaceutical unit operation case studies. Valencia Peroni et al. [21] presented an adaptive hybrid model consisting of a phenomenological model and a neural network to predict the behavior of a dextrose monohydrate cooling column crystallizer. A soft sensor model integrating finite impulse response and CNN was developed by Wang et al. [22] and gave the best prediction accuracy on a simulation case and a chemical industrial case compared with other baseline models. As an effective structure, physics-informed RNNs (PIRNNs) have recently come to the fore [23,24]. Wu et al. [24] presented a PIRNN modeling framework and developed a PIRNN-based predictive control scheme for batch crystallization processes. Through open-loop and closed-loop simulations, it was demonstrated that PIRNN models using less training data achieved prediction accuracy and closed-loop performance comparable to those of the purely data-driven model. To sum up, benefiting from the advantages of both mechanical and data-driven models, hybrid models have become one of the most widespread methods for chemical process modeling.

1.2. Models with interpretable parameters or variable relationships

The interpretability of models with interpretable parameters or variable relationships is always revealed by analyzing the relationships between input and output variables. Methods such as partial dependence plot [25] and local interpretable model-agnostic explanations [26] are useful in helping domain experts to understand the parameters that are leading to aberrations in outputs. In addition, for neural networks with a specific structure such as attention-LSTM, the variable relationships are naturally implied in the model parameters. Mu et al. [27] fused LSTM with an attention mechanism to detect the local temporal information of complex chemical process data. The interpretability of the hidden state feature was eventually enhanced and how each temporal instance contributes to the decision function could be observed. A structure combining LSTM and a spatiotemporal attention mechanism was proposed by X u et al. [28] and exhibited a good performance on the dynamic modeling of the Tennessee-Eastman process and fractionator datasets. The importance of different input features could be easily found from the spatial and temporal attention maps of the model.

1.3. Models with interpretable architectures

Two kinds of ML models are classified as models with interpretable architectures. One kind is the typical descriptive ML method, such as the decision tree [29] and logical analysis of data (LAD) [30]. The results of these methods are interpretable in themselves or can be summarized to the domain knowledge by experts. To provide the automatic enrichment and updating of existing fault trees in industrial processes, Ragab et al. [31] proposed a methodology that combines the domain knowledge with additional knowledge extracted by LAD, enabling interpretable rule discovery for detecting faulty events. In the other kind of model, the architecture is directly organized according to the domain knowledge, which is seldom seen in the literature. Naito et al. [17] designed a two-stage autoencoder (TSAE) as an anomaly detection method for fluid handling systems. Based on the premise that the plant signals could be separated into different components, two autoen-coders were combined to learn the long-term and short-term components, respectively. With the interpretable structure, the TSAE provided an explanation for its high performance on two water treatment datasets. Wu et al. [32] proposed three physics-based RNN modeling approaches, including a hybrid model, a partially connected RNN, and a weight-constrained RNN that consider the a priori knowledge on process structure. The proposed models outperformed a fully connected RNN model in the model predictive control of a chemical process example.

1.4. Mechanism discovery using ML

The aim of mechanism discovery using ML is to explore the mathematical expression of the target system in order to capture the underlying mechanisms and develop the model from the data directly [33]. Narayanan et al. [34] presented a functional-hybrid model that uses ranked domain-specific functional beliefs together with symbolic regression to develop dynamic models of chemical reaction kinetic principles in different systems. The results revealed that the functional-hybrid model could accurately capture the true trends, and the learned ordinary differential equations were very close to the actual ones. Subramanian et al. [35] tested two white-box ML approaches (SINDy and SymReg) on the identification of governing equations for the dynamics of a distillation column. They suggested that different ML algorithms may need to be used in parallel to discover the dynamics laws of complex systems.

To sum up the literature review above, the following conclusions can be drawn and research gaps identified:

(1) The interpretability of hybrid models (Section 1.1) [4,18,21,22]mainly stems from their mechanical parts.

(2) Mechanism discovery methods using ML (Section 1.4) [33- 35] rely on presupposed basic rules or knowledge and may cause underfitting when dealing with complex chemical processes. Models with interpretable parameters (Section 1.2) [27,28] or interpretable architectures (Section 1.3) [17] hold more potential for revealing the interpretability of ML itself.

(3) However, the parameter expression ability of models with interpretable parameters (Section 1.2) [27,28] is subjected to the model architecture, resulting in simple input-output weights that are insufficient for the deep understanding of actual chemical processes.

(4) Models with interpretable architectures (Section 1.3) [17] are constructed on a more solid theoretical basis, but they do not provide a clear way (e.g., revealing the interpretability via model parameters) for domain experts to explain the results.

To explore more possible ways to equip data-driven models with stronger interpretability, a novel interpretable light attention-convolution-gate recurrent unit (LACG) architecture is developed in this contribution for the purpose of chemical process modeling. The new architecture combines the advantages of models with interpretable parameters [27,28] and models with interpretable architectures [17]. It consists of three sub-modules-a basic module, a light attention module, and a residue module-that are specially designed to respectively deal with the general dynamic behavior, transient disturbances, and other input factors of the chemical process system. In addition, the chemical process dynamic characteristics can be learned by the LACG parameters. Rather than presenting simple weight correlations, the proposed architecture provides more detailed insight into the process variable interactions in the time domain, deepening the understanding of the actual chemical processes.

The proposed LACG was used in the modeling of the bottom flowrate of an actual deethanization process based on preprocessed DCS data and exhibited an excellent simulation performance compared with five typical neural networks: an FNN, a CNN, a two-dimensional CNN (CNN-2D), an LSTM, and an attention-LSTM. Moreover, a deethanization theoretical model was built using Aspen Plus Dynamics V12.1 to test the interpretability of the LACG model in order to determine whether the process dynamic behavior had been exactly learned by the LACG parameters. The proposed LACG architecture enriches the existing interpretable ML knowledge. It is expected to be a valuable method in actual chemical process modeling and analysis, promoting advanced tasks such as real-time optimization (RTO) and fault detection and diagnosis (FDD).

2. Introduction of the basic neural network layers

As all complicated neural networks are composed of basic network layers, it is necessary to provide a brief introduction before discussing specific neural networks. Fig. S1 in Appendix A illustrates three of the most widely used network layers (i.e., the dense layer, convolution layer, and recurrent layer), and their formulas are listed in Eqs. (1)-(3) [9,36,37]:

 Dense layer :  Y = A F W d X + b
Convolution layer : Y = A F W c * X + b
 Recurrent layer :  Y = A F W r X + W s X  state  + b

where X , Y, and X  state  are the input, output, and state input, respectively; subscripts d , c , r, and s refer to the dense layer, convolution layer, recurrent input layer, and recurrent state layer, respectively; W with different subscripts stands for the corresponding network weight; and b refers to the bias, if available. It should be noted that * is the convolution operation. A F represents the activation function, such as sigmoid, tanh, and rectified linear unit R e L U [9,36], if available. The dense layer is the basic module of the FNN, which is one of the most widespread networks in applications. This layer is simple to use and generally works well. However, it is costly in terms of storage and training due to the large numbers of parameters, and it tends to underperform on image and time series data compared with the convolution and recurrent layers, respectively. The convolution layer always works based on a 2D kernel in image processing. The recurrent layer is considered to be capable in time series processing. However, the gradient vanishing/exploding problem may arise and result in training problems [38]. The LSTM, designed by Hochreiter and Schmidhuber [39] and Gers et al. [40], and the GRU, developed by Cho et al. [41], aim to overcome this issue by introducing memory cells and gate units. These approaches have proved to be a great success and have become increasingly popular [42]. Both the LSTM and GRU are constructed on the same network basis illustrated in Fig. S1(c) but use more complex recurrent cells, as shown in Fig. S2 in Appendix A and Eqs. (4)-(13) [37,40,41]:

 LSTM:  f t = σ W f h t - 1 , X t + b f
i t = σ W i h t - 1 , X t + b i
o t = σ W o h t - 1 , X t + b o
C ~ t = t a n h W C h t - 1 , X t + b C
C t = f t × C t - 1 + i t × C ~ t
h t = o t × t a n h C t
 GRU :  r t = σ W r h t - 1 , X t + b r
z t = σ W z h t - 1 , X t + b z
h ~ t = t a n h W h r t × h t - 1 , X t + b h
h t = 1 - z t × h t - 1 + z t × h ~ t

where the subscripts t - 1 and t indicate the recurrent step number of the value; h stands for the hidden state; f , i, and o are the values of the forget gate, input gate, and output gate of the LSTM, respectively; C ~ and C represent the candidate and final cell states of the LSTM, respectively; r and z stand for the values of the reset gate and the update gate of the GRU, respectively; h ~ is the candidate hidden state of the GRU; and σ and t a n h are the sigmoid and tanh activation functions, respectively. In general, the hidden state h and cell state C are initialized to 0.

An RNN may deteriorate in training efficiency and modeling performance on long series [43] because it pays indiscriminate attention to all input features, even if the key information is only contained in a few positions. To address this issue, the attention mechanism can be developed to lead the network to focus on where the most relevant information lies, such as the Bahdanau attention [44] and Luong attention [45]. A popular self-attention structure based on the Luong attention is shown in Fig. S3 in Appendix A and Eqs. (14)-(17) [44]:

K = X W d
P = K X s
A = S o f t m a x P = e P i i e P i
Y = X × A , X s

where X s is the last step of the input; A refers to the attention vector; K and P are the intermediate tensors; and the subscript i indicates the element index. As shown in Fig. S3, in the final step of the attention module, an attention vector is obtained by a softmax function that normalizes all the input elements with a summation of 1. The attention vector changes with different inputs, and it is actually a weight array revealing how the input features affect the output. Therefore, the attention mechanism enables the neural network to be interpretable to some extent.

In order to illustrate complex neural networks in a straightforward way, a neural network block is proposed, as shown in Fig. S4 in Appendix A. In combination with the necessary parameter specifications, a block presents a specific layer of the neural network architecture. The variable dimension of the block output is also listed at the bottom. By organizing the blocks in a desirable way, a complex neural network with multiple layers can be illustrated easily. In the following sections, all the models are introduced using the neural network blocks. More specifically, the parameter "Pad" for the convolution layer determines the output size. For example, the output size is the same as the input size if the "Pad" is "same"; while, if the "Pad" is "valid," the output size for a one-dimensional (1D) convolution layer can be calculated by Eq. (18)[9]:

L n = L m - L f + 1

where L m and L n are the input and output lengths, respectively, and L f stands for the kernel size. The parameter "Seq" of the recurrent block indicates whether the layer returns sequence results or not. If sequence results are required, the outputs of the RNN cells under all recurrent steps are obtained. Otherwise, only the output of the final step is returned.

Other settings used in this study are as follows: ① The DCS data inputs have three main dimensions, B , S , F, where B , S, and F are the sample batch size, time step, and feature number, respectively; and ② the convolution kernel is single with a moving stride of 1.

3. Methodology

In this section, the proposed LACG architecture is presented together with five baseline neural networks used in the subsequent modeling experiments. Moreover, a hyperparameter optimization framework, Optuna, is introduced for network optimization.

3.1. The proposed LACG for chemical process dynamic modeling

To construct a network architecture that is compatible with process principles, it is necessary to clarify the factors that determine the developing trends of the process variables. In this study, the driving forces of the output variable fluctuation are classified into three parts: the general dynamic behavior of the process itself, input transient disturbances, and other input factors. Naturally, the design principles of interpretable ML indicate that the network architecture should also be in three parts in order to learn the different driving forces accordingly. Moreover, to ensure a good modeling performance and interpretability, every part of the network should be built based on the foundational neural network layers that are subject to the characteristics of the corresponding process factors. Therefore, an LACG architecture with three specially designed sub-modules is developed, as illustrated in Fig. 1.

The three sub-modules of LACG-that is, the basic module, the light attention module, and the residue module-aim to deal with the general dynamic behavior, the input transient disturbances, and the other input factors of the chemical process, respectively. As the most essential process principle, the general dynamic behavior is expected to be learned by the LACG parameters. The specific architecture and organization of the developed LACG are shown in Fig. 2, followed by detailed descriptions of each submodule.

3.1.1. The basic module

The general dynamic behavior is the basic rule of a chemical process and does not change much during normal operation. Simple neural network layers with a small parameter number are suitable for learning this general behavior in order to free the model from overfitting and achieve a good generalization performance. Thus, two convolution layers are used as the base of the proposed LACG. In the first convolution layer, input with a dimension of [ B , S, F ] is convoluted through a kernel of S , F. This convolution kernel updates in every gradient descent step and can finally be considered an input-output conversion function. The kernel weights are expected to indicate how the inputs in a time window affect the output, or how the output responds under a transient input. In this way, the general dynamic behavior of the chemical process is learned. Every column in dimension F of the convolution kernel corresponds to an input feature, and the time window size is determined by S. In addition, the second convolution layer acts as a regularization layer. To sum up, this basic module is the prerequisite that enables the LACG to be interpretable. The formulas of the basic module are derived as Eqs. (19) and (20):

Y I , 1 = C o n v X W c , I , 1 P a d =  same 
Y I , 2 = C o n v Y I , 1 W c , I , 2 P a d =  valid 

where C o n v stands for the convolution layer-that is, Eq. (2). The Roman numeral subscript I (I, II, III) indicates the specific module, followed by an Arabic numeral for numbering; for example, Y I , 1 refers to the first output of Module I (the basic module). The subscripts of the network layer, such as "Pad," indicate the specific layer setting, which is consistent with Fig. S4.

3.1.2. The light attention module

This brand-new module aims to deal with the transient disturbances of the main process variables. These frequent disturbances are input-dependent and affect the output to a limited extent because they are subject to the general dynamic behavior. The convolution kernel of the basic module is fixed, so it cannot capture such transient disturbances, reducing the model accuracy. As mentioned earlier, the attention mechanism is a useful skill in introducing more flexibilities to models. However, converted by the normalized attention vector shown in Fig. S3, the outputs may differ largely from the module inputs, increasing the overfitting risk. Therefore, to achieve a balance between model accuracy and generalization performance, a novel attention mechanism that can limit the fluctuation amplitude of the output attention vector is created: the so-called "light attention" module.

The light attention module is illustrated in Fig. 3. Through several recurrent and dense layers, the input tensor is activated by a tanh function and then scaled to -1 to 1. The intermediate tensor LP (Fig. 3) is then combined with a light attention coefficient α. Next, a light attention vector of weights ranging from 1 - α to 1 + α is obtained, which can balance the tradeoff between accuracy and generalization. Finally, an integrated convolution kernel for the basic module is obtained via pointwise multiplication between the light attention vector and the original kernel.

By changing the light attention coefficient α, the light attention weights can be limited to a desired range. In addition, the recurrent layers in the module can be altered to LSTM or GRU. The GRU was chosen in this research. Moreover, this new module is only coupled with the second convolution layer of the basic module to minimize the impact on interpretability. Thus, the equations of the light attention module can be derived as follows:

Y I I , i L V = Y I , 1 , i = 0 G R U Y I I , i - 1 W g , I I , i , b g , I I , i s e q = Y e s , i = 1 - N g , I I
Y I I , N g , I I + j L K =  Dense  Y I I , N g , I I + j - 1 W d , I I , j , b d , I I , j A F =  ReLU  , j = 1 - N d , I I
Y I I , N g , I I + N d , I I + 1 L A = 1 + α ×  Dense  Y I I , N g , I I + N d , I I W d , I I , N d , I I + 1 , b d , I I , N d , I I + 1 A F = t a n h
Y I , 2 ' = C o n v Y I , 1 W c , I , 2 × L A P a d =  valid 

where GRU stands for the GRU layer with Eqs. (10)-(13) and Dense refers to the dense layer (i.e., Eq. (1)). N is the number of network layers. The subscripts d , c, and g represent the dense, convolution, and GRU layers, respectively. In particular, W g involves three gate weights in GRU-that is, W r , W z, and W h -and so does b g. The subscripts i and j indicate the element index. LV and LK are the intermediate tensors, LA represents the light attention vector, and α stands for the light attention coefficient, which is set to 0.2 in this research according to training experience. Y 1 , 2 ' is the updated output of the basic module using the light attention vector. The meanings of other symbols are the same as in Eqs. (19)-(20).

3.1.3. The residue module

In addition to the aforementioned driving forces, the process outputs are affected by other disturbances, deviating the model outputs from the actual DCS data. To release the model from such disturbances and further improve its performance, a residue module is added, which includes one convolution layer and several GRU layers. The convolution layer aims to reduce the tensor size so that the parameter number of GRU layers can be controlled, which is helpful for training efficiency and stability. Then, the interference information flowing through the time steps is captured by the GRU layers to eliminate the residues between the model outputs and the actual DCS data. Accordingly, the formulas of the residue module are organized as follows:

Y I I I , 1 = C o n v X , Y I , 1 W c , I I I , 1 P a d =  valid 
Y I I I , k + 1 = G R U Y I I I , k W g , I I I , k , b g , I I I , k S e q = Y e s , k = 1 - N g , I I I
Y I I I , N g , I I I + 2 = G R U Y I I I , N g , I I I + 1 W g , I I I , N g , I I I + 1 , b g , I I I , N g , I I I + 1 S e q = N o
Y  final  = Y I , 2 ' + Y I I I , N g , I I I + 2

where Y  final  is the final output of the proposed LACG, combining both results of the basic module and the residue module. The subscript k indicates the element index. The meanings of the other symbols are the same as in Eqs. (19)-(24).

These three sub-modules of the LACG are organized as shown in Fig. 2 and are trained sequentially except for the second convolution layer of the basic module, as the second convolution layer needs to be trained twice together with the basic module and the light attention module, respectively. The combination of the designed modules, the coupling method, and the special training process is able to achieve highly accurate and interpretable data-driven modeling.

3.2. The baseline neural networks used for comparisons in the modeling experiments

In this study, five kinds of neural networks-that is, the FNN, CNN, CNN-2D, LSTM, and attention-LSTM-were chosen as the baseline models, as they are some of the most typical and widely used types among the numerous ML technologies. The basic layers/modules introduced in Section 2 are used to construct the baselines, including: ① an FNN with a dimension-squeezing module (Fig. S5 in Appendix A); ② two typical CNNs with one and two dimensions, respectively (i.e., the LeNet [46] and AlexNet [47]), comprising several stacked convolution layers, pooling layers, and dense layers (Fig. S6 in Appendix A), where the pooling layer is a transformation to reduce the output size by either max-pooling or average-pooling [9]; ③ an LSTM with stacked LSTM layers, followed by a series of dense layers (Fig. S7 in Appendix A); and ④ an attention-LSTM using the self-attention mechanism (Fig. S8 in Appendix A). The corresponding mathematical formulas are derived as shown in Eqs. (29)-(42):

 FNN:  Y i = X , i = 0 D e n s e Y i - 1 W d , i , b d , i A F = R e L U , i = 1 - N d
Y N d + 1 =  Dense  Y N d W d , N d + 1 , b d , N d + 1 A F =  ReLU 
Y  final  = D e n s e Y N d + 1 : , : , 1 W d , N d + 2 , b d , N d + 2
 CNN:  Y i = X , i = 0 C N N Y i - 1 W c , i A F = R e L U , P a d =  same  , i = 1 , 3 , 5 , , 2 N c - 1 P o o l Y i - 1 , i = 2 , 4 , 6 , , 2 N c
Y 2 N c + k + 1 = F l a t Y 2 N c + k , k = 0 D e n s e Y 2 N c + k W d , k , b d , k A F = R e L U , k = 1 - N d
Y  final  = D e n s e Y 2 N c + N d + 1 W d , N d + 1 , b d , N d + 1
 LSTM:  Y i = X , i = 0 L S T M Y i - 1 W 1 , i , b 1 , i  Seq  =  Yes  , i = 1 - N 1
Y N 1 + 1 = L S T M Y N 1 W 1 , N 1 + 1 , b 1 , N 1 + 1  Seq  =  No 
Y N 1 + 1 + j = D e n s e Y N 1 + j W d , j , b d , j A F = R e L U , j = 1 - N d
Y  final  = D e n s e Y N l + N d + 1 W d , N d + 1 , b d , N d + 1
 Attention-LSTM:  Y i = X , i = 0 L S T M Y i - 1 W 1 , i , b 1 , i  Seq  =  Yes  , i = 1 - N 1
Y N 1 + 1 = S A Y N 1 W d , 1
Y N 1 + 2 = D e n s e Y N 1 + 1 W d , 2 A F = t a n h
Y  final  = D e n s e Y N 1 + 2 W d , 3 , b d , 3

where P o o l and F l a t refer to the pooling and flattening operations, respectively [9], and S A stands for the self-attention layer in Eqs. (14)-(17). The subscript 1 represents the LSTM layers. The operator : , : , 1 indicates the dimension-squeezing operation. The meanings of the other symbols in Eqs. (29)-(42) are the same as in Eqs. (19)-(28).

3.3. Optimization of the hyperparameters of neural networks

Hyperparameters are the parameters that cannot be trained in the gradient descent process; in other words, they are predetermined before the training of the neural networks and include, for example, the numbers of the layers and neural units, the learning rate, and the training epoch number. For a precise comparison, the hyperparameters of the six data-driven models mentioned earlier should be optimized first. Four different hyperparameter tuning and optimization strategies were studied thoroughly by Pravin et al. [48], who found that the hyperparameter optimization framework Optuna [49] outperformed traditional strategies including manual-based tuning, grid search, and/or random search tuning methods. Thus, the Optuna-more specifically, the one based on the tree-structured Parzen estimator [50]-is used herein to find the optimal hyperparameters. To avoid overfitting and to pursue a good model generalization performance, the average of the coefficients of determination R 2 s on the validation set is considered as the optimization objective, as revealed in Eq. (43), where NEP is the epoch number.

 Average of  R 2 = m a x 1 N E P i = 1 N E P R i 2

The model hyperparameters for optimization are listed in Table 1 with their ranges in Table 2, which are determined according to training experience. It should be noted that the number of layer size U is equal to the corresponding layer number N in each model; for example, if the N c and N d of the CNN are 1 and 2, respectively, the numbers of U c and U d will be 1 and 2 as well. In particular, the U c ' ' of Module III of the LACG ranges between 10 and 45 for the purpose of parameter reduction.

During the optimization of each model, 100 trials are carried out to find the optimal hyperparameter combination. As an example, the optimization process of the proposed LACG is illustrated by the flowchart and one piece of pseudocode in Fig. 4 and Appendix A Fig. S9. For all the networks, the weights W s are initialized by the Xavier uniform initializer [51], and the biases b s are initialized to zero tensors.

4. Case study: Product flowrate modeling of an actual deethanization process

In order to test the modeling performance of different data-driven technologies, including the proposed LACG, an actual deeth-anization process in Yulin, Shaanxi, China, as shown in Fig. 5, is introduced as a case study. The process aims to separate C 2 H 6 and C 2 H 4 from C 3 H 8 and C 3 H 6 through the deethanizer. As a result, the lighter hydrocarbon C 2 is collected at the top of the column, while the heavier hydrocarbon C3 is discharged at the bottom. Several dynamic control strategies involving temperature, pressure, and level are implemented to maintain product quality.

The bottom flowrate of the deethanizer is chosen as the output variable in the case study, while the other 21 process variables are considered to be the input variables and are listed in Table 3. The variable data are collected from the actual DCS and range from 00:00:00 of September 25, 2020 to 08:57:00 of October 26, 2020, with an interval of 1 m i n. The Savitzky-Golay method (i.e., a cubical smoothing algorithm with five-point approximation) is carried out to reduce the data noise [52], and the DCS data are standardized and then chronologically divided into a training set, a validation set, and a testing set with 36142, 4518, and 4518 groups of DCS data, respectively.

A batch size of 1500 is chosen for all datasets, and the time step of each sample is set to 60 so that the input size of the models is [1500,60,21]. The mean absolute error (MAE) and R 2 are introduced as the loss function and the evaluation index, respectively, as expressed by Eqs. (44) and (45) [53], where n is the sample number; y , y, and y represent the real output, the predicted output, and the mean of the real output, respectively.

M A E = 1 n i = 1 n y ^ i - y i
R 2 = 1 - i = 1 n y ^ i - y i 2 i = 1 n y - y i 2

5. Results and discussion

In this section, the training and modeling performances of the six data-driven models on the discharge flowrate are presented, while the parameters of the developed LACG model are extracted to illustrate its interpretability.

5.1. Modeling loss and coefficient of determination

The deep learning models for the prediction of the bottom flow-rate were built using Python 3.7 and then optimized using the Optuna package [49]. The optimized results of different models are listed in Table 4.

The MAEs and R 2 s of the best-trained models are shown in Fig. S10 in Appendix A. It should be noted that each model is tested once at the end, resulting in only one testing MAE or R 2, represented by green lines. As shown in Fig. S10, for the base cases of the FNN, LSTM, and attention-LSTM models, the MAEs and R 2 s on the testing dataset differ largely from those of the training dataset and validation dataset, indicating that these models are overfitting. The CNN, CNN-2D, and LACG provide the best modeling performances. To make a more intuitive comparison, the validation MAEs and R 2 s of all models are concluded in Fig. S11 in Appendix A with logarithmic horizontal axes, and the final testing MAEs and R 2 s are marked as triangles on the right vertical axes using the same colors as the corresponding validation curves.

Moreover, all the MAEs and R 2 s of the flowrate models are listed in Table 5, together with a performance ranking based on the testing MAEs. It can be seen that the developed LACG model outperforms the other models, with the smallest MAE of 0.1370 and the highest R 2 of 0.9697.

As a result, the simulated discharge flowrate of the LACG tracks better with the actual DCS data than the baseline models, which can be seen from Fig. 6. Parts of the discharge flowrate data are shown in two separated subfigures for clarity.

To better evaluate the process modeling performance of the proposed LACG, the testing results are restored to the values before DCS data standardization. The true MAE and mean relative error (MRE) are 722.29 k g h - 1 and 1.41 %, respectively, with a flowrate average of 5.13 × 10 4 k g h - 1. Generally speaking, the proposed LACG architecture is highly accurate, with a strong generalization ability. It is thereby expected to be a practical and reliable method for actual chemical process modeling.

To reveal the potential of the proposed LACG in high-throughput prediction and optimization, the computational costs of the models built are reported in Table 6. It should be noted that the hyperparameter optimization and the model training process are coupled, so a total optimization/training cost is provided. According to Table 6, the optimization/training cost of the proposed LACG is the largest, which is attributed to the specially designed hyperparameter optimization process with three steps. Nonetheless, the prediction cost of the LACG is similar to those of traditional neural networks. Compared with the CNN-2D and attention-LSTM, the LACG takes about 0.02 s longer to perform the prediction. For a single sample, only 0.0261 m s are required by the new architecture. This means that the proposed LACG is capable of high-throughput prediction.

5.2. Model interpretability implied in the network parameters of the proposed LACG

As mentioned earlier, the kernel of the first convolution layer of the LACG is expected to indicate how the output flowrate responds to the input features. To verify the interpretability of the proposed LACG, comparisons were made between this convolution kernel and the theoretical variable finite impulse responses derived from a dynamic deethanization model, which was built using Aspen Plus Dynamics V12.1 and is shown in Fig. 7.

In each dynamic simulation, the chosen process variable is disturbed by a one-time-step(1min)increase at the beginning, and then the discharge flowrate response is recorded. Four important process variables-namely, the feed flowrate, feed temperature, steam flowrate, and reflux flowrate-are chosen as the disturbance sources. The disturbance amplitudes are set according to the average fluctuation amplitudes derived from the DCS data, which are 4 % for feed flowrate, 3 % for feed temperature, 3 % for steam flow-rate, and 0.8 % for reflux flowrate.

The bottom flowrate response curves are displayed in Fig. 8 along with the convolution kernel curves extracted from the developed LACG model. The original convolution kernel curves and the corresponding noise-filtered curves using moving average filtering are shown together, in blue and red, respectively. The length of the moving window is set to 3.

It can be seen that the convolution kernel curves extracted from the developed LACG model are very similar to the theoretical responses simulated by Aspen Plus Dynamics V12.1, especially those of the feed, steam, and reflux flowrates. Because the dynamic model is built according to the basic design conditions of the deethanization process, it inevitably suffers from a mismatch with the real situation. In other words, rather than the simulation results of Aspen Plus Dynamics, the convolution kernel curves learned from the actual DCS data are more likely to be the real variable interactions. The convolution kernel curves of different tray temperatures are collected in Fig. S12 in Appendix A, and their similar trends also verify the reliable process knowledge discovery ability of the proposed LACG.

An overview of the 21 input variable convolution kernels of the LACG model is shown using a heat map in Fig. 9. The variables with a significant impact on the discharge flowrate are marked on the right of the figure, showing excellent agreement with expert knowledge of chemical engineering, such as the feed flowrate and sump level. The detailed convolution kernel curves for the 21 input variables are provided in Fig. S13 in Appendix A. On the whole, the proposed LACG architecture has successfully learned the process dynamic behaviors implied in the DCS data. The interpretability of the LACG architecture not only assists in obtaining a deep understanding of actual chemical processes but also ensures model reliability in further applications.

As a widely used explainable ML model, the attention-LSTM describes the variable relationships in a different way from the LACG-that is, by the attention vector. For a clear comparison, the attention-LSTM attention vector of the simulation result based on a random input sample is shown in Fig. 10(a). It reveals that only the overall importance of the input within each time step to the output is uncovered, without the detailed behavior of every input feature, and the attention weights are all positive, so that no direction of variable correlation can be observed. In contrast, more details about the variable interactions of the chemical process under every time step are illustrated well by the proposed LACG architecture. Generally speaking, the attention weights of the traditional attention-LSTM are expected to be similar to the kernel values of the LACG along the time axis, as they both reveal the weights of the inputs at different time steps. Therefore, to better compare the two distinct interpretability mechanisms, the LACG kernel weights are summed along the time axis to present the temporal influence of all the input features on the output, as shown in Fig. 10(b). In the figure, the two distributions of the weights on the time axis are quite different. Notably, in this industrial case, the performance of the traditional attention-LSTM on the testing dataset is not as good as that of the LACG since it captures fewer underlying data patterns. Thus, it is inferred that the attention vector of the traditional attention-LSTM reveals less interpretability, thereby differing from the convolution kernel of the LACG, which is more consistent with the disturbance-response pattern over time.

6. Conclusions

A novel interpretable LACG architecture was developed in this contribution with the aim of accurately modeling actual chemical processes. Its three designed sub-modules-namely, the basic module, the light attention module, and the residue module-make the new architecture a highly reliable and robust data-driven approach with strong interpretability.

Compared with the five baseline neural networks (i.e., the FNN, CNN, CNN-2D, LSTM, and attention-LSTM), the proposed LACG provided the best performance on the modeling of the discharge flow-rate of an actual deethanization process. The true prediction MRE of the LACG flowrate model was about 1 %. Furthermore, compared with the simulation results of a dynamic deethanization model built using Aspen Plus Dynamics V12.1, the LACG convolution kernel curves of the feed flowrate, feed temperature, steam flowrate, and reflux flowrate were similar to the corresponding theoretical response curves of the discharge flowrate. These results demonstrate that the proposed LACG architecture accurately captures the actual process dynamic behaviors. The interpretability implied in the LACG parameters is helpful in understanding the complex variable interactions in the actual chemical process.

The proposed LACG architecture enriches the existing interpretable ML knowledge, providing a reliable method with high accuracy for actual chemical process analysis and modeling. This is a landmark study in that it enables industrial data to "speak," paving the way for intelligent manufacturing. Nevertheless, it should be noted that the general dynamic behaviors learned by the LACG were still interfered with data noise. This issue may be alleviated by more sophisticated data preprocessing.

Acknowledgments

We acknowledge the financial support provided by the National Natural Science Foundation of China (22122802, 22278044, and 21878028), the Chongqing Science Fund for Distinguished Young Scholars (CSTB2022NSCQ-JQX0021), and the Fundamental Research Funds for the Central Universities (2022CDJXY-003).

Compliance with ethics guidelines

Yue Li, Ning Li, Jingzheng Ren, and Weifeng Shen declare that they have no conflict of interest or financial conflicts to disclose.

Appendix A. Supplementary data

Supplementary data to this article can be found online at https://doi.org/10.1016/j.eng.2024.07.009.

References

[1]

J. Sansana, M.N. Joswiak, I. Castillo, Z. Wang, R. Rendall, L.H. Chiang, et al. Recent trends on hybrid modeling for Industry 4.0. Comput Chem Eng, 151 (2021), Article 107365.

[2]

D. Solle, B. Hitzmann, C. Herwig, M. Pereira Remelhe, S. Ulonska, L. Wuerth, et al. Between the poles of data-driven and mechanistic modeling for process operation. Chem Ing Tech, 89 (5) (2017), pp. 542-561.

[3]

N. Meneghetti, P. Facco, F. Bezzo, M. Barolo. A methodology to diagnose process/model mismatch in first-principles models. Ind Eng Chem Res, 53 (36) (2014), pp. 14002-14013.

[4]

Y. Chen, M. Ierapetritou. A framework of hybrid model development with identification of plant-model mismatch. AIChE J, 66 (10) (2020), p. e16996.

[5]

J. Panerati, M.A. Schnellmann, C. Patience, G. Beltrame, G.S. Patience. Experimental methods in chemical engineering: artificial neural networks—ANNs. Can J Chem Eng, 97 (9) (2019), pp. 2372-2382.

[6]

M.R. Dobbelaere, P.P. Plehiers, R. Van de Vijver, C.V. Stevens, K.M. Van Geem. Machine learning in chemical engineering: strengths, weaknesses, opportunities, and threats. Engineering, 7 (9) (2021), pp. 1201-1211.

[7]

K.T. Leperi, D. Yancy-Caballero, R.Q. Snurr, F. You. 110th anniversary: surrogate models based on artificial neural networks to simulate and optimize pressure swing adsorption cycles for CO2 capture. Ind Eng Chem Res, 58 (39) (2019), pp. 18241-18252.

[8]

H. Fang, J. Zhou, Z. Wang, Z. Qiu, Y. Sun, Y. Lin, et al. Hybrid method integrating machine learning and particle swarm optimization for smart chemical process operations. Front Chem Sci Eng, 16 (2) (2022), pp. 274-287.

[9]

S. Jiang, V.M. Zavala. Convolutional neural nets in chemical engineering: foundations, computations, and applications. AIChE J, 67 (9) (2021), p. e17282.

[10]

Z. Wu, A. Tran, D. Rincon, P.D. Christofides. Machine learning-based predictive control of nonlinear processes. Part I: theory. AIChE J, 65 (11) (2019), p. e16729.

[11]

C. Ning, F. You. Optimization under uncertainty in the era of big data and deep learning: when machine learning meets mathematical programming. Comput Chem Eng, 125 (2019), pp. 434-448.

[12]

Y. Su, Z. Wang, S. Jin, W. Shen, J. Ren, M.R. Eden. An architecture of deep learning in QSPR modeling for the prediction of critical properties using molecular signatures. AIChE J, 65 (9) (2019), p. e16678.

[13]

J. Zhang, Q. Wang, Y. Su, S. Jin, J. Ren, M. Eden, et al. An accurate and interpretable deep learning model for environmental properties prediction using hybrid molecular representations. AIChE J, 68 (6) (2022), p. e17634.

[14]

H. Wu, Y. Han, J. Jin, Z. Geng. Novel deep learning based on data fusion integrating correlation analysis for soft sensor modeling. Ind Eng Chem Res, 60 (27) (2021), pp. 10001-10010.

[15]

L. Sun, T. Liu, Y. Xie, D. Zhang, X. Xia. Real-time power prediction approach for turbine using deep learning techniques. Energy, 233 (2021), Article 121130.

[16]

M. Byun, H. Lee, C. Choe, S. Cheon, H. Lim. Machine learning based predictive model for methanol steam reforming with technical, environmental, and economic perspectives. Chem Eng J, 426 (15) (2021), Article 131639.

[17]

Naito S, Taguchi Y, Nakata K, Kato Y. Anomaly detection for multivariate time series on large-scale fluid handling plant using two-stage autoencoder. In: Xue B, Pechenizkiy M, Koh YS, editors. Proceedings of the 21st IEEE International Conference on Data Mining Workshops; 2021 Dec 7- 10 ; online conference. Piscataway: IEEE; 2021. p. 542-51.

[18]

M.L. Thompson, M.A. Kramer. Modeling chemical processes using prior knowledge and neural networks. AIChE J, 40 (8) (1994), pp. 1328-1340.

[19]

M. von Stosch, R. Oliveira, J. Peres, S. Feyo de Azevedo. Hybrid semi-parametric modeling in process systems engineering: past, present and future. Comput Chem Eng, 60 (2014), pp. 86-101.

[20]

S. Zendehboudi, N. Rezaei, A. Lohi. Applications of hybrid models in chemical, petroleum, and energy systems: a systematic review. Appl Energy, 228 (2018), pp. 2539-2566.

[21]

C. Valencia Peroni, M. Parisi, A. Chianese. Hybrid modelling and self-learning system for dextrose crystallization process. Chem Eng Res Des, 88 (12) (2010), pp. 1653-1658.

[22]

K. Wang, C. Shang, L. Liu, Y. Jiang, D. Huang, F. Yang. Dynamic soft sensor development based on convolutional neural networks. Ind Eng Chem Res, 58 (26) (2019), pp. 11521-11531.

[23]

Y. Zheng, Z. Wu. Physics-informed online machine learning and predictive control of nonlinear processes with parameter uncertainty. Ind Eng Chem Res, 62 (6) (2023), pp. 2804-2818.

[24]

G. Wu, W.T.G. Yion, K.L.N.Q. Dang, Z. Wu. Physics-informed machine learning for MPC: application to a batch crystallization process. Chem Eng Res Des, 192 (2023), pp. 556-569.

[25]

A.P. Shaha, M.S. Singamsetti, B.K. Tripathy, G. Srivastava, M. Bilal, L. Nkenyereye. Performance prediction and interpretation of a refuse plastic fuel fired boiler. IEEE Access, 8 (2020), pp. 117467-117482.

[26]

Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?”:explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016 Aug 13-17; San Francisco, CA, USA. New York City: Association for Computing Machinery; 2016. p. 1135-44.

[27]

K. Mu, L. Luo, Q. Wang, F. Mao. Industrial process monitoring and fault diagnosis based on temporal attention augmented deep network. J Inf Process Syst, 17 (2) (2021), pp. 242-252.

[28]

B. Xu, Y. Wang, L. Yuan, C. Xu. A novel second-order learning algorithm based attention-LSTM model for dynamic chemical process modeling. Appl Intell, 53 (2023), pp. 1619-1639.

[29]

B.R. Bakshi, G. Stephanopoulos. Representation of process trends—IV. induction of real-time patterns from operating data for diagnosis and supervisory control. Comput Chem Eng, 18 (4) (1994), pp. 303-332.

[30]

A. Ragab, M. El-Koujok, B. Poulin, M. Amazouz, S. Yacout. Fault diagnosis in industrial chemical processes using interpretable patterns based on logical analysis of data. Expert Syst Appl, 95 (2018), pp. 368-383.

[31]

A. Ragab, M. El-Koujok, H. Ghezzaz, M. Amazouz, M.S. Ouali, S. Yacout. Deep understanding in industrial processes by complementing human expertise with interpretable patterns of machine learning. Expert Syst Appl, 122 (2019), pp. 388-405.

[32]

Z. Wu, D. Rincon, P.D. Christofides. Process structure-based recurrent neural network modeling for model predictive control of nonlinear processes. J Process Contr, 89 (2020), pp. 74-84.

[33]

A. Chakraborty, A. Sivaram, L. Samavedham, V. Venkatasubramanian. Mechanism discovery and model identification using genetic feature extraction and statistical testing. Comput Chem Eng, 140 (2020), Article 106900.

[34]

H. Narayanan, M.N. Cruz Bournazou, G. Guillén Gosálbez, A. Butté. Functional-hybrid modeling through automated adaptive symbolic regression for interpretable mathematical expressions. Chem Eng J, 430 (4) (2022), Article 133032.

[35]

R. Subramanian, R.R. Moar, S. Singh. White-box machine learning approaches to identify governing equations for overall dynamics of manufacturing systems: a case study on distillation column. Mach Learn Appl, 3 (2021), Article 100014.

[36]

Y. Li, T. Zhang, S. Sun. Acceleration of the NVT flash calculation for multicomponent mixtures using deep neural network models. Ind Eng Chem Res, 58 (27) (2019), pp. 12312-12322.

[37]

H. Wu, J. Zhao. Self-adaptive deep learning for multimode process monitoring. Comput Chem Eng, 141 (2020), Article 107024.

[38]

Y. Bengio, P. Simard, P. Frasconi. Learning long-term dependencies with gradient descent is difficult. IEEE Trans Neural Netw, 5 (2) (1994), pp. 157-166.

[39]

S. Hochreiter, J. Schmidhuber. Long short-term memory. Neural Comput, 9 (8) (1997), pp. 1735-1780.

[40]

F.A. Gers, J. Schmidhuber, F. Cummins. Learning to forget: continual prediction with LSTM. Neural Comput, 12 (10) (2000), pp. 2451-2471.

[41]

Cho K, van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Moschitti A, Pang B, Daelemans W, editors. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing; 2014 Oct 25- 29 ; Doha, Qatar. Stroudsburg: Association for Computational Linguistics; 2014. p. 1724-34.

[42]

Chung J, Gulcehre C, Cho KH, Bengio Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. 2014. arXiv:1412.3555.

[43]

Cho K, van Merriënboer B, Bahdanau D, Bengio Y. On the properties of neural machine translation: encoder-decoder approaches. 2014. arXiv:1409.1259.

[44]

Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. 2014. arXiv: 1409.0473.

[45]

Luong T, Pham H, Manning CD. Effective approaches to attention-based neural machine translation. In: Màrquez L, Callison-Burch C, Su J, editors. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing; 2015 Sep 17-21; Lisbon, Portugal. Red Hook: Curran Associates; 2015. p. 1412-21.

[46]

Y. LeCun, L. Bottou, Y. Bengio, P. Haffner. Gradient-based learning applied to document recognition. Proc IEEE, 86 (11) (1998), pp. 2278-2324.

[47]

A. Krizhevsky, I. Sutskever, G.E. Hinton. ImageNet classification with deep convolutional neural networks. P. Bartlett, F. Pereira, C.J. Burges, L. Bottou, K.Q. Weinberger (Eds.), Advances in neural information processing systems 26, Curran Associates, Red Hook (2013), pp. 1097-1105.

[48]

P.S. Pravin, J.Z.M. Tan, K.S. Yap, Z. Wu. Hyperparameter optimization strategies for machine learning-based stochastic energy efficient scheduling in cyber-physical production systems. Digit Chem Eng, 4 (2022), Article 100047.

[49]

Akiba T, Sano S, Yanase T, Ohta T, Koyama M. Optuna:a next-generation hyperparameter optimization framework. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2019 Aug 4-8; Anchorage, AK, USA. New York City: Association for Computing Machinery; 2019. p. 2623-31.

[50]

J. Bergstra, R. Bardenet, Y. Bengio, B. Kégl. Algorithms for hyper-parameter optimization. J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira, K.Q. Weinberger (Eds.), Advances in neural information processing systems 24, Curran Associates, Red Hook (2011), pp. 2546-2554.

[51]

Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. In: Teh YW, Titterington M, editors. Proceedings of the 13th International Conference on Artificial Intelligence and Statistics; 2010 May 13- 15; Sardinia, Italy; 2010. p. 249-56.

[52]

A. Savitzky, M.J.E. Golay. Smoothing and differentiation of data by simplified least squares procedures. Anal Chem, 36 (8) (1964), pp. 1627-1639.

[53]

J.P. Poort, M. Ramdin, J. van Kranendonk, T.J.H. Vlugt. Solving vapor-liquid flash problems using artificial neural networks. Fluid Phase Equilib, 490 (2019), pp. 39-47.

RIGHTS & PERMISSIONS

THE AUTHOR

PDF (2255KB)

3448

Accesses

0

Citation

Detail

Sections
Recommended

/