《1. Introduction》

1. Introduction

Laser welding—especially high-power laser welding—has been widely applied in industries such as car manufacturing, aerospace manufacturing, and shipbuilding [1–3]. Blowouts, humping, and undercutting are typical defects that greatly reduce the strength of the joint and limit welding efficiency. It is still a major challenge to comprehensively depict the high-power disc laser welding process, which is crucial in detecting welding defects and realizing online monitoring of the welding status.

During laser welding, the material is rapidly heated up and vaporized by the high density of the laser beam energy [4]. A keyhole is formed in the molten pool beneath the laser beam due to the recoil pressure induced by the vaporization, Marangoni force, gravity of the liquid material, and buoyancy force [5]. The existence of the keyhole enhances the absorption of the laser energy by the material due to the multi-reflection of the laser beam in the keyhole [6]. Meanwhile, a metal plume induced by the high-density-energy laser beam appears in and above the keyhole [7]. The plume scatters and reflects the laser beam, and further affects the dynamic of the keyhole [8]. Some spattering is generated as a result of the recoiling pressure induced by the drastic vaporization [9,10]. The spatters disturb the dynamic of the molten pool and keyhole, as they carry off some of the kinetic energy from the molten pool. The above-mentioned research has revealed that the keyhole, plume, and spatters are the most important phenomena during the welding process, and that their real-time features can be used to depict the welding status.

A great number of studies have applied visual sensing methods to reveal the mechanism of laser welding [11–13], as visual sensors provide high-dimensional insights into the spatters, keyhole, and plume. Photodiode sensors and spectrometers have also been utilized to monitor this industrial manufacturing process [14,15] due to their low equipment cost and simple setup structure. Unfortunately, in these studies, either only a single sensor was applied to observe the welding process, or the captured signals were not related to the welding status in a quantized way. Recently, many machine learning methods, such as multiple linear regression (MLR) [16,17], support vector machines (SVMs) [18], neural networks (NNs) [19,20], and so forth, have been employed in modeling and pattern-recognition problems, such as statistical parametric speech synthesis, speech emotion recognition, and products manufacturing process monitoring [21,22]. However, MLR has limitations in fitting the highly nonlinear features of the welding process due to its linear property. The mapping ability of SVM depends on its predefined kernel function, and may be insufficient for representing multiple-sensor signals from laser welding. The NN method also has an intrinsic limitation, as it is easy to be trapped in the local optimum, and difficult to find the global optimal solution. Furthermore, these methods are shallow models with one or no hidden layers, and cannot be utilized in exploring effective representation of highly correlated multiple-optical-sensor signals. Therefore, this research introduces a deep learning method based on a deep belief network (DBN) model to solve this challenge.

This research aims to establish a multiple-optical-sensor system to obtain comprehensive insights into the high-power disc laser welding process. A deep learning model based on DBN is established to find the global optimal results for monitoring the welding status with the signals captured by the multiple-opticalsensor system. The remainder of this research is organized as follows. The experimental setup is introduced in Section 2, and the feature extractions of the multiple optical signals are described in Section 3. Section 4 describes the architecture of the DBN model and the preparation of the training and verification set, and provides a performance comparison between the established DBN model and the traditional back-propagation neural network (BPNN) model. In Section 5, the generalization ability and effectiveness of the established DBN model are verified by three additional experiments with different welding parameters. Section 6 concludes this research.

《2. Experimental setup》

2. Experimental setup

Fig. 1 depicts the experimental setup of this research. Four optical-sensor systems, including an auxiliary illumination visual sensor system, an ultraviolet/visible (UVV) band visual sensor system, a spectrometer, and a photodiode are applied to capture the signals of the welding process. The welding material in this research is 304 stainless steel. The dimensions of the workpiece are 150 mm in length, 10 mm in width, and 50 mm in thickness.

《Fig. 1》

Fig. 1. Illustration of the experimental setup.

Optical signals from the welding area are acquired by the photodiode sensor. A beam splitter is pre-equipped in the laser head, and helps to collect and transmit these signals by means of an optical fiber, as shown in Fig. 1. The photodiode receives these signals and divides them into the reflected laser light optical signal (wavelength 1030 nm) and the visible light optical signal by means of a dichroic mirror in the photodiode; both kinds of signal are amplified and transmitted to the oscilloscope. The sampling rates of the two kinds of signals are set as 500 kHz in order to obtain the detailed optical features of the welding process in high-temporal resolution.

A spectrometer is applied to collect the spectral signal (wavelength from 186 to 1100 nm) from the welding area during laser welding. As shown in Fig. 1, the spectral signals are captured by a collimator and then transmitted to the spectrometer via the optical fiber. Previous research has shown that spectral signals with a wavelength from 400 to 900 nm contain the most important insights into the solid laser welding process. Therefore, the signals within this wavelength range are selected to extract the features for the online monitoring welding status. The sampling rate of the spectrometer is set as 500 Hz.

Two high-speed visual imaging systems, including a UVV band visual sensor system (wavelength greater than 390 nm) and an auxiliary illumination visual sensor system, are applied to obtain the features of the keyhole, plume, and spatter. The UVV band visual sensor system consists of a UVV filter and a high-speed camera. With the captured images, the visual features of the plume and spatter can be extracted by means of a digital image processing method. A 40 W auxiliary light source is employed to produce laser light (wavelength 976 nm) to illuminate the welding area, and the auxiliary illumination visual sensor system, coupled with a filter that only permits laser light with a wavelength of 976 nm to pass through, captures the visual features of the keyhole. The sampling rates of the two visual imaging systems are both set as 5000 frame·s-1 .

《3. Feature extraction of multiple-sensor signals》

3. Feature extraction of multiple-sensor signals

《3.1. Feature extraction from the auxiliary illumination visual sensor system》

3.1. Feature extraction from the auxiliary illumination visual sensor system

Three images captured by the auxiliary illumination visual sensor system are shown in Fig. 2. The keyhole features, including the size and position of the keyhole, are calculated and quantized with the crop and binarization operations in digital image processing. Fig. 2 shows that the keyhole size and position fluctuate at different moments. The feature vector extracted from the signals of the auxiliary illumination visual system is expressed in Eq. (1), where keyholeposition denotes the keyhole position and keyholesize denotes the keyhole size.

《Fig. 2》

Fig. 2. Three sequential images captured by the auxiliary illumination visual sensor system through feature extraction. (a) Original image; (b) region of interest (ROI); (c) binarization; (d) keyhole.

《3.2. Feature extraction from the UVV band visual sensor system》

3.2. Feature extraction from the UVV band visual sensor system

Fig. 3 shows two images acquired by the UVV band visual sensor system. The features of the plume are extracted from the images captured by the UVV band visual sensor system. The volume of the plume is calculated as the number of pixels occupied by the plume, as shown in Fig. 3. The tilted degree of the plume is defined as the angle between the centroid of the plume and the vertical axis in the image coordinate system; this feature is the indicator of the plume direction, which also can be considered as the direction of the keyhole opening. The features of the spatters are quantized according to their flying direction, and the numbers of spatters flying forward and backward are calculated using digital image process operations.

《Fig. 3》

Fig. 3. Images captured by the UVV band visual sensor system, and its feature extraction process.

A total of four features are collected from the UVV band visual sensor system; these features form the feature vector , which is expressed in Eq. (2), where spatterfront denotes the number of forward spatters, spatterback denotes the number of backward spatters, plumevolume denotes the volume of the plume, and plumedegree denotes the tilted degree of the plume.

《3.3. Feature extraction of two signals captured by the photodiode sensor》

3.3. Feature extraction of two signals captured by the photodiode sensor

The signals of the visible light and reflected laser light captured via photodiode are analyzed by the wavelet packet decomposition (WPD) method. WPD is achieved by applying both low-pass and high-pass filters to calculate the approximation and their coefficients. The function of WPD  ) can be described by Eq. (3), where is the scale coordinate, denotes the location coordinate, is the modulation coordinate, denotes the sequence number, and is the set of the integers. Daubechies wavelets (db10) are used as the wavelet function.

The first two decomposed signals in the first layer of WPD can be expressed by Eq. (4).

The functions of the high-pass filter  and low-pass filter ) are defined by Eq. (5).

In this way, the function of WPD for n > 1 can be described by Eqs. (6) and (7).

With the photodiode signal , its WPD coefficients ) can be expressed by Eq. (8).

The features of the WPD coefficients related to both the time and frequency are calculated in Eqs. (9)–(18), where K is the number of the WPD coefficients, E denotes the mathematical expectation function, is the angular vector, and i is the imaginary part.

In Eqs. (16) and (17), is the Fourier transformation of the WPD coefficients .

In this research, the visible light optical signal captured by the photodiode is decomposed into 16 frequency sub-bands according to the WPD method. The WPD coefficients of each decomposed sub-band signal can be obtained, and its 10 statistic features are calculated according to Eqs. (9)–(18). Considering all the decomposed sub-band signals, the feature vector can be expressed by Eq. (19), where denotes the feature extracted from .

The feature vector extracted from the reflected laser light signal is expressed by Eq. (20), where denotes the features extracted from .

《3.4. Feature extraction of the signal captured by the spectrometer》

3.4. Feature extraction of the signal captured by the spectrometer

The selected spectral signals with a wavelength from 400 to 900 nm that have been captured by the spectrometer are divided into 25 sub-bands, with each sub-band covering 20 nm. The mean value of the intensity in each sub-band is calculated as the feature of each sub-band, expressed by Eq. (21), where N denotes the number of the sub-band, denotes the start spectral number of the Nth sub-band, denotes the terminal spectral number of the Nth sub-band,denotes the spectral intensity, and spectrumN denotes the calculated mean value of the intensity in the Nth sub-band.

For each sample, a total of 25 features are obtained from 25 corresponding sub-bands. The feature vector extracted from the spectrometer can be expressed by Eq. (22).

《4. Architecture and application of DBN》

4. Architecture and application of DBN

《4.1. Framework of DBN》

4.1. Framework of DBN

A DBN model consists of as many hidden layers as the target problems require, with each hidden layer being composed of a restricted Boltzmann machine (RBM). The DBN not only possesses the advantages of conventional NNs, but also has a strong fusing ability for multiple sensors due to its deep architecture [23–26]. The global optimal parameters of a DBN model are determined through a two-step training algorithm—namely, pre-training and fine-tuning. Recently, DBN models have been widely employed in signal processing and in the machine learning industry, in areas such as voice activity detection [27], acoustic modeling [28], and face recognition [29].

A typical RBM model has two layers, as shown in Fig. 4; the bottom layer is called the visible layer, and the top layer is called the hidden layer. The RBM model can be considered as a special Markov random model. All neurons in the visible layers are fully connected to units in the hidden layer by the bidirectional weights , where p denotes the neuron number in hidden layer and denotes the neuron number in the visible layer.

《Fig. 4》

Fig. 4. The structure of an RBM with f hidden and m visible neurons.

The energy function of an RBM is expressed by Eq. (23), where denotes the parameters collection in RBM, is the bidirectional weight of the visible neurons  and hidden neurons , and and are the bias terms of the corresponding neurons in the visible and hidden layers, respectively. The probabilities of each neuron in the visible and hidden layers can be calculated via Eqs. (24) and (25), respectively:

In Eqs. (24) and (25), denotes the normalization factor expressed by Eq. (26).

Since an RBM prohibits any connections between neurons in the same layer, the conditional probability distributions  and  can be calculated as the products of the Bernoulli distributions expressed in Eqs. (27) and (28), where  is the sigmoid activation function and denotes the input value of the neuron. A contrastive divergence sampling algorithm is applied to update the model parameters ,and  in Eqs. (27) and (28).

The output vector of the hidden units can be calculated according to the forward propagation algorithm with the real input data in the first visible layer; the output of the first hidden layer is then considered to be the input data for the second hidden layer.

At the top layer of DBN, a classifier is employed for the purpose of classification. In this research, a softmax classifier, which can conduct multiple classification problems, is applied as the final layer adhering to the DBN model, as expressed in Eq. (29), where denotes the probability value of the classification and denotes the category number in classification. The softmax classifier can be considered to be made up of a number of logistic models.

A DBN model can be constructed by stacking a few RBMs layer by layer. In this research, a DBN model with three hidden layers is established; its structure is shown in Fig. 5. The training process of the DBN is conducted with the pre-training and fine-tuning steps.

(1) Pre-training step. The input data is directly transmitted to the neurons in the visible layer, and then the output of RBM 1 can be calculated. RBM 1 is trained with all the training samples until the termination condition is fulfilled. Then the trained parameters of RBM 1 are fixed, and the hidden layer of RBM 1 is considered to be the visible layer to train RBM 2, according to the same algorithm of RBM 1 shown in Fig. 5. The pre-training is unsupervised and stops once all the successive individual RBMs have been trained.

《Fig. 5》

Fig. 5. The structure of the DBN applied in this research.

(2) Fine-tuning step. The parameters in each RBM are updated and optimized by applying a back-propagation algorithm to reduce the overall error of the training samples and enhance the classification accuracy of the DBN model. All DBN layers are simultaneously fine-tuned in this process. The overall training error is generated by comparing the targets with the output of the DBN model. The supervised fine-tuning process iterates until the terminal condition of the DBN model is fulfilled.

《4.2. Data preparation》

4.2. Data preparation

In this research, the spectral insights into a sample acquired by the spectrometer are divided into 25 sub-bands by wavelength. The mean value of each spectral sub-band is calculated, and 25 features are extracted in total. Both the visible light optical signal and the reflected laser light optical signal captured by the photodiode are decomposed into four levels by the WPD method; the frequency bands of interest are consequently divided into 16 subbands. As mentioned in Section 3.3, 10 different features are extracted from each sub-band. Therefore, 320 features in total are acquired from all sub-bands, considering the visible light optical signal and the reflected laser light optical signal. The volume and tilted degree of the plume, and the number of forward and backward spatters, are extracted from the images captured by the UVV band visual sensor system, and the features of keyhole size and position are acquired from the images from the auxiliary illumination visual sensor system.

The calculated features of each set of 1000 from the original sampling data from the photodiode are compressed to one piece of sample data in order to synchronize with the samples from the other sensors. For the auxiliary illumination visual sensor system and the UVV band visual sensor system, the average value of each set of 10 sequential samples is calculated as one piece of sample data. Therefore, the sampling rates of all the sensors in this research are synchronized at 500 Hz, which is the highest sampling rate of the spectrometer. Finally, a total of 351 features of the welding process are acquired. The sample values of each feature are normalized to 0–1 in order to ensure that each feature has the same weight despite their different scales, and thus to improve the accuracy and robustness of the DBN model. The normalization equation is expressed by Eq. (30), where is the normalized feature value, is the original feature value, denotes the minimum feature value of all samples, and denotes the maximum feature value of all samples.

In this research, 21 different experiments with different welding parameters were conducted and 10 500 samples of welding process signals were collected. In this dataset, 7500 samples were applied as the training data, and 3000 samples were used as the testing data in the establishment of the DBN model.

《4.3. Definition of weld statuses》

4.3. Definition of weld statuses

The sound well weld status coupled with three typical defect statuses—blowouts, humping, and undercutting—was defined based on the international standard EN ISO 13919-1-1996 [30]. Thus, the weld status for all 10 500 samples in this research was manually classified into four categories. Category 1 denotes the sound well weld status, Category 2 denotes the blowout status, Category 3 denotes the humping status, and Category 4 denotes the undercutting status. Examples of each category are shown in Fig. 6.

《Fig. 6》

Fig. 6. Definition of four kinds of welding status. (a) Sound well weld; (b) blowout; (c) humping; (d) undercutting.

《4.4. Model verification of DBN and comparison with the BPNN model》

4.4. Model verification of DBN and comparison with the BPNN model

The 351 extracted features from one sample of the welding process are gathered together and directly transmitted to the first visible layer of the DBN. Three hidden layers are applied to reduce the dimensions of the input features; finally, the optimal representation of the original input features is acquired. The number of neurons in the first, second, and third hidden layer is 200, 100, and 10, respectively. The softmax layer has four neurons to calculate the four category classification results.

A genetic algorithm is particularly suitable to solve constrained or unconstrained optimization problems. The genetic algorithm is implemented by a natural selection process, which simulates the biological evolutionary process in the real world [20]. The learning rate, learning momentum, and size of the batch of the training data model are optimized by a genetic algorithm, which is applied to accelerate its training process and achieve the optimal parameters for DBN in this research. The optimal values of the learning rate, learning momentum, and size of the batch of the training set are determined to be 0.05, 0.80, and 50, respectively.

A BPNN model with 351 nodes in the input layer, 200 nodes in the hidden layer, and four nodes in the output layer is also established for a comparison with the established DBN model. The monitoring results of the welding status with the BPNN and DBN models are tabulated in Table 1.

The established DBN model shows a higher average accuracy in welding status monitoring than the BPNN model in this research. Furthermore, the accuracy of each category is distributed more evenly in the established DBN model than in the BPNN model, as shown in Table 1. It is clear that the classification accuracy and robustness of the proposed DBN model are better than those of the BPNN model.

《Table 1》

Table 1 Comparison of the BPNN and DBN models.

《5. Results and discussion》

5. Results and discussion

Performance and generalization ability are important for deep learning models. In this research, three different experiments with different welding parameters are applied to validate the generalization ability and effectiveness of the established DBN model. The main welding parameters of these three experiments are listed in Table 2. The welds’ images, the captured original multiplesensor signals, and the online monitoring results corresponding to the three experiments are shown in Figs. 7(a–c), respectively. In Fig. 7(a), the welding process is divided into two parts; in the left part, the welding process is stable, and a good appearance of the weld is acquired; in the right part, blowouts occur, and a bad appearance of the weld is observed. The variations of the visible light photodiode signal, reflected laser light photodiode signal, keyhole size, plume volume, and tilting degree of the plume are highly related to the quality of the weld. In the left part, the visible light signal, reflected laser light signal, keyhole size, and keyhole position are stable, but the volume and tilting degree of the plume show larger fluctuations.

《Table 2》

Table 2 The welding parameters of three different experiments.

In the right part, the above-mentioned signals present the opposite variation; that is, the volume and tilting degree of the plume become stable, while the visible light signal, reflected laser light signal, keyhole size, and keyhole position show higher fluctuation in comparison with the left part. Fig. 7(b) shows the humping status that occurred in this experiment. The variations in the visible light signal, keyhole size, plume volume, tilted degree of the plume, and forward spatters are highly related to the humping status. Similarly, the values of the spectrum corresponding to the humping parts (colored in red in the figure of the spectrum) are obviously larger than the other parts with a sound well weld appearance. Fig. 7(c) shows the result of Experiment 3. The left part of Experiment 3 shows an undercutting status, and the right part shows a blowout status. In the undercut period, only the signals of the backward spatters and spectrum are stable; in the blowout part, the reflected laser light photodiode signal and forward spatters signal become more stable, and the other signals show higher fluctuations than those in the undercut part. This analysis concludes that the relation between the welding status and these sensed signals is complex and nonlinear.

《Fig. 7》

Fig. 7. Online monitoring results and captured signals of three different welding experiments. (a) Experiment 1: sound well weld and blowout; (b) Experiment 2: sound well weld and humping; (c) Experiment 3: undercutting and blowout. Group (i) is the weld appearance; group (ii) is the features extracted from multiple-sensor signals; and group (iii) is the monitoring results. The categories of the vertical coordinates in the monitoring results are as follows: Category 1: sound well weld; Category 2: blowout; Category 3: humping; and Category 4: undercutting.

The online welding status monitoring results of the proposed DBN model are listed in Table 3. The average accuracy of the three experiments is 96.00%, 98.85%, and 96.75%, respectively. The average monitoring accuracy of Category 1, Category 2, Category 3, and Category 4 is 96.85%, 98.65%, 98.40%, and 93.80%, respectively. This result proves that the established DBN model has excellent generalization and robustness ability.

《Table 3》

Table 3 Online monitoring accuracy of three different welding experiments.

《6. Conclusions》

6. Conclusions

This research provides an innovative method for the online monitoring of high-power laser welding status. A multi-opticalsensor system is established, and these captured signals are preprocessed to extract 351 dimensional features to depict the welding process. A DBN model is established to build the relationship between the welding status and these captured features. A genetic algorithm is applied to optimize the parameters of the DBN model. The following conclusions can be reached.

(1) The established multi-optical-sensor is able to obtain detailed and comprehensive insight into the high-power laser welding process.

(2) The relation between the captured signal and the welding status is complex and nonlinear.

(3) In comparison with a traditional BPNN model, the established DBN model shows higher accuracy and robustness.

(4) Three different experiments with different welding parameters validate the generalization ability and robustness of the established DBN model.

《Acknowledgements》

Acknowledgements

This work was partly supported by the National Natural Science Foundation of China (51675104 and 61703110), the Science and Technology Planning Project of Guangzhou, China (201707010197), the Innovation Team Project, Department of Education of Guangdong Province, China (2017KCXTD010), the Guangdong Provincial Natural Science Foundation of China (2017A030310494 and 2016A030310347), and the Youth Science Foundation of Guangdong University of Technology (16ZK0010).

《Compliance with ethics guidelines》

Compliance with ethics guidelines

Yanxi Zhang, Deyong You, Xiangdong Gao, and Seiji Katayama declare that they have no conflict of interest or financial conflicts to disclose.