《1. Leakage inspection in chemical process plants》

1. Leakage inspection in chemical process plants

The condition monitoring of large-scale chemical process plants is crucial for maintenance and to prevent consequential damage and major failures. The pipelines used to transport substances are one of the most important structural parts of a chemical process plant. As these pipelines often transport hazardous or toxic liquids or gases, leakage from pipelines poses a threat to operators and is an environmental safety risk [1]. A study has shown that the risk level of poisoning accidents caused by hazardous leakages is unacceptable [2]. Furthermore, damage to pipelines affects the normal operation of plants, which reduces the availability and productivity of the plants and results in economic losses [3]. A case study of an overall consequence assessment of leakage in the oil industry by Chen et al. [4] showed that the costs and losses due to leakage include production loss, asset loss, human loss of life or safety, and environmental damage. Among these different aspects, Chen et al. [4] could only estimate the production loss for one case study: Based on their estimation, the production loss was more than 270 000 USD.

In conventional condition monitoring, manual inspection by an expert is the main inspection method for failure detection in pipelines. However, human inspection is highly dependent on the competency of the inspector and the frequency of the inspection. It is very labor intensive and expensive. Furthermore, a human inspector must be exposed to the conditions in the chemical plant in order to inspect the plant directly, which is not possible most of the time due to the hazardous conditions. Thus, remote inspection is required in order to avoid human exposure to the hazardous conditions within chemical plants [5]. Remote operation requires appropriate remote data acquisition from the plant, as well as suitable data analysis methods to subsequently accomplish remote monitoring. Therefore, in order to achieve remote, safe, fast, and accurate leakage detection and localization in large-scale chemical process plants, an intelligent and automatic leakage-detection mechanism based on data obtained from the plant is required.

In recent research, several methods have been developed for pipeline leak inspection [6]. Most of the existing methods for leakage detection are based on the physical properties of the liquid in the pipelines. The density of the liquid in the pipes, process parameters such as pressure, velocity, and temperature of the liquid inside the pipes, and the size and shape of the pipes are used as measurement metrics to generate a mathematical model for leakage [7]. These methods often have limited application to other cases. Furthermore, such methods require a profound knowledge about the process in order to provide a precise model and precise understanding of the conditions in the chemical plant, which is seldom feasible. Aside from leakage detection, localization of a leak’s position in the pipes is an important aspect of leakage inspection. Most existing methods cannot detect the position of the leakage precisely. In some of these methods, leakage localization is dependent on the geographical information of the plant, such as the size and location of the plant, the length of the pipes, and the velocity, pressure, and other physical properties of the liquid in the pipes [8–11]. Therefore, it is difficult to apply these methods to other environments and other types of liquid. Furthermore, since these methods often use different sensors at different positions on the pipes, synchronization discrepancies in sensor data will greatly affect precision [12].

Even though numerous research studies have been conducted in the field of leakage detection in pipes, the need for an intelligent method for leakage detection and localization remains—particularly one that can be applied to different types of liquids and to pipes of different types, shapes, and sizes. Furthermore, fast and accurate leakage-detection methods are required, especially when the leaking drops are very small. Small leakages are difficult to detect before they cause major damage; thus, fast detection of small leaking drops can avoid serious and hazardous failure by addressing the issue in the early stage. The current challenges in leakage detection and localization in pipelines motivate the vision-based inspection and application of machine vision techniques for the detection of leaking drops.

Machine vision techniques in combination with artificial intelligence (AI) can provide a framework for learning the conditions within a chemical plant, recognizing the different operations of the plant, and making decisions [13]. Vision-based inspection is a promising approach for the realization of fully automatic condition monitoring in manufacturing systems and online inspection [14,15]. One of the main advantages of automated visual inspection is that it can detect the malfunctioning of a specific part of the plant with high speed and accuracy [16] and improve the safety in a manufacturing environment by means of fast failure detection. However, the implementation of vision-based systems in real industrial environments and large-scale plants is still challenging. One solution may be the use of drone platforms with vision abilities [17]. However, due to pose variation and the fast motion of drones, most machine vision algorithms are not optimal for dealing with the images captured by drones [17]. Furthermore, drones are not known to be intrinsically safe equipment in environments with a high risk of ignition (known as Ex-zones) [18,19]. Another possibility is the use of several cameras fixed in different positions to capture images from different parts of a large-scale plant. In this case, each camera captures a specific part of the plant and vision-based inspection is implemented on that specific part of the plant.

The main contribution of this paper is to provide a method for automated visual leakage inspection that is independent of human competency and inspection. In order to realize this target, a testbed including a demonstrator plant and an infrared (IR) camera was used. The video data captured from the demonstrator plant was used to develop an intelligent method for leakage detection based on machine vision techniques. The results obtained in the provided testbed show that leakage can be detected, localized, and classified with high accuracy using the proposed method. Further opportunities and limitations in the implementation of such a vision-based system in a real large-scale industrial plan are discussed at the end of this paper.

The remainder of the paper is organized as follows: Section 2 discusses the requirements for an efficient leakage-detection method and the hypotheses that are considered in this contribution. Section 3 then reviews, classifies, and compares the most recent literature in the field of leakage monitoring based on the defined requirements. The basic method and steps toward visual leakage inspection using machine vision and image processing are presented in Section 4. The obtained results are investigated and evaluated in Section 5, followed by the discussion and outlook in Section 6.

《2. Requirements and hypotheses of efficient leakage detection and localization》

2. Requirements and hypotheses of efficient leakage detection and localization

Recent studies in the field of leakage monitoring cover methods that range from manual inspection by a trained expert to sophisticated sensor networks [6]. However, in order to provide a reliable and applicable leakage-detection mechanism, the proposed method should meet certain requirements. These requirements are derived from the current situation in the field of leakage detection. The first important requirement is to provide safe and remote leakage inspection (Requirement R1) in order to avoid direct human exposure and inspection within the chemical plant. IR cameras have been used as a remote inspection method in several industrial applications [20]. This requirement is even more serious in the case of toxic leaks, or in Ex-zones where the risk of fire or explosion resulting from flammable leaks is high [2]. In order to realize automatic condition monitoring, another important requirement is to provide an automatic leakage inspection mechanism (Requirement R2) that is independent of operator competency or intervention [21]. An automatic inspection mechanism is less labor intensive and can be permanently applicable. The next important requirement for a leakage-detection mechanism is the accuracy of the proposed method (Requirement R3), which can ensure reliability, especially in a hazardous situation. However, since an industrial environment includes different types of noise, the accuracy and precision of the leakage-detection method can be affected by the noise. Therefore, the proposed method should also be robust with respect to environmental noise (Requirement R4). Furthermore, as the detection of small leaking drops is particularly difficult in noisy industrial environments, it is important to be able to detect small drops (Requirement R5) in order to avoid serious damage resulting from undetectable small leaks over a long time. Since there might be several leakages in the plant, including simultaneous leakages from different parts of the plant, the leakage-detection mechanism should be able to detect multiple simultaneous leakages (Requirement R6). After detection of the leakages, the localization of the leakage (Requirement R7) should also be considered. Since leakage from the pipelines might be a hazardous liquid, it is important to detect the trajectory and path of the leakage (Requirement R8), in addition to the localization. The trajectory of the leakage is important, since this can provide additional information about the path of the leaking drops in order to monitor different parts of the plant that might be affected by the leakage. In addition, leakages should be detected within a reasonable time period (Requirement R9) in order to be applicable in real applications and to avoid any serious damage as quickly as possible. Finally, the inspection method should be independent of the physical properties of the liquid in the pipelines and the material of the pipelines. Therefore, it should be applicable for different types of liquid (Requirement R10) in the pipelines. Other requirements, such as fixing the leak position or quantifying how much liquid is being leaked, can be manually investigated and handled by the operator after leakage detection and localization. The amount of leakage can be quantified according to various measurement techniques in the literature [22], such as the difference between the amount of liquid in the pipelines before and after a leakage; however, that is not the focus of image-processing techniques. Such steps occur after leakage detection and localization.

The proposed leakage-detection mechanism should fulfil these requirements in order to achieve an automatic and reliable leakage-detection mechanism with minimum hazards or damage to the environment. To summarize the aims of this paper, the following hypotheses are derived. These hypotheses (H1, H2, and H3) will be evaluated regarding the requirements R1–R10 in the evaluation section:

(1) H1: Using IR cameras and image processing facilitates automatic leakage inspection.

(2) H2: Taking advantage of data analysis, machine learning, and image processing will provide a reliable inspection:

• H2.1: Image-processing techniques will provide an inspection system that is robust to environmental noise.

• H2.2: Machine learning and image analysis can provide a framework for accurate leakage detection within a reasonable time period.

• H2.3: The position and trajectory of leaking drops can be detected correctly using image-processing techniques.

(3) H3: By using several cameras at fixed positions within a large-scale industrial plant, the proposed method can be extended to a real application.

In the next section, recent literature studies in the field of leakage detection are investigated based on the defined requirements.

《3. The state of the art in leakage monitoring and localization》

3. The state of the art in leakage monitoring and localization

The literature contains several studies in the field of leakage detection from pipelines. The techniques they present can be categorized into three groups. In the first group, a process model and physical model is used for leakage detection. Physical properties, such as the velocity of the liquid in the pipelines, the density of the liquid, the pressure of the liquid, and the temperature of the liquid, are considered, and a mathematical and physical model is derived to model the flow in the pipelines. Leakage detection occurs when the physical behavior of the liquid in the pipelines deviates from the model. In the second group of existing methods for leakage detection, physical models and sensor data are jointly used for leakage detection. Leakage is modeled with a mathematical and physical process model, and then the measured data from the sensors are classified after comparison with the derived model. In this group of techniques, the classification step is usually done by data-driven methods. In the last group, only sensor data, such as pressure and flow, and data analysis methods are used to derive a model for the process. The derived model is used to classify the conditions of the chemical plant and for anomaly detection. The advantage of data-driven approaches over physical model-driven approaches is that the latter require a profound understanding of the process and the material and a great deal of prior information about them, which are often not feasible [22]. Another challenge in leakage inspection is the detection of small leaking drops. The lite rature does not contain a clear definition of small leaks, and existing studies measure the size of a leakage differently. In general, a ‘‘small leak” usually refers to a situation in which the pressure changes related to the leakage are relatively small in relation to the levels of measurement noise and the whole pressure range [23]. However, in different studies, different metrics are used to measure the size of the leakage. Ostapkowicz [23] defines small leakage as a certain percentage of the whole flow rate in the pipes, Liu et al. [9] define small leakage as a certain ratio of leakage orifice to pipe diameter, and He et al. [24] define a certain leakage volume (i.e., the amount of liquid that exits from the orifice) as a measure of small leakage. In this paper, a small leakage is considered to be the minimum number of pixels that form a leaking drop in an image. In the following subsections, existing methods for leakage detection are reviewed in more detail.

《3.1. Physical model-driven approaches for leakage detection》

3.1. Physical model-driven approaches for leakage detection

Some methods for leakage detection are based on measuring the difference between the fluid inflow and fluid outflow and the negative pressure wave (NPW). A significant difference measured by differential pressure sensors in different positions along the pipes can indicate leakages [25]. Use of the NPW in combination with the steady-state conditions of the leakage is proposed by He et al. [24] for leakage detection; the researchers analyze the sensitive factors in leakage volume, such as orifice size and upstream/ downstream pressure. However, in general, methods based on NPW are sensitive to noise, highly dependent on sensor precision [6], and unsuitable for short-distance transportation pipes [6]. Liu et al. [9] propose a dynamic monitoring module with an amplitude attenuation model of pressure waves for larger leakage detection, and a static testing module based on the pressure-loss model for smaller leakage. If the amplitude attenuation of the pressure is less than a certain value measured by the mathematical model, then leakage is detected. For small leakages, the researchers divide the area of the plant into segments and apply a pressure-loss model to each segment in order to detect the leakage. Abhulimen and Susu [10] propose another physical model-driven approach for leakage detection using the concept of Lyapunov stability and an equilibrium point for flow and pressure. In this approach, leakage is modeled as a factor in the equilibrium equation, and leakage is detected when the model deviates from the equilibrium point, which makes the flow model instable. For leakage localization, this method uses the sonic velocity in the liquid. The time lag in the sonic signal’s travel is calculated; next, the distance between the leakage position and the sensors is calculated. Since the researchers assume that the system is stable for any sufficiently small perturbation, it is difficult to detect small drops with this method.

Another approach to model the characteristics of leakages from pipelines involves the use of acoustic emission (AE) sensors [26,27]. The main idea behind AE methods is that leakage in the pipelines causes turbulent flow, which results in the propagation of elastic waves through the pipeline materials. This method is very dependent on the material of the pipelines; therefore, it is difficult to apply it in complex pipelines made from different types of materials [6].

《3.2. Physical model-driven and data-driven approaches for leakage detection》

3.2. Physical model-driven and data-driven approaches for leakage detection

Other studies in the field of leakage detection use data-driven methods jointly with analytical and physical models. Zhang et al. [28] propose an inverse hydraulic and thermodynamic transient analysis method and an improved particle swarm optimisation (PSO) algorithm for leakage detection. First, they introduce a hydraulic and thermodynamic transient model using the flow rate and the pressure; next, the required data for leakage detection is extracted from the sensors at the origin and terminal of the pipelines. The deviation between the computed data and experimental data is used for leakage detection. Delgado-Aguiñaga et al. [29] use pressure and flow sensors placed at the pipelines’ ends and a nonlinear model to estimate the leak coefficients by means of the Water–Hammer equations and related extended Kalman filters. However, the method is unable to detect several simultaneous leakages from pipelines in different positions.

Ostapkowicz [23] uses NPW and gradient methods for leakage detection. In the pressure gradient method, the basic assumption is that the pressure changes along the pipeline are linear. However, this assumption cannot model all the dynamics of the flow [9]. Sun and Chang [7] extend the NPW method by means of signal processing and a combination of the flow and pressure signals for leakage detection. Leakage and its position can be detected when the attenuation of the integrated signal is greater than the changes in the single pressure signal. However, the accuracy of this method is highly dependent on the type and dynamics performance of the flow meters installed at both ends of the pipelines. Furthermore, this method is not appropriate for a noisy environment or for short pipelines [6].

《3.3. Purely data-driven approaches for leakage detection》

3.3. Purely data-driven approaches for leakage detection

Among the various approaches for leakage detection, some use only data-driven methods to detect leakage. Qu et al. [30] use fiber-optic sensors in parallel with pipelines to sense the vibration of the pipes. They apply a support vector machine classifier to classify normal and anomalous vibration caused by leakage in the pipelines. The location of the leakage is detected by means of distributed fiber-optic sensors. However, the method is not applicable for short-distance pipes. Da Silva et al. [31] use a fuzzy classifier to classify the operational state and process transients. The correlation between the flow rate deviations and the operational transients is used for leakage detection. Wachla et al. [32] extend this method to use a neuro-fuzzy classifier for leakage detection. In their method, the area of the pipelines is divided into subareas and the location of the leakage is decided by a set of neuro-fuzzy classifiers. To detect and localize the leakage, the residues between the measured flow and the predicted flow are considered; if the residues exceed a certain level, then leakage is detected. However, this method cannot detect small leaks, since the residues cannot represent certain changes in the flow in this case.

Among the data-driven approaches, some use image data of the leaking liquid and image-processing methods to detect leakage. They use IR cameras as exterior vision-inspection systems for monitoring the pipelines. This concept was presented for the first time by Nellis [33] as a way to monitor water canals. Nellis [33] evaluates the method and shows that it is an economical and suitable model for leakage detection. However, he does not use image processing for automatic leakage. Another application of IR cameras in leakage detection can be found in the work of Adefila et al. [34]. They consider gas leaks from pipelines and evaluate the sensibility of IR cameras in capturing temperature changes in the leaking gas. However, they do not propose any imageprocessing method to detect the gas leak. Atef et al. [35] propose an automatic leakage-detection mechanism using image analysis in IR images for water transportation pipes. They apply a clustering method to the images in order to detect leakages. For leakage localization, they propose a segmentation method based on the regiongrowing method. Another method based on IR cameras and image processing is proposed by Dai et al. [36] for gas leak detection. After noise reduction with an adaptive Wiener filter, moving regions are found by applying the improved Surendra algorithm. They use edge detection methods to localize the area of the leakage. Kroll et al. [37] use two-dimensional Gaussian distribution based on the temperature profile of a typical gas leak to model the leakage area in thermographic images. Then, the leakage is detected by cross-correlation of the image with the defined temperature profile. This method requires prior information about the temperature profile of the leak. Wang et al. [38] use a convolutional neural network (CNN) for gas leak detection using IR images and classify the images based on different leakage rates. To reduce the level of noise, they subtract different background images to elevate the environmental effect. The main assumption in Refs. [35–38] is that the region of leakage is fixed over time. This assumption can mainly be held for a gas leak, or when the pattern of the leakage in the images follow a region-growing pattern. This assumption cannot be held when the leakage includes liquid drops that are not only in the fixed area of the image, and whose pattern is more like a moving object pattern in the subsequent images. Araujo et al. [39] propose an image analysis technique for leakage detection using thermal and red–green–blue (RGB) cameras. The obtained images are combined into a single image and used as inputs for a CNN classifier to identify leakage. For image analysis, this method requires further parameters such as the distance between the camera and the pipes and the view angles of the cameras. In the field of leakage detection using thermographic video data, Fahimipirehgalin et al. [40] propose a method based on principle component analysis (PCA) and k-nearest-neighbor (KNN) classification to capture the features of leaking drops in the frames and classify them in normal and anomalous (including leakages) videos. However, this method cannot detect the positions of leaking drops. Table 1 [7,9,10,23–40] provides a summary of the pros and cons of different methods based on the defined requirements.

Although several studies have been published in the field of leakage detection, there is still a research gap regarding automatic and accurate leakage detection and localization mechanisms that are independent of the type of substance inside the pipes in chemical process plants, especially when the leaks are small or there are several simultaneous leaks. In this paper, we introduce an imageprocessing method for leakage detection based on block-PCA. PCA is a known method for dimensionality reduction in image data [41]. Furthermore, since this method can maintain the highest contrast and variance in the images, it can be useful for change detection in subsequent images as well [42]. PCA can preserve the changes in subsequent frames caused by leaking drops. It can reduce high-dimension images to lower dimensions, which will reduce the computational complexity of the image classification. In addition, dividing the images into blocks and using block-PCA make it possible to obtain detailed information about the leaking drops in each block and to localize the leakage as well. In this paper, KNN [43] is used as the classifier to classify the images (i.e., the blocks in the images) as normal (without leakage) or anomalous (including leakages). KNN is known as the simplest classifier, especially when the classes are not linearly separated, and it performs very well when the dimensionality of the input data is low. By using PCA as a dimensionality reduction method and reducing the size of the input data, KNN exhibits a good performance in classification. In the following sections, the steps of the proposed method are discussed in more detail.

《Table 1》

Table 1 Classification table of existing approaches in leakage detection based on the defined requirements.

+: the requirement is met (pros); –: the requirement is not met (cons); o: the requirement is partially considered but not clearly discussed.

《4. Automatic leakage inspection by means of machine vision techniques》

4. Automatic leakage inspection by means of machine vision techniques

In this section, image acquisition, image pre-processing, image segmentation, and feature-extraction methods are introduced first, in order to establish a proper machine vision system. Then, the extracted features are used to classify the normal and anomalous (leakage) conditions of the plant. In the following subsections, these steps are described in more detail.

《4.1. Image acquisition and testbed description》

4.1. Image acquisition and testbed description

In order to use machine vision techniques for leakage detection, a testbed was provided for this research. In this testbed, a laboratory demonstrator plant was assembled, including a thermostat with an integrated pump to circulate water through a series of pipes. The demonstrator plant also consisted of multiple Swagelok connectors, additional routes, a dead end, and sampling valves in order to generate leakages from different positions. Leakages could also be generated by losing some of the connectors in the pipelines. The pipes were primarily made of stainless steel, but flexible rubber pipes were used in two positions. To ensure the safety of the researchers using the demonstrator, a high-temperature switch was adjusted to 50 °C. The testbed was made in through close collaboration with an industrial partner in order to provide a practical representation of a real case of industrial use. Therefore, the conditions of the testbed were very similar to those of an actual industrial setting.

In this testbed, an IR camera (TIM640, Micro-Epsilon, Germany) was used to capture the video data from the demonstrator plant. This camera is capable of taking pictures with a size of either 640 × 480 pixels or 320 × 240 pixels, with a temperature resolution of 75 mK. As the typical spatial resolution of other IR cameras is only 320 × 240 pixels, the present spatial resolution was quite high for an IR camera and made it possible to capture the effect of small drops. The camera also has the functionality for raw data export as well as an interface to process live data directly. The raw data exported by the camera shows the temperature value of each pixel. Therefore, the camera can be used for online condition monitoring as well. The maximum frame rate of this camera is 25 frames per second.

From this test bed, two different datasets are provided to illustrate the capabilities of the proposed method in leakage detection using different video formats with different qualities and sizes. The Micro-Epsilon TIM640 camera provides two different video formats: motion picture experts group 4 (MP4) and radiometric video file (RAVI). Thus, the first dataset includes MP4 videos with a frame size of 320 × 240 pixels, and the second dataset includes raw data exported from the camera with a RAVI format and a size of 640 × 480 pixels. In the MP4 dataset, the value of each pixel provides greyscale color information on the temperature in the corresponding pixel, while in the RAVI dataset, the value of each pixel is exactly the temperature value of the corresponding pixels. The two formats have different frame sizes, making it possible to evaluate the method using different frame sizes as well. As shown in Fig. 1, these two datasets have a different level of noise as well as different frame sizes. Therefore, in the evaluation section, we can evaluate the proposed method for highly noisy videos (i.e., those in the MP4 format) and less noisy videos (i.e., those in the RAVI format). In order to provide these two datasets, different liquid leakages with different leakage speeds and positions were generated in the demonstrator plant. Furthermore, in order to propose a method for leakage detection that is independent of the camera position and angle, different videos were taken from the demonstrator plant from different unknown angles and distances. In this testbed, some of the videos were taken while the plant was operating in a normal operation without any leakages, while other videos were taken while the leakages were being generated from different positions of the plant. The different leakages were generated from random positions and with random speeds. The main motivation of such a setting was that the proposed method should be able to detect leakages from any random and unknown position with any random speed. Each dataset was divided into training data and test datasets. The training and test data were selected such that there were normal and anomalous videos in both sets. Despite this condition (i.e., having normal and anomalous videos in both training and test data), the videos were selected randomly for the training and test sets.

A sample frame taken by the IR camera from the laboratory demonstrator with leakages in different positions is presented in Fig. 1(a). In order to reveal the effect of leakages in the frames and remove the background from the frames, subsequent frames are subtracted in each video. Therefore, the changes in the subsequent frames caused by leakage are observable, while the unvarying pixels in the background are removed. A sample of the subtracted frame is shown in Fig. 1(b). In order to provide a better visualization of the subtracted frames in this paper, the background of the subtracted frames is shown as white and the objects (leaking drops) are shown as grey pixels. As can be seen in this figure, each subtracted frame includes a considerable amount of noise (see Fig. 1(c) for more detail) that must be removed in order to reveal the effect of the leakage and improve the quality of the data for further data analysis [44]. The effect of noise becomes even more intense in the MP4 format (Fig. 1(d)) due to the compression of the raw data. Therefore, in both cases, suitable data preprocessing is required to reduce the effect of noise and improve the quality of the frames.

《Fig. 1》

Fig. 1. (a) Sample frame from the demonstrator plant with leakages in RAVI format; (b) sample subtracted frame in RAVI format; (c) zoomed area of subtracted frame in RAVI format; (d) zoomed area of subtracted frame in MP4 format. Leaking drops are marked with solid-line circles.

《4.2. Image pre-processing》

4.2. Image pre-processing

In the first step of video pre-processing, each obtained video is divided into frames, and subsequent frames are subtracted, as follows:

where are the fth and ( – 1)th original frames of a sequence with n frames, respectively; is the subtracted frame; the index is = 2,..., n. After subtracting the frames, a noiseremoval mechanism is applied to the subtracted frames

In this work, the noise-removal mechanism introduced in Ref. [40] is applied to the subtracted frames. In the first step, background noise removal is applied to the subtracted frames in which a certain threshold, ta , is defined, and the pixels with a value lower than this threshold are set to zero. Since the leaking liquid has a different temperature than its surroundings, the changes caused by leaking drops in the subtracted frames are larger than the background noise. Therefore, the effect of leaking drops will remain in the subtracted frames after threshold filtering. After removing the background noise, another noise filter step, single-pixel noise removal, is applied. The basic assumption for this noise-removal step is that a leaking drop includes some neighboring pixels and is not a single pixel. In the subtracted frames, there are some single pixels with non-zero values, while all their neighboring pixels have zero values. These single pixels usually have a high value, and thus cannot be eliminated as background noise; therefore, they will affect the results of the image processing. As a result, if all the neighbors of a pixel have zero values, the corresponding pixel should be set to zero as well.

In order to reveal the effect of leakages and the motion of drops in sequential frames, a temporal operation is required as another image pre-processing step. Temporal operation is usually applied on sequential frames to reveal a specific effect in these frames, such as the motion of an object [14]. The temporal operation used in this paper is performed by taking the average over k sequential subtracted and filtered frames, where k is the number of temporal frames. The resulting frame, in which the effect of the motion of a leaking drop over k sequential frames can be observed, is called the temporal mean frame. After this step, the video data can be converted into a set of temporal mean frames, and the leaking drop can be seen as a streak line in these frames. However, taking the average over k sequential frames will increase the noise in the temporal mean frame due to taking the average over the remaining noise in the k sequential frames as well. Therefore, in order to distinguish the line of the leaking drops from the remaining noise in the temporal mean frames, the last step of the image preprocessing is proposed to be a vertical neighborhood filter. This pre-processing step is important because the leakage from the pipelines usually undergoes vertical motion over time. In this step, a vertical band around each pixel with a non-zero value is considered. Assuming that v is the position of the corresponding pixel on the horizontal axis, then the pixels in the range of in the horizontal direction are considered in the width of the vertical band, where α is the number of neighborhood pixels in the right and left side of the corresponding pixel. If there are q pixels in the vertical direction of the image, the size of the vertical band is q × (2α + 1). It is assumed that a pixel within a leaking drop should have at least q2 neighbors in the vertical band; otherwise, it is considered to be a noisy pixel. Fig. 2 shows a summary and the results of the image pre-processing in a zoomed area of a sample subtracted frame in MP4 format. It can be seen that some single pixels remain after the background filter, which are not part of a leaking drop (Fig. 2(a)), and that removing these pixels can reveal the leaking drops (Fig. 2(b)). Although taking the average over k frames can show several leaking drops in one frame, it increases the noise in the temporal mean frame as well (Fig. 2(c)). Therefore, a vertical neighborhood filter is required to distinguish the leaking drops from the remaining noise (Fig. 2(d)).

《Fig. 2》

Fig. 2. Steps of the noise-removal mechanism in a zoomed area in a sample subtracted frame. (a) Remaining noise after background noise removal. Single pixels are marked with dashed-line circles. (b) The effect of removing single pixels. (c) Temporal mean frame. Leaking drops form a streaked line; however, the noise is increased as well. Vertical bands are shown as a dashed line around pixels. Leaking drops have more neighbors in the vertical band than noisy pixels. (d) The effect of vertical noise removal. The pixels that do not have enough non-zero neighbors in the vertical band are set to zero.

《4.3. Image segmentation and feature extraction》

4.3. Image segmentation and feature extraction

After image pre-processing, each video is available as a set of images (temporal mean frames) for image segmentation and feature extraction. Each image can be represented as a pixel matrix, as follows:

where x2D represents the two-dimensional image, q is the number of pixels in the vertical direction, r is the number of pixels in the horizontal direction, xqr shows the value of the pixels in row q and column r, and is the real number set. Since the number of pixels (features) is large, and since considering all of the pixels in the image analysis will increase the computational complexity, an accurate feature-extraction mechanism is required. By using feature extraction, the most relevant pixels (indicating leakages) and their effect will be conserved while unnecessary pixels will be eliminated. However, before feature extraction, appropriate segmentation is required to partition the images into meaningful regions. This segmentation will help not only for leakage detection, but also for leakage localization and trajectory inspection of leaking drops. After segmentation, the feature extraction can be applied to each segment.

For image segmentation, the mean temporal frames are divided into blocks. Each block has a size of L × L pixels, where L is the number of pixels in the horizontal and vertical directions of the block. The blocks in a sample temporal mean frame are shown as grid lines in Fig. 3. Three leakages in three different positions can be observed in this figure (marked as 1, 2, and 3). If B1 is the number of blocks in the vertical axis and B2 is the number of blocks in the horizontal axis, after the images have been converted into blocks, each image matrix in Eq. (2) can be represented as the following matrix:

《Fig. 3》

Fig. 3. Sample temporal mean frame with three different leakages. The segmentation (blocks) is shown using solid lines. Each block is represented in a specific range of the image matrix. The image matrix (on the right) can be divided into the area of the blocks; therefore, each individual block can be treated as a single image and image matrix with a size of L × L. B1 is the number of blocks in the vertical axis and B2 is the number of blocks in the horizontal axis.

Each block can be represented in a specific range of the image matrix in Eq. (3) (Fig. 3). This equation shows that the image matrix can be divided into the area of the blocks; therefore, each block can be treated as a separate image matrix with a size of L × L (except the last blocks on the right side of the temporal mean frame, which might have a smaller size). Thus, this segmentation makes it possible to consider only the blocks that include leaking drops and to extract them from the images. In order to provide a suitable set of blocks, the training data is used. Since the positions of leakages are known in the training videos, it is possible to select and extract the anomalous blocks (blocks with leakages) from the training videos. Therefore, the training data can be represented as a set of anomalous blocks and one normal block (Fig. 4). The anomalous blocks are the blocks that include leakages, which can be extracted from different anomalous videos (videos with leakages). The normal block can be selected from a normal video without any leakages. It should be noted that each extracted block from a specific training video includes the same number of temporal mean frames in the corresponding video.

《Fig. 4》

Fig. 4. Training data after extracting specific blocks. (a) Training data can be reduced to a set of anomalous blocks including leaking drops. These are taken from different anomalous videos and include the same number of temporal mean frames in the corresponding videos. (b) One normal block without any leakages. Each block includes the same number of frames in the corresponding video.

After converting the training data into the set of anomalous blocks and one normal block, suitable feature extraction is required to extract the most relevant information from the selected blocks. Then the set of blocks can be converted into the set of features. For this purpose, PCA can be used as the feature-selection method. In principle, PCA is a linear coordinate transformation from a high dimensional data set to a lower dimensional data set. Assume that one anomalous block in a temporal mean frame of a training video can be represented as the following matrix:

where represents the two-dimensional image of the block; b1 and b2 are the indices of the block in the vertical and horizontal axes, respectively. In order to calculate the PCA of this block, it is first necessary to convert the two-dimensional image matrix in Eq. (4) into a one-row vector (serialization) by placing all the L rows in one row, as follows:

where is the one-dimensional row vector of the image matrix. Since each video includes N temporal mean frames, each block also includes N temporal mean frames. Each temporal mean frame in a block can be considered as a data sample represented in Eq. (5); thus, there are N data samples within a block. Therefore, the data matrix of one block in one training video can be written as follows:

where Xblock represents a block and represents the Nth temporal mean frame in block Xblock. Each block in Fig. 4 can be represented as Eq. (6), which has L × L features (pixels).

After converting the blocks into a set of data samples and a data matrix, feature extraction can be done by using PCA on each matrix. By means of PCA, a linear coordinate transformation from system Xblock to a new system Zblock can be achieved by the calculation of

where Pblock is the transformation matrix and its columns are the basis vectors of the new system. In PCA, the covariance matrix of the data matrix is considered in order to obtain the basis vectors to form the transformation. Assume that is the covariance matrix of data matrix Xblock. The eigenvector decomposition of the covariance matrix can be utilized to obtain the transformation matrix Pblock [44]. If is the ith eigenvector of covariance matrix , the corresponding eigenvalue represents the variance in the data after transforming the data matrix Xblock into the direction pblock_i.

This implies that the eigenvectors with larger eigenvalues can preserve the high variance and covariance in blocks after transformation. After rearranging the eigenvalues in descending order and sorting the eigenvectors accordingly, the major data variation can be captured by transforming to first eigenvectors. In this case, the first H eigenvectors (where H is the number of selected eigenvectors) can form the transformation matrix, , in which . Therefore, each block  can be transformed into a new data matrix with lower dimensions, as follows:

In this transformation, all frames within a block are transformed into the new directions. The defined transformation in Eq. (8) can be done for all training blocks in Fig. 4. Therefore, if there are blocks (where is the number of blocks) in the training videos (including blocks with leakages and one normal block), there will be transformation matrices and transformed matrices. A sample result of the transformation for a sample frame within a block is shown in Fig. 5. In this figure, the transformed matrix is visualized as a bar plot.

《Fig. 5》

Fig. 5. Transformation (mapping) of a sample frame within a sample block by using PCA to lower dimension H (e.g., H = 10).

Since each block shows different shapes and intensities of the pixels, different principle components should be calculated for each block. This concept is referred as ‘‘block-PCA” in this paper (Fig. 6). Block in the training blocks is represented as ; the corresponding transformation matrix and transformed matrix are represented as and , respectively.

After the definition of a set of blocks as the training set, a set of transformation matrices, S, is defined (Fig. 6(b)). Furthermore, all frames in each block should be transformed (mapped) into the directions of the calculated principle components of the corresponding block. A set of sample transformed (mapped) frames from each block is shown in Fig. 6(c). The set S and the set Map (including transformed frames) are the basis for the rest of the analysis and the classification.

It is worth noting that for the normal block, since all pixels within this block are zero, there is no transformation matrix and transformed matrix. However, in order to maintain consistency in the text, the term ‘‘transformation matrix” is used for all blocks; wherever it is necessary to differentiate operations for the normal block in this text, it will be explicitly mentioned. After preparing the training data, the classification process is performed on the test data for leakage detection and localization in the test video.

《Fig. 6》

Fig. 6. The concept of block-PCA for the training data. (a) The set of the blocks and a sample frame of each block. (b) The set of principle components corresponding to each block (transformation matrices). (c) The set of sample transformed (mapped) frames in each block using corresponding principle components. Transformed frame (1) is the result of the transformation of one frame under , which is calculated from training block(1) and so forth.

《4.4. Classification, detection, localization, and interpretation》

4.4. Classification, detection, localization, and interpretation

In this work, a binary classification including normal and anomalous classes is used to classify each block in the test data. The aim of the classification is that if a block in the test video includes leakages, it should be classified as anomalous; otherwise, it should be classified as normal.

In order to detect and localize leakages in the test video, the test video is first converted into frames and the image pre-processing step and noise-removal mechanism are applied to the test frames. The resulting temporal mean frames are divided into blocks. Each frame in each block is classified separately. Finally, the corresponding block is classified into the category in which the majority of its frames belong. With this classification, anomalous blocks and normal blocks can be detected in the test video. After the detection of anomalous blocks, the leakage can be localized in these blocks. An overview of the process for leakage detection and localization in test data is described in Fig. 7. In this process, blocks are processed one by one until there is no unprocessed block in the test video. Furthermore, in each block, the mean temporal frames are processed one by one. Classification of each frame is done when the frame is transformed by all transformation matrices in the set S, and there is no unprocessed transformation matrix left in this set. Finally, the classification of the block is done when all of its frames are classified and there is no unprocessed frame left in the block. If a block is classified as anomalous, then the localization of the leakage is done in the corresponding block.

The classification of each frame within each test block is performed by using the set S and the Map set resulting from the training blocks. In this case, each frame in each test block is transformed using the PCAs in the set S. Assume that the block b in the test video is considered for classification. This block is a data matrix represented as Xblock(b)_test using Eq. (6). In the first step, this data matrix is transformed by in set S. The result of this transformation is the H-dimensional data matrix . Each data sample (transformed frame), , i = 1...,N, in data matrix  is compared with each data sample   , j = 1...,N in the transformed training blocks. The Euclidian distance between each transformed test frame and all transformed training frames is calculated. The same procedure is done after transforming the test block using , and so forth. Afterward, by applying the KNN [43], each frame in the test block is classified into the category of the closest training frame. Finally, block b is classified into a category in which the majority of its frames belong. Since the KNN algorithm is a non-parametric machine learning method, the performance of PCA as a feature-selection mechanism can be evaluated directly. This implies that anomalous blocks, which include leakages, in the test video can be classified into the category of the closest training block with similar leakage.

《Fig. 7》

Fig. 7. Steps of the proposed method for the classification of blocks in the test video using block-PCA. The orange loop shows the transformation loop of each test frame, while the green loop shows the classification of each test frame, and the blue loop shows the classification of each test block.

It is worth noting that a zero vector with size H is considered to be a transformed matrix for this block, since there is no transformation matrix in the set S for a normal block in the training set. All blocks in the test video will be compared with this zero vector as well, after transformation by any other transformation matrix in set S. If the majority of the frames within a test block have only pixels with zero value, they will be transformed into zero after any transformation; therefore, they have minimum distance with a zero vector. If a test block has the minimum distance with the zero vector after most of the transformations, then this block is classified as normal. Finally, the test video is classified as an anomalous video if any anomalous block is detected within this video; otherwise, it is classified as a normal video.

After a block is classified as anomalous, the position of the leakage within this block is calculated as well. For this purpose, the center of mass is calculated in each anomalous block in order to localize the leakage inside this block. Since the pixels in a leaking drop are the most intensive pixels in a block, the center of mass with a certain radius can highlight the area of the leakage. Assume that r' and q' are the indices of a pixel in the vertical and horizontal axes in a block, respectively, and mr' q' is the intensity of the pixel. Then, the center of mass within a block can be calculated as follows:

where is the row (positions in the vertical and the horizontal axes) of the center of mass within block b in frame i of a test video.

where is the column (positions in the vertical and the horizontal axes) of the center of mass within block b in frame i of a test video. If b1 and b2 are the indices of block b in vertical and horizontal axes, respectively; then the global position of the leakage in frame i can be calculated as

《5. Evaluation of the proposed method for leakage detection and localization》

5. Evaluation of the proposed method for leakage detection and localization

In order to evaluate the proposed method for leakage detection using an IR camera and machine vision techniques, the results of leakage detection and localization in the test videos are discussed in this section. For evaluation, two different datasets are provided. In the first dataset, the data is provided in the MP4 format, while the data in the second dataset is provided in the raw RAVI format. In the following subsections, the results of the proposed method for these two datasets are discussed. Finally, the defined requirements for leakage detection are evaluated based on results of the two datasets.

《5.1. Performance of the proposed leakage-detection method in video data with MP4 format》

5.1. Performance of the proposed leakage-detection method in video data with MP4 format

This data set includes 25 videos with a length of 60 s each from the laboratory demonstrator plant while it was operating under normal conditions (nine videos) and under an anomalous condition in which liquid was leaking from the pipelines in different positions of the plant (16 videos). The whole dataset was labeled by experts after collecting each video from the demonstrator plant. The frame rate of the IR camera is 25 frames per second; thus, each video includes 1500 frames. Furthermore, the frame size is 320 × 240 pixels, of which the pixels in the area of 300 × 240 = 72 000 pixels are considered for image processing. The training data includes five normal videos and nine anomalous videos, and the test data includes four normal videos and seven anomalous videos. In the first step of the proposed method, the training data is divided into frames and the subtracted frames are calculated. In the image pre-processing step, pixels with a value of less than 0.5 (ta < 0.5) in the greyscale are set to zero in order to remove background noise. After applying single-pixel noise removal, temporal mean frames are calculated with k = 5 for all videos. Then, for the vertical neighborhood filter, α = 2 and q2 = 10 are set. Next, each training video is divided into blocks. The block size in this case is set to L × L = 40 × 40.

In order to form the set of blocks in Fig. 6, the anomalous blocks from each training video and one normal block are selected. Then, the PCA of each block is calculated to form the set S. In order to obtain a suitable number for H as the number of selected principle components for the transformation matrix in each block, the corresponding eigenvalues of the principle components are calculated for each block as well. Since the corresponding eigenvalues of the principle components show the amount of variance in the data after the transformation, the selected number of principle components, H, should preserve the most variance in the data. The eigenvalues of the first 50 principle components for four selected anomalous blocks in the training data are shown in Fig. 8. By selecting H = 10 in each block, more than 95% of the variance in each block can be preserved. Increasing H to greater than ten would increase the computational complexity and dimensionality of the transformation space without significantly changing the variance of the data. Thus, the first ten principle components of each block are considered to form the transformation matrix in each block. Finally, all frames in each block are transformed based on its transformation matrix. Therefore, the size of each frame within a block decreases from 40 × 40 = 1600 to 10, while 95% of the information in each frame is preserved.

《Fig. 8》

Fig. 8. Eigenvalues of the first 50 principle components in four anomalous blocks in the training data with the MP4 input format. The eigenvalue of each eigenvector (principle component) shows the amount of variance in the data after transformation in this direction. (a) Eigenvalues of the first sample anomalous block. The first ten principle components preserve 95% of the whole variance. (b) Eigenvalues of the second sample anomalous block. The first ten principle components preserve 96% of the whole variance. (c) Eigenvalues of the third sample anomalous block. The first ten principle components preserve 98% of the whole variance. (d) Eigenvalues of the fourth sample anomalous block. The first ten principle components preserve 99% of the whole variance.

For the classification of a test video, the same process is applied as was used for the training videos. Each test video is divided into blocks, and the frames within the blocks are transformed using the transformation matrices in the set S. After each transformation, the KNN algorithm with k = 3 is applied to classify the frames in each block of the test video. The value of k is selected by trial and error in the range of [1,5], with k = 3 showing the best result. Furthermore, the Euclidean distance between the transformed frames is considered to be the basic distance measurement for classification. Each frame is classified separately into a category of the closest frame in the training data based on the Euclidean distance. After all of the transformation, if the majority of frames (more than 50%) inside a block belong to the anomalous category, then the block is considered to be an anomalous block; otherwise, it will be categorized as a normal block. The result of the classification of blocks is shown in Fig. 9(a) for a test video. After detecting anomalous blocks, localization of the leakage can be done in each anomalous block based on the center of mass of the block (Figs. 9(b)–(e)). The position of the leakage can be marked in each frame after detection. In Figs. 9 (b)–(e), solid-line red circles are drawn around the center of mass of each anomalous block with a certain radius. In order to evaluate the accuracy of the proposed method for the test videos, a confusion matrix is provided for each video. The actual class of each block shows whether the block includes a leakage or not, while the predicted class of each block in each video shows the classification result. Table 2 summarizes the results of the accuracy of the proposed method for each video. In each test video, there are 42 blocks with a size of 40 × 40, and the accuracy of the classification of the blocks in each video is calculated, as shown in Table 2. The actual class, predicted class, and accuracy of the classification for normal and anomalous blocks in each test video with the MP4 format is accuracy = (TP + TN)/(Total blocks), and the F1 score is calculated as F1 = (2TP)/(2TP + FP + FN) ; where TP, FP, TN, and FN stand on true positive, false positive, true negative, and false negative, respectively.

《Fig. 9》

Fig. 9. Leakage detection and localization in the test video with the MP4 input format. (a) The results of the classification of each block in the test frame. Each block in the video is marked as an anomalous block (if it includes leakage) or a normal block (without leakage). To localize the leakages, after the detection of anomalous blocks, the center of mass in each block is highlighted. (b–e) Sample temporal mean frames with localization of the leakage. Solid-line red circles are drawn around the center of mass in each anomalous block with a radius of ten pixels.

《Table 2》

Table 2 Actual class, predicted class, and accuracy of the classification for normal and anomalous blocks in each test video with the MP4 format.

The results show that, in each video, leakages can be detected with reasonable accuracy (greater than 90%). Among the normal videos, only one video, in which four normal blocks in the video are classified as anomalous blocks, is detected as an anomalous video (Video 3). The reason for this misclassification may be the high noise in this video, especially along the pipes, which is considered to be a line of a leakage by the proposed method. Furthermore, in the anomalous videos with leakages, some blocks are expected to be detected as anomalous blocks but are instead classified as normal blocks (e.g., Video 8). These blocks are usually at the bottom of the frames, while the leaking drops start from a position at the top of the frames. In this case, by the time the leaking drop reaches the bottom of the frame, it loses its temperature difference with its surroundings; therefore, it is faded and difficult to observe at the bottom of the frame, especially when the drop is small. In this case, the drops are not observable in most of the frames in the blocks at the bottom of the video, and they will be eliminated as noise in most of the frames. Therefore, these blocks are classified as normal (Fig. 10). However, all drops are still detectable in the starting position of the leakage in all cases, even as small drops. The smallest size of the drop that can be detected in the images is 3 × 5 pixels, with a pixel intensity greater than 0.5 in the greyscale.

《Fig. 10》

Fig. 10. Fading effect on some anomalous blocks at the bottom of the frames. (a) Two blocks with leakage cannot be detected on the left side of Video 8. The drops in these two blocks are not intensive enough and are eliminated as noise in most of the frames. (b) One anomalous block at the bottom cannot be detected in Video 11 due to the fading effect. The drops in this block are not intensive enough and are eliminated as noise in most of the frames.

In the MP4 videos, the proposed algorithm can detect and localize leakages with a minimum of 310 frames. This number of frames is required to remove the noise and correctly classify each block. The analysis time for leakage detection in 310 frames is 15 s (including image pre-processing) running on 2.90 GHz central processing unit (CPU) in a 64 GB random access memory (RAM) environment with MATLAB 2018a (MathWorks, USA). Table 3 provides a summary of the results for the MP4 case.

《Table 3》

Table 3 Summary of the results of the proposed method for the classification of test videos in MP4 format.

《5.2. Performance of the proposed leakage-detection method in video data with the RAVI format》

5.2. Performance of the proposed leakage-detection method in video data with the RAVI format

The second dataset used for the evaluation of the proposed method includes 20 videos with the RAVI format, among which four videos are normal and 16 videos are anomalous, including leakages. Each video is 30 s long with a frame rate of 25 frames per second. In this dataset, the size of the frames is 640  480 = 307 200 pixels and the value of each pixel is exactly the temperature value in that pixel. The data is divided into training data and test data, with 12 videos including 10 anomalous videos and two normal videos being considered as training data and eight videos including six anomalous videos and two normal videos being considered as test data. The block size in this dataset is set at L × L = 60 × 60 pixels. The same procedure used for the videos with the MP4 format is considered for this dataset as well. The training data is divided into blocks, and anomalous blocks and one normal block are selected. The corresponding PCAs are calculated for the blocks and form the set S, and the training blocks are transformed to the lower dimension. The main advantage of this format, in comparison with the MP4 format, is that there is no extra noise due to video compression. Therefore, the proposed leakage-detection mechanism is more accurate in this format. In the image pre-processing step, pixels with a value of less than 0.1 (ta < 0.1) are set to zero to remove the background noise. After applying single-pixel noise removal, temporal mean frames are calculated with k = 5 for all videos. Then, for the vertical neighborhood filter, α = 2 and q2 = 10 are set. Furthermore, the selected number of principle components, H, is set to ten in this format as well. By selecting H =10, more than 95% of the variance in each block can be preserved. For leakage detection in the test video, the same approach as that described in Fig. 7 is followed. Each test video is divided into blocks and the frames within the blocks are transformed using the transformation matrices in the set S. After each transformation, the KNN algorithm with k = 3 is applied to classify the frames in each block of the test video. The value of k is selected by trial and error in the range of [1,5], with k = 3 showing the best result. Furthermore, the Euclidean distance between the transformed frames is chosen as the basic distance measurement for classification. Each frame is classified separately into a category of the closest frame in the training data based on the Euclidean distance. After the transformation and classification of the frames within a block, if the majority of frames (more than 50%) inside a block belong to the anomalous category under most of the transformations, then the block is considered to be an anomalous block; otherwise, it is categorized as a normal block.

In order to evaluate the accuracy of the proposed method on the test videos with the RAVI format, a confusion matrix is provided for each video. The actual class of each block shows whether the block includes leakage or not, and the predicted class of each block in each video shows the result of the proposed method in classifying the blocks in the video. Table 4 summarizes the results of the accuracy for each video.

《Table 4》

Table 4 Actual class, predicted class, and accuracy of the classification for normal and anomalous blocks in each test video with the RAVI format.

In each test video, there are 80 blocks with a size of 60 × 60. The results show that leakages can be detected in each video with reasonable accuracy. Since the noise in RAVI is less than that in MP4, the accuracy of the proposed method in the RAVI format is higher than that in the MP4 format. Normal videos and normal blocks are not classified as anomalous due to less noise. Furthermore, the data in the RAVI format are more sustainable in terms of fading effect; in addition, drops are more detectable when they reach the bottom side of the frames, and they are not eliminated as noise. Therefore, in this format, the trajectory of the leakage can be detected more precisely. The trajectory of the leakage is shown in six frames in Fig. 11. As shown in this figure, the angle of the camera is changed, and it is not directly in front of the demonstrator plant. This shows that the leakage detection is independent of the angle of the camera, according to the concept of segmentation and blocking. Furthermore, more than one leakage can be detected at the same time. In this experiment, the smallest size of the drop that can be detected in the images is 3 × 5 pixels with a pixel intensity greater than 0.1 °C, which is 2 × 104 times smaller than the size of the frame (640 × 480). In RAVI videos, the proposed algorithm can detect and localize the leakages with a minimum of 120 frames. This number of frames is required to remove noise and correctly classify each frame. The analysis time for leakage detection is 9 s (including image pre-processing) running on 2.90 GHz CPU in a 64 GB RAM environment with MATLAB 2018a (MathWorks, USA). Since the videos with RAVI format are less noisy, the minimum number of frames required for leakage detection and the processing time are less than those for the videos with MP4 format. Table 5 provides a summary of the results for the test videos with the RAVI format.

《Fig. 11》

Fig. 11. Trajectories of leakages in six consecutive frames in RAVI format. The red circles localize the leaking drops frame by frame and show the path of the leakage. This video includes two leakage positions: one on the left side of the plant starting from the middle and proceeding to the bottom of the frame, and the other on the right side of the plant at the bottom of the frame. (a–f) Consecutive frames (Frames 1–6).

《Table 5》

Table 5 Summary of the results of the proposed method for the classification of test videos in RAVI format.

In this paper, the proposed method is evaluated on two different datasets with different formats, different noise levels, and different frame sizes in order to demonstrate the applicability of the method in different formats, noise levels, and sizes. This evaluation shows that having different formats or different sizes does not change the main steps (presented in Fig. 7) of the proposed method. Therefore, this method can be applied to other formats as well. If the new format imposes more noise on the video, then it will reduce the accuracy of the method. Similarly, changing the image size will not change the main algorithm. However, the block sizes can be adjusted to the sizes of the images. This was considered for the two formats used in this paper. The RAVI format has larger image sizes than the MP4 format; therefore, the block sizes were larger as well.

《5.3. Evaluation of the results and hypotheses based on the defined requirements》

5.3. Evaluation of the results and hypotheses based on the defined requirements

As stated in Section 2, an applicable and reliable leakagedetection mechanism should meet certain requirements. Since the proposed method in this paper is based on visual inspection, it can provide safe and remote inspection by making it possible to install the camera within the chemical plant and perform the inspection remotely through the videos (Requirement R1). Afterward, the inspection can be performed automatically by means of image analysis without human inspection (Requirement R2). The results of the evaluation for two different dataset formats show that the proposed method for automatic visual inspection for leakage has reasonable detection accuracy, and the position of the leakages in the images can be detected correctly (Requirement R3). Furthermore, with the proposed image pre-processing step and the noise-removal mechanism, the method is robust to environmental noise; the results for the MP4 dataset also show that it is robust to additional noise as well, such as the noise resulting from video compression (Requirement R4). As discussed in the evaluation of the two different datasets, this method is suitable for the detection of small drops as well. The size of the detected drops in pixels is quite small in comparison with the size of the whole image (Requirement R5). Since the method is based on image segmentation, more than one leakage can be detected at the same time, and the method is not limited by the number of leaking positions; in addition, it is not necessary to install more cameras to detect more leakages (Requirement R6). Furthermore, as seen in the evaluation, the position and the trajectory of the leakage can be detected (Requirements R7 and R8). The proposed method can detect leakages within a reasonable time period, especially when the environment is less noisy and the input data is in RAVI format (Requirement R9). In the evaluation with the industrial expert, the accuracy of the method and the time required for detection were satisfying, and the industrial expert found the method to be a practical approach for real application in industrial plants. Finally, leakage detection based on machine vision techniques does not require information about the specific characteristics of the liquid in the pipes, such as the density of the liquid. Therefore, it can be applied for any type of liquid inside the pipes (Requirement R10), as long as the liquid has a different temperature than the environment.

Based on the achieved results and the evaluation with the industrial partner, the use of IR cameras and machine vision techniques can provide a promising framework for automatic leakage inspection (H1) in chemical process plants that is independent from direct human inspection and suitable for remote operation. In comparison with the current inspection method of our industrial partner, which is based on human inspection, the proposed method can save time, cost, and effort. Furthermore, it can provide reliable inspection (H2) in terms of robustness to noise (H2.1), accuracy, detection within a reasonable time period (H2.2), and correct position and trajectory detection (H2.3). Therefore, the hypotheses defined in Section 2 can be confirmed.

《5.4. Strengths, validity, and limitations of implementation of the proposed method in a real large-scale chemical plant》

5.4. Strengths, validity, and limitations of implementation of the proposed method in a real large-scale chemical plant

The proposed method for the automatic visual inspection of a chemical process plant was implemented and evaluated for a demonstrator plant with one camera. However, the main challenge that remains is to extend this method for use in a real large-scale chemical plant. As stated in H3, by considering several fixed cameras in different positions at various locations within a real large-scale plant, the method can be extended to a real application. In order to evaluate this hypothesis, the limitations and threats to the validity of the method in a real industrial plant are discussed as follows:

• The first main concern about the proposed application of IR cameras for leakage detection is that they are not good inspection devices for an outdoor plant. They can be affected by several factors such as weather conditions, sun, wind, and so forth [16]. For example, the assumption in the proposed method regarding the vertical direction of the leakage from the pipes would be affected by strong wind in an outdoor plant; similarly, unexpected noise caused by reflection from different surfaces under heavy sunlight would affect the precision of the method. Therefore, the proposed method is primarily suitable for indoor plants.

• In real application in an indoor large-scale plant, placing several fixed cameras at various locations within the plant is a way to implement the proposed visual inspection method. Each camera can observe a specific part of the plant, and the vision-based algorithm can be implemented for each specific part of the plant that is observed by a single camera. However, if the plant is huge, the possibility of installing many cameras to observe all parts of the plant should be investigated as well. Another possibility of implementing a vision-based system in real application would be drones with IR cameras. However, the motion of the drone can affect the precision of the detection algorithm; furthermore, the application of drones in an area with a high safety risk is still questionable [18,19]. Moreover, a study in Ref. [45] showed that mobile leak-detection platforms can act as a complement rather than as a substitute; therefore, identifying the applications of such a platform will be very critical for deployment on a large scale.

• IR cameras can capture the minimum temperature difference in a range of 0.05–0.10 °C [16]. In the proposed method, it is assumed that the leaking liquid has a temperature difference in comparison with its surroundings that is within the range that can be captured by IR cameras. Therefore, if this assumption is not the case in a chemical plant, the leaking liquid will not be detected by IR cameras and the proposed method will no longer be applicable. Furthermore, considering the fading problem described in Subsection 5.1, leaking drops are not observable when they lose their temperature difference. In that case, the leaking liquid is only observable at the start position of the leakage.

These limitations of the proposed method are threats to the validity and confirmation of H3. In a situation in which these limitations can be overcome in a real case of industrial use, the proposed method will be applicable for a real chemical plant.

《6. Conclusions and outlook》

6. Conclusions and outlook

In this work, an approach based on machine vision techniques was proposed for the detection and localization of leakages from the pipelines of a chemical process plant. Two different video datasets with different formats (MP4 and RAVI) were provided from an industrial demonstrator plant. These videos were taken by an IR camera while the plant was operating under normal conditions and during an anomalous condition with leakages. The camera was a typical IR camera without any special software or extra hardware. The method implemented in this research performs very well in analyzing the provided video datasets with reasonable accuracy and detection time. The image pre-processing step and noise filter can substantially improve the image quality. Furthermore, correct segmentation can help in detecting and localizing leakages. Feature extraction based on PCA in each block can preserve a high percentage of the information in subtracted frames to reveal the effect of leakages while simultaneously reducing the size of the images significantly. The trajectories of the leakages can be detected by the segmentation and the introduced block-PCA method in this contribution. The results showed that leakages and their trajectories could be detected with high accuracy in the provided datasets. Furthermore, in an evaluation with an industrial partner, the accuracy of the proposed approach and the detection time were found to be satisfying, and the industrial partner considered the method to be an applicable approach for real industrial plants. In future work, other possible methods such as optical flow for image pre-processing and deep learning for classification will be used to improve the accuracy and detection time of this method. Moreover, one of the remaining challenges for industrial application is the question of how best to effectively deploy this approach for large-scale plants. The approach can be realized by installing various cameras in different fixed positions to observe all parts of the plant, or a camera can be deployed as a single drone with machine vision capabilities.

Further research in this area will focus on the opportunities and challenges in deploying such a vision-based system in a real industrial environment. The detection time of this method can also be improved in order to make it more suitable for real-time application. In addition, the image pre-processing step and the filter algorithms need revision to make them applicable for cases with new datasets with different noise characteristics. If the leakage is not only occurring in the vertical direction, the image pre-processing step should be enhanced by defining suitable neighborhood patterns to detect leakage in any direction. By considering different patterns of the leakage, such as liquid spreading on the ground, the proposed method can be extended to include a wider variety of different leakage patterns in vison-based leakage detection.

《Acknowledgements》

Acknowledgements

This research is part of the project Scalable Integration Concept for Data Aggregation, Analysis and Preparation of Big Data Volumes in Process Industry (SIDAP), funded by the German Federal Ministry for Economic Affairs and Energy (BMWi) (01MD15009F). The Institute of Automation and Information Systems thanks its industrial partners for supporting this research.

《Compliance with ethics guidelines》

Compliance with ethics guidelines

Mina Fahimipirehgalin, Emanuel Trunzer, Matthias Odenweller, and Birgit Vogel-Heuser declare that they have no conflict of interest or financial conflicts to disclose.