《1. Introduction》

1. Introduction

Energy-storage systems such as battery modules for new energy vehicles (NEVs) are gaining extensive attention [1,2] as a means of replacing traditional gas (petrol/diesel)-operated vehicles and thereby promoting a cleaner environment. The performance parameters of lithium (Li)-ion battery modules include energy density, capacity, and specific power. To meet the power demand required for the transmission systems of NEVs, several small battery modules are used in series or parallel to form a large battery module (also known as a battery pack). A battery module consists of a number of cells connected in series and parallel. The range of an NEV depends on the performance of its battery module, and the performance of a battery module depends on the individual performance of each cell and on the configuration of the cells in series or parallel. The ideal performance of a battery module should follow the criteria of uniformity and equalization; however, these criteria have not yet been satisfactorily met.

During the mass manufacturing of cells and the assembly of cells into modules, slight variations occur due to uncertainties in the operating manufacturing conditions [3]; these may include a performance difference of the electrode materials, a change of operating conditions, or geometrical variation caused by machining errors [4]. These uncertainties can cause defects in the battery modules such as surface scratches, exposed foils, and cracks. Manufacturing defects in battery modules lead to variations in performance among the cells used in series or parallel configuration, which in turn may lead to variations in the performance parameters (i.e., capacity and voltage) of each cell in a module. Over a period of time, this problem accumulates, resulting in uneven temperature distribution and incomplete charge/discharge of several cells in the module. These problems lead to less capacity being available [5–7].

Uniformity and equalization criteria—if adapted during the design and manufacturing of a battery module—can avoid the problems of overheating, thermal runaway, and so forth, and thus increase the life of the battery module [8–12].

In order to solve these problems, some battery-sorting methods have been researched [13–15]. Gallardo-Lozano et al. [16] summarized the different active methods for a battery equalization system, and concluded that the switched capacitor and doubletiered switching capacitor methods are the best sorting methods. Kim et al. [17] proposed an approach based on a screening process (capacity screening and resistance screening) to improve the utility of a Li-ion series battery module. In subsequent research, they proposed a practical universal modeling of multi-cell battery strings arranged in series and parallel configurations [18]. Kim et al. [19] proposed a modularized two-stage charge equalizer with cell selection switches. The advantage of this sorting method is that it can be widely used for a large number of Li-ion cells in a hybrid electric vehicle (HEV). In addition, five sorting methods—namely, capacity and alternate current internal resistance, electrochemical impedance spectroscopy (EIS), voltage curve, dynamics parameters, and thermal behavior—are compared in Ref. [20]. It was found that low-frequency battery impedance is the most suitable method for sorting batteries by their dynamic characteristics.

Previous studies [21–36] have conducted the selection and classification of homogenous cells. Based on experimental verification, the sorted cells have a more consistent performance in terms of voltage, temperature, and capacity in comparison with unsorted cells. However, little research has focused on conducting experiments. Therefore, the present work combines experimental and numerical methods to conduct a comprehensive investigation on the clustering of battery cells with similar performance in order to design a battery module with higher electrochemical performance. Fig. 1 illustrates the procedures that were used to perform the clustering analysis and verify the performance of the designed modules. Charging–discharging tests were conducted on 48 Li-ion cells to measure their voltage, temperature, and capacity. The k-means clustering and support vector clustering (SVC) algorithms were used to group cells with similar performance in order to produce a battery module. A comparison analysis was performed on the performance of the battery modules produced in this research and the performance of those purchased from a manufacturer.

《Fig. 1》


Fig. 1. A comprehensive procedure for the design and manufacture of a battery module.

《2. Experimental setup for data measurement》

2. Experimental setup for data measurement

This section describes the charging–discharging tests that were conducted on 48 Li-ion cells for the measurement of data (voltage, temperature, and capacity). The 48 cells were obtained by dismantling the battery pack shown in Fig. 2(a).

The disassembly process of the battery modules was performed in four steps:

Step 1: Obtain information on the battery modules such as capacity, cell numbers, and connection modes between cells.

Step 2: Identify the output terminal of the battery module once the module is unpacked. This step should be done carefully to avoid any connection between the negative and positive poles of the battery module.

Step 3: Break the series connection first. In order to ensure safety, the battery module was split up into small parts by destroying the series and parallel connections.

Step 4: Split the small parts into cells.

After dismantling the battery module, charging–discharging tests were conducted by means of a battery-testing system (Fig. 2(b)). The battery-testing system mainly include the battery-testing device, a data-collecting system, Li-ion cells, etc. The battery-testing device was purchased from Newware Ltd. It has eight channels and can save data automatically. The steps for testing the charging–discharging process are summarized in Table 1. Step 1: The constant current discharge was set at 1.3 A. Step 2: This step began at the point when the voltage of the Liion battery was 2.75 V, and involved a resting time of 30 min. Step 3: A constant current and constant voltage charge were set with a cut-off voltage of 4.2 V. Step 4: This step again involved resting for 30 min. Step 5: The number of cycles was set as 20. Throughout the process, the voltage was not permitted to exceed the range of 2.65– 4.3 V. Each cell was charged–discharged for at least 30 cycles. The data collected from the experiment are shown in Table 2, where ‘‘zero” and ‘‘full” refer to fully discharged and fully charged states, respectively. The following section describes how clustering algorithms were used to analyze the collected experimental data.

《Fig. 2》

Fig. 2. (a) Dismantling and disassembly process for battery modules; (b) battery–testing system used for conducting charging–discharging tests.

《Table 1 》

Table 1 Steps for testing the 18650 Li-ion battery.

Maximum safety voltage: 4.3 V; minimum safety voltage: 2.65 V; start experiment steps: constant current discharge.

《Table 2》

Table 2 Data obtained from charging–discharging tests on cells.

《3. Clustering algorithms》

3. Clustering algorithms

Supervised learning and unsupervised learning are two categories of machine learning methods. Supervised learning is generally used for classification, while unsupervised learning is employed for clustering. Clustering algorithms are a broad set of techniques for grouping data according to different rules; many excellent descriptions can be found in Refs. [37–39]. The purpose of clustering analysis is to group data into several classes according to certain rules. These classes are not given in advance, but are determined by the characteristics of the data. The data in the same class tend to resemble each other in a sense, whereas the data in different classes tend to be discrepant.

《3.1. k-means clustering algorithm》

3.1. k-means clustering algorithm

MacQueen proposed the k-means clustering algorithm in 1967 [39]. As this algorithm is simple and easy to understand and has a relatively fast calculation speed, it is usually used as the preferred algorithm for the cluster analysis of large samples [40].

The main steps of the k-means clustering algorithm are as follows:

Step 1: k samples are randomly selected as the initial cluster centers.

Step 2: The distances between other data and each initial cluster center are calculated, and the data are divided into cluster domains in which the nearest cluster center is located.

Step 3: After all the data are sorted, the average of all the data of every cluster is recalculated, and the data where the average is located become a new cluster center.

Step 4: Multiple iterations are performed until the centers of two consecutive clusters are the same, indicating that the data are classified into k clusters.

The sum of square errors is a commonly used evaluation criterion that refers to the sum of the Euclidean distances from the data samples in one cluster to the cluster center , which can be expressed as follows:

whereis a dataset,is the data domain, k is the number of clusters, and  is the cluster domain whose cluster center is . The clustering center ml can be calculated by the following:

where is the number of data samples in clustering domain

The objective function in Eq. (1) represents the sum of the square errors between all the data in k clusters and their cluster center . A smaller value of  indicates better data concentration in the cluster—that is, a better clustering result.

Although the k-means clustering algorithm is practical and simple to implement, it has some limitations. First, determining a reasonable value of k is difficult. Second, the randomness of selecting initial clustering centers may result in instability of the clustering results. Third, this algorithm is sensitive to noise. A self-organized map based on a neural network can also be used for clustering. However, it is necessary to train the neural networks, which can make this process time-consuming. Therefore, the next section introduces a better and more efficient clustering algorithm.

《3.2. The SVC algorithm》

3.2. The SVC algorithm

In general, a support vector machine (SVM) is adopted for classification (supervised learning). SVC is a slightly different algorithm from an SVM. In fact, SVC is an unsupervised learning clustering algorithm.

The main idea of SVC is to map data space to a high-dimensional feature space using a Gaussian kernel function. Next, a sphere with a minimum radius is obtained and the sphere contains most of the mapped data [41,42]. After being mapped back to the data space, the sphere can be separated into several parts, each containing a single cluster point set.

In this paper, a robust and efficient cluster marking method is adopted, which is based on the training kernel radius function. This method has two stages. The first stage involves dividing the dataset into several mutually exclusive groups, each of which is a cluster. The second stage involves marking all data samples.

The description of the dataset support vector is the foundation of the SVC algorithm. The data samples are mapped to a highdimensional feature space through nonlinear changes, and the minimum radius of a sphere containing all the mapped data samples is identified. The above steps are equivalent to the following optimization problem:

where represents the nonlinear mapping, is the Lagrange multiplier, and is a regularization constant. Only the samples that satisfy the constraints lie on the boundary of the sphere. When the samples are located outside the boundary. The Gaussian kernel function is used to calculate the dot product 

where is the width parameter and can be re-expressed as follows:

At each point of  is defined as the Wolfe dual form of the distance from the center of the sphere in the feature space.

whereis the distance from each to the center of the sphere and a is the center of the sphere. Considering the kernel definition, the following equation can be obtained:

A notable feature of the trained kernel radius function is that this cluster boundary can be constructed from a set of outlines to contain samples in data space: for a support vector is separated into several disjointed sets:

where  is the connection set corresponding to different clusters.

Although it may be difficult to determine the appropriate kernel parameters in the selection of the model, SVC has some obvious advantages over other clustering algorithms: ① It can generate arbitrary cluster boundary shapes; ② it has flexible boundary changes to handle outliers; and ③ it avoids explicit calculations and is therefore effective for large datasets.

《3.3. Clustering results》

3.3. Clustering results

Supervised learning methods require training sets and test sets. This method identifies the rules in the training set and then uses these rules for the test set. In contrast, unsupervised learning has no training set or test set; rather, it looks for rules only in a set of data. In this research, the six kinds of parameters of charged and discharged state in Table 2 were used as the input vectors to conduct the clustering analysis. The output is the clustering results, which were verified by performing experimental verification.

This section mainly focuses on the clustering analysis of the data in Table 2. In this study, we chose voltage, temperature, and capacity as the inputs. Of course, researchers can also choose other parameters, so this choice of inputs is just one option rather than a standard. In this paper, the k-means clustering and the SVC algorithms are considered. In the SVC approach, the kernel argument q and the regularization constant C are set as 0.2 and 1.2, respectively. In the k-means clustering approach, the number of clusters is set as 4. The results of the clustering analysis are shown in Table 3, where the column labeled ‘‘un-clustering” represents the comparison group that is produced by randomly selected cells out of all the cells.

《Table 3》

Table 3 Clustering analysis results.

Based on the clustering analysis results, the changes in voltage, temperature, and capacity in the charge and discharge of the new battery module were calculated. The mean difference and standard difference were calculated by the following:

where denotes the mean difference of the voltage; denote the full voltage and zero voltage, respectively; is the number of cells; and denotes the standard difference of the voltage.

The results of the mean difference and standard difference are given in Tables 4 and 5, respectively. As can be seen from Table 4, the mean differences of the voltage, temperature, and capacity in the sorted battery module are obviously smaller than those in the unsorted battery module, indicating that the sorted cells share a similar performance. The results in Tables 4 and 5 are also represented in Figs. 3 and 4, respectively. From Fig. 3, it can be seen that the SVC algorithm performed better than the k-means clustering algorithm in the clustering analysis, especially regarding temperature difference.

《Table 4》

Table 4 Mean difference of battery module.

《Table 5》

Table 5 Standard difference of battery module.

《Fig. 3》

Fig. 3. Mean difference of battery module.

《Fig. 4》

Fig. 4. Standard difference of battery module.

《4. Experimental verification》

4. Experimental verification

In order to verify the results of the clustering, experimental verification was performed. As temperature is the most important parameter affecting the capacity and life of a battery module, an analysis was performed on the temperatures (performance parameter) of the battery modules produced from the four different categories (i.e., two modules purchased from the manufacturer with the same specifications, one SVC-clustered battery module, and one k-means-clustered battery module produced from the grouping of cells). The experimental setup is shown in Fig. 5. Air cooling was supplied from the bottom for the modules of each category. The temperature was observed every 5 min over the cycle as the module was charged–discharged at the same rate. Fig. 6 clearly shows that the battery modules corresponding to Category 3 (the SVC-clustered battery module) presented the best performance, with a maximum observed temperature of 32 C. By contrast, the maximum observed temperatures of the other battery modules were higher, at 40 °C for Category 1 (manufacturer), 36 °C for Category 2 (manufacturer) and 35 °C for Category 4 (k-means-clustered battery module). As the SVC-clustered battery module underwent the least heating, it is expected to have a longer life-cycle than the modules in the other categories. A plausible reason for this result is that the selection of cells with similar performance that was made when producing the module resulted in an equalized temperature distribution within the module, which consequently lowered the rise in temperature in comparison with the modules in the other categories.

《Fig. 5》

Fig. 5. Experimental setup for the verification of the produced battery modules.

《Fig. 6》

Fig. 6. Temperature variation of the battery modules at six different positions in a charging–discharging cycle from four categories: (a) Category 1 (manufacturer); (b) Category 2 (manufacturer); (c) Category 3 (SVC-clustered battery module); (d) Category 4 (k-means-clustered battery module).

《5. Conclusions》

5. Conclusions

To achieve uniformity and equalization of the Li-ion cells used in a battery module for NEVs, we combined experimental and numerical methods to conduct a comprehensive investigation on the clustering of battery cells with similar performance in order to design a battery module with better electrochemical performance. Charging–discharging tests were performed on 48 cells. Clustering algorithms were then employed to conduct a clustering analysis on the two kinds of battery modules (a SVC-clustered battery module and a k-means-clustered battery module). The performances of the battery modules created using clustering algorithms were compared with the performances of the two modules purchased from a manufacturer. The SVC-clustered battery module exhibited the best performance, with a maximum observed temperature of 32 °C. By contrast, the maximum observed temperatures of the other battery modules were higher, at 40 °C for Category 1 (manufacturer), 36 °C for Category 2 (manufacturer), and 35 °C for Category 4 (k-means-clustered battery module). A plausible reason for this finding is that the selection of cells with similar performance during the production of the module resulted in an equalized temperature distribution within the module, which consequently lowered the rise in temperature in comparison with the modules in the other categories.

The k-means clustering algorithm performance may vary depending on the data used. However, for the SVC algorithm, if the data are given, the clustering results are only affected by the SVC parameter settings. Furthermore, since SVC avoids explicit calculations in the high-dimensional feature space, it is effective for large datasets. It can easily be applied in industrial contexts in which the electric vehicles comprise hundreds of packs.

In order to minimize battery manufacturing defects, the processing technology and assembly level can be improved; alternatively, the ability to detect defects can be improved. However, manufacturing defects do exist. Although the proposed approach may appear to be overly lengthy for incorporation before the design stage, it is worth noting that an alternative application of the proposed method could be for battery recycling. Since batteries contain chemical substances and heavy metals, their disposal can cause environmental pollution and a waste of resources. However, old batteries still have various levels of capacity that can be used in other areas. Future work can focus on conducting large-scale testing on cells in order to design a larger battery module, as well as on performing experimental verification on the performance of probabilistic methods [43,44], extreme machine learning methods [45,46], and artificial-intelligence-based methods [47–50].



This work was supported by the National Natural Science Foundation of China (51675196 and 51721092) and the program for HUST Academic Frontier Youth Team (2017QYTD04). The authors acknowledge the grant (DMETKF2018019) from the State Key Lab of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology; the Sailing Talent Program and the Guangdong University Youth Innovation Talent Project (2016KQNCX053) supported by the Department of Education of Guangdong Province; and the Shantou University Scientific Research Funded Project (NTF16002).

《Compliance with ethics guidelines》

Compliance with ethics guidelines

Wei Li, Siqi Chen, Xiongbin Peng, Mi Xiao, Liang Gao, Akhil Garg, and Nengsheng Bao declare that they have no conflict of interest or financial conflicts to disclose.