1. Introduction
A vehicle platooning system uses communication technology to organize multiple vehicles to drive in a queue in order to improve traffic efficiency [
1]. As depicted in
Fig. 1, vehicles in a platoon are permitted to travel in formation and maintain a very close distance between each other, while the rear vehicles are capable of autonomous driving. Decreasing the distance between vehicles can significantly reduce air resistance, reduce fuel consumption, and increase road capacity at the same time [
2], [
3].
Platoon technology has been progressively developed from cruise control (CC) technology via the adaptive CC (ACC) and cooperative ACC (CACC) technologies; accordingly, its degree of driving automation, as specified in the Society of Automotive Engineers (SAE)’s standard J3016 [
4], is gradually improving. The fundamental CC function can be regarded as a level 1 (L1) driver assistance function [
4]. Typically, a CC system employs an electronic throttle to track a fixed speed preset by a driver in a closed loop using straightforward logic. When there is an obstacle in front, the driver must take over control of the vehicle, so this function is typically only used on a highway. In comparison, the upgraded ACC system can be considered a level 2 (L2) partial autonomous driving function. The ACC system has enhanced ability to perceive a vehicle in front, which allows the ego vehicle to maintain a safe distance behind a preceding vehicle. The further upgraded CACC [
5] and platoon [
6] technologies, which are at level 3 (L3; conditional driving automation) and level 4 (L4; high driving automation), are highly automatic driving technologies, due to their ability to liberate the driver from control over the accelerator, brake, and steering wheel. In a CACC system, the front or leader vehicle is not only passively perceived by the rear or follower car but actively communicates with the rear car and sends its own speed, steering wheel angle, and accelerator and brake signals. In this way, the following vehicles can make more reasonable decisions. In comparison with a CACC system, a platoon system can further reduce the distance between vehicles to 0.4 m, which greatly reduces air resistance and improves road traffic efficiency [
7]. This collaborative driving technology can share perception and decision-making information, significantly enhancing the perception ability of the autonomous driving system. However, platoon technology is limited to specific scenarios of vehicle formations, so it is not expected to be developed into a level 5 (L5; full driving automation) technique.
Longitudinal spacing control and communication technologies are the foundation for platoon driving [
8], [
9], [
10], making them the primary focus of early research in this area. In recent years, however, techniques for platoon scheduling and planning that can improve platoon operation efficiency have been extensively studied. The first technique to be studied in scheduling and planning was longitudinal velocity planning, which evolved from longitudinal spacing control [
11]. Unlike single-vehicle velocity planning [
12], vehicle platoon velocity planning must also account for vehicle spacing and string stability. Next, the velocity planning of platoon grouping and separation was further expanded [
13]. The velocity planning of a vehicle platoon passing a traffic light is a similar research field but is restricted to passing through green lights [
14]. Longitudinal velocity planning was further extended to the field of lateral trajectory planning. Due to the long total length of a vehicle platoon, it is a challenge to plan the lateral trajectory of the platoon for lane changing. According to research, establishing dedicated lanes for vehicle platoons is a feasible solution [
15]. Lastly, the related research expanded from the field of microscale vehicle dynamics control to the field of macroscale freight dispatching and vehicle platoon formation scheduling [
16].
In general, the platoon control technique focuses on the relative positional relationship between the vehicles within the platoon; that is, it focuses on a small-scale vehicle platoon system. In comparison, the platoon planning technique pays more attention to the overall future state of the platoon within its environment and makes predictions and adjustments to it. The platoon scheduling technique, on the other hand, pays greater attention to the status of all platoon-related systems in a specific region and makes arrangements so that a large-scale platoon system can operate efficiently as a whole. The overall structure of a platoon scheduling and planning system is summarized in
Fig. 2, with vehicle platoons, roadside controllers, and cloud servers communicating various types of information to achieve efficient scheduling. Longitudinal control of the vehicles in the platoon is usually coordinated by the vehicles, which transmit their respective positions and speed control commands to each other. When the platoon passes through certain road structures, the platoon leader communicates with the roadside controller to obtain the recommended passing speed planned by the controller. The platoon formation function is realized by cloud services. Each individual vehicle sends its own driving task to the server to obtain the driving route in order to join the platoon. In addition, the roadside controller can communicate with the cloud server to achieve precise control of traffic equipment such as traffic lights.
Existing reviews in this field primarily concentrate on the longitudinal spacing control and communication system of the platoon, although some provide summaries that focus on safety or projects. At present, platoon technology has developed to a critical point and is about to enter large-scale commercial use. Many platoon scheduling and planning techniques have been studied in order to realize the exceptional value of platoon technology for commercial freight. However, these scheduling and planning techniques have not been adequately summed up, and it is challenging to integrate them into a system that maximizes the commercial value of platoon technology. Therefore, it is both urgent and necessary to conduct a macro-level review of scheduling and planning techniques for large-scale vehicle platoons. One of the contributions of this paper is to summarize existing platoon-related scheduling and planning techniques to form a comprehensive scheduling and planning system. This paper not only summarizes the methods of scheduling and planning algorithms but also sorts out project applications. This paper is dedicated to answering the following questions:
(1)What methods can be used for scheduling and planning a vehicle platoon?
(2) What are the practical applications of vehicle platoon scheduling and planning?
(3) What shortcomings exist in applications of vehicle platoon scheduling and planning, and how can platoon technology be further implemented?
The scope of this work includes concepts, techniques, applied projects, and future developments related to platoon macro scheduling and planning. The sections of the paper are organized as depicted in
Fig. 3. Section 2 provides a summary of the related reviews and practical application platoon projects. In Section 3, the fundamental algorithm techniques that can be used for vehicle platoon scheduling and planning are introduced, answering the first research question posed above. Section 4 presents the applications of vehicle platoon scheduling and planning. It describes in depth the related technologies of platoon formation, passing traffic lights, changing lanes, and resource allocation, and responds to the second research question. Next, Section 5 introduces the simulation platforms and datasets that can be used for vehicle platoon scheduling and planning. In 6 Challenges and discussions, 7 Conclusions, we summarize the shortcomings and challenges of vehicle platoon scheduling and planning techniques, and put forward the future development direction as the answer to the third research question.
2. A brief history of vehicle platoon research
Vehicle platoon technology has gone through long-term development and is becoming increasingly mature [
17].
Fig. 4 summarizes the research direction of platoon technology. In the beginning, research mainly focused on the analysis and calculation of the convergence of the platoon control algorithm [
18]; next, research was extended to platoon simulations in complex scenarios [
15]. At present, many real-world platoon projects have already begun test runs [
19]. This section summarizes recent reviews related to platoons and practical projects related to platoon scheduling and planning.
In the past few years, with the improvement of communication and autonomous driving technologies, platoon technology has developed rapidly, based on the connected and automated vehicles (CAVs) technique [
11], [
20], [
21], [
22]. Several comprehensive reviews have been conducted on platoon technology, including network communications, lateral and longitudinal control, human factors, and application projects. The biggest difference between the platoon technique and single-car automatic driving is that a platoon must realize mutual communication and signal transmission between its vehicles. Several reviews have summarized communication techniques in a platoon [
9], [
23], [
24], some with a focus on the network security technique [
25], [
26], [
27]. Jia et al. [
23] investigated a platoon-based vehicle cyber-physical system, including basic knowledge of a dynamic traffic and vehicle network system, basic control problems, simulation tools, and open issues. In addition, Ros et al. [
24] investigated vehicle network simulation and modeling. Simulation software includes communication simulation software such as Network Simulator 2 (NS-2), Network Simulator 3 (NS-3) [
28], and Objective Modular Network Testbed in C++ (OMNet++), and traffic-simulation software such as Simulation of Urban Mobility (SUMO) [
29], Corridor Simulation (CORSIM) [
30], and Visual Simulator (VISSIM) [
31].
It is worth noting that a platoon is a very suitable scenario for visible light communication (VLC), because the trucks in a platoon usually drive within each other’s field of view. Shaaban and Faruque [
26] investigated the techniques and application of VLC in a platoon system and discussed the network security of vehicle-mounted VLC. Sun et al. [
25] examined communication network security issues in CAV and summarized a variety of network attacks and defense methods.
In terms of software, some reviews have summarized works on multi-agent coordinated longitudinal and lateral control for maintaining a platoon formation [
8], [
9], [
10], [
11], [
18], [
32], [
33]. Guanetti et al. [
11] investigated control and planning techniques, with a focus on methods to improve energy efficiency. From the perspective of network control, a four-element analysis framework of a platoon system has been proposed, including node dynamics, information flow topology, formation geometry, and distributed controllers [
9]. Poor control strategy will cause a small disturbance in a vehicle’s position to gradually oscillate and enlarge in the platoon, finally causing the platoon to lose its stability. “String stability” refers to whether a control strategy will gradually amplify a position disturbance, which is a very important concept in platoon control. Feng et al. [
18] focused on the string stability problem in vehicle platoon control, strictly analyzed the relationship between different string stability definitions, and compared various analysis methods. Specific longitudinal and lateral platoon maneuver control techniques such as proportional-integral-derivative (PID) control, model predictive control (MPC), sliding mode control, neural network control, and consensus control have been summarized by Badnava et al. [
34].
Other scholars have conducted reviews from the perspective of human influence and acceptance of the platoon technique [
33], [
35], [
36]. Axelsson [
36] investigated the research characteristics, analysis methods, risks, and solutions in the current literature from the perspective of safety. Bevly et al. [
33] summarized the human factor issue of connected autonomous vehicles and introduced the lateral control issue based on lane-changing and merging maneuvers. The cost analysis and business model of the platoon technique are important factors that determine whether the technique can be accepted by people and the market. For the first time, Chen et al. [
37] systematically summarized the business model of a vehicle platoon.
Reviews on practical application projects have also been published [
6], [
19]. Tsugawa et al. [
19] took the energy intelligent transport system (EnergyITS) [
38], KONVOI [
39], PATH [
40], and other projects as examples and analyzed the projects’ applied technologies, impacts, and shortcomings. Sawade and Radusch [
6] introduced practical application projects on cooperative autonomous driving according to functions such as longitudinal control, lateral control, and intersection collision avoidance.
To highlight the main contributions of different existing reviews, we compare various reviews in
Table 1 [
6], [
9], [
11], [
18], [
19], [
22], [
23], [
24], [
26], [
33], [
36], [
41]. We have noticed that most of the existing research investigates control, communication, and human factors at a relatively micro level. Since more macro-level scheduling and planning can highlight the value of platoon techniques, we attempt to summarize platoon techniques from the aspects of scheduling and planning. It can be seen from
Fig. 5 that the number of papers on platoon scheduling and planning techniques has grown steadily in the past decade. More specifically, we summarize platoon scheduling and planning applications such as the platoon map-level formation, velocity planning formation, signalized intersection scheduling, and resource allocation. In all, platoon technology has made considerable progress, which can support platoon scheduling and planning research. However, there is as yet no review that summarizes vehicle platoon scheduling and planning techniques. In the next section, we introduce the general techniques used for scheduling and planning, as a basis for specific platoon scheduling and planning.
3. General techniques for scheduling and planning
In this section, we summarize general techniques used in vehicle platoon scheduling and planning, including rule-based methods, model-based methods, optimization methods, learning-based methods, and multi-agent methods. The brief principles of these methods and their applications in a platoon will be introduced. Initially, scholars devised rules for completing simple scheduling tasks, with heuristic methods and finite state machines (FSMs) serving as the standard approaches. As an increasing number of vehicle and platoon models are built, high-accuracy models are being added to rule-based methods to allow for precise estimations of future states. With the gradual refinement of various evaluation criteria, the problem of optimizing task scheduling has been studied, with integer linear programming (ILP) and dynamic programming (DP) being among the most prevalent techniques. Nonetheless, the increasingly complex environments and sensing devices generate a large amount of observational data, making it increasingly difficult to manually set up rules one at a time to process the massive amount of data. Consequently, learning-based methods have gained in popularity. With the increase of intelligent decision-making traffic participants, it is necessary to use the multi-agent method based on game theory to coordinate and balance the interests of each agent. It is worth noting that, since each method has its own advantages and disadvantages, some scheduling planning methods will be used in combination to obtain good comprehensive scheduling performance, rather than keeping the methods completely mutually exclusive.
3.1. Rule-based methods
Rule-based methods are usually extracted from the experience of human experts. Due to their obvious logic, rule-based methods usually have good interpretability, and the algorithm is usually relatively simple and has good real-time performance. However, because it is impossible to set an infinite number of rules to cover all complex real situations, the effect of this method is often poor in complex and rare situations, which limits the further development of this kind of method. To solve a problem in vehicle platoon scheduling and planning easily and quickly, many rule-based methods such as heuristic algorithms and FSMs are adopted.
3.1.1. Heuristic methods
Heuristic methods usually use the knowledge summarized by human experts to build a heuristic function. At each step, the option of a high heuristic function is greedily selected for decision-making, so as to obtain the global suboptimal solution. Compared with other strict optimization algorithms, heuristic algorithms can generally only obtain suboptimal solutions, and the gap between suboptimal and optimal solutions is difficult to predict. However, heuristic methods usually have relatively good real-time performance and can efficiently and conveniently deploy algorithms on real-world vehicles.
In general, vehicle scheduling problems can be solved by designing heuristic methods based on human experience [
42], [
43], [
44], [
45], [
46], [
47], [
48]. For example, heuristic algorithms can be used to find an approximate optimal solution for vehicle platoon formation scheduling in a large-scale scenario [
42]. A heuristic algorithm can greatly save computing time on the premise that the result is close to the optimal solution. Heinovski and Dressler [
46] used heuristics to consider a formation of platoons on a highway. One of their contributions was to compare the use of distributed heuristic methods and centralized heuristic methods. Their results showed that the performance of the distributed heuristic algorithm was slightly worse than that of the centralized method, resulting in more vehicle speed adjustments. Heuristic formulas can also be used to determine the vehicle speed at each stage of an intersection, thereby reducing the time needed for vehicles to pass through the intersection [
47]. Although the heuristic method is convenient, efficient, and has good real-time performance, it is usually only used as a backup method to ensure real-time performance in large-scale situations, or used for problems with little optimization space.
3.1.2. Finite state machine
The FSM method, as the name implies, is composed of the finite states of different functional modes. The system can only be in one state at a time, so it switches between states based on the various events triggered. The function of the FSM is mainly to describe the state sequence experienced by the agent in its life cycle and the response to various external events. The vehicle merging process can be divided into four FSM processes: pace making, simultaneous pair-up, sequential pair-up, and safe-to-merge [
49]. These processes are switched according to the positional relationship of each vehicle. However, Wang et al. [
47] focused on specific velocity planning during the merging process. They divided the velocity guidance into four stages: the free driving stage, the first velocity regulation stage, the cruise stage, and the second velocity regulation stage. Amoozadeh et al. [
50] not only designed an FSM for the platoon-merging process but also designed an FSM for the platoon split and lane-changing operations. For example, the split process includes the split request, split response, and split execution status. However, the disadvantage of the FSM lies in its finiteness. When encountering increasingly complex environments, new states must be constantly added. It is difficult for an FSM to deal with a complex environment containing an infinite number of different situations. In an FSM, adding a new state may conflict with the original states, and too many states will cause the jump logic to be very complicated. Therefore, for complex environments in platoon decision-making, the FSM currently tends to be replaced by the reinforcement learning (RL) method [
51], [
52], [
53], [
54].
3.1.3. Artificial potential field (APF)
The APF method is a simple multi-objective path planning method whose potential field function is usually designed according to safety, fuel conservation, speed, and so on. The basic principle of the APF method is to simulate a variety of evaluation functions on the spatial scale as a potential field, thereby simulating the motion of the agent in the space by force. In this way, the motion trajectory of the agent in space is a relatively optimal trajectory that meets multiple evaluation criteria at the same time.
Semsar-Kazerooni et al. [
55] were the first to use a one-dimensional (1D) Morse-like potential field function to control the longitudinal spacing of vehicles in a platoon. With this potential field function, the algorithm calculates a virtual repulsion force for backward motion when the vehicle gets too close, thus requiring the vehicle actuators to slow down and back to avoid a collision. On the other hand, when the spacing is too far, high air resistance also produces high potential energy, which drives the vehicle to accelerate to close the gap. Simulations showed that these researchers’ control methods were more stable and secure than proportional differential (PD) control. Furthermore, Semsar-Kazerooni et al. [
56] decomposed the potential functions into the repulsive and attractive potential functions, and replaced the Morse-like function with a polynomial function in the repulsive potential function for design flexibility.
In addition, Gao et al. [
57] expanded the 1D potential field of the vehicle spacing to a two-dimensional (2D) potential field in the top view, although they still only performed longitudinal vehicle spacing control, and the vehicle lateral lane-changing control was completed by other modules. McCrone et al. [
58] performed CACC vehicle lateral and longitudinal control in a 2D top-view potential field. In addition to the car potential of avoiding the collision and closing the gap, the road potential and speed potential are also considered, keeping the vehicle in the correct lane and coordinated velocity. Huang et al. [
59] also perform lateral and longitudinal vehicle control on the top view, but their contribution is to use the potential filed method to plan a complete platoon driving path instead of the greedy algorithm [
60] that moves according to the direction of the potential field at each step. In addition, they also consider the potential field of the driving direction, adding a third dimension to the potential field. However, the defect of the APF method is that the agent greedily receives force in the field and may fall into a local optimum situation. Therefore, the APF method usually needs to be combined with other optimization search techniques to obtain a trajectory with good global performance [
59], [
61]. In general, the APF method is simple and elegant, and considers more environmental information than other rule-based methods, so it can deal with a large number of obstacles in the environment with low computational cost.
3.2. Model-based methods
The model-based method analyzes the dynamic model of the controlled object to implement control, such as inverse dynamic control or MPC. In the fields of platoon-related vehicles and transportation, the system model is usually relatively large and complex. Due to the development of vehicle and traffic theory, increasingly complex vehicle and traffic models have been established, based on which better algorithmic control can be carried out.
3.2.1. Vehicle system dynamics
The vehicle system dynamics model can provide theoretical support, such as convergence, for various control algorithms. In the process of vehicle macro scheduling, longitudinal dynamics is often used to estimate vehicle travel time. For example, the vehicle longitudinal system dynamics can be used to search for opportunities for vehicle pairing to form a platoon [
62]. However, due to the uncertainty of the model, Zhang et al. [
13] estimated the uncertainty of the travel time to ensure that the freight task could be completed within the time window. In addition, a vehicle platoon passing through multiple signal light intersections is a more microscale application scenario that requires a higher precision vehicle dynamics model. Therefore, Ye et al. [
63] and Liu et al. [
64] used complex longitudinal dynamics to plan the velocity trajectory of a vehicle platoon passing through intersections.
3.2.2. Model predictive control
When the model of a controlled object is known, the use of MPC can usually achieve precise and good performance control effects [
65], [
66], [
67], [
68], [
69], although the calculation time of MPC is relatively long. An and Talebpour [
66] used a basic MPC method to plan the lane-changing trajectory of the leading vehicle in a platoon. They continuously planned multiple lane-changing trajectories and selected the one with the lowest cost function for tracking, defeating the Swaroop controller in terms of comfort. Huang et al. [
65] not only planned the lateral lane-changing trajectory but also planned the combined lateral and longitudinal trajectory based on a vehicle kinematics model in the APF to obtain the approximate optimal trajectory of the overall vehicle platoon. The MPC method can also be used for platoon macro route planning. Baskar et al. [
67] applied several methods to model an automatic highway system (AHS) in order to obtain higher model accuracy, so that the MPC method could achieve better performance in obtaining global routes. MPC performance depends on the accuracy of the provided model; however, a sophisticated model will make it difficult to ensure the real-time performance of the algorithm, so it is necessary to balance accuracy and speed in a practical application process.
3.3. Optimization methods
Most scheduling and planning problems can be formulated as optimization problems. After modeling such a problem as a standard optimization problem, it can be solved using one of many existing classic optimization solvers [
70], [
71], [
72], [
73], [
74]. For example, the vehicle platoon group problem can be modeled as an optimization problem that adjusts the desired speed of each vehicle to optimize fuel consumption [
71]. Such a multi-constrained multi-variable nonlinear function optimization problem can be solved using Matlab’s “fmincon” function.
3.3.1. Integer linear programming
Linear programming problems are optimization problems in which the objective function and constraints are linear. The ILP problem adds the constraint that variables must be integers (on top of the constraints of linear programming), which further increases the difficulty of the problem. For example, the mixed-ILP (MILP) method can be adopted to solve the optimal route planning problem of an AHS, and an approximate method has been proposed to improve real-time performance in large-scale scenarios [
67]. Larsson et al. [
42] formulated the vehicle platoon formation problem as an ILP problem and used an existing ILP solver to solve it. However, due to the complexity of the problem, it is still necessary to use heuristic methods to solve this problem quickly in large-scale scenarios. The branch and bound method can be adopted to solve the mixed-integer programming problem, thereby setting the traffic light delay and offset to make the vehicle platoon pass the main road faster [
75].
3.3.2. Dynamic programming
The DP method solves complex problems by decomposing the original problem into relatively simple sub-problems; it can only be applied to problems with optimal substructures. The DP method can be used to solve the optimal path planning problem of vehicles entering platoons [
76]. Using the DP method allows fewer computing resources to be used, while ensuring optimality. Johansson et al. [
77] extended the DP method to plan the optimal speed trajectory of a vehicle platoon during deceleration, thereby reducing fuel consumption by about 80%. Data-driven adaptive DP (ADP) can also be adopted to carry out CC of a vehicle platoon, so as to continuously approach the optimal control through learning and shorten the travel time [
78].
3.4. Data-driven learning methods
As deep neural networks have brought powerful representation capabilities, an increasing number of learning-based methods have been applied to vehicle platoon scheduling and planning technologies in recent years. For example, Chen and Sun [
79] used unsupervised learning to classify human driving behavior data at traffic intersections, thereby obtaining several interpretable patterns; they then selected the optimal vehicle platoon separation strategy.
3.4.1. Supervised learning
Data-driven supervised learning methods have been widely used for the lateral and longitudinal control of vehicles in platoons [
80], [
81], [
82], [
83], [
84], [
85]. Thus far, data-driven supervised learning methods have also begun to be used in the planning and scheduling processes of platoons. For example, based on platoon road traffic data, an urban traffic flow model can be learned to study the impact of platoons on urban traffic flow [
86]. In addition, traffic light data can make it easier for a platoon to pass through traffic conditions [
87]. A wind resistance model learned based on the data can make the fuel economy consideration more accurate when a platoon is grouped [
88].
3.4.2. Reinforcement learning
Scheduling and planning are a kind of decision-making process, and many vehicle and traffic-simulation environments have been established; thus, scheduling and planning technologies are suitable for the RL method, which eliminates the trouble of manually recording and labeling datasets. In terms of platoon path planning, Chen et al. [
16] used
Q-learning RL to plan a high-level route for a vehicle platoon, while reducing the travel time and fuel consumption of each vehicle. Buechel and Knoll [
89] improved the deep deterministic policy gradient (DDPG) RL algorithm and applied it to a platoon’s longitudinal control. The agent determines its acceleration by observing the vehicles’ spacing and speed. Excessive acceleration will lead to penalties, to avoid poor comfort and high fuel consumption. In addition, the DDPG RL algorithm has been used to plan the longitudinal speed trajectory of a following car in a platoon such that the distance between the vehicles in the platoon remains reasonable, which not only reduces air resistance but also ensures safety [
90]. Based on the fact that computing resources can be shared between leader and follower vehicles in a platoon, Fan et al. [
91] developed an interesting computing task pricing strategy using RL. In this strategy, the leader vehicle agent observes its own task volume and the followers’ bids and determines the pricing strategy based on
Q-learning, allowing it to complete computing tasks at a lower cost in comparison with using iterative algorithms.
One of the shortcomings of RL is that it requires a considerable amount of computing resources and time for trial-and-error training; therefore, the development of more efficient RL algorithms to save training time is a mainstream research direction. In addition, the effect of RL depends on the accuracy of the simulation training environment, so it is necessary to build a training environment that is as realistic as possible. Otherwise, the RL agent is prone to overfitting to the unreal simulation environment.
3.5. The multi-agent system
A platoon consists of multiple vehicles, which means that most platoon scheduling and planning systems are multi-agent systems. Some studies solve the vehicle platoon problem using typical methods used for multi-agent systems, such as game theory, consensus algorithms, and multi-agent RL [
92], [
93], [
94].
3.5.1. Game theory
Game theory studies the winning probability of multiple players in a game, their respective strategies, and the final equilibrium. It can be widely used in platoon formation matching, velocity planning, and resource scheduling [
95], [
96], [
97], [
98], [
99]. In general, games can be divided into cooperative games and non-cooperative games, both of which exist in platoon scheduling applications.
The vehicle platoon matching problem can be modeled as a non-cooperative game problem [
95]. Different vehicles have different destinations and freight times, but the route and departure time can be adjusted to form a vehicle platoon to reduce air resistance. It has been demonstrated that this game has a Nash equilibrium, and the cooperative solution can save more fuel than the non-cooperative solution. Liu et al. [
96] directly modeled the problem of spacing allocation in a platoon as a cooperative game problem. Each vehicle adjusts the desired distance between itself and the preceding vehicle according to the dynamic speed and acceleration during driving, thereby ensuring the overall safety, energy saving, and coordination of the vehicle platoon. Another advantage of a vehicle platoon is that computing resources can be shared among vehicles. When a vehicle needs time-consuming calculations such as image processing, it can issue the task for other central processing unit (CPU)-idle vehicles to complete. Researchers have developed a price mechanism that allows computing tasks to be allocated more efficiently. They model the problem of computing task allocation as a multi-leader-multi-follower Stackelberg game. By fixing the follower strategy, the leader’s pricing decision is transformed into a problem that can be solved using single-agent RL [
97].
3.5.2. Consensus algorithms
The consensus algorithm is an iterative and distributed algorithm, whose purpose is to obtain the values of the consensus parameters agreed upon by all agents. In each iteration, each agent obtains the state of itself and its neighbors, thereby updating its consensus parameters. At present, the consensus algorithm has been widely used in the field of vehicle platoon longitudinal and lateral control [
100], [
101], [
102]. For example, Saeednia and Menendez [
100] used the consensus algorithm to make each vehicle agent in a vehicle platoon reach a speed consensus, so that the platoon can be formed quickly and smoothly. Wu et al. [
103] extended the longitude and lateral consensus control of vehicles to the formation of a platoon. The order and trajectory of each vehicle entering the platoon are decided by the consensus algorithm. In addition, through stability analysis, it was found that control disturbances can be added to avoid local minima in the consensus process. When vehicle platoons pass through signal intersections, they usually regroup to form new platoons. The consensus algorithm can also be applied to the reorganization process, and simulation experiments show that a vehicle platoon formed by the consensus algorithm can achieve a uniform speed and time interval [
104].
3.5.3. Multi-agent RL (MARL)
The MARL algorithm is another technique that is often used in multi-agent coordinated scheduling and planning, as there are usually numerous traffic participants in the traffic environment. Since observations and network weights between multiple agents cannot be directly shared, MARL is usually more difficult to train than single-agent RL. However, Ma et al. [
97] used RL to solve the pricing problem of edge computing tasks in a vehicle platoon. They converted the MARL problem into a single-agent RL problem by fixing the vehicle platoon follower strategy, and then used
Q-learning to solve it. Khamis and Gomaa [
105] made the traffic light agents transfer the
Q-value table parameters to each other, so that they could cooperate effectively in the RL process to solve the problem of large-scale urban traffic light control. Moreover, Sharma and Singh [
106] improved traditional
Q-learning into cooperative RL by sharing the
Q-value table, thereby reducing interference in vehicle-to-vehicle (V2V) communication resource allocation.
In all, thanks to the continuous development of the abovementioned algorithms, a vehicle platoon can perform unprecedentedly effective, far-sighted, and real-time scheduling and planning. Most of these methods can be used in combination to complement each other, so it is difficult to say which method is best. For simple environments and specific tasks, simple rule-based methods can achieve stable results; for complex and uncertain tasks, deep learning methods can achieve astounding control effects. In general, however, the popular approach has gradually shifted from the simple, rule-based approach to the more-considered neural network and machine learning-based approach. There is no doubt that the study of operations research and artificial intelligence (AI) algorithms will promote the application of platoon scheduling planning. Next, we will introduce the specific applications of these methods in vehicle platoons and the technological development of each application scenario.
4. Platoon scheduling and planning applications
The main areas of platoon scheduling and planning applications include platoon formation, platoon intersection scheduling, platoon lane changing, and platoon resource allocation. This section summarizes the research background, progress, and future development directions of these fields.
4.1. Map-level platoon formation scheduling
As shown in
Fig. 6, the process of vehicle platoon formation includes macroscopic map-level formation [
42], [
107] and microscopic speed planning formation [
108]. The first step of platoon scheduling is to find vehicles suitable for platoon formation and plan their routes on the map level [
107], [
109], [
110], [
111], as shown in
Fig. 6(a). Since the formation of a vehicle platoon can effectively reduce air resistance, some vehicles may detour or wait at the original starting point to form a vehicle platoon under the premise of meeting the delivery time constraints to reduce fuel consumption.
For example, a DP method has been used to solve the problem of vehicle platoon merging in a 2D map, given the location of the merging point and the merging path [
76]. The considered application scenario is that vehicles join a platoon driving along a highway around a city. The planned route must be short while enabling vehicles to meet and join the vehicle platoon within a short period of time. Using DP algorithms can ensure the optimality of the search path with relatively few computing resources. Baskar et al. [
67] suggested that vehicle platoon merging can reduce fuel consumption when the global path of each vehicle is planned, and proposed two methods based on MILP and a macro-traffic flow model. They first established a simplified and fast traffic-simulation model; however, planning a global path in a dynamic environment is a nonlinear non-convex integer optimization problem that is difficult to solve. Therefore, they approximated the problem using a mixed-integer linearity problem. The researchers also proposed a macroscopic traffic flow model for human drivers. Both methods achieved a good balance between path optimality and computational efficiency.
Larsson et al. [
42] considered a large-scale traffic model without a delivery deadline, using the ILP and heuristic methods to save fuel consumption, as shown in
Fig. 6(b). They found that the more vehicles there are in the network, the easier it is to form vehicle platoons and save more fuel. Sokolov et al. [
70] coordinated the route planning of numerous vehicles by means of combination optimization and adjusted the departure time to form a vehicle platoon. A simulation on Polaris showed that their method increases road capacity and reduces fuel consumption [
112]. Considering the formation of platoon vehicles, Chen et al. [
16] applied RL to the route planning problem. Due to the large amount of calculation required by RL, they adopted edge computing on a road network based on a vehicular
ad-hoc network (VANET) [
113] to speed up the planning process. A summary of related work on map-level platoon formation is shown in
Table 2 [
16], [
42], [
67], [
70], [
76].
Since there is as yet no large number of vehicles with platoon or CACC functions in the real world, the abovementioned studies have only been tested in a simulation environment; moreover, only small-scale experiments are usually performed in a simulation. In the future real environment, scheduling and planning thousands of vehicles moving in real time will present the problem of dimensional explosion. Thus, future research will focus on information sharing and network routing structure issues involving numerous vehicles. At the same time, learning-based methods are being researched for smarter and faster planning.
4.2. Platoon formation velocity planning
Next, when the vehicles have been paired and approached at the macro map level, microscopic velocity planning, as shown in
Fig. 6(c) [
42], is required to gradually reduce the distance among the vehicles and complete the vehicle platoon formation operation [
71], [
100], [
114], [
115], [
116], [
117], [
118], [
119]. Faster platoon grouping enables the vehicles to travel longer distances with low air resistance, but it also causes greater changes in the speeds of the front and rear vehicles, which may result in higher fuel consumption.
Amoozadeh et al. [
50] focus on a logical protocol design based on the FSM of the vehicle platoon, including the formation of the vehicle platoon. The specific process is divided into three parts: the merging request, merging response, and merging execution. In addition, they have developed a simulation platform, Vehicular Network Open Simulator (VENTOS), to validate the logic protocol. The simulation platform consists of two parts: SUMO for traffic simulation and OMNET++ [
120] for communication simulation.
Through mathematical analysis of the vehicle longitudinal dynamics, Zhang et al. [
13] calculate the distance threshold to form a vehicle platoon that can reduce the fuel consumption, travel time, and freight default of each cargo vehicle. They also study the scheduling of vehicle platoons in a road network, including diverging routes and convergence routes, and the scalability of the algorithm when scheduling numerous vehicles. Researchers have also conducted a coordinated optimization based on vehicle dynamics and fuel consumption models [
71]. The speed of each vehicle can be adjusted to form a truck platoon and reduce overall fuel consumption. At the same time, the delivery deadline constraint can be considered in the optimization problem. This research further expands the problem of vehicle platoon formation and designs a complete integrated system that can reduce the number of empty vehicles, find opportunities for platoon formation, and reduce fuel consumption [
62].
Saeednia and Menendez [
108] adopt methods based on decelerating the preceding vehicle and accelerating the following vehicle to construct an optimization problem for vehicle formation. In the optimization objective, they consider both the fuel consumption and time consumption of the formation process. They also propose a consensus-based distributed algorithm and compare it with the previous centralized optimization algorithm [
121]. The new distributed algorithm is easier to deploy in practice, and it is more flexible in the platoon formation process. Even when there is an acceleration limit, part of the platoon can be formed first.
Table 3 [
13], [
50], [
62], [
71], [
108], [
121] summarizes related work on the velocity planning of platoon formation.
Although real-world small-scale vehicle experiments have been conducted on vehicle velocity planning to form a platoon, many details have not yet been considered, such as traffic jams, ramps, and other road conditions. In addition, in some countries, mandatory rest requirements for drivers must be complied with. In the future, larger scale and heterogeneous vehicle experiments will need to be carried out. At the same time, since the following car can save more fuel than the leading car, there is interest and value in developing a business model for profit distribution.
The platoon formation applications described above can be categorized by means of task release dynamics. In most platoon formation scheduling programs, task details are known beforehand and then scheduled. A study has demonstrated that proper scheduling and planning in advance can result in substantial fuel savings and other advantages [
122]. Nevertheless, to ensure accurate and appropriate planning in advance, it is vital to verify that the input data for each activity is exact and definitive. In a few cases, however, task conditions will alter in real time, requiring immediate scheduling adjustments. A natural solution to this problem would be to retrigger scheduling optimizations based on events [
123], [
124]. On this basis, improved scheduling effects will be realized if past data is used to predict future information in order to cope with real-time planning [
125]. Due to the unpredictability of traffic circumstances, uncertainty and randomness must also be factored into the real-time scheduling considerations of various inputs [
13]. In general, real-time scheduling methods strive to achieve a balance between real-time and scheduling performance. In the future, however, when the vast majority of cars on the road have the platoon function, it will be practical and effective to directly construct an opportunistic platoon by means of a simple scheduling technique, given the ongoing development of platoon technology [
126].
4.3. Platoon scheduling at signalized intersections
When a vehicle platoon arrives at a signalized intersection, a scheduling algorithm is required for vehicle velocity planning and signal light timing scheduling, as shown in
Fig. 7 [
79], [
90]. Two methods are used for vehicle platoon scheduling at intersections with traffic lights: manipulating vehicles [
127], [
128], [
129], [
130], [
131], [
132], [
133] and manipulating traffic lights [
47], [
75], [
105], [
134]. At present, the most widely studied method is to design a speed trajectory curve according to the state of the vehicle platoon. Just like a single vehicle driving through a traffic light, a vehicle platoon must plan its speed to pass through the intersection during the green light period. However, because a vehicle platoon composed of multiple trucks is usually very long, it is sometimes difficult for it to pass the intersection as a whole, and a reasonable separation strategy is required.
4.3.1. Methods that manipulate vehicles
Adjusting the speed of a vehicle platoon and allowing it to pass through an intersection within an existing green light window is a relatively simple and widely studied method [
119], [
135], [
136]. Feng et al. [
90] proposed a vehicle platoon velocity trajectory planning strategy, as shown in
Fig. 7(b) [
90], which improves the throughput of traffic signal intersections compared with predictive CC (PCC). It includes three components: platoon leader trajectory planning based on a green light optimal speed advisory (GLOSA), platoon follower spacing control based on the DDPG method, and a vehicle platoon length decision algorithm. The DDPG-based RL follower spacing controller observes the position and speed of the surrounding vehicles, outputs acceleration as the action, and rewards the distance between vehicles and safety.
A vehicle platoon guidance strategy (VPGS) system can be adopted to separately formulate the trajectory of the leader vehicle and the following vehicle, while considering fuel consumption based on the vehicle specific power (VSP) [
63]. The research scenario can also be expanded from a single signalized intersection to multiple signalized intersections based on CAV, by changing the vehicle platoon acceleration to obtain better safety and fuel consumption, less travel delay, increased comfort, and better intersection throughput [
64]. By optimizing the speed of each car, Timmerman and Boon [
72] were able to make the vehicles in a platoon cross an intersection in a shorter time. The researchers also considered the issue of fairness, because the strategy of lower overall delay at an intersection may sometimes be unfair to individual vehicles. Chen and Sun [
79] considered the separation of a platoon at signalized intersections, so that the platoon can use multiple lanes to quickly pass through intersections. At the same time, the driving style of other manned vehicles can be predicted, which makes the vehicle separation strategy safer and more efficient [
137].
4.3.2. Methods that manipulate traffic lights
Another traffic scheduling method for dealing with signalized intersections is to adjust the offset between arterial intersections. A layered method combining a branch and bound algorithm with a relaxation coefficient can be used to optimize the dual goals of reducing delay and decreasing emissions [
75]. Khamis and Gomaa [
105] applied MARL to traffic light control. They adopted a new collaboration method between traffic light agent controllers, which is based on spreading learned knowledge from a highly educated agent to a neighboring agent with less knowledge. Simulation experiments on the green light district (GLD) platform showed that their method reduced the number of stopping times, saved on travel time, improved safety, and reduced energy consumption. Since the traffic light is used as the RL agent, the observation space is the location of the surrounding vehicles, the action space is the display of red or green lights, and the reward is the traffic flow and vehicle waiting time. A joint control model was established to simultaneously adjust the vehicle platoon speed and traffic signal light timing in order to maximize the traffic capacity at an intersection [
47]. Research related to platoon scheduling at signalized intersections is summarized in
Table 4 [
47], [
63], [
64], [
72], [
75], [
79], [
90], [
105], [
138].
At present, most traffic intersection scheduling work is still at the simulation level and lacks practical verification; thus, more consideration must be given to situations involving crossing intersections along with non-autonomous vehicles. In the future, it will be necessary to extend research on a single intersection or a series of intersections to include large-scale intersection networks and to further combine vehicle platoon velocity planning with signal light duration control. In combination with this work, the influence of signal intersections on the string stability of the vehicle platoon can be studied.
4.4. Platoon lane-changing planning
During the driving of an autonomously driven platoon, it is difficult to perform lane-changing maneuvers due to the long overall length. Many studies have used V2V communication, as shown in
Fig. 8, to cooperate among the vehicles in the platoon in order to complete platoon lane changing [
139], [
140], [
141].
Amoozadeh et al. [
50] took lane-changing into consideration when formulating a vehicle platoon management agreement and adopted SUMO’s default lane-changing model LC2013 as the specific maneuver strategy. Dos Santos et al. [
49] expanded the grand cooperative driving challenge (GCDC) 2016 protocol, focusing on the two steps of sequential pair-up and safe-to-merge. Compared with the original protocol and SUMO’s default lane-changing strategy, the new protocol significantly improves the total lane-merging time. Dos Santos and Wolf [
92] further adopted a multi-agent distributed probability collectives method for vehicle lane-changing planning; its performance is close to that of centralized planning.
An and Talebpour [
66] used the MPC method to plan the lane-changing trajectory of an autonomous vehicle platoon. The lane-changing trajectory they used was a sixth-order polynomial curve, and the optimization goal was to reduce the lane-changing acceleration so as to obtain a smooth lane-changing trajectory. The researchers combined and optimized the lateral and longitudinal movement of the vehicles during the lane changing, reduced the distance shock wave caused by the lane changing, and ensured the string stability of the platoon. Huang et al. [
65] performed MPC based on the APF model and the vehicle kinematics model to obtain the approximate optimal path of a vehicle platoon. We summarize the related work on the lane-changing maneuvers of platoons in
Table 5 [
49], [
50], [
66], [
92], [
142].
Because a practical vehicle platoon lane-changing experiment is quite dangerous, the existing studies are generally based on simulation platforms, and practical problems such as communication delay are rarely considered. Most current research assumes that all vehicles have communication functions. In practice, however, most vehicles are driven manually, and it is necessary to consider the platoon lane-changing problem in a scenario with mixed traffic including both autonomous vehicles and manual vehicles. In a congested environment, it may be difficult for a long platoon to change lanes, so more flexible methods are needed, such as changing the vehicle gap and changing lanes in different groups.
4.5. Platoon resource-allocation scheduling
With the widespread application of the Internet of Vehicles (IoVs) and the growth of freight logistics, the amount of multi-source information that needs to be processed by a highway system is rapidly increasing, and processing this information requires a great deal of computing and storage resources [
143]. In a platoon system, the ability of a single vehicle or a single platoon to perceive and process information is limited, and the system must transmit information between a cloud computing system and the platoons for joint processing. Therefore, the issue of how to improve the utilization of vehicle cloud computing resources is the key in improving the information-processing efficiency of vehicle platoon systems. Existing research has studied the resource-allocation optimization problem in cloud computing and communication engineering, as well as resource-allocation scheduling optimization schemes for a vehicle platoon system. The specific resources to be scheduled include communication resources and computing resources.
4.5.1. Communication resources
As we know, V2V communication is the foundation of platoon technology. Based on the increasingly rich perception of information and the increase in communication content, Xu et al. [
144] proposed a cooperative perception approach among multiple agents, which optimizes the balance between performance and communication bandwidth resources. Reasonable communication resource allocation can achieve lower communication delay and power consumption, as well as better platoon control performance [
145], [
146], [
147], [
148], [
149], [
150]. Device-to-device (D2D) multicast and enhanced multimedia broadcast multicast service (E-MBMS) communications can be used for sub-channel allocation and power control, as shown in
Fig. 9, thereby reducing communication interference, communication delay, and power consumption [
146]. Mei et al. [
73] combined the radio resource-allocation problem with the platoon longitudinal control problem for joint scheduling. They first used the bipartite graph method for wireless resource allocation and then used heuristic gradient descent to adjust the platoon member vehicle parameters to minimize tracking error. Sharma and Singh [
106] applied cooperative RL (C-RL) to the problem of wireless resource allocation between vehicles to improve V2V throughput.
In the process of communication resource scheduling, network security and network attacks are an important aspect to be considered. In terms of network attack resistance, a distributed filtering method has been proposed to resist the spoofing attacks and service-denying attacks [
151]. A neural network-based adaptive event-triggering method can also be used to control communication scheduling in order to defend against network attacks [
152]. With the continuous development of security control and attack-detection technology in industrial cyber-physical systems [
153], [
154], communication control based on dynamic event triggering has been successfully applied in the field of vehicle platoon scheduling [
155]. Under the condition of limited communication resources, the use of dynamic event-triggered scheduling and platoon control co-design can effectively save the communication bandwidth to ensure a beneficial tradeoff between robust platoon control performance and communication efficiency [
156].
4.5.2. Computing resources
Despite its advantages, vehicle intelligence involves many time-consuming computing tasks. The vehicles in a platoon can transmit computing tasks through the communication network to utilize idle computing resources. Ma et al. [
97] developed a price strategy for edge computing in a vehicle platoon, which uses an RL method. The researchers combined game theory with RL algorithms to identify a reasonable task flow offloading and recovery scheme among platoon members. The RL method takes the pricing strategy as the action and can complete computing tasks at a lower price than an iterative algorithm.
Table 6 [
73], [
97], [
106], [
146] provides a summary of related work on platoon resource allocation.
However, current platoon resource scheduling still has the problem of insufficient versatility, and there is a lack of unified standards and test cases for computing, communication, information, and other resources. In the process of resource scheduling, more factors need to be considered, such as lane width, train driving direction, the long-term impact of scheduling decisions, errors and delays in perception information, and the overall cost of equipment, communication, and computing. Moreover, with the development of 5G and multi-platform communication technologies, resource-scheduling technology can be extended to drones and other collaborative compensation fields, thereby improving the efficiency of all participants in the entire transportation system.
4.6. Platoon projects with a scheduling technique
In recent years, with the maturity of autonomous driving technology, an increasing number of autonomous vehicle platoon projects have begun practical tests, including EnergyITS [
38], Safe Road Trains for the Environment (SARTRE) [
157], the GALVP, and the West Well and SAIC MOTOR projects [
7], as shown in
Fig. 3. Moreover, some simulation projects have been specifically designed for autonomous vehicle platoons [
158], [
159], [
160].
The GALVP project is an autonomous driving platoon project launched in 2020 by Tongji University (China) and Tsinghua University (China). In this project, the leader vehicle in the platoon is manned, and the following vehicles realize low-cost unmanned driving. This project will be applied to mine truck operations in the future, and it will be necessary to schedule mine muck transportation. The GCDC competition [
161], [
162], [
163] focuses on testing the communication and collaboration capabilities between platoons. It includes two scheduling and planning scenarios, such as a platoon-merging scenario and an intersections-passing scenario. Companion [
164] is another practical application project focusing on vehicle platoon scheduling and planning techniques. The platoon system of autonomous vehicles is divided into three layers: the strategic layer, tactical layer, and operational layer. The strategic level is the top level of strategic planning, including transportation planning and route optimization. It assigns cargo transportation tasks to trucks and assigns them routes that can form vehicle platoons. After the strategic layer outputs a rough speed profile for each vehicle, the tactical layer is responsible for detailed velocity planning while considering the dynamics of each vehicle. The operating layer includes the control of the platoon and the control of the vehicle itself. The platoon controller is a distributed one that is responsible for maintaining the correct distance between vehicles, performing merging actions, and ensuring platoon stability. The vehicle controller is an internal control loop for operation control. It controls the speed of the vehicle according to the platoon controller, other advanced driver-assistance systems (ADASs), and the driver’s input.
Most of the project application scenarios mentioned above are freeway or arterial scenarios; however, container terminals are another promising application scenario for platoon scheduling technology. A new concept has been proposed of an automated container transportation system between an inland port and terminals (ACTIPOT), which uses automated trucks to transfer containers between inland ports and container terminals [
165]. Microscopic simulation models have been built to demonstrate the overall performance of the ACTIPOT system. In the port of Rotterdam, the Netherlands, the formation of platoons allows vehicles to directly meet the demand points of the inland dry port [
166]. The experimental results show that the use of vehicle platoons in the port of Rotterdam does indeed hold great potential to reduce costs, dwell time, and emissions. At present, automated trucks are being used to solve the problem of container transshipment scheduling between the two seaport terminals in the port of Singapore [
167]. Autonomous trucks can be transported in platoons with short inter-vehicle distances, and the following vehicles in the platoon can save on fuel consumption due to reduced air resistance. In this problem, a mixed-integer second-order cone programming (MISOCP) model is built to minimize the total running cost.
Although the main research and application field of platoon technology focuses on heavy-duty truck transportation, related research on platoon technology for passenger vehicles has also been carried out [
168]. At present, passenger vehicle platooning mainly involves bus platooning or modular rail transit systems. As an early prototype of a bus platooning system, the problem of multi-model bus fleet scheduling has been widely studied. In bus fleet scheduling tasks, the usual goal is to generate optimal bus vehicle schedules across multiple vehicle types, thereby reducing passenger waiting time and lowering operating costs [
169]. Existing studies have used the transportation network method to carry out the multi-objective optimization of passenger waiting time and vehicle occupancy level [
170]. The least-cost network traffic model can take advantage of Pareto efficient schedules, thereby reducing overall fleet size and operating cost [
171]. In addition, it is possible to switch a bus fleet’s flexible routes or fixed routes to reduce costs by building a market entry and exit real option model with an average return demand density [
172]. By directly establishing an integer nonlinear programming model and using DP to solve it, it is possible to balance operating costs and customer waiting time by scheduling different types of buses [
173].
Later, when the concept of modular transportation emerged, actual bus platooning began to take shape. Joint design of the dispatch interval and vehicle capacity of a modular bus platooning system was carried out under overloaded traffic conditions [
174]. Similar to the fleet size design, a bus platooning system can be modeled as a MILP model and solved with DP [
175]. In addition, bus platooning scheduling problems modeled as MILP models can be solved using advanced commercial solvers such as Gurobi [
176]. The flexible route design problem also exists in a modular transportation bus platooning system, and can be solved by means of a two-stage solution using DP and heuristics [
177]. With the gradual deepening of the research on modular transportation bus platooning, the concept of a transportation corridor based on modular autonomous vehicles was proposed [
178]. Inbound and outbound strategies for modular autonomous vehicles were studied, and the theoretical properties of a passenger boarding sequence and a minimum dispatch interval were given.
However, most current bus platooning system experiments are carried out through small-scale data simulations. In the future, large-scale tests in real environments will be able to verify the generalization performance of the scheduling algorithm and its sensitivity to environmental parameters. In order to conduct large-scale experiments in a real-world environment, more complex factors in real-world tasks must be considered, such as the randomness of driving speed due to traffic conditions, capacity requirements for emergency situations, changes in demand on different workdays, and dynamic connectivity strategies for modular vehicles. Moreover, in the real world, a wider range of joint scheduling can be carried out, such as the joint scheduling of multiple bus lines in multiple regions, the joint scheduling of bus vehicles and drivers, and fleet size management scheduling. The scheduling algorithm proposed for a bus platooning system can also be applied to other train-based transportation systems, such as subways, high-speed rail, and other forms of modular autonomous driving. In terms of algorithms, parallelization or approximation can be considered to improve the computational efficiency of DP; or, a more efficient method can be used to jointly design algorithms to replace DP.
Compared with a truck platoon, a passenger vehicle platoon has higher requirements for comfort, safety, and efficiency. Passenger comfort considerations include both lateral comfort and longitudinal comfort. In the lateral dispatching process of a passenger vehicle platoon, the influence of lateral acceleration on passenger comfort must be considered in situations such as the entry of vehicles at the gate [
179], the lateral merging of the platoon [
69], and lane changes by the platoon [
180]. Passenger safety has been evaluated during the emergency braking of a passenger vehicle platoon [
181]. Moreover, when a passenger vehicle platoon is passing through a traffic intersection, traffic signal control can optimize safety for private cars and public transport vehilcles [
182]. Finally, reducing the waiting time of passengers in order to improve the efficiency of public transportation is another key point in the development of passenger vehicle platoon technology. In addition to adjusting the timing of traffic lights to allow a passenger vehicle platoon to pass through a traffic intersection quickly [
182], dynamic optimization of bus platoon scheduling can reduce the waiting time of passengers at the stations [
183].
In short, it can be seen that the performance of scheduling and planning algorithms is gradually improving in various scenarios. Application research on platoon scheduling and planning has gradually shifted in focus from single agents to multiple agents, from a static environment to dynamic environments, and from a single goal to multiple goals. In the process of algorithm evaluation, various platoon-related simulation platforms and datasets are used, which are summarized in the next section.
5. Simulation platforms and datasets
Since platoon technology is not yet fully mature, there are risks in conducting experiments directly on real roads; thus, there are relatively few related experiments. Instead, many experiments rely on simulation environments and datasets for testing. The use of simulation environments and datasets are currently the two main verification methods. In a simulation environment, the interactions between the agent and the environment may be more accurate, whereas a dataset has better authenticity.
5.1. Simulation platforms
A simulation platform not only provides a test environment for various algorithms but also provides predictive models for model-based methods and a training environment for RL methods. Therefore, a realistic and efficient simulation platform is of great importance. Although some reviews have included content on simulation platforms [
23], [
24], [
33], this paper provides a more comprehensive and concise summary of research on simulation platforms for platoon scheduling and planning, including works with a focus on traffic, vehicles, networks, and integrated simulation platforms, as shown in
Fig. 10. The application of each simulation platform in the field of platoon scheduling and planning is also summarized.
5.1.1. Traffic-simulation platforms
Types of traffic-simulation platforms include macroscopic traffic simulation, mesoscopic traffic simulation, and microscopic traffic-simulation. In the scheduling and planning of vehicle platoons, it is necessary to consider the complex interaction between vehicles; therefore, microscopic or mesoscopic simulation platforms are usually used, such as SUMO, VISSIM, and CORSIM.
• SUMO [
29] is an open-source, microscale, multi-model traffic-simulation software. The SUMO simulation platform finely simulates the travel route of each vehicle from a microscopic perspective and gathers the vehicles into a traffic flow for traffic demand analysis. Since SUMO is an open-source software and supports the Linux system, it is convenient for secondary development. In addition, users can set parameters such as random route selection, random follow-up mode, and random departure time to improve the randomness of the simulation. Currently, SUMO is mainly used to conduct microscopic traffic simulations in order to verify platoon formation and lane-changing techniques [
49], [
50], [
72], [
92], [
184].
• VISSIM [
31] is microscopic traffic-simulation software like SUMO, but it has a more intuitive graphical user interface (GUI), which is convenient for users to configure roads, cars, and buses. Therefore, VISSIM software is widely used to evaluate urban traffic planning and engineering design. VISSIM has abundant research materials for learning, calibration, and application, and it is convenient in application. VISSIM simulation results are accurate, but there are obvious shortcomings in terms of extended applications. The disadvantage of VISSIM is that the road network generation process is cumbersome. VISSIM is also suitable for controlling traffic lights at intersections for vehicle platoon microscopic traffic simulation [
47], [
63], [
72].
• Traffic software integrated system (TSIS)/CORSIM [
30] can simulate complex geometric conditions, different traffic phenomena, different types of traffic control, and management and operations; it can also explain the interaction between different components of a road network and has an interface with external control logic and programs. TSIS/CORSIM has relatively few functions, but it is a completely open-source software, which helps scientific researchers and developers to carry out bottom-level learning and research work.
• Parallel microscopic simulation (Paramics) [
185] has features such as detailed road network modeling, flexible signal and vehicle control, perfect path guidance, a rich programming interface, and detailed data analysis. Paramics can simulate traffic signals, ramp control, detectors, variable information boards, in-car information display devices, in-car information consultants, route guidance, and more. Paramics is more prominent in application, and provides a wealth of application programming interface (API) for external applications. It can be used to simulate the collaborative control between platoon vehicles [
186].
• The Polaris [
112] framework is designed to solve the problem of the lack of interoperable models in the field of transportation simulation. It is used to implement prototype integration requirements, network supply, and operational feature models. In addition, it is a function-based full-featured agent-based simulation model, including travel requirements, traffic simulation, and basic intelligent transportation system (ITS) operations. It can be extended into a tool suitable for planning purposes, which has been achieved through continuous model development and extensive calibration and verification work; thus, this framework is widely used to simulate platoon formation at the map level [
70].
• Synchro plus SimTraffic [
187] is a powerful urban traffic-modeling and drawing tool. SimTraffic can improve a city’s transportation network and measure the derivation, road distribution, and scientificity of each road to help city builders develop a scientific and effective transportation road network. It has an interface with traditional popular traffic-simulation software, which is easy to understand and has high practical value for engineering.
• TransModeler [
188] is a traffic-simulation model based on a geographic information system, which provides the most effective solution for many traffic-planning and simulation-modeling tasks. It uses advanced driving behavior models to simulate the traffic flow phenomenon on a road network and achieves a unique breakthrough in reproducing complex traffic fluids in a state of increasing traffic through simulation.
5.1.2. Network simulation platforms
A network simulator can simulate the information interaction and communication process between platoon vehicles. Commonly used softwares include NS-2, NS-3, and OMNET++.
•NS-3 [
28] is a discrete-event network simulator for Internet systems. It provides a model of packet data network and execution, and provides users with a simulation engine for simulation experiments. As an open-source software, NS-3 has good scalability, and users can develop it in C++ or Python. It can be used for platoon V2V communication simulation [
184], [
189].
• OMNET++ [
120] is another free and open-source multi-protocol network simulation software that is of great significance in the field of network simulation [
190]. It is simple and intuitive when used to create network topologies, due to its integrated development environment (IDE) and GUI. The source code of OMNET++ is based on C++, and developers can use its defined network description (NED) language to build and simulate network modules. OMNET++ network communication simulation has been used for the formation and lane-changing process of vehicle platoons [
50], [
191].
5.1.3. Integrated simulation platforms
With the continuous deepening of platoon research, integrated simulation platforms have begun to appear in order to carry out professional platoon traffic simulation and communication simulation at the same time.
•iTETRIS [
192] integrates the wireless communication platform NS-3 and the road traffic-simulation platform SUMO in an environment that is easily tailored to specific situations, so that a performance analysis of collaborative ITS can be performed at the city level. iTETRIS uses the accuracy and scale of the simulation to clearly reveal the impact of traffic engineering on urban road traffic efficiency, operational strategies, and communication interoperability.
• Veins [
193] includes a comprehensive set of models that can simultaneously ensure simulation accuracy and real-time computing. The Veins framework is composed of OMNET++ and SUMO—both of which have a good GUI and IDE for easy development—and has been applied to vehicle platoon microscopic traffic simulation [
194].
• VENTOS [
50] is an open-source integrated VANET C++ and SUMO simulator used to study vehicle traffic, cooperative driving, and vehicle-infrastructure interaction by enabling the wireless communication capabilities of dedicated short-range communications (DSRC). VENTOS can help researchers in the fields of transportation engineering, control theory, and vehicle networks, and VENTOS-based simulation can simulate vehicle platoon formation and lane-changing behavior [
50].
5.2. Datasets
Although there are numerous simulation platforms that can simulate the traffic, vehicle, and network conditions of a platoon, real-world datasets are still indispensable to verify the performance of an algorithm in the real environment. However, since the platoon technique has not yet been widely popularized, pure platoon datasets are scarce, and most platoon-related research uses other traffic datasets as an alternative. Numerous researchers have constructed and utilized platoon-related datasets to carry out a great deal of significant research; we introduce these studies from the United States, Europe, and Asia, according to the regions where the datasets were recorded. Platoon-related datasets are summarized in
Table 7 [
42], [
62], [
71], [
100], [
108], [
195], [
196], [
197], [
198], [
199].
5.2.1. The United States
The United States is one of the first countries to carry out research on platooning technology, so some of the early, well-known datasets were recorded in the United States. The well-known dataset in the transportation field, next-generation simulation (NGSIM) [
200], was recorded in California and Georgia, USA, and is now used to study the impact of the platoon technique on traffic flow [
195]. In addition, Yu [
196] collected vehicle trajectory data near a traffic intersection to calibrate the proposed platoon dispersion models.
5.2.2. Europe
Europe followed the United States in platooning studies, with Germany, Sweden, and Switzerland as representatives. The Europe Flagship Program helps numerous universities and businesses conduct extensive research. A dataset of randomly generated driving tasks using German highway network graphics was produced to verify the fuel-saving performance of a vehicle platoon formation algorithm [
42]. It was found that the more vehicles there are in a network, the easier it is to form a vehicle platoon to save fuel. By collecting the speed of trucks on real highways, Saeednia and Menendez [
100], [
108] constructed an optimization problem of vehicle platoon formation. Two real vehicles have also been used to perform platoon merging on a highway, forming a dataset for testing the impact of different traffic flow densities on platoon formation [
62]. Sebe and Müller [
197] used real urban road network data and background traffic density models to simulate the platoon formation process in order to verify the performance of their algorithm. Based on Scania’s simulation model, Liang et al. [
71] simulated platoon driving 280 km on a Swedish highway to obtain simulation data and verify that the average air resistance is reduced by 8.4% when vehicles drive in a platoon.
5.2.3. Asia
In the past two years, countries in Asia have also attempted to launch datasets for their own road conditions. Compared with Europe and the United States, which have accumulated decades of technical reserves, Asian countries started their development later. However, the technical solutions obtained by different nations are surprisingly similar, as all rely on expensive active-sensing sensors. Automatic number plate recognition (ANPR) data recorded in Beijing has been used to discover companion patterns [
198], and Saha et al. [
199] used a dataset recorded in New Delhi, India, to validate their proposed vehicle platoon dispersion prediction model.
In general, the various kinds of traffic-simulation platforms available today are sufficient, allowing adequate simulations of platoon scheduling and planning techniques. There is no doubt that a more realistic and detailed simulation platform will be more conducive to the efficient evaluation and training of scheduling and planning algorithms. If a dedicated traffic-simulation platform for platoons appears in the future, it will make the experimental environment configuration more convenient. Nevertheless, at present, the available datasets are still relatively inadequate—especially datasets dedicated to platoons. We believe that this problem can be alleviated as more and more practical platoon projects are launched.
6. Challenges and discussions
Although the scheduling and planning techniques of vehicle platoons have been greatly developed in recent years, there is still a long way to go before platoon scheduling and planning techniques mature to large-scale applications. Major challenges and future development directions are elaborated in this section.
6.1. Lack of real-vehicle verification on a large scale
Although some autonomous vehicle platoon projects have begun testing on real-world roads in recent years, they are still far from large-scale practical commercial operations. Due to the lack of large-scale practical operations, existing algorithms do not respond well to real and complex environmental uncertainties, such as traffic jams, accidents, ramps, communication delays, forced driver rests, and uncooperative human drivers. At the same time, the lack of a large-scale vehicle platoon operation environment makes it impossible for cargo scheduling and vehicle platoon formation algorithms to be actually verified. Large-scale application scenarios challenge the algorithm’s computational real-time and communication capabilities. At present, there is a lack of a unified evaluation system for platoon scheduling. Large-scale real-vehicle verification is helpful to the formation of the evaluation system.
The lack of practical vehicle platoon datasets is another current problem. As a result, many vehicle platoon scheduling algorithms can only be verified using general road datasets, which may deviate from actual platoon situations. With the launch of an increasing number of platoon projects and the continuous development of ITS, the lack of large-scale real-vehicle verification of platoon technology is expected to be resolved.
6.2. Communication challenges in practical scheduling
At present, most of the existing platoon scheduling studies do not take practical communication problems into account; rather, they assume that all vehicles can achieve perfect communication without delay, error, and bandwidth limit, so as to carry out centralized scheduling. In practice, however, due to communication delays, vehicles cannot execute scheduling instructions immediately and may reject the scheduling instruction. Moreover, due to the lack of communication capacity, it is necessary to consider situations involving mixed driving between network-connected vehicles and conventional vehicles. Security and privacy issues must also be considered when developing practical communication processes, and future platoon scheduling research needs to take the various issues that may affect communication processes into account. One possible way to deal with communication delays is to adopt distributed control, in which vehicles make decisions that are in both their own interest and the overall interest.
Furthermore, it is challenging to handle highly dynamic communication channels and network topology while providing diverse services with stringent quality-of-service (QoS) requirements in real time. Therefore, AI-based engineering solutions are needed to achieve efficient network slicing, mobility management, and cooperative content caching and delivery.
6.3. Human-machine interaction patterns in platoons
As more and more practical platoon project experiments are being carried out, human-machine interaction problems in platoon systems are beginning to appear. Unlike self-driving vehicles, in which the driver can take over alone at any time, a human driver cannot take over a platoon system at any time, because it is difficult for a human to maintain a very close distance at high speeds, as is done in a platoon system. Manual driving is not as accurate as automatic driving in terms of distance control, so a bad manual takeover of driving power can easily lead to accidents. Thus, under such circumstances, the driver feels highly stressed. At present, few studies have researched human-machine interactions in platoon driving mode [
201]. In the future, emphasis must be put on human-machine interactions in platoon formation, driving, and emergency separation scenarios [
202], [
203].
6.4. Application of learning-based methods
In recent years, learning-based algorithms based on deep neural networks and bio-inspired learning networks have attracted widespread attention and research [
204]. Since vehicle platoon scheduling and planning are decision-making processes that are easy to model, the RL algorithm is easy to apply and has been widely adopted. However, the RL algorithm still has some bottlenecks regarding its training efficiency and transferring applications. RL is inefficient when sampling experience in complex observation and motion spaces, so its training time is long. In addition to the problem of the algorithm itself, a simulation environment for interacting with other drivers lacks authenticity, resulting in RL agents exhibiting poor performance when transferred to the real world. Meanwhile, algorithms based on deep neural networks are difficult to explain, and it is difficult to trace responsibility in the case of an accident.
In order to improve RL training speed, the experience replay strategy of RL must be improved so that the driving experience generated by time-consuming simulations can be better utilized. Combining residual RL and multi-expert RL methods with a traditional rule controller makes it possible to avoid training overfitting and improves the transfer robustness of the agent. More platoon-specific datasets can also be provided for imitation learning, which can significantly improve the training speed compared with a standard RL. Finally, significant graphs and other methods can be used to provide a preliminary explanation of the vehicle decision-making behavior in a platoon.
6.5. Efficient computing resource scheduling
With the gradually increasing complexity of scheduling algorithms and a gradual increase in the scale of the vehicle platoon, the amount of calculation required for a vehicle platoon will also gradually increase, causing on-board computing resources to become increasingly constrained. Existing computing resource-scheduling methods based on IoV communication must become more efficient in order to meet the real-time requirements of vehicle platoon scheduling and planning. Cabin computing is an emerging method for scheduling computing resources [
205]. It is a network-accessible integrated computing environment for cross-domain resource configuration and collaboration throughout the entire life cycle of information technology (IT) tasks. Compared with existing computing modes and technologies, cabin computing possesses the following characteristics: ① It is constructed flexibly in response to the requirements of IT tasks, is scaled and managed in response to the execution of IT tasks, and dies dynamically in response to the conclusion of IT tasks. ② In the vertical dimension of the full life cycle of IT tasks, it performs the four functions of identifying requirements, allocating resources, executing tasks, and concluding tasks. ③ In the horizontal dimension of the resources required by IT tasks, cabin computing realizes the overall configuration and coordinated operation of data resources and physical resources. Compared with grid computing, virtual organization, cloud computing, and edge computing, cabin computing unifies mobility and customization and optimizes and coordinates cross-domain resources, thereby enabling complex vehicle platoon scheduling and planning algorithms to be calculated efficiently and in real time.
6.6. Distribution of economic benefits
With the development of vehicle platoon technology, the distribution of economic benefits related to vehicle platoons has become increasingly important, yet has not been the subject of extensive research [
148]. It is necessary to distribute economic benefits in a vehicle platoon for a few different reasons. First, the vehicle at the rear of the platoon can save more fuel than the vehicle at the front [
206]. Second, in order to form a platoon to save fuel, some vehicles may need to slow down or stop to wait for vehicles behind them to catch up, which results in a loss of their time. Third, the platoon technique reduces the distance between cars, thereby improving road capacity, so road managers should consider reducing tolls for platoons. In the future, game theory should be used to design a set of rules so that the overall solution can be close to the optimum even when all drivers and managers are selfish. It is only in the case of a reasonable distribution of benefits that all stakeholders will be motivated to carry out platoon formation. Furthermore, an increase in the number of schedulable vehicles creates more space for scheduling, thus resulting in more revenue. Combining various freight companies to form a unified freight scheduling system can bring greater economic benefits.
6.7. The intelligent platoon transportation system
Although platoon technology is currently mainly used in the field of truck freight, with the continuous development of the platoon scheduling technique, platoons are expected to be applied to urban passenger transportation systems and to compete with buses and subways. In city centers, due to road traffic jams and the close distance between stations, buses and subways start and stop repeatedly, resulting in lower average speeds and lower passenger transport efficiency. The core idea of the intelligent platoon transportation system is to realize the point-to-point transportation of passengers, so it is necessary to split “trains” into “platoons.” Passengers with the same boarding and alighting stations can take the same vehicle; then, the vehicles can form platoons when they drive on the road. When a platoon passes the station, the vehicles that need to stop can separate from the platoon and park, while the other vehicles continue to travel without slowing down. In this way, the average speed of transport vehicles can be increased while improving transport efficiency.
6.8. Future development of real-world platoon projects
As summarized in Section 4.6, a great deal of real-world platoon application project research has been completed, and good planning and scheduling results have been achieved. These projects will continue to progress in the future, with continuous iterations and updates. For example, the PATH project will focus on improving the safety of each vehicle during emergency braking in a platoon in the future [
40]. The EnergyITS project will further shorten the distance between vehicles in a platoon, improve equipment reliability, and explore a more mature business operation model [
38]. However, as concluded in the GCDC challenge, although each team has implemented the scheduling and planning functions of a platoon system in the real world, there is a lack of a unified communication protocol and test verification evaluation standards in all systems [
161]. This hinders platoon technology from moving toward larger scale practical applications. In addition, current laws and regulations do not fully support the development of various platoon application projects. In the future, breakthroughs in relevant laws and regulations will be required to enable platoon technology to carry out a wider range of technical verification and application.
7. Conclusions
Since platoon scheduling and planning techniques are the key to obtaining the economic benefits of platoons, their development is crucial to platoons’ large-scale promotion and implementation. Scheduling and planning techniques for vehicle platoons have been significantly refined over the past decade. Rule-based, model-based, optimization planning, learning-based, and multi-agent platoon scheduling methods have been successfully applied to platoon vehicle formation and separation, lateral and longitudinal planning, and resource allocation. However, due to the lack of large-scale industrial operation, this technology still has a number of flaws. We hope that our review will aid researchers in understanding diverse platoon scheduling and planning techniques and application scenarios, as well as in advancing the development of a unified and coordinated platoon scheduling and planning system.
A primary future development direction for platoons is to conduct more practical road test projects. Based on the road testing of actual vehicles, more platoon-specific datasets can be made available to researchers. In addition, rapidly evolving neural learning-based algorithms can use these datasets to learn how to handle complex traffic situations in the real world. Through the continuous efforts of researchers to improve platoon scheduling techniques, it is hoped that the advantages of platoon technology will be fully utilized and this technology will be widely adopted, thereby transforming the transportation industry and vastly improving transportation efficiency.
Acknowledgments
This work is funded by the Shanghai Municipal Science and Technology Major Project (2018SHZDZX01) of ZhangJiang Laboratory and Shanghai Center for Brain Science and Brain-Inspired Technology, Shanghai Rising Star Program (21QC1400900), and Tongji-Westwell Autonomous Vehicle Joint Lab Project.
Compliance with ethics guidelines
Jing Hou, Guang Chen, Jin Huang, Yingjun Qiao, Lu Xiong, Fuxi Wen, Alois Knoll, and Changjun Jiang declare that they have no conflicts of interest or financial conflicts to disclose.