General Optimal Trajectory Planning: Enabling Autonomous Vehicles with the Principle of Least Action

Heye Huang , Yicong Liu , Jinxin Liu , Qisong Yang , Jianqiang Wang , David Abbink , Arkady Zgonnikov

Engineering ›› 2024, Vol. 33 ›› Issue (2) : 72 -85.

PDF (4934KB)
Engineering ›› 2024, Vol. 33 ›› Issue (2) : 72 -85. DOI: 10.1016/j.eng.2023.10.001
Research
Article

General Optimal Trajectory Planning: Enabling Autonomous Vehicles with the Principle of Least Action

Author information +
History +
PDF (4934KB)

Abstract

This study presents a general optimal trajectory planning (GOTP) framework for autonomous vehicles (AVs) that can effectively avoid obstacles and guide AVs to complete driving tasks safely and efficiently. Firstly, we employ the fifth-order Bezier curve to generate and smooth the reference path along the road centerline. Cartesian coordinates are then transformed to achieve the curvature continuity of the generated curve. Considering the road constraints and vehicle dynamics, limited polynomial candidate trajectories are generated and smoothed in a curvilinear coordinate system. Furthermore, in selecting the optimal trajectory, we develop a unified and auto-tune objective function based on the principle of least action by employing AVs to simulate drivers’ behavior and summarizing their manipulation characteristics of “seeking benefits and avoiding losses.” Finally, by integrating the idea of receding-horizon optimization, the proposed framework is achieved by considering dynamic multi-performance objectives and selecting trajectories that satisfy feasibility, optimality, and adaptability. Extensive simulations and experiments are performed, and the results demonstrate the framework’s feasibility and effectiveness, which avoids both dynamic and static obstacles and applies to various scenarios with multi-source interactive traffic participants. Moreover, we prove that the proposed method can guarantee real-time planning and safety requirements compared to drivers’ manipulation.

Graphical abstract

Keywords

Autonomous vehicle / Trajectory planning / Multi-performance objectives / Principle of least action

Cite this article

Download citation ▾
Heye Huang,Yicong Liu,Jinxin Liu,Qisong Yang,Jianqiang Wang,David Abbink,Arkady Zgonnikov. General Optimal Trajectory Planning: Enabling Autonomous Vehicles with the Principle of Least Action. Engineering, 2024, 33(2): 72-85 DOI:10.1016/j.eng.2023.10.001

登录浏览全文

4963

注册一个新账户 忘记密码

1. Introduction

1.1. Motivation

Autonomous vehicles (AVs) are currently an active area of research in academia and industry and are believed to help improve driving safety and mobility. While the development of autonomous driving technology faces some challenges [1], [2], [3]. One of the key issues is how to plan a feasible trajectory for a vehicle in a dynamic environment such that it does not collide with obstacles and satisfies kinematic, environmental, and temporal constraints [4], [5]. Generally, one seeks a safe, efficient, and reliable trajectory planning approach with the following abilities: ① trajectory feasibility [4]. The generated trajectory is addressed to satisfy computational resources, time, and kinematic constraints, and the controlled vehicle can reach the destination executable along the trajectory within a limited time rather than being trapped in local values. ② The trajectory optimality [3]. In the planning process, it is essential that AVs meet multi-performance objectives, such as driving safety, efficiency, and comfort, and achieve optimal performance with the shortest path and least energy. ③ The method adaptability [5]. A planning method is employed to define specific driving strategies and trajectories. The developed method is expected to have the capability to handle diverse scenarios like excellent drivers, thereby facilitating the traffic transferability of AVs. Specifically, the complexity of the traffic environment, the individual randomness of traffic participants, and the interactive game require AVs to collaborate on multiple dynamic objectives in the planning process. This choice can be achieved through simple rules, such as avoidance behavior, while significantly reducing passing ability and ensuring safety.

Therefore, considering the feasibility, optimality, and adaptability is significant for AV’s safe and reliable trajectory planning. This study aims to design an autonomous vehicle trajectory-planning method to plan optimal and complete trajectories with a unified objective function under arbitrary uncertain scenarios.

1.2. Related work

Much research has focused on real-time trajectory planning for AVs in dynamic environments. These algorithms can be classified into five main types: graph search, sampling, artificial potential field (APF)-, artificial intelligence (AI)-, and optimization-based approaches [6], [7]. A graph search algorithm generally represents the state space as occupying the grid and provides a solution for path planning by accessing different states in the grid [8], such as A* [9] and Dijkstra. This representation allows AVs to find a feasible solution suitable for the low-speed planning process; however, it is challenging in high-speed driving and has a high calculation cost. The primary representative sampling algorithms are the probability road map (PRM) and rapidly exploring random tree (RRT) [10]. RRT allows random searches through navigation areas and fast planning in semi-structured spaces [11]. However, it is difficult to generate an optimal path, and it usually generates a discontinuous trajectory curvature. Lattice planner [12], [13] is also using trajectory sampling or lattice searches. It aims to generate a smooth and safe local trajectory that satisfies vehicle kinematics constraints. It is suitable for simple and low-speed scenarios but has limited applications in complex traffic. The APF-based approach has been verified and applied in several fields [14], [15], [16]. It can achieve online dynamic obstacle avoidance and can be implemented easily in trajectory planning. While local optimization problems, which are limited in achieving trajectory feasibility, will restrict their application. Although some strategies, such as the escape strategy and heuristic function, have been studied in recent years to solve this problem [17], [18], they also increase the difficulty of parameter adjustment, and making it challenging to ensure the convergence, optimality, and feasibility of the search algorithm.

AI-based or data-driven approaches are becoming popular for devising trajectory planning processes for AVs with the rapid advances in learning-based algorithms [5], [19]. For example, in combination with an end-to-end decision-making approach, input images are directly converted into the control commands of an AV [20]. This approach is more efficient than traditional approaches in simple scenarios such as highways; however, its stability and interpretability need to be improved because of the limitation of training data and missing intermediate outputs in complex scenarios [19]. Deep reinforcement learning (DRL) has also been effectively applied to decision-making and motion planning. Hoel et al. [21] proposed a flexible framework for the tactical decision-making of AVs by combining planning and DRL. The proposed framework can effectively realize the interaction of a dynamic environment, resulting in low online planning efficiency and high training costs. Vazquez et al. [22] proposed a planner called the game-theoretic model predictive controller, which enables the output of an interactive multi-agent neural network policy for trajectory planning. This planner is effective in static obstacle scenarios but struggles with complex dynamic traffic.

In trajectory planning, discrete interpolation optimization approaches provide a planner with promising mathematical foundations for fitting a given road description [23], [24]. In particular, feasible trajectories can be obtained by considering feasibility, comfort, vehicle dynamics, and other requirements. Interpolation optimization algorithms usually generate a series of dynamically feasible candidate trajectories using the clothoid, polynomial, Bezier, and spline curves within an existing reference point set [25], [26]. For example, the Bezier curve can generate a safe and smooth collision-free path when obtaining the real-time positions of obstacles [27]. These curves satisfy the constraints required by their interpolation points and achieve vehicle dynamic constraints. It also significantly reduces the constrained space and ensures the ability to express the generated trajectory [27]. Meanwhile, a mathematical optimization method that considers the cost function, such as the expectation–maximization (EM) planner [12], can also be effectively applied in complex scenarios through layered optimization iterations. However, similar to the lattice planner, the cost function of the planners adopts multiple single-objective weighting methods, which have difficulty determining the weights of single objectives and the inconsistency of dimensions among various objectives.

1.3. Contribution and organization

Aiming at the limitations of existing methods and the performance requirements for real-time trajectory planning of AVs, this work mainly focuses on:

(1)How driver behavior strategies inspire the vehicle trajectory planning process;

(2) How to ensure the feasibility of the planning approach;

(3) How to dynamically coordinate multi-performance objectives in the trajectory planning process.

Therefore, we formulate a unified framework that simultaneously achieves multi-scenario adaptation and multi-performance objective coordination. As illustrated in Fig. 1, we propose a framework that satisfies the requirements of flexibility, optimality, and adaptability. First, we generate a reference path and feasible trajectory clusters in a curvilinear coordinate system using discrete interpolation optimization. Subsequently, by defining a unified objective function that considers optimality and adaptability based on the principle of least action, the optimal trajectory is selected by calculating the actual action of each feasible trajectory. Finally, we optimize the speed of the generated trajectory based on receding-horizon optimization and output it to the AVs for trajectory tracking and control. The main contributions are as follows:

(1) We propose a risk-bounded trajectory planning framework by integrating receding horizon optimization for reliable autonomous driving in a dynamic environment, ensuring trajectory feasibility for AVs.

(2) We develop a general trajectory planning approach based on the principle of least action by combining the purpose of “optimization” in nature with the characteristics of “seeking benefits and avoiding losses” of drivers. This is theoretically proven to enable human-centered adaptability.

(3) We develop a unified objective function from the perspective of driving expectations to avoid weighting each objective separately based on driver experience but unify and auto-tune the pursuit of multiple objectives, such as safety and efficiency. Compared with existing studies, our proposed general optimal trajectory planning (GOTP) framework can perform in complex multi-obstacle scenarios that are completely general and conform to the expected driver behaviors.

The remainder of this paper is organized as follows. Section 2 describes the generation of the feasible trajectory clusters. Section 3 presents the comprehensive objective function applied to select the optimal trajectory. Experiments to verify the proposed algorithm are described in Section 4. Finally, Section 5 summarizes the study.

2. Feasible trajectory generation

For AVs, trajectory generation focuses mainly on generating a series of actions to allow an agent to reach the target state from the initial state. The entire process of generating feasible trajectory clusters can be divided into feasible trajectories and speed generation.

2.1. Reference path generation and smoothing

Through statistical analysis of a large number of naturalistic driving data (the naturalistic driving database contains multiple original data of 33 drivers’ free driving process, and 469 680 and 278 838 frames of experimental data are collected on urban roads and high-speed sections respectively) [28], we can obtain the driving characteristics that drivers usually drive along the centerline as much as possible in the driving process, and 90% of the lateral positions of the vehicle are within a distance of 0.7 m from the centerline of the lane, thereby ensuring higher driving safety.

Therefore, according to the driver’s driving characteristics, we adopt human-like driving thinking to embody the phenomenon, in which the driver pursues centerline driving in generating the reference paths of AVs. Furthermore, curvature continuity is essential to ensure a feasible trajectory. Assuming that the precise waypoints of structured roads have been given, we can avoid the sampling process of global planning. We employ a Bezier curve to generate and smooth the reference path when considering local planning. The Bezier curve has certain advantages over other curve generation methods for path generation efficiency and flexibility, ensuring local smoothness and curvature continuity of the generated trajectory [29]. A quintic Bezier curve is used to smooth the reference path. The expression is:

Pt=1-t5P0+51-t4tP1+101-t3t2P2+101-t2t3P3+51-tt4P4+t5P5

where Pi(i=0,1,,5) represents six control points of the quintic Bezier curve, and t represents the time parameter. Considering the Bezier curve properties, the quintic Bezier curve has a second-order derivative; thus, curvature ρ satisfies Eq. 2.

ρt=P't×PntP't2

where Pt and Pnt represent for the firstand n derivative of control points, respectively.

When smoothing the entire path, the two Bezier curves satisfy the continuity constraint conditions at the splicing location: ① The positions are continuous, ② the tangent vectors are continuous, and ③ the curvature is continuous. As shown in Fig. 2, the reference path generated using the quantic Bezier curve as the basic frame can satisfy curvature continuity. Assume that the vehicle drives at a constant speed, regardless of the up-and-down movement along the z-axis direction; then, on the structured road, the state of a vehicle can be decomposed into four-dimensional states (x,y,θi,ρ). Here steering angle θi and the curvature ρt of Bezier curve at point Pt=(x(t),y(t)) along the path are defined as:

θi=tan-1ẋẏ,ρt=x·y··-y·x··x·2+y·232

where ẋ,ẏ and ẋ̇,ẏ̇ are the lateral and longitudinal velocities and accelerations of ego vehicle, respectively.

2.2. Feasible path cluster generation

Trajectory generation is expected to follow road condition constraints and conform to the geometric shape of the road to achieve better comfort and curvature continuity in the driving process. Therefore, the Cartesian coordinates must be converted into curvilinear coordinates to generate feasible path clusters for vehicles during driving [25]. First, we map the reference path from the Cartesian coordinates by mapping the points on the reference path to the curvilinear coordinate system. The arc length s along the road centerline, lateral offset l, and reference path were converted from the Cartesian coordinate system to the curvilinear coordinate system.

Based on the curvilinear coordinate system, we consider the reference path as the baseline, and two parameters corresponding to the axes of the curvilinear coordinate system are adopted to represent the state of the generated path for the feasible path cluster. Given the arc length si and lateral offset li at the current position of the generation path, the arc length sf and lateral offset lf at the end position when sampling the terminal state are used to generate the design parameters of various paths. As shown in Fig. 3, we generate a set of feasible path clusters by laterally shifting the reference path along the s-axis in the curvilinear coordinate system. The lateral offset can satisfy all possible paths generated by changing it to cover the road. Furthermore, the generation along the curvilinear coordinates can be more consistent with driving maneuverability, and the constraints of road conditions and vehicle dynamics can be satisfied by applying differential equation constraints to the generated curve.

We then apply a cubic polynomial to smooth the curvature of the generated paths to realize a smooth transition from the initial lateral offset to the sampling end offset, and used first- and second-order derivations of the lateral offset to calculate their curvatures [30].

ls=as3+bs2+cs+d,sis<sfsf,sfs
dldts=3as2+2bs+c,sis<sf0,sfs
d2ldts=6as+2b,sis<sf0,sfs

where a,b,c,d are the coefficients of the cubic polynomial, which can be determined from boundary conditions such as initial and final position and velocity. The boundary constraint and the constraint of heading angle difference θ(s) between the vehicle and baseline for the cubic polynomial can be written as follows:

ls0=l0,lsf=lf,lsi=li
dldtsi=tanθsi,dldtsf=0
θs0=θ0,θsf=θf,θsi=θi

In the path generation process, the vehicle kinematics model needs to be described in detail; the vehicle kinematics model can be expressed by the following Eq. 10.

ẋ=vcosθ,ẏ=vsinθ,θ=tan-1(ẏ/ẋ),θ̇=vρ

where (x,y) is the position, θ is the vehicle velocity angle, v is the linear velocity, and ρ is the curvature. Considering that there are limits to the motion and dynamic characteristics, the curvature and acceleration must be checked in the kinematic model to ensure the physical feasibility of reasonable steering behavior and to improve the safety and smoothness of lateral tracking and control [31].

The entire process was divided into path generation and speed planning when generating a feasible path. For the partial planning and elimination of speed, the curvature differentially constrains the motion equation, and the vehicle motion is expressed by distance instead of time. Therefore, a transformation can naturally decompose path generation into space-path generation and speed generation. Adopting the above method can generate a path independent of vehicle speed. Furthermore, the path generated after the curvature constraint is consistent with the reference path trend, maintaining a fixed lateral offset and forming a feasible path cluster, as shown in Fig. 3.

2.3. Speed generation

The appropriate execution speed should satisfy the following characteristics after collecting a wide range of path candidates: ① continuous velocity and acceleration, ② no sideways movement of the vehicle while following the trajectory at the speed, ③ a feasible speed limit based on the acceleration capability, and ④ soft constraints such as traffic rules and particular road condition limits that need to be considered to achieve enforceability. Referring to Ref. [25], we adopted a trapezoidal frame for speed planning.

We employed a polynomial approach to smooth the trapezoidal speed curve following the trapezoidal speed frame and avoid adverse consequences, such as extreme conditions caused by abrupt changes in speed, resulting in reduced comfort. Accordingly, the acceleration at and path length st can be easily obtained through derivation and velocity integration, respectively. The initial velocity v0, initial acceleration a0, terminal velocity vf, and terminal acceleration af, total path length sf, terminal time tf, and unknown parameters {a,b,c}. Based on the trapezoidal velocity profile, the velocity can be described as a cubic polynomial function of time, and the following equation gives its corresponding relationship:

vtf=v0+atf+btf2+ctf3=vf
atf=a0+2btf+3ctf2=af
stf=v0tf+12atf2+13btf3+14ctf4=sf

The trapezoidal curve after polynomial smoothing can be obtained using the above formulas, guaranteeing the continuity of acceleration and speed smoothness. In the actual planning process, the speed must be further optimized according to the actual situation and dynamically adjusted according to the speed constraints under different situations. Furthermore, the acceleration and deceleration settings in the real vehicle experiments were relatively conservative to ensure that the acceleration produced by the cubic polynomial speed curve did not exist at certain points when the acceleration of the linear speed curve exceeded the constraints.

3. Trajectory evaluation and optimization

Given the trajectory planning processes in a rolling pattern, we must optimize and select the optimal path after generating feasible finite candidate trajectory curves by utilizing the structural characteristics of the traffic environment. This section introduces a general trajectory optimization framework with a unified and auto-tuned objective function that employs the principle of least action to select the optimal solution.

3.1. Objective function based on the principle of least action

The trajectory planning subsystem is crucial for AVs to handle the complexity of driving scenarios, the unpredictable behavior of other road users, and the need for driving safety, efficiency, and driver comfort. This subsystem functions similarly to the driver’s brain when carrying out a transport mission. The key challenges in designing a planning algorithm are how to act like a human, improve adaptability by learning from drivers, and acquire optimal strategies.

To achieve anthropomorphic driving of AVs, we first consider the drivers’ manipulation behavior (tactical and operational) in the driving process, then extract relevant attributes to which drivers pay attention and analyze the main objectives to pursue. Finally, by imitating drivers’ tactical and operational decision-making strategies, we developed a general trajectory planning approach. The fundamental objectives most drivers pursue while driving are efficiency and safety. According to Aart and van Schagen’s work [32], there is a relationship between driving speed and driving safety, and the degree of safety decreases as the speed change rate increases. Similar results were obtained in other studies [33], [34], which confirmed that one of the main goals was driving safety. Further psychological research [35] revealed that drivers consistently seek to reach their desired destination faster. Research has demonstrated that the pursuit of safety and efficiency is universal.

When establishing an objective function considering safety and efficiency, we propose that the pursuits in the driver’s brain can be described as pursuing an extreme state of various targets in nature. The anthropomorphic driving process of AVs can be regarded as the result of natural laws restricting the interaction between AVs and the traffic environment. Therefore, to explain the driver’s choice of the optimal path in the driving process, we introduce the principle of least action [28] to reveal the driving mechanism of “seeking benefits and avoiding losses.” The principle of least action demonstrates that when there are multiple ways to realize a process, nature always selects the one that minimizes time and energy consumption. Although the energy level of a system determines the state stability according to the concept of analogy entropy, the system is in a safe state when it is stable, and the pursuit of high efficiency is reflected in shortening the driving time. The pursuit of safety and efficiency in the driving process can also be expressed by integrating the Lagrangian L with time t, where L=Lx,ẋ=T-U, T is the kinetic energy, U is the potential energy, x,ẋ are the lengths of the lateral coordinates of a small path section and its slope, respectively.

Specifically, the introduction of the principle of least action is inspired by relevant mechanical principles and the modeling process of the Hamiltonian framework for the optimal path in the particle motion process, as shown in Fig. 4. A physical system is referred to as a dynamic system if the behavioral description of the system is time-dependent. A system’s behavior can be formally described as motion in phase space or “state space.” The driver–vehicle–road system is an objectively existing generalized dynamic system. The principle of least actions can represent a physical system that employs T and U to represent the integration of a Lagrangian quantity with time. In a system composed of free particles, based on the principle of least action, the real motion of the particles from points 1 to 2 causes the action to have a minimum value. The laws of mechanical systems are described by the generalized coordinates in variational problems δS=0 when the actual action S=t0tfLdt is taken as an extreme value. t0 and tf are the start and end time points of the vehicle, and L is the actual process’ Lagrangian. Therefore, establishing an objective function for AVs in pursuit of safety and high efficiency can be transformed into finding the path with the least action. Furthermore, we use S to represent the least action corresponding to the actual action of the vehicle in the virtual mechanical system.

3.2. Path calculation and selection

The autonomous vehicle reaches the target area at a minimum cost while avoiding obstacles and satisfying the corresponding constraints. Its linear time-invariant system dynamics are described in [36]:

Xt+1=AXt+But
Yt=CXt+Du(t)R(t)

where Xt is the state vector and ut is the input vector. A is the state transition matrix, B is the input matrix, C is the output matrix, and D is the direct transmission matrix, respectively. Also, vehicles have dynamic constraints for the presence of static and dynamic obstacles in the environment. Therefore, the local constraints of the vehicle are described by output set R(t). Eq. 14 is the state–space model of the vehicle, Eq. 15 is the constraint condition that needs to be satisfied by the state and input, R(t) is the constraint condition that needs to be satisfied at time t, and the output state vector YtR(t).

Receding-horizon optimization can obtain the optimal input in a limited period by considering the objective function and constraints and solving the problem iteratively. Therefore, the optimization process is performed in a limited domain, new control inputs are executed to form new states, and the next optimization is output based on the current state until the target set Φ is reached. Therefore, the resulting trajectory comprises a series of locally optimal segments. The objective function can be expressed in the receding horizon as:

S=k=0J(ut,Xt,Φ)
S=min{τ=0HwJut+τ|t,Xt+τ|t,Φ+Z[X(t+Hw+1),Φ]}

where J() is the cost function and Φ is the target set. ut+τ|t and Xt+τ|t represent the control input value and the predicted value at time t to future time t+τ, respectively. Hw is a receding domain and Z() represents the terminal penalty term. Specifically, referring to the existing research [37] and dynamic parameter adjustments, the receding domain Hw was set to 15.

Autonomous vehicles are driven in complex traffic environments with many vehicles, cyclists, pedestrians, and road facilities. Dynamic interactions between vehicles can describe system dynamics. Multi-vehicle systems can be classified as multiple two-vehicle systems. We then divided the interactions between the vehicles based on a typical car-following scenario, as shown in Fig. 5. Specifically, when a vehicle is driven, its driving force moves it from the starting point to the endpoint. When there is no lane-changing (LC) process, the driver’s target driving force only exists in the longitudinal direction; when the driver chooses to change lanes, owing to lateral movement, the driver’s target driving force will have a lateral component. The longitudinal constraints of the road speed limit and the lateral constraints of road markings were specifically considered. The Lagrange equation in car-following process is:

Li=Ti-Ui=12mivi2-t0tfRi-Gi,xvi,x+Fli,2-Fli,1-Gi,yvi,ydt-t0tfFjivi-vjdt

where i represents vehicle i; mi is the mass of vehicle i; vi and vj are the velocities of vehicles i,j, respectively; and the subscripts x and y denote the longitudinal and lateral dimensions, respectively. Gi is the virtual driving force of vehicle i driven by the driving target and Ri is the longitudinal restraint resistance of traffic rules to drivers. Ri,Gi can be referred to in traffic [38]. Fli,1 and Fli,2 represent the transverse binding forces of both lane lines on the target vehicle, and Fji represents the external force of the risk caused by vehicle j to vehicle i.

The virtual driving force caused by the driver’s inner target can be expressed using Eq. 19:

Gi=migsinφi,x,sinθi,x=ξvdervl

where g is the acceleration of gravity; φi is the driver’s pursuit of driving speed in vehicle i, specifically, φi,x means in the longitudinal direction; ξ is a constant, generally an average acceleration of approximately 0.28g when the vehicle accelerates from 0 to the average road velocity (100 km∙h–1), and hence in this paper ξ = 0.2; vder is the driver’s expected velocity; and vl is the lane velocity limit.

A lane boundary (such as a lane line) restrains the lateral motion of a vehicle. Generally, researchers use a spring model to describe this constraint [30], [31]:

Fli=1.5miBllw2-rli

where Bl represents the line type of the road boundary (for example, we set Bl=2 for the dotted line and Bl=3 for the solid line), rli is the distance between the vehicle and road boundary, and lw is the lane width.

During the driving process, the speed of the ego vehicle and the relative speed between the ego vehicle and surrounding vehicles directly relate to the potential collision risk. Generally, the higher the speed of the ego vehicle, the greater the collision risk. The greater the relative speed between the ego vehicle and surrounding vehicles, the greater the traffic disturbance and potential impact on the front and rear vehicles. Therefore, the vehicle safety margin in the longitudinal direction is positively related to the ego vehicle speed and relative speed and is limited by the driver’s elliptical observation angle distribution. The lateral safety margin is related to the driver’s risk sensitivity and is also affected by lane constraints. Considering the limited change in lateral speed, the margin can only be related to the dynamic relative distance.

Furthermore, considering that a driver’s visual recognition capabilities are significantly affected during the driving process, for example, the field of view becomes narrower as the vehicle speed increases. Therefore, similar to social forces [14], [39], the driver’s field-of-view distribution is regarded as an ellipse, and considering that there are significant differences in the risks caused by vehicles driving to the road traffic environment along the longitudinal and lateral axes, the risk distribution is constructed finally into an elliptical distribution with dynamically varying major and minor axes. In particular, the longitudinal and lateral dynamic safety margins are described.

rx=rij,x+Ψ[vix,vix-vjx]Δt
ry=rij,y+

where Ψ() is the positive correlation function, Δt represents the receding time step, and is the lateral safety margin. rij,x, rij,y are the car-following distances of vehicles i and j in the longitudinal and lateral directions, respectively. Therefore, the interaction between traffic participants can be expressed as:

Fji=12mjvj2rij1rx2-1ry2

For any driving scenario, we assume that there are n vehicles in the traffic system, and the Lagrange of vehicle i can be described as:

Li=12mivi2-t0tfRi-Gi,xvi,x+Fli,2-Fli,1-Gi,yvi,ydt-t0tfj=1n-1Fjivi-vjdt

Aiming at the safety and efficiency goals pursued by drivers, we establish a Lagrangian Li, which reflects the stable states of vehicle i. Furthermore, efficiency can be expressed as the time consumed in the driving process. According to the principle of least action, the cost of generating a feasible path can be defined as Srisk.

Srisk=t0tfLidt=t0tfTi-Uidt

3.3. Speed optimization

After optimizing and adjusting the vehicle driving strategy within the time window, the optimal dynamic objective function is obtained as:

Srisk=min{τ=0Hwt0tfLiut+τ|t,Xt+τ|t,Φdt+Z[X(t+Hw+1),Φ]}

After calculating the optimal path through the above steps, the optimal path is the path that satisfies “F=ma.” Therefore, it is appropriate to calculate the minimum value of Srisk to solve this path, and the variational method is typically used to solve the functional extremum.

argSrisk=argminτ=0Hwt0tfLiut+τ|t,Xt+τ|t,Φdt+ZXt+Hw+1,Φ=0

where Srisk is the theoretical least action and is the extreme value of the actual action Srisk of each trajectory. The variation in the daily Lagrangian value for this process is 0. Further, we decompose the Lagrangian value to the Cartesian coordinate system for convenience at each time point t, and update the Lagrangianvalue by considering the receding optimal search (t+τ).

Lx=12mivi,x2-t0tfRi-Gi,xvi,x+j=1n-1Fji,xvi,xdtLy=12mivi,y2-t0tfFli,2-Fli,1-Gi,yvi,y+j=1n-1Fji,yvi,ydt

Therefore, we need to calculate Eq. 28 using the Euler–Lagrange equation, the form of which is:

ddtLq̇-Lq=0

where q denotes a generalized coordinate, in a mechanical system, it could represent the position; q̂. is the time derivative of the generalized coordinate q, and it represents the generalized velocity.

Therefore, for each time step t, the Euler–Lagrange equation was substituted to obtain the optimal velocity (vi,x,vi,y) of the corresponding trajectory TraO:

miv̇i,x-[Ri-Gi,x+j=1n-1Fji,x]=0miv̇i,y-[(Fli,2-Fli,1-Gi,y)+j=1n-1Fji,y]=0

3.4. Algorithm framework

The implementation algorithm (GOTP) for the entire process is presented in Table 1. Specifically, it can be described in four ways:

(1) Generate the reference centerline;

(2) Generate the trajectory cluster and corresponding velocity;

(3) Calculate the action of the feasible trajectory;

(4) Output the speed corresponding to the optimal trajectory.

Furthermore, after obtaining the trajectory candidates to seek the desired multi-performance optimal trajectory, we further adjust the vehicle driving strategy based on receding horizon optimization within the time window [12]. By satisfying constraints such as curvature continuity and collision avoidance, the optimal solution is obtained by a continuous rolling dynamic search.

Specifically, assuming that the upper layer of AV (behavior planning layer) requires the ego vehicle to complete a lane change from t0 to t1, the vehicle needs to complete a Δl in the lateral direction and a Δs movement in the longitudinal direction. s and l can transfer as s(t) and l(t). By obtaining a finite set of lateral trajectory candidates at each period, completing the entire planning process, and calculating the action of each feasible trajectory, we can obtain the optimal trajectory and reference speed that meets the driver’s expectations.

The specific steps are shown in Fig. 6, Fig. 7. We designed a multi-lane highway with multiple obstacles and two driving lanes. When the vehicle detects an obstacle (red rectangle), the current lane is considered a dangerous candidate. Based on the collision rates in different lanes, vehicles change lanes to avoid obstacles. Meanwhile, considering the pursuit of efficiency and comfort, the vehicle will attempt to drive along the centerline of the lane and choose the shortest path. Therefore, the lowest-cost trajectory, shown in Fig. 6, is the green trajectory. In Fig. 6, trajectories 2, 4, 6, 8, and 10 interact with red rectangular obstacles, thereby generating spatiotemporal conflict points with the obstacles. Therefore, these trajectories are considered unsafe. To find a feasible trajectory, we rescan and evaluate the trajectory using a deterministic planner. The principle for selecting the optimal trajectory is shown in Fig. 7. To satisfy optimality, it is necessary to perform multi-objective trade-offs simultaneously in the process of searching for the optimal trajectory. For example, the calculation process of Methods A and B may find a safe but not necessarily optimal trajectory and stop the search. In contrast, our approach simulates the driver’s optimization process and finds a satisfactory solution. Based on this, the optimal solution is searched iteratively.

4. Experiment and results analysis

In this section, we describe the simulations and real-vehicle experiments conducted in typical traffic scenarios to verify the performance of our proposed GOTP framework for autonomous vehicles, including its feasibility, optimality, and adaptability. In simulation experiments, we compare the effectiveness of the proposed framework with that of traditional algorithms in different scenarios based on Matlab. In the real vehicle test, road experiments are conducted on vehicles owned by the Tsinghua University Research Group to ensure the intuitiveness of the results and safety. We set the static obstacle position and dynamic obstacle range in advance based on the set navigation points and then obtain real driver data for comparison and verification.

4.1. LC simulations

LC scenarios are typical and frequent in actual traffic. A successful lane-change trajectory planner should first ensure that it does not collide with other obstacles and simultaneously satisfy the pursuit of multiple goals, such as efficiency and comfort.

To demonstrate the advantages of our method qualitatively, Fig. 8 shows an example of a typical LC scenario. The yellow vehicle represents the ego vehicle, and the green, blue, and purple vehicles represent the surrounding vehicles. During the planning process, to better visualize the planning process, we use dynamic rolling planning to demonstrate further that the planner predicts the location over some time in the future. When the positions of two vehicles on the same time axis do not overlap, a safe lane change occurs, as shown in Fig. 8(a)–(c). However, if the safety gap is short over time, there will be an overlap in the space–time positions. An example is shown in Fig. 8(d). Furthermore, the corresponding process can be described by the spatiotemporal evolution of the vehicle interaction position, as shown in Fig. 8(e).

Based on the Matlab Automated Driving Toolbox, we compared the proposed method with Matlab’s lane-change decision logic and trajectory-planning algorithm. The time headway (THW) was set as a safety constraint to estimate the criticality of a certain traffic situation for comparison. As shown in Table 2, by comparing our method with the planning method based on the THW safety constraint in the two cases of constant velocity (CV) and constant acceleration (CA) of the surrounding vehicles, the results show that our method performs well in this multi-vehicle lane-change situation. In a multi-vehicle environment, because many surrounding vehicles need to merge, the gap in the target lane is not sufficiently large; therefore, it is necessary to find a suitable safety boundary to complete the lane change. Compared with the THW method, the proposed GOTP framework is scalable for finding safe LC trajectories in this challenging situation while satisfying the efficiency requirement.

4.2. Multi-lane overtaking simulations

4.2.1. Case study 1

Challenging driving scenarios such as multi-lane overtaking occur when static and moving obstacles are simultaneously present at the same time. To qualitatively demonstrate the advantages of our approach, we tested the GOTP in typical overtaking scenarios. Fig. 9(a) shows that the ego vehicle was driven along the road centerline. Static obstacles were located in the current and left lanes. The ego vehicle must plan a reasonable and feasible trajectory from the origin to the endpoint. The GOTP framework calculated the distance from each possible curve to the nearest obstacle to determine the allowable velocity. The trajectory labeled “GOTP” in Fig. 9(b) is the globally optimal trajectory found by the proposed planner. We plotted the trajectory outputs from the APF and A* planning methods for comparison. The specific planned path lengths and times are listed in Table 3. The results show that GOTP can effectively determine the global optimal trajectory with the shortest planning path and planning time. The APF may fall into the local optima process, resulting in the longest planning paths and times. The curve obtained by A* deviated from the road centerline, and the trajectory quality was not high. Fig. 9(c) shows the ego vehicle’s longitudinal and lateral velocity curves embedded in the GOTP. The vehicle speed changed smoothly owing to the constrained control of the jerk rate, avoiding sudden acceleration or rapid deceleration of the vehicle during the LC process and ensuring the comfort of trajectory generation.

4.2.2. Case study 2

Furthermore, there are obstacles ahead when AVs drive in dynamic traffic, including slow and fast vehicles merging under various conditions, and it is often necessary to complete the task of continuous overtaking. Fig. 9(d) shows the initial configuration of the overtaking scenario on an oncoming two-lane road with surrounding vehicles at a constant speed of 10 m∙s–1. The overtaking scenario requires fast vehicles to understand the potential LC intention of the surrounding vehicles and react appropriately to complete the overtaking. The ego vehicle must decide the overtaking timing and optimal trajectory of the overtaking process without colliding with the surrounding vehicles.

Experiments show that the ego vehicle accelerates slightly during overtaking to use the space between the first and second oncoming vehicles to overtake. Simultaneously, the ego vehicle avoids colliding with all vehicles in its lane. Roll planning must be considered in the presence of moving obstacles, and all obstacles are expected to be tested safely to avoid collisions. Therefore, in the process of dynamic planning, through continuous optimization, as shown in Table 3, our plannar GOTP finds a safe and continuous overtaking curve; however, it is difficult for the benchmarks (A* and APF planners) to complete the continuous overtaking process under various parameter settings.

The results of tests on different cases show that the GOTP can achieve safe and comfortable trajectories in various challenging scenarios, avoiding static and moving obstacles. Additionally, this approach can provide appropriate acceleration to control the vehicle for throttling and braking. Appropriate velocities are used to ensure that the vehicle was driving at a safe velocity.

4.3. Real vehicle test

Furthermore, we test the proposed planning approach using a real vehicle. Based on a real vehicle platform, we first collect the driving data of experienced drivers. Two classic driver models, the car-following model (Gipps) and the LC trajectory model (MOBIL) [40], [41], are employed to compare the differences between the proposed GOTP framework and existing typical driver models. We test and verify these models in typical car-following, cut-in, and rear-vehicle acceleration scenarios and compare them with the driving trajectories of experienced drivers.

When analyzing the “perception–decision–control” process of AVs in dynamic interactive traffic, the performance of trajectory planning can be quantified by comparing some evaluation metrics of the ego driver, the GOTP framework, and the Gipps/MOBIL model under the same conditions. Based on the speed changes and driving trajectory characteristics, we select three metrics: speed similarity, average temporal consistency, and critical moments of acceleration and deceleration. Fig. 10 shows that the conflict vehicle suddenly brakes due to the disturbance of the front bicycle obstacle in the car-following scenario, and the ego driver GOTP takes action to avoid collisions. Results show that the ego driver’s and GOTP’s speed curves are similar (both speed curves are smoothed and simplified), and the critical acceleration and deceleration moments achieve high consistency (Δtc10.17s,Δtc20.25s). For example, the deceleration peak moment output by the GOTP is consistent with the driver’s braking behavior.

The results of three different scenarios (car-following, cut-in, and rear vehicle acceleration) are shown in Fig. 11(a)–(c), respectively. Specifically, in Fig. 11(a) and (c), in the car-following and rear vehicle acceleration scenarios, compared with the typical Gipps model, the GOTP can drive the vehicle to have a high similarity with the driver at speed tracking and early warning of acceleration and deceleration. Specifically, the average speed is greater than that of the Gipps model and achieves higher efficiency while ensuring driving safety. In Fig. 11(c), the GOTP and Gipps models perceive changes in the relative distance and speed between vehicles owing to the rear vehicle acceleration; therefore, they adopt acceleration behavior and increase the speed. The driver has a blind spot and is insensitive to the underlying risk, which increases the risk to the ego vehicle. Thus, the GOTP model allows for multi-directional risk sensitivity.

As shown in Fig. 11(b), compared with the MOBIL model, the GOTP method has a higher consistency with the driver speed in the cut-in scenario. It is more sensitive to the cut-in response of the surrounding vehicle. In contrast, the MOBIL model reacts slowly in the deceleration process, resulting in a higher average speed during the cut-in process, which reduces safety with higher efficiency. By testing real vehicles in different scenarios, the results show that the GOTP is more risk-sensitive in longitudinal and lateral omnidirectional scenarios, can achieve efficient and safe dynamic balance for different scenarios, and pursues the optimality of the output trajectory.

4.4. Discussion

In complex dynamic collision scenarios, the driver, proposed GOTP, and classical methods act differently, resulting in various application results. As shown in Fig. 12, identifying natural human responses to emergencies and potential hazards is critical for making safety decisions in uncertain situations. An interesting topic of discussion is the extent to which safety systems can be developed using natural human risk-sensitive behavior. Every time drivers look at the road, they re-plan a driving policy. The driver obtains the current vehicle-driving strategy through experience. However, owing to the limitations of drivers (including different driving experiences and driving styles), experience-based policies will fail in extreme scenarios and cause traffic accidents.

Classical methods are usually rule-based, and when trying to predetermine how to handle interactive traffic participants, the difficulties are obvious: ① It is difficult to understand and predict how obstacles will interact with the ego vehicle, so the following motions are difficult to describe, and defining rules is challenging to apply; ② when multiple obstacles block the road, the probability of not being able to find a trajectory that satisfies all predetermined decisions is greatly increased, resulting in planning failure. Therefore, the failure probability of rule-based applications increases in different scenarios.

This study proposes a trajectory planning method based on the principle of least action to address the problems of the various forms of existing trajectory quality evaluation functions, strong subjectivity of weight assignment, and poor scenario adaptability. The GOTP method divides the planning process into two stages: feasible trajectory generation and trajectory decision-making. It learns an excellent driver manipulation mechanism to establish a comprehensive evaluation function of trajectory quality that considers safety and efficiency and realizes the objective expression of the trajectory quality evaluation function in different scenarios. The inner pursuit of “seeking benefits and avoiding losses” is a uniform rule that AVs will follow. The receding-horizon optimization concept can determine an optimized performance index at each moment. With the help of a dynamic objective function design related to the interaction relationship of traffic participants, it is beneficial to jump out of the local optimal value and complete the global optimization during planning. Meanwhile, because the GOTP algorithm considers interactions with surrounding traffic participants (dynamic vehicles and other road users) in the design of the objective function, risk-bounded constraints can theoretically enable AVs to achieve safe driving in a dynamic environment. Based on simulations and real vehicle verification, the GOTP can be employed to solve trajectory planning problems in complex environments and ensure that the vehicle achieves positive performance during the optimization solution.

Furthermore, as shown in Fig. 13, when planning trajectories in different scenarios, it is necessary to achieve multiple objectives, such as driving safety, efficiency, and comfort, and achieve the optimal performance of the output trajectory with the shortest path and least energy. The GOTP framework, based on the principle of least action, mines behavioral characteristics from a large amount of drivers’ naturalistic driving data and combines them with the properties of natural physical systems, which can avoid the bounded rationality and cognitive bias of drivers. In particular, owing to the unified objective function, the need to separately adjust the weights for different objectives in different scenarios can be effectively avoided. Instead, a dynamic balance of pursuing objectives in various scenarios according to the intrinsic needs of drivers can be achieved.

5. Conclusions

This study proposes a GOTP framework for AVs that applies to various types of roads and enables AVs to avoid dynamic and static obstacles efficiently. The proposed GOTP framework incorporates the unified objective function of the principle of least action, which conforms to the driver’s characteristic of “seeking benefits and avoiding losses.” It allows safe and efficient multi-objective optimization and adaptability in the driving process. A dynamic integrated structure based on receding-horizon optimization solves the trajectory optimization problem by considering risk constraints, model-based feedback control, and continuous optimization, resulting in trajectories with positive performance in typical scenarios.

Through simulation verification in an LC scenario, multi-lane overtaking scenarios, and multi-type real vehicle experimental tests, the comparative results show that the proposed GOTP framework based on the principle of least action can be applied effectively for reliable lane changes, safe-margin keeping, and continuous overtaking. It considers driver behavior characteristics and the desired speed for the decision-making process while maintaining safe and efficient driving, with a 100% collision-free rate and an 87% complete lane change rate. Compared to existing methods, our proposed GOTP allows AVs to execute safely and efficiently in complex multi-type scenarios and conforms to the expected driver behavior of dynamic multi-objectives. Future work will include inputting sources of uncertain interactive objects, such as dynamic cyclists and pedestrians, to extend the GOTP framework’s dynamic safety boundaries and general optimization functions.

Acknowledgments

This research was supported by the National Natural Science Foundation of China (the Key Project, 52131201; Science Fund for Creative Research Groups, 52221005), the China Scholarship Council, and the Joint Laboratory for Internet of Vehicles, Ministry of Education-China MOBILE Communications Corporation.

Compliance with ethics guidelines

Heye Huang, Yicong Liu, Jinxin Liu, Qisong Yang, Jianqiang Wang, David Abbink, and Arkady Zgonnikov declare they have no conflict of interest or financial conflicts to disclose.

Special issue articles

References

RIGHTS & PERMISSIONS

THE AUTHOR

AI Summary AI Mindmap
PDF (4934KB)

4996

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/