PATH PLANNING ALGORITHM BASED ON TEACHING- LEARNING-BASED-OPTIMIZATION FOR AN AUTONOMOUS VEHICLE

Resume This study provides a teaching-learning-based optimization (TLBO) path planning method for an autonomous vehicle in a cluttered environment, which takes into account path smoothness and the possibility of collision with nearby obstacles. The path planning problem is tackled as a multiobjective optimization in order to plan an efficient path that allows the vehicle to travel autonomously in crowded settings. The TLBO algorithm is used to find the ideal path, with the goals of finding the shortest path to the target site and maximizing path smoothness, while avoiding obstacles and taking into account the vehicle's dynamic and algebraic properties. Article info Received 1 October 2021 Accepted 18 January 2022 Online 22 February 2022

ISSN 1335-4205 (print version) ISSN 2585-7878 (online version) of the local environment. Based on the obtained on-board sensory data, local path planning generates a minimum-level high-resolution path. However, when the environment is congested or the goal is a significant distance away, however, this method is ineffective. As a result, it is preferable to mix the two approaches in order to maximize their benefits while minimizing some of their drawbacks [2].
Path planning navigational approaches can be categorized into classical and bio-inspired approaches [3]. Before the 2000s, most traditional tactics were employed, but the bio-inspired methods became the most popular [4]. The main idea behind traditional approaches is to either find a realistic solution or confirm that there is not one. Cell decomposition [5][6], roadmap [7], sampling-based algorithms [8] and artificial potential field (APF) [9][10] are the most common traditional techniques. These methods are usually not mutually incompatible and to optimize the path planning process, a hybrid algorithm combining the two traditional techniques is used [11]. Although traditional techniques are straightforward they have a number of limitations, including long calculation times, inability to execute in real time, and regional minima trapping. As a result, bio-inspired techniques

Introduction
The ability to reach a target and avoid obstacles are critical elements of autonomous vehicle navigation. As a result, the vehicle interacts with its surroundings and perceives the world in which it works. Path planning is essential for the vehicle to be totally autonomous and dependable.
Its goal is to plan a series of suitable paths via multiple points and segments that allow the vehicle to complete its tasks and arrive at the desired destination while avoiding collisions with surrounding obstacles, as well as achieve other objectives such as minimizing the travelled distance, mission time, consumed energy and/or any other objectives based on the mission type. For path planning, there are two approaches: (i) off-line or global path planning and (ii) online or local path planning [1]. Typically, global path planning generates a high-level, low-resolution path based on knowledge of the environmental map, as well as current and historical perceptive environmental information. This method is effective in creating a path that is optimized. However, the response to dynamic or unanticipated impediments is insufficient. The local path planning approach, on the other hand, does not require a priori knowledge The TLBO technique is considered a metaheuristic algorithm proposed by Rao et al. in 2011 [20]. It is inspired by the process of teaching and learning, via a simplified mathematical model of knowledge improvements gained by students in a class [21]. The basic idea of the TLBO is that a group of learners constitutes the population, while the teacher is considered the most learned person, i.e., the learner with the best solution. The TLBO is divided into two phases, the teaching phase and the learning phase. The first phase means learning from the teacher, while the second one means learning through the interaction between learners.
For the path planning of robotic manipulators, the TLBO has been frequently proposed [22][23]. However, just a few studies have used the TLBO to plan UGV paths. For the mobile robot path planning, the nonlinear inertia weighted teaching-learning-based optimization (NIWTLBO) technique is proposed in [24]. Coordinate system transformation creates a new map representation of the path between the start and goal points. To obtain a globally optimal path, the NIWTLBO method is used to optimize the path's objective function. The proposed method's effectiveness is demonstrated by simulation results. However, the findings of the experiments are not included. To acquire the parameters of the adaptive neural fuzzy inference system (ANFIS), the TLBO is proposed in [25]. Comparisons of the obtained results utilizing the recommended approach to findings from other intelligent algorithms such as PSO, invasive weed optimization (IWO) and biogeography-based optimization (BBO), are used to validate the results. The high quality of the simulation results confirms that the TLBO-based ANFIS is an effective alternative strategy for tackling the differential drive wheeled mobile robot navigation problem.
In [26], a conformal geometric algebra and TLBO are proposed for path planning of mobile robot avoiding all the possible obstacles in the navigation task. The proposed method is conformal spheres to represent the mobile robot and the obstacles. Then, a conformal translator is proposed to translate the mobile robot sphere to a new sphere position. The TLBO algorithm optimizes this translator by minimizing the distance between the new position and the objective position while maximizing the distance between the new position and any potential obstacles. In [27], the TLBO technique is proposed for path planning to guide a three-wheeled robot. The objective function has been proposed to optimize the path planning and reaching destination point. The obtained simulation results show that the TLBO technique finds the near-optimal path. In [28], a hybrid strategy combining TLBO and the shuffled frog leaping algorithm (SFLA) is presented for a mobile robot to enhance exploitation efficiency and overcome SFLA's sluggish convergence rate. The proposed approach is named as shuffled teaching-learning-based optimization (STLBO) algorithm. In comparison to the standard TLBO and PSO, simulation findings demonstrate that to circumvent these shortcomings have been proposed [3,12].
Due to their ability to handle environmental uncertainties and find the near-optimal path, while considering the vehicle's algebraic and dynamic constraints, as well as any other constraints corresponding to the vehicle or its assigned mission, computational intelligent approaches have recently become the most dominant in the field of mobile robot navigation. Genetic algorithm (GA), fuzzy logic (FL), neural network (NN), particle swarm optimization (PSO), firefly algorithm (FFA) and most recently, teachinglearning-based optimization are the main bio-inspired methodologies utilized for unmanned ground vehicles (UGVs) path planning (TLBO). Bhaskar et al. [13] propose a GA-based path planner for a mobile robot to design a collision-free path. The proposed method uses GA with a modified mutation operation as a solution strategy to guarantee a random selection of grids from the available zones without any repetition The suggested model aims to reduce computing complexity by shrinking the search space. Application of the PSO aids to reduction of calculations and maintenance of more steady convergence characteristics. In [14] an improved PSO algorithm is proposed to plan an optimized smooth path for mobile robots to tackle the local trapping and premature convergence issues. This algorithm is combined with the continuous high-degree Bezier curve to smooth the path. In [15], an optimal method of path planning based on FFA with self-adaptive population size is proposed. The evaluation of the degree of collision is established at the costs of avoiding the collision. Two nonlinear functions are presented to determine the population size based on the degree of population collision. Individuals are then added or removed from the firefly population. In terms of solution stability, running time and convergence speed, the proposed algorithm has the better performance.
It should be mentioned that all the bio-inspired techniques have some advantages and disadvantages. As a result, various hybrid algorithms combining two strategies have been developed to increase the path planner's overall performance [16][17][18][19]. It is also worth noting that all the evolutionary and swarm intelligencebased algorithms are probabilistic and require the same set of regulating parameters, such as population size and generation count. Furthermore, different algorithms necessitate their own sets of control settings. The GA, for instance, employs mutation and crossover rates. The PSO, too, makes advantage of inertia weight, as well as social and cognitive characteristics. The proper adjustment of algorithm-specific parameters is a critical aspect affecting optimization algorithm performance. The incorrect tuning for specific parameters of the algorithm either increases processing effort or produces the local best result. Considering this fact, the TLBO is proposed as it does not need any algorithm specific parameters.
center of gravity with coordinates (x, y) and orientation angle θ.
In the vehicle configuration space C, the start and the target points are connected with a straight line-ofsight (LoS), regardless the obstacles. Considered as the shortest path, LoS is divided into (M -1) segments and M points starting from P 1 to P M , where every two successive points define a segment which is a part of the path linking S p to T p . Consequently, to avoid the potential collision with the surrounding obstacles and considering all other constraints, a new path consists of (M -1) segments and M points (including source and target locations) is planned. Based on the aforementioned explanation, the objective of the path planning algorithm is to find the optimum coordinates of (M -2) points (as S p to T p are already known) that minimizes the path length, maximizes the path smoothness and avoids the potential collision with the surrounding obstacles. For this purpose, TLBO is proposed.
As mentioned before, the vector of vehicle position is defined as , , q t x y T i = h 6 @ . As the path is divided into (M -1) segments, at the end of any segment, the vehicle position can be obtained as: In this work, points P 1 and P M are considered as the starting and target points S p and T p , respectively. Consequently, the optimal path planning problem can be stated as: finding the optimum coordinates (x i , y i ) and orientation angle , , , , to minimize the path length and maximize the path smoothness such that: the STLBO produces better results. From a review of the literature, the following results are summarized for the problem of unmanned vehicles' path planning: • Traditional methods are straightforward.
However, they have significant drawbacks, such as long computation times, difficulty with online implementation and local minima trapping; • Bio-inspired approaches are able to find the nearoptimal path. However, their main demerit is the selection and tuning of the algorithm-specific parameters; and • The TLBO is widely applied in the path planning of robotic manipulators [22][23]. However, very few works focused on applying TLBO for UGV path planning [26][27][28].
Compared to the related works in the literature, the main contributions of this paper can be summarized as follows: • Developing the near-optimal path planning algorithm based on the TLBO that obtains the desired collision-free path that the vehicle must follow to reach its target; and • Modelling and simulation of the proposed algorithm to validate its efficiency.
2 Path planning problem Let q C ! be the vector of generalized coordinates for the vehicle, , , q t x y T i = h 6 @ and C is the vehicle's configuration space. The vehicle moves in an unknown cluttered environment starting from S p with coordinates (x s , y s ), while the goal is to reach the target location T p with coordinates (x t , y t ), as shown in Figure 1. During motion, let q be the current posture of the vehicle's

Optimization constraints:
The following constraints apply to the optimization problem: • The boundary constraints: in which the coordinates of P i are always within the vehicle's configuration space C, i.e.
• The collision avoidance constraint: During the vehicle motion, the distance between the vehicle and the j th obstacle, j = {1, 2,..., N obs }, where N obs is the number of detected obstacles, can be defined as: where obs j denotes the center of gravity of the j th obstacle. As shown in Figure 2, each obstacle j is surrounded by a circle with radius r obs j . To avoid the collision between the vehicle and the j th obstacle, d j must be greater than the minimum safety distance obs j d presented as: where: where ζ is a weighting factor and b r is the vehicle width.
where J (u) is the total objective function, J 1 (u) is the objective function that minimizes the path length, J 2 (u) is the objective function that maximizes the path smoothness, w 1 and w 2 are weighting factors and u is the vector of m design variables, such that:

Objective function calculation
The values of J 1 and J 2 can be calculated as follows: • Minimizing the path length via J 1 : The total path length is determined by: where S(P i , P i+1 ) presents the distance between two successive points P i and P i+1 .
• Maximizing the path smoothness via J 2 . The smoothness of a path is a very important attribute in the vehicle path planning, since the vehicle should not significantly change its direction suddenly. The path smoothness leads to reduction of the energy consumption and time waste. So, the smoothness is considered as a second objective. Therefore, when maximizing the path smoothness, the total of the vehicle's turning angle is minimized. As a result, the objective function of the vehicle's turning angle is defined as follows: This means that if d j is greater than obs j d , i.e., the distance between the vehicle and the j th obstacle is safe. Therefore, the collision avoidance constraint is not violated and no penalty is added to the objective function. However, if d j is less than obs j d , the vehicle will collide with the j th obstacle. As a result, a penalty is added to the objective function in order to avoid the potential collision.

TLBO algorithm
After defining the optimization problem in Section 2, the TLBO algorithm is applied to obtain the optimum path that minimizes the total objective function presented in Equation (15).
Teaching-learning is an important part of the process in which each individual seeks to learn knowledge from others in order to improve themselves. Rao et al. [20,30] and Rao and Patel [31] proposed the TLBO algorithm, which simulates the traditional teaching-learning phenomenon of a classroom. The algorithm mimics two essential types of learning: (i) from the teacher (known as the teacher phase) and (ii) through the interaction between all the learners (known as the learner phase). Because the TLBO algorithm is considered as a population-based optimization method, a group or set of learners is considered as a population and the design variables are the various subjects offered to the learners. The mean result of each learner is analogous to the objective function of the optimization problem. The best value of the objective function (the best learner) is considered as a teacher, while in each iteration, the TLBO is trying to improve the solution. The execution of TLBO algorithm is given as follows: Step 1. Construction of the vector of design variables X (the vector of learners' subjects): For each learner k, the vector of D v subjects (the number of design variables) is: As a result, the length of the vector of design variables is: Step 2. Initialization: • Select the number of learners N p .
• Select the maximum number of iterations i max .
• Generate a random population of learners as follows: , , , , , , , , , where, r j is a uniformly distributed random number within the range ! [0, 1], N p denotes the number of learners in a class and X X max min j j represent the upper and lower bounds of each design variable j, respectively. • Set the iteration counter i to 1 The weighting factor ζ is a critical value to ensure the collision avoidance of the vehicle. Therefore, 0.5 < ζ < 1. The selection of ζ value is significantly critical. While increasing ζ will ensure collision avoidance, it will decrease the possibility of finding the feasible solution of the optimization problem, results in a high computational burden. In this work, ζ is selected to be 3 2 .
Consequently, the collision avoidance constraint can be described as: As a result, the path planning multi-objective optimization problem can be mathematically represented as follows: subjected to: , , , , As the TLBO is a meta-heuristic optimization technique, the optimization problem should be solved as an unconstrained problem. Therefore, the collision avoidance constraint should be included in the objective function. This can be obtained by adding penalties to the objective function for violating the constraints [29]. For a constrained optimization problem with c constraints, the objective function can be represented as an unconstrained problem as follows: where G j is the penalty function of the constraint g j and a k is a positive constant known as the penalty parameter. Equation (15) indicates that while minimizing the objective function, a positive penalty is added whenever a constraint is violated [29]. There are many common penalty functions. In this paper, the considered penalty is being proportional to the amount of violation. The values of the constants a k can be adapted to change the contribution of the penalty terms relative to the magnitude of the whole objective function. As a result, Equation (15) can be re-written as: where v is the penalty coefficient and R obs j can be calculated as follows: accordance with his or her ability.
Step 3.4 Evaluate the difference mean: Evaluate the difference between current mean result M i,j and the teacher results in each subject j as follows: , where, T f is the teaching factor [23], , , T 1 2 f ! 6 @ which is decided randomly with equal probability as: T f is not a parameter of the TLBO algorithm. The value of T f is not given as an input to the algorithm and its value is randomly decided by the algorithm using Equation (22). The algorithm performs better if the value of T f is between 1 and 2 [23]. However, the algorithm is found to perform much better if the value of T f is either 1 or 2 and hence to simplify the algorithm, the teaching factor is suggested to take value either 1 or 2 depending on the rounding up criteria given by Equation (22).
Step 3. Teacher Phase: In this phase, the objective is to find the best student in the class to be the teacher. Then, all the other students are considered as learners.
Step 3.1 Calculate the mean result (M i,j ): The mean result of all learners on a particular subject j, is given as follows: Step 3.2 Calculate the objective function: For each learner k, calculate the objective function (J) presented in Equation (15).
Step 3.3 Find the teacher Xi teacher h : Determine the best learner of the generation with its corresponding values of design variables to be the teacher Xi teacher h in the current iteration i, which is the best solution (the lowest value of J) among the other learners. Under normal conditions, a teacher is thought to be a person with a high level of learning capacity who trains students to improve their learning outcomes. The teacher attempted to raise the average learning successes of learners in the subjects they taught in preserved. The accepted individuals of the population are passed and moved on to the next iteration.
Step 5. Termination criterion: The proposed algorithm checks for convergence of the iterative process after each iteration. If the termination criterion is achieved, terminate the algorithm and output the best solution. Otherwise, setting i = i + 1 and, proceed to Step 3. In this work, the termination criterion is selected as if the objective function does not change within 30 consecutive iterations, the optimization algorithm is terminated. The flowchart of the TLBO procedure is shown in Figure 3.

Simulation results analysis
The proposed path planning algorithm is successfully implemented in simulation. The goal of simulation is to show that the proposed path planning algorithm is able to find the near-optimal global offline obstacle-free path and to check the stability and performance of the proposed TLBO algorithm.
In the simulation, it is assumed that all the obstacles are already detected and that the vehicle's workspace is totally known. The vehicle's start point is S p = (200, 0) and the target point is T p = (700, 300). In addition, there are four detected obstacles in the workspace with known locations, while the safety distance for each obstacle is d safe = 30 cm.
As can be seen from Figure 4, the proposed algorithm succeeded to generate a planned path with avoiding the Step 3.5 Update the values of the design variables: Based on Dif Mean i,j the existing solution is updated according to the following expression: .

Conclusions
This paper proposed a TLBO algorithm to provide a path planning method for an autonomous UGV in a cluttered environment. The suggested approach was successful in obtaining the near-optimal pathways, while avoiding nearby obstacles and it can manage system kinematic and algebraic restrictions. The TLBO algorithm is used to find the near-optimal path between the start and destination locations, taking into account path length minimization, path smoothness optimization and avoiding potential collisions with nearby obstacles. Many remarkable features are achieved by the proposed algorithm: 1) it is simple and can be modelled and simulated; and 2) using TLBO there is no need for adjustment of specific parameters such as GA, FFA, PSO and other meta-heuristic optimization techniques. Finally, collision with surrounding obstacles. In addition, Figure  5, shows that the objective function converges to the minimum value within an average of 84 iterations.
Since the TLBO is a meta-heuristic optimization technique and based on a random population in the first iteration, it is not guaranteed to find the global optimum solution. Therefore, it is of great importance to check its performance and stability. This can be obtained by executing the proposed algorithm many times to ensure that the obtained solution is near the optimal one. As can be seen from Table 1, the proposed algorithm is executed 100 times. It is clear that the objective function converges to the minimum value within an average of 114 iterations. In addition, from values of the objective function's standard deviation, it is clear that most of the values are close to the average. All the values of the TLBO parameters are given in Table 3.    the simulations evaluate the proposed path planning algorithm in a cluttered environment, the simulations