LaTex2Web logo

Documents Live, a web authoring and publishing system

If you see this, something is wrong

Collapse and expand sections

To get acquainted with the document, the best thing to do is to select the "Collapse all sections" item from the "View" menu. This will leave visible only the titles of the top-level sections.

Clicking on a section title toggles the visibility of the section content. If you have collapsed all of the sections, this will let you discover the document progressively, from the top-level sections to the lower-level ones.

Cross-references and related material

Generally speaking, anything that is blue is clickable.

Clicking on a reference link (like an equation number, for instance) will display the reference as close as possible, without breaking the layout. Clicking on the displayed content or on the reference link hides the content. This is recursive: if the content includes a reference, clicking on it will have the same effect. These "links" are not necessarily numbers, as it is possible in LaTeX2Web to use full text for a reference.

Clicking on a bibliographical reference (i.e., a number within brackets) will display the reference.

Speech bubbles indicate a footnote. Click on the bubble to reveal the footnote (there is no page in a web document, so footnotes are placed inside the text flow). Acronyms work the same way as footnotes, except that you have the acronym instead of the speech bubble.

Discussions

By default, discussions are open in a document. Click on the discussion button below to reveal the discussion thread. However, you must be registered to participate in the discussion.

If a thread has been initialized, you can reply to it. Any modification to any comment, or a reply to it, in the discussion is signified by email to the owner of the document and to the author of the comment.

Table of contents

First published on Wednesday, Jul 2, 2025 and last modified on Wednesday, Jul 2, 2025 by François Chaplais.

Flatness-based Finite-Horizon Multi-UAV Formation Trajectory Planning and Directionally Aware Collision Avoidance Tracking
arXiv
Published version: 10.48550/arXiv.2506.23129

Hossein B. Jond Department of Cybernetics, Czech Technical University in Prague, Prague, Czechia

Logan Beaver Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, VA, USA

Martin Jiroušek Department of Cybernetics, Czech Technical University in Prague, Prague, Czechia

Naiemeh Ahmadlou Mechanical Engineering Faculty, Sahand University of Technology, New Sahand Town, Tabriz, Iran

Veli Bakırcıoğlu Department of Cybernetics, Czech Technical University in Prague, Prague, Czechia and Aksaray University, Aksaray, Türkiye

Martin Saska Department of Cybernetics, Czech Technical University in Prague, Prague, Czechia

Keywords: Differential flatness; Formation control; Pontryagin’s principle

Abstract

©2025 Elsevier Accepted for Journal of The Franklin Institute. Personal use of this material is permitted. Permission from Elsevier must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.

Optimal collision-free formation control of the unmanned aerial vehicle (UAV) is a challenge. The state-of-the-art optimal control approaches often rely on numerical methods sensitive to initial guesses. This paper presents an innovative collision-free finite-time formation control scheme for multiple UAVs leveraging the differential flatness of the UAV dynamics, eliminating the need for numerical methods. We formulate a finite-time optimal control problem to plan a formation trajectory for feasible initial states. This optimal control problem in formation trajectory planning involves a collective performance index to meet the formation requirements to achieve relative positions and velocity consensus. It is solved by applying Pontryagin’s principle. Subsequently, a collision-constrained regulating problem is addressed to ensure collision-free tracking of the planned formation trajectory. The tracking problem incorporates a directionally aware collision avoidance strategy that prioritizes avoiding UAVs in the forward path and relative approach. It assigns lower priority to those on the sides with an oblique relative approach, disregarding UAVs behind and not in the relative approach. The high-fidelity simulation results validate the effectiveness of the proposed control scheme.

1 Introduction

Unmanned aerial vehicle (UAV) formation control has gained significant attention due to its broad applications, such as exploring hazardous environments [1], transporting loads [2], conducting critical inspections [3], and surveying historical structures [4]. A fundamental aspect of UAV formation control involves trajectory planning and trajectory tracking. In formation trajectory planning, UAVs plan optimal or sub-optimal collision-free trajectories between their initial and terminal states [5, 6]. Formation trajectory tracking ensures that UAVs converge to these pre-planned formation trajectories [7, 8, 9]. Formation trajectory planning and tracking present significant challenges due to the nonlinear dynamics of UAVs, inter-UAV collision avoidance, and dynamic environmental constraints. For quadrotor UAVs, their underactuated nature further complicates control and trajectory planning. These challenges become more pronounced in time-sensitive operations, where UAV groups must rapidly form formations, execute tasks, and ensure collision-free motion within finite time horizons [10].

A quadrotor UAV, with six degrees of freedom and only four control inputs, is inherently underactuated. Differential flatness resolves underactuation by representing states and inputs as functions of flat outputs (e.g., position and yaw) and their derivatives. This simplifies trajectory planning and control [11, 12, 13]. The application of differential flatness theory to quadrotor formation control has garnered research attention. The study in [14] has tackled the leader-follower formation control for quadrotors using differential flatness theory and proposed an observer-based controller to reconstruct the leader’s states and reject disturbances. In [15], a leader-follower formation control strategy is presented for a team of quadrotors, integrating trajectory planning (using the flatness-based trajectory generation algorithm in [11]), distributed consensus, and trajectory tracking. In [16], a leader-follower control strategy is developed for UAV formation in GPS-denied environments, incorporating feedback linearization, differential flatness, and linear-quadratic regulator (LQR) techniques. In [17], a virtual structure-based approach is introduced to generate formation trajectories in a quadrotor swarm, where quadrotors utilize differential flatness for control. The virtual structure is integrated with multiple layered potential fields to ensure collision avoidance.

Networked UAVs can exchange information with neighbors, enabling consensus algorithms to facilitate collaboration. These algorithms can be integrated with leader-follower [18], behavioral, or virtual structure methods for trajectory tracking or employed independently for trajectory planning and tracking in team formation control [19]. Finite-horizon formation control unifies consensus-based coordination with an optimal control framework by optimizing a collective performance index of UAV states and control inputs within a constrained time frame [10]. Some research applies discrete-time optimal control with fixed-time performance guarantees, such as the method for the seeker stabilized platform (SSP) in [20], which uses iterative updates and dynamic programming. In [21], a framework is proposed that uses fixed-time performance functions and adaptive neural approximations to efficiently handle unknown nonlinearities and directions, avoiding the reliance on Nussbaum-type functions commonly used in existing strategies for prescribed performance control of discrete-time non-affine systems. In [22], a discrete-time fuzzy-neural intelligent control approach with fixed-time prescribed performance is introduced, addressing the limitations of existing methods that struggle with time-varying sampling intervals. However, these approaches can be computationally intensive due to time-stepping algorithms [20], adaptive neural approximations [21], and fuzzy-neural design [22]. In contrast, our method uses continuous-time finite-horizon optimal control and leverages the differential flatness of UAV dynamics to enable efficient real-time coordination via linear-quadratic optimization.

Existing finite-horizon methods exhibit limitations due to their dependence on numerical optimization, leader-follower hierarchies, or pre-planned trajectory interpolations, each of which introduces practical constraints. Numerical methods for trajectory optimization require well-chosen initial guesses to converge to feasible solutions [23]. In the leader-follower approach [18], the leader serves as a single point of failure, the behavioral approach lacks a formalized control framework, and the virtual structure approach suffers from limited adaptability [14, 15, 17]. Trajectory interpolations using Bézier curves, high-order piecewise polynomials, or recursive constructions [24, 5, 6] require predefined initial and terminal states, and often waypoints. Existing collision-avoidance strategies often consider all neighboring UAVs equally without prioritizing evasive actions based on their relative motion [8, 25]. This can lead to inefficient maneuvers and unnecessary trajectory deviations. Although infinite-horizon approaches provide stability guarantees [26], they lack responsiveness to time-sensitive formation control tasks.

This paper addresses the above challenges by leveraging the differential flatness property of quadrotor UAV dynamics to develop a finite-horizon formation trajectory planning and tracking control scheme based solely on the initial states of the UAVs. Specifically, the proposed approach involves the following features:

  1. Consensus-Based Formation Control: Unlike most prior works that rely on a virtual or actual leader as an external reference for formation control [27, 14, 28, 26], the proposed approach is entirely consensus-based. This improves robustness and scalability by eliminating potential single-point failures associated with leader-based strategies. Formation direction can be adjusted by incorporating the coordinates of the target area as a deviation term in the performance index, allowing the UAV group to maintain formation while approaching the target without relying on a designated leader.
  2. Optimal Trajectory Planning Using Only Initial State Knowledge: The proposed framework optimally plans UAV trajectories using only their initial states by applying the necessary and sufficient conditions from the Pontryagin principle. This significantly reduces computational complexity, as trajectory planning and generation are performed only once.
  3. Finite Horizon Real-Time Adaptability: Unlike infinite-horizon formulations [26], our approach employs a finite-horizon strategy, optimizing UAV motion within a limited time window to ensure timely responsiveness, adaptability to dynamic tasks and environments, and guaranteed finite-time convergence.
  4. Directionally Aware Collision Avoidance: The proposed method prioritizes evasive actions based on relative motion, emphasizing avoidance maneuvers on the forward path and along the approach trajectory, thus reducing unnecessary deviations and ensuring efficient collision-free multi-UAV operations.

The main contribution of this paper is a unified framework for UAV formation control that integrates the features discussed above. It ensures collision-free trajectory planning and tracking while maintaining scalability, computational efficiency, and real-time adaptability. Furthermore, the approach leverages the linear-quadratic paradigm from optimal control, which is known for its analytic tractability and computational efficiency [29].

The rest of this paper is organized as follows. Section 2 provides preliminaries. The collision-free finite-time formation control is defined in Section 3. Section 4 recasts the problem definition as optimal control problems, where their solutions are investigated. The directionally aware collision avoidance strategy is introduced in Section 5. Section 6 presents illustrative examples and simulation results for the (re)formation of a four-UAV and seven-UAV team. Section 7 concludes the paper.

2 Preliminaries

2.1 graph theory

A directed graph is a pair \( \mathcal{G}(\mathcal{V},\mathcal{E})\) where \( \mathcal{V}=\{1,…,N\}\) is a finite set of vertices/nodes and \( \mathcal{E}\subseteq \{(i,j):i,j \in \mathcal{V}\}\) is a set of directed edges/arcs. Each edge \( (i,j) \in \mathcal{E}\) represents a directional information flow and is assigned a positive scalar weight \( \mu_{ij}>0\) . A graph \( \mathcal{G} \) is connected if, for every pair of distinct vertices \( (i, j) \in \mathcal{V} \times \mathcal{V} \) with \( i \neq j \), there exists a path of undirected edges in \( \mathcal{E} \) that connects \( i \) to \( j \).

The graph Laplacian matrix is defined as

\[ \mathbf{L} = \mathbf{D} \mathbf{W} \mathbf{D}^\top \in \mathbb{R}^{N}, \]

where \( \mathbf{D} \in \mathbb{R}^{|\mathcal{V}| \times |\mathcal{E}|} \) is the incidence matrix of \( \mathcal{G} \), with its \( (i,j) \)th element equal to \( 1 \) if node \( i \) is the tail of edge \( j \), \( -1 \) if node \( i \) is the head, and \( 0 \) otherwise. Additionally, \( \mathbf{W} = \mathrm{diag}(\dots, \mu_{ij}, \dots) \) for all \( (i,j) \in \mathcal{E} \), where \( \mathbf{W} \in \mathbb{R}^{|\mathcal{E}|} \). Matrix \( \mathbf{L}\) is symmetric (\( \mathbf{L}=\mathbf{L}^\top\) ), positive semidefinite (PSD) (\( \mathbf{L}\succeq \mathbf{0}\) ), and satisfies the sum-of-squares property

\[ \begin{equation} \sum_{(i,j)\in\mathcal{E}}\mu_{ij}\|\mathbf{x}_i-\mathbf{x}_j\|^2=\mathbf{x}^\top (\mathbf{L}\otimes\mathbf{I}_m) \mathbf{x}, \end{equation} \]

(1)

where \( \mathbf{x}=[\mathbf{x}^\top_1,…,\mathbf{x}_N^\top]^\top\in \mathbb{R} ^{mN}\) , \( \mathbf{x}_i\in \mathbb{R} ^{m}\) , \( \mathbf{I}_m\in \mathbb{R} ^{m}\) is the identity matrix of dimension \( m\) , \( \otimes\) demonstrates the Kronecker product operator, and \( \lVert .\rVert\) is the Euclidean norm.

Let \( \mathbf{Y} = [y_{ij}] \in \mathbb{R}^{p \times q} \). Then, the Kronecker product of \( \mathbf{Y} \) and the identity matrix \( \mathbf{I}_m \) is given by

\[ \mathbf{Y} \otimes \mathbf{I}_m = \begin{bmatrix} y_{11} \mathbf{I}_m & \cdots & y_{1q} \mathbf{I}_m \\ \vdots & \ddots & \vdots \\ y_{p1} \mathbf{I}_m & \cdots & y_{pq} \mathbf{I}_m \end{bmatrix} \in \mathbb{R}^{mp \times mq}. \]

2.2 UAV model

The UAV model is a quadrotor consisting of a rigid cross frame with four rotors, as depicted in Fig. 1. The equations of motion governing the acceleration of the center of mass for a quadrotor UAV indexed by \( i\) are given as follows [11]

\[ \begin{align} m\ddot{\mathbf{p}}_i(t) =& -mg\mathbf{z}_\mathcal{W} + u_{i1}(t) \mathbf{z}_\mathcal{B}, \end{align} \]

(2)

\[ \begin{align} \dot{\boldsymbol{\omega}}_{\mathcal{BW}} (t)=& \boldsymbol{\mathcal{I}}^{-1} \big\{-\boldsymbol{\omega}_{\mathcal{BW}}(t) \times \boldsymbol{\mathcal{I}}\boldsymbol{\omega}_{\mathcal{BW}}(t) + \begin{bmatrix} u_{i2}(t) & u_{i3}(t) & u_{i4}(t) \end{bmatrix}^\top\big\}, \end{align} \]

(3)

where \( \mathbf{p}_i(t)=[x_i(t),y_i(t),z_i(t)]^\top\in\mathbb{R}^{3}\) is the position vector of the center of mass in the world frame \( \mathcal{W}\) , \( \boldsymbol{\omega}_{\mathcal{BW}}(t)\) is the angular velocity of the body frame \( \mathcal{B}\) relative to the world frame \( \mathcal{W}\) , \( \boldsymbol{\mathcal{I}}\) is the moment of inertia matrix referenced to the center of mass along the body axes \( \mathbf{x}_\mathcal{B}\) ,\( \mathbf{y}_\mathcal{B}\) , \( \mathbf{z}_\mathcal{B}\) , \( m\) is the mass of the quadrotor, and \( g\) is the gravitational acceleration. The control input to the quadrotor is \( \mathbf{u}_i(t)\in\mathbb{R}^{4}\) where \( u_{i1}(t)\) is the net body force and \( u_{i2}(t)\) , \( u_{i3}(t)\) , \( u_{i4}(t)\) are the body moments which can be expressed according to the rotor speeds.

Quadrotor configuration. W) and B) denote the world frame and body frame, respectively. The roll, pitch, and yaw angles are denoted by _i) , _i) , and _i) , respectively.
Figure 1. Quadrotor configuration. \( \mathcal{W}\) and \( \mathcal{B}\) denote the world frame and body frame, respectively. The roll, pitch, and yaw angles are denoted by \( \phi_i\) , \( \theta_i\) , and \( \psi_i\) , respectively.

3 Problem definition

Consider a multi-UAV team consisting of \( N \) quadrotors, indexed by \( \mathcal{V} = \{1, \dots, N\} \). The dynamics of each UAV \( i \in \mathcal{V} \) is governed by (2)–(3). Two time-invariant, connected, directed graphs are defined: the formation graph \( \mathcal{G}_f(\mathcal{V}, \mathcal{E}_f) \) and the communication or sensing graph \( \mathcal{G}_c(\mathcal{V}, \mathcal{E}_c) \). The formation graph encodes the desired spatial relationships between UAVs, whereas the communication graph represents all potential sensing and communication links needed for collision avoidance.

Assumption 1

(Connectivity) The formation graph is directed, with its underlying undirected graph being connected and containing at least one globally reachable node, i.e., a node \(i\) such that there is a directed path from \(i\) to node \(j\) for all \(j\in\mathcal{V}\) with \(j\neq i\) [30].

Assumption 1 implies the existence of a directed spanning tree with a root node \(i\).

Assumption 2

The communication graph is a complete directed graph to account for all inter-UAV collisions (thus, \( \mathcal{G}_f\subseteq\mathcal{G}_c\) ), where every pair of UAVs is connected by directed edges in both directions.

The formation control problem involves steering all UAVs from their initial state to a desired formation. Additionally, collisions between UAVs must be avoided. The formation shape is determined by the offset vectors \( \mathbf{d}_{ij}\in\mathbb{R}^3\) for all UAV pairs \( (i,j)\in\mathcal{E}_f\) . Each UAV \( i\in\mathcal{V}\) has an avoidance region radius \( r_i\) , meaning the minimum safe distance between two UAVs \( i\) and \( j\) is \( r_i + r_j\) . Additionally, let \( R_i\) denote the radius of the collision reaction region of each UAV \( i\in\mathcal{V}\) and \( R_i>r_i\) (see Fig. 2).

Define the UAV team position vector \( \mathbf{p}(t)=[\mathbf{p}_1^\top(t),…,\mathbf{p}_N^\top(t)]^\top\in\mathbb{R}^{3N}\) . The following regions are defined accordingly

  1. safety region: \( \Psi(\mathbf{p})\triangleq\{\mathbf{p}(t)|\mathbf{p}_i(t)\in\mathbb{R}^3, \|\mathbf{p}_i(t)-\mathbf{p}_j(t)\|\geq R_i+R_j,\forall (i,j)\in\mathcal{E}_c\}\) ,
  2. collision reaction region: \( \Gamma(\mathbf{p})\triangleq\{\mathbf{p}(t)|\mathbf{p}_i(t)\in\mathbb{R}^3, r_i+r_j<\|\mathbf{p}_i(t)-\mathbf{p}_j(t)\|< R_i+R_j,\exists (i,j)\in\mathcal{E}_c\}\) , and
  3. collision region: \( \Lambda(\mathbf{p})\triangleq\{\mathbf{p}(t)|\mathbf{p}_i(t)\in\mathbb{R}^3, \|\mathbf{p}_i(t)-\mathbf{p}_j(t)\|< r_i+r_j,\exists (i,j)\in\mathcal{E}_c\}\) ,
  4. singularity: \( \Phi(\mathbf{p})\triangleq\{\mathbf{p}(t)|\mathbf{p}_i(t)\in\mathbb{R}^3, \|\mathbf{p}_i(t)-\mathbf{p}_j(t)\|= r_i+r_j,\exists (i,j)\in\mathcal{E}_c\}\) .
UAV workspace configuration showing the safe distance r_i) and collision reaction distance R_i) .
Figure 2. UAV workspace configuration showing the safe distance \( r_i\) and collision reaction distance \( R_i\) .

We make the following assumption.

Assumption 3

\( \Lambda(\mathbf{p}(0))=\Phi(\mathbf{p}(0))=\Lambda(\mathbf{p}(t_f))=\Phi(\mathbf{p})(t_f)= \emptyset\) .

Assumption 3 ensures no collisions occur at the initial and final UAV positions. This guarantees the feasibility and safety of the formation configuration. By enforcing this condition, collision avoidance is required only during the transition. The multi-UAV formation control problem is defined as follows.

Problem 1

(Multi-UAV Formation Control in State Space with Collision Avoidance) (a) Consider a multi-UAV system on the directed connected graph \( \mathcal{G}_f(\mathcal{V},\mathcal{E}_f)\) with UAV dynamics (2)-(3). The multi-UAV formation control problem is to determine control strategies that drive all UAVs from their initial state to a desired formation specified by \( \mathbf{d}_{ij},\forall(i,j)\in\mathcal{E}_f\) in the state space in a finite time horizon \( t_f>0\) . (b) All UAVs must avoid mutual collisions, i.e., \( \Lambda(\mathbf{p})\) and \( \Phi(\mathbf{p})\) must be \( \emptyset\) for all \( t\in(0,t_f)\) .

The consensus-based formation control objective is to ensure that

\[ \begin{align} \sum_{(i,j)\in\mathcal{E}_f}\big\{\|\mathbf{p}_i(t)&-\mathbf{p}_j(t)-\mathbf{d}_{ij}\|^2+\|\dot{\mathbf{p}}_i(t)-\dot{\mathbf{p}}_j(t)\|^2\big\}\rightarrow 0,~ \text{as}~t\rightarrow t_f, \end{align} \]

(4)

for any initial position \( \mathbf{p}_i(0)\) and velocity \( \dot{\mathbf{p}}_i(0)\) with \( i\in\mathcal{V}\) and while satisfying the collision avoidance constraint

\[ \begin{align} &\|\mathbf{p}_i(t)-\mathbf{p}_j(t)\|>r_i+r_j,~ \forall (i,j)\in\mathcal{E}_c,\forall t\in(0,t_f). \end{align} \]

(5)

The UAV dynamics, formation control objective, and collision avoidance constraint described by (2)-(5) can be recast as a constrained optimal control problem. This allows for investigating optimal formation control strategies for Problem 1. However, finding a solution is problematic due to the nonlinear UAV dynamics. Numerical methods can be sensitive to initialization [23] and unstable, often relegating them to a lower priority [31, 32]. Moreover, the quadrotor system is inherently underactuated, with six degrees of freedom but only four control inputs. This characteristic poses an additional challenge in finding a solution that adheres to the system’s underactuation dynamics.

In the next section, we present a collision-free formation trajectory planning and tracking scheme leveraging the differential flatness property of the UAV dynamics.

4 Collision-Free Trajectory Planning and Tracking

Leveraging the differential flatness property of the UAV dynamics provides a fully actuated system with four flat outputs [11, 33, 14]. By applying a diffeomorphism (see Definition 1 from [34]), Problem 1 is transformed into a flat space where it becomes easier to handle due to the linear UAV dynamics. The simplified problem, denoted as Problem 2, is recast and addressed as optimal control trajectory planning and tracking problems in flat space. The flat space control strategies are then transformed back to the original coordinate system using the inverse of the diffeomorphism. This approach is summarized in the control scheme diagram in Fig. 3.

Outline of the control scheme for generating collision-free motion primitives for UAV formation. The formation trajectory planning signal is optimal. Subsequently, this trajectory is tracked within the safety region using an optimal tracking control law. The tracking law becomes sub-optimal in the collision reaction region, ensuring collision-free motion by sacrificing optimality.
Figure 3. Outline of the control scheme for generating collision-free motion primitives for UAV formation. The formation trajectory planning signal is optimal. Subsequently, this trajectory is tracked within the safety region using an optimal tracking control law. The tracking law becomes sub-optimal in the collision reaction region, ensuring collision-free motion by sacrificing optimality.

4.1 optimal trajectories in flat coordinates

The flat output vector is defined as \( \mathbf{z}_i(t)=[\mathbf{p}_i^\top(t),\psi_i(t)]^\top\in\mathbb{R}^{4}\) where \( \mathbf{p}_i(t)\) is the coordinate of the center of mass of the \( i\) th UAV in the world coordinate system and \( \psi_i(t)\) is its yaw angle. The flat input vector is \( \dddot{\mathbf{z}}_i(t)\in\mathbb{R}^{3}\) . The collective thrust \( u_{i1}(t)\) exerted by the propellers of the \( i\) th quadrotor is directly determined by the flat outputs and their derivatives. Since angular velocity and acceleration are dependent on these outputs and their derivatives, (3) is employed to calculate the inputs \( u_{i2}(t)\) , \( u_{i3}(t)\) , and \( u_{i4}(t)\) (see [11]).

Define \( \boldsymbol{\psi}(t)=[\psi_1(t),…,\psi_N(t)]^\top\in\mathbb{R}^{N}\) . The UAV team flat state and control input vectors are defined as \( \mathbf{r}(t)=[\mathbf{p}^\top(t),\boldsymbol{\psi}^\top(t),\dot{\mathbf{p}}^\top(t),\dot{\boldsymbol{\psi}}^\top(t),\ddot{\mathbf{p}}^\top(t), \ddot{\boldsymbol{\psi}}^\top(t),1]^\top\in\mathbb{R}^{12N+1}\) and \( \mathbf{u}_{\mathbf{r}}(t)=[\dddot{\mathbf{p}}(t),\dddot{\boldsymbol{\psi}}(t)]^\top\in\mathbb{R}^{4N}\) , respectively. The multi-UAV team dynamics in flat space is given by

\[ \begin{equation} \dot{\mathbf{r}}(t)=\mathbf{A}\mathbf{r}(t)+\mathbf{B}\mathbf{u}_{\mathbf{r}}(t), ~ \mathbf{r}(0)=\mathbf{r}_0, \end{equation} \]

(6)

where \( \mathbf{A}=\begin{bmatrix} \mathbf{A}_0 & \mathbf{0}\\ \mathbf{0} & 0 \end{bmatrix}\in\mathbb{R}^{12N+1}\) , \( \mathbf{A}_0=\begin{bmatrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{bmatrix}\otimes\mathbf{I}_{4N}\) , \( \mathbf{B}=\begin{bmatrix} \mathbf{B}_0\\ \mathbf{0} \end{bmatrix}\in\mathbb{R}^{(12N+1)\times 4N}\) , and \( \mathbf{B}_0=\begin{bmatrix} 0& 0 & 1 \end{bmatrix}^\top\otimes\mathbf{I}_{4N}\) .

Problem 2

(Multi-UAV Formation Control in Flat Space with Collision Avoidance) (a) Consider a multi-UAV system on the directed connected graph \( \mathcal{G}_f(\mathcal{V},\mathcal{E}_f)\) in flat space with dynamics (6). The multi-UAV formation control problem is to determine control strategies for the UAV team that drive all UAVs from their initial state to a desired formation specified by \( \mathbf{d}_{ij},\forall(i,j)\in\mathcal{E}_f\) in the flat state space in a finite time horizon \( t_f\) . (b) Additionally, \( \Lambda(\mathbf{p})\) and \( \Phi(\mathbf{p})\) must be \( \emptyset\) for all \( t\in(0,t_f)\) .

The formation control objective (4) and collision avoidance constraint (5) in the original coordinates are transferred unchanged to flat coordinates. Problem 2 (a) pertains to an unconstrained control problem. In the following, we will first address Problem 2 (a) for formation trajectory planning and then use the resulting trajectory in a collision-free tracking controller to solve Problem 2 in its entirety (see Fig. 3).

Using the finite-time linear quadratic paradigm, Problem 2 (a) is recast as the following optimal control problem

\[ \begin{align} \min_{\mathbf{u}_{\mathbf{r}}}~C(\mathbf{r}(t_f))+\int_{0}^{t_f}\big(L(\mathbf{r}(t))+\mathbf{u}_{\mathbf{r}}^\top(t) \mathbf{R} \mathbf{u}_{\mathbf{r}}(t)\big)\,dt, \end{align} \]

(7)

subject to state dynamics (6) where

\[ \begin{align*} &C(\mathbf{r}(t_f))=\sum_{(i,j)\in\mathcal{E}_f}\omega_{ij}(\|\mathbf{p}_i(t_f)-\mathbf{p}_j(t_f)-\mathbf{d}_{ij}\|^2+\|\dot{\mathbf{p}}_i(t_f)-\dot{\mathbf{p}}_j(t_f)\|^2),\\ &L(\mathbf{r}(t))=\sum_{(i,j)\in\mathcal{E}_f}\mu_{ij}(\|\mathbf{p}_i(t)-\mathbf{p}_j(t)-\mathbf{d}_{ij}\|^2+\|\dot{\mathbf{p}}_i(t)-\dot{\mathbf{p}}_j(t)\|^2), \end{align*} \]

\( \omega_{ij}>0\) , \( \mathbf{R}=\mathrm{diag}(\gamma_1,…,\gamma_N)\otimes \mathbf{I}_4\) , and \( \gamma_i>0\) . The collective performance index (7) quadratically penalizes the UAV team’s deviations from cooperative formation behavior (i.e., achieving relative distances and velocity consensus) while expending the least control effort.

Using the sum-of-squares property of graph Laplacian matrix in (1), the terminal and running formation error components in (7), respectively, are compacted as [35, 36]

\[ \begin{align*} &C(\mathbf{r}(t_f))=\mathbf{r}^\top(t_f) \mathbf{Q}_f \mathbf{r}(t_f), ~ L(\mathbf{z}(t))=\mathbf{r}^\top(t) \mathbf{Q}\mathbf{r}(t), \end{align*} \]

where \( \mathbf{Q}_f=\begin{bmatrix} \hat{\mathbf{L}}_f& \mathbf{0}_{4N} & \mathbf{0}_{4N} &-\boldsymbol{\Theta}_f\\ \mathbf{0}_{4N} & \hat{\mathbf{L}}_f& \mathbf{0}_{4N} &\mathbf{0}_{4N\times 1} \\ \mathbf{0}_{4N} & \mathbf{0}_{4N}& \mathbf{0}_{4N} &\mathbf{0}_{4N\times 1}\\ -\boldsymbol{\Theta}_f^\top &\mathbf{0}_{1\times 4N}&\mathbf{0}_{1\times 4N}& \boldsymbol{\Upsilon}_f \end{bmatrix}\in\mathbb{R}^{12N+1}\) , \( \mathbf{Q}=\begin{bmatrix} \hat{\mathbf{L}}& \mathbf{0}_{4N} & \mathbf{0}_{4N} &-\boldsymbol{\Theta} \\ \mathbf{0}_{4N} & \hat{\mathbf{L}}& \mathbf{0}_{4N} &\mathbf{0}_{4N\times 1} \\ \mathbf{0}_{4N} & \mathbf{0}_{4N}& \mathbf{0}_{4N} &\mathbf{0}_{4N\times 1}\\ -\boldsymbol{\Theta}^\top &\mathbf{0}_{1\times 4N}&\mathbf{0}_{1\times 4N}& \boldsymbol{\Upsilon} \end{bmatrix}\in\mathbb{R}^{12N+1}\) , \( \hat{\mathbf{L}}_f=\begin{bmatrix} \mathbf{L}_f\otimes\mathbf{I}_3& \mathbf{0}_{3N\times N}\\ \mathbf{0}_{N\times 3N}& \mathbf{0}_{N} \end{bmatrix}\in\mathbb{R}^{4N}\) , \( \hat{\mathbf{L}}=\begin{bmatrix} \mathbf{L}\otimes\mathbf{I}_3& \mathbf{0}_{3N\times N}\\ \mathbf{0}_{N\times 3N}& \mathbf{0}_{N} \end{bmatrix}\in\mathbb{R}^{4N}\) , \( \boldsymbol{\Theta}_f=\begin{bmatrix} (\mathbf{D}\mathbf{W}_f\otimes\mathbf{I}_3)\mathbf{d} \\ \mathbf{0}_{N\times 1} \end{bmatrix}\in\mathbb{R}^{4N\times 1}\) , \( \boldsymbol{\Theta}=\begin{bmatrix} (\mathbf{D}\mathbf{W}\otimes\mathbf{I}_3)\mathbf{d} \\ \mathbf{0}_{N\times 1} \end{bmatrix}\in\mathbb{R}^{4N\times 1}\) , \( \boldsymbol{\Upsilon}_f=\mathbf{d}^\top(\mathbf{W}_f\otimes\mathbf{I}_3)\mathbf{d}\in\mathbb{R}\) , \( \boldsymbol{\Upsilon}=\mathbf{d}^\top(\mathbf{W}\otimes\mathbf{I}_3)\mathbf{d}\in\mathbb{R}\) , \( \mathbf{L}_f=\mathbf{D}\mathbf{W}_f\mathbf{D}^\top\in\mathbb{R}^{N}\) , \( \mathbf{W}_f=\mathrm{diag}(…,\omega_{ij},…) \forall (i,j)\in\mathcal{E}\in \mathbb{R} ^{|\mathcal{E}|}\) , and \( \mathbf{d}=[…,\mathbf{d}_{ij}^\top,…]^\top\in \mathbb{R} ^{3|\mathcal{E}|}\) . The matrices \( \mathbf{Q}_f\) and \( \mathbf{Q}\) are symmetric and PSD (note that \( \mathbf{r}^\top(t_f) \mathbf{Q}_f \mathbf{r}(t_f)\geq 0\) , \( \mathbf{r}^\top(t) \mathbf{Q} \mathbf{r}(t)\geq 0\) ).

The objective functional (7) and the system dynamics (6) are both convex. Thus, Pontryagin’s principle is both a necessary and sufficient condition for optimality. Applying these conditions, the optimal formation trajectory planning signal for any initial state of the UAV team is presented as follows.

Theorem 1

Consider the multi-UAV formation control problem in flat coordinates defined by (6) and (7). The unique globally optimal control input and corresponding optimal formation trajectory for any initial state \( \mathbf{r}_0\) are given by

\[ \begin{align} &\mathbf{u}_{\mathbf{r}}(t)=-\mathbf{R}^{-1}\mathbf{B}^\top\mathbf{G}(t_f-t)\mathbf{H}^{-1}(t_f)\mathbf{r}_0, \end{align} \]

(8)

\[ \begin{align} &\mathbf{r}(t)=\mathbf{H}(t_f-t)\mathbf{H}^{-1}(t_f)\mathbf{r}_0, \end{align} \]

(9)

where

\[ \begin{align} &\mathbf{H}(t)=\begin{bmatrix} \mathbf{I} & \mathbf{0} \end{bmatrix}\exp{(-t\mathbf{M})}\begin{bmatrix} \mathbf{I} \\ \mathbf{Q}_f \end{bmatrix}, \end{align} \]

(10)

\[ \begin{align} &\mathbf{G}(t)=\begin{bmatrix} \mathbf{0} & \mathbf{I} \end{bmatrix}\exp{(-t\mathbf{M})}\begin{bmatrix} \mathbf{I} \\ \mathbf{Q}_f \end{bmatrix}, \end{align} \]

(11)

\[ \begin{align} &\mathbf{M}=\begin{bmatrix} \mathbf{A} & -\mathbf{B}\mathbf{R}^{-1} \mathbf{B}^\top \\ -\mathbf{Q} & -\mathbf{A}^\top \end{bmatrix}. \end{align} \]

(12)

Proof

See Appendix 9.

4.2 Formation trajectory tracking

In the control scheme shown in Fig. 3, the formation trajectory planning signal, as given by (9), is followed by the formation trajectory tracker. Meanwhile, the tracker must ensure that each UAV avoids collisions with its neighboring UAVs along its specified trajectory whenever \( \Gamma(\mathbf{p}) \) becomes non-empty.

4.2.1 safety region control strategy

When \( \Psi(\mathbf{p}) \neq \emptyset \), the UAVs are within the safety region, and as a result, the inter-UAV collision avoidance constraint is inactive. The control strategy for the UAVs to follow the formation trajectory planning signal within the safety region is derived as follows.

Using the finite-time LQR paradigm, formation trajectory tracking within the safety region is recast as the following optimal control problem

\[ \begin{align} &\min_{\mathbf{u}_{\mathbf{z}}}~(\mathbf{z}(t_f)-\mathbf{r}(t_f))^\top \mathbf{K}_f (\mathbf{z}(t_f)-\mathbf{r}(t_f))+\nonumber \end{align} \]

(13)

\[ \begin{align} &\int_{0}^{t_f}\Big\{(\mathbf{z}(t)-\mathbf{r}(t))^\top \mathbf{K} (\mathbf{z}(t)-\mathbf{r}(t))+\mathbf{u}_{\mathbf{z}}^\top(t) \mathbf{R}_{\mathbf{z}}\mathbf{u}_{\mathbf{z}}(t)\Big\}\,dt, \\\end{align} \]

(14)

subject to

\[ \begin{equation} \dot{\mathbf{z}}(t)=\mathbf{A}\mathbf{z}(t)+\mathbf{B}\mathbf{u}_{\mathbf{z}}(t), ~ \mathbf{z}(0)=\mathbf{r}_0, \end{equation} \]

(15)

where \( \mathbf{K}_f=\mathrm{diag}(\mathbf{K}_{0f},0)\) , \( \mathbf{K}_{0f}=\mathbf{I}_{3}\otimes\mathrm{diag}(\delta_1,…,\delta_N)\otimes \mathbf{I}_4\) , \( \mathbf{K}=\mathrm{diag}(\mathbf{K}_0,0)\) , \( \mathbf{K}_0=\mathbf{I}_{3}\otimes\mathrm{diag}(\zeta_1,…,\zeta_N)\otimes \mathbf{I}_4\) , \( \mathbf{R}_{\mathbf{z}}=\mathrm{diag}(\eta_1,…,\eta_N)\otimes\mathbf{I}_{4}\) , \( \delta_i>0\) , \( \zeta_i>0\) , and \( \eta_i>0\) .

The optimal feedback control for (13) and (15) is given by

\[ \begin{equation} \mathbf{u}_{\mathbf{z}}(t)=-\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top\mathbf{P}(t)(\mathbf{z}(t)-\mathbf{r}(t)), \end{equation} \]

(16)

where \( \mathbf{P}(t)\) is the solution to the Riccati differential equation

\[ \begin{align} &\dot{\mathbf{P}}(t)+\mathbf{P}(t)\mathbf{A}-\mathbf{P}(t)\mathbf{B}\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top\mathbf{P}(t)+\mathbf{K}+\mathbf{A}^\top\mathbf{P}(t)=\mathbf{0},~\mathbf{P}(t_f)=\mathbf{K}_f. \end{align} \]

(17)

The pair \( (\mathbf{A},\mathbf{B})\) is controllable, \( \mathbf{R}_{\mathbf{z}}\succ 0\) , and \( \mathbf{K}\succ 0\) , thus, there exists a unique solution \( \mathbf{P}(t)=\mathbf{P}^\top(t)\succeq 0\) to (17).

The optimal state feedback is given by

\[ \begin{equation} \dot{\mathbf{z}}(t)=\left(\mathbf{A}-\mathbf{B}\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top\mathbf{P}(t)\right) (\mathbf{z}(t)-\mathbf{r}(t)). \end{equation} \]

(18)

The minimal cost attained at any \( t\in[0,t_f]\) and state \( \mathbf{z}(t)\) is yielded by the value function

\[ \begin{equation} n(\mathbf{z})=(\mathbf{z}(t)-\mathbf{r}(t))^\top \mathbf{P}(t) (\mathbf{z}(t)-\mathbf{r}(t)). \end{equation} \]

(19)

The Hamilton–Jacobi–Bellman (HJB) equation is given by

\[ \begin{align} \frac{d n}{d t}+&\min_{\mathbf{u}_{\mathbf{z}}}\big\{(\mathbf{z}(t)-\mathbf{r}(t))^\top \mathbf{K} (\mathbf{z}(t)-\mathbf{r}(t))+\mathbf{u}_{\mathbf{z}}^\top(t) \mathbf{R}_{\mathbf{z}} \mathbf{u}_{\mathbf{z}}(t)+\frac{\partial n}{\partial \mathbf{z}}\big(\mathbf{A}\mathbf{z}(t)+\mathbf{B}\mathbf{u}_{\mathbf{z}}(t) \big) \big\}=0. \end{align} \]

(20)

The optimal input (16), equivalently is derived from the HJB equation (20) as

\[ \begin{align*} \mathbf{u}_{\mathbf{z}}(t)=-\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top\frac{\partial n}{\partial \mathbf{z}}. \end{align*} \]

4.2.2 collision avoidance control strategy

When \( \Gamma(\mathbf{p})\neq \emptyset\) , the formation trajectory tracker applies the collision avoidance control strategy derived below for the collision reaction region. First, we define a collision penalty function [37, 8] as

\[ \begin{align} v_{ij}=&\big(\min\big\{0,\frac{\|\mathbf{p}_j(t)-\mathbf{p}_i(t)\|^2-(R_i+R_j)^2}{\|\mathbf{p}_j(t)-\mathbf{p}_i(t)\|^2-(r_i+r_j)^2}\big\}\big)^2=\big(\min\big\{0,\frac{\|\mathbf{e}_{ij}^\top\mathbf{z}(t)\|^2-(R_i+R_j)^2}{\|\mathbf{e}_{ij}^\top\mathbf{z}(t)\|^2-(r_i+r_j)^2}\big\}\big)^2, \end{align} \]

(21)

for all \( (i,j)\in\mathcal{E}_c\) where \( \mathbf{e}_{ij}=[-\mathbf{D}_{ij}\otimes\mathbf{I}_3,\mathbf{0}]\in\mathbb{R}^{(12N+1)\times 3}\) and \( \mathbf{D}_{ij}\) is the column of \( \mathbf{D}\) corresponding to the edge \( (i,j)\) . Note that \( v_{ij}\) is nonzero only within the collision reaction region and is either zero or undefined outside of this region.

The partial derivative of \( v_{ij}\) with respect to \( \mathbf{z}_i\) is given by

\[ \frac{\partial v_{ij}}{\partial \mathbf{z}_i}=\left \{ \begin{aligned} &\mathbf{0}, && \text{if}~\Psi(\mathbf{p})\neq \emptyset \\ &\mathbf{z}^\top(t)\mathbf{e}_{ij}\phi_{ij},&& \text{if}~\Gamma(\mathbf{p})\neq \emptyset \\ &\mathbf{0}, && \text{if}~\Lambda(\mathbf{p})\neq \emptyset\\ &\text{not defined}, && \text{if}~\Phi(\mathbf{p})\neq \emptyset \end{aligned} \right. \]

where

\[ \begin{align*} \phi_{ij}=&4((R_i+R_j)^2-(r_i+r_j)^2)\frac{\|\mathbf{e}_{ij}^\top\mathbf{z}(t)\|^2-(R_i+R_j)^2}{(\|\mathbf{e}_{ij}^\top\mathbf{z}(t)\|^2-(r_i+r_j)^2)^3}. \end{align*} \]

Moreover, the time derivative of \( v_{ij}\) at the collision reaction region (i.e., at \( \Gamma(\mathbf{p})\neq\emptyset\) ) is

\[ \begin{equation} \frac{d v_{ij}}{dt}=\frac{\partial v_{ij}}{\partial \mathbf{z}_i}\mathbf{e}_{ij}^\top\dot{\mathbf{z}}(t). \end{equation} \]

(22)

Let \( V_i=\sum_{j\in\mathcal{N}_i}v_{ij}\) where \( \frac{\partial V_i}{\partial \mathbf{z}_i}=\sum_{j\in\mathcal{N}_i}\mathbf{z}^\top(t)\mathbf{e}_{ij}\phi_{ij}\) for all \( i\in\mathcal{V}\) . Define \( \frac{\partial V}{\partial \mathbf{z}}=[\mathbf{0}_{8N},\frac{\partial V_1}{\partial \mathbf{z}_1},…,\\ \frac{\partial V_N}{\partial \mathbf{z}_N},\mathbf{0}_{N+1}]\in\mathbb{R}^{12N+1}\) . Now, augment the value function of the HJB (20) with the collision avoidance penalty \(V\)

\[ \begin{align} &\big(\frac{d n}{d t}+\frac{d V}{d t}\big)+\min_{\mathbf{u}_{\mathbf{z}}}\big\{(\mathbf{z}(t)-\mathbf{r}(t))^\top \mathbf{K} (\mathbf{z}(t)-\mathbf{r}(t))+\mathbf{u}_{\mathbf{z}}^\top(t) \mathbf{R}_{\mathbf{z}} \mathbf{u}_{\mathbf{z}}(t)+\nonumber \end{align} \]

(23)

\[ \begin{align} &\big(\frac{\partial n}{\partial \mathbf{z}}+\frac{\partial V}{\partial \mathbf{z}}\big)\big(\mathbf{A}\mathbf{z}(t)+\mathbf{B}\mathbf{u}_{\mathbf{z}}(t)\big)\big\}=0. \\\end{align} \]

(24)

Note that the HJB in (23) is constructed for the tracking optimal control problem with the performance index (13), augmented with a penalty function to indirectly designate the collision region instead of the inequality constraint, ensuring safety [25].

The collision avoidance control strategy is derived by minimizing the HJB equation in (23), leading to the following feedback control law

\[ \begin{align} \mathbf{u}_{\mathbf{z}}(t) = -\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top( \mathbf{P}(t) \mathbf{z}(t) + \frac{\partial V}{\partial \mathbf{z}}^\top ). \end{align} \]

(25)

The corresponding state dynamics is given by

\[ \begin{equation} \dot{\mathbf{z}}(t) = ( \mathbf{A} - \mathbf{B}\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top \mathbf{P}(t) ) \mathbf{z}(t) - \mathbf{B}\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top \frac{\partial V}{\partial \mathbf{z}}^\top. \end{equation} \]

(26)

It is important to note that the collision avoidance control law within the safety region (i.e., when UAVs are not within the collision reaction zone) coincides with the optimal control law for the system.

To demonstrate that (25) provides a collision-free control strategy, we prove that (23) is non-increasing in the region \(\Gamma(\mathbf{p})\), following the approach in [37].

We define \(\hat{V} = \sum_{i=1}^N V_i\), where \(V_i\) represents the sum of collision penalty functions for UAV \(i\) regarding its neighbors. The following lemma provides the basis for showing that UAVs maintain a collision-free formation.

Lemma 1

If the initial conditions of the UAVs satisfy

\[ \begin{equation} ||\mathbf{p}_i(0) - \mathbf{p}_j(0)|| \geq r_i + r_j + \epsilon, \end{equation} \]

(27)

for all \(i,j \in \mathcal{V}\), \(i \neq j\), and some \(\epsilon > 0\), then the UAVs can achieve \(\frac{d\hat{V}}{dt} \geq 0\) using bounded control effort, thereby avoiding collisions.

Proof

Let \( \frac{d\hat{V}}{dt}< 0\) , since any other condition trivially satisfies Lemma 1. By the fundamental theorem of calculus,

\[ \begin{equation} \hat{V}(t) = \hat{V}(0) + \int_0^{t} \frac{d\hat{V}}{d\tau}\, d\tau. \end{equation} \]

(28)

The control input \( \mathbf{u}_{\mathbf{z}}\) appears in \( \frac{d^{n+1} \hat{V}}{dt^{n+1}}\) (i.e., \( n=2\) here). We can obtain \( \hat{V}(t)\) and \( \frac{d\hat{V}}{dt}\) as a function of \( \mathbf{u}_z\) ,

\[ \begin{align} \hat{V}(t) =& \hat{V}(0) +\sum_{i=1}^n\left. \frac{d^n\hat{V}}{dt^n} \right|_{t=0}\frac{t^n}{n!}+ \underbrace{\int_0^t \dots \int_0^t}_{n+1} \frac{d^{n+1}\hat{V}}{d\tau^{n+1}} \underbrace{d\tau \dots d\tau}_{n+1}, \end{align} \]

(29)

\[ \begin{align} \frac{d\hat{V}}{dt} =& \sum_{i=1}^n\left. \frac{d^n\hat{V}}{dt^n} \right|_{t=0}\frac{t^{n-1}}{(n-1)!}+\underbrace{\int_0^t \dots \int_0^t}_{n} \frac{d^{n+1}\hat{V}}{d\tau^{n+1}} \underbrace{d\tau \dots d\tau}_{n}, \end{align} \]

(30)

where \( \left. \frac{d^n\hat{V}}{dt^n} \right|_{t=0}\) is the initial condition of the \( n^{th}\) derivative of \( \hat{V}(t)\) .

To guarantee collision avoidance while \( \frac{d\hat{V}}{dt}\to 0\) at time \( t\) , we require

\[ \begin{align} \hat{V}(t) &\leq \hat{V}(0) + \delta, \end{align} \]

(31)

\[ \begin{align} \frac{d\hat{V}}{dt} &\leq 0, \end{align} \]

(32)

where \( \delta>0\) and (31) implies collision avoidance, as \( \frac{d\hat{V}}{dt} > 0\) implies that \( \hat{V}(t) = \max_{\tau\in[0,t]} \hat{V}(\tau)\) . Substituting (29) and (30) into (31) and (32), respectively, and rearranging their terms yield

\[ \begin{align} &\underbrace{\int_0^t \dots \int_0^t}_{n+1} \frac{d^{n+1}\hat{V}}{d\tau^{n+1}} \underbrace{d\tau \dots d\tau}_{n+1} \leq -\big(\delta+\sum_{i=1}^n\left. \frac{d^n\hat{V}}{dt^n} \right|_{t=0}\frac{t^n}{n!}\big), \end{align} \]

(33)

\[ \begin{align} &\underbrace{\int_0^t\dots \int_0^t}_{n}\frac{d^{n+1}\hat{V}}{d\tau^{n+1}}\underbrace{d\tau\dots d\tau}_{n} \leq -\sum_{i=1}^n\left. \frac{d^n\hat{V}}{dt^n} \right|_{t=0}\frac{t^{n-1}}{(n-1)!} . \end{align} \]

(34)

Both (33) and (34) are finite lower bounds on the function \( \frac{d^{n+1}\hat{V}}{dt^{n+1}}\) . Therefore, we can always construct a finite function \( \frac{d^{n+1}\hat{V}}{dt^{n+1}}\) to satisfy (33) and (34) for an arbitrarily small \( t > 0\) . This implies that (31) and (32) are satisfied for any \( \delta > 0\) . Finally, by continuity in \( \frac{d\hat{V}}{dt}\) , we can select \( \delta\) sufficiently small to ensure (31) and (32) for any \( \epsilon > 0\) . This concludes the proof.

Theorem 2

Given the conditions of Lemma 1, any trajectory that results in a collision is sub-optimal.

Proof

We use a proof by contradiction. Assume that a trajectory where two UAVs collide is optimal, resulting in a cost of \( J^* = \infty\) by definition. By Lemma 1, there exists a bounded control input \( \mathbf{u}_{\mathbf{z}}^\dagger\) such that \( \left. \frac{d\hat{V}}{dt} \right|_{t=t^\dagger}=0\) in some finite time \( t^\dagger<t_f\) without collisions. The corresponding cost is

\[ \begin{align} J^\dagger &= \int_0^{t^\dagger}\Big\{(\mathbf{z}(t)-\mathbf{r}(t))^\top \mathbf{K} (\mathbf{z}(t)-\mathbf{r}(t))+(\mathbf{u}_{\mathbf{z}}^\dagger(t))^\top \mathbf{R}_{\mathbf{z}}\mathbf{u}_{\mathbf{z}}^\dagger(t)+ \frac{d\hat{V}}{dt}\Big\}\, dt \notag \end{align} \]

(35)

\[ \begin{align} &+ \int_{t^\dagger}^{t_f}\Big\{(\mathbf{z}(t)-\mathbf{r}(t))^\top \mathbf{K} (\mathbf{z}(t)-\mathbf{r}(t))+(\mathbf{u}_{\mathbf{z}}^\dagger(t)) ^\top\mathbf{R}_{\mathbf{z}}\mathbf{u}_{\mathbf{z}}^\dagger(t)\Big\}\, dt, \\\end{align} \]

(36)

where \( V\) remains stationary for \( t \geq t^\dagger\) . Every term in (35) is finite. Therefore, \( J^\dagger < J^*\) and \( J^*\) is not optimal. This contradiction implies that our original assumption is incorrect, proving Theorem 2.

5 Directionally Aware Collision Avoidance

In this section, we introduce two collision avoidance strategies that incorporate directional awareness by integrating the UAVs’ velocities into the collision penalty function in Subsection 4.2.2. Recognizing that combining these strategies enhances the effectiveness of collision avoidance, we unify them into a single approach to provide UAVs with a more robust, directionally aware collision avoidance capability.

5.1 Forward-path aware collision avoidance

To derive a collision avoidance control strategy that is directionally aware, we adjust the penalty function to prioritize avoiding collisions in the forward path of each UAV. We incorporate a forward-path aware weighting factor \(\alpha_{ij} \) into the collision penalty function (21) as

\[ v^F_{ij} = \alpha_{ij}v_{ij}, \]

where

\[ \begin{equation} \alpha_{ij} = \left\{ \begin{array}{ll} \max\{0, \cos(\eta_{ij})\}, & \text{if } \|\mathbf{v}_i(t)\| \neq 0, \\ 0, & \text{if } \|\mathbf{v}_i(t)\| = 0, \end{array}\right. \end{equation} \]

(37)

with

\[ \cos(\eta_{ij}) = \frac{(\mathbf{p}_j(t) - \mathbf{p}_i(t))^\top \mathbf{v}_i(t)}{\|\mathbf{p}_j(t) - \mathbf{p}_i(t)\|\|\mathbf{v}_i(t)\|}. \]

The forward-path aware penalty function \(v^F_{ij}\) accounts for whether UAV \( j \) lies within the forward path of UAV \( i \), and it is zero otherwise. Specifically, when UAV \(j\) is in the forward path, the directionally aware weighting factor \(\alpha_{ij} \) is given by the cosine of the angle between the relative position vector and the velocity vector, as depicted in Fig. 4. The cosine is obtained using the dot product as shown in (37). If UAV \(j\) is positioned behind UAV \(i\), the weighting factor \(\alpha_{ij} \) becomes zero.

The forward-path aware collision avoidance prioritizes avoiding collisions in the forward path. The velocity vector v_i) indicates the forward motion path of UAV i). Within the collision reaction region, when UAV j) is located in the forward path of UAV i), the directionally aware weighting factor _{ij} ) for UAV i) with respect to UAV j) is defined as the cosine of the angle between the UAV i)&amp;rsquo;s velocity vector v_i) and their relative distance vector p_j-p_i). The value of _{ij}) decreases as the UAV j) moves further to the sides. By definition, _{ik}) is zero when UAV k) is located behind UAV i).
Figure 4. The forward-path aware collision avoidance prioritizes avoiding collisions in the forward path. The velocity vector \(\mathbf{v}_i\) indicates the forward motion path of UAV \(i\). Within the collision reaction region, when UAV \(j\) is located in the forward path of UAV \(i\), the directionally aware weighting factor \(\alpha_{ij} \) for UAV \(i\) with respect to UAV \(j\) is defined as the cosine of the angle between the UAV \(i\)’s velocity vector \(\mathbf{v}_i\) and their relative distance vector \(\mathbf{p}_j-\mathbf{p}_i\). The value of \(\alpha_{ij}\) decreases as the UAV \(j\) moves further to the sides. By definition, \(\alpha_{ik}\) is zero when UAV \(k\) is located behind UAV \(i\).

5.2 Approach-aware collision avoidance

In approach-aware collision avoidance, the collision penalty function is adjusted to prevent a collision when two UAVs are on an approach trajectory. This approach leverages the relative velocities to assess whether the UAVs are approaching each other. The approach-aware weighting factor \(\beta_{ij} \) is incorporated into (21) as

\[ v^A_{ij} = \beta_{ij}v_{ij}, \]

where

\[ \beta_{ij} = \left\{ \begin{array}{ll} \max\left\{ 0, -\cos(\zeta_{ij}) \right\}, & \text{if } \|\mathbf{v}_j(t) - \mathbf{v}_i(t)\| \neq 0, \\ 0, & \text{if } \|\mathbf{v}_j(t) - \mathbf{v}_i(t)\| = 0, \end{array}\right. \]

with

\[ \cos(\zeta_{ij}) = \frac{(\mathbf{p}_j(t) - \mathbf{p}_i(t))^\top (\mathbf{v}_j(t) - \mathbf{v}_i(t))}{\|\mathbf{p}_j(t) - \mathbf{p}_i(t)\| \|\mathbf{v}_j(t) - \mathbf{v}_i(t)\|}. \]

Here, \(\cos(\zeta_{ij})\) captures the alignment between the relative position \(\mathbf{p}_j(t) - \mathbf{p}_i(t)\) and velocity \(\mathbf{v}_j(t) - \mathbf{v}_i(t)\). When \(\cos(\zeta_{ij}) < 0\), the UAVs are approaching, posing a collision risk. If their velocities are identical, i.e., \(\|\mathbf{v}_j(t) - \mathbf{v}_i(t)\| = 0\), they remain at a constant separation. A higher \(\beta_{ij}\) (near 1) indicates a direct, high-speed approach, while a lower \(\beta_{ij}\) (closer to 0) suggests an oblique approach with reduced collision risk. When \(\beta_{ij} = 0\), the UAVs are either parallel, diverging, or stationary relative to each other.

5.3 Directionally aware unified collision avoidance

We combine the collision avoidance strategies from Subsections 5.1 and 5.2 into a unified approach that accounts for both UAV \(j\) being within UAV \(i\)’s forward path and the relative approach. This is achieved by defining the weighting factor \(\xi_{ij}\) as

\[ \xi_{ij} = \alpha_{ij} \beta_{ij}, \]

or

\[ \xi_{ij} = \left\{ \begin{array}{ll} \max\{0, \cos(\eta_{ij})\} \max\{0, -\cos(\zeta_{ij})\}, & \text{if } \|\mathbf{v}_i(t)\| \neq 0 \text{ and } \|\mathbf{v}_j(t) - \mathbf{v}_i(t)\| \neq 0, \\ 0, & \text{otherwise}. \end{array}\right. \]

Here, \(\alpha_{ij}\) ensures that UAV \(j\) is in the forward path of UAV \(i\). If UAV \(j\) is outside the forward path, \(\alpha_{ij} = 0\), making \(\xi_{ij} = 0\). \(\beta_{ij}\) ensures that UAV \(j\) is approaching UAV \(i\). If UAV \(j\) is moving away or stationary relative to UAV \(i\), \(\beta_{ij} = 0\), making \(\xi_{ij} = 0\). Thus, \(\xi_{ij}\) is nonzero only when UAV \(j\) lies in the forward path of UAV \(i\) (\(\alpha_{ij} > 0\)) and UAV \(j\) is approaching UAV \(i\) (\(\beta_{ij} > 0\)) and is zero if either condition is not met.

To demonstrate the improved effectiveness of the unified collision avoidance approach, we analyze two scenarios with two UAVs having evolving positions and velocities. In the first scenario, \(\mathbf{p}_1(0) = [1, 1,1]^\top\) and \(\mathbf{p}_2(0) = [7, 0,3]^\top\); in the second, \(\mathbf{p}_2(0)\) is changed to \([7, 3,1]^\top\). Both scenarios feature constant velocities \(\mathbf{v}_1(0) = [2, 1,0.5]^\top\) and \(\mathbf{v}_2(0) = [-2, 1,-0.5]^\top\). The evolution of relative vectors, \(\alpha_{12}\), \(\beta_{12}\), and \(\xi_{12}\) at some time steps are shown in Fig. 5. Figs. 5(6)–(9) represent the first scenario, while Figs. 5(10)–(13) depict the second. Figs. 5(9) and 5(13) compare \(\alpha_{12}\) and \(\beta_{12}\) over 20 time steps. It is evident that \(\xi_{12}\) consistently captures collision risks more effectively and shows to be more reliable overall.

Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 5. Evolution of relative vectors, \(\alpha_{12}\), \(\beta_{12}\), and \(\xi_{12}\) for two UAVs in two scenarios. Scenario 1 is depicted in subfigures (a) to (d), and Scenario 2 in subfigures (e) to (h).

We then incorporate the directionally aware penalty function \(v^U_{ij}\) defined as

\[ v^U_{ij} = \xi_{ij}v_{ij}, \]

into the HJB equation, replacing \(v_{ij}\) with \(v^U_{ij}\) in the minimization to account for the new penalty function. In the feedback control law \( \mathbf{u}_{\mathbf{z}}(t)\) in (25) now \(\frac{\partial V}{\partial \mathbf{z}}\) accounts for the directionally aware collision avoidance strategy. This approach provides a collision avoidance control that emphasizes avoiding collisions in the forward path as well as along an approach trajectory, promoting adaptive behavior, and enhancing collision-free formation tracking efficiency.

Remark 1

The weighting factors \( \alpha_{ij},\beta_{ij},\xi_{ij} \) are introduced to dynamically adjust the sensitivity of the penalty function \( v_{ij} \) based on the relative motion of the agents. In the context of directionally aware collision avoidance, \( v_{ij} \) quantifies the collision risk between agents based on their proximity, and \( \alpha_{ij},\beta_{ij},\xi_{ij} \) modulate the impact of \( v_{ij} \) by accounting for the agents’ forward paths and relative approach. This distinction ensures that the weighting factors do not alter the fundamental role of \( v_{ij} \) in the collision avoidance mechanism. Rather, they dynamically scale its effect based on directional considerations.

Assumption 4

(Constancy of weighting factors in derivatives of the penalty function) Given the distinct roles of the weighting factors \( \alpha_{ij}, \beta_{ij}, \xi_{ij} \) in dynamically adjusting the sensitivity of the penalty function \( v_{ij} \), as stated in Remark 1, we assume that these factors are treated as constants when calculating the time and partial derivatives of \( v_{ij} \).

From their definitions, the boundedness of the weighting factors is evident (\( \alpha_{ij}, \beta_{ij}, \xi_{ij} \in [0, 1] \)). In Subsection 4.2.2, we demonstrate that the HJB equation, constructed using the time and partial derivatives of \( v_{ij} \), remains non-increasing within the collision reaction region \( \Gamma(\mathbf{p}) \). The inclusion of the bounded weighting factors as coefficients to \( v_{ij} \), in accordance with Remark 1, does not alter the directional behavior of the HJB equation nor increase its magnitude in \( \Gamma(\mathbf{p}) \). This is because the weighting factors do not amplify the penalty function beyond its original bounds, preventing any unbounded growth that could destabilize the HJB equation. Therefore, under Assumption 4, the conditions outlined in Subsection 4.2.2 are preserved.

Remark 2

Assumption 4 simplifies the computation of the penalty function derivatives and improves computational efficiency, robustness, and smoothness of control responses. However, as demonstrated in Appendix 10, the control strategy in (25), utilizing the forward-path-aware penalty function \( v_{ij}^F \), ensures bounded control inputs even without Assumption 4, thereby guaranteeing stability. Analogous stability conditions can be derived for the approach-aware penalty function \( v_{ij}^A \) and the unified penalty function \( v_{ij}^U \), reinforcing the robustness of the directionally aware collision avoidance framework.

6 Simulation Results

In this section, we first consider a four-UAV (\( N=4\) ) (re)formation problem to demonstrate the efficacy of the presented control scheme outlined in Fig. 3. In the initial configuration, the UAVs are positioned in a square formation in the \( x-y\) plane from the top, as illustrated in Fig. 14(15). The formation shape is determined by the offset vectors \( \mathbf{d}_{12}=[-4,-4,0]^\top\, \mathrm{m}\) and \( \mathbf{d}_{13}=\mathbf{d}_{24}=[4,-4,0]^\top\, \mathrm{m}\) , which characterize a diamond alignment in the \( x-y\) plane, as depicted in Fig. 14(16).

Communication graph
Figure 15. Communication graph
Desired formation graph
Figure 16. Desired formation graph
Figure 14. The communication graph \(\mathcal{G}_c(\mathcal{V},\mathcal{E}_c)\) and the desired formation graph \(\mathcal{G}_f(\mathcal{V},\mathcal{E}_f)\) for a team of four UAVs. The graphs illustrate the spatial arrangement of the UAVs in the \(x\)–\(y\) plane, corresponding to their initial and desired formations, respectively.

The initial states of the UAVs in the flat coordinates are \( \mathbf{p}_1(0)=[0,0,5]^\top\, \mathrm{m}\) , \( \mathbf{p}_2(0)=[5,0,0]^\top\, \mathrm{m}\) , \( \mathbf{p}_3(0)=[0,5,0]^\top\, \mathrm{m}\) , \( \mathbf{p}_4(0)=[5,5,0]^\top\, \mathrm{m}\) , \( \dot{\mathbf{p}}_i(0)=[0,0,0]^\top\, \mathrm{m/s}\) , \( \ddot{\mathbf{p}}_i(0)=[0,0,0]^\top\, \mathrm{m/s^2}\) , and \( \psi_i(0)=0\, \mathrm{rad}\) , \( \dot{\psi}_i(0)=0\, \mathrm{rad/s}\) , \( \ddot{\psi}_i(0)=0\, \mathrm{rad/s^2}\) for \( i\in\{1,2,3,4\}\) . The weighting parameters of the performance index (7) are as \( \mu_{12}=0.9\) , \( \mu_{13}=0.7\) , \( \mu_{14}=0.5\) , \( \mu_{23}=0.6\) , \( \mu_{24}=0.8\) , \( \mu_{34}=0.4\) , \( \omega_{ij}=1\) for all \( (i,j)\in\mathcal{E}_f\) , and \( \gamma_i=1\) for all \( i\in\{1,2,3,4\}\) . The horizon length is \( t_f=10\, \mathrm{s}\) . For each UAV, its mass, distance from the rotor to the center of mass, gravity acceleration, and moment of inertia are respectively set as \(m=1.0\, \mathrm{kg}\), \(l=0.2\, \mathrm{m}\), \(g=9.81\, \mathrm{m/s^2}\), and \(\mathcal{I}_{xx}=\mathcal{I}_{xx}=\mathcal{I}_{xx}= 0.016\, \mathrm{kg.m^2}\).

Figs. 1720 show the optimal formation trajectories (from Theorem 1) and their time histories. Without collision avoidance, these trajectories demonstrate optimal control strategies for achieving the diamond formation within \( t_f=10\, \mathrm{s}\) . The decreasing control inputs \(\mathbf{u}_{\mathbf{r}}\) over time, as shown in the time histories in Figs. 20, reflect the achievement of the optimal performance index, which includes a penalty on control effort in addition to the formation requirements.

Optimal formation trajectories
Figure 18. Optimal formation trajectories
Top view of the trajectories
Figure 19. Top view of the trajectories
Figure 17. Optimal formation trajectories of UAVs. Initially, the UAVs are placed such that from the top, they are seen in a square formation and then transition into a diamond formation in the \( x-y\) plane.
Figure 22
Figure 23
Figure 24
Figure 25
Figure 26
Figure 27
Figure 21. Positions and velocities
Figure 29
Figure 30
Figure 31
Figure 32
Figure 33
Figure 34
Figure 28. Accelerations and \(\mathbf{u}_{\mathbf{r}}\)
Figure 36
Figure 37
Figure 38
Figure 39
Figure 40
Figure 41
Figure 35. Attitudes with rates
Euclidean distances
Figure 42. Euclidean distances
Figure 20. Time histories of positions, velocities, accelerations, jerks (i.e., control inputs \(\mathbf{u}_{\mathbf{r}}\)), attitudes with rates, and Euclidean distances for all UAV pairs \( (i,j) \in \mathcal{E}_f\) . The blue dashed line represents the distance before entering the collision reaction region (i.e., \( R_i+R_j\) ), and the red dashed line represents the distance before entering the collision region (i.e., \( r_i+r_j\) ).

The formation trajectory tracking involves tracking the optimal formation trajectory planning signal while considering inter-UAV collisions, as outlined in Fig. 3. The avoidance region radius for all UAVs is defined as the same value \( r_i=1.5\, \mathrm{m}\) for all \( i={1,…,4}\) . This implies that a collision occurs for every UAV pair \( (i,j)\) if they approach each other closer than \( 3\, \mathrm{m}\) . We choose the radius of the collision reaction region \( R_i=2r_i\) for all \( i={1,…,4}\) . The parameters of the tracking performance index (13) are set as \( \zeta_{ij}=10\mu_{ij}\) , \( \delta_{ij}=10\omega_{ij}\) , and \( \eta_i=\gamma_i\) for all \( i,j\in\{1,2,3,4\}\) and \( (i,j)\) in the communication graph. Utilizing the collision penalty function from Subsection 4.2.2, the formation tracking trajectories and their corresponding time histories are shown in Figs. 43-46, respectively. It is apparent from both figures that the UAV team tracks the optimal trajectories and successfully achieves the desired formation at the terminal time. The relative distances between any UAV pair never decrease below \( r_i+r_j\) , even though some UAV pairs initially have a relative distance between \( r_i+r_j\) and \( R_i+R_j\) . Despite the collision avoidance control strategy being applied from the outset, it separates the UAV pairs before they reach the collision region.

Formation tracking trajectories
Figure 44. Formation tracking trajectories
Top view of the tracking trajectories
Figure 45. Top view of the tracking trajectories
Figure 43. Formation trajectory tracking, which involves following the optimal formation trajectory while accounting for inter-UAV collisions.
Figure 48
Figure 49
Figure 50
Figure 51
Figure 52
Figure 53
Figure 47. Positions and velocities
Figure 55
Figure 56
Figure 57
Figure 58
Figure 59
Figure 60
Figure 54. Accelerations and \(\mathbf{u}_{\mathbf{z}}\)
Figure 62
Figure 63
Figure 64
Figure 65
Figure 66
Figure 67
Figure 61. Attitudes with rates
Euclidean distances
Figure 68. Euclidean distances
Figure 46. Time histories of positions, velocities, accelerations, jerks (\(\mathbf{u}_{\mathbf{z}}\)), attitudes with rates, and Euclidean distances for formation tracking trajectories.

The formation trajectories and their corresponding time histories using the directionally aware collision avoidance tracking strategy are shown in Fig. 69-72, respectively. Compared to the non-directionally aware approach (Figs. 4346), the directionally aware strategy minimizes deviations from optimal trajectories while avoiding collisions. The results highlight the effectiveness of the proposed formation trajectory planning and tracking, demonstrating successful collision-free multi-UAV formation control. Notably, UAVs employing the directionally aware collision avoidance strategy require minimal deviation from their optimal paths to ensure collision avoidance. Thus, the directionally aware collision avoidance approach proves more efficient by adaptively prioritizing collision avoidance in both the forward path and the relative approach for each UAV.

Directionally aware formation tracking trajectories
Figure 70. Directionally aware formation tracking trajectories
Top view of the tracking trajectories
Figure 71. Top view of the tracking trajectories
Figure 69. Formation trajectory tracking utilizing the directionally aware collision avoidance strategy.
Figure 74
Figure 75
Figure 76
Figure 77
Figure 78
Figure 79
Figure 73. Positions and velocities
Figure 81
Figure 82
Figure 83
Figure 84
Figure 85
Figure 86
Figure 80. Accelerations and \(\mathbf{u}_{\mathbf{z}}\)
Figure 88
Figure 89
Figure 90
Figure 91
Figure 92
Figure 93
Figure 87. Attitudes with rates
Euclidean distances
Figure 94. Euclidean distances
Figure 72. Time histories of positions, velocities, accelerations, jerks (\(\mathbf{u}_{\mathbf{z}}\)), attitudes with rates, and Euclidean distances for formation tracking trajectories utilizing the directionally aware collision avoidance strategy.

To further assess the performance of the proposed UAV formation control scheme, we extend our analysis to a larger group of seven UAVs (\(N=7\)). Initially, the UAVs are arranged in a cubic configuration, as shown in Fig. 95(96). The desired formation shape is defined by the offset vectors \(\mathbf{d}_{12}=\mathbf{d}_{24}=\mathbf{d}_{46}=[-8,8,0]^\top\, \mathrm{m}\) and \(\mathbf{d}_{13}=\mathbf{d}_{35}=\mathbf{d}_{57}=[8,-8,0]^\top\, \mathrm{m}\), forming a V-shape in the \(x\)-\(y\) plane, as illustrated in Fig. 95(97). Initially, \(\dot{\mathbf{p}}_i(0) = [0,0,0]^\top\, \mathrm{m/s}\), \(\ddot{\mathbf{p}}_i(0) = [0,0,0]^\top\, \mathrm{m/s^2}\), and \(\psi_i(0) = 0\, \mathrm{rad}\), \(\dot{\psi}_i(0) = 0\, \mathrm{rad/s}\), and \(\ddot{\psi}_i(0) = 0\, \mathrm{rad/s^2}\) for \(i \in \{1, \dots, 7\}\). The weighting parameters of the performance index (7) are assigned as \(\mu_{12} = 0.9\), \(\mu_{13} = 0.7\), \(\mu_{24} = 0.8\), \(\mu_{35} = 0.5\), \(\mu_{46} = 0.7\), \(\mu_{57} = 0.6\), with \(\omega_{ij} = 1\) and \(\gamma_i = 1\) for all \(i\). All other parameters remain consistent with the first example.

Initial UAV placement
Figure 96. Initial UAV placement
Desired formation graph
Figure 97. Desired formation graph
Figure 95. Initial configuration and desired V-shape formation graph for a team of seven UAVs.

The optimal formation trajectories and their corresponding time histories are depicted in Figs. 98-101. The relative distances between UAV pairs occasionally fall slightly below \( r_i + r_j \), indicating a collision situation. The formation tracking trajectories and their time histories are presented in Figs. 124-127. While the UAV team achieves collision-free formation tracking (no violations of \( r_i + r_j\) ), the control input \( \mathbf{u}_{\mathbf{z}}\) shows abrupt jumps (e.g., at \( t=2.5\, \mathrm{s}\) ), which are infeasible for physical UAVs due to actuation limits. The formation trajectories and their time histories under the directionally aware collision avoidance tracking strategy are shown in Figs. 150-153. Similarly to the first example, this strategy ensures that no UAV pair violates the \( r_i + r_j \) constraint while allowing only minimal deviations from the optimal trajectory. These results further validate the effectiveness of the proposed formation trajectory planning and tracking framework, particularly demonstrating the efficiency of the directionally aware collision avoidance approach.

Optimal formation trajectories
Figure 99. Optimal formation trajectories
Top view of the trajectories
Figure 100. Top view of the trajectories
Figure 98. Optimal formation trajectories of the seven-UAV team.
Figure 103
Figure 104
Figure 105
Figure 106
Figure 107
Figure 108
Figure 102. Positions and velocities
Figure 110
Figure 111
Figure 112
Figure 113
Figure 114
Figure 115
Figure 109. Accelerations and \(\mathbf{u}_{\mathbf{r}}\)
Figure 117
Figure 118
Figure 119
Figure 120
Figure 121
Figure 122
Figure 116. Attitudes with rates
Euclidean distances
Figure 123. Euclidean distances
Figure 101. Time histories of positions, velocities, accelerations, jerks, attitudes with their rates, and Euclidean distances for the optimal formation trajectories of the seven-UAV team. (d) indicates a collision situation.
Formation tracking trajectories
Figure 125. Formation tracking trajectories
Top view of the tracking trajectories
Figure 126. Top view of the tracking trajectories
Figure 124. Formation trajectory tracking of the seven-UAV team.
Figure 129
Figure 130
Figure 131
Figure 132
Figure 133
Figure 134
Figure 128. Positions and velocities
Figure 136
Figure 137
Figure 138
Figure 139
Figure 140
Figure 141
Figure 135. Accelerations and \(\mathbf{u}_{\mathbf{z}}\)
Figure 143
Figure 144
Figure 145
Figure 146
Figure 147
Figure 148
Figure 142. Attitudes with rates
Euclidean distances
Figure 149. Euclidean distances
Figure 127. Time histories of positions, velocities, accelerations, jerks, attitudes with their rates, and Euclidean distances for formation tracking trajectories of the seven-UAV team. (b) shows a sudden large jump in the tracking control input \( \mathbf{u}_z \) around \(t=2.5\, \mathrm{s}\).
Directionally aware formation tracking trajectories
Figure 151. Directionally aware formation tracking trajectories
Top view of the tracking trajectories
Figure 152. Top view of the tracking trajectories
Figure 150. Formation trajectory tracking of the seven-UAV team utilizing the directionally aware collision avoidance strategy.
Figure 155
Figure 156
Figure 157
Figure 158
Figure 159
Figure 160
Figure 154. Positions and velocities
Figure 162
Figure 163
Figure 164
Figure 165
Figure 166
Figure 167
Figure 161. Accelerations and \(\mathbf{u}_{\mathbf{z}}\)
Figure 169
Figure 170
Figure 171
Figure 172
Figure 173
Figure 174
Figure 168. Attitudes with rates
Euclidean distances
Figure 175. Euclidean distances
Figure 153. Time histories of positions, velocities, accelerations, jerks, attitudes with their rates, and Euclidean distances for formation tracking trajectories of the seven-UAV team utilizing the directionally aware collision avoidance strategy.

The trajectory in Fig. 153 may still be impractical for some physical UAVs due to actuator constraints. However, the proposed formation control design can adapt to these limitations. For instance, reducing the weighting parameters \(\mu_{ij}\) of the formation performance index improves trajectory feasibility. Fig. 183 shows the time histories of trajectories using the directionally aware collision avoidance strategy from Fig. 153, with \(\mu_{ij}\) adjusted to 10  To validate the feasibility of the planned trajectories, a simulation experiment was conducted using the high-fidelity Gazebo simulator in conjunction with the MRS UAV system . The simulation incorporates full system dynamics, aerodynamic effects, and sensor noise while utilizing the same control and estimation software as the real platforms. Over years of validation through extensive use and comparisons with real-world experiments, it has demonstrated high accuracy, rendering the simulation-to-reality gap negligible [38]. The simulation was performed using UAV platforms based on the DJI Flame Wheel F450.

Snapshots of Gazebo simulator-based formation trajectories and their time histories under the directionally aware collision avoidance tracking strategy with improved feasibility are shown in Figs. 176-183. The reference trajectories from the latest example (shown in green in Fig. 176) generate feasible control commands suitable for direct hardware execution [39, 40]. By leveraging this high-fidelity simulator, we validated the feasibility of the formation control scheme in a realistic simulation environment, ensuring minimal discrepancies between simulation and real-world implementation. These results further substantiate the effectiveness of the proposed formation trajectory planning and tracking framework, demonstrating its potential for the deployment of real-world UAV teams.

t=0,s)
Figure 177. \(t=0\,\mathrm{s}\)
t=2,s)
Figure 178. \(t=2\,\mathrm{s}\)
t=4,s)
Figure 179. \(t=4\,\mathrm{s}\)
t=6,s)
Figure 180. \(t=6\,\mathrm{s}\)
t=8,s)
Figure 181. \(t=8\,\mathrm{s}\)
t=10,s)
Figure 182. \(t=10\,\mathrm{s}\)
Figure 176. Snapshots of a seven-UAV team tracking the formation trajectories using the directionally aware collision avoidance strategy with improved feasibility (green) in the high fidelity Gazebo simulator using MRS UAV System.
Figure 185
Figure 186
Figure 187
Figure 188
Figure 189
Figure 190
Figure 184. Positions and velocities
Figure 192
Figure 193
Figure 194
Figure 195
Figure 196
Figure 197
Figure 191. Accelerations and \(\mathbf{u}_{\mathbf{z}}\)
Figure 199
Figure 200
Figure 201
Figure 202
Figure 203
Figure 204
Figure 198. Attitudes with rates
Euclidean distances
Figure 205. Euclidean distances
Figure 183. Time histories of positions, velocities, accelerations, jerks, attitudes with their rates, and Euclidean distances for the seven-UAV team formation trajectories utilizing the directionally aware collision avoidance strategy with improved feasibility. The faded lines correspond to the simulation results from the ROS-Gazebo-based simulator, each matching the color of its reference dashed line.

7 Conclusion

This paper presents a collision-free, finite-time formation control scheme for multiple UAVs based on the differential flatness property of UAV dynamics. Unlike traditional methods that rely on a leader, predefined tracking signals, or waypoints, our approach utilizes the initial states of the UAVs to plan optimal formation trajectories based on consensus. Additionally, we introduce a directionally aware collision avoidance strategy that prioritizes avoiding collisions both in the forward path and the relative approach for each UAV, promoting adaptive behavior and enhancing efficiency. By formulating and solving a finite-time optimal control problem, we derive trajectories that optimize a collective performance index, ensuring efficient formation and movement of the UAV team. To maintain collision-free tracking of the formation trajectory, we incorporate safety guarantees into the tracking controller when necessary, striking a balance between optimality and safety. High-fidelity simulations are performed using the ROS-Gazebo-based simulator. The results confirm the effectiveness of our control scheme, demonstrating successful formation trajectory planning and formation trajectory tracking while avoiding collisions. However, the assumption of a complete communication graph for global collision awareness may pose scalability challenges for larger UAV teams. Future work could address this limitation by developing distributed control strategies that enable real-time, directionally aware, collision-free trajectory planning and tracking, using onboard sensors such as UVDAR [41].

Acknowledgments

This work was partially funded by the Czech Science Foundation (GAČR) under research project no. \(\mathrm{23-07517S}\) and the European Union under the project Robotics and Advanced Industrial Production (reg. no. \(\mathrm{CZ.02.01.01/00/22\_008/0004590}\)), Czech Technical University in Prague (CTU) within the CTU Global Postdoc Fellowship program, and CTU grant no. \(\mathrm{SGS23/177/OHK3/3T/13}\).

Appendix

9 proof of theorem 1

Proof

Define the Hamiltonian

\[ \begin{align} H=\mathbf{r}^\top(t) \mathbf{Q} \mathbf{r}(t)&+\mathbf{u}_{\mathbf{r}}^\top(t) \mathbf{R} \mathbf{u}_{\mathbf{r}}(t)+\boldsymbol{\psi}^\top(t)(\mathbf{A}\mathbf{r}(t)+\mathbf{B}\mathbf{u}_{\mathbf{r}}(t)), \\\end{align} \]

(38)

where \( \boldsymbol{\psi}(t)\) is co-state. According to Pontryagin’s principle, the necessary conditions for optimality are \( \frac{\partial H}{\partial \mathbf{u}_{\mathbf{r}}}=0\) and \( \dot{\boldsymbol{\psi}}(t)=-\frac{\partial H}{\partial \mathbf{r}}\) . Applying the necessary conditions yields

\[ \begin{align} &\mathbf{u}_{\mathbf{r}}(t)=-\mathbf{R}^{-1}\mathbf{B}^\top\boldsymbol{\psi}(t), \end{align} \]

(39)

\[ \begin{align} &\dot{\boldsymbol{\psi}}(t)=-\mathbf{Q}\mathbf{r}(t)-\mathbf{A}^\top\boldsymbol{\psi}(t) ,~\boldsymbol{\psi}(t_f)=\mathbf{Q}_f \mathbf{r}(t_f). \end{align} \]

(40)

Note that (7) is a strictly convex function of \( \mathbf{u}_{\mathbf{r}}(t)\) for all admissible control functions, and the dynamics are linear. Therefore, the conditions obtained from Pontryagin’s principle are both necessary and sufficient [42, 43].

Substituting (39) into the state dynamics (6), we have

\[ \begin{equation} \dot{\mathbf{r}}(t)=\mathbf{A}\mathbf{r}(t)-\mathbf{B}\mathbf{R}^{-1}\mathbf{B}^\top\boldsymbol{\psi}(t),~ \mathbf{r}(0)=\mathbf{r}_0. \end{equation} \]

(41)

Equations (41) and (40) are unified into the following differential equation

\[ \begin{equation} \begin{bmatrix}\dot{\mathbf{r}}(t) \\ \dot{\boldsymbol{\psi}}(t)\end{bmatrix}=\mathbf{M}\begin{bmatrix}\mathbf{r}(t) \\ \boldsymbol{\psi}(t)\end{bmatrix}, ~ \end{equation} \]

(42)

where \( \mathbf{M}\) is given by (12). Moreover, the initial and terminal conditions are unified as

\[ \begin{equation} \begin{bmatrix} \mathbf{I} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{bmatrix}\begin{bmatrix}\mathbf{r}(0) \\ \boldsymbol{\psi}(0)\end{bmatrix}+\begin{bmatrix} \mathbf{0} & \mathbf{0} \\ -\mathbf{Q}_f & \mathbf{I} \end{bmatrix}\begin{bmatrix}\mathbf{r}(t_f) \\ \boldsymbol{\psi}(t_f)\end{bmatrix}=\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}. \end{equation} \]

(43)

The solution of (42) at \( t_f\) is given by

\[ \begin{align} \begin{bmatrix}\mathbf{r}(t_f) \\ \boldsymbol{\psi}(t_f)\end{bmatrix}&=\exp{(t_f\mathbf{M})}\begin{bmatrix}\mathbf{r}(0) \\ \boldsymbol{\psi}(0)\end{bmatrix}. \end{align} \]

(44)

Substituting (44) into (43), we obtain

\[ \begin{align} \Bigg(\begin{bmatrix} \mathbf{I} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{bmatrix}+\begin{bmatrix} \mathbf{0} & \mathbf{0} \\ -\mathbf{Q}_f & \mathbf{I} \end{bmatrix}\exp{(t_f\mathbf{M})}\Bigg)\begin{bmatrix}\mathbf{r}(0) \\ \boldsymbol{\psi}(0)\end{bmatrix}=\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}, \end{align} \]

(45)

or equivalently,

\[ \begin{align} &\Bigg(\begin{bmatrix} \mathbf{I} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} \end{bmatrix}\exp{(-t_f\mathbf{M})}+\begin{bmatrix} \mathbf{0} & \mathbf{0} \\ -\mathbf{Q}_f & \mathbf{I} \end{bmatrix}\Bigg)\exp{(t_f\mathbf{M})}\begin{bmatrix}\mathbf{r}(0) \\ \boldsymbol{\psi}(0)\end{bmatrix}=\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}. \end{align} \]

(46)

Let

\[ \begin{equation} \exp{(-t_f\mathbf{M})}=\begin{bmatrix} \mathbf{C}_{11} & \mathbf{C}_{12} \\ \mathbf{C}_{21} & \mathbf{C}_{22} \end{bmatrix}. \end{equation} \]

(47)

Multiplying both sides of (46) by \( \begin{bmatrix} \mathbf{I} & -\mathbf{C}_{12} \\ \mathbf{0} & \mathbf{I} \end{bmatrix}\) , we get

\[ \begin{align} &\begin{bmatrix} \mathbf{C}_{11}+\mathbf{C}_{12}\mathbf{Q}_f & \mathbf{0} \\ -\mathbf{Q}_f & \mathbf{I} \end{bmatrix}\exp{(t_f\mathbf{M})}\begin{bmatrix}\mathbf{r}(0) \\ \boldsymbol{\psi}(0)\end{bmatrix}=\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}, \end{align} \]

(48)

or equivalently,

\[ \begin{align} &\begin{bmatrix} \mathbf{C}_{11}+\mathbf{C}_{12}\mathbf{Q}_f & \mathbf{0} \\ -\mathbf{Q}_f & \mathbf{I} \end{bmatrix}\begin{bmatrix}\mathbf{r}(t_f) \\ \boldsymbol{\psi}(t_f)\end{bmatrix}=\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}. \end{align} \]

(49)

Substituting (47) into (10) yields

\[ \begin{align} \mathbf{H}(t_f)=\mathbf{C}_{11}+\mathbf{C}_{12}\mathbf{Q}_f. \end{align} \]

(50)

Thus, (48) is rewritten as

\[ \begin{align} &\begin{bmatrix} \mathbf{H}(t_f) & \mathbf{0} \\ -\mathbf{Q}_f & \mathbf{I} \end{bmatrix}\begin{bmatrix}\mathbf{r}(t_f) \\ \boldsymbol{\psi}(t_f)\end{bmatrix}=\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}. \end{align} \]

(51)

From (51), we have

\[ \begin{align} \begin{bmatrix}\mathbf{r}(t_f) \\ \boldsymbol{\psi}(t_f)\end{bmatrix}&=\begin{bmatrix} \mathbf{H}(t_f) & \mathbf{0} \\ -\mathbf{Q}_f & \mathbf{I} \end{bmatrix}^{-1}\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}=\begin{bmatrix} \mathbf{H}^{-1}(t_f) & \mathbf{0} \\ \mathbf{Q}_f\mathbf{H}^{-1}(t_f) & \mathbf{I} \end{bmatrix}\begin{bmatrix}\mathbf{r}_0 \\ \mathbf{0}\end{bmatrix}\nonumber \end{align} \]

(52)

\[ \begin{align} &=\begin{bmatrix} \mathbf{H}^{-1}(t_f)\mathbf{r}_0 \\ \mathbf{Q}_f\mathbf{H}^{-1}(t_f)\mathbf{r}_0 \end{bmatrix}=\begin{bmatrix} \mathbf{I} \\ \mathbf{Q}_f \end{bmatrix}\mathbf{H}^{-1}(t_f)\mathbf{r}_0, \\\end{align} \]

(53)

Substituting (52) into (44) and rearranging it, we get

\[ \begin{align} &\begin{bmatrix}\mathbf{r}(0) \\ \boldsymbol{\psi}(0)\end{bmatrix}=\exp{(-t_f\mathbf{M})}\begin{bmatrix} \mathbf{I} \\ \mathbf{Q}_f \end{bmatrix}\mathbf{H}^{-1}(t_f)\mathbf{r}_0. \end{align} \]

(54)

Substituting (54) into the solution of (42) at \( t\) yields

\[ \begin{align} \begin{bmatrix}\mathbf{r}(t) \\ \boldsymbol{\psi}(t)\end{bmatrix}&=\exp{(t\mathbf{M})}\begin{bmatrix}\mathbf{r}(0) \\ \boldsymbol{\psi}(0)\end{bmatrix}=\exp{((t-t_f)\mathbf{M})}\begin{bmatrix} \mathbf{I} \\ \mathbf{Q}_f \end{bmatrix}\mathbf{H}^{-1}(t_f)\mathbf{r}_0. \end{align} \]

(55)

From (55), we have

\[ \begin{align} & \mathbf{r}(t) =\begin{bmatrix} \mathbf{I} & \mathbf{0} \end{bmatrix}\exp{((t-t_f)\mathbf{M})}\begin{bmatrix} \mathbf{I} \\ \mathbf{Q}_f \end{bmatrix}\mathbf{H}^{-1}(t_f)\mathbf{r}_0, \\ & \boldsymbol{\psi}(t) =\begin{bmatrix} \mathbf{0} & \mathbf{I} \end{bmatrix}\exp{((t-t_f)\mathbf{M})}\begin{bmatrix} \mathbf{I} \\ \mathbf{Q}_f \end{bmatrix}\mathbf{H}^{-1}(t_f)\mathbf{r}_0. \\\end{align} \]

(56)

Using the notation in (10) and (11), we get (8) and (9). This concludes the proof.

10 stability analysis of forward-path-aware collision avoidance control

This appendix analyzes the stability of the collision avoidance control strategy under the forward-path-aware penalty function \( v_{ij}^F \), as introduced in Subsection 5.1. Specifically, it demonstrates that the control inputs remain bounded without relying on Assumption 4, ensuring the stability of the directionally aware collision avoidance framework. The analysis focuses on the gradient of the penalty function and its impact on the control law in (25).

Recall the collision avoidance control strategy (25) under the forward-path-aware penalty function as follows,

\[ \begin{align} \mathbf{u}_{\mathbf{z}}(t) =-\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top \mathbf{P}(t) \mathbf{z}(t)-\mathbf{R}_{\mathbf{z}}^{-1}\mathbf{B}^\top \frac{\partial V^F}{\partial \mathbf{z}}^\top, \end{align} \]

(57)

where \( V_i^F = \sum_{j \in \mathcal{N}_i} v_{ij}^F = \sum_{j \in \mathcal{N}_i} \alpha_{ij} v_{ij} \) for UAV \( i \) and the total penalty is \( V^F = \sum_i V_i^F \).

Then, we have

\[ \begin{equation} \frac{\partial V_i^F}{\partial \mathbf{z}_i}=\sum_{j\in\mathcal{N}_i}\big(\alpha_{ij}\underbrace{\frac{\partial v_{ij}}{\partial \mathbf{z}_i}}_{\text{original gradient}}+v_{ij}\underbrace{\frac{\partial \alpha_{ij}}{\partial \mathbf{z}_i}}_{\text{directional adjustment}}\big). \end{equation} \]

(58)

The gradient decomposition in (58) shows how directional awareness modifies the collision response. The original gradient term maintains baseline collision avoidance, while the directional adjustment term modulates responses based on the forward-path awareness.

To ensure stability and avoid unbounded control inputs, we must guarantee that the gradient of the total penalty \(\frac{\partial V^F}{\partial \mathbf{z}}\) does not grow indefinitely.

The total penalty \( V\) is guaranteed to be non-increasing because of bounded components \( \alpha_{ij}\) and \( v_{ij}\) (where \( \alpha_{ij} \in [0,1]\) and \( v_{ij} \to 0\) as \( \|\mathbf{p}_j-\mathbf{p}_i\| \to \infty\) ) and vanishing gradients

\[ \left\|\frac{\partial V_i^F}{\partial \mathbf{z}_i}\right\| \leq \sum_{j \in \mathcal{N}_i} \left(\alpha_{ij}\left\|\frac{\partial v_{ij}}{\partial \mathbf{z}_i}\right\| + v_{ij}\left\|\frac{\partial \alpha_{ij}}{\partial \mathbf{z}_i}\right\|\right), \]

where \(\frac{\partial v_{ij}}{\partial \mathbf{z}_i}\) is proven to be bounded in Lemma 1, and for \(\frac{\partial \alpha_{ij}}{\partial \mathbf{z}_i}\), its bounds are derived as follows.

For \( \alpha_{ij} = \frac{(\mathbf{p}_j - \mathbf{p}_i)^\top \mathbf{v}_i}{\|\mathbf{p}_j - \mathbf{p}_i\| \|\mathbf{v}_i\|}\) (when \( \|\mathbf{v}_i\| \neq 0\) and \( \cos\eta_{ij} > 0\) ), the partial position and velocity derivatives are

\[ \begin{align*} &\frac{\partial \alpha_{ij}}{\partial \mathbf{p}_i} = -\frac{\mathbf{v}_i}{\|\mathbf{p}_j - \mathbf{p}_i\| \|\mathbf{v}_i\|} + \frac{(\mathbf{p}_j - \mathbf{p}_i)^\top \mathbf{v}_i}{\|\mathbf{p}_j - \mathbf{p}_i\|^3 \|\mathbf{v}_i\|}(\mathbf{p}_j - \mathbf{p}_i),\\ &\frac{\partial \alpha_{ij}}{\partial \mathbf{v}_i} = \frac{\mathbf{p}_j - \mathbf{p}_i}{\|\mathbf{p}_j - \mathbf{p}_i\| \|\mathbf{v}_i\|} - \frac{(\mathbf{p}_j - \mathbf{p}_i)^\top \mathbf{v}_i}{\|\mathbf{p}_j - \mathbf{p}_i\| \|\mathbf{v}_i\|^3}\mathbf{v}_i. \end{align*} \]

Taking their Euclidean norm, we have

\[ \begin{align*} \left\|\frac{\partial \alpha_{ij}}{\partial \mathbf{p}_i}\right\| &\leq \left\|-\frac{\mathbf{v}_i}{\|\mathbf{p}_j - \mathbf{p}_i\| \|\mathbf{v}_i\|}\right\| + \left\|\frac{(\mathbf{p}_j - \mathbf{p}_i)^\top \mathbf{v}_i}{\|\mathbf{p}_j - \mathbf{p}_i\|^3 \|\mathbf{v}_i\|}(\mathbf{p}_j - \mathbf{p}_i)\right\| \\ &= \frac{1}{\|\mathbf{p}_j - \mathbf{p}_i\|} + \frac{|\cos\eta_{ij}|}{\|\mathbf{p}_j - \mathbf{p}_i\|} \leq \frac{2}{\|\mathbf{p}_j - \mathbf{p}_i\|}, \end{align*} \]

and

\[ \begin{align*} \left\|\frac{\partial \alpha_{ij}}{\partial \mathbf{v}_i}\right\| &\leq \left\|\frac{\mathbf{p}_j - \mathbf{p}_i}{\|\mathbf{p}_j - \mathbf{p}_i\| \|\mathbf{v}_i\|}\right\| + \left\|\frac{(\mathbf{p}_j - \mathbf{p}_i)^\top \mathbf{v}_i}{\|\mathbf{p}_j - \mathbf{p}_i\| \|\mathbf{v}_i\|^3}\mathbf{v}_i\right\| \\ &= \frac{1}{\|\mathbf{v}_i\|} + \frac{|\cos\eta_{ij}|}{\|\mathbf{v}_i\|} \leq \frac{2}{\|\mathbf{v}_i\|}. \end{align*} \]

Thus, the total derivative norm is bounded by

\[ \left\|\frac{\partial \alpha_{ij}}{\partial \mathbf{z}_i}\right\| \leq \left\|\frac{\partial \alpha_{ij}}{\partial \mathbf{p}_i}\right\| + \left\|\frac{\partial \alpha_{ij}}{\partial \mathbf{v}_i}\right\| \leq \frac{2}{\|\mathbf{p}_j - \mathbf{p}_i\|} + \frac{2}{\|\mathbf{v}_i\|}. \]

Since \( v_{ij} \to 0 \) as \(\|\mathbf{p}_j - \mathbf{p}_i\| \to \infty\), the term \( v_{ij} \frac{\partial \alpha_{ij}}{\partial \mathbf{z}_i} \) in (58) diminishes at large separations, further ensuring that the directional adjustment term does not contribute to unbounded growth in the total gradient.

The boundedness of \(\frac{\partial \alpha_{ij}}{\partial \mathbf{z}_i}\), combined with the boundedness of \(\frac{\partial v_{ij}}{\partial \mathbf{z}_i}\) (as proven in Lemma 1) and the decay of \( v_{ij} \) as \(\|\mathbf{p}_j - \mathbf{p}_i\| \to \infty\), ensures that the total gradient \(\frac{\partial V_i^F}{\partial \mathbf{z}_i}\) in (58) remains bounded. Consequently, the control input \(\mathbf{u}_{\mathbf{z}}(t)\) in (57) avoids unbounded growth, preserving the stability of the HJB equation and the collision avoidance strategy without Assumption 4.

Similar derivations can be applied to the approach-aware penalty function \( v_{ij}^A = \beta_{ij} v_{ij} \) and the unified penalty function \( v_{ij}^U = \xi_{ij} v_{ij} \), as their weighting factors (\(\beta_{ij}\), \(\xi_{ij}\)) are also bounded in \([0, 1]\) and their gradients can be shown to be bounded using analogous techniques, ensuring stability across the directionally aware framework.

References

[1] Pavel Petráček and Vít Krátký and Matěj Petrlík and Tomáš Báča and Radim Kratochvíl and Martin Saska Large-Scale Exploration of Cave Environments by Unmanned Aerial Vehicles IEEE Robotics and Automation Letters 2021 6 4 7596-7603

[2] Celso Oliveira Barcelos and Leonardo Alves Fagundes-Júnior and Daniel Khéde Dourado Villa and Mário Sarcinelli-Filho and Amanda Piaia Silvatti and Daniel Ceferino Gandolfo and Alexandre Santos Brandão Robot Formation Performing a Collaborative Load Transport and Delivery Task by Using Lifting Electromagnets Applied Sciences 2023 13 2

[3] Timur Uzakov and Tiago P. Nascimento and Martin Saska UAV 2020 International Conference on Unmanned Aircraft Systems (ICUAS) 2020 1301-1308

[4] Saska, Martin and Hert, Daniel and Baca, Tomas and Kratky, Vit and Nascimento, Tiago Formation control of unmanned micro aerial vehicles for straitened environments Autonomous Robots 2020 44 6 991-1008

[5] Yunes Alqudsi and Murat Makaraci and Ayman Kassem and Gamal El-Bayoumi A numerically-stable trajectory generation and optimization algorithm for autonomous quadrotor UAVs Robotics and Autonomous Systems 2023 170 104532

[6] Nunzio A. Letizia and Babak Salamat and Andrea M. Tonello A Novel Recursive Smooth Trajectory Generation Method for Unmanned Vehicles IEEE Transactions on Robotics 2021 37 5 1792-1805

[7] Jianan Wang and Ming Xin Integrated Optimal Formation Control of Multiple Unmanned Aerial Vehicles IEEE Transactions on Control Systems Technology 2013 21 5 1731-1744

[8] Youfang Huang and Wen Liu and Bo Li and Yongsheng Yang and Bing Xiao Finite-time formation tracking control with collision avoidance for quadrotor UAVs Journal of the Franklin Institute 2020 357 7 4034-4058

[9] Zhixu Du and Hao Zhang and Zhuping Wang and Huaicheng Yan Model Predictive Formation Tracking-Containment Control for Multi-UAVs With Obstacle Avoidance IEEE Transactions on Systems, Man, and Cybernetics: Systems 2024 1-11

[10] Bo Li and Wenquan Gong and Yongsheng Yang and Bing Xiao Distributed Fixed-Time Leader-Following Formation Control for Multiquadrotors With Prescribed Performance and Collision Avoidance IEEE Transactions on Aerospace and Electronic Systems 2023 59 5 7281-7294

[11] Daniel mellinger and Vijay Kumar Minimum snap trajectory generation and control for quadrotors 2011 IEEE International Conference on Robotics and Automation 2011 2520-2525 10.1109/ICRA.2011.5980409

[12] Ezra Tal and Sertac Karaman Accurate Tracking of Aggressive Quadrotor Trajectories Using Incremental Nonlinear Dynamic Inversion and Differential Flatness IEEE Transactions on Control Systems Technology 2021 29 3 1203-1218 10.1109/TCST.2020.3001117

[13] Pengyu Wang and Jiawei Tang and Hin Wang Lin and Fan Zhang and Chaoqun Wang and Jiankun Wang and Ling Shi and Max Q.-H. Meng MINER-RRT IEEE Transactions on Automation Science and Engineering 2025 1-1 10.1109/TASE.2025.3531504

[14] Xiaolin Ai and Jianqiao Yu Flatness-based finite-time leader–follower formation control of multiple quadrotors with external disturbances Aerospace Science and Technology 2019 92 20-33

[15] Daniele Ravasio and Luca Bascetta and Gian Paolo Incremona and Maria Prandini A Leader-Follower Strategy with Distributed Consensus for the Coordinated Navigation of a Team of Quadrotors in an Environment with Obstacles 2024 European Control Conference (ECC) 2024 1443-1448 10.23919/ECC64448.2024.10590881

[16] Robert Selje and Amer Al-Radaideh and Liang Sun Efficient Three Dimensional Formation Control for Unmanned Aerial Vehicles in GPS-Denied Environments 2021 American Control Conference (ACC) 2021 3761-3766 10.23919/ACC50511.2021.9483176

[17] Dingjiang Zhou and Zijian Wang and Mac Schwager Agile Coordination and Assistive Collision Avoidance for Quadrotor Swarms Using Virtual Structures IEEE Transactions on Robotics 2018 34 4 916-923 10.1109/TRO.2018.2857477

[18] Haibo Du and Wenwu Zhu and Guanghui Wen and Zhisheng Duan and Jinhu Lü Distributed Formation Control of Multiple Quadrotor Aircraft Based on Nonsmooth Consensus Algorithms IEEE Transactions on Cybernetics 2019 49 1 342-353 10.1109/TCYB.2017.2777463

[19] Y. Wu, J. Gou, X. Hu, and Y. Huang, “A new consensus theory-based method for formation control and obstacle avoidance of UAVs,'' Aerospace Science and Technology, vol. 107, p. 106332, 2020.

[20] Xiangwei Bu and Maolong Lv and Humin Lei Discrete-Time Optimal Control Ensuring Fixed-Time Prescribed Performance for SSP IEEE Transactions on Aerospace and Electronic Systems 2024 1-9 10.1109/TAES.2024.3485012

[21] Xiangwei Bu and Humin Lei Fixed-Time Prescribed Performance Unknown Direction Control of Discrete-Time Non-Affine Systems Without Nussbaum-Type Function IEEE Transactions on Automation Science and Engineering 2024 21 4 7064-7072 10.1109/TASE.2023.3336976

[22] Xiangwei Bu and Ruining Luo and Humin Lei Fuzzy-Neural Intelligent Control With Fixed-Time Pre-Configured Qualities for Electromechanical Dynamics of Electric Vehicles Motors IEEE Transactions on Intelligent Transportation Systems 2025 26 1 1193-1202 10.1109/TITS.2024.3487958

[23] Runqi Chai and Al Savvaris and Antonios Tsourdos and Senchun Chai and Yuanqing Xia Trajectory Optimization of Space Maneuver Vehicle Using a Hybrid Optimal Control Solver IEEE Transactions on Cybernetics 2019 49 2 467-480

[24] Bahareh Sabetghadam and Rita Cunha and António Pascoal A Distributed Algorithm for Real-Time Multi-Drone Collision-Free Trajectory Replanning Sensors 2022 22 5

[25] K. Lee, C. Park, and S. Y. Park, “Near-optimal continuous control for spacecraft collision avoidance maneuvers via generating functions,'' Aerospace Science and Technology, vol. 62, pp. 65–74, 2017.

[26] Z. Shi, J. He, T. Wang, C. Zhou, and J. Guo, “Optimal formation control and collision avoidance in environment with multiple rectangle obstacles,'' Journal of the Franklin Institute, vol. 355, no. 15, pp. 7626–7642, 2018.

[27] Huiming Li and Hao Chen and Xiangke Wang Collision-free formation generation and tracking of unmanned aerial vehicles based on physicomimetics approach Asian Journal of Control 2023 25 4 2752-2775

[28] Hao Liu and Teng Ma and Frank L. Lewis and Yan Wan Robust Formation Trajectory Tracking Control for Multiple Quadrotors With Communication Delays IEEE Transactions on Control Systems Technology 2020 28 6 2633-2640

[29] J. C. Engwerda, “Robust open-loop Nash equilibria in the noncooperative LQ game revisited,'' Optimal Control Applications and Methods, vol. 38, no. 5, pp. 795–813, 2017.

[30] C. Li and Z. Qu, “Distributed finite-time consensus of nonlinear systems under switching topologies,'' Automatica, vol. 50, no. 6, pp. 1626–1631, 2014.

[31] Arthur E Bryson Optimal control-1950 to 1985 IEEE Control Systems Magazine 1996 16 3 26–33

[32] John T Betts Practical methods for optimal control and estimation using nonlinear programming SIAM 2010

[33] X. Ai and J. Yu, “Fixed-time trajectory tracking for a quadrotor with external disturbances: A flatness-based sliding mode control approach,'' Aerospace Science and Technology, vol. 89, pp. 58–76, 2019.

[34] L. E. Beaver and A. A. Malikopoulos, “Optimal control of differentially flat systems is surprisingly easy,'' Automatica, vol. 159, p. 111404, 2024.

[35] Hossein B. Jond and Vasif Nabiyev On the finite horizon Nash equilibrium solution in the differential game approach to formation control Journal of Systems Engineering and Electronics 2019 30 6 1233-1242 10.21629/JSEE.2019.06.17

[36] Hossein B. Jond and Vasif Nabiyev and Nurhan Özmen and Dalibor Lukáš Existence of Nash equilibrium in differential game approach to formation control International Journal of Robotics and Automation 2018 33 206 07 10.2316/Journal.206.2018.4.206-5308

[37] Dušan M. Stipanović and Peter F. Hokayem and Mark W. Spong and Dragoslav D. Šiljak Cooperative Avoidance Control for Multiagent Systems Journal of Dynamic Systems, Measurement, and Control 2007 129 5 699-707 04

[38] T. Baca, M. Petrlik, M. Vrba, V. Spurny, R. Penicka, D. Hert, and M. Saska, “The MRS UAV system: Pushing the frontiers of reproducible research, real-world deployment, and education with autonomous unmanned aerial vehicles,'' Journal of Intelligent & Robotic Systems, vol. 102, no. 1, p. 26, Apr. 2021.

[39] T Baca and D Hert and G Loianno and M Saska and V Kumar Model Predictive Trajectory Tracking and Collision Avoidance for Reliable Outdoor Deployment of Unmanned Aerial Vehicles 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 1–8 IEEE

[40] M. Petrlik and T. Baca and D. Hert and M. Vrba and T. Krajnik and M. Saska A Robust UAV System for Operations in a Constrained Environment IEEE Robotics and Automation Letters 2169-2176 4 10.1109/LRA.2020.2970980

[41] Viktor Walter and Nicolas Staub and Antonio Franchi and Martin Saska UVDAR IEEE Robotics and Automation Letters 2019 4 3 2637-2644

[42] J. C. Engwerda, “On the open-loop Nash equilibrium in LQ-games,'' Journal of Economic Dynamics and Control, vol. 22, no. 5, pp. 729–762, 1998.

[43] Urszula Ledzewicz and Heinz Schättler Pitfalls in applying optimal control to dynamical systems: An overview and editorial perspective Discrete and Continuous Dynamical Systems-B 2022 27 11 6711–6722