Uav Swarm Navigation uses deep reinforcement learning for robust flight

Researchers are tackling the complex challenge of coordinating swarms of unmanned aerial vehicles (UAVs) in environments where communication is impossible. Myong-Yol Choi, Hankyoul Ko, and Hanse Cho of the Department of Mechanical Engineering at the Ulsan National Institute of Science and Technology, along with Changseung Kim, Seunghwan Kim, Jaemin Seo, and others, present a new deep reinforcement learning (DRL) controller that enables UAVs to collectively navigate using only onboard LiDAR sensing. This is an important step towards robust and autonomous group behavior. Inspired by the collective behavior of biological groups, this system allows followers to implicitly track leaders without relying on explicit communication or external positioning, overcoming a key limitation in many current UAV swarm technologies. Validated through extensive simulations and real-world experiments, this breakthrough promises to unlock the potential of UAV swarms for applications such as search and rescue, infrastructure inspection, and environmental monitoring, even in difficult scenarios where communications are denied.

Leader/Follower DRL for UAV fleets with communication denied

Scientists have demonstrated an innovative deep reinforcement learning (DRL)-based controller for collective navigation of unmanned aerial vehicle (UAV) swarms in communication-free environments. Inspired by the elegant coordination observed in biological flocks, where informed individuals lead the group without explicit signaling, the team adopted an implicit leader-follower framework to achieve robust behavior in complex and obstacle-filled environments. This innovative approach allows swarms to move effectively even when inter-agent communication is unavailable or unreliable. This is a significant limitation of many existing systems. The core of this research lies in the development of a system in which only the leader UAV possesses target information and the follower UAV learns robust policies solely from onboard LiDAR sensing, without requiring inter-agent communication or leader identification.
The researchers combined LiDAR point clustering and extended Kalman filters to provide stable neighborhood tracking and ensure reliable recognition independent of external positioning systems. This allows the swarm to operate autonomously, overcoming the limitations of systems that rely on external infrastructure and constant communication links. The central innovation of this work is a DRL controller trained using GPU-accelerated Nvidia Isaac Sim. This allows subsequent UAVs to learn complex emergent behaviors and effectively balance the need to swarm with robust obstacle avoidance using only local perception. This allows the swarm to implicitly track the leader even when faced with perceptual challenges such as occlusion or limited visibility.

Extensive simulations and difficult real-world experiments with a swarm of five UAVs confirmed the robustness of this approach and its ability to transfer from simulation to reality. The team successfully demonstrated collective navigation across diverse indoor and outdoor environments without any communication or external localization, a remarkable achievement in autonomous swarm robotics. This breakthrough opens exciting possibilities for applications in scenarios where communications are compromised, such as search and rescue operations in disaster areas, surveillance in conflict environments, and environmental monitoring in remote areas. This research establishes a new paradigm for resilient and scalable UAV swarm navigation, paving the way for more adaptive and reliable autonomous systems.

Leader/Follower DRL for UAV fleets with communication denied

Scientists have developed a new deep reinforcement learning (DRL) controller for collective navigation of unmanned aerial vehicle (UAV) swarms that operates without agent-to-agent communication, addressing critical limitations in challenging environments. The research team developed an implicit leader-follower framework inspired by biological groups. This framework allows for a single, informed leader to guide the group without explicit signaling, allowing for robust operations even in scenarios where communication is denied. This approach avoids the bandwidth and latency constraints of traditional communication-dependent methods and provides improved scalability and real-time performance. To facilitate robust recognition, this study combined onboard LiDAR sensing with an extended Kalman filter to achieve stable neighborhood tracking and provide reliable location data independent of external systems.

The team used LiDAR point clustering to identify surrounding UAVs and fed this data into a Kalman filter to maintain accurate and continuous tracking of each follower’s position relative to the leader and its peers. The system provides accurate localization even in the presence of occlusions and limited field of view. This is important to maintain group cohesion. The core of the innovation lies in a DRL controller trained within a GPU-accelerated Isaac Sim environment, which allows subsequent UAVs to learn complex behaviors from only local sensory input and balance swarming and obstacle avoidance. In our experiments, we employed a reward function designed to encourage both neighbor approach and successful obstacle avoidance, promoting the development of emergent behavior.

This training process allowed followers to implicitly track the leader’s movements, creating a cohesive swarm without the need for explicit leader identification or mission broadcast. Robustness and simulation-to-realistic transfer were rigorously validated through extensive simulation and real-world experiments, including a swarm of five UAVs navigating through various indoor and outdoor environments. This swarm successfully demonstrated collective navigation without communication or external localization, confirming the effectiveness of the proposed methodology and the potential for real-world deployment. The system successfully navigated in various terrains and demonstrated its adaptability and resilience in complex and dynamic scenarios.

Leader and follower UAV fleets navigate without communication

Scientists have developed a deep reinforcement learning (DRL)-based controller for collective navigation of swarms of unmanned aerial vehicles (UAVs) in communication-free environments, achieving robust behavior amid complex obstacles. The research team employs an implicit leader-follower framework, mirroring biological flocks, in which an informed individual guides the group without explicit communication, only the leader possesses goal information, and the follower’s UAV learns policy using only onboard LiDAR sensing. Experiments revealed that trailing UAVs can successfully navigate collectively by reacting to changes in the leader’s position, rather than being constrained to the direction of the swarm’s average speed. The main objective of this study was to train a robust control policy π via DRL, allowing the follower to balance obstacle avoidance and swarming behavior, cohesion and separation using only onboard LiDAR data.

Results show that successful execution of these behaviors results in emergent leader following, in which the herd naturally moves toward the destination as followers maintain positional ties with their neighbors under the influence of the leader’s goal-directed movements. This system enables mass movement in a completely communication-free manner, which is an important technological achievement. Measurements confirm the effectiveness of the LiDAR-based perception system, which consists of ego state estimation, object tracking, and point downsampling modules. The recognition system filters the raw point cloud, stacks the two most recent point clouds, and retains points within the range of 0.05 to 10.0 meters.

High intensity points identified with an intensity threshold of 170 are used as key seeds for object detection, while region of interest points within a 0.3 meter radius of the existing track centroid are also retained. DBSCAN clustering utilizes a distance threshold of 0.1 meters and a minimum cluster size scaled by the number of stacked frames to group filtered points into clusters. Testing proves that each cluster is tracked using an Extended Kalman Filter (EKF) with an isokinetic model, and the tracking is validated based on a high intensity ratio threshold of 0.05 with a continuous duration of 0.01 seconds. The resulting recognition output, representing the verified neighbors, is fed to the DRL control policy. The DRL framework models the follower control problem as a partially observable Markov decision process and utilizes encoders to process observations into latent vectors, allowing actors and critics to decide on policies and estimate values, allowing stable collective navigation without an external positioning system. Through extensive simulations and real-world experiments with a fleet of five UAVs, we successfully demonstrated collective navigation across diverse indoor and outdoor environments without the need for communication or external localization.

Implicit leadership in UAV Swarm Navigation requires robustness

Researchers have developed a deep reinforcement learning (DRL) controller that enables collective navigation of swarms of unmanned aerial vehicles (UAVs) that operate without communication between agents. The system takes inspiration from biological swarm behavior, where leadership naturally emerges, and addresses the challenge of coordinating multiple UAVs in complex environments with obstacles. The core innovation lies in an implicit leader/follower framework, where a single “leader” UAV possesses target information and follower UAVs learn navigation using only onboard LiDAR sensing, eliminating the need for agent-to-agent communication or explicit leader identification. The developed system employs LiDAR point clustering and extended Kalman filter to ensure stable neighborhood tracking and provide reliable recognition independent of external positioning systems.

DRL controllers trained using GPU-accelerated simulations enable subsequent UAVs to learn complex behaviors while balancing swarming behavior and obstacle avoidance based solely on local awareness. Extensive simulation and real-world experiments with a fleet of five UAVs confirm the robustness and simulation-to-reality transferability of this approach, successfully demonstrating collective navigation in diverse indoor and outdoor environments without the need for communication or external localization. The authors acknowledge that there are limitations to current scalability to larger swarms, and future research will also consider enhancing this aspect, alongside investigating more complex collective behaviors such as adaptive role switching. This study demonstrates the practicality and robustness of utilizing DRL for communication-free collective navigation and provides a promising solution for UAV swarm operations in difficult and communication-denied scenarios.

👉 More information
🗞 Communication-free collective navigation of UAV swarms using LiDAR-based deep reinforcement learning
🧠ArXiv: https://arxiv.org/abs/2601.13657

Source link