Ítem


Deep Reinforcement Learning for drone obstacle avoidance

Unmanned Aerial Vehicles (UAVs) are increasingly deployed in autonomous mis sions across complex, cluttered environments where reliable obstacle avoidance is crit ical. Traditional navigation frameworks rely on modular pipelines—separating percep tion, mapping, planning, and control—which often suffer from error accumulation, high computational overhead, and poor reactivity in dynamic scenarios. To address these lim itations, this thesis investigates an end-to-end deep reinforcement learning (DRL) frame work for real-time UAV obstacle avoidance using onboard depth sensing. We compare two state-of-the-art DRL algorithms, Proximal Policy Optimization (PPO) and Twin Delayed DDPG (TD3), in a continuous control setting, evaluating their train ing dynamics and performance in diverse simulated environments. Our initial experi ments highlight key failure modes such as collisions with overhead obstacles and dead end traps, caused by the policy’s limited temporal awareness. To overcome these, we propose a neural architecture that incorporates both a pretrained ResNet8-based depth encoder and two temporal reasoning mechanisms: (1) an LSTM module for recurrent memory, and (2) a stacked buffer of recent depth observations. This temporal augmen tation allows the agent to recover from occlusions and partial observability, significantly improving navigation robustness. Trained in a curriculum-based Gym-PyBullet-Drones environment, our final memory based policy achieves a 96% success rate across randomized 3D obstacle courses and out performs EGO-Planner-v2 in both success rate and adaptability. The results demonstrate that DRL policies with temporal context can match or exceed the performance of tradi tional planning pipelines while offering greater generalization and simplicity in deploy ment.

9

Universitat de Girona. Institut de Recerca en Visió per Computador i Robòtica

Director: Vasiljević, Goran
Manen, Benjamin van
Autor: Loc Pham, Thanh
Data: 2025
Resum: Unmanned Aerial Vehicles (UAVs) are increasingly deployed in autonomous mis sions across complex, cluttered environments where reliable obstacle avoidance is crit ical. Traditional navigation frameworks rely on modular pipelines—separating percep tion, mapping, planning, and control—which often suffer from error accumulation, high computational overhead, and poor reactivity in dynamic scenarios. To address these lim itations, this thesis investigates an end-to-end deep reinforcement learning (DRL) frame work for real-time UAV obstacle avoidance using onboard depth sensing. We compare two state-of-the-art DRL algorithms, Proximal Policy Optimization (PPO) and Twin Delayed DDPG (TD3), in a continuous control setting, evaluating their train ing dynamics and performance in diverse simulated environments. Our initial experi ments highlight key failure modes such as collisions with overhead obstacles and dead end traps, caused by the policy’s limited temporal awareness. To overcome these, we propose a neural architecture that incorporates both a pretrained ResNet8-based depth encoder and two temporal reasoning mechanisms: (1) an LSTM module for recurrent memory, and (2) a stacked buffer of recent depth observations. This temporal augmen tation allows the agent to recover from occlusions and partial observability, significantly improving navigation robustness. Trained in a curriculum-based Gym-PyBullet-Drones environment, our final memory based policy achieves a 96% success rate across randomized 3D obstacle courses and out performs EGO-Planner-v2 in both success rate and adaptability. The results demonstrate that DRL policies with temporal context can match or exceed the performance of tradi tional planning pipelines while offering greater generalization and simplicity in deploy ment.
9
Format: application/pdf
Accés al document: http://hdl.handle.net/10256/28367
Llenguatge: eng
Editor: Universitat de Girona. Institut de Recerca en Visió per Computador i Robòtica
Drets: Attribution-NonCommercial-NoDerivatives 4.0 International
URI Drets: http://creativecommons.org/licenses/by-nc-nd/4.0/
Matèria: DRL (Deep Reinforcement Learning)
Machine learning
Aprenentatge profund (Aprenentatge automàtic)
Vehicles aeris autònoms
Autonomous aerial vehicles
UAV (Vehicle aeri no tripulat)
Drone aircraft
Robots -- Sistemes de navegació
Robots -- Navigation systems
Obstacle avoidance
Títol: Deep Reinforcement Learning for drone obstacle avoidance
Tipus: info:eu-repo/semantics/masterThesis
Repositori: DUGiDocs

Matèries

Autors