Ítem


Semi-online neural-Q_leaming for real-time robot learning

Reinforcement learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the semi-online neural-Q_learning algorithm (SONQL). The algorithm uses the classic Q_learning technique with two modifications. First, a neural network (NN) approximates the Q_function allowing the use of continuous states and actions. Second, a database of the most representative learning samples accelerates and stabilizes the convergence. The term semi-online is referred to the fact that the algorithm uses the current but also past learning samples. However, the algorithm is able to learn in real-time while the robot is interacting with the environment. The paper shows simulated results with the "mountain-car" benchmark and, also, real results with an underwater robot in a target following behavior

© IEEE/RSJ International Conference on Intelligent Robots and Systems : IROS 2003 : Proceedings, 2003, vol. 1, p. 662-667

IEEE

Autor: Carreras Pérez, Marc
Ridao Rodríguez, Pere
El-Fakdi Sencianes, Andrés
Data: 2003
Resum: Reinforcement learning (RL) is a very suitable technique for robot learning, as it can learn in unknown environments and in real-time computation. The main difficulties in adapting classic RL algorithms to robotic systems are the generalization problem and the correct observation of the Markovian state. This paper attempts to solve the generalization problem by proposing the semi-online neural-Q_learning algorithm (SONQL). The algorithm uses the classic Q_learning technique with two modifications. First, a neural network (NN) approximates the Q_function allowing the use of continuous states and actions. Second, a database of the most representative learning samples accelerates and stabilizes the convergence. The term semi-online is referred to the fact that the algorithm uses the current but also past learning samples. However, the algorithm is able to learn in real-time while the robot is interacting with the environment. The paper shows simulated results with the "mountain-car" benchmark and, also, real results with an underwater robot in a target following behavior
Format: application/pdf
Cita: Carreras, M., Ridao, P., i El-Fakdi, A. (2003). Semi-online neural-Q_leaming for real-time robot learning. IEEE/RSJ International Conference on Intelligent Robots and Systems : IROS 2003 : Proceedings, 1, 662-667. Recuperat 05 abril 2010, a http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1250705
ISBN: 0-7803-7860-1
Accés al document: http://hdl.handle.net/10256/2167
Llenguatge: eng
Editor: IEEE
Col·lecció: Reproducció digital del document publicat a: http://dx.doi.org/10.1109/IROS.2003.1250705
Articles publicats (D-ATC)
És part de: © IEEE/RSJ International Conference on Intelligent Robots and Systems : IROS 2003 : Proceedings, 2003, vol. 1, p. 662-667
Drets: Tots els drets reservats
Matèria: Aprenentatge per reforç
Aprenentatge automàtic
Robots
Machine learning
Reinforcement learning
Títol: Semi-online neural-Q_leaming for real-time robot learning
Tipus: info:eu-repo/semantics/article
Repositori: DUGiDocs

Matèries

Autors