4.7 Article

TEXPLORE: real-time sample-efficient reinforcement learning for robots

期刊

MACHINE LEARNING
卷 90, 期 3, 页码 385-429

出版社

SPRINGER
DOI: 10.1007/s10994-012-5322-7

关键词

Reinforcement learning; Robotics; MDP; Real-time

资金

  1. Direct For Computer & Info Scie & Enginr
  2. Division Of Computer and Network Systems [1305287] Funding Source: National Science Foundation
  3. Direct For Computer & Info Scie & Enginr
  4. Div Of Information & Intelligent Systems [0917122] Funding Source: National Science Foundation

向作者/读者索取更多资源

The use of robots in society could be expanded by using reinforcement learning (RL) to allow robots to learn and adapt to new situations online. RL is a paradigm for learning sequential decision making tasks, usually formulated as a Markov Decision Process (MDP). For an RL algorithm to be practical for robotic control tasks, it must learn in very few samples, while continually taking actions in real-time. In addition, the algorithm must learn efficiently in the face of noise, sensor/actuator delays, and continuous state features. In this article, we present texplore, the first algorithm to address all of these challenges together. texplore is a model-based RL method that learns a random forest model of the domain which generalizes dynamics to unseen states. The agent explores states that are promising for the final policy, while ignoring states that do not appear promising. With sample-based planning and a novel parallel architecture, texplore can select actions continually in real-time whenever necessary. We empirically evaluate the importance of each component of texplore in isolation and then demonstrate the complete algorithm learning to control the velocity of an autonomous vehicle in real-time.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据