4.7 Article

Learning human activities and object affordances from RGB-D videos

期刊

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH
卷 32, 期 8, 页码 951-970

出版社

SAGE PUBLICATIONS LTD
DOI: 10.1177/0278364913478446

关键词

3D perception; human activity detection; object affordance; supervised learning; spatio-temporal context; personal robots

类别

资金

  1. ARO [W911NF-12-1-0267]
  2. Microsoft
  3. Alfred P. Sloan Research Fellowship

向作者/读者索取更多资源

Understanding human activities and object affordances are two very important skills, especially for personal robots which operate in human environments. In this work, we consider the problem of extracting a descriptive labeling of the sequence of sub-activities being performed by a human, and more importantly, of their interactions with the objects in the form of associated affordances. Given a RGB-D video, we jointly model the human activities and object affordances as a Markov random field where the nodes represent objects and sub-activities, and the edges represent the relationships between object affordances, their relations with sub-activities, and their evolution over time. We formulate the learning problem using a structural support vector machine (SSVM) approach, where labelings over various alternate temporal segmentations are considered as latent variables. We tested our method on a challenging dataset comprising 120 activity videos collected from 4 subjects, and obtained an accuracy of 79.4% for affordance, 63.4% for sub-activity and 75.0% for high-level activity labeling. We then demonstrate the use of such descriptive labeling in performing assistive tasks by a PR2 robot.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据