4.7 Article

Optimizing task scheduling in human-robot collaboration with deep multi-agent reinforcement learning

期刊

JOURNAL OF MANUFACTURING SYSTEMS
卷 60, 期 -, 页码 487-499

出版社

ELSEVIER SCI LTD
DOI: 10.1016/j.jmsy.2021.07.015

关键词

Human-Robot Collaboration; Real-time task scheduling; Multi-agent reinforcement learning

资金

  1. U.S. National Science Foundation (NSF) Grant [CMMI1853454]

向作者/读者索取更多资源

This paper introduces a method of using a chessboard setting to simulate decision-making in HRC assembly processes, optimizing completion time through a Markov game model. The application of a deep-Q-network (DQN) based multi-agent reinforcement learning method is compared with other approaches to improve scheduling efficiency, demonstrating effectiveness in a case study.
Human-Robot Collaboration (HRC) presents an opportunity to improve the efficiency of manufacturing processes. However, the existing task planning approaches for HRC are still limited in many ways, e.g., co-robot encoding must rely on experts' knowledge and the real-time task scheduling is applicable within small stateaction spaces or simplified problem settings. In this paper, the HRC assembly working process is formatted into a novel chessboard setting, in which the selection of chess piece move is used to analogize to the decision making by both humans and robots in the HRC assembly working process. To optimize the completion time, a Markov game model is considered, which takes the task structure and the agent status as the state input and the overall completion time as the reward. Without experts' knowledge, this game model is capable of seeking for correlated equilibrium policy among agents with convergency in making real-time decisions facing a dynamic environment. To improve the efficiency in finding an optimal policy of the task scheduling, a deep-Q-network (DQN) based multi-agent reinforcement learning (MARL) method is applied and compared with the Nash-Q learning, dynamic programming and the DQN-based single-agent reinforcement learning method. A heightadjustable desk assembly is used as a case study to demonstrate the effectiveness of the proposed algorithm with different number of tasks and agents.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据