☆ 4.7 Article

Path Planning of Coastal Ships Based on Optimized DQN Reward Function

JOURNAL OF MARINE SCIENCE AND ENGINEERING (2021)

Journal

JOURNAL OF MARINE SCIENCE AND ENGINEERING

Volume 9, Issue 2, Pages -

Publisher

MDPI

DOI: 10.3390/jmse9020210

Keywords

path planning; deep reinforcement learning; decision-making; obstacle avoidance

Funding

National Key R&D Program of China [2018YFB1601502]
LiaoNing Revitalization Talents Program [XLYC1902071]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This paper proposes an optimized deep Q network (DQN) algorithm for coastal ship path planning, which improves learning efficiency and convergence speed of the model in navigation.

Path planning is a key issue in the field of coastal ships, and it is also the core foundation of ship intelligent development. In order to better realize the ship path planning in the process of navigation, this paper proposes a coastal ship path planning model based on the optimized deep Q network (DQN) algorithm. The model is mainly composed of environment status information and the DQN algorithm. The environment status information provides training space for the DQN algorithm and is quantified according to the actual navigation environment and international rules for collision avoidance at sea. The DQN algorithm mainly includes four components which are ship state space, action space, action exploration strategy and reward function. The traditional reward function of DQN may lead to the low learning efficiency and convergence speed of the model. This paper optimizes the traditional reward function from three aspects: (a) the potential energy reward of the target point to the ship is set; (b) the reward area is added near the target point; and (c) the danger area is added near the obstacle. Through the above optimized method, the ship can avoid obstacles to reach the target point faster, and the convergence speed of themodel is accelerated. The traditional DQN algorithm, A* algorithm, BUG2 algorithm and artificial potential field (APF) algorithm are selected for experimental comparison, and the experimental data are analyzed from the path length, planning time, number of path corners. The experimental results show that the optimized DQN algorithm has better stability and convergence, and greatly reduces the calculation time. It can plan the optimal path in line with the actual navigation rules, and improve the safety, economy and autonomous decision-making ability of ship navigation.

Path Planning of Coastal Ships Based on Optimized DQN Reward Function

Journal

JOURNAL OF MARINE SCIENCE AND ENGINEERING

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Path Planning of Coastal Ships Based on Optimized DQN Reward Function

Journal

JOURNAL OF MARINE SCIENCE AND ENGINEERING

Publisher

MDPI

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper