4.6 Article

Deep Reinforcement Learning for Multiobjective Optimization

期刊

IEEE TRANSACTIONS ON CYBERNETICS
卷 51, 期 6, 页码 3103-3114

出版社

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TCYB.2020.2977661

关键词

Optimization; Training; Urban areas; Neural networks; Reinforcement learning; Traveling salesman problems; Modeling; Deep reinforcement learning (DRL); multiobjective optimization; Pointer Network; traveling salesman problem

资金

  1. National Natural Science Foundation of China [61773390, 71571187]

向作者/读者索取更多资源

This article proposes a DRL-MOA end-to-end framework for solving multiobjective optimization problems, decomposing the problems, modeling subproblems, optimizing parameters, and training neural networks. Experimental results show that the method has strong generalization ability and fast solving speed.
This article proposes an end-to-end framework for solving multiobjective optimization problems (MOPs) using deep reinforcement learning (DRL), that we call DRL-based multiobjective optimization algorithm (DRL-MOA). The idea of decomposition is adopted to decompose the MOP into a set of scalar optimization subproblems. Then, each subproblem is modeled as a neural network. Model parameters of all the subproblems are optimized collaboratively according to a neighborhood-based parameter-transfer strategy and the DRL training algorithm. Pareto-optimal solutions can be directly obtained through the trained neural-network models. Specifically, the multiobjective traveling salesman problem (MOTSP) is solved in this article using the DRL-MOA method by modeling the subproblem as a Pointer Network. Extensive experiments have been conducted to study the DRL-MOA and various benchmark methods are compared with it. It is found that once the trained model is available, it can scale to newly encountered problems with no need for retraining the model. The solutions can be directly obtained by a simple forward calculation of the neural network; thereby, no iteration is required and the MOP can be always solved in a reasonable time. The proposed method provides a new way of solving the MOP by means of DRL. It has shown a set of new characteristics, for example, strong generalization ability and fast solving speed in comparison with the existing methods for multiobjective optimizations. The experimental results show the effectiveness and competitiveness of the proposed method in terms of model performance and running time.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据