☆ 4.7 Article

A method for model selection using reinforcement learning when viewing design as a sequential decision process

STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION (2019)

Journal

STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION

Volume 59, Issue 5, Pages 1521-1542

Publisher

SPRINGER

DOI: 10.1007/s00158-018-2145-6

Keywords

Reinforcement learning; Tradespace; Decision making under uncertainty; Sequential decision process; Design; Multi-fidelity

Funding

National Science Foundation (NSF) under NSF [CMMI-1455444]
College of Engineering at Pennsylvania State University

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

In an emerging paradigm, design is viewed as a sequential decision process (SDP) in which mathematical models of increasing fidelity are used in a sequence to systematically contract sets of design alternatives. The key idea behind SDP is to sequence models of increasing fidelity to provide sequentially tighter bounds on the decision criteria thereby removing inefficient designs from the tradespace with the guarantee that the antecedent model only removes design solutions that are dominated when analyzed using the more detailed, high-fidelity model. In general, efficiency in the SDP is achieved by using less expensive (low-fidelity) models early in the design process, before using high-fidelity models later on in the process. However, the set of multi-fidelity models and discrete decision states result in a combinatorial combination of model sequences, some of which require significantly fewer model evaluations than others. Unfortunately, the optimal modeling policy can not be determined at the onset of the SDP because the computational costs of executing all models on all designs and the discriminatory power of the resulting bounds are unknown. In this paper, the model selection problem is formulated as a finite Markov decision process (MDP) and an online reinforcement learning (RL) algorithm, namely, Q-learning, is used to obtain and follow an approximately optimal modeling policy, thereby overcoming the optimal modeling policy limitation of the current SDP. The outcome is a Reinforcement Learning based Design (RL-D) methodology able to learn efficient sequencing of models from sample estimates of the computational cost and discriminatory power of different models while analyzing design alternatives in the tradespace throughout the design process. Through application to two different design examples, the RL-D is shown to (1) effectively identify the approximate optimal modeling policy and (2) efficiently converge upon a choice set.

A method for model selection using reinforcement learning when viewing design as a sequential decision process

Journal

STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

A method for model selection using reinforcement learning when viewing design as a sequential decision process

Journal

STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION

Publisher

SPRINGER

Keywords

Categories

Funding

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper