☆ 4.2 Article

Sensitivity Analysis in Markov Decision Processes with Uncertain Reward Parameters

JOURNAL OF APPLIED PROBABILITY (2011)

期刊

JOURNAL OF APPLIED PROBABILITY

卷 48, 期 04, 页码 954-967

出版社

Cambridge University Press (CUP)

DOI: 10.1239/jap/1324046012

关键词

-

类别

Statistics & Probability

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.2

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Mathematics, Applied

Mean Field Markov Decision Processes

Nicole Baeuerle

Summary: This article investigates mean-field control problems in discrete time, including discounted reward, infinite time horizon, and compact state and action space. The existence of optimal policies is proven, and the limiting mean-field problem is derived as the number of individuals approaches infinity. Furthermore, the average reward problem is considered, and it is shown that the optimal policy in this mean-field limit is e-optimal for the discounted problem when the number of individuals is large and the discount factor is close to one. This result is significant for obtaining an average reward optimal policy in problems where the reward depends only on the distribution of individuals, by first computing an optimal measure from a static optimization problem and then achieving it with Markov Chain Monte Carlo methods. Two applications are provided: congestion avoidance on a graph and optimal positioning on a market place, which are explicitly solved.

APPLIED MATHEMATICS AND OPTIMIZATION (2023)

添加到收藏夹

Article Statistics & Probability

BATCH POLICY LEARNING IN AVERAGE REWARD MARKOV DECISION PROCESSES

Peng Liao, Zhengling Qi, Runzhe Wan, Predrag Klasnja, Susan A. Murphy

Summary: This study focuses on the batch (off-line) policy learning problem in the infinite horizon Markov decision process and proposes a doubly robust estimator to estimate the average reward. Moreover, an optimization algorithm is developed to compute the optimal policy in a parameterized stochastic policy class.

ANNALS OF STATISTICS (2022)

添加到收藏夹

Article Engineering, Mechanical

A general framework for probabilistic sensitivity analysis with respect to distribution parameters

Jiannan Yang

Summary: Probabilistic sensitivity analysis is used to identify influential uncertain inputs for decision-making. A general sensitivity framework is proposed, which unifies various sensitivity measures, including Fisher information. The framework is derived analytically and the sensitivity analysis is reformulated as an eigenvalue problem. The implementation of the framework involves two main steps: Monte Carlo type sampling and solving an eigenvalue equation. The resulting eigenvectors guide the simultaneous variations of input parameters and focus on perturbing uncertainty. The framework is conceptually simple and provides new sensitivity insights for applied mechanics problems.

PROBABILISTIC ENGINEERING MECHANICS (2023)

添加到收藏夹

Article Psychology, Clinical

Objective measures of reward sensitivity and motivation in people with high v. low anhedonia

Chloe Slaney, Adam M. Perkins, Robert Davis, Ian Penton-Voak, Marcus R. Munafo, Conor J. Houghton, Emma S. J. Robinson

Summary: Anhedonia, a core symptom of depression, is poorly understood. This study examined reward motivation and sensitivity in individuals with high and low anhedonia. The results suggest that anhedonia is associated with impairments in decision-making and reward sensitivity.

PSYCHOLOGICAL MEDICINE (2023)

添加到收藏夹

Article Green & Sustainable Science & Technology

Probabilistic Analysis of Slope against Uncertain Soil Parameters

Pisanu Chuaiwate, Saravut Jaritngam, Pattamad Panedpojaman, Nirut Konkong

Summary: This article investigates the influence of uncertain soil parameters on slope stability problems using the probability method. It finds that the inherent spatial variability of soil properties and its impact on slope safety factors are the most important soil instabilities. The study combines randomly selected uncertain parameters with traditional analysis using Bishop's simple methodology and Monte Carlo simulation to analyze the minimum safety factor and critical slip surface of slope stability. The results recommend new soil strength parameters for construction. Additionally, probability analysis identifies insufficient understanding of groundwater level distribution and the assumption of uniform distribution increasing the probability of failure.

SUSTAINABILITY (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Reward shaping with hierarchical graph topology

Jianghui Sang, Yongli Wang, Weiping Ding, Zaki Ahmadkhan, Lin Xu

Summary: This paper presents a reward shaping method called HGT, which propagates reward information through hierarchical graph topology to shape potential functions for complex tasks. Compared to cutting-edge RL techniques, HGT achieves faster learning rates in experiments on Atari and Mujoco tasks.

PATTERN RECOGNITION (2023)

添加到收藏夹

Article Multidisciplinary Sciences

Long-term stability of computational parameters during approach-avoidance conflict in a transdiagnostic psychiatric patient sample

Ryan Smith, Namik Kirlic, Jennifer L. Stewart, James Touthang, Rayus Kuplicki, Timothy J. McDermott, Samuel Taylor, Sahib S. Khalsa, Martin P. Paulus, Robin L. Aupperle

Summary: By analyzing 1-year follow-up data, computational modeling parameters show a certain level of stability and correlation with other clinical indicators. Patients demonstrate differences in decision uncertainty and emotional conflict compared to healthy controls.

SCIENTIFIC REPORTS (2021)

添加到收藏夹

Article Management

Markov decision processes with burstiness constraints

Michal Golan, Nahum Shimkin

Summary: This paper discusses a Markov Decision Process (MDP) model considering (sigma, rho)-burstiness constraints over a finite or infinite horizon, and explores the corresponding constrained optimization problems. By introducing a recursive form of constraints, an augmented-state model is proposed to recover sufficiency of Markov or stationary policies and apply standard theory.

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Efficient algorithms for Risk-Sensitive Markov Decision Processes with limited budget

Daniel A. Melo Moreira, Karina Valdivia Delgado, Leliane Nunes de Barros, Denis Deratani Maua

Summary: This study focuses on finding optimal policies for Markov Decision Processes by optimizing a risk-sensitive non-linear cumulative cost function. Two algorithms were developed to improve efficiency and solve large-scale problems without sacrificing optimality.

INTERNATIONAL JOURNAL OF APPROXIMATE REASONING (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Graph convolution with topology refinement for Automatic Reinforcement Learning

Jianghui Sang, Yongli Wang

Summary: This study proposes a method called Graph Convolution with Topology Refinement (GTR) for automatic reinforcement learning, which constructs a new latent graph to enhance reward shaping. The most suitable state node is identified using graph entropy, and the original graph is adaptively mapped to a subset of nodes to form a more compact latent graph. GTR utilizes trainable projection vectors for node feature projection, ensuring consistent inter-connections between nodes in the new latent graph.

NEUROCOMPUTING (2023)

添加到收藏夹

Article Engineering, Manufacturing

Risk-sensitive Markov decision processes with long-run CVaR criterion

Li Xia, Luyao Zhang, Peter W. Glynn

Summary: This paper studies the optimization of CVaR in an infinite-horizon discrete-time MDP model. By introducing a pseudo-CVaR metric and deriving CVaR difference formula and optimal conditions for deterministic policies, we develop algorithms for efficient optimization and establish properties of the optimal pseudo-CVaR function. Numerical experiments are conducted to demonstrate the main results.

PRODUCTION AND OPERATIONS MANAGEMENT (2023)

添加到收藏夹

Article Management

Optimal Procurement in Remanufacturing Systems with Uncertain Used-Item Condition

Emre Nadar, Mustafa Akan, Laurens Debo, Alan Scheller-Wolf

Summary: This article investigates a single-product remanufacture-to-order system with uncertain quality levels for used items, random procurement lead times, and lost sales. The quality level of used items is only known after acquisition and inspection, with higher-quality items having lower remanufacturing costs. The system is modeled as a Markov decision process, and an optimal policy is sought regarding procurement, demand satisfaction, and remanufacturing. The optimal procurement policy is characterized as a state-dependent noncongestive acquisition strategy, taking into account system congestion. It is also shown that meeting demand with the highest-quality item is always optimal. Extensions of the model to cases with known used-item conditions and remanufacture-to-stock systems are discussed, where the standard push strategy is optimal in the remanufacturing stage.

OPERATIONS RESEARCH (2023)

添加到收藏夹

Article Engineering, Industrial

Decomposition methods for solving Markov decision processes with multiple models of the parameters

Lauren N. Steimle, Vinayak S. Ahluwalia, Charmee Kamdar, Brian T. Denton

Summary: The article explores the issue of decision-making in MDPs with uncertain parameters and introduces new solution methods. Numerical experiments show that the customized implementation significantly outperforms traditional methods, and that the variance among model parameters can be a crucial factor in problem-solving value.

IISE TRANSACTIONS (2021)

添加到收藏夹

Article Management

Markov decision processes with recursive risk measures

Nicole Baeuerle, Alexander Glauner

Summary: This paper investigates risk-sensitive Markov Decision Processes with unbounded cost and finite/infinite planning horizons. By recursively applying static risk measures and making direct assumptions on model data, we derive a Bellman equation and prove the existence of optimal Markov policies. Additionally, our approach unifies results for various well-known risk measures and establishes a connection to distributionally robust MDPs.

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Decisions for information or information for decisions? Optimizing information gathering in decision-intensive processes

S. Voorberg, R. Eshuis, W. van Jaarsveld, G. J. van Houtum

Summary: This paper introduces an approach that supports decision makers in balancing information gathering and cost-effective decision making, using CMMN modeling notation and Markov Decision Processes optimization technique to provide decision makers with an optimal information-gathering solution and configure a run-time recommendation tool.

DECISION SUPPORT SYSTEMS (2021)

添加到收藏夹

暂无数据

暂无数据

© Peeref 2019-2024. All rights reserved.