Article
Mathematics, Applied
Nicole Baeuerle
Summary: This article investigates mean-field control problems in discrete time, including discounted reward, infinite time horizon, and compact state and action space. The existence of optimal policies is proven, and the limiting mean-field problem is derived as the number of individuals approaches infinity. Furthermore, the average reward problem is considered, and it is shown that the optimal policy in this mean-field limit is e-optimal for the discounted problem when the number of individuals is large and the discount factor is close to one. This result is significant for obtaining an average reward optimal policy in problems where the reward depends only on the distribution of individuals, by first computing an optimal measure from a static optimization problem and then achieving it with Markov Chain Monte Carlo methods. Two applications are provided: congestion avoidance on a graph and optimal positioning on a market place, which are explicitly solved.
APPLIED MATHEMATICS AND OPTIMIZATION
(2023)
Article
Statistics & Probability
Peng Liao, Zhengling Qi, Runzhe Wan, Predrag Klasnja, Susan A. Murphy
Summary: This study focuses on the batch (off-line) policy learning problem in the infinite horizon Markov decision process and proposes a doubly robust estimator to estimate the average reward. Moreover, an optimization algorithm is developed to compute the optimal policy in a parameterized stochastic policy class.
ANNALS OF STATISTICS
(2022)
Article
Engineering, Mechanical
Jiannan Yang
Summary: Probabilistic sensitivity analysis is used to identify influential uncertain inputs for decision-making. A general sensitivity framework is proposed, which unifies various sensitivity measures, including Fisher information. The framework is derived analytically and the sensitivity analysis is reformulated as an eigenvalue problem. The implementation of the framework involves two main steps: Monte Carlo type sampling and solving an eigenvalue equation. The resulting eigenvectors guide the simultaneous variations of input parameters and focus on perturbing uncertainty. The framework is conceptually simple and provides new sensitivity insights for applied mechanics problems.
PROBABILISTIC ENGINEERING MECHANICS
(2023)
Article
Psychology, Clinical
Chloe Slaney, Adam M. Perkins, Robert Davis, Ian Penton-Voak, Marcus R. Munafo, Conor J. Houghton, Emma S. J. Robinson
Summary: Anhedonia, a core symptom of depression, is poorly understood. This study examined reward motivation and sensitivity in individuals with high and low anhedonia. The results suggest that anhedonia is associated with impairments in decision-making and reward sensitivity.
PSYCHOLOGICAL MEDICINE
(2023)
Article
Green & Sustainable Science & Technology
Pisanu Chuaiwate, Saravut Jaritngam, Pattamad Panedpojaman, Nirut Konkong
Summary: This article investigates the influence of uncertain soil parameters on slope stability problems using the probability method. It finds that the inherent spatial variability of soil properties and its impact on slope safety factors are the most important soil instabilities. The study combines randomly selected uncertain parameters with traditional analysis using Bishop's simple methodology and Monte Carlo simulation to analyze the minimum safety factor and critical slip surface of slope stability. The results recommend new soil strength parameters for construction. Additionally, probability analysis identifies insufficient understanding of groundwater level distribution and the assumption of uniform distribution increasing the probability of failure.
Article
Computer Science, Artificial Intelligence
Jianghui Sang, Yongli Wang, Weiping Ding, Zaki Ahmadkhan, Lin Xu
Summary: This paper presents a reward shaping method called HGT, which propagates reward information through hierarchical graph topology to shape potential functions for complex tasks. Compared to cutting-edge RL techniques, HGT achieves faster learning rates in experiments on Atari and Mujoco tasks.
PATTERN RECOGNITION
(2023)
Article
Multidisciplinary Sciences
Ryan Smith, Namik Kirlic, Jennifer L. Stewart, James Touthang, Rayus Kuplicki, Timothy J. McDermott, Samuel Taylor, Sahib S. Khalsa, Martin P. Paulus, Robin L. Aupperle
Summary: By analyzing 1-year follow-up data, computational modeling parameters show a certain level of stability and correlation with other clinical indicators. Patients demonstrate differences in decision uncertainty and emotional conflict compared to healthy controls.
SCIENTIFIC REPORTS
(2021)
Article
Management
Michal Golan, Nahum Shimkin
Summary: This paper discusses a Markov Decision Process (MDP) model considering (sigma, rho)-burstiness constraints over a finite or infinite horizon, and explores the corresponding constrained optimization problems. By introducing a recursive form of constraints, an augmented-state model is proposed to recover sufficiency of Markov or stationary policies and apply standard theory.
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
(2024)
Article
Computer Science, Artificial Intelligence
Daniel A. Melo Moreira, Karina Valdivia Delgado, Leliane Nunes de Barros, Denis Deratani Maua
Summary: This study focuses on finding optimal policies for Markov Decision Processes by optimizing a risk-sensitive non-linear cumulative cost function. Two algorithms were developed to improve efficiency and solve large-scale problems without sacrificing optimality.
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING
(2021)
Article
Computer Science, Artificial Intelligence
Jianghui Sang, Yongli Wang
Summary: This study proposes a method called Graph Convolution with Topology Refinement (GTR) for automatic reinforcement learning, which constructs a new latent graph to enhance reward shaping. The most suitable state node is identified using graph entropy, and the original graph is adaptively mapped to a subset of nodes to form a more compact latent graph. GTR utilizes trainable projection vectors for node feature projection, ensuring consistent inter-connections between nodes in the new latent graph.
Article
Engineering, Manufacturing
Li Xia, Luyao Zhang, Peter W. Glynn
Summary: This paper studies the optimization of CVaR in an infinite-horizon discrete-time MDP model. By introducing a pseudo-CVaR metric and deriving CVaR difference formula and optimal conditions for deterministic policies, we develop algorithms for efficient optimization and establish properties of the optimal pseudo-CVaR function. Numerical experiments are conducted to demonstrate the main results.
PRODUCTION AND OPERATIONS MANAGEMENT
(2023)
Article
Management
Emre Nadar, Mustafa Akan, Laurens Debo, Alan Scheller-Wolf
Summary: This article investigates a single-product remanufacture-to-order system with uncertain quality levels for used items, random procurement lead times, and lost sales. The quality level of used items is only known after acquisition and inspection, with higher-quality items having lower remanufacturing costs. The system is modeled as a Markov decision process, and an optimal policy is sought regarding procurement, demand satisfaction, and remanufacturing. The optimal procurement policy is characterized as a state-dependent noncongestive acquisition strategy, taking into account system congestion. It is also shown that meeting demand with the highest-quality item is always optimal. Extensions of the model to cases with known used-item conditions and remanufacture-to-stock systems are discussed, where the standard push strategy is optimal in the remanufacturing stage.
OPERATIONS RESEARCH
(2023)
Article
Engineering, Industrial
Lauren N. Steimle, Vinayak S. Ahluwalia, Charmee Kamdar, Brian T. Denton
Summary: The article explores the issue of decision-making in MDPs with uncertain parameters and introduces new solution methods. Numerical experiments show that the customized implementation significantly outperforms traditional methods, and that the variance among model parameters can be a crucial factor in problem-solving value.
Article
Management
Nicole Baeuerle, Alexander Glauner
Summary: This paper investigates risk-sensitive Markov Decision Processes with unbounded cost and finite/infinite planning horizons. By recursively applying static risk measures and making direct assumptions on model data, we derive a Bellman equation and prove the existence of optimal Markov policies. Additionally, our approach unifies results for various well-known risk measures and establishes a connection to distributionally robust MDPs.
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH
(2022)
Article
Computer Science, Artificial Intelligence
S. Voorberg, R. Eshuis, W. van Jaarsveld, G. J. van Houtum
Summary: This paper introduces an approach that supports decision makers in balancing information gathering and cost-effective decision making, using CMMN modeling notation and Markov Decision Processes optimization technique to provide decision makers with an optimal information-gathering solution and configure a run-time recommendation tool.
DECISION SUPPORT SYSTEMS
(2021)