☆ 4.7 Article

Learning the value of information and reward over time when solving exploration-exploitation problems

SCIENTIFIC REPORTS (2017)

Journal

SCIENTIFIC REPORTS

Volume 7, Issue -, Pages -

Publisher

NATURE PUBLISHING GROUP

DOI: 10.1038/s41598-017-17237-w

Keywords

-

Categories

Multidisciplinary Sciences

Funding

F.R.S.-FNRS grant (Belgium)
NSF CRCNS grant [BCS-1309346]
FWO-Flanders Odysseus II Award [G.OC44.13N]

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

To flexibly adapt to the demands of their environment, animals are constantly exposed to the conflict resulting from having to choose between predictably rewarding familiar options (exploitation) and risky novel options, the value of which essentially consists of obtaining new information about the space of possible rewards (exploration). Despite extensive research, the mechanisms that subtend the manner in which animals solve this exploitation-exploration dilemma are still poorly understood. Here, we investigate human decision-making in a gambling task in which the informational value of each trial and the reward potential were separately manipulated. To better characterize the mechanisms that underlined the observed behavioural choices, we introduce a computational model that augments the standard reward-based reinforcement learning formulation by associating a value to information. We find that both reward and information gained during learning influence the balance between exploitation and exploration, and that this influence was dependent on the reward context. Our results shed light on the mechanisms that underpin decision-making under uncertainty, and suggest new approaches for investigating the exploration-exploitation dilemma throughout the animal kingdom.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Automation & Control Systems

Solving hard-exploration problems with counting and replay approach

Bo-Ying Huang, Shi-Chun Tsai

Summary: The reinforcement learning agent has achieved success in Atari 2600 games, but it tends to fall into local optima in complex and challenging environments. To address this issue, a Trajectory Evaluation Module is developed and integrated with count-based exploration and trajectory replay methods. Experiment results show that this module helps the agent successfully pass all levels.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2022)

Add to Collection

Article Multidisciplinary Sciences

Sensing prior constraints in deep neural networks for solving exploration geophysical problems

Xinming Wu, Jianwei Ma, Xu Si, Zhengfa Bi, Jiarun Yang, Hui Gao, Dongzi Xie, Zhixiang Guo, Jie Zhang

Summary: One of the key objectives in geophysics is to characterize the subsurface through analyzing and interpreting geophysical field data. Data-driven deep learning methods have potential for simplifying the process but face challenges such as poor generalizability and weak interpretability. This study presents three strategies for imposing domain knowledge constraints on deep neural networks (DNNs) to address these challenges, including generating synthetic training datasets, designing nontrainable custom layers, and implementing prior knowledge as regularization terms in the loss functions.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2023)

Add to Collection

Article Computer Science, Information Systems

PCOBL: A Novel Opposition-Based Learning Strategy to Improve Metaheuristics Exploration and Exploitation for Solving Global Optimization Problems

Tapas Si, Debolina Bhattacharya, Somen Nayak, Pericles B. C. Miranda, Utpal Nandi, Saurav Mallik, Ujjwal Maulik, Hong Qin

Summary: This manuscript proposes a novel Opposition-based learning scheme, called PCOBL, to improve the performance of meta-heuristics by maintaining an effective balance between exploration and exploitation. The empirical results demonstrate that PCOBL positively impacts the performance of meta-heuristics, outperforming state-of-the-art algorithms in terms of best-error runs and convergence in most optimization problems. Moreover, the inclusion of PCOBL in the meta-heuristic algorithm has a low impact on its efficiency.

IEEE ACCESS (2023)

Add to Collection

Article Psychology, Biological

The role of intolerance of uncertainty when solving the exploration-exploitation dilemma1

Angelos-Miltiadis Krypotos, Maryna Alves, Geert Crombez, Johan W. S. Vlaeyen

Summary: When making behavioral decisions, individuals need to balance between exploiting known options or exploring new ones. The relationship between intolerance of uncertainty (IU) and performance in an exploration-exploitation dilemma (EED) task was tested using computational models. The results did not provide strong evidence for a clear relationship between EED and IU, except for the decay rate and the tendency to become paralyzed in the face of uncertainty.

INTERNATIONAL JOURNAL OF PSYCHOPHYSIOLOGY (2022)

Add to Collection

Article Computer Science, Theory & Methods

Leveraging transition exploratory bonus for efficient exploration in Hard-Transiting reinforcement learning problems

Shangdong Yang, Huihui Wang, Shaokang Dong, Xingguo Chen

Summary: In reinforcement learning, agents learn policies from spatiotemporal data generated through interaction with the environment. However, the reward signals in the data are often sparse, making policy learning challenging. Prior knowledge of task structure has been used to address this issue, and in this paper, we consider the Hard-Transiting task structure. We propose two novel algorithms for efficient exploration and test them on various tasks.

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE (2023)

Add to Collection

Article Computer Science, Information Systems

Solving combinatorial optimization problems over graphs with BERT-Based Deep Reinforcement Learning

Qi Wang, Kenneth H. Lai, Chunlei Tang

Summary: This study proposes a novel framework (BDRL) that combines BERT and deep reinforcement learning to solve combinatorial optimization problems over graphs. The transformer encoder of BERT is improved to effectively embed the combinatorial optimization graph, and BERT-like training is extended to reinforcement learning using contrastive objectives to acquire self-attention-consistent representations. Hierarchical reinforcement learning is employed to pre-train and fine-tune the model for specific combinatorial optimization problems. The results demonstrate the generalization ability, efficiency, and effectiveness of the proposed framework in multiple tasks.

INFORMATION SCIENCES (2023)

Add to Collection

Article Psychology, Biological

Polarity of uncertainty representation during exploration and exploitation in ventromedial prefrontal cortex

Nadescha Trudel, Jacqueline Scholl, Miriam C. Klein-Flugge, Elsa Fouragnan, Lev Tankelevitch, Marco K. Wittmann, Matthew F. S. Rushworth

Summary: In a study conducted by Trudel et al., it was found that the ventromedial prefrontal cortex carries multiple decision variables with varying strength and polarity depending on the behavioral context. Initially, participants tend to select predictors with higher uncertainty, but as time progresses, they shift towards more accurate predictors and avoid uncertain ones. This transition is accompanied by changes in representations of belief uncertainty in the vmPFC.

NATURE HUMAN BEHAVIOUR (2021)

Add to Collection

Article Statistics & Probability

NONPARAMETRIC LEARNING FOR IMPULSE CONTROL PROBLEMS-EXPLORATION VS. EXPLOITATION

Soren Christensen, Claudia Strauch

Summary: The paper aims to combine techniques from stochastic control with methods from statistics for stochastic processes to learn the dynamics of the underlying process and control it in a reasonable manner. By studying a long-term average impulse control problem, the authors propose a solution to the exploration-exploitation dilemma and find that it can be based on the convergence rates of estimators for the invariant density.

ANNALS OF APPLIED PROBABILITY (2023)

Add to Collection

Article Multidisciplinary Sciences

Humans monitor learning progress in curiosity-driven exploration

Alexandr Ten, Pramod Kaushik, Pierre-Yves Oudeyer, Jacqueline Gottlieb

Summary: Curiosity-driven learning is foundational to human cognition, allowing individuals to autonomously decide what to learn. Computational theories propose competence measures and learning progress as intrinsic utility functions for efficient exploration, with empirical evidence supporting the importance of these concepts in task selection.

NATURE COMMUNICATIONS (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

Continuous human learning optimization with enhanced exploitation and exploration

Ling Wang, Yihao Jia, Bowen Huang, Xian Wu, Wenju Zhou, Minrui Fei

Summary: This paper proposes a new continuous HLO variant algorithm, named CHLOEEE, which enhances the exploration and exploitation capabilities by introducing a novel social learning operator. By comparing and analyzing the search behaviors of CHLO variants, the superiority of CHLOEEE algorithm on benchmark problems is validated.

SOFT COMPUTING (2023)

Add to Collection

Article Physics, Multidisciplinary

Solving Graph Problems Using Gaussian Boson Sampling

Yu-Hao Deng, Si-Qiu Gong, Yi-Chao Gu, Zhi-Jiong Zhang, Hua-Liang Liu, Hao Su, Hao-Yang Tang, Jia-Min Xu, Meng-Hao Jia, Ming-Cheng Chen, Han-Sen Zhong, Hui Wang, Jiarong Yan, Yi Hu, Jia Huang, Wei -Jun Zhang, Hao Li, Xiao Jiang, Lixing You, Zhen Wang, Li Li, Nai-Le Liu, Chao -Yang Lu, Jian-Wei Pan

Summary: Gaussian boson sampling (GBS) is a protocol for demonstrating quantum computational advantage and is mathematically associated with graph-related and quantum chemistry problems. This study investigates the enhancement of GBS over classical stochastic algorithms on noisy quantum devices in the computationally interesting regime. Experimental results show the presence of GBS enhancement with a large photon-click number and robustness under certain noise, which may stimulate the development of more efficient classical and quantum-inspired algorithms.

PHYSICAL REVIEW LETTERS (2023)

Add to Collection

Article Computer Science, Artificial Intelligence

Balancing exploration and exploitation in episodic reinforcement learning

Qihang Chen, Qiwei Zhang, Yunlong Liu

Summary: One of the major challenges in reinforcement learning is the sparse and delayed rewards in episodic tasks. The existing techniques have difficulties in assigning credits to explored transitions or are misled by behavioral policies, leading to sluggish learning efficiency. To address this, we propose an approach called EMR, which combines intrinsic rewards of exploration mechanisms with reward redistribution to balance exploration and exploitation in such tasks.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Add to Collection

Article Automation & Control Systems

Step-Wise Deep Learning Models for Solving Routing Problems

Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang

Summary: The article proposes a novel step-wise scheme to remove visited nodes in each node selection step, addressing the issue of suboptimal policies in routing problems. By applying this scheme, the performance of two deep models is significantly improved, and an approximate step-wise TAM model is introduced to reduce computational complexity.

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS (2021)

Add to Collection

Article Computer Science, Artificial Intelligence

An adaptive estimation method with exploration and exploitation modes for non-stationary environments

Kutalmi Coskun, Borahan Tumer

Summary: Modeling and analysis of dynamic systems are crucial for addressing complex real-world problems. This paper proposes a stochastic learning method that can handle non-stationarity and detect changes in stability using a statistical model. Experimental results demonstrate the effectiveness of this method in different types of drifts.

PATTERN RECOGNITION (2022)

Add to Collection

Article Anesthesiology

The exploration-exploitation dilemma in pain: an experimental investigation

Angelos-Miltiadis Krypotos, Geert Crombez, Maryna Alves, Nathalie Claes, Johan W. S. Vlaeyen

Summary: This study investigates how individuals solve the exploration-exploitation dilemma when facing pain and finds that participants tend to choose the safest option, prioritize rewards over pain, and are more inclined to explore after experiencing pain.

PAIN (2022)

Add to Collection

Correction Multidisciplinary Sciences

Learning the value of information and reward over time when solving exploration-exploitation problems (vol 7, 2017)

Irene Cogliati Dezza, Angela J. Yu, Axel Cleeremans, William Alexander

SCIENTIFIC REPORTS (2018)

Add to Collection

Article Clinical Neurology

Functional and structural balances of homologous sensorimotor regions in multiple sclerosis fatigue

I. Cogliati Dezza, G. Zito, L. Tomasevic, M. M. Filippi, A. Ghazaryan, C. Porcaro, R. Squitti, M. Ventriglia, D. Lupoi, F. Tecchio

JOURNAL OF NEUROLOGY (2015)

Add to Collection

Article Psychiatry

Distinct motivations to seek out information in healthy individuals and problem gamblers

Irene Cogliati Dezza, Xavier Noel, Axel Cleeremans, Angela J. Yu

Summary: This study uses a novel decision-making task and computational model to investigate the motivations driving information-seeking behavior in healthy individuals and problem gamblers. The results suggest that healthy subjects and problem gamblers have distinct information-seeking modes, with healthy individuals being more motivated by novelty-seeking and problem gamblers showing a preference for accumulating knowledge. These findings have important implications for the diagnosis and treatment of behavioral addiction.

TRANSLATIONAL PSYCHIATRY (2021)

Add to Collection

Article Multidisciplinary Sciences

Anxiety increases information-seeking in response to large changes

Caroline J. Charpentier, Irene Cogliati Dezza, Valentina Vellani, Laura K. Globig, Maria Gaedeke, Tali Sharot

Summary: Anxiety does not lead to a general increase in information-seeking, but rather increases it when there are large changes in the environment. This suggests that greater information-seeking in anxious individuals in changing environments may be an adaptive compensatory mechanism.

SCIENTIFIC REPORTS (2022)

Add to Collection

Article Biology

Independent and interacting value systems for reward and information in the human brain

Irene Cogliati Dezza, Axel Cleeremans, William H. Alexander, David Badre

Summary: This study uses computational modeling, model-based functional magnetic resonance imaging analysis, and a novel experimental paradigm to identify a dedicated and independent value system for information in the human PFC. The results provide empirical evidence for PFC as an optimizer of independent information and reward signals during decision-making.

ELIFE (2022)

Add to Collection

Article Psychology, Experimental

People adaptively use information to improve their internal states and external outcomes

I. Cogliati Dezza, C. Maher, T. Sharot

Summary: This study demonstrates that people can accurately predict the impact of information on their internal states and external outcomes, and use these predictions to guide their information-seeking choices. Participants achieve happiness, certainty and make better decisions when they seek information that aligns with their expectations.

COGNITION (2022)

Add to Collection

Article Multidisciplinary Sciences

Multifaceted information-seeking motives in children

Gaia Molinaro, Irene Cogliati Dezza, Sarah Katharina Buehler, Christina Moutsiana, Tali Sharot

Summary: From a young age, children need to gather information to understand their environment. This study examines the developmental trajectories of diverse information-seeking motives in children, finding that school-age children integrate factors such as reducing uncertainty, directing action, and positive outcomes into their information-seeking choices. The study suggests that motives related to usefulness and uncertainty reduction become stronger with age, while seeking positive news remains relatively constant throughout development.

NATURE COMMUNICATIONS (2023)

Add to Collection

Meeting Abstract Neurosciences

Anxiety Selectively Increases Information-Seeking in Response to Large Changes

Valentina Vellani, Caroline Charpentier, Irene Cogliati Dezza, Laura K. Globig, Maria Gadeke, Tali Sharot

BIOLOGICAL PSYCHIATRY (2022)

Add to Collection

Article Psychology, Experimental

Should We Control? The Interplay Between Cognitive Control and Information Integration in the Resolution of the Exploration-Exploitation Dilemma

Irene Cogliati Dezza, Axel Cleeremans, William Alexander

JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL (2019)

Add to Collection

Article Psychology, Multidisciplinary

Baby schema in human and animal faces induces cuteness perception and gaze allocation in children

Marta Borgi, Irene Cogliati-Dezza, Victoria Brelsford, Kerstin Meints, Francesca Cirulli

FRONTIERS IN PSYCHOLOGY (2014)

Add to Collection

No Data Available

© Peeref 2019-2024. All rights reserved.