4.8 Article

Belief state representation in the dopamine system

期刊

NATURE COMMUNICATIONS
卷 9, 期 -, 页码 -

出版社

NATURE PUBLISHING GROUP
DOI: 10.1038/s41467-018-04397-0

关键词

-

资金

  1. National Institutes of Health [R01MH095953, R01MH101207, R01MH109177]
  2. Harvard Mind Brain and Behavior faculty grant
  3. Fondation pour la Recherche Medicale grant [SPE20150331860]

向作者/读者索取更多资源

Learning to predict future outcomes is critical for driving appropriate behaviors. Reinforcement learning (RL) models have successfully accounted for such learning, relying on reward prediction errors (RPEs) signaled by midbrain dopamine neurons. It has been proposed that when sensory data provide only ambiguous information about which state an animal is in, it can predict reward based on a set of probabilities assigned to hypothetical states (called the belief state). Here we examine how dopamine RPEs and subsequent learning are regulated under state uncertainty. Mice are first trained in a task with two potential states defined by different reward amounts. During testing, intermediate-sized rewards are given in rare trials. Dopamine activity is a non-monotonic function of reward size, consistent with RL models operating on belief states. Furthermore, the magnitude of dopamine responses quantitatively predicts changes in behavior. These results establish the critical role of state inference in RL.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.8
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Psychology, Biological

Multi-task reinforcement learning in humans

Momchil S. Tomov, Eric Schulz, Samuel J. Gershman

Summary: The study shows that participants in complex environments tend to map previously learned strategies to new scenarios, with a strategy that combines successor features and generalized policy iteration predicting behavior best.

NATURE HUMAN BEHAVIOUR (2021)

Article Neurosciences

Impulsivity and risk-seeking as Bayesian inference under dopaminergic control

John G. Mikhael, Samuel J. Gershman

Summary: The Bayesian model suggests that dopamine levels play a significant role in influencing the precision of stimulus encoding and contextual information during decision-making processes.

NEUROPSYCHOPHARMACOLOGY (2022)

Article Biochemistry & Molecular Biology

The role of state uncertainty in the dynamics of dopamine

John G. Mikhael, HyungGoo R. Kim, Naoshige Uchida, Samuel J. Gershman

Summary: This study investigates the relationship between basal ganglia dopamine activity and reward prediction errors using reinforcement learning models. The researchers found that in certain conditions, dopamine activity ramps up even after learning. They also validated their model predictions through experiments on mice.

CURRENT BIOLOGY (2022)

Article Psychology

Heuristics From Bounded Meta-Learned Inference

Marcel Binz, Samuel J. Gershman, Eric Schulz, Dominik Endres

Summary: This study proposes a novel computational model (BMI) that explains the discovery and selection of different heuristics, advancing our understanding of heuristic decision-making. The model provides predictions about when each heuristic should be applied and has been verified through empirical studies.

PSYCHOLOGICAL REVIEW (2022)

Article Neurosciences

A gradual temporal shift of dopamine responses mirrors the progression of temporal difference error in machine learning

Ryunosuke Amo, Sara Matias, Akihiro Yamanaka, Kenji F. Tanaka, Naoshige Uchida, Mitsuko Watabe-Uchida

Summary: The authors have discovered that dopamine signals gradually move from the time of reward to the time of cue, similar to the evaluation signals used in temporal difference learning. This finding bridges the gap between computational theories and the brain, and provides fundamental insights into how the brain associates cues and rewards that are separated in time.

NATURE NEUROSCIENCE (2022)

Article Behavioral Sciences

Dopamine Mediates the Bidirectional Update of Interval Timing

Anthony M. Jakob, John G. Mikhael, Allison E. Hamilos, John A. Assad, Samuel J. Gershman

Summary: The role of dopamine as a reward prediction error signal in reinforcement learning tasks has been well-established, and it also affects the speed of subjective time. According to the theory, the timing of dopamine relative to reward delivery determines whether subjective time speeds up or slows down. Reanalyzing measurements of dopaminergic neurons in mice performing a self-timed movement task, it was found that dopamine activity timing could predict changes in subjective time speed.

BEHAVIORAL NEUROSCIENCE (2022)

Article Psychology, Biological

Trait somatic anxiety is associated with reduced directed exploration and underestimation of uncertainty

Haoxue Fan, Samuel J. Gershman, Elizabeth A. Phelps

Summary: Trait somatic anxiety is associated with reduced tendency for exploration, manifested as a lesser likelihood for choosing uncertain options and reducing choice stochasticity. It is also associated with underestimation of relative uncertainty.

NATURE HUMAN BEHAVIOUR (2023)

Review Biology

The molecular memory code and synaptic plasticity: A synthesis

Samuel J. Gershman

Summary: The widely accepted view of memory in the brain suggests that synapses store memory and memories are formed through modification of synapses. An alternative view proposes that molecules within the cell body store memory and memories are formed through biochemical operations on these molecules. This paper presents a computational model that integrates both views by considering synapses as storage sites for probability distribution parameters and intracellular molecules as storage sites for generative model parameters, offering a framework for learning and inference.

BIOSYSTEMS (2023)

Article Physics, Multidisciplinary

Compositional Sequence Generation in the Entorhinal-Hippocampal System

Daniel C. McNamee, Kimberly L. Stachenfeld, Matthew M. Botvinick, Samuel J. Gershman

Summary: Medial entorhinal cortex neurons exhibit multiple periodically organized firing fields, forming an internal representation of space. This grid coding is not limited to this cortex region, but also present in other areas like the prefrontal cortex as a general principle. By applying dynamical systems theory, we showed how grid coding can generate diverse sequential reactivations of hippocampal place cells, corresponding to cognitive map traversals. We expanded this model to describe how multiple dynamical systems synthesis can support compositional cognitive computations.

ENTROPY (2022)

Article Neurosciences

Undermatching Is a Consequence of Policy Compression

Bilal A. Bari, Samuel J. Gershman

Summary: The matching law explains how agents tend to match their choice ratios to the ratios of rewards they receive when faced with multiple options. However, perfect matching is rare, and agents often exhibit undermatching or bias choices towards the poorer option. Overmatching is seldom observed. This article proposes that agents not only aim to maximize rewards but also minimize cognitive cost, measured as policy complexity. The theory suggests that capacity-constrained agents can only undermatch or perfectly match, consistent with empirical evidence.

JOURNAL OF NEUROSCIENCE (2023)

Article Multidisciplinary Sciences

Visual motion perception as online hierarchical inference

Johannes Bill, Samuel J. Gershman, Jan Drugowitsch

Summary: This article presents a theory of how the human brain infers motion relations in real time and offers a unified explanation for various perceptual phenomena. The proposed online hierarchical Bayesian inference provides a principled solution for this complex perceptual task and explains human percepts for different stimuli. It also makes testable predictions for future psychophysics experiments and motivates targeted neural network implementations.

NATURE COMMUNICATIONS (2022)

Article Multidisciplinary Sciences

Causal implicatures from correlational statements

Samuel B. Gershman, Tomer B. Ullman

Summary: Correlation does not imply causation, but people still tend to infer causality from correlational statements. In Study 1, participants interpreted statements of association to imply causality in one direction. In Studies 2 and 3, participants interpreted statements of increased risk to imply causality in the opposite direction. Therefore, even correlational language can give rise to causal inferences.

PLOS ONE (2023)

Article Multidisciplinary Sciences

Teachers recruit mentalizing regions to represent learners' beliefs

Natalia Velez, Alicia M. Chen, Taylor Burke, Fiery A. Cushman, Samuel J. Gershman

Summary: Teaching allows humans to pass on culturally specific knowledge and skills, but little is known about the neural computations behind teachers' decision-making process. In this study, participants played the role of teachers while undergoing fMRI scans, selecting examples to teach learners how to answer abstract multiple-choice questions. The findings suggest that participants' example selections were guided by a model that maximizes learners' belief in the correct answer. Furthermore, brain regions involved in processing social information were found to track learners' confidence in the correct answer. These results provide insights into the computational and neural mechanisms underlying our remarkable teaching abilities.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2023)

Article Biology

Rate-distortion theory of neural coding and its implications for working memory

Anthony M. V. Jakob, Samuel J. Gershman

Summary: Rate-distortion theory offers a powerful framework for understanding human memory, while neural population coding models can implement this framework and reproduce key regularities of visual working memory.
Article Psychology, Biological

Empowerment contributes to exploration behaviour in a creative video game

Franziska Brandle, Lena J. Stocks, Joshua B. Tenenbaum, Samuel J. Gershman, Eric Schulz

Summary: The authors demonstrate that in a creative environment, people's choices are influenced by the empowerment they bring. Previous studies portray individuals as stumbling upon good options by chance. However, they may not fully capture the complexity of exploration strategies exhibited by individuals in more intricate settings. In this study, the authors investigate the behavior of 29,493 players in the online game 'Little Alchemy 2'. They find that players are motivated not only by external rewards but also by an intrinsic desire to create objects that enable them to create even more. Additionally, players' drive for empowerment is absent when playing a game variant lacking recognizable semantics, indicating that individuals utilize their knowledge of the world and its possibilities to guide their exploration.

NATURE HUMAN BEHAVIOUR (2023)

暂无数据