☆ 4.6 Article

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

PLOS COMPUTATIONAL BIOLOGY (2016)

期刊

PLOS COMPUTATIONAL BIOLOGY

卷 12, 期 7, 页码 -

出版社

PUBLIC LIBRARY SCIENCE

DOI: 10.1371/journal.pcbi.1005034

关键词

类别

Biochemical Research Methods Mathematical & Computational Biology

资金

Japan Society for the Promotion of Science [13J05086, 15K13111, 25285176]
Japan Science and Technology Agency
Exploratory Research for Advanced Technology
Kawarabayashi Large Graph Project
Grants-in-Aid for Scientific Research [15K13111, 13J05086, 25285176, 15K17262] Funding Source: KAKEN

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Direct reciprocity, or repeated interaction, is a main mechanism to sustain cooperation under social dilemmas involving two individuals. For larger groups and networks, which are probably more relevant to understanding and engineering our society, experiments employing repeated multiplayer social dilemma games have suggested that humans often show conditional cooperation behavior and its moody variant. Mechanisms underlying these behaviors largely remain unclear. Here we provide a proximate account for this behavior by showing that individuals adopting a type of reinforcement learning, called aspiration learning, phenomenologically behave as conditional cooperator. By definition, individuals are satisfied if and only if the obtained payoff is larger than a fixed aspiration level. They reinforce actions that have resulted in satisfactory outcomes and anti-reinforce those yielding unsatisfactory outcomes. The results obtained in the present study are general in that they explain extant experimental results obtained for both so-called moody and non-moody conditional cooperation, prisoner's dilemma and public goods games, and well-mixed groups and networks. Different from the previous theory, individuals are assumed to have no access to information about what other individuals are doing such that they cannot explicitly use conditional cooperation rules. In this sense, myopic aspiration learning in which the unconditional propensity of cooperation is modulated in every discrete time step explains conditional behavior of humans. Aspiration learners showing (moody) conditional cooperation obeyed a noisy GRIM-like strategy. This is different from the Pavlov, a reinforcement learning strategy promoting mutual cooperation in two-player situations.

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

期刊

PLOS COMPUTATIONAL BIOLOGY

出版社

PUBLIC LIBRARY SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Reinforcement Learning Explains Conditional Cooperation and Its Moody Cousin

期刊

PLOS COMPUTATIONAL BIOLOGY

出版社

PUBLIC LIBRARY SCIENCE

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文