Article
Anesthesiology
Angelos-Miltiadis Krypotos, Geert Crombez, Maryna Alves, Nathalie Claes, Johan W. S. Vlaeyen
Summary: This study investigates how individuals solve the exploration-exploitation dilemma when facing pain and finds that participants tend to choose the safest option, prioritize rewards over pain, and are more inclined to explore after experiencing pain.
Article
Robotics
Andrew Silva, Nina Moorman, William Silva, Zulfiqar Zaidi, Nakul Gopalan, Matthew Gombolay
Summary: Researchers have developed a language-conditioned multi-task learning method called LanCon-Learn, which helps robots understand the relationship between tasks and objectives for better application in manipulation domains. Experimental results show that LanCon-Learn achieves significant improvement in task success rate and skill transfer compared to non-language baselines.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2022)
Article
Computer Science, Artificial Intelligence
Qihang Chen, Qiwei Zhang, Yunlong Liu
Summary: One of the major challenges in reinforcement learning is the sparse and delayed rewards in episodic tasks. The existing techniques have difficulties in assigning credits to explored transitions or are misled by behavioral policies, leading to sluggish learning efficiency. To address this, we propose an approach called EMR, which combines intrinsic rewards of exploration mechanisms with reward redistribution to balance exploration and exploitation in such tasks.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Review
Computer Science, Artificial Intelligence
Anthony Triche, Anthony S. Maida, Ashok Kumar
Summary: Recent works have connected Hebbian plasticity with reinforcement learning, resulting in a class of trial-and-error learning called neo-Hebbian plasticity. Inspired by the role of dopamine in synaptic modification, neo-Hebbian RL methods selectively reinforce associations to enable learning exploitative behaviors. This review focuses on the exploration-exploitation balance under the neo-Hebbian RL framework and suggests potential improvements through stronger incorporation of intrinsic motivators.
Article
Computer Science, Artificial Intelligence
Guoyu Zuo, Zhipeng Tian, Gao Huang
Summary: Learning from visual observations is a challenging problem in RL, with representation learning and task learning to solve. Existing methods using data augmentation can improve RL generation but often cause instability and divergence. We propose DAR-EEE, a unified method that incorporates bootstrap ensembles, to stabilize and accelerate task learning. Our experimental evaluation demonstrates improved sample efficiency and state-of-the art performance on difficult image-based control tasks.
APPLIED INTELLIGENCE
(2023)
Article
Computer Science, Artificial Intelligence
Min Li, Tianyi Huang, William Zhu
Summary: This research proposes an adaptive exploration policy to address the exploration-exploitation tradeoff by adjusting the exploration noise based on training stability. The effectiveness of this policy is demonstrated through theoretical analysis and experiments.
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS
(2021)
Article
Robotics
Jingxi Xu, Shuran Song, Matei Ciocarlie
Summary: Inspired by human abilities, the robotic manipulation field aims to develop new methods for tactile-based object interaction. TANDEM, an architecture for learning efficient exploration strategies and decision making, is proposed in this study. The results show that TANDEM achieves higher accuracy with fewer actions in a tactile object recognition task and is more robust to sensor noise.
IEEE ROBOTICS AND AUTOMATION LETTERS
(2022)
Article
Management
H. Henry Cao, Liye Ma, Z. Eddie Ning, Baohong Sun
Summary: In this paper, the authors use a continuous time bandit model to analyze the effectiveness of recommendation algorithms in a monopoly and duopoly market. They find that in a competitive market, firms focus more on exploitation rather than exploration. Additionally, competition decreases the return from developing a forward-looking algorithm for impatient users. However, the development of a forward-looking algorithm always benefits users in a competitive market. The decision of competing firms to invest in a forward-looking algorithm can create a prisoner's dilemma, highlighting the implications for artificial intelligence adoption and policy makers.
MANAGEMENT SCIENCE
(2023)
Article
Computer Science, Artificial Intelligence
Igor Q. Lordeiro, Diego B. Haddad, Douglas O. Cardoso
Summary: The research assessed the feasibility of using reinforcement learning and multi-armed bandit algorithms to tackle the problem presented by Minesweeper, showing successful results particularly in smaller game boards, such as the beginner level.
IEEE TRANSACTIONS ON GAMES
(2022)
Article
Biology
Jose Segovia-Martin, Felix Creutzig, James Winters
Summary: Higher levels of economic activity lead to higher energy use and consumption of natural resources, and the use of fossil fuels remains a significant contributor to greenhouse gas emissions and climate change. The Jevons Paradox suggests that increasing resource efficiency can actually lead to increased resource consumption. This study develops a mathematical model and computer simulator to analyze the effects of exploration-exploitation strategies on efficiency, consumption, and sustainability, and highlights the importance of demand reduction measures in achieving sustainable development goals.
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES
(2023)
Article
Automation & Control Systems
Chengbin Xuan, Feng Zhang, Hak-Keung Lam
Summary: This paper presents a method to improve the safety of agents during the exploration stage in q-learning. By introducing a safety indicator function and a safe exploration mask, the algorithm reduces the likelihood of unsafe actions and improves its applicability in industrial settings.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2022)
Article
Neurosciences
Jean-Paul Noel, Baptiste Caziot, Stefania Bruni, Nora E. Fitzgerald, Eric Avila, Dora E. Angelaki
Summary: The study emphasizes the importance of closed loops between action and perception in understanding complex behaviors, introducing the framework of reinforcement learning and control. It highlights active sensing, dynamical planning, and leveraging structural regularities as key operations for intelligent behavior. The approach allows for flexible and generalizable behaviors, while also exploring the neural underpinnings of intelligence properties such as flexibility, prediction, and generalization.
PROGRESS IN NEUROBIOLOGY
(2021)
Article
Computer Science, Artificial Intelligence
Antoine Theberge, Christian Desrosiers, Maxime Descoteaux, Pierre-Marc Jodoin
Summary: Diffusion MRI tractography is the only non-invasive tool to assess the white-matter structural connectivity of a brain. Using deep reinforcement learning to address tractography issues has shown competitive results and stable performance when generalizing to new data.
MEDICAL IMAGE ANALYSIS
(2021)
Article
Computer Science, Artificial Intelligence
Wen-Hua Chen
Summary: This paper discusses the relationship between Reinforcement Learning (RL) and the recently developed Dual Control for Exploitation and Exploration (DCEE), highlighting the potential of DCEE in solving similar problems as RL in unknown environments and its advantages in coping with uncertainty, learning efficiency, and potential to establish formal properties. The paper also explores the links between DCEE and other relevant methods, providing insights for cross fertilisation between control, machine learning, and neuroscience in developing autonomous control under uncertain environments.
Article
Computer Science, Information Systems
Ryusei Maeda, Mamoru Mimura
Summary: This paper proposes a method of automating post-exploitation by combining deep reinforcement learning and PowerShell Empire, with A2C showing the most efficient learning progress and the ability for trained agents to gain administrator privileges in a test domain network.
COMPUTERS & SECURITY
(2021)