☆ 4.6 Article

Multi-armed bandit problem with known trend

NEUROCOMPUTING (2016)

期刊

NEUROCOMPUTING

卷 205, 期 -, 页码 16-21

出版社

ELSEVIER SCIENCE BV

DOI: 10.1016/j.neucom.2016.02.052

关键词

Multi-armed bandit; Online learning; Recommender systems

类别

Computer Science, Artificial Intelligence

向作者/读者索取更多资源

Protocol

Reagent

摘要

We consider a variant of the multi-armed bandit model, which we call multi-armed bandit problem with known trend, where the gambler knows the shape of the reward function of each arm but not its distribution. This new problem is motivated by different on-line problems like active learning, music and interface recommendation applications, where when an arm is sampled by the model the received reward change according to a known trend. By adapting the standard multi-armed bandit algorithm UCB1 to take advantage of this setting, we propose the new algorithm named Adjusted Upper Confidence Bound (A-UCB) that assumes a stochastic model. We provide upper bounds of the regret which compare favorably with the ones of UCB1. We also confirm that experimentally with different simulations. (C) 2016 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Artificial Intelligence

Dynamic clustering based contextual combinatorial multi-armed bandit for online recommendation

Cairong Yan, Haixia Han, Yanting Zhang, Dandan Zhu, Yongquan Wan

Summary: This study proposes an algorithm named DC(3)MAB to address the challenges faced by recommender systems in large-scale user and sparse interaction scenarios. The algorithm improves recommendation performance through dynamic user clustering, dynamic item partitioning based on collaborative filtering, and a multi-class reward mechanism based on fine-grained implicit feedback.

KNOWLEDGE-BASED SYSTEMS (2022)

添加到收藏夹

Article Mathematics, Applied

Multi-armed bandit problem with online clustering as side information

Andrii Dzhoha, Iryna Rozora

Summary: In this paper, we address the problem of sequential resource allocation under the multi-armed bandit model in a non-stationary stochastic environment. We propose a two-stage algorithm that combines a modified k-means clustering approach with Thompson Sampling policy to handle the dynamic nature of the problem. The algorithm effectively deals with cluster drift and potential misclassification, providing a solution for online clustering with side information.

JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Multi-objective multi-armed bandit with lexicographically ordered and satisficing objectives

Alihan Huyuk, Cem Tekin

Summary: The study focuses on multi-objective multi-armed bandits with lexicographically ordered and satisficing objectives, proposing an algorithm that achieves bounded regret. By exploring two different settings, it demonstrates uniform expected regret bounded in time and shows the effectiveness of the proposed algorithm.

MACHINE LEARNING (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Prioritized Experience Replay based on Multi-armed Bandit

Ximing Liu, Tianqing Zhu, Cuiqing Jiang, Dayong Ye, Fuqing Zhao

Summary: The proposed Prioritized Experience Replay based on Multi-armed Bandit (PERMAB) is a dynamic experience replay strategy that adapts based on the interaction between the agent and environment. It combines multiple priority criteria to measure the importance of experiences, with weights adjusted adaptively from episode to episode. This strategy improves learning efficiency and performance by considering both sample informativeness and diversity.

EXPERT SYSTEMS WITH APPLICATIONS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Ballooning multi-armed bandits

Ganesh Ghalme, Swapnil Dhamal, Shweta Jain, Sujit Gujar, Y. Narahari

Summary: The paper introduces the ballooning multi-armed bandits (BL-MAB) model, where the regret is computed with respect to the best available arm at each time. It proves that achieving sub-linear regret is possible if the best arm is more likely to arrive in the early rounds. The proposed algorithm focuses on exploring newly arriving arms and determining the sequence of arm pulls for exploitation, ultimately achieving sub-linear regret.

ARTIFICIAL INTELLIGENCE (2021)

添加到收藏夹

Article Computer Science, Hardware & Architecture

Distributed learning dynamics of Multi-Armed Bandits for edge intelligence

Shuzhen Chen, Youming Tao, Dongxiao Yu, Feng Li, Bei Gong

Summary: This paper investigates the problem of multi-agent decision making in IoT networks using the distributed Multi-Armed Bandits (MAB) model and proposes a lightweight and robust learning algorithm for dynamic networks. Rigorous analysis and extensive experiments show that the algorithm demonstrates good efficiency and stability in mobile settings despite resource constraints.

JOURNAL OF SYSTEMS ARCHITECTURE (2021)

添加到收藏夹

Article Automation & Control Systems

An index-based deterministic convergent optimal algorithm for constrained multi-armed bandit problems

Hyeong Soo Chang

Summary: In the constrained multi-armed bandit model, a deterministic convergent optimal algorithm is constructed by ensuring the probability of choosing an optimal feasible arm converges to one over infinite horizon. This algorithm is based on the anytime parameter-free thresholding algorithm and provides a finite-time lower bound for the convergent optimality. A relaxed version of the algorithm is also studied for estimating the optimal value and discussing its convergent optimality after a sufficiently large horizon size.

AUTOMATICA (2021)

添加到收藏夹

Article Computer Science, Information Systems

Online Learning of Time-Varying Unbalanced Networks in Non-Convex Environments: A Multi-Armed Bandit Approach

Olusola T. Odeyomi

Summary: This study discusses how agents in a time-varying distributed network can converge to the global minimizer of the network. A multi-armed bandit algorithm CD EXP3 is proposed to help agents find the minimizer by observing their losses. The simulations demonstrate the effectiveness of the algorithm and analyze the effects of different topologies and parameters on convergence.

IEEE ACCESS (2023)

添加到收藏夹

Article Engineering, Electrical & Electronic

Privacy-Preserving Communication-Efficient Federated Multi-Armed Bandits

Tan Li, Linqi Song

Summary: This paper investigates the privacy-preserving communication-efficient algorithm in federated multi-armed bandit problems and explores the relationship between privacy, communication, and learning performance. By designing learning algorithms and communication protocols, protecting privacy and reducing communication costs, we obtain theoretical and empirical results.

IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS (2022)

添加到收藏夹

Article Engineering, Electrical & Electronic

Risk-Aware Multi-Armed Bandits With Refined Upper Confidence Bounds

Xingchi Liu, Mahsa Derakhshani, Sangarapillai Lambotharan, Mihaela van der Schaar

Summary: The traditional MAB framework assumes the arm with the highest expected reward is the best choice, but this may be risky if the variance is high. This study investigates a mean-variance metric to consider uncertainty in rewards and develops risk-aware algorithms for arm-selection, achieving O(log(T)) regret in theoretical analysis. Numerical results show that the proposed algorithms outperform other risk-aware MAB algorithms.

IEEE SIGNAL PROCESSING LETTERS (2021)

添加到收藏夹

Article Engineering, Electrical & Electronic

Multi-Agent Multi-Armed Bandit Learning for Online Management of Edge-Assisted Computing

Bochun Wu, Tianyi Chen, Wei Ni, Xin Wang

Summary: This paper presents a new online learning-based approach to offloading scheduling, utilizing multi-agent multi-armed bandit learning to exploit randomly varying conditions. The proposed combinatorial and distributed bandit upper confidence bound algorithms aim to minimize delays in edge-assisted computing by orchestrating resources efficiently. The study establishes the asymptotic optimality of the algorithms through sublinearity of regrets, showing that random turns in decision-making do not compromise performance.

IEEE TRANSACTIONS ON COMMUNICATIONS (2021)

添加到收藏夹

Review Neurosciences

Multi-Armed Bandits in Brain-Computer Interfaces

Frida Heskebeck, Carolina Bergeling, Bo Bernhardsson

Summary: This review discusses the application of multi-armed bandit (MAB) problems in the field of brain-computer interfaces (BCIs). Although MAB optimization has great potential in BCI, there is still relatively little research on this topic. The review provides background information on MAB problems and solution methods, and explores the latest concepts and future research directions of MAB in BCI systems.

FRONTIERS IN HUMAN NEUROSCIENCE (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Multi-armed bandit heterogeneous ensemble learning for imbalanced data

Qi Dai, Jian-wei Liu, Jiapeng Yang

Summary: Ensemble learning is a widely used approach for handling class imbalance issues. However, selecting the most suitable resampling method and base classifier has been a challenge. This study proposes a multi-armed bandit heterogeneous ensemble framework that uses the multi-armed bandit technique to choose the best base classifier and resampling techniques, resulting in a competitive ensemble model. The experimental results show that this model performs well on various evaluation metrics.

COMPUTATIONAL INTELLIGENCE (2023)

添加到收藏夹

Article Engineering, Multidisciplinary

Earning While Learning: An Adversarial Multi-Armed Bandit Based Real-Time Bidding Scheme in Deregulated Electricity Market

Yufeng Wang, Bo Zhang, Jianhua Ma, Qun Jin

Summary: In deregulated electricity markets, market gaming behaviors can significantly impact electricity costs. Optimizing bids using an adversarial multiarmed bandit model, like Exp3C, leads to increased profits. Experimental results show that Exp3C outperforms other heuristic schemes.

IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING (2022)

添加到收藏夹

Article Physics, Multidisciplinary

Non Stationary Multi-Armed Bandit: Empirical Evaluation of a New Concept Drift-Aware Algorithm

Emanuele Cavenaghi, Gabriele Sottocornola, Fabio Stella, Markus Zanker

Summary: The Multi-Armed Bandit problem addresses sequential decision-making challenges, but in reality, the reward distribution may change. The f-Discounted-Sliding-Window Thompson Sampling algorithm is proposed to combat concept drift in non-stationary environments by introducing a discount factor and a sliding window mechanism.

ENTROPY (2021)

添加到收藏夹

暂无数据

Article Computer Science, Artificial Intelligence

3D-KCPNet: Efficient 3DCNNs based on tensor mapping theory

Rui Lv, Dingheng Wang, Jiangbin Zheng, Zhao-Xu Yang

Summary: In this paper, the authors investigate tensor decomposition for neural network compression. They analyze the convergence and precision of tensor mapping theory, validate the rationality of tensor mapping and its superiority over traditional tensor approximation based on the Lottery Ticket Hypothesis. They propose an efficient method called 3D-KCPNet to compress 3D convolutional neural networks using the Kronecker canonical polyadic (KCP) tensor decomposition. Experimental results show that 3D-KCPNet achieves higher accuracy compared to the original baseline model and the corresponding tensor approximation model.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Personalized robotic control via constrained multi-objective reinforcement learning

Xiangkun He, Zhongxu Hu, Haohan Yang, Chen Lv

Summary: In this paper, a novel constrained multi-objective reinforcement learning algorithm is proposed for personalized end-to-end robotic control with continuous actions. The approach trains a single model using constraint design and a comprehensive index to achieve optimal policies based on user-specified preferences.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Overlapping community detection using expansion with contraction

Zhijian Zhuo, Bilian Chen, Shenbao Yu, Langcai Cao

Summary: In this paper, a novel method called Expansion with Contraction Method for Overlapping Community Detection (ECOCD) is proposed, which utilizes non-negative matrix factorization to obtain disjoint communities and applies expansion and contraction processes to adjust the degree of overlap. ECOCD is applicable to various networks with different properties and achieves high-quality overlapping community detection.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

High-compressed deepfake video detection with contrastive spatiotemporal distillation

Yizhe Zhu, Chunhui Zhang, Jialin Gao, Xin Sun, Zihan Rui, Xi Zhou

Summary: In this work, the authors propose a Contrastive Spatio-Temporal Distilling (CSTD) approach to improve the detection of high-compressed deepfake videos. The approach leverages spatial-frequency cues and temporal-contrastive alignment to fully exploit spatiotemporal inconsistency information.

NEUROCOMPUTING (2024)

添加到收藏夹

Review Computer Science, Artificial Intelligence

A review of coverless steganography

Laijin Meng, Xinghao Jiang, Tanfeng Sun

Summary: This paper provides a review of coverless steganographic algorithms, including the development process, known contributions, and general issues in image and video algorithms. It also discusses the security of coverless steganography from theoretical analysis to actual investigation for the first time.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Confidence-based interactable neural-symbolic visual question answering

Yajie Bao, Tianwei Xing, Xun Chen

Summary: Visual question answering requires processing multi-modal information and effective reasoning. Neural-symbolic learning is a promising method, but current approaches lack uncertainty handling and can only provide a single answer. To address this, we propose a confidence based neural-symbolic approach that evaluates NN inferences and conducts reasoning based on confidence.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

A framework-based transformer and knowledge distillation for interior style classification

Anh H. Vo, Bao T. Nguyen

Summary: Interior style classification is an interesting problem with potential applications in both commercial and academic domains. This project proposes a method named ISC-DeIT, which combines data-efficient image transformer architectures and knowledge distillation, to address the interior style classification problem. Experimental results demonstrate a significant improvement in predictive accuracy compared to other state-of-the-art methods.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Improving robustness for vision transformer with a simple dynamic scanning augmentation

Shashank Kotyan, Danilo Vasconcellos Vargas

Summary: This article introduces a novel augmentation technique called Dynamic Scanning Augmentation to improve the accuracy and robustness of Vision Transformer (ViT). The technique leverages dynamic input sequences to adaptively focus on different patches, resulting in significant changes in ViT's attention mechanism. Experimental results demonstrate that Dynamic Scanning Augmentation outperforms ViT in terms of both robustness to adversarial attacks and accuracy against natural images.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Introducing shape priors in Siamese networks for image classification

Hiba Alqasir, Damien Muselet, Christophe Ducottet

Summary: The article proposes a solution to improve the learning process of a classification network by providing shape priors, reducing the need for annotated data. The solution is tested on cross-domain digit classification tasks and a video surveillance application.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Neural dynamics solver for time-dependent infinity-norm optimization based on ACP framework with robot application

Dexiu Ma, Mei Liu, Mingsheng Shang

Summary: This paper proposes a method using neural dynamics solvers to solve infinity-norm optimization problems. Two improved solvers are constructed and their effectiveness and superiority are demonstrated through theoretical analysis and simulation experiments.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

cpp-AIF: A multi-core C plus plus implementation of Active Inference for Partially Observable Markov Decision Processes

Francesco Gregoretti, Giovanni Pezzulo, Domenico Maisto

Summary: Active Inference is a computational framework that uses probabilistic inference and variational free energy minimization to describe perception, planning, and action. cpp-AIF is a header-only C++ library that provides a powerful tool for implementing Active Inference for Partially Observable Markov Decision Processes through multi-core computing. It is cross-platform and improves performance, memory management, and usability compared to existing software.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Predicting stock market trends with self-supervised learning

Zelin Ying, Dawei Cheng, Cen Chen, Xiang Li, Peng Zhu, Yifeng Luo, Yuqi Liang

Summary: This paper proposes a novel stock market trends prediction framework called SMART, which includes a self-supervised stock technical data sequence embedding model S3E. By training with multiple self-supervised auxiliary tasks, the model encodes stock technical data sequences into embeddings and uses the learned sequence embeddings for predicting stock market trends. Extensive experiments on China A-Shares market and NASDAQ market prove the high effectiveness of our model in stock market trends prediction, and its effectiveness is further validated in real-world applications in a leading financial service provider in China.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

DHGAT: Hyperbolic representation learning on dynamic graphs via attention networks

Hao Li, Hao Jiang, Dongsheng Ye, Qiang Wang, Liang Du, Yuanyuan Zeng, Liu Yuan, Yingxue Wang, C. Chen

Summary: DHGAT1, a dynamic hyperbolic graph attention network, utilizes hyperbolic metric properties to embed dynamic graphs. It employs a spatiotemporal self-attention mechanism and weighted node representations, resulting in excellent performance in link prediction tasks.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Progressive network based on detail scaling and texture extraction: A more general framework for image deraining

Jiehui Huang, Zhenchao Tang, Xuedong He, Jun Zhou, Defeng Zhou, Calvin Yu-Chian Chen

Summary: This study proposes a progressive learning multi-scale feature blending model for image deraining tasks. The model utilizes detail dilation and texture extraction to improve the restoration of rainy images. Experimental results show that the model achieves near state-of-the-art performance in rain removal tasks and exhibits better rain removal realism.

NEUROCOMPUTING (2024)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Stabilization and synchronization control for discrete-time complex networks via the auxiliary role of edges subsystem

Lizhi Liu, Zilin Gao, Yinhe Wang, Yongfu Li

Summary: This paper proposes a novel discrete-time interconnected model for depicting complex dynamical networks. The model consists of nodes and edges subsystems, which consider the dynamic characteristic of both nodes and edges. By designing control strategies and coupling modes, the stabilization and synchronization of the network are achieved. Simulation results demonstrate the effectiveness of the proposed methods.

NEUROCOMPUTING (2024)

添加到收藏夹

© Peeref 2019-2024. All rights reserved.