4.5 Article

Personalised Reranking of Paper Recommendations Using Paper Content and User Behavior

期刊

出版社

ASSOC COMPUTING MACHINERY
DOI: 10.1145/3312528

关键词

Academic search; paper recommendation; reranking

资金

  1. Ahold Delhaize
  2. China Scholarship Council
  3. Innovation Center for Artificial Intelligence (ICAI)

向作者/读者索取更多资源

Academic search engines have been widely used to access academic papers, where users' information needs are explicitly represented as search queries. Some modern recommender systems have taken one step further by predicting users' information needs without the presence of an explicit query. In this article, we examine an academic paper recommender that sends out paper recommendations in email newsletters, based on the users' browsing history on the academic search engine. Specifically, we look at users who regularly browse papers on the search engine, and we sign up for the recommendation newsletters for the first time. We address the task of reranking the recommendation candidates that are generated by a production system for such users. We face the challenge that the users on whom we focus have not interacted with the recommender system before, which is a common scenario that every recommender system encounters when new users sign up. We propose an approach to reranking candidate recommendations that utilizes both paper content and user behavior. The approach is designed to suit the characteristics unique to our academic recommendation setting. For instance, content similarity measures can be used to find the closest match between candidate recommendations and the papers previously browsed by the user. To this end, we use a knowledge graph derived from paper metadata to compare entity similarities (papers, authors, and journals) in the embedding space. Since the users on whom we focus have no prior interactions with the recommender system, we propose a model to learn a mapping from users' browsed articles to user clicks on the recommendations. We combine both content and behavior into a hybrid reranking model that outperforms the production baseline significantly, providing a relative 13% increase in Mean Average Precision and 28% in Precision@1. Moreover, we provide a detailed analysis of the model components, highlighting where the performance boost comes from. The obtained insights reveal useful components for the reranking process and can be generalized to other academic recommendation settings as well, such as the utility of graph embedding similarity. Also, recent papers browsed by users provide stronger evidence for recommendation than historical ones.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Computer Science, Information Systems

Generating Relevant and Informative Questions for Open-Domain Conversations

Yanxiang Ling, Fei Cai, Jun Liu, Honghui Chen, Maarten de Rijke

Summary: Recent research emphasizes the importance of mixed-initiative interactions in conversational search. The task of question generation (QG) in open-domain conversational systems aims to enhance human-machine interactions. However, the limited availability of QG-specific data in conversations makes this task challenging. In this study, we propose a context-enhanced neural question generation (CNQG) model that leverages conversational context to predict question content and pattern. We also use multi-task learning with auxiliary training objectives and a self-supervised approach to train our question generator.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Economics

Parameter-efficient deep probabilistic forecasting

Olivier Sprangers, Sebastian Schelter, Maarten de Rijke

Summary: Probabilistic time series forecasting is crucial in various domains, and Transformer-based methods have achieved state-of-the-art performance. However, they require a large number of parameters and high memory requirements. To address this, we propose a novel bidirectional temporal convolutional network with significantly fewer parameters. Our method performs on par with state-of-the-art approaches and requires lower memory, reducing infrastructure cost.

INTERNATIONAL JOURNAL OF FORECASTING (2023)

Article Computer Science, Information Systems

Evaluating the Robustness of Click Models to Policy Distributional Shift

Romain Deffayet, Jean-Michel Renders, Maarten De Rijke

Summary: The performance of click models under policy distributional shift (PDS) is examined, and a new evaluation protocol is proposed to predict their performance under PDS, along with guidelines to mitigate risks.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Information Systems

Improving Transformer-based Sequential Recommenders through Preference Editing

Muyang Ma, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Huasheng Liang, Jun Ma, Maarten De Rijke

Summary: One of the key challenges in sequential recommendation is how to extract and represent user preferences. We propose a transformer-based sequential recommendation model, named MrTransformer, to explore multiple user preferences. MrTransformer employs preference-editing-based self-supervised learning mechanism to disentangle user preferences into multiple independent representations, improving preference extraction and representation. Experiments show that MrTransformer with preference editing outperforms state-of-the-art methods in terms of Recall, MRR, and NDCG, especially for long sequences of interactions.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Information Systems

PRADA: Practical Black-box Adversarial Attacks against Neural Ranking Models

Chen Wu, Ruqing Zhang, Jiafeng Guo, Maarten De Rijke, Yixing Fan, Xueqi Cheng

Summary: This article introduces the Word Substitution Ranking Attack (WSRA) task against Neural Ranking Models (NRMs), which aims to promote a target document's ranking by adding adversarial perturbations to its text. The proposed Pseudo Relevance-based ADversarial ranking Attack (PRADA) method outperforms existing attack strategies and successfully fools the NRM with small indiscernible perturbations of text.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Information Systems

A Next Basket Recommendation Reality Check

Ming Li, Sami Jullien, Mozhdeh Ariannezhad, Maarten De Rijke

Summary: The study aims to investigate the performance of NBR methods in practical applications and proposes a new set of evaluation metrics to measure the performance of NBR models. By conducting experimental analysis on state-of-the-art NBR models, it reveals the actual progress and improvements of NBR methods in the recommendation process.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Parallel Split-Join Networks for Shared Account Cross-Domain Sequential Recommendations

Wenchao Sun, Muyang Ma, Pengjie Ren, Yujie Lin, Zhumin Chen, Zhaochun Ren, Jun Ma, Maarten de Rijke

Summary: This study addresses the challenges of sequential recommendation in a context where multiple users share a single account and behavior is available in multiple domains. The proposed PSJNet network learns role-specific representations and filters out irrelevant information using a gating mechanism. It also combines split and join techniques to learn cross-domain representations. Experimental results demonstrate that PSJNet outperforms state-of-the-art baselines in terms of MRR and Recall.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2023)

Article Chemistry, Multidisciplinary

Exploiting the Rolling Shutter Read-Out Time for ENF-Based Camera Identification

Ericmoore Ngharamike, Li-Minn Ang, Kah Phooi Seng, Mingzhong Wang

Summary: The electric network frequency (ENF) is a fluctuating signal representing the frequency of mains power system. This fluctuation can be utilized to extract information about the source camera of a video recorded under ENF-affected lighting. By considering the ENF and the camera-specific read-out time (T-ro), the suggested approach in this paper aims to identify the source camera of an unknown video. Experimental results demonstrate the effectiveness of the proposed method.

APPLIED SCIENCES-BASEL (2023)

Article Computer Science, Artificial Intelligence

Keep and Select: Improving Hierarchical Context Modeling for Multi-Turn Response Generation

Yanxiang Ling, Fei Cai, Jun Liu, Honghui Chen, Maarten de Rijke

Summary: Hierarchical context modeling is crucial for the response generation in multi-turn conversational systems. We propose a model named KS-CQ, which utilizes the Keep and Select modules to generate neighbor-aware context representation and context-enriched query representation. Extensive experiments demonstrate the effectiveness of our approach compared to state-of-the-art baselines in both automatic and human evaluations.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Proceedings Paper Computer Science, Information Systems

Contrasting Neural Click Models and Pointwise IPS Rankers

Philipp Hager, Maarten de Rijke, Onno Zoeter

Summary: Inverse-propensity scoring and neural click models are compared in this study for learning rankers from user clicks affected by position bias. Theoretical differences are explored and empirical comparisons are conducted on a prevalent evaluation setup. It is shown that both methods optimize for true document relevance when position bias is known, but small empirical differences are found when neural click models learn from shared, conflicting features.

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT I (2023)

Proceedings Paper Computer Science, Information Systems

Scene-Centric vs. Object-Centric Image-Text Cross-Modal Retrieval: A Reproducibility Study

Mariya Hendriksen, Svitlana Vakulenko, Ernst Kuiper, Maarten de Rijke

Summary: This article investigates the reproducibility and replicability of state-of-the-art CMR results when evaluated on object-centric and scene-centric datasets. By selecting two different architectures of CMR models and evaluating them on two scene-centric datasets and three object-centric datasets, it is discovered that the reproducibility and replicability of the experimental results are problematic, and the scores obtained by the models on object-centric datasets are significantly lower than those obtained on scene-centric datasets.

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III (2023)

Proceedings Paper Computer Science, Information Systems

Improving the Generalizability of the Dense Passage Retriever Using Generated Datasets

Thilina C. Rajapakse, Maarten de Rijke

Summary: Dense retrieval methods have outperformed traditional sparse retrieval methods in open-domain retrieval. However, there is a noticeable decrease in accuracy when these methods are applied to out-of-distribution and out-of-domain datasets. This may be due to the mismatch in information available to the context encoder and the query encoder during training. By training on datasets with multiple queries per passage, we show that dense passage retriever models perform better on out-of-distribution and out-of-domain test datasets compared to models trained on datasets with single query per passage.

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II (2023)

Article Computer Science, Artificial Intelligence

Differentiable Logic Policy for Interpretable Deep Reinforcement Learning: A Study From an Optimization Perspective

Xin Li, Haojie Lei, Li Zhang, Mingzhong Wang

Summary: This paper explores interpretable Deep Reinforcement Learning (DRL) by representing policy using Differentiable Inductive Logic Programming (DILP). The research focuses on the optimization perspective of DILP-based policy learning and proposes using Mirror Descent for policy optimization. The theoretical and empirical studies verify the effectiveness of the proposed approach.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2023)

Proceedings Paper Computer Science, Information Systems

eCom'22: The SIGIR 2022Workshop on eCommerce

Ajinkya Kale, Surya Kallumadi, Tracy Holloway King, Shervin Malmasi, Maarten de Rijke, Jacopo Tagliabue

Summary: eCommerce Information Retrieval is gaining attention in the academic literature and is essential for major eCommerce websites. The workshop aims to bring together researchers and practitioners to discuss unique topics in eCommerce IR and explore ways to improve search relevance using the combination of free text, structured data, and customer behavioral data.

PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22) (2022)

Proceedings Paper Computer Science, Information Systems

ReCANet: A Repeat Consumption-Aware Neural Network for Next Basket Recommendation in Grocery Shopping

Mozhdeh Ariannezhad, Sami Jullien, Ming Li, Min Fang, Sebastian Schelter, Maarten de Rijke

Summary: This paper presents an empirical study on the repeat consumption behavior of users in the context of grocery shopping. The study highlights the significance of repeat purchases in the performance of next basket recommendation (NBR). To address this, the authors propose ReCANet, a neural network model that explicitly models the repeat consumption behavior of users, and demonstrate its superior performance compared to state-of-the-art models for the NBR task.

PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22) (2022)

暂无数据