4.0 Article

On the assessment of expertise profiles

Publisher

WILEY
DOI: 10.1002/asi.22908

Keywords

information retrieval; knowledge management

Funding

  1. IOP-MMI program of SenterNovem/The Dutch Ministry of Economic Affairs, as part of the A Propos project
  2. Radio Culture and Auditory Resources Infrastructure Project (LARM)
  3. Danish National Research Infrastructures Program [09-067292]
  4. European Union's ICT Policy Support Programme as part of the Competitiveness and Innovation Framework Programme, CIP ICT-PSP [250430]
  5. European Community's Seventh Framework Programme (FP7) [258191, 288024]
  6. Netherlands Organisation for Scientific Research (NWO) [612.061.-814, 612.061.815, 640.004.802, 380-70-011, 727.011.005, 612.001.116, 277-70-004]
  7. Center for Creation, Content and Technology (CCCT)
  8. Hyperlocal Service Platform project
  9. Service Innovation ICT program
  10. WAHSP project
  11. BILAND project
  12. CLARIN-nl program
  13. Dutch national program COMMIT
  14. ESF Research Network Program ELIAS

Ask authors/readers for more resources

Expertise retrieval has attracted significant interest in the field of information retrieval. Expert finding has been studied extensively, with less attention going to the complementary task of expert profiling, that is, automatically identifying topics about which a person is knowledgeable. We describe a test collection for expert profiling in which expert users have self-selected their knowledge areas. Motivated by the sparseness of this set of knowledge areas, we report on an assessment experiment in which academic experts judge a profile that has been automatically generated by state-of-the-art expert-profiling algorithms; optionally, experts can indicate a level of expertise for relevant areas. Experts may also give feedback on the quality of the system-generated knowledge areas. We report on a content analysis of these comments and gain insights into what aspects of profiles matter to experts. We provide an error analysis of the system-generated profiles, identifying factors that help explain why certain experts may be harder to profile than others. We also analyze the impact on evaluating expert-profiling systems of using self-selected versus judged system-generated knowledge areas as ground truth; they rank systems somewhat differently but detect about the same amount of pairwise significant differences despite the fact that the judged system-generated assessments are more sparse.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.0
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Information Systems

Generating Relevant and Informative Questions for Open-Domain Conversations

Yanxiang Ling, Fei Cai, Jun Liu, Honghui Chen, Maarten de Rijke

Summary: Recent research emphasizes the importance of mixed-initiative interactions in conversational search. The task of question generation (QG) in open-domain conversational systems aims to enhance human-machine interactions. However, the limited availability of QG-specific data in conversations makes this task challenging. In this study, we propose a context-enhanced neural question generation (CNQG) model that leverages conversational context to predict question content and pattern. We also use multi-task learning with auxiliary training objectives and a self-supervised approach to train our question generator.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Economics

Parameter-efficient deep probabilistic forecasting

Olivier Sprangers, Sebastian Schelter, Maarten de Rijke

Summary: Probabilistic time series forecasting is crucial in various domains, and Transformer-based methods have achieved state-of-the-art performance. However, they require a large number of parameters and high memory requirements. To address this, we propose a novel bidirectional temporal convolutional network with significantly fewer parameters. Our method performs on par with state-of-the-art approaches and requires lower memory, reducing infrastructure cost.

INTERNATIONAL JOURNAL OF FORECASTING (2023)

Article Computer Science, Information Systems

Evaluating the Robustness of Click Models to Policy Distributional Shift

Romain Deffayet, Jean-Michel Renders, Maarten De Rijke

Summary: The performance of click models under policy distributional shift (PDS) is examined, and a new evaluation protocol is proposed to predict their performance under PDS, along with guidelines to mitigate risks.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Information Systems

Improving Transformer-based Sequential Recommenders through Preference Editing

Muyang Ma, Pengjie Ren, Zhumin Chen, Zhaochun Ren, Huasheng Liang, Jun Ma, Maarten De Rijke

Summary: One of the key challenges in sequential recommendation is how to extract and represent user preferences. We propose a transformer-based sequential recommendation model, named MrTransformer, to explore multiple user preferences. MrTransformer employs preference-editing-based self-supervised learning mechanism to disentangle user preferences into multiple independent representations, improving preference extraction and representation. Experiments show that MrTransformer with preference editing outperforms state-of-the-art methods in terms of Recall, MRR, and NDCG, especially for long sequences of interactions.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Information Systems

PRADA: Practical Black-box Adversarial Attacks against Neural Ranking Models

Chen Wu, Ruqing Zhang, Jiafeng Guo, Maarten De Rijke, Yixing Fan, Xueqi Cheng

Summary: This article introduces the Word Substitution Ranking Attack (WSRA) task against Neural Ranking Models (NRMs), which aims to promote a target document's ranking by adding adversarial perturbations to its text. The proposed Pseudo Relevance-based ADversarial ranking Attack (PRADA) method outperforms existing attack strategies and successfully fools the NRM with small indiscernible perturbations of text.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Information Systems

A Next Basket Recommendation Reality Check

Ming Li, Sami Jullien, Mozhdeh Ariannezhad, Maarten De Rijke

Summary: The study aims to investigate the performance of NBR methods in practical applications and proposes a new set of evaluation metrics to measure the performance of NBR models. By conducting experimental analysis on state-of-the-art NBR models, it reveals the actual progress and improvements of NBR methods in the recommendation process.

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Parallel Split-Join Networks for Shared Account Cross-Domain Sequential Recommendations

Wenchao Sun, Muyang Ma, Pengjie Ren, Yujie Lin, Zhumin Chen, Zhaochun Ren, Jun Ma, Maarten de Rijke

Summary: This study addresses the challenges of sequential recommendation in a context where multiple users share a single account and behavior is available in multiple domains. The proposed PSJNet network learns role-specific representations and filters out irrelevant information using a gating mechanism. It also combines split and join techniques to learn cross-domain representations. Experimental results demonstrate that PSJNet outperforms state-of-the-art baselines in terms of MRR and Recall.

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING (2023)

Article Computer Science, Artificial Intelligence

Keep and Select: Improving Hierarchical Context Modeling for Multi-Turn Response Generation

Yanxiang Ling, Fei Cai, Jun Liu, Honghui Chen, Maarten de Rijke

Summary: Hierarchical context modeling is crucial for the response generation in multi-turn conversational systems. We propose a model named KS-CQ, which utilizes the Keep and Select modules to generate neighbor-aware context representation and context-enriched query representation. Extensive experiments demonstrate the effectiveness of our approach compared to state-of-the-art baselines in both automatic and human evaluations.

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS (2023)

Article Computer Science, Artificial Intelligence

Foundation models and the privatization of public knowledge

Fabian Ferrari, Jose van Dijck, Antal van den Bosch

Summary: In order to ensure the integrity of knowledge production, it is necessary to provide regulators and researchers with access to the training procedures of foundational models like GPT-4. Foundation models need to be open and accessible, although they are not synonymous.

NATURE MACHINE INTELLIGENCE (2023)

Proceedings Paper Computer Science, Information Systems

Contrasting Neural Click Models and Pointwise IPS Rankers

Philipp Hager, Maarten de Rijke, Onno Zoeter

Summary: Inverse-propensity scoring and neural click models are compared in this study for learning rankers from user clicks affected by position bias. Theoretical differences are explored and empirical comparisons are conducted on a prevalent evaluation setup. It is shown that both methods optimize for true document relevance when position bias is known, but small empirical differences are found when neural click models learn from shared, conflicting features.

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT I (2023)

Proceedings Paper Computer Science, Information Systems

Scene-Centric vs. Object-Centric Image-Text Cross-Modal Retrieval: A Reproducibility Study

Mariya Hendriksen, Svitlana Vakulenko, Ernst Kuiper, Maarten de Rijke

Summary: This article investigates the reproducibility and replicability of state-of-the-art CMR results when evaluated on object-centric and scene-centric datasets. By selecting two different architectures of CMR models and evaluating them on two scene-centric datasets and three object-centric datasets, it is discovered that the reproducibility and replicability of the experimental results are problematic, and the scores obtained by the models on object-centric datasets are significantly lower than those obtained on scene-centric datasets.

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III (2023)

Proceedings Paper Computer Science, Information Systems

From Baseline to Top Performer: A Reproducibility Study of Approaches at the TREC 2021 Conversational Assistance Track

Weronika Lajewska, Krisztian Balog

Summary: This paper reports on reproducing the organizers' baseline and top participant submission at the TREC Conversational Assistance track in 2021. It highlights the challenges of reproducibility due to less strict requirements in accompanying papers. Results show key practical information is missing and indicate a smaller relative difference between baseline and top approach. The impact of pipeline components and dataset selection on system performance is explored, with findings suggesting the benefits of advanced retrieval techniques and different query rewriting methods.

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III (2023)

Proceedings Paper Computer Science, Information Systems

Improving the Generalizability of the Dense Passage Retriever Using Generated Datasets

Thilina C. Rajapakse, Maarten de Rijke

Summary: Dense retrieval methods have outperformed traditional sparse retrieval methods in open-domain retrieval. However, there is a noticeable decrease in accuracy when these methods are applied to out-of-distribution and out-of-domain datasets. This may be due to the mismatch in information available to the context encoder and the query encoder during training. By training on datasets with multiple queries per passage, we show that dense passage retriever models perform better on out-of-distribution and out-of-domain test datasets compared to models trained on datasets with single query per passage.

ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT II (2023)

Article Communication

Observe, inspect, modify: Three conditions for generative AI governance

Fabian Ferrari, Jose van Dijck, Antal van den Bosch

Summary: The absence of benchmarks to examine the effectiveness of oversight mechanisms for generative AI systems is a problem for research and policy. This article introduces the conditions of industrial observability, public inspectability, and technical modifiability as structural elements for governing generative AI systems. These conditions are exemplified using the EU's AI Act, grounding the analysis of oversight mechanisms in the material properties of generative AI systems.

NEW MEDIA & SOCIETY (2023)

Proceedings Paper Computer Science, Artificial Intelligence

Reproducibility as a Mechanism for Teaching Fairness, Accountability, Confidentiality, and Transparency in Artificial Intelligence

Ana Lucic, Maurits Bleeker, Sami Jullien, Samarth Bhargav, Maarten de Rijke

Summary: This paper explains the setup of a graduate-level course on Fairness, Accountability, Confidentiality, and Transparency in Artificial Intelligence (FACT-AI) at the University of Amsterdam, focusing on teaching FACT-AI concepts through reproducibility. The course involves a group project where students reproduce existing FACT-AI algorithms and write corresponding reports. The authors reflect on their experience teaching the course over two years, including during a global pandemic, and propose guidelines for teaching FACT-AI through reproducibility in graduate-level AI study programs.

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE (2022)

No Data Available