4.4 Article

A Survey on Tag Recommendation Methods

Publisher

WILEY
DOI: 10.1002/asi.23736

Keywords

-

Funding

  1. Google
  2. FAPEMIG-PRONEX-MASWeb project Models, Algorithms and Systems for the Web [APQ-01400-14]
  3. National Institute of Science and Technology for the Web (INWEB)
  4. CNPq
  5. FAPEMIG

Ask authors/readers for more resources

Tags (keywords freely assigned by users to describe web content) have become highly popular on Web 2.0 applications, because of the strong stimuli and easiness for users to create and describe their own content. This increase in tag popularity has led to a vast literature on tag recommendation methods. These methods aim at assisting users in the tagging process, possibly increasing the quality of the generated tags and, consequently, improving the quality of the information retrieval (IR) services that rely on tags as data sources. Regardless of the numerous and diversified previous studies on tag recommendation, to our knowledge, no previous work has summarized and organized them into a single survey article. In this article, we propose a taxonomy for tag recommendation methods, classifying them according to the target of the recommendations, their objectives, exploited data sources, and underlying techniques. Moreover, we provide a critical overview of these methods, pointing out their advantages and disadvantages. Finally, we describe the main open challenges related to the field, such as tag ambiguity, cold start, and evaluation issues.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.4
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Information Systems

10SENT: A stable sentiment analysis method based on the combination of off-the-shelf approaches

Philipe F. Melo, Daniel H. Dalip, Manoel M. Junior, Marcos A. Goncalves, Fabricio Benevenuto

JOURNAL OF THE ASSOCIATION FOR INFORMATION SCIENCE AND TECHNOLOGY (2019)

Article Computer Science, Information Systems

Risk-Sensitive Learning to Rank with Evolutionary Multi-Objective Feature Selection

Daniel Xavier Sousa, Sergio Canuto, Marcos Andre Goncalves, Thierson Couto Rosa, Wellington Santos Martins

ACM TRANSACTIONS ON INFORMATION SYSTEMS (2019)

Article Computer Science, Information Systems

Fine-grained tourism prediction: Impact of social and environmental features

Amir Khatibi, Fabiano Belem, Ana Paula Couto da Silva, Jussara M. Almeida, Marcos A. Goncalves

INFORMATION PROCESSING & MANAGEMENT (2020)

Article Computer Science, Information Systems

Extended pre-processing pipeline for text classification: On the role of meta-feature representations, sparsification and selective sampling

Washington Cunha, Sergio Canuto, Felipe Viegas, Thiago Salles, Christian Gomes, Vitor Mangaravite, Elaine Resende, Thierson Rosa, Marcos Andre Goncalves, Leonardo Rocha

INFORMATION PROCESSING & MANAGEMENT (2020)

Article Statistics & Probability

A bias-variance analysis of state-of-the-art random forest text classifiers

Thiago Salles, Leonardo Rocha, Marcos Goncalves

Summary: The study analyzed variants of random forest (RF) classifiers in the case of noisy data, exploring the bias-variance decomposition of error rate and showing significant improvements in variance and bias stability for lazy and boosted RF variants. The research provides promising directions for further enhancements in RF-based learners.

ADVANCES IN DATA ANALYSIS AND CLASSIFICATION (2021)

Article Computer Science, Information Systems

Fixing the curse of the bad product descriptions - Search-boosted tag recommendation for E-commerce products

Fabiano M. Belem, Rodrigo M. Silva, Claudio M. de Andrade, Gabriel Person, Felipe Mingote, Raphael Ballet, Helton Alponti, Henrique P. de Oliveira, Jussara M. Almeida, Marcos A. Goncalves

INFORMATION PROCESSING & MANAGEMENT (2020)

Article Computer Science, Information Systems

Exploiting semantic relationships for unsupervised expansion of sentiment lexicons

Felipe Viegas, Mario S. Alvim, Sergio Canuto, Thierson Rosa, Marcos Andre Goncalves, Leonardo Rocha

INFORMATION SYSTEMS (2020)

Article Computer Science, Information Systems

On the cost-effectiveness of neural and non-neural approaches and representations for text classification: A comprehensive comparative study

Washington Cunha, Vitor Mangaravite, Christian Gomes, Sergio Canuto, Elaine Resende, Cecilia Nascimento, Felipe Viegas, Celso Franca, Wellington Santos Martins, Jussara M. Almeida, Thierson Rosa, Leonardo Rocha, Marcos Andre Goncalves

Summary: This article brings two major contributions. Firstly, it critically analyses recent scientific articles about different approaches for automatic text classification, revealing potential issues related to experimental procedures. Secondly, it provides a comparison between neural and non-neural ATC solutions, showing that simpler non-neural methods perform well in smaller datasets, while neural Transformers are better in larger datasets. However, the gains in effectiveness of neural methods are not significant compared to properly tuned non-neural solutions.

INFORMATION PROCESSING & MANAGEMENT (2021)

Review Health Care Sciences & Services

Impact of Big Data Analytics on People's Health: Overview of Systematic Reviews and Recommendations for Future Studies

Israel Junior Borges do Nascimento, Milena Soriano Marcolino, Hebatullah Mohamed Abdulazeem, Ishanka Weerasekara, Natasha Azzopardi-Muscat, Marcos Andre Goncalves, David Novillo-Ortiz

Summary: The study aimed to assess the impact of big data analytics on people's health, focusing on improving the accuracy of diagnosis for certain diseases, managing chronic diseases, and supporting real-time analysis of large, varied data inputs for disease prediction and diagnosis.

JOURNAL OF MEDICAL INTERNET RESEARCH (2021)

Article Computer Science, Information Systems

Individualized extreme dominance (IndED): A new preference-based method for multi-objective recommender systems

Reinaldo Silva Fortes, Daniel Xavier de Sousa, Dayanne G. Coelho, Anisio M. Lacerda, Marcos A. Goncalves

Summary: The study introduces a new preference-based multi-objective recommendation method, IndED, which better satisfies individual user preferences and balances objectives more effectively. By utilizing the concepts of extreme dominance and statistical significance tests, IndED defines a new Pareto-based dominance relation to guide optimization search based on user preferences.

INFORMATION SCIENCES (2021)

Article Multidisciplinary Sciences

FISETIO: A FIne-grained, Structured and Enriched Tourism Dataset for Indoor and Outdoor attractions

Amir Khatibi, Ana Paula Couto da Silva, Jussara M. Almeida, Marcos A. Goncalves

DATA IN BRIEF (2020)

Article Information Science & Library Science

A pragmatic approach to hierarchical categorization of research expertise in the presence of scarce information

Gustavo Oliveira de Siqueira, Sergio Canuto, Marcos Andre Goncalves, Alberto H. F. Laender

INTERNATIONAL JOURNAL ON DIGITAL LIBRARIES (2020)

Proceedings Paper Computer Science, Information Systems

Automatic Generation of Initial Reading Lists: Requirements and Solutions

Pablo Figueira, Fabiano Belem, Jussara M. Almeida, Marcos A. Goncalves

2019 ACM/IEEE JOINT CONFERENCE ON DIGITAL LIBRARIES (JCDL 2019) (2019)

Proceedings Paper Computer Science, Information Systems

Similarity-Based Synthetic Document Representations for Meta-Feature Generation in Text Classification

Sergio Canuto, Thiago Salles, Thierson C. Rosa, Marcos A. Goncalves

PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19) (2019)

Proceedings Paper Computer Science, Artificial Intelligence

CluWords: Exploiting Semantic Word Clustering Representation for Enhanced Topic Modeling

Felipe Viegas, Sergio Canuto, Christian Gomes, Washington Luiz, Thierson Rosa, Sabir Ribas, Leonardo Rocha, Marcos Andre Goncalves

PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19) (2019)

No Data Available