4.6 Article

Multi-label thresholding for cost-sensitive classification

期刊

NEUROCOMPUTING
卷 436, 期 -, 页码 232-247

出版社

ELSEVIER
DOI: 10.1016/j.neucom.2020.12.004

关键词

Multi-label classification; Cost-sensitive learning; Threshold choice methods; Global threshold; Context; Misclassification costs

资金

  1. Deanship of Scientific Research (DSR) , King Abdulaziz University, Saudi Arabia, Jeddah [J1596121440]

向作者/读者索取更多资源

This paper investigates cost-sensitive classification methods for multi-label classification, adopting a simple but general thresholding method that is applicable to most classification algorithms. It explores the choice of single and multiple thresholds and proposes cost curves and scatter diagrams for performance evaluation. Experimental evaluation on 13 multi-label datasets demonstrates that adjusting a global threshold instead of per-label threshold does not lead to significant performance loss.
Multi-label classification associates each instance with a set of labels which reflects the nature of a wide range of real-world applications. However, existing approaches assume that all labels have the same misclassification cost, whereas in real-world problems different types of misclassification errors have different costs, which are generally unknown in the training context or might change from one context to another. Thus, there is a demand for cost-sensitive classification methods that minimise the average misclassification cost rather than error rates or counts. In this paper, we adopt a simple yet general method, called thresholding, which applies to most classification algorithms to adapt them to cost-sensitive multi-label classification. This paper investigates current threshold choice approaches for multi-label classification. It explores the choice of single and multiple thresholds and extends some of the current techniques to support multi-label problems. Moreover, it proposes cost curves and scatter diagrams for performance evaluation in the multi-label setting. Experimental evaluation on 13 multi-label datasets demonstrates that there is no significant loss by adjusting a global threshold rather than a per-label threshold considering different misclassification costs across labels. Although tuning multiple thresholds is the obvious solution, the global threshold can also be valid. Multi-label classification associates each instance with a set of labels which reflects the nature of a wide range of real-world applications. However, existing approaches assume that all labels have the same misclassification cost, whereas in real-world problems different types of misclassification errors have different costs, which are generally unknown in the training context or might change from one context to another. Thus, there is a demand for cost-sensitive classification methods that minimise the average misclassification cost rather than error rates or counts. In this paper, we adopt a simple yet general method, called thresholding, which applies to most classification algorithms to adapt them to cost-sensitive multi-label classification. This paper investigates current threshold choice approaches for multi-label classification. It explores the choice of single and multiple thresholds and extends some of the current techniques to support multi-label problems. Moreover, it proposes cost curves and scatter diagrams for performance evaluation in the multi-label setting. Experimental evaluation on 13 multi-label datasets demonstrates that there is no significant loss by adjusting a global threshold rather than a per-label threshold considering different misclassification costs across labels. Although tuning multiple thresholds is the obvious solution, the global threshold can also be valid. (c) 2020 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据