☆ 4.7 Article

A generalized weighted distance k- Nearest Neighbor for multi-label problems

PATTERN RECOGNITION (2021)

期刊

PATTERN RECOGNITION

卷 114, 期 -, 页码 -

出版社

ELSEVIER SCI LTD

DOI: 10.1016/j.patcog.2020.107526

关键词

Multi-label classification; Binary relevance; Nearest neighbor; Adaptive distance measure; Prototype weighting

类别

Computer Science, Artificial Intelligence Engineering, Electrical & Electronic

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

智能总结 New
摘要

The generalized Prototype Weighting (PW) scheme introduced in this paper supports various objective functions including F-measure, and is designed to significantly improve performance in multi-label classification by using gradient descent to specify parameters.

In multi-label classification, each instance is associated with a set of pre-specified labels. One common approach is to use Binary Relevance (BR) paradigm to learn each label by a base classifier separately. Use of k-Nearest Neighbor (kNN) as the base classifier (denoted as BRkNN) is a simple, descriptive and powerful approach. In binary relevance a highly imbalanced view of dataset is used. However, kNN is known to perform poorly on imbalanced data. One approach to deal with this is to define the distance function in a parametric form and use the training data to adjust the parameters (i.e. adjusting boundaries between classes) by optimizing a performance measure customized for imbalanced data e.g. F-measure. Prototype Weighting (PW) scheme presented in the literature (Paredes & Vidal, 2006) uses gradient descent to specify the parameters by minimizing the classification error-rate on training data. This paper presents a generalized version of PW. First, instead of minimizing the error-rate proposed in PW, the generalized PW supports also other objective functions that use elements of confusion matrix (including F-measure). Second, PW originally presented for 1NN is extended to the general case of kNN (i.e., k > = 1 ). For problems having highly overlapped classes, it is expected to perform better since a value of k > 1 produces smoother decision boundaries which in turn can improve generalization. In multi-label problems with many labels or problems with highly overlapped classes, the proposed generalized PW is expected to significantly improve the performance as it involves many decision boundaries. The performance of the proposed method has been compared with state-of-the-art methods in multi-label classification containing 6 lazy classifiers based on kNN. Experiments show that the proposed method significantly outperforms other methods. (c) 2020 Published by Elsevier Ltd.

A generalized weighted distance k- Nearest Neighbor for multi-label problems

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

A generalized weighted distance k- Nearest Neighbor for multi-label problems

期刊

PATTERN RECOGNITION

出版社

ELSEVIER SCI LTD

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文