☆ 4.7 Article

Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems

KNOWLEDGE-BASED SYSTEMS (2019)

期刊

KNOWLEDGE-BASED SYSTEMS

卷 186, 期 -, 页码 -

出版社

ELSEVIER

DOI: 10.1016/j.knosys.2019.104942

关键词

Neighborhood rough sets; Feature selection; Neighborhood entropy; Lebesgue measure; Incomplete neighborhood decision systems

类别

Computer Science, Artificial Intelligence

资金

National Natural Science Foundation of China [61772176, 61402153, 61672332]
Plan for Scientific Innovation Talent of Henan Province, China [184100510003]
Key Scientific and Technological Project of Henan Province, China [182102210362]
Young Scholar Program of Henan Province, China [2017GGJS041]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Feature selection for mixed and incomplete data in terms of numerical and categorical features with missing values has currently gained considerable attention. The development of the neighborhood rough sets-based feature selection method is an important step in improving classification performance, especially in incomplete data with mixed continuous numerical and categorical features. In this paper, a novel feature selection method based on the neighborhood rough sets using Lebesgue and entropy measures in incomplete neighborhood decision systems is proposed, and the method has the capacity to handle mixed and incomplete datasets; further, it can simultaneously maintain the original classification information. First, a Lebesgue measure based on the neighborhood tolerance class is developed to study the positive region and dependency degree. To thoroughly analyze the uncertainty, noise and incompleteness of incomplete neighborhood decision systems, some neighborhood tolerance entropy-based uncertainty measures are presented based on Lebesgue and entropy measures. Then, by combining an algebraic view with an information view in neighborhood rough sets, the neighborhood tolerance dependency joint entropy is defined in incomplete neighborhood decision systems. Moreover, all the corresponding properties are discussed, and the relationships among these measures are established to meaningfully convey the knowledge essence and investigate the uncertainty of incomplete neighborhood decision systems. Finally, for all high-dimensional datasets, the Fisher score method is used to preliminarily eliminate irrelevant features to significantly reduce the computational complexity, and a heuristic feature selection algorithm is designed to improve the classification performance of mixed and incomplete datasets. Experiments under an instance and fifteen public datasets demonstrate that the proposed feature selection method is effective in selecting the most relevant features, achieving great classification ability for incomplete neighborhood decision systems. (C) 2019 Elsevier B.V. All rights reserved.

Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems

期刊

KNOWLEDGE-BASED SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems

期刊

KNOWLEDGE-BASED SYSTEMS

出版社

ELSEVIER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文