4.7 Article

Feature selection using Lebesgue and entropy measures for incomplete neighborhood decision systems

期刊

KNOWLEDGE-BASED SYSTEMS
卷 186, 期 -, 页码 -

出版社

ELSEVIER
DOI: 10.1016/j.knosys.2019.104942

关键词

Neighborhood rough sets; Feature selection; Neighborhood entropy; Lebesgue measure; Incomplete neighborhood decision systems

资金

  1. National Natural Science Foundation of China [61772176, 61402153, 61672332]
  2. Plan for Scientific Innovation Talent of Henan Province, China [184100510003]
  3. Key Scientific and Technological Project of Henan Province, China [182102210362]
  4. Young Scholar Program of Henan Province, China [2017GGJS041]

向作者/读者索取更多资源

Feature selection for mixed and incomplete data in terms of numerical and categorical features with missing values has currently gained considerable attention. The development of the neighborhood rough sets-based feature selection method is an important step in improving classification performance, especially in incomplete data with mixed continuous numerical and categorical features. In this paper, a novel feature selection method based on the neighborhood rough sets using Lebesgue and entropy measures in incomplete neighborhood decision systems is proposed, and the method has the capacity to handle mixed and incomplete datasets; further, it can simultaneously maintain the original classification information. First, a Lebesgue measure based on the neighborhood tolerance class is developed to study the positive region and dependency degree. To thoroughly analyze the uncertainty, noise and incompleteness of incomplete neighborhood decision systems, some neighborhood tolerance entropy-based uncertainty measures are presented based on Lebesgue and entropy measures. Then, by combining an algebraic view with an information view in neighborhood rough sets, the neighborhood tolerance dependency joint entropy is defined in incomplete neighborhood decision systems. Moreover, all the corresponding properties are discussed, and the relationships among these measures are established to meaningfully convey the knowledge essence and investigate the uncertainty of incomplete neighborhood decision systems. Finally, for all high-dimensional datasets, the Fisher score method is used to preliminarily eliminate irrelevant features to significantly reduce the computational complexity, and a heuristic feature selection algorithm is designed to improve the classification performance of mixed and incomplete datasets. Experiments under an instance and fifteen public datasets demonstrate that the proposed feature selection method is effective in selecting the most relevant features, achieving great classification ability for incomplete neighborhood decision systems. (C) 2019 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据