4.7 Article

Occam's razor in dimension reduction: Using reduced row Echelon form for finding linear independent features in high dimensional microarray datasets

期刊

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.engappai.2017.04.006

关键词

Occam's razor philosophy; Machine learning; Feature selection; Linear independent columns; Row reduced Echelon form

向作者/读者索取更多资源

Microarray high dimensional datasets suffer from small sample size and extreme large number of features. Therefore, feature selection plays crucial roles on the performance of the trained models on those datasets. A typical feature selection method consists of two main parts, problem criterion and a search strategy. The common datasets don't have huge number of features with respect to their number of samples; hence, a search strategy in their feature selection methods were able to seek the search space. In contrast, microarray high dimensional datasets have huge number of features; therefore, their search space is very large and searching that space is a prohibitive action. In this paper, we take into account the philosophy of Occam's razor in feature subset selection in order to release high dimensional datasets from computational search methods. The proposed method uses two stages for feature selection. In the first stage features are rearranged by their importance in the dataset and in the second stage, the fundamental concept of reduced row Echelon form is applied on dataset in order to find linear independent features. For determining the effectiveness of the proposed method some experiments are carried out on nine binary microarray high dimensional datasets. The obtained results are compared with eleven state-of-the-art feature selection algorithms including Correlation based Feature Selection (CFS), Fast Correlation Based Filter (FCBF), Interact (INT) and Maximum Relevancy Minimum Redundancy (MRMR). The average outcomes of the results are analyzed by a statistical non parametric test and it reveals that the proposed method has a meaningful superiority to the others in terms of accuracy, sensitivity, specificity, G-mean, number of selected features and computational complexity.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据