Journal
EXPERT SYSTEMS WITH APPLICATIONS
Volume 39, Issue 8, Pages 7270-7280Publisher
PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2012.01.096
Keywords
Microarray data classification; Feature selection; Machine learning; Efficient classification with few genes
Categories
Funding
- Junta de Castilla y Leon [VA100A08]
Ask authors/readers for more resources
Microarray data classification is a task involving high dimensionality and small samples sizes. A common criterion to decide on the number of selected genes is maximizing the accuracy, which risks overfitting and usually selects more genes than actually needed. We propose, relaxing the maximum accuracy criterion, to select the combination of attribute selection and classification algorithm that using less attributes has an accuracy not statistically significantly worst that the best. Also we give some advice to choose a suitable combination of attribute selection and classifying algorithms for a good accuracy when using a low number of gene expressions. We used some well known attribute selection methods (FCBF, ReliefF and SVM-RFE, plus a Random selection, used as a base line technique) and classifying techniques (Naive Bayes, 3 Nearest Neighbor and SVM with linear kernel) applied to 30 data sets involving different cancer types. (C) 2012 Elsevier Ltd. All rights reserved.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available