4.7 Article

Credal-C4.5: Decision tree based on imprecise probabilities to classify noisy data

期刊

EXPERT SYSTEMS WITH APPLICATIONS
卷 41, 期 10, 页码 4625-4637

出版社

PERGAMON-ELSEVIER SCIENCE LTD
DOI: 10.1016/j.eswa.2014.01.017

关键词

Imprecise probabilities; Imprecise Dirichlet Model; Uncertainty measures; Credal decision trees; C4.5 algorithm; Noisy data

资金

  1. Spanish Consejeria de Economia, Innovacion y Ciencia de la Junta de Andalucia [TIC-6016]
  2. Spanish MEC project [TIN2012-38969]

向作者/读者索取更多资源

In the area of classification, C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the over-fitting. A modification of C4.5, called Credal-C4.5, is presented in this paper. This new procedure uses a mathematical theory based on imprecise probabilities, and uncertainty measures. In this way, Credal-C4.5 estimates the probabilities of the features and the class variable by using imprecise probabilities. Besides it uses a new split criterion, called Imprecise Information Gain Ratio, applying uncertainty measures on convex sets of probability distributions (credal sets). In this manner, Credal-C4.5 builds trees for solving classification problems assuming that the training set is not fully reliable. We carried out several experimental studies comparing this new procedure with other ones and we obtain the following principal conclusion: in domains of class noise, Credal-C4.5 obtains smaller trees and better performance than classic C4.5. (C) 2014 Elsevier Ltd. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据