4.5 Article

Alternative prior assumptions for improving the performance of naive Bayesian classifiers

期刊

DATA MINING AND KNOWLEDGE DISCOVERY
卷 18, 期 2, 页码 183-213

出版社

SPRINGER
DOI: 10.1007/s10618-008-0101-6

关键词

Conjugate; Dirichlet assumption; Generalized Dirichlet distribution; Liouville distribution; Naive Bayesian classifier

向作者/读者索取更多资源

The prior distribution of an attribute in a naive Bayesian classifier is typically assumed to be a Dirichlet distribution, and this is called the Dirichlet assumption. The variables in a Dirichlet random vector can never be positively correlated and must have the same confidence level as measured by normalized variance. Both the generalized Dirichlet and the Liouville distributions include the Dirichlet distribution as a special case. These two multivariate distributions, also defined on the unit simplex, are employed to investigate the impact of the Dirichlet assumption in naive Bayesian classifiers. We propose methods to construct appropriate generalized Dirichlet and Liouville priors for naive Bayesian classifiers. Our experimental results on 18 data sets reveal that the generalized Dirichlet distribution has the best performance among the three distribution families. Not only is the Dirichlet assumption inappropriate, but also forcing the variables in a prior to be all positively correlated can deteriorate the performance of the naive Bayesian classifier.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据