期刊
KNOWLEDGE-BASED SYSTEMS
卷 24, 期 6, 页码 740-748出版社
ELSEVIER
DOI: 10.1016/j.knosys.2010.12.010
关键词
Virtual sample; Regularization theory; Cost-sensitive learning; Gaussian distribution; Prior knowledge
资金
- National Natural Science Foundation of China [60873037, 61073043]
- Natural Science Foundation of Heilongjiang Province of China [F200901]
- China Postdoctoral Science Foundation [20090460880]
- Heilongjiang Province Postdoctoral Science Foundation [LBH-Z09214]
- Harbin Outstanding Academic Leader Foundation of Heilongjiang Province of China [2010RFXXG054]
Traditional machine learning algorithms are not with satisfying generalization ability on noisy, imbalanced, and small sample training set. In this work, a novel virtual sample generation (VSG) method based on Gaussian distribution is proposed. Firstly, the method determines the mean and the standard error of Gaussian distribution. Then, virtual samples can be generated by such Gaussian distribution. Finally, a new training set is constructed by adding the virtual samples to the original training set. This work has shown that training on the new training set is equivalent to a form of regularization regarding small sample problems, or cost-sensitive learning regarding imbalanced sample problems. Experiments show that given a suitable number of virtual sample replicates, the generalization ability of the classifiers on the new training sets can be better than that on the original training sets. (C) 2011 Published by Elsevier B.V.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据