期刊
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
卷 22, 期 6, 页码 812-825出版社
IEEE COMPUTER SOC
DOI: 10.1109/TKDE.2010.33
关键词
Active learning; domain expert; generalized query
With the assistance of a domain expert, active learning can often select or construct fewer examples to request their labels to build an accurate classifier. However, previous works of active learning can only generate and ask specific queries. In real-world applications, the domain experts (or oracles) are often more readily to answer generalized queries with don't-care attributes. The power of such generalized queries is that one generalized query is often equivalent to many specific ones. However, overly general queries are not good as answers from the domain experts (or oracles) can be highly uncertain, and this makes learning difficult. In this paper, we propose a novel active learning algorithm that asks good generalized queries. We, then, extend our algorithm to construct new, hierarchical features for both nominal and numeric attributes. We demonstrate experimentally that our new method asks significantly fewer queries compared with the previous works of active learning, even when the initial labeled data set is very small, and the oracle is inaccurate in class probability estimations. Our method can be readily deployed in real-world data mining tasks where obtaining labeled examples is costly.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据