4.5 Article

Information theoretic-PSO-based feature selection: an application in biomedical entity extraction

期刊

KNOWLEDGE AND INFORMATION SYSTEMS
卷 60, 期 3, 页码 1453-1478

出版社

SPRINGER LONDON LTD
DOI: 10.1007/s10115-018-1265-z

关键词

Named entity recognition; Feature selection; Binary PSO; Correlation; Mutual information; Normalized mutual information; Particle swarm optimization

向作者/读者索取更多资源

Named entity recognition is a vital task for various applications related to biomedical natural language processing. It aims at extracting different biomedical entities from the text and classifying them into some predefined categories. The types could vary depending upon the genre and domain, such as gene versus non-gene in a coarse-grained scenario, or protein, DNA, RNA, cell line, and cell-type in a fine-grained scenario. In this paper, we present a novel filter-based feature selection technique utilizing the search capability of particle swarm optimization (PSO) for determining the most optimal feature combination. The technique yields in the most optimized feature set, that when used for classifiers learning, enhance the system performance. The proposed approach is assessed over four popular biomedical corpora, namely GENIA, GENETAG, AIMed, and Biocreative-II Gene Mention Recognition (BC-II). Our proposed model obtains the F score values of 74.49%, 91.11%, 90.47%, 88.64% on GENIA, GENETAG, AIMed, and BC-II dataset, respectively. The efficiency of feature pruning through PSO is evident with significant performance gains, even with amuch reduced set of features.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据