4.6 Article

Feature selection for entity extraction from multiple biomedical corpora: A PSO-based approach

Journal

SOFT COMPUTING
Volume 22, Issue 20, Pages 6881-6904

Publisher

SPRINGER
DOI: 10.1007/s00500-017-2714-4

Keywords

Particle swarm optimization (PSO); Feature selection; Condition random field; Entity extraction

Ask authors/readers for more resources

Entity extraction is an important step in biomedical text mining. Among many other challenges, there are two very crucial issues, viz. determining the most applicable feature set so that the model can be precise and less complex, and adapting the system across multiple benchmark corpora. In this paper, we propose a novel method for feature selection using the search capability of particle swarm optimization. The compact feature set used for training the classifier yields much better results when compared to the baseline model, which was developed with a complete set of features. A large number of features suitable for named entity recognition task from biomedical domain are also developed in the current paper. The complete set of features is implemented by studying the properties of datasets and from the domain knowledge. We have used conditional random field, a robust classifier as the underlying learning algorithm which has shown success in solving similar kinds of problems. Our experiments on multiple benchmark corpora yield the level of performance which are at par the state-of-the-art techniques.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available