4.6 Article

Moara: a Java library for extracting and normalizing gene and protein mentions

期刊

BMC BIOINFORMATICS
卷 11, 期 -, 页码 -

出版社

BIOMED CENTRAL LTD
DOI: 10.1186/1471-2105-11-157

关键词

-

资金

  1. Spanish grants [BIO2007-67150-C03-02, S-Gen-0166/2006, PS-010000-2008-1]
  2. European Union [FP7-HEALTH-F4-2008-202047]
  3. Spanish Ramon y Cajal program
  4. Integromics, S.L.

向作者/读者索取更多资源

Background: Gene/protein recognition and normalization are important preliminary steps for many biological text mining tasks, such as information retrieval, protein-protein interactions, and extraction of semantic information, among others. Despite dedication to these problems and effective solutions being reported, easily integrated tools to perform these tasks are not readily available. Results: This study proposes a versatile and trainable Java library that implements gene/protein tagger and normalization steps based on machine learning approaches. The system has been trained for several model organisms and corpora but can be expanded to support new organisms and documents. Conclusions: Moara is a flexible, trainable and open-source system that is not specifically orientated to any organism and therefore does not requires specific tuning in the algorithms or dictionaries utilized. Moara can be used as a stand-alone application or can be incorporated in the workflow of a more general text mining system.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据