期刊
BIOINFORMATICS
卷 27, 期 18, 页码 2586-2594出版社
OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btr358
关键词
-
类别
资金
- National Science Council [NSC99-3112-B-001-005, NSC98-2221-E-155-060-MY3]
- Academia Sinica [95-02]
- research center for Humanities and Social Sciences [IIS-50-23]
Motivation: Gene normalization (GN) is the task of normalizing a textual gene mention to a unique gene database ID. Traditional top performing GN systems usually need to consider several constraints to make decisions in the normalization process, including filtering out false positives, or disambiguating an ambiguous gene mention, to improve system performance. However, these constraints are usually executed in several separate stages and cannot use each other's input/output interactively. In this article, we propose a novel approach that employs a Markov logic network (MLN) to model the constraints used in the GN task. Firstly, we show how various constraints can be formulated and combined in an MLN. Secondly, we are the first to apply the two main concepts of co-reference resolution-discourse salience in centering theory and transitivity-to GN models. Furthermore, to make our results more relevant to developers of information extraction applications, we adopt the instance-based precision/recall/F-measure (PRF) in addition to the article-wide PRF to assess system performance. Results: Experimental results show that our system outperforms baseline and state-of-the-art systems under two evaluation schemes. Through further analysis, we have found several unexplored challenges in the GN task.
作者
我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。
推荐
暂无数据