4.6 Article

Coreference resolution of medical concepts in discharge summaries by exploiting contextual information

Journal

Publisher

B M J PUBLISHING GROUP
DOI: 10.1136/amiajnl-2012-000808

Keywords

-

Funding

  1. National Library of Medicine [2U54LM008748]
  2. Consortium for Healthcare Informatics Research (CHIR)
  3. VA HSR HIR [08-374, 08-204]
  4. VA Informatics and Computing Infrastructure (VINCI)
  5. National Institutes of Health, National Library of Medicine [R13LM010743-01]
  6. National Core Facility Program for Biotechnology, Taiwan (Bioinformatics Consortium of Taiwan) [NSC100-2319-B-010-002, NSC98-2221-E-155-060-MY3]
  7. [U54LM008748]

Ask authors/readers for more resources

Objective Patient discharge summaries provide detailed medical information about hospitalized patients and are a rich resource of data for clinical record text mining. The textual expressions of this information are highly variable. In order to acquire a precise understanding of the patient, it is important to uncover the relationship between all instances in the text. In natural language processing (NLP), this task falls under the category of coreference resolution. Design A key contribution of this paper is the application of contextual-dependent rules that describe relationships between coreference pairs. To resolve phrases that refer to the same entity, the authors use these rules in three representative NLP systems: one rule-based, another based on the maximum entropy model, and the last a system built on the Markov logic network (MLN) model. Results The experimental results show that the proposed MLN-based system outperforms the baseline system (exact match) by average F-scores of 4.3% and 5.7% on the Beth and Partners datasets, respectively. Finally, the three systems were integrated into an ensemble system, further improving performance to 87.21%, which is 4.5% more than the official i2b2 Track 1C average (82.7%). Conclusion In this paper, the main challenges in the resolution of coreference relations in patient discharge summaries are described. Several rules are proposed to exploit contextual information, and three approaches presented. While single systems provided promising results, an ensemble approach combining the three systems produced a better performance than even the best single system.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available