4.6 Article

CRFs based de-identification of medical records

期刊

JOURNAL OF BIOMEDICAL INFORMATICS
卷 58, 期 -, 页码 S39-S46

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2015.08.012

关键词

Protected health information; De-identification; Medical records; Conditional random fields

资金

  1. NIH NLM [2U54LM008748, 5R13LM011411]
  2. NIH NIGMS [5R01GM102282]

向作者/读者索取更多资源

De-identification is a shared task of the 2014 i2b2/UTHealth challenge. The purpose of this task is to remove protected health information (PHI) from medical records. In this paper, we propose a novel de-identifier, WI-deld, based on conditional random fields (CRFs). A preprocessing module, which tokenizes the medical records using regular expressions and an off-the-shelf tokenizer, is introduced, and three groups of features are extracted to train the de-identifier model. The experiment shows that our system is effective in the de-identification of medical records, achieving a micro-Fl of 0.9232 at the i2b2 strict entity evaluation level. (C) 2015 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据