4.6 Article

Automatic de-identification of electronic medical records using token-level and character-level conditional random fields

期刊

JOURNAL OF BIOMEDICAL INFORMATICS
卷 58, 期 -, 页码 S47-S52

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2015.06.009

关键词

De-identification; Protected health information; Electronic medical records i2b2; Natural language processing; Hybrid method

资金

  1. National 863 Program of China [2015AA015405]
  2. NSFCs (National Natural Science Foundation of China) [61402128, 61473101, 61173075, 61272383]
  3. Strategic Emerging Industry Development Special Funds of Shenzhen [JCYJ20140508161040764, JCYJ20140417172417105, JCYJ20140627163809422]

向作者/读者索取更多资源

De-identification, identifying and removing all protected health information (PHI) present in clinical data including electronic medical records (EMRs), is a critical step in making clinical data publicly available. The 2014 i2b2 (Center of Informatics for Integrating Biology and Bedside) clinical natural language processing (NLP) challenge sets up a track for de-identification (track 1). In this study, we propose a hybrid system based on both machine learning and rule approaches for the de-identification track. In our system, PHI instances are first identified by two (token-level and character-level) conditional random fields (CRFs) and a rule-based classifier, and then are merged by some rules. Experiments conducted on the i2b2 corpus show that our system submitted for the challenge achieves the highest micro F-scores of 94.64%, 91.24% and 91.63% under the token, strict and relaxed criteria respectively, which is among top-ranked systems of the 2014 i2b2 challenge. After integrating some refined localization dictionaries, our system is further improved with F-scores of 94.83%, 91.57% and 91.95% under the token, strict and relaxed criteria respectively. (C) 2015 Elsevier Inc. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据