4.6 Article

Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes

期刊

JOURNAL OF BIOMEDICAL INFORMATICS
卷 75, 期 -, 页码 S28-S33

出版社

ACADEMIC PRESS INC ELSEVIER SCIENCE
DOI: 10.1016/j.jbi.2017.06.005

关键词

De-identification; Named entity recognition; Information extraction; Clinical text mining; Electronic health record

资金

  1. UK's Farr Institute of the Health Informatics Research, Health eResearch Centre
  2. Serbian Ministry of Education and Science [III44006, III47003]
  3. [NIH P50 MH106933]
  4. [NIH 4R13LM011411]
  5. MRC [MR/K006665/1] Funding Source: UKRI

向作者/读者索取更多资源

De-identification of clinical narratives is one of the main obstacles to making healthcare free text available for research. In this paper we describe our experience in expanding and tailoring two existing tools as part of the 2016 CEGS N-GRID Shared Tasks Track 1, which evaluated de-identification methods on a set of psychiatric evaluation notes for up to 25 different types of Protected Health Information (PHI). The methods we used rely on machine learning on either a large or small feature space, with additional strategies, including two-pass tagging and multi-class models, which both proved to be beneficial. The results show that the integration of the proposed methods can identify Health Information Portability and Accountability Act (HIPAA) defined PHIS with overall F-1-scores of similar to 90% and above. Yet, some classes (Profession, Organization) proved again to be challenging given the variability of expressions used to reference given information. (C) 2017 Published by Elsevier Inc.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据