☆ 4.4 Article

Automated Identification of Surveillance Colonoscopy in Inflammatory Bowel Disease Using Natural Language Processing

DIGESTIVE DISEASES AND SCIENCES (2013)

期刊

DIGESTIVE DISEASES AND SCIENCES

卷 58, 期 4, 页码 936-941

出版社

SPRINGER

DOI: 10.1007/s10620-012-2433-8

关键词

Crohn's disease; Ulcerative colitis; Machine learning; Automated retrieval console

类别

Gastroenterology & Hepatology

资金

American College of Gastroenterology Junior Faculty Development Award
Houston VA HSR&D Center of Excellence [HFP90-020]
Department of Veterans Affairs, Veterans Health Administration, Health Services Research and Development Service [MRP05-305]
National Institutes of Health/National Institute of Diabetes and Digestive and Kidney Disease Center [P30 DK56338, K24 DK078154-05]

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

Differentiating surveillance from non-surveillance colonoscopy for colorectal cancer in patients with inflammatory bowel disease (IBD) using electronic medical records (EMR) is important for practice improvement and research purposes, but diagnosis code algorithms are lacking. The automated retrieval console (ARC) is natural language processing (NLP)-based software that allows text-based document-level classification. The purpose of this study was to test the feasibility and accuracy of ARC in identifying surveillance and non-surveillance colonoscopy in IBD using EMR. We performed a split validation study of electronic reports of colonoscopy pathology for patients with IBD from the Michael E. DeBakey VA Medical Center. A gastroenterologist manually classified pathology reports as either derived from surveillance or non-surveillance colonoscopy. Pathology reports were randomly split into two sets: 70 % for algorithm derivation and 30 % for validation. An ARC generated classification model was applied to the validation set of pathology reports. The performance of the model was compared with manual classification for surveillance and non-surveillance colonoscopy. A total of 575 colonoscopy pathology reports were available on 195 IBD patients, of which 400 reports were designated as training and 175 as testing sets. Within the testing set, a total of 69 pathology reports were classified as surveillance by manual review, whereas the ARC model classified 66 reports as surveillance for a recall of 0.77, precision of 0.80, and specificity of 0.88. ARC was able to identify surveillance colonoscopy for IBD without customized software programming. NLP-based document-level classification may be used to differentiate surveillance from non-surveillance colonoscopy in IBD.

Automated Identification of Surveillance Colonoscopy in Inflammatory Bowel Disease Using Natural Language Processing

期刊

DIGESTIVE DISEASES AND SCIENCES

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Automated Identification of Surveillance Colonoscopy in Inflammatory Bowel Disease Using Natural Language Processing

期刊

DIGESTIVE DISEASES AND SCIENCES

出版社

SPRINGER

关键词

类别

资金

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文