☆ 4.6 Article

Five-way smoking status classification using text hot-spot identification and error-correcting output codes

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION (2008)

期刊

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION

卷 15, 期 1, 页码 32-35

出版社

OXFORD UNIV PRESS

DOI: 10.1197/jamia.M2434

关键词

类别

Computer Science, Information Systems Computer Science, Interdisciplinary Applications Health Care Sciences & Services Information Science & Library Science Medical Informatics

向作者/读者索取更多资源

Protocol

社区支持

Reagent

社区支持

摘要

We participated in the i2b2 smoking status classification challenge task. The purpose of this task was to evaluate the ability of systems to automatically identify patient smoking status from discharge summaries. Our submission included several techniques that we compared and studied, including hot-spot identification, zero-vector filtering, inverse class frequency weighting, error-correcting output codes, and post-processing rules. We evaluated our approaches using the same methods as the i2b2 task organizers, using micro- and macro-averaged F1 as the primary performance metric. Our best performing system achieved a micro-F1 of 0.9000 on the test collection, equivalent to the best performing system submitted to the i2b2 challenge. Hot-spot identification, zero-vector filtering, classifier weighting, and error correcting output coding contributed additively to increased performance, with hot-spot identification having by far the largest positive effect. High performance on automatic identification of patient smoking status from discharge summaries is achievable with the efficient and straightforward machine learning techniques studied here.

Five-way smoking status classification using text hot-spot identification and error-correcting output codes

期刊

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION

出版社

OXFORD UNIV PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

Five-way smoking status classification using text hot-spot identification and error-correcting output codes

期刊

JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION

出版社

OXFORD UNIV PRESS

关键词

类别

向作者/读者索取更多资源

Protocol

Reagent

作者

我是这篇论文的作者

评论

主要评分

次要评分

新颖性

重要性

科学严谨性

评价这篇论文

推荐

导出引文

分享论文