4.6 Article

Development of an optical character recognition pipeline for handwritten form fields from an electronic health record

期刊

出版社

OXFORD UNIV PRESS
DOI: 10.1136/amiajnl-2011-000182

关键词

-

资金

  1. NIH from the National Human Genome Research Institute (NHGRI) [5U01HG004608-02]
  2. Clinical and Translational Science Award (CTSA) program of the National Center for Research Resources, National Institutes of Health [1UL1RR025011]

向作者/读者索取更多资源

Background Although the penetration of electronic health records is increasing rapidly, much of the historical medical record is only available in handwritten notes and forms, which require labor-intensive, human chart abstraction for some clinical research. The few previous studies on automated extraction of data from these handwritten notes have focused on monolithic, custom-developed recognition systems or third-party systems that require proprietary forms. Methods We present an optical character recognition processing pipeline, which leverages the capabilities of existing third-party optical character recognition engines, and provides the flexibility offered by a modular custom-developed system. The system was configured and run on a selected set of form fields extracted from a corpus of handwritten ophthalmology forms. Observations The processing pipeline allowed multiple configurations to be run, with the optimal configuration consisting of the Nuance and LEADTOOLS engines running in parallel with a positive predictive value of 94.6% and a sensitivity of 13.5%. Discussion While limitations exist, preliminary experience from this project yielded insights on the generalizability and applicability of integrating multiple, inexpensive general-purpose third-party optical character recognition engines in a modular pipeline.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

暂无数据
暂无数据