Journal
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
Volume 14, Issue 3, Pages 545-554Publisher
IEEE COMPUTER SOC
DOI: 10.1109/TCBB.2016.2591523
Keywords
Electronic health records; healthcare quality; embedding models; neural language models
Categories
Funding
- DARPA grant [FA9550-12-1-0406]
- NSF BIGDATA grant [14476570]
- ONR grant [N00014-15-1-2729]
Ask authors/readers for more resources
Increased availability of Electronic Health Record (EHR) data provides unique opportunities for improving the quality of health services. In this study, we couple EHRs with the advanced machine learning tools to predict three important parameters of healthcare quality. More specifically, we describe how to learn low-dimensional vector representations of patient conditions and clinical procedures in an unsupervised manner, and generate feature vectors of hospitalized patients useful for predicting their length of stay, total incurred charges, and mortality rates. In order to learn vector representations, we propose to employ state-of-the-art language models specifically designed for modeling co-occurrence of diseases and applied clinical procedures. The proposed model is trained on a large-scale EHR database comprising more than 35 million hospitalizations in California over a period of nine years. We compared the proposed approach to several alternatives and evaluated their effectiveness by measuring accuracy of regression and classification models used for three predictive tasks considered in this study. Our model outperformed the baseline models on all tasks, indicating a strong potential of the proposed approach for advancing quality of the healthcare system.
Authors
I am an author on this paper
Click your name to claim this paper and add it to your profile.
Reviews
Recommended
No Data Available