☆ 4.6 Article

Applying machine learning algorithms to electronic health records to predict pneumonia after respiratory tract infection

JOURNAL OF CLINICAL EPIDEMIOLOGY (2022)

Journal

JOURNAL OF CLINICAL EPIDEMIOLOGY

Volume 145, Issue -, Pages 154-163

Publisher

ELSEVIER SCIENCE INC

DOI: 10.1016/j.jclinepi.2022.01.009

Keywords

Respiratory tract infection; Pneumonia; Machine learning; Primary care; Electronic health records; Prediction modeling

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Automated Summary New
Abstract

This study aimed to predict community acquired pneumonia after respiratory tract infection (RTI) consultations in primary care using machine learning applied to electronic health records. The main predictors of pneumonia diagnosis were identified as older age, comorbidity, and initial presentation with lower respiratory tract infection (LRTIs). The developed models achieved good accuracy in internal and temporal validations.

Objectives: To predict community acquired pneumonia after respiratory tract infection (RTI) consultations in primary care by applying machine learning to electronic health records. Study design and Setting: A population-based cohort study was conducted using primary care electronic health records between 2002 to 2017. Sixteen thousand two hundred eighty-nine patients who consulted with RTIs then subsequently diagnosed with pneumonia within 30 days were compared with a random sample of eligible RTI patients. Variable selection compared logistic regression, random forest and penalized regression models. Prediction models were developed using classification and regression trees (CART) and logistic regression. Model performance was assessed through internal and temporal validations. Results: Older age, comorbidity, and initial presentation with lower respiratory tract infection (LRTIs) were identified as the main predictors of pneumonia diagnosis. Developed models achieved good discrimination accuracy with AUROC for the logistic regression model being 0.81 (0.80, 0.84) and 0.70 (0.69, 0.71) for CART during internal validation, and 0.80 (0.79, 0.81) vs. 0.68 (0.67, 0.69) for temporal validation. Conclusion: From a large number of candidate variables, a small number of predictors of pneumonia were consistently identified through machine learning variable selection procedures. Logistic regression generally provided better model performance than CART models. (c) 2022 Elsevier Inc. All rights reserved.

Applying machine learning algorithms to electronic health records to predict pneumonia after respiratory tract infection

Journal

JOURNAL OF CLINICAL EPIDEMIOLOGY

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Applying machine learning algorithms to electronic health records to predict pneumonia after respiratory tract infection

Journal

JOURNAL OF CLINICAL EPIDEMIOLOGY

Publisher

ELSEVIER SCIENCE INC

Keywords

Categories

Ask authors/readers for more resources

Protocol

Reagent

Authors

I am an author on this paper

Reviews

Primary Rating

Secondary Ratings

Novelty

Significance

Scientific rigor

Rate this paper

Recommended

Export Citation

Share Paper