4.7 Review

Natural Language Processing in Surgery A Systematic Review and Meta-analysis

Journal

ANNALS OF SURGERY
Volume 273, Issue 5, Pages 900-908

Publisher

LIPPINCOTT WILLIAMS & WILKINS
DOI: 10.1097/SLA.0000000000004419

Keywords

artificial intelligence; electronic health record; machine learning; natural language processing; outcomes research; surgery

Categories

Ask authors/readers for more resources

The study systematically evaluated the use of natural language processing (NLP) in surgical outcomes research, finding that NLP models have a higher sensitivity in identifying postoperative complications compared to traditional non-NLP models. NLP is particularly effective in ruling out documentation of surgical outcomes, while demonstrating similar performance measures to traditional approaches.
Objective: The aim of this study was to systematically assess the application and potential benefits of natural language processing (NLP) in surgical outcomes research. Summary Background Data: Widespread implementation of electronic health records (EHRs) has generated a massive patient data source. Traditional methods of data capture, such as billing codes and/or manual review of free-text narratives in EHRs, are highly labor-intensive, costly, subjective, and potentially prone to bias. Methods: A literature search of PubMed, MEDLINE, Web of Science, and Embase identified all articles published starting in 2000 that used NLP models to assess perioperative surgical outcomes. Evaluation metrics of NLP systems were assessed by means of pooled analysis and meta-analysis. Qualitative synthesis was carried out to assess the results and risk of bias on outcomes. Results: The present study included 29 articles, with over half (n = 15) published after 2018. The most common outcome identified using NLP was postoperative complications (n = 14). Compared to traditional non-NLP models, NLP models identified postoperative complications with higher sensitivity [0.92 (0.87-0.95) vs 0.58 (0.33-0.79), P < 0.001]. The specificities were comparable at 0.99 (0.96-1.00) and 0.98 (0.95-0.99), respectively. Using summary of likelihood ratio matrices, traditional non-NLP models have clinical utility for confirming documentation of outcomes/diagnoses, whereas NLP models may be reliably utilized for both confirming and ruling out documentation of outcomes/diagnoses. Conclusions: NLP usage to extract a range of surgical outcomes, particularly postoperative complications, is accelerating across disciplines and areas of clinical outcomes research. NLP and traditional non-NLP approaches demonstrate similar performance measures, but NLP is superior in ruling out documentation of surgical outcomes.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available