4.7 Review

Screening nonrandomized studies for medical systematic reviews: A comparative study of classifiers

期刊

ARTIFICIAL INTELLIGENCE IN MEDICINE
卷 55, 期 3, 页码 197-207

出版社

ELSEVIER SCIENCE BV
DOI: 10.1016/j.artmed.2012.05.002

关键词

Medical informatics; Clinical research informatics; Text mining; Document classification; Systematic reviews

资金

  1. Pittsburgh Biomedical Informatics Training Program [5 T15 LM/DE07059]
  2. NIH from the US National Library of Medicine [K99LM010943]

向作者/读者索取更多资源

Objectives: To investigate whether (1) machine learning classifiers can help identify nonrandomized studies eligible for full-text screening by systematic reviewers; (2) classifier performance varies with optimization; and (3) the number of citations to screen can be reduced. Methods: We used an open-source, data-mining suite to process and classify biomedical citations that point to mostly nonrandomized studies from 2 systematic reviews. We built training and test sets for citation portions and compared classifier performance by considering the value of indexing, various feature sets, and optimization. We conducted our experiments in 2 phases. The design of phase I with no optimization was: 4 classifiers x 3 feature sets x 3 citation portions. Classifiers included k-nearest neighbor, naive Bayes, complement nave Bayes, and evolutionary support vector machine. Feature sets included bag of words, and 2- and 3-term n-grams. Citation portions included titles, titles and abstracts, and full citations with metadata. Phase II with optimization involved a subset of the classifiers, as well as features extracted from full citations, and full citations with overweighted titles. We optimized features and classifier parameters by manually setting information gain thresholds outside of a process for iterative grid optimization with 10-fold cross-validations. We independently tested models on data reserved for that purpose and statistically compared classifier performance on 2 types of feature sets. We estimated the number of citations needed to screen by reviewers during a second pass through a reduced set of citations. Results: In phase I, the evolutionary support vector machine returned the best recall for bag of words extracted from full citations; the best classifier with respect to overall performance was k-nearest neighbor. No classifier attained good enough recall for this task without optimization. In phase II, we boosted performance with optimization for evolutionary support vector machine and complement naive Bayes classifiers. Generalization performance was better for the latter in the independent tests. For evolutionary support vector machine and complement naive Bayes classifiers, the initial retrieval set was reduced by 46% and 35%, respectively. Conclusions: Machine learning classifiers can help identify nonrandomized studies eligible for full-text screening by systematic reviewers. Optimization can markedly improve performance of classifiers. However, generalizability varies with the classifier. The number of citations to screen during a second independent pass through the citations can be substantially reduced. (C) 2012 Elsevier B.V. All rights reserved.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Health Care Sciences & Services

A Taxonomy of Delivery and Documentation Deviations During Delivery of High-Fidelity Simulations

William R. McIvor, Arna Banerjee, John R. Boulet, Tanja Bekhuis, Eugene Tseytlin, Laurence Torsher, Samuel DeMaria, John P. Rask, Matthew S. Shotwell, Amanda Burden, Jeffrey B. Cooper, David M. Gaba, Adam Levine, Christine Park, Elizabeth Sinz, Randolph H. Steadman, Matthew B. Weinger

SIMULATION IN HEALTHCARE-JOURNAL OF THE SOCIETY FOR SIMULATION IN HEALTHCARE (2017)

Article Computer Science, Interdisciplinary Applications

Automated annotation and classification of BI-RADS assessment from radiology reports

Sergio M. Castro, Eugene Tseytlin, Olga Medvedeva, Kevin Mitchell, Shyam Visweswaran, Tanja Bekhuis, Rebecca S. Jacobson

JOURNAL OF BIOMEDICAL INFORMATICS (2017)

Article Health Care Sciences & Services

Using Natural Language Processing to Enable In-depth Analysis of Clinical Messages Posted to an Internet Mailing List: A Feasibility Study

Tanja Bekhuis, Marcos Kreinacke, Heiko Spallek, Mei Song, Jean A. O'Donnell

JOURNAL OF MEDICAL INTERNET RESEARCH (2011)

Review Multidisciplinary Sciences

Feature Engineering and a Proposed Decision-Support System for Systematic Reviewers of Medical Evidence

Tanja Bekhuis, Eugene Tseytlin, Kevin J. Mitchell, Dina Demner-Fushman

PLOS ONE (2014)

Article Dentistry, Oral Surgery & Medicine

Are dentists interested in the oral-systemic disease connection? A qualitative study of an online community of 450 practitioners

Mei Song, Jean A. O'Donnell, Tanja Bekhuis, Heiko Spallek

BMC ORAL HEALTH (2013)

Article Dentistry, Oral Surgery & Medicine

Evidence-Based Practice Knowledge, Perceptions, and Behavior: A Multi-Institutional, Cross-Sectional Study of a Population of US Dental Students

Cheryl L. Straub-Morarend, Christine R. Wankiiri-Hale, Derek R. Blanchette, Sharon K. Lanning, Tanja Bekhuis, Becky M. Smith, Abby J. Brodie, Deise Cruz Oliveira, Robert A. Handysides, Deborah V. Dawson, Heiko Spallek

JOURNAL OF DENTAL EDUCATION (2016)

Proceedings Paper Computer Science, Interdisciplinary Applications

A Prototype for a Hybrid System to Support Systematic Review Teams: A Case Study of Organ Transplantation

Tanja Bekhuis, Eugene Tseytlin, Kevin J. Mitchell

PROCEEDINGS 2015 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (2015)

Article Information Science & Library Science

Building a gold standard to construct search filters: a case study with biomarkers for oral cancer

John J. Frazier, Corey D. Stein, Eugene Tseytlin, Tanja Bekhuis

JOURNAL OF THE MEDICAL LIBRARY ASSOCIATION (2015)

Article Information Science & Library Science

Comparative effectiveness research designs: an analysis of terms and coverage in Medical Subject Headings (MeSH) and Emtree

Tanja Bekhuis, Dina Demner-Fushman, Rebecca Crowley

JOURNAL OF THE MEDICAL LIBRARY ASSOCIATION (2013)

Article Nutrition & Dietetics

High-soluble-fiber foods in conjunction with a telephone-based, personalized behavior change support service result in favorable changes in lipids and lifestyles after 7 weeks

PM Kris-Etherton, DS Taylor, H Smiciklas-Wright, DC Mitchell, TC Bekhuis, BH Olson, AB Slonim

JOURNAL OF THE AMERICAN DIETETIC ASSOCIATION (2002)

Article Computer Science, Artificial Intelligence

A few-shot disease diagnosis decision making model based on meta-learning for general practice

Qianghua Liu, Yu Tian, Tianshu Zhou, Kewei Lyu, Ran Xin, Yong Shang, Ying Liu, Jingjing Ren, Jingsong Li

Summary: This study proposes a few-shot disease diagnosis decision making model based on a model-agnostic meta-learning algorithm (FSDD-MAML). It significantly improves the diagnostic process in primary health care and helps general practitioners diagnose few-shot diseases more accurately.

ARTIFICIAL INTELLIGENCE IN MEDICINE (2024)

Article Computer Science, Artificial Intelligence

Predicting stroke outcome: A case for multimodal deep learning methods with tabular and CT Perfusion data

Balazs Borsos, Corinne G. Allaart, Aart van Halteren

Summary: The study demonstrates the feasibility of predicting functional outcomes for ischemic stroke patients and the usability of multimodal deep learning architectures for this purpose.

ARTIFICIAL INTELLIGENCE IN MEDICINE (2024)

Article Computer Science, Artificial Intelligence

Depression detection for twitter users using sentiment analysis in English and Arabic tweets

Abdelmoniem Helmy, Radwa Nassar, Nagy Ramdan

Summary: This study utilizes machine learning models to detect depression symptoms in Arabic and English texts, and provides manually and automatically annotated tweet corpora. The study also develops an application that can detect tweets with depression symptoms and predict depression trends.

ARTIFICIAL INTELLIGENCE IN MEDICINE (2024)