Article
Computer Science, Information Systems
Inyoung Jun, Shannan N. Rich, Zhaoyi Chen, Jiang Bian, Mattia Prosperi
Summary: Replicating prediction models using EHR data, especially for MRSA outcomes, is challenging due to the lack of well-defined computable phenotypes for predictors and outcomes. While most original variables can be (re)computed, certain variables may only be approximated by proxy computable phenotypes, leading to mild discriminatory ability in the replicated prediction model. Despite the richness of EHR data, limited availability of validated computable phenotypes remains a challenge when replicating complex prediction models.
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS
(2021)
Review
Biology
Martin Chapman, Shahzad Mumtaz, Luke Rasmussen, Andreas Karwath, Georgios Gkoutos, Chuang Gao, Dan Thayer, Jennifer A. Pacheco, Helen Parkinson, Rachel L. Richesson, Emily Jefferson, Spiros Denaxas, Vasa Curcin
Summary: This study presents a set of desiderata for the design of a next-generation phenotype library that aims to ensure the quality of hosted definitions by combining functionality currently offered by disparate tooling. Researchers examined phenotype models, implementation, validation, and contemporary phenotype libraries, resulting in 14 library desiderata that promote high-quality phenotype definitions.
Review
Computer Science, Interdisciplinary Applications
Yuqi Si, Jingcheng Du, Zhao Li, Xiaoqian Jiang, Timothy Miller, Fei Wang, W. Jim Zheng, Kirk Roberts
Summary: Patient representation learning involves developing dense mathematical representations of patients from Electronic Health Records (EHRs) using advanced deep learning methods. Studies from 2015 to 2019 saw a doubling in publications on this topic, with structured EHR data, recurrent neural networks, and supervised learning being commonly used approaches. Disease prediction was the most common application, while privacy concerns and lack of benchmark datasets were challenges faced by researchers in this field.
JOURNAL OF BIOMEDICAL INFORMATICS
(2021)
Article
Public, Environmental & Occupational Health
Mira D. Vale, Denise White Perkins
Summary: This study investigates clinicians' strategies for working with social determinants of health (SDOH) data and the challenges confronting SDOH standardization. The findings reveal that clinicians use different strategies to integrate SDOH data into patient care, but these strategies have limitations for coordinating care across institutions and standardizing SDOH data in electronic health records.
SOCIAL SCIENCE & MEDICINE
(2022)
Review
Computer Science, Interdisciplinary Applications
Feng Xie, Han Yuan, Yilin Ning, Marcus Eng Hock Ong, Mengling Feng, Wynne Hsu, Bibhas Chakraborty, Nan Liu
Summary: This study systematically examines deep learning solutions for temporal data representation in electronic health records (EHRs). The study identifies challenges in representing temporal data, such as irregularity, heterogeneity, sparsity, and model opacity. It explores how deep learning techniques address these challenges and discusses open challenges in the field. The study concludes that deep learning solutions can partially address the challenges of temporal EHR data, but future research should focus on designing comprehensive and integrated solutions and incorporating clinical domain knowledge and model interpretability.
JOURNAL OF BIOMEDICAL INFORMATICS
(2022)
Article
Medicine, General & Internal
Hossein Estiri, Alaleh Azhir, Deborah L. Blacker, Christine S. Ritchie, Chirag J. Patel, Shawn N. Murphy
Summary: This study developed computational models for identifying Alzheimer's Disease (AD) cohorts and compared the utility of AD diagnosis codes and temporal representations from electronic health records (EHRs) for characterizing AD cohorts. The models with sequential features improved AD classification by 3-16% over the use of diagnosis codes alone. These findings have important implications for accelerating AD research and precision drug development.
Article
Health Care Sciences & Services
Ning Shang, Atlas Khan, Fernanda Polubriaginof, Francesca Zanoni, Karla Mehl, David Fasel, Paul E. Drawz, Robert J. Carrol, Joshua C. Denny, Matthew A. Hathcock, Adelaide M. Arruda-Olson, Peggy L. Peissig, Richard A. Dart, Murray H. Brilliant, Eric B. Larson, David S. Carrell, Sarah Pendergrass, Shefali Setia Verma, Marylyn D. Ritchie, Barbara Benoit, Vivian S. Gainer, Elizabeth W. Karlson, Adam S. Gordon, Gail P. Jarvik, Ian B. Stanaway, David R. Crosslin, Sumit Mohan, Iuliana Ionita-Laza, Nicholas P. Tatonetti, Ali G. Gharavi, George Hripcsak, Chunhua Weng, Krzysztof Kiryluk
Summary: The study implemented a portable and scalable electronic CKD phenotype to aid in early disease recognition and large-scale observational and genetic studies. Through manual validation and case-control validation, the algorithm showed high accuracy and detected a significant number of undetected CKD cases.
NPJ DIGITAL MEDICINE
(2021)
Review
Computer Science, Information Systems
Siqi Li, Pinyan Liu, Gustavo G. Nascimento, Xinru Wang, Fabio Renato Manzolli Leite, Bibhas Chakraborty, Chuan Hong, Yilin Ning, Feng Xie, Zhen Ling Teo, Daniel Shu Wei Ting, Hamed Haddadi, Marcus Eng Hock Ong, Marco Aurelio Peres, Nan Liu
Summary: This review examines the application of federated learning (FL) on structured medical data, identifies limitations, and discusses potential innovations. Out of 34 included articles, most utilized data from electronic health records and focused on clinical predictions and association studies. However, there is a lack of sufficient evaluation of clinically meaningful benefits and comparisons with single-site analyses. Future FL applications should prioritize clinical motivations and develop designs and methodologies to support clinical practice and research effectively.
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
(2023)
Article
Cardiac & Cardiovascular Systems
Alanna M. Chamberlain, Veronique L. Roger, Peter A. Noseworthy, Lin Y. Chen, Susan A. Weston, Ruoxiang Jiang, Alvaro Alonso
Summary: This study developed simple computable phenotypes for atrial fibrillation using electronic medical record data. However, using diagnostic codes to identify incident atrial fibrillation is prone to some misclassification. Further research is needed to determine if more complex phenotypes, including unstructured data sources or machine learning techniques, can improve the accuracy of identifying incident atrial fibrillation.
JOURNAL OF THE AMERICAN HEART ASSOCIATION
(2022)
Article
Computer Science, Interdisciplinary Applications
Xiao Luo, Priyanka Gandhi, Zuoyi Zhang, Wei Shao, Zhi Han, Vasu Chandrasekaran, Vladimir Turzhitsky, Vishal Bali, Anna R. Roberts, Megan Metzger, Jarod Baker, Carmen La Rosa, Jessica Weaver, Paul Dexter, Kun Huang
Summary: The study effectively predicted chronic cough patients using deep learning algorithms with structured and unstructured EHR data, achieving high sensitivity and specificity in patient identification.
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE
(2021)
Review
Computer Science, Information Systems
Lucia A. Carrasco-Ribelles, Jose Llanes-Jurado, Carlos Gallego-Moll, Margarita Cabrera-Bean, Monica Monteagudo-Zaragoza, Concepcion Violan, Edurne Zabaleta-del-Olmo
Summary: The objective of this study is to describe and evaluate the use of artificial intelligence techniques in handling longitudinal data from electronic health records to predict health-related outcomes. The review included 81 studies and found heterogeneity in reporting methodology and results, as well as a lack of public EHR datasets and code sharing, making replication of the research complex.
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
(2023)
Article
Computer Science, Interdisciplinary Applications
Ethan Steinberg, Ken Jung, Jason A. Fries, Conor K. Corbin, Stephen R. Pfohl, Nigam H. Shah
Summary: Patient representation schemes improve the accuracy of clinical prediction models by transferring information learned from the entire patient population to the task of training specific models, particularly when only a small number of patient records are available.
JOURNAL OF BIOMEDICAL INFORMATICS
(2021)
Article
Computer Science, Information Systems
Yiyang Liu, Khairul A. Siddiqi, Robert L. Cook, Jiang Bian, Patrick J. Squires, Elizabeth A. Shenkman, Mattia Prosperi, Dushyantha T. Jayaweera
Summary: The computable phenotype algorithm for HIV patients achieved high sensitivity and comparable specificity in identifying patients from diverse backgrounds. The sample consisted of HIV patients with a mean age of 42.7 years, predominantly male and half Black African American, with an average follow-up period of 4.6 years and a high number of encounters.
METHODS OF INFORMATION IN MEDICINE
(2021)
Article
Multidisciplinary Sciences
Chengxi Zang, Yongkang Zhang, Jie Xu, Jiang Bian, Dmitry Morozyuk, Edward J. Schenck, Dhruv Khullar, Anna S. Nordvig, Elizabeth A. Shenkman, Russell L. Rothman, Jason P. Block, Kristin Lyman, Mark G. Weiner, Thomas W. Carton, Fei Wang, Rainu Kaushal
Summary: In this study, the authors used electronic health records to characterize post-acute sequelae of SARS-CoV-2 infection and found possible heterogeneity between populations. They identified a broad range of PASC-related conditions and replicated some of them across two cohorts.
NATURE COMMUNICATIONS
(2023)
Article
Genetics & Heredity
Sarah D. Huang, Vaneeta Bamba, Samantha Bothwell, Patricia Y. Fechner, Anna Furniss, Chijioke Ikomi, Leena Nahata, Natalie J. Nokoff, Laura Pyle, Helina Seyoum, Shanlee M. Davis
Summary: Turner syndrome is a genetic condition characterized by the absence of the second sex chromosome. This study developed a computable phenotype to accurately identify patients with Turner syndrome using electronic health record data. The algorithm showed high sensitivity and specificity, making it a powerful tool for studying rare pediatric conditions like Turner syndrome.
AMERICAN JOURNAL OF MEDICAL GENETICS PART A
(2023)
Article
Computer Science, Information Systems
Pascal S. Brandt, Abel Kho, Yuan Luo, Jennifer A. Pacheco, Theresa L. Walunas, Hakon Hakonarson, George Hripcsak, Cong Liu, Ning Shang, Chunhua Weng, Nephi Walton, David S. Carrell, Paul K. Crane, Eric B. Larson, Christopher G. Chute, Iftikhar J. Kullo, Robert Carroll, Josh Denny, Andrea Ramirez, Wei-Qi Wei, Jyoti Pathak, Laura K. Wiley, Rachel Richesson, Justin B. Starren, Luke Rasmussen
Summary: This study analyzed a publicly available sample of rule-based phenotype definitions and found significant variability in logical constructs and used terminologies. Despite the range of conditions, all phenotype definitions consisted of logical criteria and tabular data. This study highlights the importance of standardizing the representation of phenotype definitions.
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
(2023)
Article
Biochemistry & Molecular Biology
Munyaradzi Musvosvi, Huang Huang, Chunlin Wang, Qiong Xia, Virginie Rozot, Akshaya Krishnan, Peter Acs, Abhilasha Cheruku, Gerlinde Obermoser, Alasdair Leslie, Samuel M. Behar, Willem A. Hanekom, Nicole Bilek, Michelle Fisher, Stefan H. E. Kaufmann, Gerhard Walzl, Mark Hatherill, Mark M. Davis, Thomas J. Scriba
Summary: In this study, single-cell and bulk T-cell receptor (TCR) sequencing and the GLIPH2 algorithm were used to analyze M. tuberculosis-specific sequences in two longitudinal cohorts. The findings identified T-cell similarity groups associated with control of infection or progression to disease, and proposed antigens recognized by T-cell similarity groups associated with infection control as high-priority targets for future vaccine development.
Correction
Multidisciplinary Sciences
Lili Liu, Atlas Khan, Elena Sanchez-Rodriguez, Francesca Zanoni, Yifu Li, Nicholas Steers, Olivia Balderes, Junying Zhang, Priya Krithivasan, Robert A. LeDesma, Clara Fischman, Scott J. Hebbring, John B. Harley, Halima Moncrieffe, Leah C. Kottyan, Bahram Namjou-Khales, Theresa L. Walunas, Rachel Knevel, Soumya Raychaudhuri, Elizabeth W. Karlson, Joshua C. Denny, Ian B. Stanaway, David Crosslin, Thomas Rauen, Juergen Floege, Frank Eitner, Zina Moldoveanu, Colin Reily, Barbora Knoppova, Stacy Hall, Justin T. Sheff, Bruce A. Julian, Robert J. Wyatt, Hitoshi Suzuki, Jingyuan Xie, Nan Chen, Xujie Zhou, Hong Zhang, Lennart Hammarstroem, Alexander Viktorin, Patrik K. E. Magnusson, Ning Shang, George Hripcsak, Chunhua Weng, Tatjana Rundek, Mitchell S. V. Elkind, Elizabeth C. Oelsner, R. Graham Barr, Iuliana Ionita-Laza, Jan Novak, Ali G. Gharavi, Krzysztof Kiryluk
NATURE COMMUNICATIONS
(2023)
Article
Pathology
Mahsa Khanlari, Huan Mo, Do Hwan Kim, Ali Sakhdari, Ken H. Young, Preetesh Jain, Michael Wang, Shaoying Li, Rashmi Kanagal-Shamanna, Roberto N. Miranda, Francisco Vega, L. Jeffrey Medeiros, Chi Young Ok
Summary: This study collected 102 cases of untreated B-MCL and P-MCL and found significant biological differences between the two, including chromatin pattern, cell size and shape variation, cell proliferation rate, and overall survival.
AMERICAN JOURNAL OF SURGICAL PATHOLOGY
(2023)
Article
Pharmacology & Pharmacy
Shaopeng Gu, Govarthanan Rajendiran, Kennedy Forest, Tam C. C. Tran, Joshua C. C. Denny, Eric A. A. Larson, Russell A. A. Wilke
Summary: This study used retrospective analysis of clinical data from the All of Us database to identify cases of drug-induced liver injury (DILI) related to the use of common antibiotics. The results showed that amoxicillin-clavulanate was the most common cause of DILI among the study participants. The findings highlight the efficiency of mining data from electronic health record-linked research cohorts to identify DILI cases associated with the use of common antibiotics.
CLINICAL PHARMACOLOGY & THERAPEUTICS
(2023)
Article
Medical Laboratory Technology
Alexandre Bazinet, Alan Wang, Xinmei Li, Fuli Jia, Huan Mo, Wei Wang, Sa A. Wang
Summary: Detection of measurable residual disease (MRD) is crucial in chronic lymphocytic leukemia (CLL). An artificial intelligence (AI)-assisted multiparameter flow cytometry (MFC) workflow was evaluated for CLL MRD detection. The AI-assisted analysis performed well in classifying MRD-positive and MRD-negative cases. However, it showed sub-optimal performance in atypical immunophenotype CLL and cases lacking residual normal B cells. Further improvement is needed for these cases.
CYTOMETRY PART B-CLINICAL CYTOMETRY
(2023)
Article
Medicine, Research & Experimental
William Z. Kariampuzha, Gioconda Alyea, Sue Qu, Jaleal Sanjak, Ewy Mathe, Eric Sid, Haley Chatelaine, Arjun Yadaw, Yanji Xu, Qian Zhu
Summary: A new system for extracting epidemiological information from rare disease literature has been developed, and its efficiency and accuracy have been demonstrated through three case studies.
JOURNAL OF TRANSLATIONAL MEDICINE
(2023)
Correction
Medicine, Research & Experimental
William Z. Kariampuzha, Gioconda Alyea, Sue Qu, Jaleal Sanjak, Ewy Mathe, Eric Sid, Haley Chatelaine, Arjun Yadaw, Yanji Xu, Qian Zhu
JOURNAL OF TRANSLATIONAL MEDICINE
(2023)
Article
Multidisciplinary Sciences
Yoonjung Yoonie Joo, Jennifer A. Pacheco, William K. Thompson, Laura J. Rasmussen-Torvik, Luke V. Rasmussen, Frederick T. J. Lin, Mariza de Andrade, Kenneth M. Borthwick, Erwin Bottinger, Andrew Cagan, David S. Carrell, Joshua C. Denny, Stephen B. Ellis, Omri Gottesman, James G. Linneman, Jyotishman Pathak, Peggy L. Peissig, Ning Shang, Gerard Tromp, Annapoorani Veerappan, Maureen E. Smith, Rex L. Chisholm, Andrew J. Gawron, M. Geoffrey Hayes, Abel N. Kho
Summary: We identified genetic risk variants and clinical phenotypes associated with diverticular disease (DD) using NLP and multiple EHR data sources. Our algorithm improved patient classification for DD analysis and replicated known associations between ARHGAP15 loci and DD. Additionally, we found significant associations between DD GWAS variants and circulatory system, genitourinary, and neoplastic EHR phenotypes.
Article
Multidisciplinary Sciences
Jane Alexandra Shaw, Maynard Meiring, Devon Allies, Lauren Cruywagen, Tarryn-Lee Fisher, Kesheera Kasavan, Kelly Roos, Stefan Marc Botha, Candice MacDonald, Andritte M. Hiemstra, Donald Simon, Ilana M. van Rensburg, Marika A. Flinn, Ayanda B. Shabangu, Helena Kuivaniemi, Gerard Tromp, Stephanus Malherbe, Gerhard Walzl, Nelita du Plessis
Summary: Bronchoalveolar lavage (BAL) is a common procedure for studying infectious disease immunology. This study analyzed the factors that influence the outcomes of BAL and identified associations with participant characteristics such as active tuberculosis (TB) disease, HIV infection, and recent SARS-CoV-2 infection. The results showed correlations between BAL volume and cell count in participants with active TB disease and current smokers. Older participants had lower BAL cell and volume yields, and higher neutrophils. Current smokers had lower volumes, higher cell counts, and black pellets. The findings provide insights for researchers to optimize participant selection and assay for projects involving lung immune cells.
SCIENTIFIC REPORTS
(2023)
Article
Multidisciplinary Sciences
Laura D. Hughes, Ginger Tsueng, Jack DiGiovanna, Thomas D. Horvath, Luke V. Rasmussen, Tor C. Savidge, Thomas Stoeger, Serdar Turkarslan, Qinglong Wu, Chunlei Wu, Andrew I. Su, Lars Pache
Article
Infectious Diseases
Tracy R. Richardson, Bronwyn Smith, Stephanus T. N. Malherbe, Jane Alexandra Shaw, Firdows Noor, Candice MacDonald, Gian D. van der Spuy, Kim Stanley, Alida Carstens, Tarryn-Lee Fisher, Ilana van Rensburg, Marika Flinn, Candice Snyders, Isaac Johnson, Bernadine Fransman, Hazel Dockrell, Guy Thwaites, Nguyen Thuy Thuong Thuong, Claudia Schacht, Harriet Mayanja-Kizza, Mary Nsereko, Elisa M. Tjon Kon Fat, Paul L. A. M. Corstjens, Annemieke Geluk, Morton Ruhwald, Adam Penn-Nicholson, Novel Chegou, Jayne Sutherland, Gerhard Walzl
Summary: To improve TB diagnosis, WHO is calling for a non-sputum based triage test to focus testing on high-risk individuals. The TriageTB study aims to assess the accuracy of diagnostic test candidates and validate a multi-biomarker point-of-care test. By targeting confirmatory testing to those with a positive triage test, diagnostic costs can be reduced and TB care improved.
BMC INFECTIOUS DISEASES
(2023)
Article
Computer Science, Information Systems
David J. Schlueter, Lina Sulieman, Huan Mo, Jacob M. Keaton, Tracey M. Ferrara, Ariel Williams, Jun Qian, Onajia Stubblefield, Chenjie Zeng, Tam C. Tran, Lisa Bastarache, Jian Dai, Anav Babbar, Andrea Ramirez, Slavina B. Goleva, Joshua C. Denny
Summary: This study evaluated the replication of known cigarette smoking associations using All of Us data. The results showed that most phenotypes found in published meta-analyses associated with smoking were nominally or fully replicated in All of Us, demonstrating the feasibility of studying common exposures using this dataset.
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION
(2023)
Article
Genetics & Heredity
Erin Mcgowan, Jaleal Sanjak, Ewy A. Mathe, Qian Zhu
Summary: By developing an integrative rare disease profile network, potential candidates for drug repurposing or repositioning for glioblastoma were effectively identified.
ORPHANET JOURNAL OF RARE DISEASES
(2023)
Article
Immunology
Jane Alexandra Shaw, Maynard Meiring, Candice Snyders, Frans Everson, Lovemore Nyasha Sigwadhi, Veranyay Ngah, Gerard Tromp, Brian Allwood, Coenraad F. N. Koegelenberg, Elvis M. Irusen, Usha Lalla, Nicola Baines, Annalise E. Zemlin, Rajiv T. Erasmus, Zivanai C. Chapanduka, Tandi E. Matsha, Gerhard Walzl, Hans Strijdom, Nelita du Plessis, Alimuddin Zumla, Novel Chegou, Stephanus T. Malherbe, Peter S. Nyasulu
Summary: This study collected samples and clinical data from COVID-19 patients in Sub-Saharan African populations and found dysregulation in biomarkers among critical patients. These dysregulations were associated with abnormal cytokine responses, bacterial infections, and endothelial dysfunction, which may contribute to mortality.
FRONTIERS IN IMMUNOLOGY
(2023)