4.6 Article

Fast and scalable search of whole-slide images via self-supervised deep learning

Journal

NATURE BIOMEDICAL ENGINEERING
Volume 6, Issue 12, Pages 1420-+

Publisher

NATURE PORTFOLIO
DOI: 10.1038/s41551-022-00929-8

Keywords

-

Funding

  1. National Institute of General Medical Sciences (NIGMS) [R35GM138216]
  2. Brigham President's Fund
  3. BWH and MGH Pathology
  4. Google Cloud Research Grant
  5. Nvidia GPU Grant Program
  6. Tau Beta Pi Fellowship
  7. Siebel Foundation
  8. NIH National Cancer Institute (NCI) Ruth L. Kirschstein National Service Award [T32CA251062]
  9. BWH Precision Medicine Program

Ask authors/readers for more resources

A self-supervised deep-learning algorithm enables independent and fast searching and retrieval of gigapixel whole-slide images regardless of the size of the image repository.
A self-supervised deep-learning algorithm searches for and retrieves gigapixel whole-slide images at speeds that are independent of the size of the image repository The adoption of digital pathology has enabled the curation of large repositories of gigapixel whole-slide images (WSIs). Computationally identifying WSIs with similar morphologic features within large repositories without requiring supervised training can have significant applications. However, the retrieval speeds of algorithms for searching similar WSIs often scale with the repository size, which limits their clinical and research potential. Here we show that self-supervised deep learning can be leveraged to search for and retrieve WSIs at speeds that are independent of repository size. The algorithm, which we named SISH (for self-supervised image search for histology) and provide as an open-source package, requires only slide-level annotations for training, encodes WSIs into meaningful discrete latent representations and leverages a tree data structure for fast searching followed by an uncertainty-based ranking algorithm for WSI retrieval. We evaluated SISH on multiple tasks (including retrieval tasks based on tissue-patch queries) and on datasets spanning over 22,000 patient cases and 56 disease subtypes. SISH can also be used to aid the diagnosis of rare cancer types for which the number of available WSIs is often insufficient to train supervised deep-learning models.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Interdisciplinary Applications

Pathomic Fusion: An Integrated Framework for Fusing Histopathology and Genomic Features for Cancer Diagnosis and Prognosis

Richard J. Chen, Ming Y. Lu, Jingwen Wang, Drew F. K. Williamson, Scott J. Rodig, Neal Lindeman, Faisal Mahmood

Summary: This study proposes an interpretable strategy for multimodal fusion of histology image and genomic features for survival outcome prediction. The results on glioma and clear cell renal cell carcinoma datasets demonstrate that this approach improves the prognostic determinations.

IEEE TRANSACTIONS ON MEDICAL IMAGING (2022)

Article Biochemistry & Molecular Biology

Deep learning-enabled assessment of cardiac allograft rejection from endomyocardial biopsies

Jana Lipkova, Tiffany Y. Chen, Ming Y. Lu, Richard J. Chen, Maha Shady, Mane Williams, Jingwen Wang, Zahra Noor, Richard N. Mitchell, Mehmet Turan, Gulfize Coskun, Funda Yilmaz, Derya Demir, Deniz Nart, Kayhan Basak, Nesrin Turhan, Selvinaz Ozkara, Yara Banz, Katja E. Odening, Faisal Mahmood

Summary: A deep learning-based AI system has been developed for automated assessment of gigapixel whole-slide images obtained from endomyocardial biopsy, addressing the detection, subtyping and grading of allograft rejection. The system showed non-inferior performance to conventional assessment, reducing interobserver variability and assessment time.

NATURE MEDICINE (2022)

Correction Computer Science, Artificial Intelligence

Benchmarking weakly-supervised deep learning pipelines for whole slide classification in computational pathology' Medical Image Analysis (vol 79, 102474, 2022)

Narmin Ghaffari Laleh, Hannah Sophie Muti, Chiara Maria Lavinia Loeffler, Amelie Echle, Oliver Lester Saldanha, Faisal Mahmood, Ming Y. Lu, Christian Trautwein, Rupert Langer, Bastian Dislich, Roman D. Buelow, Heike Irmgard Grabsch, Hermann Brenner, Jenny Chang-Claude, Elizabeth Alwers, Titus J. Brinker, Firas Khader, Daniel Truhn, Nadine T. Gaisa, Peter Boor, Michael Hoffmeister, Volkmar Schulz, Jakob Nikolas Kather

MEDICAL IMAGE ANALYSIS (2022)

Article Biology

A low-cost, open-source evolutionary bioreactor and its educational use

Vishhvaan Gopalakrishnan, Dena Crozier, Kyle J. Card, Lacy D. Chick, Nikhil P. Krishnan, Erin McClure, Julia Pelesko, Drew F. K. Williamson, Daniel Nichol, Soumyajit Mandal, Robert A. Bonomo, Jacob G. Scott

Summary: A morbidostat is a bioreactor that uses antibiotics to control bacterial growth, making it suitable for studying antibiotic resistance evolution. We present a low-cost morbidostat called the EVolutionary biorEactor (EVE) that can be constructed by students with minimal engineering and programming experience. We validate EVE in a real classroom setting by evolving replicate Escherichia coli populations under chloramphenicol challenge, providing students the opportunity to learn about bacterial growth and antibiotic resistance.

ELIFE (2022)

Editorial Material Biochemistry & Molecular Biology

Harnessing medical twitter data for pathology AI

Ming Y. Lu, Bowen Chen, Faisal Mahmood

Summary: Researchers have utilized pathology data from Twitter to develop a visual-language model that can classify and retrieve histopathology images. This achievement represents a significant milestone in the advancement of multifunctional foundational artificial intelligence models in computational pathology.

NATURE MEDICINE (2023)

Article Engineering, Biomedical

Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status

Jordan Anaya, John-William Sidhom, Faisal Mahmood, Alexander S. Baras

Summary: This study demonstrates that a weakly supervised multiple-instance learning model can encode and aggregate the local sequence context or genomic position of somatic mutations, providing enhanced explainability for sample-level classification. The model achieves best-in-class performance in tumor type classification and microsatellite status prediction, potentially generating biological insight from genomic datasets.

NATURE BIOMEDICAL ENGINEERING (2023)

Article Engineering, Biomedical

Algorithmic fairness in artificial intelligence for medicine and healthcare

Richard J. Chen, Judy J. Wang, Drew F. K. Williamson, Tiffany Y. Chen, Jana Lipkova, Ming Y. Lu, Sharifa Sahai, Faisal Mahmood

Summary: This Perspective discusses the impact of algorithmic biases on healthcare disparities in machine learning. Insufficiently fair AI systems can lead to unequal diagnosis, treatment, and billing of patients. The article explores how biases arise in clinical workflows and outlines emerging technologies such as disentanglement, federated learning, and model explainability to mitigate biases in AI-based medical software development.

NATURE BIOMEDICAL ENGINEERING (2023)

Meeting Abstract Oncology

Deep learning-based multimodal integration of histology and genomics improves cancer origin prediction

Muhammad Shaban, Ming Y. Lu, Drew F. K. Williamson, Richard J. Chen, Jana Lipkova, Tiffany Y. Chen, Faisal Mahmood

CANCER RESEARCH (2023)

Meeting Abstract Oncology

Racial disparities bias oncology AI models

Richard J. Chen, Drew F. K. Williamson, Ming Y. Lu, Tiffany Y. Chen, Jana Lipkova, Muhammad Shaban, Faisal Mahmood

CANCER EPIDEMIOLOGY BIOMARKERS & PREVENTION (2023)

No Data Available