4.7 Article

A new approach for interpreting Random Forest models and its application to the biology of ageing

Journal

BIOINFORMATICS
Volume 34, Issue 14, Pages 2449-2456

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/bty087

Keywords

-

Funding

  1. Leverhulme Trust [RPG-2016-015]

Ask authors/readers for more resources

Motivation: This work uses the Random Forest (RF) classification algorithm to predict if a gene is over-expressed, under-expressed or has no change in expression with age in the brain. RFs have high predictive power, and RF models can be interpreted using a feature (variable) importance measure. However, current feature importance measures evaluate a feature as a whole (all feature values). We show that, for a popular type of biological data (Gene Ontology-based), usually only one value of a feature is particularly important for classification and the interpretation of the RF model. Hence, we propose a new algorithm for identifying the most important and most informative feature values in an RF model. Results: The new feature importance measure identified highly relevant Gene Ontology terms for the aforementioned gene classification task, producing a feature ranking that is much more informative to biologists than an alternative, state-of-the-art feature importance measure.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Management

Stochastic local search and parameters recommendation: a case study on flowshop problems

Lucas M. Pavelski, Myriam Delgado, Marie-Eleonore Kessaci, Alex A. Freitas

Summary: This paper investigates the use of meta-learning to recommend different stochastic local searches and their parameters to solve permutation flowshop problems. The proposed approach builds a performance database and trains multiple recommendation models to achieve good performance and quality of recommendations. Experiments show that this method outperforms the state-of-the-art algorithm with tuned configuration.

INTERNATIONAL TRANSACTIONS IN OPERATIONAL RESEARCH (2023)

Review Cell Biology

Sex-specific aging in animals: Perspective and future directions

Anne M. Bronikowski, Richard P. Meisel, Peggy R. Biga, James R. Walters, Judith E. Mank, Erica Larschan, Gerald S. Wilkinson, Nicole Valenzuela, Ashley Mae Conard, Joao Pedro de Magalhaes, Jingyue (Ellie) Duan, Amy E. Elias, Tony Gamble, Rita M. Graze, Kristin E. Gribble, Jill A. Kreiling, Nicole C. Riddle

Summary: Sex differences in aging, including lifespan, age-associated decline, and physiological markers, vary greatly across animal species. The underlying causes of these differences remain mostly unknown, highlighting the need for further research on the role of sex-determination mechanisms and their impact on aging.

AGING CELL (2022)

Review Food Science & Technology

Biomarkers of geroprotection and cardiovascular health: An overview of omics studies and established clinical biomarkers in the context of diet

Riccardo Secci, Alexander Hartmann, Michael Walter, Hans Joergen Grabe, Sandra van der Auwera-Palitschka, Axel Kowald, Daniel Palmer, Gerald Rimbach, Georg Fuellen, Israel Barrantes

Summary: This review focuses on the role of diet in protecting against aging, discussing biomarkers based on omics and clinical data, as well as the effects of berry-based interventions. It also emphasizes the importance of individuals' dietary history in geroprotection.

CRITICAL REVIEWS IN FOOD SCIENCE AND NUTRITION (2023)

Editorial Material Biotechnology & Applied Microbiology

Cellular reprogramming and the rise of rejuvenation biotech

Joao Pedro de Magalhaes, Alejandro Ocampo

Summary: Cells can be rejuvenated and biological clocks reset through cellular reprogramming, and many companies are now developing therapies using this technique to rejuvenate human beings. However, the current research in rejuvenation is mostly based on in vitro studies, and it remains to be seen if it can be translated into clinical applications.

TRENDS IN BIOTECHNOLOGY (2022)

Review Cell Biology

Protein Biomarkers in Blood Reflect the Interrelationships Between Stroke Outcome, Inflammation, Coagulation, Adhesion, Senescence and Cancer

Georg Fuellen, Uwe Walter, Larissa Henze, Jan Bohmert, Daniel Palmer, Soyoung Lee, Clemens A. Schmitt, Henrik Rudolf, Axel Kowald

Summary: The most important predictors for outcomes after ischemic stroke are chronological age and stroke severity, with gender, genetics, and lifestyle/environmental factors also playing a role. Recurrent stroke can be prevented through various therapies and treatment of risk factors. Protein biomarkers, particularly those related to immune-inflammatory, coagulation, and adhesion processes, can provide insight into predicting health deterioration following stroke. These processes also connect stroke to cancer and other conditions, indicating potential overlap in biomarkers.

CELLULAR AND MOLECULAR NEUROBIOLOGY (2023)

Article Biochemistry & Molecular Biology

RMDisease V2.0: an updated database of genetic variants that affect RNA modifications with disease and trait implication

Bowen Song, Xuan Wang, Zhanmin Liang, Jiongming Ma, Daiyun Huang, Yue Wang, Joao Pedro de Magalhaes, Daniel J. Rigden, Jia Meng, Gang Liu, Kunqi Chen, Zhen Wei

Summary: Recent advances in epitranscriptomics have revealed the functional associations between RNA modifications (RMs) and multiple human diseases. This study presents an updated database, RMDisease v2.0, which identifies RM-associated genetic variants that may affect different types of RNA modifications in various organisms. These variants, including disease- and trait-associated genetic variants, may function through perturbations of epitranscriptomic markers. RMDisease v2.0 serves as a valuable resource for studying the genetic drivers of phenotypes in the epitranscriptome layer circuitry.

NUCLEIC ACIDS RESEARCH (2023)

Article Oncology

Evaluation of the Synergistic Potential of Simultaneous Pan- or Isoform-Specific BET and SYK Inhibition in B-Cell Lymphoma: An In Vitro Approach

Sina Sender, Ahmad Wael Sultan, Daniel Palmer, Dirk Koczan, Anett Sekora, Julia Beck, Ekkehard Schuetz, Leila Taher, Bertram Brenig, Georg Fuellen, Christian Junghanss, Hugo Murua Escobar

Summary: This study evaluated the impact of simultaneous inhibition of SYK and BET, two key regulators involved in B-cell lymphoma progression. The results showed enhanced anti-proliferative effects and a distinct gene expression profile when SYK and BET were inhibited simultaneously. This suggests that simultaneous inhibition of SYK and BET could be a promising combination therapy for B-cell lymphoma.

CANCERS (2022)

Article Cell Biology

Rilmenidine extends lifespan and healthspan in Caenorhabditis elegans via a nischarin I1-imidazoline receptor

Dominic F. Bennett, Anita Goyala, Cyril Statzer, Charles W. Beckett, Alexander Tyshkovskiy, Vadim N. Gladyshev, Collin Y. Ewald, Joao Pedro de Magalhaes

Summary: Searching for drugs with similar gene expression patterns, rilmenidine was found to extend the lifespan of nematodes and rats, mediated by the I1-imidazoline receptor. This study suggests the potential of rilmenidine as a longevity-promoting drug.

AGING CELL (2023)

Article Biochemistry & Molecular Biology

GeneFriends: gene co-expression databases and tools for humans and model organisms

Priyanka Raina, Rodrigo Guinea, Kasit Chatsirisupachai, Ines Lopes, Zoya Farooq, Cristina Guinea, Csaba-Attila Solyom, Joao Pedro de Magalhaes

Summary: Gene co-expression analysis is a powerful method for understanding gene function and regulation. The GeneFriends database provides researchers with updated gene co-expression networks for humans, mice, and other model organisms, based on RNA-seq data. This valuable tool can help researchers decipher the complexity of genomes and assign functions to poorly annotated genes.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemistry & Molecular Biology

DirectRMDB: a database of post-transcriptional RNA modifications unveiled from direct RNA sequencing technology

Yuxin Zhang, Jie Jiang, Jiongming Ma, Zhen Wei, Yue Wang, Bowen Song, Jia Meng, Guifang Jia, Joao Pedro de Magalhaes, Daniel J. Rigden, Daiyun Hang, Kunqi Chen

Summary: Mapping RNA modifications with advanced technologies has revolutionized our understanding of them. Current modification profiling methods are limited and require selective treatments. Direct RNA sequencing enables the direct study of modifications and has the potential to overcome the limitations of previous methods. The DirectRMDB database provides a fresh perspective on RNA modifications and allows exploration of the epitranscriptome in an isoform-specific manner.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemical Research Methods

Interpretable Ensembles of Classifiers for Uncertain Data With Bioinformatics Applications

Marcelo Rodrigues de Holanda Maia, Alexandre Plastino, Alex Freitas, Joao Pedro de Magalhaes

Summary: This paper proposes two new approaches for dealing with data uncertainty. One approach is to select training instances for each model in an ensemble, and the other is to sample features when splitting a node in a Random Forest training. These approaches are applied to classify ageing-related genes and predict drugs' side effects, and the results show that ensembles based on these approaches achieve better predictive performance.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

Article Biochemistry & Molecular Biology

m7GHub V2.0: an updated database for decoding the N7-methylguanosine (m7G) epitranscriptome

Xuan Wang, Yuxin Zhang, Kunqi Chen, Zhanmin Liang, Jiongming Ma, Rong Xia, Joao Pedro de Magalhaes, Daniel J. Rigden, Jia Meng, Bowen Song

Summary: With the advancement in mapping N7-methylguanosine (m(7)G) RNA methylation sites, a comprehensive resource (m7Ghub v.2.0) has been developed to study m(7)G modification under various physiological contexts. The resource includes the m7GDB database collecting hundreds of thousands of putative m(7)G sites identified in 23 species, the m7GDiseaseDB hosting m(7)G-associated variants including disease-relevant m(7)G-SNPs, and two enhanced analysis modules for interactive analyses.

NUCLEIC ACIDS RESEARCH (2023)

Article Biochemistry & Molecular Biology

Human Ageing Genomic Resources: updates on key databases in ageing research

Joao Pedro de Magalhaes, Zoya Abidi, Gabriel Arantes dos Santos, Roberto A. Avelar, Diogo Barardo, Kasit Chatsirisupachai, Peter Clark, Evandro A. De-Souza, Emily J. Johnson, Ines Lopes, Guy Novoa, Ludovic Senez, Angelo Talay, Daniel Thornton, Paul Ka Po To

Summary: This article introduces the key features and recent enhancements of the Human Ageing Genomic Resources (HAGR), focusing on its six main databases. These databases cover information related to genes and ageing, longevity, life-history, cellular senescence, and genetic variants associated with human longevity. HAGR also provides various tools and gene expression signatures.

NUCLEIC ACIDS RESEARCH (2023)

Article Cell Biology

scDiffCom: a tool for differential analysis of cell-cell interactions provides a mouse atlas of aging changes in intercellular communication

Cyril Lagger, Eugen Ursu, Anais Equey, Roberto A. Avelar, Angela Oliveira Pisco, Robi Tacutu, Joao Pedro de Magalhaes

Summary: Dysregulation of intercellular communication is a hallmark of aging. Here the authors provide a bioinformatics tool to infer changes in cell-cell signaling and an atlas of age-related communication changes in 23 mouse tissues.

NATURE AGING (2023)

No Data Available