4.6 Article

Pathway analysis in metabolomics: Recommendations for the use of over-representation analysis

Journal

PLOS COMPUTATIONAL BIOLOGY
Volume 17, Issue 9, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pcbi.1009105

Keywords

-

Funding

  1. Wellcome Trust [222837/Z/21/Z]
  2. UK Medical Research Council [MR/R008922/1]
  3. French Ministry of Research [ANR-INBS-0010, ANR-19-CE45-0021, DFG: 431572533]
  4. National Research Agency, French MetaboHUB [ANR-INBS-0010, ANR-19-CE45-0021, DFG: 431572533]
  5. BBSRC [BB/T007974/1]
  6. NIH [R01 HL133932-01]
  7. NIHR Imperial Biomedical Research Centre (BRC)
  8. MESRI (Minister of Higher Education, Research and Innovation)
  9. Agence Nationale de la Recherche (ANR) [ANR-19-CE45-0021] Funding Source: Agence Nationale de la Recherche (ANR)
  10. BBSRC [BB/T007974/1] Funding Source: UKRI
  11. MRC [MR/R008922/1] Funding Source: UKRI
  12. Wellcome Trust [222837/Z/21/Z] Funding Source: Wellcome Trust

Ask authors/readers for more resources

The study found that changes in different parameters (such as background set, differential metabolite selection methods, and pathway databases) can significantly alter the results of over-representation analysis in metabolomics, which has practical implications for users and researchers. The study offers some recommendations to help ensure the reliability and reproducibility of pathway analysis in metabolomics.
Over-representation analysis (ORA) is one of the commonest pathway analysis approaches used for the functional interpretation of metabolomics datasets. Despite the widespread use of ORA in metabolomics, the community lacks guidelines detailing its best-practice use. Many factors have a pronounced impact on the results, but to date their effects have received little systematic attention. Using five publicly available datasets, we demonstrated that changes in parameters such as the background set, differential metabolite selection methods, and pathway database used can result in profoundly different ORA results. The use of a non-assay-specific background set, for example, resulted in large numbers of false-positive pathways. Pathway database choice, evaluated using three of the most popular metabolic pathway databases (KEGG, Reactome, and BioCyc), led to vastly different results in both the number and function of significantly enriched pathways. Factors that are specific to metabolomics data, such as the reliability of compound identification and the chemical bias of different analytical platforms also impacted ORA results. Simulated metabolite misidentification rates as low as 4% resulted in both gain of false-positive pathways and loss of truly significant pathways across all datasets. Our results have several practical implications for ORA users, as well as those using alternative pathway analysis methods. We offer a set of recommendations for the use of ORA in metabolomics, alongside a set of minimal reporting guidelines, as a first step towards the standardisation of pathway analysis in metabolomics. Author summary Metabolomics is a rapidly growing field of study involving the profiling of small molecules within an organism. It allows researchers to understand the effects of biological status (such as health or disease) on cellular biochemistry, and has wide-ranging applications, from biomarker discovery and personalised medicine in healthcare to crop protection and food security in agriculture. Pathway analysis helps to understand which biological pathways, representing collections of molecules performing a particular function, may be involved in response to a disease phenotype, or drug treatment, for example. Over-representation analysis (ORA) is perhaps the most common pathway analysis method used in the metabolomics community. However, ORA can give drastically different results depending on the input data and parameters used. Here, we have established the effects of these factors on ORA results using computational modifications applied to five real-world datasets. Based on our results, we offer the research community a set of best-practice recommendations applicable not only to ORA but also to other pathway analysis methods to help ensure the reliability and reproducibility of results.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Chemistry, Analytical

Automated Annotation of Untargeted All-Ion Fragmentation LC-MS Metabolomics Data with MetaboAnnotatoR

Goncalo Graca, Yuheng Cai, Chung-Ho E. Lau, Panagiotis A. Vorkas, Matthew R. Lewis, Elizabeth J. Want, David Herrington, Timothy M. D. Ebbels

Summary: This paper proposes a novel approach for automated annotation of isotopologues, adducts, and in-source fragments from untargeted metabolomics and lipidomics LC-MS datasets. The method combines correlation-based parent-fragment linking with molecular fragment matching. The workflow demonstrates high precision and recall values and outperforms current state-of-the-art software for AIF data annotation.

ANALYTICAL CHEMISTRY (2022)

Article Nutrition & Dietetics

Blood pressure interactions with the DASH dietary pattern, sodium, and potassium: The International Study of Macro-/Micronutrients and Blood Pressure (INTERMAP)

Queenie Chan, Gina M. Wren, Chung-Ho E. Lau, Timothy M. D. Ebbels, Rachel Gibson, Ruey Leng Loo, Ghadeer S. Aljuraiban, Joram M. Posma, Alan R. Dyer, Lyn M. Steffen, Beatriz L. Rodriguez, Lawrence J. Appel, Martha L. Daviglus, Paul Elliott, Jeremiah Stamler, Elaine Holmes, Linda Van Horn

Summary: Adherence to the DASH diet is associated with lower blood pressure and increased potassium intake. The DASH diet recommends consuming more fruits, vegetables, and potassium-rich foods to replace sodium-rich processed foods and influence blood pressure through metabolic pathways.

AMERICAN JOURNAL OF CLINICAL NUTRITION (2022)

Article Chemistry, Analytical

Finding Correspondence between Metabolomic Features in Untargeted Liquid Chromatography-Mass Spectrometry Metabolomics Datasets

Rui Climaco Pinto, Ibrahim Karaman, Matthew R. Lewis, Jenny Hallqvist, Manuja Kaluarachchi, Goncalo Graca, Elena Chekmeneva, Brenan Durainayagam, Mohsen Ghanbari, M. Arfan Ikram, Henrik Zetterberg, Julian Griffin, Paul Elliott, Ioanna Tzoulaki, Abbas Dehghan, David Herrington, Timothy Ebbels

Summary: This article introduces a method to find feature correspondence between two similar LC-MS metabolomics experiments or batches using only the features' retention time, mass-to-charge ratio, and feature intensity. The effectiveness of the method is demonstrated through experiments on real and synthetic datasets.

ANALYTICAL CHEMISTRY (2022)

Article Immunology

Host transcriptomic signatures of tuberculosis can predict immune reconstitution inflammatory syndrome in HIV patients

Stanley Kimbung Mbandi, Hannah Painter, Adam Penn-Nicholson, Asma Toefy, Mzwandile Erasmus, Willem A. Hanekom, Thomas J. Scriba, Rachel P. J. Lai, Suzaan Marais, Helen A. Fletcher, Graeme Meintjes, Robert J. Wilkinson, Mark F. Cotton, Savita Pahwa, Mark J. Cameron, Elisa Nemes

Summary: Immune reconstitution inflammatory syndrome (IRIS) is a complication of antiretroviral therapy (ART) in patients with advanced HIV, but its pathogenesis is uncertain. This study found that immune-based blood transcriptomic signatures (RISK6 and Sweeney3) have the potential to predict and diagnose IRIS in HIV+ children and adults.

EUROPEAN JOURNAL OF IMMUNOLOGY (2022)

Review Biochemistry & Molecular Biology

Networks and Graphs Discovery in Metabolomics Data Analysis and Interpretation

Adam Amara, Clement Frainay, Fabien Jourdan, Thomas Naake, Steffen Neumann, Elva Maria Novoa-del-Toro, Reza M. Salek, Liesa Salzer, Sarah Scharfenberg, Michael Witting

Summary: Both targeted and untargeted mass spectrometry-based metabolomics approaches can be used to understand metabolic processes through network analysis. Network-based methods can provide insights into metabolism by connecting metabolites based on various relationships. These methods can address the challenges in metabolite identification and interpretation in untargeted metabolomics data analysis.

FRONTIERS IN MOLECULAR BIOSCIENCES (2022)

Article Oncology

Identification of prediagnostic metabolites associated with prostate cancer risk by untargeted mass spectrometry-based metabolomics: A case-control study nested in the Northern Sweden Health and Disease Study

Johnny R. Ostman, Rui C. Pinto, Timothy M. D. Ebbels, Elin Thysell, Goran Hallmans, Ali A. Moazzami

Summary: Prostate cancer is the most common cancer form in males in many European and American countries, but its etiology is still unclear. This study used untargeted metabolomics to analyze plasma samples from 752 PCa case-control pairs and identified new metabolites associated with PCa risk. The associations between different metabolites and risk varied depending on disease aggressiveness and baseline age.

INTERNATIONAL JOURNAL OF CANCER (2022)

Editorial Material Engineering, Environmental

Avoiding the Misuse of Pathway Analysis Tools in Environmental Metabolomics

Cecilia Wieder, Jacob G. Bundy, Clement Frainay, Nathalie Poupin, Pablo Rodriguez-Mier, Florence Vinson, Juliette Cooke, Rachel P. J. Lai, Fabien Jourdan, Timothy M. D. Ebbels

ENVIRONMENTAL SCIENCE & TECHNOLOGY (2022)

Article Biochemical Research Methods

Single sample pathway analysis in metabolomics: performance evaluation and application

Cecilia Wieder, Rachel P. J. Lai, Timothy M. D. Ebbels

Summary: This study evaluates the applicability of single sample pathway analysis methods in metabolomics, demonstrating the potential of ssPA methods through benchmarking with semi-synthetic metabolomics data and a case study on inflammatory bowel disease. Clustering/dimensionality reduction-based methods provide higher precision at moderate-to-high effect sizes, offering a deeper level of interpretation that conventional methods cannot provide.

BMC BIOINFORMATICS (2022)

Review Biochemistry & Molecular Biology

Recent advances in mass spectrometry-based computational metabolomics

Timothy M. D. Ebbels, Justin J. J. van der Hooft, Haley Chatelaine, Corey Broeckling, Nicola Zamboni, Soha Hassoun, Ewy A. Mathe

Summary: The computational metabolomics field brings together experts from various disciplines to maximize the impact of metabolomics research. Advances in technology have generated complex datasets that require processing, annotation, modeling, and interpretation. Techniques for visualization, integration, and interpretation of metabolomics data have evolved alongside the development of databases and knowledge resources. This review highlights recent advances and discusses opportunities and innovations in response to challenges in the field.

CURRENT OPINION IN CHEMICAL BIOLOGY (2023)

Article Nutrition & Dietetics

Untargeted metabolomic analysis investigating links between unprocessed red meat intake and markers of inflammation

Alexis C. Wood, Goncalo Graca, Meghana Gadgil, Mackenzie K. Senn, Matthew A. Allison, Ioanna Tzoulaki, Philip Greenland, Timothy Ebbels, Paul Elliott, Mark O. Goodarzi, Russell Tracy, Jerome I. Rotter, David Herrington

Summary: This study investigated the relationship between red meat intake and inflammation. The results showed no significant association between processed or unprocessed red meat and markers of inflammation. However, unprocessed red meat intake was inversely associated with the plasma metabolite glutamine, which was also inversely associated with C-reactive protein levels.

AMERICAN JOURNAL OF CLINICAL NUTRITION (2023)

Article Nutrition & Dietetics

Associations between Metabolomic Biomarkers of Avocado Intake and Glycemia in the Multi-Ethnic Study of Atherosclerosis

Alexis C. Wood, Mark O. Goodarzi, Mackenzie K. Senn, Meghana D. Gadgil, Goncalo Graca, Matthew A. Allison, Ioanna Tzoulaki, Michael Y. Mi, Philip Greenland, Timothy Ebbels, Paul Elliott, Russell P. Tracy, David M. Herrington, Jerome I. Rotter

Summary: Avocado intake is associated with metabolomic biomarkers related to glycemia. These biomarkers are strongly associated with lower fasting glucose, lower fasting insulin, and lower incidence of type 2 diabetes. However, the association between avocado intake and fasting insulin is attenuated when controlling for body mass index.

JOURNAL OF NUTRITION (2023)

Review Endocrinology & Metabolism

Problems, principles and progress in computational annotation of NMR metabolomics data

Michael T. Judge, Timothy M. D. Ebbels

Summary: This review aims to broaden the application of automated annotation tools by discussing the key ideas of spectral matching and describing a set of terms for classifying this information, thus advancing standards for communicating annotation confidence. Additionally, it hopes to facilitate collaboration between chemical data scientists, software developers, and the NMR metabolomics community for long-term software solutions.

METABOLOMICS (2022)

No Data Available