4.7 Article

Mspire-Simulator: LC-MS Shotgun Proteomic Simulator for Creating Realistic Gold Standard Data

Journal

JOURNAL OF PROTEOME RESEARCH
Volume 12, Issue 12, Pages 5742-5749

Publisher

AMER CHEMICAL SOC
DOI: 10.1021/pr400727e

Keywords

simulation; simulator; model; mspire; mass spectroscopy; proteomics

Funding

  1. Brigham Young University
  2. NSF GRF [DGE-0750759]

Ask authors/readers for more resources

The most important step in any quantitative proteomic pipeline is feature detection (aka peak picking). However, generating quality hand-annotated data sets to validate the algorithms, especially for lower abundance peaks, is nearly impossible. An alternative for creating gold standard data is to simulate it with features closely mimicking real data. We present Mspire-Simulator, a free, open-source shotgun proteomic simulator that goes beyond previous simulation attempts by generating LC-MS features with realistic m/z and intensity variance along with other noise components. It also includes machine-learned models for retention time and peak intensity prediction and a genetic algorithm to custom fit model parameters for experimental data sets. We show that these methods are applicable to data from three different mass spectrometers, including two fundamentally different types, and show visually and analytically that simulated peaks are nearly indistinguishable from actual data. Researchers can use simulated data to rigorously test quantitation software, and proteomic researchers may benefit from overlaying simulated data on actual data sets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Biochemical Research Methods

Current controlled vocabularies are insufficient to uniquely map molecular entities to mass spectrometry signal

Rob Smith, Ryan M. Taylor, John T. Prince

BMC BIOINFORMATICS (2015)

Article Biochemical Research Methods

A coherent mathematical characterization of isotope trace extraction, isotopic envelope extraction, and LC-MS correspondence

Rob Smith, John T. Prince, Dan Ventura

BMC BIOINFORMATICS (2015)

Article Multidisciplinary Sciences

Structures of the Gβ-CCT and PhLP1-Gβ-CCT complexes reveal a mechanism for G-protein β-subunit folding and Gβγ dimer assembly

Rebecca L. Plimpton, Jorge Cuellar, Chun Wan J. Lai, Takuma Aoba, Aman Makaju, Sarah Franklin, Andrew D. Mathis, John T. Prince, Jose L. Carrascosa, Jose M. Valpuesta, Barry M. Willardson

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2015)

Article Biochemical Research Methods

Probabilistic Generation of Mass Spectrometry Molecular Abundance Variance for Case and Control Replicates

John T. Prince, Rob Smith

JOURNAL OF PROTEOME RESEARCH (2017)

Article Biochemical Research Methods

JAMSS: proteomics mass spectrometry simulation in Java

Rob Smith, John T. Prince

BIOINFORMATICS (2015)

Article Biochemical Research Methods

Automated structural classification of lipids by machine learning

Ryan Taylor, Ryan H. Miller, Ryan D. Miller, Michael Porter, James Dalgleish, John T. Prince

BIOINFORMATICS (2015)

Review Biochemical Research Methods

LC-MS alignment in theory and practice: a comprehensive algorithmic review

Rob Smith, Dan Ventura, John T. Prince

BRIEFINGS IN BIOINFORMATICS (2015)

Article Immunology

Yersinia pseudotuberculosis BarA-UvrY Two-Component Regulatory System Represses Biofilms via CsrB

Jeffrey K. Schachterle, Ryan M. Stewart, M. Brett Schachterle, Joshua T. Calder, Huan Kang, John T. Prince, David L. Erickson

FRONTIERS IN CELLULAR AND INFECTION MICROBIOLOGY (2018)

Article Multidisciplinary Sciences

Powerful gene-based testing by integrating long-range chromatin interactions and knockoff genotypes

Shiyang Ma, James Dalgleish, Justin Lee, Chen Wang, Linxi Liu, Richard Gill, Joseph D. Buxbaum, Wendy K. Chung, Hugues Aschard, Edwin K. Silverman, Michael H. Cho, Zihuai He, Iuliana Ionita-Laza

Summary: The study proposes a gene-based testing framework that enhances gene discovery by incorporating long-range chromatin interaction data and leveraging the knockoff framework. Through simulations and applications to multiple diseases and traits, the test shows improved power over other gene-based tests and provides a more focused approach to identifying possible causal genes.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2021)

Article Multidisciplinary Sciences

HIV-1 Vpr suppresses expression of the thiazide-sensitive sodium chloride co-transporter in the distal convoluted tubule

Shashi Shrivastav, Hewang Lee, Koji H. Okamoto, Huiyan Lu, Teruhiko D. Yoshida, Khun Zaw Latt, Hidefumi Wakashin, James L. T. J. Dalgleish, Erik H. A. Koritzinsky, Peng A. Xu, Laureano D. Z. Asico, Joon-Yong A. Chung, Stephen Hewitt, John J. B. Gildea, Robin A. Felder, Pedro A. Jose, Avi Z. Rosenberg, Mark A. Knepper, Tomoshige Kino, Jeffrey B. Kopp

Summary: HIV-associated nephropathy (HIVAN) impairs both glomerular and tubular functions. This study found that viral protein R (Vpr) plays an important role in urinary sodium wasting by inhibiting the expression of Na+-Cl- cotransporter (NCC) and reducing the transcriptional activity of mineralocorticoid receptor (MR) in the distal convoluted tubule.

PLOS ONE (2022)

Article Biotechnology & Applied Microbiology

BIGKnock: fine-mapping gene-based associations via knockoff analysis of biobank-scale data

Shiyang Ma, Chen Wang, Atlas Khan, Linxi Liu, James Dalgleish, Krzysztof Kiryluk, Zihuai He, Iuliana Ionita-Laza

Summary: We propose BIGKnock, a computationally efficient gene-based testing approach that leverages long-range chromatin interaction data and performs conditional genome-wide testing via knockoffs. Applying BIGKnock to the UK Biobank data, we show that it produces smaller sets of significant genes that are likely to contain the causal gene(s), compared to conventional gene-based tests. We also demonstrate its ability to pinpoint potential causal genes at more than 80% of the associated loci.

GENOME BIOLOGY (2023)

Review Oncology

CNVScope: Visually Exploring Copy Number Aberrations in Cancer Genomes

James L. T. Dalgleish, Yonghong Wang, Jack Zhu, Paul S. Meltzer

CANCER INFORMATICS (2019)

Article Biochemical Research Methods

Whole blood and urine bioactive Hepcidin-25 determination using liquid chromatography mass spectrometry

Adam C. Swensen, Jordan G. Finnell, Catalina Matias, Andrew J. Gross, John T. Prince, Richard K. Watt, John C. Price

ANALYTICAL BIOCHEMISTRY (2017)

No Data Available