4.7 Article Proceedings Paper

pNovo 3: precise de novo peptide sequencing using a learning-to-rank framework

Journal

BIOINFORMATICS
Volume 35, Issue 14, Pages I183-I190

Publisher

OXFORD UNIV PRESS
DOI: 10.1093/bioinformatics/btz366

Keywords

-

Funding

  1. National Key Research and Development Program of China [2016YFA0501300]
  2. National Natural Science Foundation of China [31470805]
  3. Youth Innovation Promotion Association CAS [2014091]
  4. National High Technology Research and Development Program of China (863) [2014AA020902, 2014AA020901]

Ask authors/readers for more resources

Motivation De novo peptide sequencing based on tandem mass spectrometry data is the key technology of shotgun proteomics for identifying peptides without any database and assembling unknown proteins. However, owing to the low ion coverage in tandem mass spectra, the order of certain consecutive amino acids cannot be determined if all of their supporting fragment ions are missing, which results in the low precision of de novo sequencing. Results In order to solve this problem, we developed pNovo 3, which used a learning-to-rank framework to distinguish similar peptide candidates for each spectrum. Three metrics for measuring the similarity between each experimental spectrum and its corresponding theoretical spectrum were used as important features, in which the theoretical spectra can be precisely predicted by the pDeep algorithm using deep learning. On seven benchmark datasets from six diverse species, pNovo 3 recalled 29-102% more correct spectra, and the precision was 11-89% higher than three other state-of-the-art de novo sequencing algorithms. Furthermore, compared with the newly developed DeepNovo, which also used the deep learning approach, pNovo 3 still identified 21-50% more spectra on the nine datasets used in the study of DeepNovo. In summary, the deep learning and learning-to-rank techniques implemented in pNovo 3 significantly improve the precision of de novo sequencing, and such machine learning framework is worth extending to other related research fields to distinguish the similar sequences. Availability and implementation pNovo 3 can be freely downloaded from http://pfind.ict.ac.cn/software/pNovo/index.html. Supplementary information Supplementary data are available at Bioinformatics online.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Chemistry, Analytical

pDeep3: Toward More Accurate Spectrum Prediction with Fast Few-Shot Learning

Ching Tarn, Wen-Feng Zeng

Summary: This study adopts few-shot learning method to enhance the prediction accuracy of deep learning spectrum prediction, validated on multiple datasets, showing significant improvement in prediction accuracy within seconds.

ANALYTICAL CHEMISTRY (2021)

Article Medicine, Research & Experimental

[Mn(PaPy2Q)(NO)]ClO4, a Near-Infrared Light activated release of Nitric Oxide drug as a nitric oxide donor for therapy of human prostate cancer cells in vitro and in vivo

Yuwan Zhao, Zhuo Li, Huancheng Tang, Shanhong Lin, Wenfeng Zeng, Dongcai Ye, Xin Zeng, Qiuming Luo, Jianwei Li, Zhixian Ao, Jierong Mo, Lixin Chen, Yiqiu Yang, Yunsheng Huang, Jianjun Liu

Summary: This study investigated the synthesis of a near-infrared light-sensitive NO prodrug and its effects on prostate cancer cells. The results showed that the drug effectively inhibited cell proliferation and promoted apoptosis in a concentration-dependent manner. Furthermore, in vivo experiments demonstrated the anti-cancer effects of the drug, with increased NO concentration in tumors after near-infrared light irradiation.

BIOMEDICINE & PHARMACOTHERAPY (2021)

Article Biochemical Research Methods

pDeepXL: MS/MS Spectrum Prediction for Cross-Linked Peptide Pairs by Deep Learning

Zhen-Lin Chen, Peng-Zhi Mao, Wen-Feng Zeng, Hao Chi, Si-Min He

Summary: pDeepXL is a deep learning tool for predicting MS/MS spectra of cross-linked peptide pairs. Trained using transfer learning, it accurately predicts spectra of both noncleavable and cleavable cross-linked peptide pairs, and shows improved robustness through online fine-tuning. Integration of pDeepXL into a database search engine increases the identification of cross-link spectra by 18% on average.

JOURNAL OF PROTEOME RESEARCH (2021)

Article Biochemistry & Molecular Biology

Artificial intelligence for proteomics and biomarker discovery

Matthias Mann, Chanchal Kumar, Wen-Feng Zeng, Maximilian T. Strauss

Summary: The rapid growth of biomedical data generation and computational capabilities has led to advancements in utilizing machine learning and deep learning in proteomics for predictive modeling and biomarker discovery. These technologies are essential for improving analytical workflows and integrating multi-omics data, while also raising concerns about model transparency, explainability, and data privacy when deploying MS-based biomarkers in clinical settings.

CELL SYSTEMS (2021)

Article Biochemical Research Methods

pValid 2: A deep learning based validation method for peptide identification in shotgun proteomics with increased discriminating power

Wen-Jing Zhou, Zhuo-Hong Wei, Si-Min He, Hao Chi

Summary: In this study, we developed a more comprehensive validation method pValid 2, which successfully overcame the limitations of previous validation methods by introducing a new feature, leading to improved accuracy and efficiency in identifications.

JOURNAL OF PROTEOMICS (2022)

Article Biochemical Research Methods

Precise, fast and comprehensive analysis of intact glycopeptides and modified glycans with pGlyco3

Wen-Feng Zeng, Wei-Qian Cao, Ming-Qi Liu, Si-Min He, Peng-Yuan Yang

Summary: The study introduces a new glycan-first glycopeptide search engine, pGlyco3, which can comprehensively analyze intact N- and O-glycopeptides, including glycopeptides with modified saccharide units, in a fast and accurate manner.

NATURE METHODS (2021)

Article Computer Science, Interdisciplinary Applications

Eight-element fifth-generation multiple-input multiple-output antenna designed by modal currents cancelation

Wen-Feng Zeng, Qing-Xin Chu

Summary: An antenna decoupling method based on modal control is proposed in this paper, which excites a pair of decoupling modes simultaneously to achieve decoupling. The effectiveness of this method is validated through the analysis and design of a head-to-head antenna pair. Additionally, an eight-element MIMO antenna is designed, fabricated, and measured to demonstrate the good performance of the proposed method.

INTERNATIONAL JOURNAL OF RF AND MICROWAVE COMPUTER-AIDED ENGINEERING (2022)

Article Biochemistry & Molecular Biology

The structural context of posttranslational modifications at a proteome-wide scale

Isabell Bludau, Sander Willems, Wen-Feng Zeng, Maximilian T. Strauss, Fynn M. Hansen, Maria C. Tanzer, Ozge Karayel, Brenda A. Schulman, Matthias Mann

Summary: The recent revolution in computational protein structure prediction has provided new insights into the study of the entire proteome. In this study, the researchers analyze posttranslational modifications (PTMs) of proteins to determine their structural context and investigate their potential regulatory sites. The analysis reveals global patterns of PTM occurrence and spatial coregulation of different types of PTMs.

PLOS BIOLOGY (2022)

Article Engineering, Electrical & Electronic

Design of Self-Decoupling Dielectric Resonator Antenna With Shared Radiator

Yu-Zhong Liang, Fu-Chang Chen, Wen-Feng Zeng, Qing-Xin Chu

Summary: This communication investigates the method of mode cancellation for designing a two-port dielectric resonator antenna (DRA) for in-band full-duplex (IBFD) applications. The antenna structure is simple, consisting only of a single DRA element, a pair of feeding lines, and a pair of metallic probes. By utilizing different modes, the mutual coupling between the exciting port and the passive port can be suppressed to a very low level without the need for an extra decoupling structure. A prototype is fabricated and measured to verify the design, with the results demonstrating broad bandwidth and high isolation throughout the working band.

IEEE TRANSACTIONS ON ANTENNAS AND PROPAGATION (2023)

Article Multidisciplinary Sciences

AlphaPeptDeep: a modular deep learning framework to predict peptide properties for proteomics

Wen-Feng Zeng, Xie-Xuan Zhou, Sander Willems, Constantin Ammar, Maria Wahle, Isabell Bludau, Eugenia Voytik, Maximillian T. Strauss, Matthias Mann

Summary: Machine learning and deep learning are becoming increasingly important in MS-based proteomics. AlphaPeptDeep is a modular Python framework built on PyTorch that can learn and predict peptide properties. It features a model shop that allows non-specialists to create models easily. AlphaPeptDeep can also predict sequence-based properties and performs well in predicting retention time, collisional cross sections, and fragment intensities.

NATURE COMMUNICATIONS (2022)

Article Multidisciplinary Sciences

pGlycoQuant with a deep residual network for quantitative glycoproteomics at intact glycopeptide level

Siyuan Kong, Pengyun Gong, Wen-Feng Zeng, Biyun Jiang, Xinhang Hou, Yang Zhang, Huanhuan Zhao, Mingqi Liu, Guoquan Yan, Xinwen Zhou, Xihua Qiao, Mengxi Wu, Pengyuan Yang, Chao Liu, Weiqian Cao

Summary: pGlycoQuant is a generic tool for quantitative analysis of intact glycopeptides using both primary and tandem mass spectrometry. It employs a deep learning model and a Match In Run algorithm to improve glycopeptide matching and expand the quantitative function of various search engines. Its application in N-glycoproteomic study demonstrates its potential in exploring site-specific glycosylation and its role in biological processes.

NATURE COMMUNICATIONS (2022)

Article Cardiac & Cardiovascular Systems

CYP2C19 loss-of-function is associated with increased risk of hypertension in a Hakka population: a case-control study

Nan Cai, Cunren Li, Xianfang Gu, Wenfeng Zeng, Jiawei Zhong, Jingfeng Liu, Guopeng Zeng, Junxing Zhu, Haifeng Hong

Summary: The study found that there is a relationship between CYP2C19 gene polymorphisms and hypertension in the Hakka population. Loss-of-function genotypes of CYP2C19 increase the risk of hypertension.

BMC CARDIOVASCULAR DISORDERS (2023)

Article Medicine, Research & Experimental

Quantitative multiorgan proteomics of fatal COVID-19 uncovers tissue-specific effects beyond inflammation

Lisa Schweizer, Tina Schaller, Maximilian Zwiebel, Oezge Karayel, Johannes Bruno Mueller-Reif, Wen-Feng Zeng, Sebastian Dintner, Thierry M. Nordmann, Klaus Hirschbuehl, Bruno Maerkl, Rainer Claus, Matthias Mann

Summary: SARS-CoV-2 can cause damage to lung tissue and other organs in the human body, and this study aimed to analyze these effects comprehensively. Using a mass spectrometry proteomics workflow, the researchers identified inflammatory responses as the initial reaction in all tissues. They also found specific patterns of damage in different organs, such as diffuse alveolar damage in the lungs and organ-specific changes in the kidneys, liver, and lymphatic and vascular systems. In the brain, secondary inflammatory effects were linked to neurotransmitter receptors and myelin degradation. These findings contribute to our understanding of the mechanisms of COVID-19 and provide insights for organ-specific therapeutic interventions.

EMBO MOLECULAR MEDICINE (2023)

Article Biochemistry & Molecular Biology

Robust dimethyl-based multiplex-DIA doubles single-cell proteome depth via a reference channel

Marvin Thielert, Ericka C. M. Itang, Constantin Ammar, Florian A. Rosenberger, Isabell Bludau, Lisa Schweizer, Thierry M. Nordmann, Patricia Skowronek, Maria Wahle, Wen-Feng Zeng, Xie-Xuan Zhou, Andreas-David Brunner, Sabrina Richter, Mitchell P. Levesque, Fabian J. Theis, Martin Steger, Matthias Mann

Summary: Single-cell proteomics allows unbiased characterization of biological function and heterogeneity at the protein level. However, current limitations include proteomic depth, throughput, and robustness. In this study, we introduce a streamlined multiplexed workflow using mDIA to address these limitations. Our approach enables automated and complete dimethyl labeling of bulk or single-cell samples, without compromising proteomic depth. We also demonstrate the ability to quantify twice as many proteins per single cell compared to previous methods, and our workflow allows routine analysis of 80 single cells per day. Additionally, we combine mDIA with spatial proteomics to increase the throughput for microdissection and MS analysis, and successfully identify proteomic signatures of cells within distinct tumor microenvironments in primary cutaneous melanoma.

MOLECULAR SYSTEMS BIOLOGY (2023)

Article Engineering, Electrical & Electronic

Antenna Decoupling Based on Characteristic Modes Cancellation

Qingxin Chu, Wenfeng Zeng

Summary: This paper focuses on the antenna coupling within the MIMO system in 5G and summarizes decoupling techniques. It also elaborates on the new decoupling research developments based on the theory of characteristic mode and provides design examples to validate the proposed decoupling method.

CHINESE JOURNAL OF ELECTRONICS (2022)

No Data Available