4.5 Article

Integrative analysis with expanded DNA methylation data reveals common key regulators and pathways in cancers

Journal

NPJ GENOMIC MEDICINE
Volume 4, Issue -, Pages -

Publisher

SPRINGER NATURE, CO-PUBL CTR EXCELLENCE GENOMIC MED RES
DOI: 10.1038/s41525-019-0077-8

Keywords

-

Funding

  1. NIH [R01 HG009626]
  2. National Natural Science Foundation of China [61503061, 61872063]
  3. Fundamental Research Funds for the Central Universities [ZYGX2016J102]
  4. Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education [93K172017K02]

Ask authors/readers for more resources

The integration of genomic and DNA methylation data has been demonstrated as a powerful strategy in understanding cancer mechanisms and identifying therapeutic targets. The TCGA consortium has mapped DNA methylation in thousands of cancer samples using Illumina Infinium Human Methylation 450 K BeadChip (Illumina 450 K array) that only covers about 1.5% of CpGs in the human genome. Therefore, increasing the coverage of the DNA methylome would significantly leverage the usage of the TCGA data. Here, we present a new model called EAGLING that can expand the Illumina 450 K array data 18 times to cover about 30% of the CpGs in the human genome. We applied it to analyze 13 cancers in TCGA. By integrating the expanded methylation, gene expression, and somatic mutation data, we identified the genes showing differential patterns in each of the 13 cancers. Many of the triple-evidenced genes identified in majority of the cancers are biomarkers or potential biomarkers. Pan-cancer analysis also revealed the pathways in which the triple-evidenced genes are enriched, which include well known ones as well as new ones, such as axonal guidance signaling pathway and pathways related to inflammatory processing or inflammation response. Triple-evidenced genes, particularly TNXB, RRM2, CELSR3, SLC16A3, FANCI, MMP9, MMP11, SIK1, and TRIM59 showed superior predictive power in both tumor diagnosis and prognosis. These results have demonstrated that the integrative analysis using the expanded methylation data is powerful in identifying critical genes/pathways that may serve as new therapeutic targets.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Biochemical Research Methods

DEEPSMP: A deep learning model for predicting the ectodomain shedding events of membrane proteins

Zhongbo Cao, Wei Du, Gaoyang Li, Huansheng Cao

JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (2020)

Article Biochemical Research Methods

CyanoPATH: a knowledgebase of genome-scale functional repertoire for toxic cyanobacterial blooms

Wei Du, Gaoyang Li, Nicholas Ho, Landon Jenkins, Drew Hockaday, Jiankang Tan, Huansheng Cao

Summary: CyanoPATH is a database that curates and analyzes the common genomic functional repertoire for cyanobacteria harmful algal blooms in eutrophic waters, summarizing 19 pathways involved in the utilization of nutrients, stress resistance, and more. It provides valuable assistance in analyzing aquatic metagenomes and metatranscriptomes in CyanoHAB research. Most importantly, it bridges the gap between genome and ecology.

BRIEFINGS IN BIOINFORMATICS (2021)

Article Biochemical Research Methods

Deep forest ensemble learning for classification of alignments of non-coding RNA sequences based on multi-view structure representations

Ying Li, Qi Zhang, Zhaoqian Liu, Cankun Wang, Siyu Han, Qin Ma, Wei Du

Summary: This paper introduces a novel deep fusion learning framework, GCFM, based on a convolutional neural network and a deep forest algorithm, for accurate clustering of ncRNAs. By classifying alignments of ncRNA sequences, GCFM shows a 6% improvement in F-value compared to existing methods, and outperforms RNAclust, Ensembleclust, and CNNclust in ncRNA family clustering with 20% increased accuracy. Additionally, GCFM is utilized in constructing phylogenetic trees for ncRNAs and predicting RNA interactions, with an accuracy rate of 90.63%.

BRIEFINGS IN BIOINFORMATICS (2021)

Article Biochemical Research Methods

Capsule-LPI: a LncRNA-protein interaction predicting tool based on a capsule network

Ying Li, Hang Sun, Shiyao Feng, Qi Zhang, Siyu Han, Wei Du

Summary: This study presents a novel multichannel capsule network framework, Capsule-LPI, to integrate multimodal features for LPI prediction. Through comprehensive experimental comparisons and evaluations, it is demonstrated that both multimodal features and the architecture of the multichannel capsule network can significantly improve the performance of LPI prediction. The experimental results show that Capsule-LPI outperforms the existing state-of-the-art tools.

BMC BIOINFORMATICS (2021)

Article Computer Science, Hardware & Architecture

DeepHBSP: A Deep Learning Framework for Predicting Human Blood-Secretory Proteins Using Transfer Learning

Wei Du, Yu Sun, Hui-Min Bao, Liang Chen, Ying Li, Yan-Chun Liang

Summary: This study introduces a novel deep learning model for accurate identification of blood-secretory proteins using amino acid sequence information. The model combines a binary classification network and a ranking network, applying descriptive loss and compactness loss to improve prediction accuracy. Transfer learning is utilized to train a highly accurate generalized model with limited samples.

JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY (2021)

Article Genetics & Heredity

De novo Prediction of Moonlighting Proteins Using Multimodal Deep Ensemble Learning

Ying Li, Jianing Zhao, Zhaoqian Liu, Cankun Wang, Lizheng Wei, Siyu Han, Wei Du

Summary: In this paper, we propose a novel computational model MEL-MP for predicting moonlighting proteins (MPs) by utilizing specific classifiers for different feature types, resulting in superior prediction performance. Through experiments, it is shown that MEL-MP outperforms the existing machine learning model MPFit, demonstrating its effectiveness in predicting MPs.

FRONTIERS IN GENETICS (2021)

Article Multidisciplinary Sciences

Cardiac cell type-specific gene regulatory programs and disease risk association

James D. Hocker, Olivier B. Poirion, Fugui Zhu, Justin Buchanan, Kai Zhang, Joshua Chiou, Tsui-Min Wang, Qingquan Zhang, Xiaomeng Hou, Yang E. Li, Yanxiao Zhang, Elie N. Farah, Allen Wang, Andrew D. McCulloch, Kyle J. Gaulton, Bing Ren, Neil C. Chi, Sebastian Preissl

Summary: This study identified cCREs in the human heart and found their associations with cardiac cell types and heart failure. It also discovered that genetic variants associated with cardiovascular diseases are enriched within these cCREs, with some potentially linked to atrial fibrillation.

SCIENCE ADVANCES (2021)

Article Biochemistry & Molecular Biology

SecProCT: In Silico Prediction of Human Secretory Proteins Based on Capsule Network and Transformer

Wei Du, Xuan Zhao, Yu Sun, Lei Zheng, Ying Li, Yu Zhang

Summary: This study introduces a novel deep learning model, SecProCT, which predicts secretory proteins using only amino acid sequences with higher accuracy compared to traditional machine learning and other deep learning approaches. The model's innovation lies in its reliance solely on amino acid sequences, overcoming the dependency on annotated protein features in existing methods, and accurately predicting secretory proteins and cancer protein biomarkers in blood and saliva.

INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES (2021)

Article Biochemistry & Molecular Biology

A single-cell atlas of chromatin accessibility in the human genome

Kai Zhang, James D. Hocker, Michael Miller, Xiaomeng Hou, Joshua Chiou, Olivier B. Poirion, Yunjiang Qiu, Yang E. Li, Kyle J. Gaulton, Allen Wang, Sebastian Preissl, Bing Ren

Summary: This study utilized single-cell chromatin accessibility assays to analyze gene regulatory elements in diverse cell types and tissues in the human body, integrating data from adult and fetal tissue to reveal the specificity of cCREs in 222 distinct cell types. The findings provide a foundation for understanding gene regulatory programs across tissues, life stages, and organ systems in humans.
Article Mathematical & Computational Biology

HBFP: a new repository for human body fluid proteome

Dan Shao, Lan Huang, Yan Wang, Xueteng Cui, Yufei Li, Yao Wang, Qin Ma, Wei Du, Juan Cui

Summary: The study highlights the importance of human body fluid proteome as a source for disease biomarker discovery and the lack of a centralized database for published body fluid proteins. The newly developed Human Body Fluid Proteome (HBFP) database provides a valuable resource for research in clinical proteomics and biomarker discovery.

DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION (2021)

Article Biochemical Research Methods

LPInsider: a webserver for lncRNA-protein interaction extraction from the literature

Ying Li, Lizheng Wei, Cankun Wang, Jianing Zhao, Siyu Han, Yu Zhang, Wei Du

Summary: LncRNA plays important roles in biological processes, and extracting LPIs from biomedical literature is challenging. LPInsider is the first webserver for extracting LPIs from literature, utilizing multiple text features and logistic regression. LPInsider helps researchers enhance their comprehension of lncRNAs through text mining and saves time.

BMC BIOINFORMATICS (2022)

Article Computer Science, Information Systems

SURE: Screening unlabeled samples for reliable negative samples based on reinforcement learning

Ying Li, Hang Sun, Wensi Fang, Qin Ma, Siyu Han, Rui Wang-Sattler, Wei Du, Qiong Yu

Summary: For classification tasks in the bioinformatics field with limited negative samples, the researchers propose a novel deep reinforcement learning-based model called SURE, which can screen reliable negative samples from unlabeled samples. SURE consists of a sample selector and a sample inspector, which are trained together to optimize the screening strategies. Experimental results show that SURE has a robust negative sample screening capability and outperforms other methods in the task of NPI prediction. Additionally, 5 NPI datasets refined by SURE are provided through a web server.

INFORMATION SCIENCES (2023)

Article Computer Science, Information Systems

PecidRL: Petition expectation correction and identification based on deep reinforcement learning

Ying Li, Wensi Fang, Hang Sun, Xiangyu Liu, Wei Du, Yijun Liu, Qianqian Li

Summary: In this paper, a novel deep reinforcement learning method named PecidRL is proposed for the correction and identification of petition expectation. A dataset containing 237,042 petitions from the largest official petition platform in China is collected. The results show that the proposed method significantly improves the performance of petition expectation identification models.

INFORMATION PROCESSING & MANAGEMENT (2023)

Article Multidisciplinary Sciences

Integrated analysis of single-cell chromatin state and transcriptome identified common vulnerability despite glioblastoma heterogeneity

Ramya Raviram, Anugraha Raman, Sebastian Preissl, Jiangfang Ning, Shaoping Wu, Tomoyuki Koga, Kai Zhang, Cameron W. Brennan, Chenxu Zhu, Jens Luebeck, Kinsey Van Deynze, Jee Yun Han, Xiaomeng Hou, Zhen Ye, Anna K. Mischel, Yang Eric Li, Rongxin Fang, Tomas Baback, Joshua Mugford, Claudia Z. Han, Christopher K. Glass, Cathy L. Barr, Paul S. Mischel, Vineet Bafna, Laure Escoubet, Bing Ren, Clark C. Chen

Summary: In 2021, glioblastoma, the most common form of adult brain cancer, was reclassified by the World Health Organization into two subtypes based on genetic characteristics. The study analyzed the chromatin accessibility and transcription profiles of clinical samples from both types of tumors, revealing intratumoral heterogeneity and shared chromatin structure among tumor cells. Silencing specific transcription factors suppressed tumor growth, suggesting a potential therapeutic target for addressing the challenges associated with intratumoral heterogeneity.

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA (2023)

Article Multidisciplinary Sciences

Local flux coordination and global gene expression regulation in metabolic modeling

Gaoyang Li, Li Liu, Wei Du, Huansheng Cao

Summary: The authors develop a method called Decrem, which integrates locally coupled reactions and global transcriptional regulation of metabolism, to reconstruct genome-scale metabolic networks. Decrem achieves accurate predictions of phenotypes and has broad applications in bioengineering, synthetic biology, and microbial pathology.

NATURE COMMUNICATIONS (2023)

No Data Available