4.6 Article

Classification of early and late stage liver hepatocellular carcinoma patients from their genomics and epigenomics profiles

期刊

PLOS ONE
卷 14, 期 9, 页码 -

出版社

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0221476

关键词

-

资金

  1. J. C. Bose National Fellowship (DSTDepartment of Science & Technology, Ministry of Science and Technology, India)
  2. CSIR (Council of Scientific and Industrial Research, India)
  3. ICMR (Indian Council Of Medical Research, India)

向作者/读者索取更多资源

Background Liver Hepatocellular Carcinoma (LIHC) is one of the major cancers worldwide, responsible for millions of premature deaths every year. Prediction of clinical staging is vital to implement optimal therapeutic strategy and prognostic prediction in cancer patients. However, to date, no method has been developed for predicting the stage of LIHC from the genomic profile of samples. Methods The Cancer Genome Atlas (TCGA) dataset of 173 early stage (stage-I), 177 late stage (stage-II, Stage-III and stage-IV) and 50 adjacent normal tissue samples for 60,483 RNA transcripts and 485,577 methylation CpG sites, was extensively analyzed to identify the key transcriptomic expression and methylation-based features using different feature selection techniques. Further, different classification models were developed based on selected key features to categorize different classes of samples implementing different machine learning algorithms. Results In the current study, in silico models have been developed for classifying LIHC patients in the early vs. late stage and cancerous vs. normal samples using RNA expression and DNA methylation data. TCGA datasets were extensively analyzed to identify differentially expressed RNA transcripts and methylated CpG sites that can discriminate early vs. late stages and cancer vs. normal samples of LIHC with high precision. Naive Bayes model developed using 51 features that combine 21 CpG methylation sites and 30 RNA transcripts achieved maximum MCC (Matthew's correlation coefficient) 0.58 with an accuracy of 78.87% on the validation dataset in discrimination of early and late stage. Additionally, the prediction models developed based on 5 RNA transcripts and 5 CpG sites classify LIHC and normal samples with an accuracy of 96-98% and AUC (Area Under the Receiver Operating Characteristic curve) 0.99. Besides, multiclass models also developed for classifying samples in the normal, early and late stage of cancer and achieved an accuracy of 76.54% and AUC of 0.86. Conclusion Our study reveals stage prediction of LIHC samples with high accuracy based on the genomics and epigenomics profiling is a challenging task in comparison to the classification of cancerous and normal samples. Comprehensive analysis, differentially expressed RNA transcripts, methylated CpG sites in LIHC samples and prediction models are available from CancerLSP (http://webs.iiitd.edu.in/raghava/cancerlsp/).

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.6
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

DMPPred: a tool for identification of antigenic regions responsible for inducing type 1 diabetes mellitus

Nishant Kumar, Sumeet Patiyal, Shubham Choudhury, Ritu Tomer, Anjali Dhall, Gajendra P. S. Raghava

Summary: In this study, a high-precision method for predicting, designing, and scanning T1DM associated peptides was developed. By using alignment and machine learning techniques, this method provides accurate results. A web server and standalone server were also developed for practical use.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biochemical Research Methods

Prediction of RNA-interacting residues in a protein using CNN and evolutionary profile

Sumeet Patiyal, Anjali Dhall, Khushboo Bajaj, Harshita Sahu, Gajendra P. S. Raghava

Summary: This paper describes a method called Pprint2 for predicting RNA-interacting residues in proteins. The study found that positively charged amino acids are more prominent in these residues. By using evolutionary profiles and convolutional neural network, the researchers developed a final model that performed well on the validation dataset.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Immunology

Prediction of celiac disease associated epitopes and motifs in a protein

Ritu Tomer, Sumeet Patiyal, Anjali Dhall, Gajendra P. S. Raghava

Summary: In this study, computational tools were used to predict CD associated epitopes and motifs in protein-based foods and therapeutics. The analysis revealed that proline (P) and glutamine (Q) are highly abundant in CD associated peptides. Machine learning based models and motif-based approach were developed, and the best models and motifs were integrated into a web server and standalone software package CDpred.

FRONTIERS IN IMMUNOLOGY (2023)

Article Biology

TNFepitope: A webserver for the prediction of TNF-α inducing epitopes

Anjali Dhall, Sumeet Patiyal, Shubham Choudhury, Shipra Jain, Kashish Narang, Gajendra P. S. Raghava

Summary: Tumor Necrosis Factor alpha (TNF-a) is a pleiotropic pro-inflammatory cytokine that plays a crucial role in immune cell signaling pathways. This study aimed to predict and design TNF-a inducing epitopes using an in silico tool. The proposed models achieved high predictive performance in identifying TNF-a inducing peptides in both human and mouse hosts. Additionally, potential TNF-a inducing peptides were identified in proteins of HIV-1, HIV-2, SARS-CoV-2, and human insulin.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Biology

A web server for predicting and scanning of IL-5 inducing peptides using alignment-free and alignment-based method

Leimarembi Devi Naorem, Neelam Sharma, Gajendra P. S. Raghava

Summary: This study aims to develop a model for accurately predicting IL-5 inducing antigenic regions in proteins. The study identified certain residues, such as Ile, Asn, and Tyr, that dominate IL-5 inducing peptides. Alignment-based methods provided high precision but limited coverage, while alignment-free methods, including machine learning models, improved performance. The hybrid model combining alignment-based and alignment-free methods achieved excellent results on the validation dataset.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Review Pharmacology & Pharmacy

ViralVacDB: A manually curated repository of viral vaccines

Sadhana Tripathi, Neelam Sharma, Leimarembi Devi Naorem, Gajendra P. S. Raghava

Summary: Over the years, numerous vaccines have been developed against viral infections, but a comprehensive database providing detailed information on viral vaccines has been lacking. In this review, we introduce our freely accessible database ViralVacDB, which includes details of viral vaccines, their types, routes of administration, and approving agencies. This repository systematically covers additional information on 422 viral vaccines, including 145 approved vaccines and 277 in clinical trials. We believe that this database will greatly benefit researchers and professionals in pharmaceuticals and immuno-informatics.

DRUG DISCOVERY TODAY (2023)

Article Endocrinology & Metabolism

Hmrbase2: a comprehensive database of hormones and their receptors

Dashleen Kaur, Akanksha Arora, Sumeet Patiyal, Gajendra Pal Singh Raghava

Summary: Hmrbase2 is a comprehensive platform that provides extensive information on hormones, which is essential for the therapeutics and diagnostics of hormonal diseases.

HORMONES-INTERNATIONAL JOURNAL OF ENDOCRINOLOGY AND METABOLISM (2023)

Article Biochemical Research Methods

RNA capture pin technology: investigating long-term stability and mRNA purification specificity of oligonucleotide immobilization on gold and streptavidin surfaces

Deriesha Gaines, Elia Brodsky, Harpreet Kaur, Gergana G. Nestorova

Summary: Advancing biomedical studies requires the development of advanced technologies for the rapid extraction of nucleic acid. We characterized an RNA capture pin (RCP) tool that enables non-destructive, rapid purification and enrichment of mRNA for genetic analysis. The RCP, functionalized with dT15 capture sequences, demonstrated high RNA capture efficiency and selectivity, with 70% messenger RNA, 10% ribosomal RNA, and 20% non-coding RNA. Evaluation of long-term stability showed that gold-thiol RNA capture pins retained 40% of the oligos after 4 months of storage, while streptavidin-coated pins showed a significant decrease in dT15 surface coverage after 2 weeks of storage at 4°C.

ANALYTICAL AND BIOANALYTICAL CHEMISTRY (2023)

Article Genetics & Heredity

Glioma-BioDP: database for visualization of molecular profiles to improve prognosis of brain cancer

Xiang Deng, Shaoli Das, Harpreet Kaur, Evan Wilson, Kevin Camphausen, Uma Shankavaram

Summary: To assist cancer researchers in validation, exploration, analysis, and visualization of molecular profiles in cancer patient samples, we developed Glioma-BioDP, a user-friendly web tool for exploring and visualizing RNA and protein expression profiles in low- and high-grade gliomas. Glioma-BioDP includes expression data from The Cancer Genome Atlas and allows querying by mRNA, microRNA, and protein levels. Advanced query interface enables exploring the association of expression profiles with molecular and histological subtypes, surgical resection status, and survival. The tool facilitates the validation and generation of hypotheses for novel therapies and personalized treatment for gliomas.

BMC MEDICAL GENOMICS (2023)

Review Biochemistry & Molecular Biology

Multi-perspectives and challenges in identifying B-cell epitopes

Nishant Kumar, Nisha Bajiya, Sumeet Patiyal, Gajendra P. S. Raghava

Summary: This paper provides a comprehensive review of the identification of B-cell epitopes (BCEs), covering experimental techniques, historical perspectives, and computational methods. The overall challenge of identifying BCEs is also discussed.

PROTEIN SCIENCE (2023)

Article Biochemical Research Methods

A random forest model for predicting exosomal proteins using evolutionary information and motifs

Akanksha Arora, Sumeet Patiyal, Neelam Sharma, Naorem Leimarembi Devi, Dashleen Kaur, Gajendra P. S. Raghava

Summary: Non-invasive diagnostics and therapies are important for minimizing patient discomfort. Exosomal proteins are identified as potential biomarkers. This study presents a model for predicting exosomal proteins based on machine learning and sequence motifs. The hybrid model outperforms existing methods and a web server and standalone software have been developed for researchers to predict and discover exosomal proteins.

PROTEOMICS (2023)

Article Genetics & Heredity

In silico transcriptional analysis of asymptomatic and severe COVID-19 patients reveals the susceptibility of severe patients to other comorbidities and non-viral pathological conditions

Poonam Sen, Harpreet Kaur

Summary: This study conducted a comparative transcriptomics analysis on seven asymptomatic and eight severe COVID-19 patients. The results showed differential gene expression between severe and asymptomatic patients, with the upregulation of biological processes and pathways related to viral infection and inflammation in severe patients. These findings can aid researchers in finding effective therapeutic targets and assist clinicians in managing COVID-19 patients and their post-COVID-19 effects.

HUMAN GENE (2023)

Article Biochemical Research Methods

Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models

Akshara Pande, Sumeet Patiyal, Anjali Lathwal, Chakit Arora, Dilraj Kaur, Anjali Dhall, Gaurav Mishra, Harpreet Kaur, Neelam Sharma, Shipra Jain, Salman Sadullah Usmani, Piyush Agrawal, Rajesh Kumar, Vinod Kumar, Gajendra P. S. Raghava

Summary: In the past 30 years, numerous protein features have been discovered for protein annotation. To integrate these features, we developed a method called Pfeature, which can compute over 200,000 features for predicting protein function, residue-level annotation, and chemically modified peptides. Pfeature includes six major modules: composition, binary profiles, evolutionary information, structural features, patterns, and model building.

JOURNAL OF COMPUTATIONAL BIOLOGY (2023)

Article Biology

Risk assessment of cancer patients based on HLA-I alleles, neobinders and expression of cytokines

Anjali Dhall, Sumeet Patiyal, Harpreet Kaur, Gajendra P. S. Raghava

Summary: Advancements in cancer immunotherapy have shown significant outcomes in treating cancers. To design effective immunotherapy, it's important to understand immune response of a patient based on its genomic profile. However, analyses to do that requires proficiency in the bioinformatic methods. Here, we are providing a web-based resource that gives scientists with no bioinformatics expertise, the ability to obtain the prognostic biomarkers for different cancer types at different levels.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

暂无数据