4.7 Article Proceedings Paper

Measuring similarity between gene expression profiles: a Bayesian approach

期刊

BMC GENOMICS
卷 10, 期 -, 页码 -

出版社

BMC
DOI: 10.1186/1471-2164-10-S3-S14

关键词

-

向作者/读者索取更多资源

Background: Grouping genes into clusters on the basis of similarity between their expression profiles has been the main approach to predict functional modules, from which important inference or further investigation decision could be made. While the univocal determination of similarity metric is important, current practices are normally involved with Euclidean distance and Pearson correlation, of which assumptions are not likely the case for high-throughput microarray data. Results: We advocate the use of a novel metric - BayesGen - to measure similarity between gene expression profiles, and demonstrate its performance on two important applications: constructing genome-wide co-expression network, and clustering cancer human tissues into subtypes. BayesGen is formulated as the evidence ratio between two alternative hypotheses about the generating mechanism of a given pair of genes, and incorporates as prior knowledge the global characteristics of the whole dataset. Through the joint modelling of expected intensity levels and noise variances, it addresses the inherent nonlinearity and the association of noise levels across different microarray value ranges. The full Bayesian formulation also facilitates the possibility of meta-analysis. Conclusion: BayesGen allows more effective extraction of similarity information between genes from microarray expression data, which has significant effect on various inference tasks. It also provides a robust choice for other object-feature data, as illustrated through the results of the test on synthetic data.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.7
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Biochemical Research Methods

Adversarial generation of gene expression data

Ramon Vinas, Helena Andres-Terre, Pietro Lio, Kevin Bryson

Summary: The study developed a method based on conditional generative adversarial networks to generate realistic transcriptomics data for Escherichia coli and humans. Results showed that the approach performed better in preserving gene expression properties compared to existing simulators, maintaining tissue- and cancer-specific attributes, and exhibiting real gene clusters and ontologies at different scales.

BIOINFORMATICS (2022)

Article Biochemical Research Methods

Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases

Paul Scherer, Maja Trebacz, Nikola Simidjievski, Ramon Vinas, Zohreh Shams, Helena Andres Terre, Mateja Jamnik, Pietro Lio

Summary: Gene expression data is often high dimensional, noisy, and has a low number of samples, making it challenging for learning algorithms. In this article, a method called Gene Interaction Network Constrained Construction (GINCCo) is proposed to construct computational graph models for gene expression data by incorporating the structure of gene interaction networks. The results of a case study on cancer phenotype prediction tasks show that GINCCo outperforms other models while greatly reducing model complexity.

BIOINFORMATICS (2022)

Article Neurosciences

Metabolite and lipoprotein profiles reveal sex-related oxidative stress imbalance in de novo drug-naive Parkinson's disease patients

Gaia Meoni, Leonardo Tenori, Sebastian Schade, Cristina Licari, Chiara Pirazzini, Maria Giulia Bacalini, Paolo Garagnani, Paola Turano, Claudia Trenkwalder, Claudio Franceschi, Brit Mollenhauer, Claudio Luchinat

Summary: Parkinson's disease is the neurological disorder with the highest increase in prevalence. The lack of precise diagnosis at early stages remains a challenge. Metabolomics has provided valuable insights into the molecular basis of PD and potential biomarkers for early detection and treatment efficacy. In this study, NMR was used to analyze serum samples from German PD patients, revealing more pronounced pathological characteristics in male patients and confirming altered levels of acetone and cholesterol. Additionally, stronger oxidative stress markers were detected.

NPJ PARKINSONS DISEASE (2022)

Article Multidisciplinary Sciences

SCORPION is a stacking-based ensemble learning framework for accurate prediction of phage virion proteins

Saeed Ahmad, Phasit Charoenkwan, Julian M. W. Quinn, Mohammad Ali Moni, Md Mehedi Hasan, Pietro Lio, Watshara Shoombuatong

Summary: In this study, a computational approach called SCORPION is proposed for the accurate identification of phage virion proteins (PVPs) using only protein primary sequences. By exploring various feature descriptors and machine learning algorithms, optimal baseline models were constructed and a two-step feature selection strategy was used to determine the optimal feature vector. Results demonstrate that SCORPION outperforms existing methods and shows superior predictive performance.

SCIENTIFIC REPORTS (2022)

Article Multidisciplinary Sciences

AMYPred-FRL is a novel approach for accurate prediction of amyloid proteins by using feature representation learning

Phasit Charoenkwan, Saeed Ahmed, Chanin Nantasenamat, Julian M. W. Quinn, Mohammad Ali Moni, Pietro Lio, Watshara Shoombuatong

Summary: This study presents a novel meta-predictor, AMYPred-FRL, which utilizes a feature representation learning approach to identify amyloid proteins more accurately. By combining multiple machine learning algorithms and sequence-based feature descriptors, AMYPred-FRL generates 60 probabilistic features and forms a hybrid model. Through cross-validation and independent tests, AMYPred-FRL outperforms existing methods in predictive performance.

SCIENTIFIC REPORTS (2022)

Article Computer Science, Information Systems

Heterogeneous Model Fusion Federated Learning Mechanism Based on Model Mapping

Xiaofeng Lu, Yuying Liao, Chao Liu, Pietro Lio, Pan Hui

Summary: This article proposes a heterogeneous model fusion federated learning mechanism to address the resource waste issue caused by computing power imbalance in IoT devices. It trains learning models of different scales on devices with lower computing power and evaluates the effectiveness of the method through experiments.

IEEE INTERNET OF THINGS JOURNAL (2022)

Article Biology

SAPPHIRE: A stacking-based ensemble learning framework for accurate prediction of thermophilic proteins

Phasit Charoenkwan, Nalini Schaduangrat, Mohammad Ali Moni, Pietro Lio, Balachandran Manavalan, Watshara Shoombuatong

Summary: This study presents a novel computational method, SAPPHIRE, for accurately identifying thermophilic proteins (TPPs) using sequence information. The method combines different feature encodings and machine learning algorithms to train baseline models and extract key information of TPPs. SAPPHIRE outperforms existing methods in terms of predictive performance and achieves higher accuracy and correlation coefficient.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Article Biology

NEPTUNE: A novel computational approach for accurate and large-scale identification of tumor homing peptides

Phasit Charoenkwan, Nalini Schaduangrat, Pietro Lio, Mohammad Ali Moni, Balachandran Manavalan, Watshara Shoombuatong

Summary: This study proposes a novel computational approach, NEPTUNE, for the accurate and large-scale identification of Tumor Homing Peptides (THPs) from sequence information. The results demonstrate that NEPTUNE achieves superior performance in THP prediction and improves interpretability using the SHapley additive explanations method.

COMPUTERS IN BIOLOGY AND MEDICINE (2022)

Article Computer Science, Artificial Intelligence

A deep graph neural network architecture for modelling spatio-temporal dynamics in resting-state functional MRI data

Tiago Azevedo, Alexander Campbell, Rafael Romero-Garcia, Luca Passamonti, Richard A. I. Bethlehem, Pietro Lio, Nicola Toschi

Summary: In this paper, a novel deep neural network architecture is proposed that combines graph neural networks and temporal convolutional networks for learning from both the spatial and temporal components of resting-state functional magnetic resonance imaging (rs-fMRI) data. The model is evaluated using samples from the UK Biobank and Human Connectome Project datasets, showing effectiveness and explainability-related features. This approach lays the groundwork for future deep learning architectures focused on the spatio-temporal nature of rs-fMRI data.

MEDICAL IMAGE ANALYSIS (2022)

Article Oncology

Cell graph neural networks enable the precise prediction of patient survival in gastric cancer

Yanan Wang, Yu Guang Wang, Changyuan Hu, Ming Li, Yanan Fan, Nina Otter, Ikuan Sam, Hongquan Gou, Yiqun Hu, Terry Kwok, John Zalcberg, Alex Boussioutas, Roger J. Daly, Guido Montufar, Pietro Lio, Dakang Xu, Geoffrey I. Webb, Jiangning Song

Summary: This study proposes an AI-powered digital staging system that analyzes spatial patterns in the tumor microenvironment to accurately predict survival rates and staging of gastric cancer patients. The results show outstanding model performance and significant improvement over traditional staging systems.

NPJ PRECISION ONCOLOGY (2022)

Article Biochemical Research Methods

Modular Multi-Source Prediction of Drug Side-Effects With DruGNN

Pietro Bongini, Franco Scarselli, Monica Bianchini, Giovanna Maria Dimitri, Niccolo Pancino, Pietro Lio

Summary: Drug side-effects have a significant impact on public health, care system costs, and drug discovery processes. Predicting the probability of side-effects before their occurrence is crucial to reduce this impact, especially in drug discovery. By integrating heterogeneous data into a graph dataset, this study successfully utilizes Graph Neural Networks (GNNs) to predict drug side-effects, showing promising results. The experimental results highlight the significance of utilizing relationships between data entities and suggest potential future developments in this field.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2023)

Article Computer Science, Information Systems

SafetyMed: A Novel IoMT Intrusion Detection System Using CNN-LSTM Hybridization

Nuruzzaman Faruqui, Mohammad Abu Yousuf, Md Whaiduzzaman, A. K. M. Azad, Salem A. Alyami, Pietro Lio, Muhammad Ashad Kabir, Mohammad Ali Moni

Summary: The Internet of Medical Things (IoMT) has become an attractive target for cybercriminals due to its market value and rapid growth. However, IoMT devices have limited computational capabilities, making them vulnerable to cyber-attacks. To address this, a novel Intrusion Detection System (IDS) called SafetyMed is proposed, which combines Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks to defend against intrusion from sequential and grid data. SafetyMed has shown high detection rates and accuracy, making it a potential game-changer in vulnerable sectors like the medical industry.

ELECTRONICS (2023)

Article Health Care Sciences & Services

Adverse Effects of COVID-19 Vaccination: Machine Learning and Statistical Approach to Identify and Classify Incidences of Morbidity and Postvaccination Reactogenicity

Md. Martuza Ahamad, Sakifa Aktar, Md. Jamal Uddin, Md. Rashed-Al-Mahfuz, A. K. M. Azad, Shahadat Uddin, Salem A. Alyami, Iqbal H. Sarker, Asaduzzaman Khan, Pietro Lio, Julian M. W. Quinn, Mohammad Ali Moni

Summary: Good vaccine safety and reliability are crucial for countering infectious diseases effectively. This study aims to reduce adverse reactions to COVID-19 vaccines by identifying common factors through patient data analysis and classification. Patient medical histories and postvaccination effects were examined, and statistical and machine learning approaches were used. The analysis revealed that prior illnesses, hospital admissions, and SARS-CoV-2 reinfection were significantly associated with poor patient reactions.

HEALTHCARE (2023)

Proceedings Paper Acoustics

ROBUST AND EFFICIENT UNCERTAINTY AWARE BIOSIGNAL CLASSIFICATION VIA EARLY EXIT ENSEMBLES

Alexander Campbell, Lorena Qendro, Pietro Lio, Cecilia Mascolo

Summary: This article proposes an approach for estimating predictive uncertainty using early exit ensembles. Empirical evaluation shows that this method performs well in terms of accuracy and uncertainty metrics, while also providing significant computational speed-up and memory reduction compared to single model baselines.

2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) (2022)

Review Biology

EMPIRICAL COMPARISON AND ANALYSIS OF MACHINE LEARNING-BASED PREDICTORS FOR PREDICTING AND ANALYZING OF THERMOPHILIC PROTEINS

Phasit Charoenkwan, Nalini Schaduangrat, Md Mehedi Hasan, Mohammad Ali Moni, Pietro Lio, Watshara Shoombuatong

Summary: This article comprehensively investigates 14 state-of-the-art TPP predictors and summarizes their characteristics and advantages and disadvantages. Through comparative analysis, it provides future perspectives for the development of more accurate and efficient TPP predictors.

EXCLI JOURNAL (2022)

暂无数据