Article
Medical Informatics
Luke T. Slater, Sophie Russell, Silver Makepeace, Alexander Carberry, Andreas Karwath, John A. Williams, Hilary Fanning, Simon Ball, Robert Hoehndorf, Georgios Gkoutos
Summary: Semantic similarity is a valuable tool in biomedical analysis, especially for patient phenotype analysis in clinical tasks. This study developed a reproducible benchmarking platform to evaluate patient phenotype similarity in uncurated phenotype profiles. The results showed that term-specificity and annotation-frequency measures performed the best among the evaluated configurations.
BMC MEDICAL INFORMATICS AND DECISION MAKING
(2022)
Article
Computer Science, Artificial Intelligence
Maryam Daniali, Peter D. Galer, David Lewis-Smith, Shridhar Parthasarathy, Edward Kim, Dario D. Salvucci, Jeffrey M. Miller, Scott Haag, Ingo Helbig
Summary: The Human Phenotype Ontology (HPO) is a standardized dictionary of clinical phenotypic terms used in precision medicine. This study presents a novel approach to phenotype representation by incorporating phenotypic frequencies based on a large dataset. The proposed embedding technique exceeds current models in identifying phenotypic similarities, showing high agreement with experts' judgment.
ARTIFICIAL INTELLIGENCE IN MEDICINE
(2023)
Article
Biochemical Research Methods
Shujie Ren, Liang Yu, Lin Gao
Summary: In this study, we propose a pretraining framework called MGP-DR for drug pair representation learning. By integrating drug molecular graph information and target information, the model utilizes self-supervised learning strategies to predict drug-drug interactions and drug combinations. It achieves promising performance across multiple metrics compared to other state-of-the-art methods.
Article
Biochemistry & Molecular Biology
Petri Toronen, Liisa Holm
Summary: The advent of next-generation sequencing technology has resulted in a massive increase in gene catalogs for new genomes, transcriptomes, and metagenomes that require computational inference for functional annotation. PANNZER is a high-throughput functional annotation web server that supports annotation of up to 100,000 protein sequences and provides Gene Ontology annotations and free text description predictions. Two case studies highlight issues related to data quality and method evaluation, arguing that commonly used evaluation metrics and datasets may bias the development of automated function prediction methods.
Article
Biochemical Research Methods
Gokhan Ozsari, Ahmet Sureyya Rifaioglu, Ahmet Atakan, Tunca Dogan, Maria Jesus Martin, Rengul Cetin Atalay, Volkan Atalay
Summary: In this study, the researchers propose SLPred, an ensemble-based multi-view and multi-label protein subcellular localization prediction tool, which provides predictions for nine main subcellular locations using independent machine-learning models. The results show that SLPred outperforms other tools in most cases.
Article
Biology
Luke T. Slater, Andreas Karwath, John A. Williams, Sophie Russell, Silver Makepeace, Alexander Carberry, Robert Hoehndorf, Georgios Gkoutos
Summary: This study developed a method to extract patient phenotype profiles from clinical narrative text and used semantic similarity to classify primary patient diagnosis. The results showed that uncurated text phenotypes can be a powerful tool for the differential diagnosis of common diseases.
COMPUTERS IN BIOLOGY AND MEDICINE
(2021)
Article
Biochemical Research Methods
Malcolm E. Fisher, Erik Segerdell, Nicolas Matentzoglu, Mardi J. Nenni, Joshua D. Fortriede, Stanley Chu, Troy J. Pells, David Osumi-Sutherland, Praneet Chaturvedi, Christina James-Zorn, Nivitha Sundararaj, Vaneet S. Lotay, Virgilio Ponferrada, Dong Zhuo Wang, Eugene Kim, Sergei Agalakov, Bradley Arshinoff, Kamran Karimi, Peter D. Vize, Aaron M. Zorn
Summary: This article introduces the design and application of Xenopus Phenotype Ontology (XPO), which annotates phenotypic data in Xenopus experiments and enables interoperability and ontology management with other species. The XPO combines different ontologies to facilitate the integration and management of phenotypic data.
BMC BIOINFORMATICS
(2022)
Article
Biochemical Research Methods
Yang Li, Wang Keqi, Guohua Wang
Summary: The article introduces a novel approach to compute disease similarity by integrating disease-related genes and gene ontology hierarchy to learn disease representation based on deep representation learning. In the experiments, the AUC value of this method is 0.8074, improving the most competitive baseline method by 10.1%.
Article
Computer Science, Information Systems
Shimaa Ibrahim, Said Fathalla, Jens Lehmann, Hajira Jabeen
Summary: This paper proposes a Multilingual Ontology Matching (MoMatch) approach for matching ontologies in different natural languages. It uses machine translation and various string similarity techniques to identify correspondences across different ontologies. The paper also presents a Quality Assessment Suite for Ontologies (QASO) that evaluates the quality of the matching process and the ontology. The results show that MoMatch outperforms five state-of-the-art matching approaches in terms of precision, recall, and F-measure.
Article
Biochemical Research Methods
Pieter Verschaffelt, Tim Van den Bossche, Wassim Gabriel, Michal Burdukiewicz, Alessio Soggiu, Lennart Martens, Bernhard Y. Renard, Henning Schiebenhoefer, Bart Mesuere
Summary: The study of microbiomes has become increasingly important, leading to the emergence of tools like MegaGO, which calculates functional similarity between data sets using semantic similarity between Gene Ontology terms. MegaGO is user-friendly, high-performing, and available as a web application or standalone command line tool.
JOURNAL OF PROTEOME RESEARCH
(2021)
Article
Biochemical Research Methods
Parnal Joshi, Sagnik Banerjee, Xiao Hu, Pranav M. Khade, Iddo Friedberg
Summary: With the rise in genomic data from sequencing technologies, the functions of many gene products remain unknown. High-throughput experiments are being conducted to address this gap, but the resulting annotations are biased towards less informative Gene Ontology terms. GOThresher, a Python tool, is introduced to identify and remove biases in protein function annotation databases, which is crucial for accurate understanding of protein function and training unbiased machine learning methods.
Article
Biochemical Research Methods
Weiqi Zhai, Xiaodi Huang, Nan Shen, Shanfeng Zhu
Summary: HPO-based approaches are popular for genomic diagnostics of rare diseases, but they do not fully utilize available information on disease and patient phenotypes. We present a new method called Phen2Disease that prioritizes diseases and genes using semantic similarity between phenotype sets. Our experiments show that Phen2Disease outperforms state-of-the-art methods, especially in cohorts with fewer HPO terms. We also find that patients with higher information content scores have more accurate predictions. Phen2Disease provides ranked diseases and patient HPO terms, offering a novel approach for rare disease diagnostics.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Immunology
Willem Maassen, Geertje Legger, Ovgu Kul Cinar, Paul van Daele, Marco Gattorno, Brigitte Bader-Meunier, Carine Wouters, Tracy Briggs, Lennart Johansson, Joeri van der Velde, Morris Swertz, Ebun Omoyinmi, Esther Hoppenreijs, Alexandre Belot, Despina Eleftheriou, Roberta Caorsi, Florence Aeschlimann, Guilaine Boursier, Paul Brogan, Matthias Haimel, Marielle van Gijn
Summary: This study demonstrates that improved curation of HPO terms can increase the accuracy of diagnosis for systemic autoinflammatory diseases, highlighting the high potential of HPO-based genome diagnostics in this disease category.
FRONTIERS IN IMMUNOLOGY
(2023)
Article
Computer Science, Information Systems
Sawsan Almahmoud, Bassam Hammo, Bashar Al-Shboul, Nadim Obeid
Summary: This paper proposes a hybrid approach using two-level fingerprints to detect illegitimate non-human traffic, and experimental results show that it can effectively detect fake clicks.
MULTIMEDIA TOOLS AND APPLICATIONS
(2022)
Article
Computer Science, Artificial Intelligence
Sengodan Mani, Samukutty Annadurai
Summary: A new modified model of similarity spreading for ontology mapping is proposed in this paper, which aims to address the heterogeneity issue between ontologies for interoperability. By utilizing node clustering based on edge affinity and coefficient similarity propagation, the model achieves graph matching. The evaluation shows that the proposed model outperforms similar systems.
INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS
(2022)
Article
Biochemical Research Methods
Tao Wang, Yongzhuang Liu, Quanwei Yin, Jiaquan Geng, Jin Chen, Xipeng Yin, Yongtian Wang, Xuequn Shang, Chunwei Tian, Yadong Wang, Jiajie Peng
Summary: QTL analyses of multiomic molecular traits play a significant role in inferring the functional effects of genome variants. However, limited study sample size restricts QTL discovery and leads to missing molecular trait-variant associations. This study presents xQTLImp, a computational framework, to efficiently impute missing molecular QTL associations. Experimental results demonstrate high imputation accuracy and novel QTL discovery ability of xQTLImp.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Biochemical Research Methods
Wei Wang, Ruijiang Han, Menghan Zhang, Yuxian Wang, Tao Wang, Yongtian Wang, Xuequn Shang, Jiajie Peng
Summary: BrainMI is a novel framework that integrates brain connectome data and molecular-based gene association networks to predict brain disease genes. It constructs a new gene network based on resting-state functional magnetic resonance imaging data and brain region-specific gene expression data, and utilizes a multiple network integration method to learn low-dimensional features of genes. BrainMI achieves higher performance in predicting brain disease genes compared to existing state-of-the-art methods.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Computer Science, Information Systems
Jiajie Peng, Jinjin Yang, D. Vijay Anand, Xuequn Shang, Kelin Xia
Summary: The packing of genomic DNA into highly-order hierarchical assemblies greatly affects chromosome flexibility, dynamics, and functions. This study proposes an FRI-based model to quantify chromosome flexibility, which shows better accuracy and computational efficiency compared to the Gaussian network model (GNM). The model is based on the correlation between flexibility index and measurements for chromosome accessibility, and it can easily incorporate interchromosome interactions for improved accuracy.
FRONTIERS OF COMPUTER SCIENCE
(2022)
Article
Biotechnology & Applied Microbiology
Yongtian Wang, Liran Juan, Jiajie Peng, Tao Wang, Tianyi Zang, Yadong Wang
Summary: In this paper, a computational model is proposed for exploring metabolite-disease pairs and has good performance in predicting potential metabolites related to diseases through adequate validation. The results show that DLMPM has a better performance in prioritizing candidate diseases-related metabolites compared with the previous methods and would be helpful for researchers to reveal more information about human diseases.
Correction
Biochemical Research Methods
Tao Wang, Yongzhuang Liu, Quanwei Yin, Jiaquan Geng, Jin Chen, Xipeng Yin, Yongtian Wang, Xuequn Shang, Chunwei Tian, Yadong Wang, Jiajie Peng
BRIEFINGS IN BIOINFORMATICS
(2022)
Editorial Material
Genetics & Heredity
Tao Wang, Miguel E. Renteria, Jiajie Peng
FRONTIERS IN GENETICS
(2022)
Article
Biochemical Research Methods
Cheng Zhong, Kangenbei Liao, Wei Chen, Qianlong Liu, Baolin Peng, Xuanjing Huang, Jiajie Peng, Zhongyu Wei
Summary: This study proposes a hierarchical dialog system model for disease diagnosis, which achieves better accuracy and symptom recall compared to existing systems. The model integrates a two-level policy structure and is capable of handling a large number of diseases and symptoms.
Article
Neurosciences
Shuhui Liu, Yupei Zhang, Jiajie Peng, Tao Wang, Xuequn Shang
Summary: Mathematical learning has been found to significantly impact the plasticity and cognitive functions of the brain. This study identifies non-math students using magnetic resonance imaging scans (MRIs) and employs subspace enhanced contrastive learning and multiple-layer-perceptron models for student classification.
Article
Biochemical Research Methods
Wei Chen, Zhiwei Li, Hongyi Fang, Qianyuan Yao, Cheng Zhong, Jianye Hao, Qi Zhang, Xuanjing Huang, Jiajie Peng, Zhongyu Wei
Summary: In this article, two frameworks are proposed to support automatic medical consultation, which are doctor-patient dialogue understanding and task-oriented interaction. A new large medical dialogue dataset with multi-level fine-grained annotations is created, and five independent tasks are established, including named entity recognition, dialogue act classification, symptom label inference, medical report generation, and diagnosis-oriented dialogue policy. Benchmark results for each task are reported to demonstrate the usability of the dataset and establish a baseline for future studies.
Article
Multidisciplinary Sciences
Yafei Dai, Qiangqiang Zhang, Fei Wu, Jiajie Peng, Xiaobao Xu, Quansheng Du, Qing Pan, Yongjun Chen
Summary: With the development of natural science, interdisciplinary scientific research has become an inevitable trend in pursuit of scientific and technological innovations. Countries and regions like the United States, European Union, and China have established institutions to promote interdisciplinary research, but face challenges such as disciplinary barriers and funding mechanisms. To encourage interdisciplinary research, the National Natural Science Foundation of China established the Department of Interdisciplinary Sciences in 2020, aiming to create a culture of interdisciplinary cooperation and reform funding mechanisms.
CHINESE SCIENCE BULLETIN-CHINESE
(2023)
Article
Biochemical Research Methods
Yongtian Wang, Xinmeng Liu, Yewei Shen, Xuerui Song, Tao Wang, Xuequn Shang, Jiajie Peng
Summary: Circular RNAs (circRNAs) are important in biological processes and closely related to disease diagnosis, treatment, and inference. A computational model based on collaborative deep learning with circRNA multi-view functional annotations is proposed to predict potential circRNA-disease associations efficiently. The model shows better performance in predicting candidate disease-related circRNAs and has high practicality for the diagnosis and treatment of human diseases.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Biotechnology & Applied Microbiology
Shuhui Liu, Yupei Zhang, Jiajie Peng, Xuequn Shang
Summary: Analyzing cell-cell communication in the tumor micro-environment helps understand cancer progression and drug tolerance. Existing methods based on known molecular interactions have limitations in predicting cellular communications. In this study, we propose an improved hierarchical variational autoencoder (HiVAE) model that utilizes single-cell RNA-seq data to estimate cell-cell communication scores.
BRIEFINGS IN FUNCTIONAL GENOMICS
(2023)
Article
Biochemical Research Methods
Tao Wang, Jinjin Yang, Yifu Xiao, Jingru Wang, Yuxian Wang, Xi Zeng, Yongtian Wang, Jiajie Peng
Summary: Drug-food interactions (DFIs) refer to the situation where some constituents of food affect the bioaccessibility or efficacy of a drug by involving in drug pharmacodynamic and/or pharmacokinetic processes. This article proposes a novel end-to-end graph embedding-based method named DFinder to identify DFIs. DFinder combines node attribute features and topological structure features to learn the representations of drugs and food constituents. The evaluation results indicate that DFinder outperforms other baseline methods.
Article
Biochemical Research Methods
Wei Chen, Cheng Zhong, Jiajie Peng, Zhongyu Wei
Summary: The automatic diagnostic system queries potential symptoms from patients and predicts possible diseases. Existing methods overlook the importance of symptom inquiry, resulting in low diagnostic accuracy. To address this, a new framework called DxFormer is proposed, which decouples symptom inquiry and disease diagnosis and optimizes them separately. Experimental results confirm that improving symptom recall can enhance diagnostic accuracy.
Proceedings Paper
Computer Science, Artificial Intelligence
Ruijiang Han, Wei Wang, Yuxi Long, Jiajie Peng
Summary: In this work, a post-processing unsupervised deep representation debiasing algorithm called DeepMinMax is proposed, which obtains unbiased representations directly from pre-trained representations without re-training or fine-tuning the entire model. Experimental results on synthetic and real-world datasets show that DeepMinMax outperforms existing state-of-the-art algorithms on downstream tasks.
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE
(2022)