Review
Genetics & Heredity
Wenbin Ye, Qiwei Lian, Congting Ye, Xiaohui Wu
Summary: Alternative polyadenylation (APA) plays important roles in modulating mRNA stability, translation, and subcellular localization, and contributes extensively to shaping eukaryotic transcriptome complexity and proteome diversity. In this article, we provide an exhaustive overview of computational approaches for predicting poly(A) sites (pAs) from DNA sequences, bulk RNA sequencing (RNA-seq) data, and single-cell RNA sequencing (scRNA-seq) data. We examine several representative tools and provide operable suggestions on assessing the reliability of pAs predicted by different tools. We also propose practical guidelines on choosing appropriate methods applicable to diverse scenarios and discuss the challenges and opportunities in improving pA prediction using new techniques.
GENOMICS PROTEOMICS & BIOINFORMATICS
(2023)
Article
Genetics & Heredity
Qihan Long, Yangyang Yuan, Miaoxin Li
Summary: The RNA-SSNV framework allows for the accurate identification of expressed somatic mutations and enables a more insightful analysis of cancer driver genes and carcinogenic mechanisms.
FRONTIERS IN GENETICS
(2022)
Review
Biochemical Research Methods
Erik Christensen, Ping Luo, Andrei Turinsky, Mia Husic, Alaina Mahalanabis, Alaine Naidas, Juan Javier Diaz-Mejia, Michael Brudno, Trevor Pugh, Arun Ramani, Parisa Shooshtari
Summary: Single-cell RNA sequencing (scRNA-seq) clustering and labelling methods are used to determine cellular composition. This study assesses different scRNA-seq labelling algorithms using cancer datasets. Results show that cell-based methods generally outperform cluster-based methods. The study also provides suggestions for algorithm selection.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Biochemical Research Methods
Xuan Liu, Sara J. C. Gosline, Lance T. Pflieger, Pierre Wallet, Archana Iyer, Justin Guinney, Andrea H. Bild, Jeffrey T. Chang
Summary: Single-cell RNA sequencing (scRNA-Seq) is a promising strategy for characterizing immune cell populations, but manually classifying immune cells from transcriptional profiles remains a challenge. ImmClassifier, a machine learning classifier, has shown to be effective in distinguishing fine-grained cell types in scRNA-Seq experiments, outperforming other tools in precision and recall.
BRIEFINGS IN BIOINFORMATICS
(2021)
Review
Biochemical Research Methods
Erik Christensen, Ping Luo, Andrei Turinsky, Mia Husic, Alaina Mahalanabis, Alaine Naidas, Juan Javier Diaz-Mejia, Michael Brudno, Trevor Pugh, Arun Ramani, Parisa Shooshtari
Summary: This study evaluates 17 cell-based and 9 cluster-based scRNA-seq labelling algorithms using 8 cancer datasets, providing a comprehensive assessment of these methods in a cancer-specific context. The results show that cell-based methods generally outperform cluster-based methods in terms of performance and speed. Additionally, larger cell numbers in training data have a positive impact on prediction scores for cell-based methods. The best performing algorithms are scPred and SVM, with suggestions for algorithm selection provided.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Biotechnology & Applied Microbiology
Chengkui Zhao, Qi Cheng, Weixin Xie, Jiayu Xu, Siwen Xu, Ying Wang, Weixing Feng
Summary: In this study, a computational method was developed to infer the expression level of miRNA at the single-cell level. The method was applied to single-cell data from breast cancer patients, leading to the discovery of potential miRNA markers.
Article
Biochemistry & Molecular Biology
Andrew Jiang, Klaus Lehnert, Linya You, Russell G. Snell
Summary: ICARUS is a web server designed to assist inexperienced R users in conducting single cell RNA-seq analysis. It features an intuitive tutorial-style user interface, allowing users to navigate through various preprocessing, analysis, and visualization steps. ICARUS can be accessed easily through a dedicated web server and offers a range of features, including quality control, dimensionality reduction, and cell clustering. It supports multiple organisms and allows for differential expression analysis and gene set enrichment analysis.
NUCLEIC ACIDS RESEARCH
(2022)
Article
Biochemistry & Molecular Biology
Andrew Jiang, Klaus Lehnert, Linya You, Russell G. Snell
Summary: ICARUS is a web server that allows users without experience in R to perform single cell RNA-seq analysis. It features an intuitive tutorial-style user interface for logical navigation through preprocessing, analysis, and visualization steps. ICARUS can be accessed through a dedicated web server and does not require software installation. Users can apply quality control thresholds, adjust parameters for dimensionality reduction and cell clustering, and visualize data using 2D/3D UMAP and t-SNE plots. ICARUS also offers flexible differential expression analysis and gene set enrichment analysis. It supports multiple organisms and can handle multimodal data. The ability to save analysis state and achieve a complete analysis in 1-2 hours by inexperienced users further enhances its value.
NUCLEIC ACIDS RESEARCH
(2022)
Article
Biochemical Research Methods
Dylan Kotliar, Andres Colubri
Summary: Sciviewer is an interactive tool for visualizing cells within Python, with a novel method to identify genes locally varying along any user-specified direction on the embedding. It enables rapid and flexible iteration between interactive and programmatic modes of scRNA-Seq exploration, demonstrating an effective approach for analyzing high-dimensional data.
Article
Computer Science, Information Systems
Weiping Zhu, Jiaojiao Chen, Lin Xu, Jiannong Cao
Summary: In this study, a method called interactive group recognizing (IGR) is proposed to improve group recognition accuracy by collecting sensing data from individuals to deduce their interactions.
COMPUTER COMMUNICATIONS
(2022)
Article
Multidisciplinary Sciences
Richard C. Tyser, Elmir Mahammadov, Shota Nakanoh, Ludovic Vallier, Antonio Scialdone, Shankar Srinivas
Summary: The single-cell transcriptional profile of a human embryo between 16 and 19 days after fertilization shows similarities and differences in gastrulation compared to mouse and non-human primate models. This study provides new insights into human development and offers valuable information for directed differentiation of human cells in vitro.
Article
Computer Science, Artificial Intelligence
Mothe Rajesh, Sheshikala Martha
Summary: Single-cell RNA sequencing technology is utilized to analyze the transcriptomes of individual cells and identify rare cell populations. Traditional methods struggle to analyze transcriptomic profiles on a single-cell level, thus machine learning techniques have become crucial. In this study, we analyzed single-cell RNA sequencing data using linear dimensional reduction, identification of highly variable features, cell clustering, nonlinear dimensional reduction, and identification of gene markers. This analysis is important for identifying transcriptomic challenges and heterogeneity in cellular characteristics. Our research assists researchers in the field of bioinformatics and computational biology studying single-cell RNA sequencing data.
Article
Multidisciplinary Sciences
Kai Battenberg, S. Thomas Kelly, Radu Abu Ras, Nicola A. Hetherington, Makoto Hayashi, Aki Minoda
Summary: Single-cell RNA-sequencing analysis has gained popularity, and UniverSC is a universal tool for processing single-cell RNA-seq data on any platform. It provides a command-line tool, docker image, and containerized graphical application for consistent and comprehensive integration, comparison, and evaluation of data from various platforms. Additionally, a cross-platform application with a graphical user interface is available to address the bottleneck of data processing for researchers without bioinformatics expertise.
NATURE COMMUNICATIONS
(2022)
Article
Biochemical Research Methods
Malte D. Luecken, M. Buettner, K. Chaichoompu, A. Danese, M. Interlandi, M. F. Mueller, D. C. Strobl, L. Zappia, M. Dugas, M. Colome-Tatche, Fabian J. Theis
Summary: This study benchmarked 68 method and preprocessing combinations on 85 batches of gene expression data, highlighting the importance of highly variable gene selection in improving method performance. When dealing with complex integration tasks, scANVI, Scanorama, scVI, and scGen consistently performed well, while the performance of single-cell ATAC-sequencing integration was strongly influenced by the choice of feature space.
Article
Biotechnology & Applied Microbiology
Kazi Ferdous Mahin, Md Robiuddin, Mujahidul Islam, Shayed Ashraf, Farjana Yeasmin, Swakkhar Shatabda
Summary: The paper proposes a method called PanClassif for cancer detection using RNA-seq data and improving the performance of various machine learning classifiers. The method outperforms existing methods and shows consistent performance across different datasets.
Article
Computer Science, Artificial Intelligence
Xiucai Ye, Hongmin Li, Akira Imakura, Tetsuya Sakurai
Review
Biochemistry & Molecular Biology
Yuyang Xue, Xiucai Ye, Lesong Wei, Xin Zhang, Tetsuya Sakurai, Leyi Wei
Summary: The Transformer model has become the mainstream model in natural language processing, and bioinformatics has made remarkable progress in drug design and protein property prediction through machine learning. This study proposes a deep model-based method called CPPFormer, which achieves the best performance in drug penetration tasks for CPPs by implementing the attention structure of the Transformer and a feature extractor.
CURRENT MEDICINAL CHEMISTRY
(2022)
Article
Computer Science, Information Systems
Tian-Fu Lee, Xiucai Ye, Syuan-Han Lin
Summary: Group authenticated key agreements (GAKAs) using physical unclonable function (PUF) solve the security and efficiency problems of the Internet of Medical Things (IoMT) by enabling authentication and secure information exchange among medical sensor devices.
IEEE INTERNET OF THINGS JOURNAL
(2022)
Article
Biochemical Research Methods
Zhao-Yue Zhang, Lin Ning, Xiucai Ye, Yu-He Yang, Yasunori Futamura, Tetsuya Sakurai, Hao Lin
Summary: The location of miRNAs in cells plays a crucial role in their regulatory function. Current prediction algorithms for miRNA subcellular localization have limitations. In this study, a new data partitioning strategy and deep learning algorithm were proposed to accurately predict miRNA subcellular localization and explore the underlying mechanisms through motif analysis. Additionally, a user-friendly web server was established for convenient use.
BRIEFINGS IN BIOINFORMATICS
(2022)
Article
Physics, Multidisciplinary
Jing Zhang, Yanzi Li, Qian Ding, Liwei Lin, Xiucai Ye
Summary: The paper introduces a semantics- and prediction-based differential privacy protection scheme for trajectory data to ensure data privacy and solve the balance issue between privacy budget and service quality.
Article
Chemistry, Analytical
Tian-Fu Lee, Xiucai Ye, Wei-Yu Chen, Chi-Chang Chang
Summary: The Tactile Internet plays a significant role in electronic medicine, but inappropriate and insecure authentication key agreements may endanger patients' lives. Researchers propose an enhanced scheme to address the limitations of the Kamil scheme, providing better security functionalities and maintaining a lightweight computational cost.
Article
Biochemical Research Methods
Xin Zhang, Lesong Wei, Xiucai Ye, Kai Zhang, Saisai Teng, Zhongshen Li, Junru Jin, Minjae Kim, Tetsuya Sakurai, Lizhen Cui, Balachandran Manavalan, Leyi Wei
Summary: A novel deep learning framework SiameseCPP is proposed for automated prediction of cell-penetrating peptides (CPPs). SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network comprising a transformer and gated recurrent units. Comprehensive experiments demonstrate that SiameseCPP outperforms existing baseline models for CPP prediction and exhibits satisfactory generalization ability on other functional peptide datasets.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Biotechnology & Applied Microbiology
Yangyang Chen, Zixu Wang, Xiangxiang Zeng, Yayang Li, Pengyong Li, Xiucai Ye, Tetsuya Sakurai
Summary: Language models can learn complex molecular distributions. This research investigates the differences between RNNs and the Transformer-Layer in learning complex molecular distributions and evaluates their performance on various molecular generation tasks. The results show that both language models can learn complex molecular distributions and the SMILES-based representation outperforms SELFIES.
BRIEFINGS IN FUNCTIONAL GENOMICS
(2023)
Article
Biochemical Research Methods
Shihu Jiao, Xiucai Ye, Chunyan Ao, Tetsuya Sakurai, Quan Zou, Lei Xu
Summary: The rapid and extensive transmission of SARS-CoV-2 has caused a global health emergency, and identifying phosphorylation sites plays a crucial role in understanding the molecular mechanisms of infection. However, current prediction tools lack accuracy and efficiency. This study presents a comprehensive analysis of SARS-CoV-2 infection in human lung cells and introduces a deep learning predictor, PSPred-ALE, designed to identify phosphorylation sites. The predictor utilizes a self-adaptive learning embedding algorithm and a multihead attention module to improve prediction accuracy. Comparative analysis shows that PSPred-ALE outperforms existing prediction tools. The proposed model can assist biomedical scientists in understanding phosphorylation mechanisms in SARS-CoV-2 infection.
Article
Biology
Xiucai Ye, Yifan Shang, Tianyi Shi, Weihang Zhang, Tetsuya Sakurai
Summary: The increased availability of high-throughput technologies has allowed researchers to study disease etiology across multiple omics layers, particularly in improving cancer subtype identification. In this study, a novel multi-omics clustering method called MCLS is proposed, which effectively deals with missing multi-omics data. The method utilizes complete multi-omics data to construct a latent subspace using feature extraction and decomposition techniques, and then applies spectral clustering to identify clusters. Experimental results demonstrate that MCLS outperforms state-of-the-art methods in cancer subtype identification, providing valuable insights into cancer research.
COMPUTERS IN BIOLOGY AND MEDICINE
(2023)
Article
Biochemical Research Methods
Dong Huang, Xiucai Ye, Ying Zhang, Tetsuya Sakurai
Summary: Collaborative analysis has become a promising approach for drug discovery with the availability of large-scale QSAR datasets. This paper proposes a novel framework for collaborative drug discovery using federated learning on non-IID datasets, addressing the difficulty of training on non-IID data by globally sharing a small subset of data among all institutions. The experimental results demonstrate competitive predictive accuracy while respecting data privacy.
Article
Biochemical Research Methods
Xin Zhang, Lesong Wei, Xiucai Ye, Kai Zhang, Saisai Teng, Zhongshen Li, Junru Jin, Min Jae Kim, Tetsuya Sakurai, Lizhen Cui, Balachandran Manavalan, Leyi Wei
Summary: In this study, a novel deep learning framework called SiameseCPP is proposed for automated prediction of CPPs. SiameseCPP utilizes contrastive learning to build a CPP predictive model and demonstrates superior performance and generalizability on multiple functional peptide datasets.
BRIEFINGS IN BIOINFORMATICS
(2022)
Proceedings Paper
Computer Science, Artificial Intelligence
Weihang Zhang, Xiucai Ye, Testuya Sakurai
Summary: This paper proposes a clustering method that automatically generates candidate cluster numbers and corresponding cluster partitions, and recovers a shared low-rank similarity matrix through ensemble learning. Spectral clustering is applied to detect the cluster number using the candidate cluster numbers. Experimental results demonstrate that the proposed method can accurately detect the cluster numbers and achieve better clustering results compared to existing methods.
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN)
(2022)
Article
Computer Science, Information Systems
Xiucai Ye, Weihang Zhang, Tetsuya Sakurai