Article
Biochemical Research Methods
Mingming Jiang, Bowen Zhao, Shenggan Luo, Qiankun Wang, Yanyi Chu, Tianhang Chen, Xueying Mao, Yatong Liu, Yanjing Wang, Xue Jiang, Dong-Qing Wei, Yi Xiong
Summary: This study developed an interpretable stacking model, NeuroPpred-Fuse, for the prediction of neuropeptides through fusing sequence-derived features and feature selection methods. The model achieved 90.6% accuracy and 95.8% AUC on the independent test set, outperforming current state-of-the-art models, demonstrating strong generalization ability.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Chemistry, Analytical
Valentina Brancato, Nadia Brancati, Giusy Esposito, Massimo La Rosa, Carlo Cavaliere, Ciro Allara, Valeria Romeo, Giuseppe De Pietro, Marco Salvatore, Marco Aiello, Mara Sangiovanni
Summary: Breast cancer is the most common cancer among women worldwide, and its heterogeneity can be predicted through radiomics using medical imaging. However, the lack of comprehensive datasets and a general methodology limits the routine use of radiomics in breast cancer clinical practice.
Article
Radiology, Nuclear Medicine & Medical Imaging
Aydin Demircioglu
Summary: In radiomic studies, performing feature selection before cross-validation can lead to bias, and it is important to conduct feature selection within cross-validation to reduce bias.
INSIGHTS INTO IMAGING
(2021)
Article
Computer Science, Hardware & Architecture
Yosef Masoudi-Sobhanzadeh, Shabnam Emami-Moghaddam
Summary: This study proposes a machine learning-based method to predict botnets, addressing the limitations of existing methods in real-time application, functionality, and consideration of attack types. The results show that the proposed method accurately classifies data streams into relevant groups and achieves a trade-off between feature selection and prediction model performance.
Article
Computer Science, Artificial Intelligence
Felix Mohr, Jan N. van Rijn
Summary: Traditional cross-validation methods have drawbacks in terms of speed and providing limited information on the learning process. This article introduces a new validation approach called learning curve cross-validation (LCCV) which iteratively increases the training instances. Experiments on 75 datasets show that LCCV achieves comparable performance to 5/10-fold CV while significantly reducing runtime (median runtime reductions of over 50%) with a maximum difference of 2.5% in performance compared to CV.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2023)
Article
Biochemical Research Methods
Shouzhi Chen, Yanhong Liao, Jianping Zhao, Yannan Bin, Chunhou Zheng
Summary: Due to the global outbreak of COVID-19 and its variants, antiviral peptides with anti-coronavirus activity (ACVPs) have become a promising new drug candidate for the treatment of coronavirus infection. In this study, an efficient and reliable prediction model PACVP was constructed to identify ACVPs based on effective feature representation and a two-layer stacking learning framework.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2023)
Review
Automation & Control Systems
Jiarui Xie, Manuel Sage, Yaoyao Fiona Zhao
Summary: The progress of machine learning has provided new opportunities for gas turbine modelling. Feature selection and feature learning techniques are important for addressing the challenges in this field. This review paper examines 46 studies that utilized FSFL techniques for GT modelling, and provides a categorization framework and implementation recommendations.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Article
Engineering, Industrial
Jiarui Xie, Chonghui Zhang, Manuel Sage, Mutahar Safdar, Yaoyao Fiona Zhao
Summary: Machine learning is a promising method for modeling production processes and predicting product quality. Data scarcity poses challenges in accurately modeling complex systems, especially for mass customization with high-variety low-volume products. This study introduces knowledge accumulation, extraction, and transfer (KAET) as a solution to the data scarcity problem. It proposes a sequential cross-product KAET (SeqTrans) approach that integrates data preparation, feature selection (FS), feature learning (FL), and transfer learning (TL) to address practical challenges and achieve effective knowledge transfer among multiple entities.
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH
(2023)
Article
Genetics & Heredity
Minchao Jiang, Renfeng Zhang, Yixiao Xia, Gangyong Jia, Yuyu Yin, Pu Wang, Jian Wu, Ruiquan Ge
Summary: This article introduces a computational method called i2APP that can efficiently predict antiparasitic peptides. By extracting multi-level features and using machine learning algorithms for classification, this method achieves higher accuracy and AUC than existing methods on independent datasets.
FRONTIERS IN GENETICS
(2022)
Article
Biochemistry & Molecular Biology
Shihu Jiao, Quan Zou
Summary: A new predictor called iPVP-DRLF was developed to specifically and effectively identify plant vacuole proteins. By using hybrid features and the light gradient boosting machine algorithm, iPVP-DRLF outperforms other predictors in terms of accuracy. Experimental results also indicate that deep representation learning features have an advantage in the identification of plant vacuole proteins.
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL
(2022)
Article
Computer Science, Information Systems
Joohwan Sung, Sungmin Han, Heesu Park, Soree Hwang, Song Joo Lee, Jong Woong Park, Inchan Youn
Summary: This study proposes a novel stroke severity classification method using symmetric gait features and RFECV. The experiment conducted on chronic stroke patients and elderly participants showed that the classification performance increased when using symmetric gait data and RFECV technique. These findings can assist clinicians in diagnosing stroke severity based on patient data obtained using ML technology.
Article
Engineering, Civil
Marco Zanetti, Elena Allegri, Anna Sperotto, Silvia Torresan, Andrea Critto
Summary: Due to climate change and urbanization, pluvial flooding is expected to increase in the future, posing threats to properties and people. However, current machine learning methods for pluvial flood risk assessment often neglect spatio-temporal constraints, leading to underestimation of prediction error. This paper proposes a novel machine learning methodology that incorporates features selection and spatio-temporal cross-validation to improve the accuracy of pluvial flood risk prediction in the Metropolitan City of Venice.
JOURNAL OF HYDROLOGY
(2022)
Article
Biology
Ya-Bian Luo, Yan-Yao Hou, Zhen Wang, Xin-Man Hu, Wei Li, Yan Li, Yong Liu, Tong-Jiang Li, Chun-Zhi Ai
Summary: This study developed machine learning models to predict the metabolic properties of UGT1A1 substrates. The models demonstrated good accuracy and robustness, and were validated with in vitro assays. This strategy is important for optimizing drug metabolism and avoiding drug-drug interactions in clinical practice.
COMPUTERS IN BIOLOGY AND MEDICINE
(2022)
Article
Energy & Fuels
Ladislav Zjavka
Summary: Autonomous off-grid systems dependent upon Renewable Energy (RE) sources face challenges with fluctuating supplies and need self-adapting Power Quality (PQ) prediction models based on Artificial Intelligence (AI) to maintain stability and efficiency. The proposed multi-step PQ prediction algorithm gradually improves accuracy by developing AI models with increasing input PQ-parameters, enhancing the capability to approximate target quantities for unknown combinations of off-grid connected household appliances.
SUSTAINABLE ENERGY GRIDS & NETWORKS
(2021)
Article
Computer Science, Artificial Intelligence
Zhen Zhong, Guobao Xiao, Kun Zeng, Shiping Wang
Summary: In this work, we address the feature matching problem by proposing a novel end-to-end network called TSSN-Net. We introduce a Two-step Sparse Switchable Normalization Block to adaptively normalize different convolution layers and a Multi-Scale Correspondence Grouping algorithm to capture local information of correspondences. The experimental results demonstrate that our network achieves state-of-the-art performance on benchmark datasets.
Article
Biochemical Research Methods
Xin Zhang, Lesong Wei, Xiucai Ye, Kai Zhang, Saisai Teng, Zhongshen Li, Junru Jin, Minjae Kim, Tetsuya Sakurai, Lizhen Cui, Balachandran Manavalan, Leyi Wei
Summary: A novel deep learning framework SiameseCPP is proposed for automated prediction of cell-penetrating peptides (CPPs). SiameseCPP learns discriminative representations of CPPs based on a well-pretrained model and a Siamese neural network comprising a transformer and gated recurrent units. Comprehensive experiments demonstrate that SiameseCPP outperforms existing baseline models for CPP prediction and exhibits satisfactory generalization ability on other functional peptide datasets.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Biology
Phasit Charoenkwan, Chonlatip Pipattanaboon, Chanin Nantasenamat, Md Mehedi Hasan, Mohammad Ali Moni, Pietro Lio, Watshara Shoombuatong
Summary: Despite existing cancer therapies, the development of new and effective treatments is necessary to address the ongoing cancer recurrence and new cases. This study proposes a new machine learning-based approach, PSRTTCA, for improving the identification and characterization of tumor T cell antigens (TTCAs) based on their primary sequences.
COMPUTERS IN BIOLOGY AND MEDICINE
(2023)
Article
Biochemistry & Molecular Biology
Adeel Malik, Watshara Shoombuatong, Chang-Bae Kim, Balachandran Manavalan
Summary: A machine learning-based predictor called GPApred was developed to identify LPXTG-like proteins from their primary sequences. This predictor can be utilized for functional characterization and drug targeting in further research.
INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES
(2023)
Article
Biology
Saraswathy Nithiyanandam, Vinoth Kumar Sangaraju, Balachandran Manavalan, Gwang Lee
Summary: Protein folding is a complex process where a polymer of amino acids transitions from an unfolded state to a unique three-dimensional structure. Previous studies have identified structural parameters and examined their relationship with protein folding rate, but these parameters are only applicable to a limited set of proteins. Machine learning models have been proposed, but they fail to explain plausible folding mechanisms. In this study, ten different machine learning algorithms were evaluated using various structural parameters and network centrality measures, with support vector machine showing the best predictive capability.
COMPUTERS IN BIOLOGY AND MEDICINE
(2023)
Article
Biology
Phasit Charoenkwan, Pramote Chumnanpuen, Nalini Schaduangrat, Changmin Oh, Balachandran Manavalan, Watshara Shoombuatong
Summary: In this study, a novel computational approach called PSRQSP was developed to improve the prediction and analysis of QSPs. Experimental results showed that PSRQSP outperformed conventional methods in identifying QSPs and demonstrated its predictive capability and effectiveness. PSRQSP also constructed an easy-to-use web server for accelerating the discovery of potential QSPs for drug development.
COMPUTERS IN BIOLOGY AND MEDICINE
(2023)
Article
Biochemistry & Molecular Biology
Phasit Charoenkwan, Nalini Schaduangrat, Nhat Truong Pham, Balachandran Manavalan, Watshara Shoombuatong
Summary: Proposed the first stack-based approach, Pretoria, for accurate and large-scale identification of CD8+ T-cell epitopes (TCEs) of eukaryotic pathogens. Constructed a pool of 144 different machine learning (ML)-based classifiers based on 12 popular ML algorithms and used feature selection method to determine important ML classifiers for building the stacked model. Experimental results demonstrated that Pretoria outperformed several conventional ML classifiers and the existing method, with an accuracy of 0.866, MCC of 0.732, and AUC of 0.921 in the independent test.
INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES
(2023)
Article
Biochemistry & Molecular Biology
Ahmad Firoz, Adeel Malik, Hani Mohammed Ali, Yusuf Akhter, Balachandran Manavalan, Chang-Bae Kim
Summary: In this study, a new two-layer hybrid framework called PRR-HyPred was constructed to simultaneously predict and classify PRRs. Using support vector machine and random forest-based classifier, PRR-HyPred achieved accuracies of 83.4% and 95% in the first and second layers respectively. This is the first study that can predict and classify PRRs into specific families, and it can be a valuable tool for large-scale PRR prediction and classification, facilitating future studies.
INTERNATIONAL JOURNAL OF BIOLOGICAL MACROMOLECULES
(2023)
Article
Biochemistry & Molecular Biology
Kazuhiro Maeda, Hiroyuki Kurata
Summary: This article presents a new approach called KinModGPT that generates kinetic models directly from natural language text. KinModGPT utilizes GPT as a natural language interpreter and Tellurium as an SBML generator. The effectiveness of KinModGPT in creating SBML kinetic models from complex natural language descriptions is demonstrated, including metabolic pathways, protein-protein interaction networks, and heat shock response. This article showcases the potential of KinModGPT in kinetic modeling automation.
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2023)
Article
Computer Science, Artificial Intelligence
Diponkor Bala, Md. Shamim Hossain, Mohammad Alamgir Hossain, Md. Ibrahim Abdullah, Md. Mizanur Rahman, Balachandran Manavalan, Naijie Gu, Mohammad S. Islam, Zhangjin Huang
Summary: The monkeypox virus poses a new pandemic threat. However, there is currently no reliable monkeypox database available for training and testing deep learning models. The MSID dataset has been developed for this purpose, providing a collection of monkeypox patient images for building confident deep learning models. The proposed MonkeyNet model can accurately identify monkeypox disease and assist doctors in making early diagnoses.
Review
Biochemical Research Methods
Le Thi Phan, Changmin Oh, Tao He, Balachandran Manavalan
Summary: Enhancers are non-coding DNA elements that enhance the transcription rate of specific genes. Computational platforms have been developed to complement experimental methods in identifying enhancers. This review provides an overview of machine learning-based prediction methods and databases for enhancer identification and discusses the advantages and drawbacks of these methods, as well as guidelines for developing more efficient enhancer predictors.
Article
Computer Science, Artificial Intelligence
Nhat Truong Pham, Duc Ngoc Minh Dang, Ngoc Duy Nguyen, Thanh Thi Nguyen, Hai Nguyen, Balachandran Manavalan, Chee Peng Lim, Sy Dzung Nguyen
Summary: This paper proposes a deep learning framework for speech emotion recognition, which combines a hybrid data augmentation method and deep attention-based dilated convolutional-recurrent neural networks. The framework is able to extract high-level representations from three-dimensional log Mel spectrogram features. Experimental results show that the proposed framework outperforms other state-of-the-art methods on the EmoDB and ERC datasets.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Biology
Shaherin Basith, Balachandran Manavalan, Gwang Lee
Summary: This study combined microsecond-scale unbiased molecular dynamics simulation with network analysis to elucidate the local and global conformational changes and allosteric communications in SOD1 systems. Structural analyses revealed significant variations in catalytic sites and stability due to unmetallated SOD1 systems and cysteine mutations. Dynamic motion analysis showed balanced atomic displacement and highly correlated motions in the Holo system.
COMPUTERS IN BIOLOGY AND MEDICINE
(2024)
Article
Multidisciplinary Sciences
Phasit Charoenkwan, Sajee Waramit, Pramote Chumnanpuen, Nalini Schaduangrat, Watshara Shoombuatong
Summary: HCV infection causes chronic liver diseases, and there is no effective vaccine available. This study proposes a novel approach called TROLLOPE to accurately identify TCE-HCVs from sequence information, with superior predictive performance.