Article
Computer Science, Information Systems
Mehrdad Vatankhah, Mohammadreza Momenzadeh
Summary: In this article, a new method is introduced to improve the performance of the Lasso feature selection model. The method finds the best regularization parameter automatically to achieve optimal performance in DNA microarray data classification. Experimental results demonstrate that the proposed Lasso outperforms other feature selection methods in terms of selecting the best features for microarray data classification, showing robustness and stability. It is a powerful algorithm for selecting informative features, which can be applied in cancer diagnosis using gene expression profiles.
MULTIMEDIA TOOLS AND APPLICATIONS
(2023)
Article
Computer Science, Artificial Intelligence
Prativa Agarwalla, Sumitra Mukhopadhyay
Summary: This paper proposes a framework called GENEmops for gene selection and subsequent cancer classification. The framework uses a multi-objective player selection strategy and employs multi-filtering and adaptive parameter tuning methods for gene selection. By introducing a new graded rotational blending operator, the framework improves the performance of the hybrid wrapper scheme. Experimental results demonstrate the efficiency of the proposed framework.
APPLIED SOFT COMPUTING
(2022)
Article
Computer Science, Artificial Intelligence
Zaixing He, Mengtian Wu, Xinyue Zhao, Shuyou Zhang, Jianrong Tan
Summary: NLDA, initially proposed to overcome singularity issues, has been enhanced through the introduction of RNLDA to prevent overfitting and improve performance. Extensive experiments show that RNLDA outperforms state-of-the-art DDR methods.
PATTERN RECOGNITION
(2021)
Article
Computer Science, Interdisciplinary Applications
Muhammed Abd-Elnaby, Marco Alfonse, Mohamed Roushdy
Summary: Researchers reviewed and studied feature selection and classification techniques in order to improve cancer classification based on microarray data.
JOURNAL OF BIOMEDICAL INFORMATICS
(2021)
Article
Computer Science, Information Systems
Heyam H. Al-Baity, Nourah Al-Mutlaq
Summary: A new optimized wrapper gene selection method based on simulated annealing algorithm was proposed to assist in breast cancer prediction, showing superior performance in accuracy and execution time through experiments.
CMC-COMPUTERS MATERIALS & CONTINUA
(2021)
Article
Engineering, Chemical
Waleed Ali, Faisal Saeed
Summary: Advancements in intelligent systems have greatly contributed to the fields of bioinformatics, health, and medicine. This paper proposes a hybrid filter-genetic feature selection approach to improve the performance of cancer classification by addressing the high-dimensionality and noisy nature of microarray data. Experimental results demonstrate that the proposed method outperforms common machine learning methods in terms of Accuracy, Recall, Precision, and F-measure.
Article
Computer Science, Artificial Intelligence
Kushal Kanti Ghosh, Shemim Begum, Aritra Sardar, Sukdev Adhikary, Manosij Ghosh, Munish Kumar, Ram Sarkar
Summary: DNA microarray experiments provide information about cell and tissue states, with only a few genes playing a significant role in disease classification. Feature selection algorithms aim to efficiently identify relevant features, with feature ranking techniques assigning importance to features without using learning algorithms. This paper extensively studies 10 popular filter ranking methods and their performance on various microarray datasets using different classifiers. The experiments show that Mutual Information is the most effective method among Entropy based methods, ReliefF is best in the Similarity based methods category, and Chi-square performs well in the Statistics based methods category.
EXPERT SYSTEMS WITH APPLICATIONS
(2021)
Article
Automation & Control Systems
Mohammad Ahmadi Ganjei, Reza Boostani
Summary: In this paper, a new hybrid feature selection approach that combines filter and wrapper methods is proposed. By ranking, clustering, and searching the features, this method achieves better performance on high-dimensional datasets.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2022)
Article
Engineering, Chemical
Nurhawani Ahmad Zamri, Nor Azlina Ab Aziz, Thangavel Bhuvaneswari, Nor Hidayati Abdul Aziz, Anith Khairunnisa Ghazali
Summary: This paper proposes the use of a simulated Kalman filter with mutation (SKF-MUT) for feature selection of microarray data to enhance the classification accuracy of ANN. The algorithm effectively selects informative gene features, leading to classification accuracy ranging from 95% to 100% on various cancer datasets.
Review
Computer Science, Artificial Intelligence
Sarah Osama, Hassan Shaban, Abdelmgeid A. Ali
Summary: This review explores the applications of machine learning-based data reduction and classification algorithms in microarray gene expression data. It summarizes various data preprocessing methods, reviews different feature selection algorithms, and discusses feature extraction and hybrid methods. It also examines widely used machine learning algorithms for tumor and nontumor classification. Finally, the challenges and unanswered questions in accurate cancer classification and detection are highlighted.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Mathematics
Sebastian Alberto Grillo, Jose Luis Vazquez Noguera, Julio Cesar Mello Roman, Miguel Garcia-Torres, Jacques Facon, Diego P. Pinto-Roa, Luis Salgueiro Romero, Francisco Gomez-Vela, Laura Raquel Bareiro Paniagua, Deysi Natalia Leguizamon Correa
Summary: This study analyzes the impact of redundant features on classification model performance and proposes a theoretical framework for analyzing feature construction and selection. The experimental results suggest that a large number of redundant features can reduce the classification error.
Article
Biochemical Research Methods
Fengsheng Wang, Leyi Wei
Summary: In this study, we propose a novel multi-scale end-to-end deep learning model, MSTLoc, for identifying protein subcellular locations in the imbalanced multi-label immunohistochemistry (IHC) images dataset. We demonstrate that the proposed MSTLoc outperforms current state-of-the-art models in multi-label subcellular location prediction. Through feature visualization and interpretation analysis, we show that the multi-scale deep features learned from our model exhibit better ability in capturing discriminative patterns underlying protein subcellular locations, and the features from different scales are complementary for the improvement in performance. Case study results indicate that our MSTLoc can successfully identify some biomarkers from proteins that are closely involved in cancer development.
Article
Computer Science, Artificial Intelligence
Mehrdad Rostami, Saman Forouzandeh, Kamal Berahmand, Mina Soltani, Meisam Shahsavari, Mourad Oussalah
Summary: The proposed social network analysis-based gene selection approach aims to maximize relevance and minimize redundancy of selected genes by repetitively selecting maximum communities and using node centrality-based criteria. This method improves classification accuracy of microarray data while reducing time complexity.
ARTIFICIAL INTELLIGENCE IN MEDICINE
(2022)
Article
Automation & Control Systems
Shoujia Zhang, Weidong Xie, Wei Li, Linjie Wang, Chaolu Feng
Summary: Microarray data plays a significant role in cancer classification and prediction. This paper proposes a GAMB-GNN model that utilizes gene attributes and multi-type relation networks to address the limitations of previous methods. By using a gene ranking algorithm based on Markov Blanket, GAMB-GNN obtains gene scores and rankings, and constructs a multi-type gene relations graph. Experimental results on six microarray datasets demonstrate that GAMB-GNN significantly outperforms baseline and state-of-the-art methods in terms of accuracy and f1-score.
CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS
(2023)
Article
Computer Science, Artificial Intelligence
Weibo Shu, Yaqiang Yao, Shengfei Lyu, Jinlong Li, Huanhuan Chen
Summary: In the research area of time series classification, a novel algorithm called short isometric shapelet transform (SIST) is introduced in this paper to reduce time complexity by fixing the length of shapelet and training a single linear classifier. The theoretical evidence and empirical experiments demonstrate the superior performance of the proposed algorithm in terms of near-lossless accuracy while reducing time complexity.
KNOWLEDGE AND INFORMATION SYSTEMS
(2021)
Article
Biochemical Research Methods
Ronesh Sharma, Shiu Kumar, Tatsuhiko Tsunoda, Thirumananseri Kumarevel, Alok Sharma
Summary: DNA-binding proteins play essential roles in cellular processes, with single-stranded and double-stranded proteins classified based on their interactions with DNA. Computational prediction of these proteins aids in understanding their functions and binding domains. A proposed method using hidden Markov model profiles achieved improved performance compared to benchmark methods, with approximately 3% overall improvement.
ANALYTICAL BIOCHEMISTRY
(2021)
Article
Computer Science, Artificial Intelligence
Shiu Kumar, Ronesh Sharma, Alok Sharma
Summary: This study introduces a frequency-based approach using LSTM network for recognizing different brain wave signals, incorporating adaptive filtering with genetic algorithm to achieve improved performance compared to existing methods.
PEERJ COMPUTER SCIENCE
(2021)
Article
Biochemical Research Methods
Marco Necci, Damiano Piovesan, Silvio C. E. Tosatto
Summary: Intrinsically disordered proteins present a challenge to traditional protein structure-function analysis, with computational methods, particularly deep learning techniques, showing superior performance in predicting disorder. However, predicting disordered binding regions remains difficult, and there is a significant variation in computational times among methods.
Article
Biochemical Research Methods
Shiu Kumar, Ronesh Sharma, Tatsuhiko Tsunoda, Thirumananseri Kumarevel, Alok Sharma
Summary: The rapid spread of the COVID-19 pandemic globally has had a profound impact, making it crucial to predict when a country may be able to contain the virus. Researchers successfully forecasted the date when New Zealand contained the virus using a long short-term memory network model and have applied it to other countries as well.
BMC BIOINFORMATICS
(2021)
Article
Biochemical Research Methods
Shiu Kumar, Tatsuhiko Tsunoda, Alok Sharma
Summary: The proposed SPECTRA predictor achieved the lowest average error rates and highest average kappa coefficient values compared to other methods, demonstrating its effectiveness in improving brain wave signal recognition for the development of computationally efficient real-time BCI systems.
BMC BIOINFORMATICS
(2021)
Article
Biochemical Research Methods
Alok Sharma, Artem Lysenko, Keith A. Boroevich, Edwin Vans, Tatsuhiko Tsunoda
Summary: Artificial intelligence methods, particularly deep neural networks such as convolutional neural networks, offer capabilities for discovering complex biological mechanisms from raw data. However, interpreting the results of these methods in a biomedical context remains a challenge. This study introduces an approach using CNN for nonimage data feature selection, showing promising results for predicting cancer types and identifying key pathways.
BRIEFINGS IN BIOINFORMATICS
(2021)
Article
Chemistry, Medicinal
Jaswinder Singh, Kuldip Paliwal, Jaspreet Singh, Yaoqi Zhou
Summary: The dilated convolutional neural network method SPOT-RNA-1D predicts RNA backbone torsion and pseudotorsion angles with smaller mean absolute errors compared to random and helix prediction methods. It accurately recovers overall patterns of angle distributions but faces difficulty in predicting angles further away from bases involved in tertiary interactions. SPOT-RNA-1D yields more accurate dihedral angles than the best models in RNA-puzzles experiments, showing potential as model quality indicators and restraints for RNA structure prediction.
JOURNAL OF CHEMICAL INFORMATION AND MODELING
(2021)
Article
Biochemical Research Methods
Md Ochiuddin Miah, Rafsanjani Muhammod, Khondaker Abdullah Al Mamun, Dewan Md Farid, Shiu Kumar, Alok Sharma, Abdollah Dehzangi
Summary: This paper introduces a novel clustering-based ensemble technique called CluSem to enhance the classification performance of real-time BCI applications. A new brain game named CluGame is developed using this method to evaluate the classification performance of real-time motor imagery movements. Results show that CluSem improves classification accuracy by 5% to 15% compared to existing methods on collected and publicly available EEG datasets.
JOURNAL OF NEUROSCIENCE METHODS
(2021)
Article
Multidisciplinary Sciences
Sajid Ahmed, Rafsanjani Muhammod, Zahid Hossain Khan, Sheikh Adilina, Alok Sharma, Swakkhar Shatabda, Abdollah Dehzangi
Summary: In this study, a new multi-headed deep convolutional neural network model called ACP-MHCNN is proposed for extracting and combining discriminative features from different information sources in an interactive way to identify anticancer peptides. The model outperforms other existing models for anticancer peptide identification by a substantial margin, demonstrating higher accuracy, sensitivity, specificity, precision, and MCC.
SCIENTIFIC REPORTS
(2021)
Article
Multidisciplinary Sciences
Sayed Mehedi Azim, Alok Sharma, Iman Noshadi, Swakkhar Shatabda, Iman Dehzangi
Summary: AMPylation is an emerging post-translational modification that plays a role in neurodevelopment and neurodegeneration. However, there is a lack of computational approaches for predicting AMPylation due to a lack of peptide sequence datasets. In this study, a new dataset and machine learning tool called DeepAmp were introduced, achieving promising results in predicting AMPylation sites in proteins.
SCIENTIFIC REPORTS
(2022)
Article
Genetics & Heredity
Farnoush Manavi, Alok Sharma, Ronesh Sharma, Tatsuhiko Tsunoda, Swakkhar Shatabda, Iman Dehzangi
Summary: DNA-binding proteins play a vital role in biological activity including replication, packing, and reparation of DNA. They can be classified into single-stranded DNA-binding proteins (SSBs) or double-stranded DNA-binding proteins (DSBs) which help determine their function. Despite previous efforts, the prediction accuracy of DSB and SSB remains limited. In this study, a new method called CNN-Pred is proposed, which accurately predicts DSB and SSB using evolutionary-based features extracted from position specific scoring matrix (PSSM) with a 1D-convolutional neural network (CNN) as the classifier. The results show that CNN-Pred improves DSB and SSB prediction accuracies by more than 4% compared to previous studies. CNN-Pred is available as a standalone tool with its source codes on GitHub: https://github.com/MLBC-lab/CNN-Pred.
Article
Biochemistry & Molecular Biology
Alessio Del Conte, Adel Bouhraoua, Mahta Mehdiabadi, Damiano Clementel, Alexander Miguel Monzon, Damiano CAID Predictors, Silvio C. E. Tosatto, Damiano Piovesan
Summary: Intrinsic disorder (ID) in proteins is a well-established phenomenon in structural biology, but measuring its behavior on a large scale is challenging. To address this issue, CAID benchmarks ID predictors and creates a web server, the CAID Prediction Portal, which executes all CAID methods on user-defined sequences. The server generates standardized output, facilitates comparison between methods, and provides a valuable resource for researchers studying ID in proteins.
NUCLEIC ACIDS RESEARCH
(2023)
Article
Multidisciplinary Sciences
Alok Sharma, Artem Lysenko, Keith A. Boroevich, Tatsuhiko Tsunoda
Summary: Modern oncology offers a wide range of treatments, and selecting the best option for each patient is crucial for optimal outcomes. Multi-omics profiling combined with AI-based predictive models show promise in streamlining treatment decisions, but are hindered by high dimensionality of datasets and limited annotated samples. Here, we propose a novel deep learning-based method, DeepInsight-3D, to predict patient-specific anticancer drug response using multi-omics data. This approach converts structured data into images and leverages convolutional neural networks to handle high dimensionality while modeling complex relationships between variables. DeepInsight-3D outperforms other state-of-the-art methods and has the potential to aid in the development of personalized treatment strategies for various cancers.
SCIENTIFIC REPORTS
(2023)
Article
Biochemical Research Methods
Shangru Jia, Artem Lysenko, Keith A. Boroevich, Alok Sharma, Tatsuhiko Tsunoda
Summary: Cell-type annotation is a critical step in analyzing scRNA-seq data. Most current methods use unsupervised clustering algorithms, resulting in rough classification. To address this issue, we propose scDeepInsight, a supervised annotation method that performs manifold assignments, data integration, supervised training, outlier detection, and cell-type annotation. It can also identify active genes related to cell-types.
BRIEFINGS IN BIOINFORMATICS
(2023)
Article
Computer Science, Information Systems
Sujan Kumar Roy, Aaron Nicolson, Kuldip K. Paliwal
Summary: The study investigates the use of MHANet for LPC estimation to reduce bias and improve speech enhancement quality, validated through subjective AB listening tests and seven objective measures.