☆ 4.5 Article

Null space based feature selection method for gene expression data

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS (2012)

期刊

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS

卷 3, 期 4, 页码 269-276

出版社

SPRINGER HEIDELBERG

DOI: 10.1007/s13042-011-0061-9

关键词

Feature selection; Null space; DNA microarray gene expression data; Classification accuracy; Biological significance

类别

Computer Science, Artificial Intelligence

资金

Grants-in-Aid for Scientific Research [10F00364] Funding Source: KAKEN

向作者/读者索取更多资源

Protocol

Reagent

摘要

Feature selection is quite an important process in gene expression data analysis. Feature selection methods discard unimportant genes from several thousands of genes for finding important genes or pathways for the target biological phenomenon like cancer. The obtained gene subset is used for statistical analysis for prediction such as survival as well as functional analysis for understanding biological characteristics. In this paper we propose a null space based feature selection method for gene expression data in terms of supervised classification. The proposed method discards the redundant genes by applying the information of null space of scatter matrices. We derive the method theoretically and demonstrate its effectiveness on several DNA gene expression datasets. The method is easy to implement and computationally efficient.

作者

我是这篇论文的作者

点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.5

评分不足

次要评分

新颖性

-

重要性

-

科学严谨性

-

评价这篇论文

推荐

Article Computer Science, Information Systems

Self-regularized Lasso for selection of most informative features in microarray cancer classification

Mehrdad Vatankhah, Mohammadreza Momenzadeh

Summary: In this article, a new method is introduced to improve the performance of the Lasso feature selection model. The method finds the best regularization parameter automatically to achieve optimal performance in DNA microarray data classification. Experimental results demonstrate that the proposed Lasso outperforms other feature selection methods in terms of selecting the best features for microarray data classification, showing robustness and stability. It is a powerful algorithm for selecting informative features, which can be applied in cancer diagnosis using gene expression profiles.

MULTIMEDIA TOOLS AND APPLICATIONS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

GENEmops: Supervised feature selection from high dimensional biomedical dataset

Prativa Agarwalla, Sumitra Mukhopadhyay

Summary: This paper proposes a framework called GENEmops for gene selection and subsequent cancer classification. The framework uses a multi-objective player selection strategy and employs multi-filtering and adaptive parameter tuning methods for gene selection. By introducing a new graded rotational blending operator, the framework improves the performance of the hybrid wrapper scheme. Experimental results demonstrate the efficiency of the proposed framework.

APPLIED SOFT COMPUTING (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Representative null space LDA for discriminative dimensionality reduction

Zaixing He, Mengtian Wu, Xinyue Zhao, Shuyou Zhang, Jianrong Tan

Summary: NLDA, initially proposed to overcome singularity issues, has been enhanced through the introduction of RNLDA to prevent overfitting and improve performance. Extensive experiments show that RNLDA outperforms state-of-the-art DDR methods.

PATTERN RECOGNITION (2021)

添加到收藏夹

Article Computer Science, Interdisciplinary Applications

Classification of breast cancer using microarray gene expression data: A survey

Muhammed Abd-Elnaby, Marco Alfonse, Mohamed Roushdy

Summary: Researchers reviewed and studied feature selection and classification techniques in order to improve cancer classification based on microarray data.

JOURNAL OF BIOMEDICAL INFORMATICS (2021)

添加到收藏夹

Article Computer Science, Information Systems

A New Optimized Wrapper Gene Selection Method for Breast Cancer Prediction

Heyam H. Al-Baity, Nourah Al-Mutlaq

Summary: A new optimized wrapper gene selection method based on simulated annealing algorithm was proposed to assist in breast cancer prediction, showing superior performance in accuracy and execution time through experiments.

CMC-COMPUTERS MATERIALS & CONTINUA (2021)

添加到收藏夹

Article Engineering, Chemical

Hybrid Filter and Genetic Algorithm-Based Feature Selection for Improving Cancer Classification in High-Dimensional Microarray Data

Waleed Ali, Faisal Saeed

Summary: Advancements in intelligent systems have greatly contributed to the fields of bioinformatics, health, and medicine. This paper proposes a hybrid filter-genetic feature selection approach to improve the performance of cancer classification by addressing the high-dimensionality and noisy nature of microarray data. Experimental results demonstrate that the proposed method outperforms common machine learning methods in terms of Accuracy, Recall, Precision, and F-measure.

PROCESSES (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Theoretical and empirical analysis of filter ranking methods: Experimental study on benchmark DNA microarray data

Kushal Kanti Ghosh, Shemim Begum, Aritra Sardar, Sukdev Adhikary, Manosij Ghosh, Munish Kumar, Ram Sarkar

Summary: DNA microarray experiments provide information about cell and tissue states, with only a few genes playing a significant role in disease classification. Feature selection algorithms aim to efficiently identify relevant features, with feature ranking techniques assigning importance to features without using learning algorithms. This paper extensively studies 10 popular filter ranking methods and their performance on various microarray datasets using different classifiers. The experiments show that Mutual Information is the most effective method among Entropy based methods, ReliefF is best in the Similarity based methods category, and Chi-square performs well in the Statistics based methods category.

EXPERT SYSTEMS WITH APPLICATIONS (2021)

添加到收藏夹

Article Automation & Control Systems

A hybrid feature selection scheme for high-dimensional data

Mohammad Ahmadi Ganjei, Reza Boostani

Summary: In this paper, a new hybrid feature selection approach that combines filter and wrapper methods is proposed. By ranking, clustering, and searching the features, this method achieves better performance on high-dimensional datasets.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2022)

添加到收藏夹

Article Engineering, Chemical

Feature Selection of Microarray Data Using Simulated Kalman Filter with Mutation

Nurhawani Ahmad Zamri, Nor Azlina Ab Aziz, Thangavel Bhuvaneswari, Nor Hidayati Abdul Aziz, Anith Khairunnisa Ghazali

Summary: This paper proposes the use of a simulated Kalman filter with mutation (SKF-MUT) for feature selection of microarray data to enhance the classification accuracy of ANN. The algorithm effectively selects informative gene features, leading to classification accuracy ranging from 95% to 100% on various cancer datasets.

PROCESSES (2023)

添加到收藏夹

Review Computer Science, Artificial Intelligence

Gene reduction and machine learning algorithms for cancer classification based on microarray gene expression data: A comprehensive review

Sarah Osama, Hassan Shaban, Abdelmgeid A. Ali

Summary: This review explores the applications of machine learning-based data reduction and classification algorithms in microarray gene expression data. It summarizes various data preprocessing methods, reviews different feature selection algorithms, and discusses feature extraction and hybrid methods. It also examines widely used machine learning algorithms for tumor and nontumor classification. Finally, the challenges and unanswered questions in accurate cancer classification and detection are highlighted.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

添加到收藏夹

Article Mathematics

Redundancy Is Not Necessarily Detrimental in Classification Problems

Sebastian Alberto Grillo, Jose Luis Vazquez Noguera, Julio Cesar Mello Roman, Miguel Garcia-Torres, Jacques Facon, Diego P. Pinto-Roa, Luis Salgueiro Romero, Francisco Gomez-Vela, Laura Raquel Bareiro Paniagua, Deysi Natalia Leguizamon Correa

Summary: This study analyzes the impact of redundant features on classification model performance and proposes a theoretical framework for analyzing feature construction and selection. The experimental results suggest that a large number of redundant features can reduce the classification error.

MATHEMATICS (2021)

添加到收藏夹

Article Biochemical Research Methods

Multi-scale deep learning for the imbalanced multi-label protein subcellular localization prediction based on immunohistochemistry images

Fengsheng Wang, Leyi Wei

Summary: In this study, we propose a novel multi-scale end-to-end deep learning model, MSTLoc, for identifying protein subcellular locations in the imbalanced multi-label immunohistochemistry (IHC) images dataset. We demonstrate that the proposed MSTLoc outperforms current state-of-the-art models in multi-label subcellular location prediction. Through feature visualization and interpretation analysis, we show that the multi-scale deep features learned from our model exhibit better ability in capturing discriminative patterns underlying protein subcellular locations, and the features from different scales are complementary for the improvement in performance. Case study results indicate that our MSTLoc can successfully identify some biomarkers from proteins that are closely involved in cancer development.

BIOINFORMATICS (2022)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Gene selection for microarray data classification via multi-objective graph theoretic-based method

Mehrdad Rostami, Saman Forouzandeh, Kamal Berahmand, Mina Soltani, Meisam Shahsavari, Mourad Oussalah

Summary: The proposed social network analysis-based gene selection approach aims to maximize relevance and minimize redundancy of selected genes by repetitively selecting maximum communities and using node centrality-based criteria. This method improves classification accuracy of microarray data while reducing time complexity.

ARTIFICIAL INTELLIGENCE IN MEDICINE (2022)

添加到收藏夹

Article Automation & Control Systems

GAMB-GNN: Graph Neural Networks learning from gene structure relations and Markov Blanket ranking for cancer classification in microarray data

Shoujia Zhang, Weidong Xie, Wei Li, Linjie Wang, Chaolu Feng

Summary: Microarray data plays a significant role in cancer classification and prediction. This paper proposes a GAMB-GNN model that utilizes gene attributes and multi-type relation networks to address the limitations of previous methods. By using a gene ranking algorithm based on Markov Blanket, GAMB-GNN obtains gene scores and rankings, and constructs a multi-type gene relations graph. Experimental results on six microarray datasets demonstrate that GAMB-GNN significantly outperforms baseline and state-of-the-art methods in terms of accuracy and f1-score.

CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS (2023)

添加到收藏夹

Article Computer Science, Artificial Intelligence

Short isometric shapelet transform for binary time series classification

Weibo Shu, Yaqiang Yao, Shengfei Lyu, Jinlong Li, Huanhuan Chen

Summary: In the research area of time series classification, a novel algorithm called short isometric shapelet transform (SIST) is introduced in this paper to reduce time complexity by fixing the length of shapelet and training a single linear classifier. The theoretical evidence and empirical experiments demonstrate the superior performance of the proposed algorithm in terms of near-lossless accuracy while reducing time complexity.

KNOWLEDGE AND INFORMATION SYSTEMS (2021)

添加到收藏夹

Article Biochemical Research Methods

Single-stranded and double-stranded DNA-binding protein prediction using HMM profiles

Ronesh Sharma, Shiu Kumar, Tatsuhiko Tsunoda, Thirumananseri Kumarevel, Alok Sharma

Summary: DNA-binding proteins play essential roles in cellular processes, with single-stranded and double-stranded proteins classified based on their interactions with DNA. Computational prediction of these proteins aids in understanding their functions and binding domains. A proposed method using hidden Markov model profiles achieved improved performance compared to benchmark methods, with approximately 3% overall improvement.

ANALYTICAL BIOCHEMISTRY (2021)

添加到收藏夹

Article Computer Science, Artificial Intelligence

OPTICAL+: a frequency-based deep learning scheme for recognizing brain wave signals

Shiu Kumar, Ronesh Sharma, Alok Sharma

Summary: This study introduces a frequency-based approach using LSTM network for recognizing different brain wave signals, incorporating adaptive filtering with genetic algorithm to achieve improved performance compared to existing methods.

PEERJ COMPUTER SCIENCE (2021)

添加到收藏夹

Article Biochemical Research Methods

Critical assessment of protein intrinsic disorder prediction

Marco Necci, Damiano Piovesan, Silvio C. E. Tosatto

Summary: Intrinsically disordered proteins present a challenge to traditional protein structure-function analysis, with computational methods, particularly deep learning techniques, showing superior performance in predicting disorder. However, predicting disordered binding regions remains difficult, and there is a significant variation in computational times among methods.

NATURE METHODS (2021)

添加到收藏夹

Article Biochemical Research Methods

Forecasting the spread of COVID-19 using LSTM network

Shiu Kumar, Ronesh Sharma, Tatsuhiko Tsunoda, Thirumananseri Kumarevel, Alok Sharma

Summary: The rapid spread of the COVID-19 pandemic globally has had a profound impact, making it crucial to predict when a country may be able to contain the virus. Researchers successfully forecasted the date when New Zealand contained the virus using a long short-term memory network model and have applied it to other countries as well.

BMC BIOINFORMATICS (2021)

添加到收藏夹

Article Biochemical Research Methods

SPECTRA: a tool for enhanced brain wave signal recognition

Shiu Kumar, Tatsuhiko Tsunoda, Alok Sharma

Summary: The proposed SPECTRA predictor achieved the lowest average error rates and highest average kappa coefficient values compared to other methods, demonstrating its effectiveness in improving brain wave signal recognition for the development of computationally efficient real-time BCI systems.

BMC BIOINFORMATICS (2021)

添加到收藏夹

Article Biochemical Research Methods

DeepFeature: feature selection in nonimage data using convolutional neural network

Alok Sharma, Artem Lysenko, Keith A. Boroevich, Edwin Vans, Tatsuhiko Tsunoda

Summary: Artificial intelligence methods, particularly deep neural networks such as convolutional neural networks, offer capabilities for discovering complex biological mechanisms from raw data. However, interpreting the results of these methods in a biomedical context remains a challenge. This study introduces an approach using CNN for nonimage data feature selection, showing promising results for predicting cancer types and identifying key pathways.

BRIEFINGS IN BIOINFORMATICS (2021)

添加到收藏夹

Article Chemistry, Medicinal

RNA Backbone Torsion and Pseudotorsion Angle Prediction Using Dilated Convolutional Neural Networks

Jaswinder Singh, Kuldip Paliwal, Jaspreet Singh, Yaoqi Zhou

Summary: The dilated convolutional neural network method SPOT-RNA-1D predicts RNA backbone torsion and pseudotorsion angles with smaller mean absolute errors compared to random and helix prediction methods. It accurately recovers overall patterns of angle distributions but faces difficulty in predicting angles further away from bases involved in tertiary interactions. SPOT-RNA-1D yields more accurate dihedral angles than the best models in RNA-puzzles experiments, showing potential as model quality indicators and restraints for RNA structure prediction.

JOURNAL OF CHEMICAL INFORMATION AND MODELING (2021)

添加到收藏夹

Article Biochemical Research Methods

CluSem: Accurate clustering-based ensemble method to predict motor imagery tasks from multi-channel EEG data

Md Ochiuddin Miah, Rafsanjani Muhammod, Khondaker Abdullah Al Mamun, Dewan Md Farid, Shiu Kumar, Alok Sharma, Abdollah Dehzangi

Summary: This paper introduces a novel clustering-based ensemble technique called CluSem to enhance the classification performance of real-time BCI applications. A new brain game named CluGame is developed using this method to evaluate the classification performance of real-time motor imagery movements. Results show that CluSem improves classification accuracy by 5% to 15% compared to existing methods on collected and publicly available EEG datasets.

JOURNAL OF NEUROSCIENCE METHODS (2021)

添加到收藏夹

Article Multidisciplinary Sciences

ACP-MHCNN: an accurate multi-headed deep-convolutional neural network to predict anticancer peptides

Sajid Ahmed, Rafsanjani Muhammod, Zahid Hossain Khan, Sheikh Adilina, Alok Sharma, Swakkhar Shatabda, Abdollah Dehzangi

Summary: In this study, a new multi-headed deep convolutional neural network model called ACP-MHCNN is proposed for extracting and combining discriminative features from different information sources in an interactive way to identify anticancer peptides. The model outperforms other existing models for anticancer peptide identification by a substantial margin, demonstrating higher accuracy, sensitivity, specificity, precision, and MCC.

SCIENTIFIC REPORTS (2021)

添加到收藏夹

Article Multidisciplinary Sciences

A convolutional neural network based tool for predicting protein AMPylation sites from binary profile representation

Sayed Mehedi Azim, Alok Sharma, Iman Noshadi, Swakkhar Shatabda, Iman Dehzangi

Summary: AMPylation is an emerging post-translational modification that plays a role in neurodevelopment and neurodegeneration. However, there is a lack of computational approaches for predicting AMPylation due to a lack of peptide sequence datasets. In this study, a new dataset and machine learning tool called DeepAmp were introduced, achieving promising results in predicting AMPylation sites in proteins.

SCIENTIFIC REPORTS (2022)

添加到收藏夹

Article Genetics & Heredity

CNN-Pred: Prediction of single-stranded and double-stranded DNA-binding protein using convolutional neural networks

Farnoush Manavi, Alok Sharma, Ronesh Sharma, Tatsuhiko Tsunoda, Swakkhar Shatabda, Iman Dehzangi

Summary: DNA-binding proteins play a vital role in biological activity including replication, packing, and reparation of DNA. They can be classified into single-stranded DNA-binding proteins (SSBs) or double-stranded DNA-binding proteins (DSBs) which help determine their function. Despite previous efforts, the prediction accuracy of DSB and SSB remains limited. In this study, a new method called CNN-Pred is proposed, which accurately predicts DSB and SSB using evolutionary-based features extracted from position specific scoring matrix (PSSM) with a 1D-convolutional neural network (CNN) as the classifier. The results show that CNN-Pred improves DSB and SSB prediction accuracies by more than 4% compared to previous studies. CNN-Pred is available as a standalone tool with its source codes on GitHub: https://github.com/MLBC-lab/CNN-Pred.

GENE (2023)

添加到收藏夹

Article Biochemistry & Molecular Biology

CAID prediction portal: a comprehensive service for predicting intrinsic disorder and binding regions in proteins

Alessio Del Conte, Adel Bouhraoua, Mahta Mehdiabadi, Damiano Clementel, Alexander Miguel Monzon, Damiano CAID Predictors, Silvio C. E. Tosatto, Damiano Piovesan

Summary: Intrinsic disorder (ID) in proteins is a well-established phenomenon in structural biology, but measuring its behavior on a large scale is challenging. To address this issue, CAID benchmarks ID predictors and creates a web server, the CAID Prediction Portal, which executes all CAID methods on user-defined sequences. The server generates standardized output, facilitates comparison between methods, and provides a valuable resource for researchers studying ID in proteins.

NUCLEIC ACIDS RESEARCH (2023)

添加到收藏夹

Article Multidisciplinary Sciences

DeepInsight-3D architecture for anti-cancer drug response prediction with deep-learning on multi-omics

Alok Sharma, Artem Lysenko, Keith A. Boroevich, Tatsuhiko Tsunoda

Summary: Modern oncology offers a wide range of treatments, and selecting the best option for each patient is crucial for optimal outcomes. Multi-omics profiling combined with AI-based predictive models show promise in streamlining treatment decisions, but are hindered by high dimensionality of datasets and limited annotated samples. Here, we propose a novel deep learning-based method, DeepInsight-3D, to predict patient-specific anticancer drug response using multi-omics data. This approach converts structured data into images and leverages convolutional neural networks to handle high dimensionality while modeling complex relationships between variables. DeepInsight-3D outperforms other state-of-the-art methods and has the potential to aid in the development of personalized treatment strategies for various cancers.

SCIENTIFIC REPORTS (2023)

添加到收藏夹

Article Biochemical Research Methods

scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning

Shangru Jia, Artem Lysenko, Keith A. Boroevich, Alok Sharma, Tatsuhiko Tsunoda

Summary: Cell-type annotation is a critical step in analyzing scRNA-seq data. Most current methods use unsupervised clustering algorithms, resulting in rough classification. To address this issue, we propose scDeepInsight, a supervised annotation method that performs manifold assignments, data integration, supervised training, outlier detection, and cell-type annotation. It can also identify active genes related to cell-types.

BRIEFINGS IN BIOINFORMATICS (2023)

添加到收藏夹

Article Computer Science, Information Systems

DeepLPC-MHANet: Multi-Head Self-Attention for Augmented Kalman Filter-Based Speech Enhancement

Sujan Kumar Roy, Aaron Nicolson, Kuldip K. Paliwal

Summary: The study investigates the use of MHANet for LPC estimation to reduce bias and improve speech enhancement quality, validated through subjective AB listening tests and seven objective measures.

IEEE ACCESS (2021)

添加到收藏夹

暂无数据

© Peeref 2019-2024. All rights reserved.