4.3 Article

A Two-Stage Feature Selection Method for Gene Expression Data

Journal

OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY
Volume 13, Issue 2, Pages 127-137

Publisher

MARY ANN LIEBERT, INC
DOI: 10.1089/omi.2008.0083

Keywords

-

Funding

  1. National Science Council in Taiwan [NSC96-2622-E-151-019-CC3, NSC96-2622-E214-004-CC3, NSC95-2221-E-151-004-MY3, NSC952221-E-214-087, NSC95-2622-E-214-004, NSC94-2622-E-151025-CC3, NSC94-2622-E-151-025-CC3, KMU-EM-97-2.1a.]

Ask authors/readers for more resources

Microarray data referencing gene expression profiles provide valuable answers to a variety of problems, and contributes to advances in clinical medicine. Gene expression data typically has a high dimension and a small sample size. Generally, only relatively small numbers of gene expression data are strongly correlated with a certain phenotype. To analyze gene expression profiles correctly, feature (gene) selection is crucial for classification. Feature (gene) selection has certain advantages, such as effective extraction of genes that influence classification accuracy, elimination of irrelevant genes, and improvement of the classification accuracy calculation. In this paper, we propose a two-stage feature selection method, which uses information gain to implement a gene-ranking process, and combines an improved particle swarm optimization with the K-nearest neighbor method and support vector machine classifiers to calculate the classification accuracy. The experimental results show that the proposed method can effectively select relevant gene subsets, and achieves higher classification accuracy than previous studies.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Nutrition & Dietetics

Associations between Circulating Markers of Cholesterol Homeostasis and Macrovascular Events among Patients Undergoing Hemodialysis

Wen-Chin Lee, Wei-Hung Kuo, Sin-Hua Moi, Barry Chiu, Jin-Bor Chen, Cheng-Hong Yang

Summary: This study aimed to investigate the differences in cholesterol synthesis and absorption between hemodialysis patients and healthy controls. Results showed that markers for cholesterol homeostasis were not significantly associated with macrovascular events during a 1-year follow-up, shedding light on potential novel therapeutic targets in managing cholesterol absorption in hemodialysis patients.

NUTRIENTS (2021)

Article Engineering, Civil

Deep Learning for Imputation and Forecasting Tidal Level

Cheng-Hong Yang, Chih-Hsien Wu, Chih-Min Hsieh, Yi-Chuan Wang, I-Fan Tsen, Shih-Huan Tseng

Summary: Tidal observations can be influenced by mechanical failures or typhoon-induced storms, leading to data interruptions or anomalies, reducing data applicability. A deep learning algorithm for missing value imputation and tide level forecasting has been proposed, with experimental results showing better performance compared to traditional methods.

IEEE JOURNAL OF OCEANIC ENGINEERING (2021)

Article Computer Science, Artificial Intelligence

Applications of Deep Learning and Fuzzy Systems to Detect Cancer Mortality in Next-Generation Genomic Data

Cheng-Hong Yang, Sin-Hua Moi, Ming-Feng Hou, Li-Yeh Chuang, Yu-Da Lin

Summary: In this study, a method named FuzzyDeepCoxPH that combines machine learning and traditional survival analysis was proposed to identify high-risk missense mutation variants and candidate genes associated with cancer mortality. The results showed that FuzzyDeepCoxPH can effectively distinguish high-risk variants and candidate genes related to cancer mortality, providing comprehensive risk estimation for cancer medicine.

IEEE TRANSACTIONS ON FUZZY SYSTEMS (2021)

Article Engineering, Multidisciplinary

Flexible Resource Scheduling for Software-Defined Cloud Manufacturing with Edge Computing

Chen Yang, Fangyin Liao, Shulin Lan, Lihui Wang, Weiming Shen, George Q. Huang

Summary: This research focuses on achieving rapid reconfiguration in a cloud manufacturing environment by proposing a new manufacturing model called software-defined cloud manufacturing (SDCM), which transfers control logic from hardware to software. Edge computing is introduced to complement cloud computing with computation and storage capabilities near end devices. The study also addresses the management of network congestion caused by transmitting a large amount of Internet of Things (IoT) data with different quality of service (QoS) values. An approach integrating genetic algorithm, Dijkstra's shortest path algorithm, and queuing algorithm is proposed to solve the optimization problem. Experimental results demonstrate that the proposed method effectively prevents network congestion and reduces communication latency in the SDCM.

ENGINEERING (2023)

Article Food Science & Technology

Analyzing the Performance of Machine Learning Techniques in Disease Prediction

Khongdet Phasinam, Tamal Mondal, Dony Novaliendry, Cheng-Hong Yang, Chiranjit Dutta, Mohammad Shabaz

Summary: The history of data stored can help companies predict potential patterns and make competitive decisions. This study focuses on the diagnosis and estimation of heart disease, and previous research has shown the effectiveness of knowledge exploration methods in predicting heart disease. Currently, there are no real-time methods for analyzing and forecasting heart disease in its early stages.

JOURNAL OF FOOD QUALITY (2022)

Article Mathematics

Identifying the Association of Time-Averaged Serum Albumin Levels with Clinical Factors among Patients on Hemodialysis Using Whale Optimization Algorithm

Cheng-Hong Yang, Yin-Syuan Chen, Sin-Hua Moi, Jin-Bor Chen, Li-Yeh Chuang

Summary: This study employed a whale optimization algorithm-based feature selection model to interpret the complex association between time-averaged serum albumin (TSA) and clinical factors among hemodialysis patients. By conducting a multifactor analysis, an optimal multifactor TSA-associated model was constructed, which exhibited superior performance.

MATHEMATICS (2022)

Article Biochemical Research Methods

DeepBarcoding: Deep Learning for Species Classification Using DNA Barcoding

Cheng-Hong Yang, Kuo-Chuan Wu, Li-Yeh Chuang, Hsueh-Wei Chang

Summary: DNA barcodes are short sequence fragments used for species identification. This study proposes a deep learning framework, called deep barcoding, for species classification using DNA barcodes. By utilizing raw sequence data and deep convolutional neural networks, the deep barcoding model achieves high accuracy in species identification. Although there are challenges, the deep barcoding model has the potential to be an effective tool for species classification.

IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS (2022)

Article Mathematics

Deep Learning for Vessel Trajectory Prediction Using Clustered AIS Data

Cheng-Hong Yang, Guan-Cheng Lin, Chih-Hsien Wu, Yen-Hsien Liu, Yi-Chuan Wang, Kuo-Chang Chen

Summary: Accurate vessel track prediction is crucial for maritime traffic control and management to improve navigation efficiency and safety. This study proposed a DLSTM model for vessel prediction, which combines clustering and training techniques. The results demonstrated that the DLSTM model outperformed other models in terms of prediction accuracy.

MATHEMATICS (2022)

Article Biochemical Research Methods

Dimensionality reduction approach for many -objective epistasis analysis

Cheng-Hong Yang, Ming-Feng Hou, Li-Yeh Chuang, Cheng-San Yang, Yu-Da Lin

Summary: This study extended MOMDR to the many-objective version (MaODR) for better identification of SSI between cases and controls. The MaODR-CLN model, with three objective functions - correct classification rate, likelihood ratio, and normalized mutual information, showed higher detection success rates compared to MOMDR and MDR. MaODR-CLN successfully identified significant SSIs associated with coronary artery disease.

BRIEFINGS IN BIOINFORMATICS (2023)

Article Biology

Overall mortality risk analysis for rectal cancer using deep learning-based fuzzy systems

Cheng-Hong Yang, Wen -Ching Chen, Jin-Bor Chen, Hsiu-Chen Huang, Li-Yeh Chuang

Summary: This study proposed an advanced analytic approach, called Fuzzy-based RNNCoxPH, for detecting missense variants associated with high-risk of all-cause mortality in rectum adenocarcinoma. The Fuzzy-based RNNCoxPH model exhibits higher efficacy in identifying and classifying the missense variants related to mortality risk in rectum adenocarcinoma compared to other test methods.

COMPUTERS IN BIOLOGY AND MEDICINE (2023)

Article Computer Science, Artificial Intelligence

Export- and import-based economic models for predicting global trade using deep learning

Cheng-Hong Yang, Cheng-Feng Lee, Po-Yin Chang

Summary: Forecasting global foreign trade is crucial for governments and multinational corporations, but accurate predictions are challenging due to complex relationships between exports, imports, and economic variables. Traditional models provide less accurate forecasts for trade data. This study proposes an ensemble learning approach that combines trade and deep learning models to improve forecasting performance. The method establishes cointegration relationships between variables and uses them to predict future trade data. Experimental results show that the ensemble learning method outperforms traditional models in terms of forecasting accuracy.

EXPERT SYSTEMS WITH APPLICATIONS (2023)

Article Biochemical Research Methods

Dimensionality reduction approach for many-objective epistasis analysis

Cheng-Hong Yang, Ming-Feng Hou, Li-Yeh Chuang, Cheng-San Yang, Yu-Da Lin

Summary: This study extended the multiobjective approach-based multifactor dimensionality reduction (MOMDR) to the many-objective version (MaODR) to improve the identification of single-nucleotide polymorphism-single-nucleotide polymorphism interactions (SSIs) between cases and controls. An objective function selection approach was introduced to determine the optimal measure combination in MaODR among 10 well-known measures. The results showed that the MaODR-CLN model exhibited higher detection success rates in identifying SSIs with weak marginal effects.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Pharmacology & Pharmacy

Machine Learning approaches for the mortality risk assessment of patients undergoing hemodialysis

Cheng-Hong Yang, Yin-Syuan Chen, Sin-Hua Moi, Jin-Bor Chen, Lin Wang, Li-Yeh Chuang

Summary: This study aimed to assess the all-cause mortality risk in hemodialysis (HD) patients and compared the performance of different Cox proportional hazards (CoxPH) models. The whale optimization algorithm (WOA)-CoxPH model showed the highest concordance index and provided better risk assessment compared to other models. Patients with seven or more risk characteristics of eight selected parameters were found to have a potentially increased risk of all-cause mortality in the HD population.

THERAPEUTIC ADVANCES IN CHRONIC DISEASE (2022)

Article Computer Science, Information Systems

AIS-Based Intelligent Vessel Trajectory Prediction Using Bi-LSTM

Cheng-Hong Yang, Chih-Hsien Wu, Jen-Chung Shao, Yi-Chuan Wang, Chih-Min Hsieh

Summary: Accurate vessel trajectory prediction is crucial for maritime traffic control and management, aiding in route planning, distance reduction, and increased efficiency. This study proposes a method that combines data denoising and deep learning prediction to improve accuracy. Experimental results demonstrate the effectiveness of the proposed method.

IEEE ACCESS (2022)

Article Pharmacology & Pharmacy

Identification of mortality-risk-related missense variant for renal clear cell carcinoma using deep learning

Jin-Bor Chen, Huai-Shuo Yang, Sin-Hua Moi, Li-Yeh Chuang, Cheng-Hong Yang

Summary: The improved DeepSurv model achieved greater balanced accuracy compared with the DeepSurv model and identified 610 high-risk variants associated with overall mortality. The results of gene differential expression analysis indicated nine KIRCC mortality-risk-related pathways, suggesting their associations with cancer cell growth, cancer cell differentiation, and immune response inhibition. The findings support the effectiveness of the improved DeepSurv model in identifying mortality-related high-risk variants and candidate genes in the context of KIRCC overall mortality.

THERAPEUTIC ADVANCES IN CHRONIC DISEASE (2021)

No Data Available